How does it compare to largest deepseek ans Claude opus 4.6?
I hot used to blazing fast speed and accurate results. I’m not buying a server and 128 GB of RAM just to run a model similar to gpt-4.
ATLAS has some benchmarks in the repo, and it’s comparable to opus 4.6, you don’t actually even need 128gb model for that. An 8 bit quantized model will run with around 32gb and still perform quite well.
You should be able to get very decent performance with 128gb vram running Qwen 3.6 with something like https://github.com/itigges22/ATLAS especially if you run MTP https://huggingface.co/unsloth/Qwen3.6-27B-MTP-GGUF
A friend of mine gets something like 50 tokens a second with it, and output quality is quite decent.
How does it compare to largest deepseek ans Claude opus 4.6? I hot used to blazing fast speed and accurate results. I’m not buying a server and 128 GB of RAM just to run a model similar to gpt-4.
ATLAS has some benchmarks in the repo, and it’s comparable to opus 4.6, you don’t actually even need 128gb model for that. An 8 bit quantized model will run with around 32gb and still perform quite well.