Azeez @AtlasInference

Building Atlas, pure Rust inference engine with custom CUDA kernels | Ambassador @Alibaba_Qwen atlasinference.io Joined March 2026

Tweets

92
Followers

525
Following

42
Likes

99

Azeez @AtlasInference

3 days ago

@RisingSayak @huggingface We're trying to enable Qwen3.6, pushed some modified kernels and made a PR for transformers GDN support on DGX Spark :) github.com/huggingface/tr…

0 0 4 142 1

View Details

Cross-architecture from a single codebase is exactly why we built SCALE. Thrilled to see @AtlasInference getting this running! More performance optimizations for both @AMD and @nvidia are on the way. scale-lang.com

Azeez @AtlasInference

5 days ago

Atlas Inference is running Qwen3.6-27B on AMD Strix Halo 🥳 Using @SpectralCom's SCALE ROCm backend, our CUDA kernels compile and run on RDNA⚙️ Cross-architecture inference from ONE codebase 🗣️ Thank you @AIatAMD for the gift 🙏 POC ✅ excited to keep tuning performance⚡️

7 2 31 2K 10

1 1 4 165 0

View Details

Azeez @AtlasInference

4 days ago

@worawisut @seree @SpectralCom @AIatAMD I think we needed it but I hope @worawisut also needed it 😉

1 0 0 16 0

View Details

Azeez @AtlasInference

5 days ago

7 2 31 2K 10

View Details

Azeez @AtlasInference

5 days ago

@SpectralCom @AIatAMD We picked @Alibaba_Qwen's series to test because it is the de-facto standard for local LLMs! Join our discord for access to early releases, feature requests, and any for help you may need serving 🫂 discord.com/invite/6vDbKaK… Github linked below🔗 github.com/Avarok-Cyberse…

0 0 0 284 0

View Details

Azeez @AtlasInference

5 days ago

@no_stp_on_snek Very soon😉 thanks again @no_stp_on_snek

0 0 1 37 0

View Details

Azeez @AtlasInference

6 days ago

@RisingSayak @NVIDIAAI Makes sense. We technically support vision for the Qwen3.6-suite but maybe not exactly what you're looking for just yet. Happy to build for any fitting use cases though!

1 0 1 41 0

View Details

Azeez @AtlasInference

7 days ago

@LottoLabs

0 0 3 533 0

View Details

Azeez @AtlasInference

7 days ago

@seree Thanks for taking the time to run through these! I think the default mem allocation may be higher than needed for a smaller dense model like this. Plz dm or post the details in #bugs regarding any of these other pieces, should be customizable/avoidable :) appreciate the feedback

1 0 3 161 0

View Details

Azeez @AtlasInference

a week ago

@Alibaba_Qwen Excited to try Qwen3.7-Max (plz OSS release soon🙏) Look at how deeply embedded we are optimizing @Alibaba_Qwen: 3.5/3.6-35B, 3.5/3.6-27B, 3.5-122B (EP=2), 3-Next-80B (GDN/Mamba-2), 3-VL, 3-Coder. Achieved 130 tok/s on 3.5-35B. The Qwen series is genuinely WHY we built Atlas!

1 1 4 513 0

View Details

Azeez @AtlasInference

a week ago

It’s official: @AtlasInference is now a @Alibaba_Qwen ambassador! 🤝 Our mission started with Qwen. It remains our top priority and most optimized series. Qwen revolutionized open-source AI, and we’re excited to keep pushing its limits ⚡️ Thank you to our amazing community❤️‍🔥

5 3 24 949 2

View Details

Azeez @AtlasInference

2 weeks ago

@torfi_F_Olafss @huggingface Yes we optimize per {model}_{quant} pair! So to answer your question @torfi_F_Olafss this should definitely help the NVFP4 kernel landscape. Also just as a random sidenote I have many more hours on Minecraft than Atlas inference so take that as you will lol

1 0 1 77 0

View Details

Azeez @AtlasInference

2 weeks ago

DGX Spark lovers 🚨 Thank you @huggingface for merging SM_121 support into kernel-builder, every dev can now pull optimized kernels via get_kernel() 🚀 @AtlasInference pushed to make sure the DGX Spark community had representation 💾 Let's keep squeezing these GB10 chips 📈

5 5 59 3K 36

View Details

Azeez @AtlasInference

2 weeks ago

@huggingface See github.com/huggingface/ke… for more details. Special thanks to @RisingSayak and the broader hf team for working quickly to resolve this, and being open to collaboration from the incredible open source community 💯

0 0 6 466 2

View Details

Azeez @AtlasInference

2 weeks ago

@no_stp_on_snek @TheAhmadOsman @pupposandro Our community is amazing 💯🙏 thank you @no_stp_on_snek TQ+ will definitely help elevate Atlas, especially from THE founder himself! 🧠

0 0 1 24 0

View Details

Tom Turney @no_stp_on_snek

2 weeks ago

@TheAhmadOsman @pupposandro That's right, that's why im working on TQ+ cache compression techniques for @AtlasInference right now in my copious amounts of spare time... now back to hand tuning metal kernels: github.com/Avarok-Cyberse…

1 1 5 347 2

View Details

Azeez @AtlasInference

3 weeks ago

@therealbifkn @NVIDIAAI 🔥

0 0 1 49 0

View Details

Azeez @AtlasInference

3 weeks ago

@jun_song When Gemini translates it probably destroys your original structure and flow. And it brings more of an AI flavor to it. Happens to me when I go from Urdu all the time. Either way, you can't really control it. Bi-directional encoders do have their limits lol