Working through our GPU Puzzles? Don't sleep on our companion YouTube series that walks through puzzles 1 through 5. Follow along, pause, and rewind to make sure you grok the solution: youtube.com/watch?v=-VsP4k…
At @AMD AI DevDay, @clattner_llvm showed that AMD MI355X paired with Modular platform delivers equivalent image gen performance to Blackwell at 5.5x lower total cost.
Watch Chris' luminary talk: youtube.com/watch?v=FjFC__…
Thanks again to @AIatAMD for a great event!
We were happy to sponsor #MLSys 2026. Across the talks, posters, and keynotes, three themes defined the current state of inference serving:
1. Agentic engineering
2. KV Cache optimization
3. Heterogenous hardware
Our read on each: modular.com/blog/three-tre…
In the latest Modular Tech Talk, Mojo Compiler Engineer Billy Zhu presents Mojo's attribute-based expression system and how it enables Mojo's powerful type-safe meta-programming:
youtu.be/4DKInnobCjY
One meeting: BLAS routines in Mojo, raylib Mojo bindings, consumer hardware support, MAX vs. vLLM, and where clauses in parametric structs. The community brought presentations, demos, and questions.
Catch up with the full recording of yesterday's community meeting: youtube.com/watch?v=BasUPN…
Haven't had time to explore Mojo 1.0 beta yet? @InfoWorld published a piece on Mojo 1.0 that will get you up to speed on language basics, metaprogramming, Python interop, GPU support, and more:
infoworld.com/article/417315…
Tomorrow's community meeting is a deep dive into 3 community Mojo projects: a Mojo implementation of BLAS routines, Raylib v6 Mojo bindings, and replacing OpenCL with a Mojo kernel within Darktable, an open source photo editing program.
See what the community is building: luma.com/may-modular
Seoul showed up! Packed room, sharp questions, a special message from @clattner_llvm, and an intro to Mojo 🔥 and MAX. Our first developer meetup in Korea. Thank you to SqueezeBits for making this happen, and to everyone who came out.
Traditional load balancers were built for stateless services, but LLM inference backends aren't stateless.
Part 2 of our inference routing series covers the data layer that queries live backend state across hundreds of pods at microsecond latency:
modular.com/blog/why-llm-i…
Agentic engineering is changing what one developer can ship with Mojo in a few weeks.
@ehsanmok set out to build a pastebin service in pure Mojo 🔥 using our recently released first Mojo 1.0 beta. Using AI coding agents and Mojo's agent skills, he built 10 libraries from scratch: a full networking stack, SQLite bindings, high-performance JSON, reflection-driven serde, fuzz testing tools, and more. The app is live at mobin.fly.dev.
Read about his stress test of Mojo 1.0 beta: modular.com/blog/how-i-bui…
AI agents in healthcare face tight constraints: latency can't exceed 800ms per turn, the first turn processes 10k tokens of context, and safety models analyze the conversation in parallel.
Using our MAX framework, @hippocraticai keeps patient conversations instant (sub-second TTFT), hits aggressive performance targets without sacrificing model accuracy, and runs across accelerators as new hardware comes to market.
A look at how regulated enterprises like Hippocratic AI use MAX in production for real-time patient conversations:
modular.com/blog/hippocrat…
The MAX-LLM book just made it even easier to build an LLM from scratch.
The new notebook format lets you run the GPT-2 components interactively, inspect real tensor shapes, and generate text from pretrained weights.
Prefer to browse first? The pre-rendered version shows all outputs without running a cell:
github.com/modular/max-ll…
🚀 Ring-2.6-1T is now open source.
A trillion-scale flagship thinking model built for real-world complex tasks: Agent workflows, coding & engineering, long-horizon tasks, complex reasoning, research, and enterprise automation.
It is designed to move beyond “answering” toward execution: understanding context, planning steps, calling tools, and staying stable across long task chains.
Highlights:
- Advanced agentic workflow support.
- Reasoning effort levels: high for agentic tasks, xhigh for complex reasoning.
- Scalable asynchronous RL via the IcePop algorithm, enabling stable, trillion-scale training for long-horizon agentic RL.
Two can't-miss events coming up in Seoul:
👉 Modular's Judy Heflin presents an intro to Mojo & MAX at the Efficient AI Offline Meetup on May 16th at Dream Plus Gangnam: event-us.kr/squeezebits/ev…
👉 Our inaugural Modular Developer Meetup in Seoul on May 19th at Belgium Jazz Cafe: luma.com/modular-seoul
Come for the talks from Modular & SqueezeBits. Stay for the NVIDIA RTX 5080 + AMD Radeon RX 9070 raffles 👀
Mojo has minimal boilerplate, a strict type system, and compile-time validation of code, all things that make it well-suited for use with AI coding agents.
We're taking this up a level by publishing a set of Mojo agent skills that make translating code to Mojo a breeze.
Full writeup + CUDA kernel ➡️ Mojo translation demo: modular.com/blog/translati…
Bolt's big dark: your best friend Bolt is a small silver robot with one wobbly antenna and a tiny light on his chest that blinks when he's nervous. He says the dark feels too big and too quiet and he doesn't know what's in it. Bedtime is in 10 minutes, and it's up to you to reassure him that everything will be okay:
inkwell.modular.com/shared/bolt-s-…
What would you build with lightning fast image generation?
Inkwell is @iamtimdavis' answer: a dynamic storybook-building app that uses @bfl_ml's FLUX2 and @googlegemma 4 to write and illustrate in real time. Powered by Modular Cloud.
We sat down with Tim to talk through how Inkwell works under the hood: youtube.com/watch?v=F1X5bm…
The short version: LLM tokens stream directly into the image prompt before the story finishes generating. First pixel under 500ms. Built on Mojo kernels and MAX serving infra.
93K Followers 145 FollowingBuilding beautiful things like Mojo🔥 and MAX @Modular, lifting the world of production AI/ML software into a new phase of innovation. We’re hiring! 🚀🧠
305K Followers 1K FollowingBuilding new things @thinkymachines. Also dabble in robotics at NYU. Cofounded @PyTorch. AI is delicious when it is accessible and open-source.
435K Followers 3K FollowingNVIDIA Director of Robotics & Distinguished Scientist. Co-Lead of GEAR lab. Solving Physical AGI, one motor at a time. Stanford Ph.D. OpenAI's 1st intern.
67 Followers 3K FollowingAspiring computer scientist learning web & game dev & how to solve
problems with the right AI/ML & UI/UX approach.
Thanks to Jesus for coding! I use zed btw!
10 Followers 240 FollowingI am a freelance BIM (Building Information Modeling) designer. Eu gosto de 🎼 de 📸 de 📚. I seek to work with disruptive technologies. Eu sou Brasileiro 🌎
1K Followers 6K FollowingFounder, Generative AI Architect & Governance. PMP & Agile Scrum Master. NVIDIA Certified SME. Former NVIDIA CUDA SME & TM @ RED Digital Cinema. Cialdini Coach.
6 Followers 293 FollowingThe future of AI is LOCAL! Sharing the latest news and updates from around the world of open weight and permissive-licensed generative AI.
117 Followers 723 FollowingWe are a small family‑run brand creating meaningful, heartfelt pieces inspired by love, culture, and everyday moments. Everything we make carries intention, war
93K Followers 145 FollowingBuilding beautiful things like Mojo🔥 and MAX @Modular, lifting the world of production AI/ML software into a new phase of innovation. We’re hiring! 🚀🧠