Get the latest ideas from a16z.
Plus the best new takeaways about creativity from other top podcasts — read in minutes, not hours.
or
By continuing, you agree to podbrain's Terms and Privacy Policy.
Yuka Lee and Justine Moore speak with Mohammad Norouzi, CEO and founder of Ideogram, a Toronto-based generative AI company that just released their first open-weight image generation model.
The conversation explores Ideogram's strategic shift from closed-source to open-weight models, focusing on their breakthrough 9.3 billion parameter model that achieves state-of-the-art text rendering and design capabilities.
Mohammad discusses the technical innovations behind their JSON prompting system, which enables precise layout control and professional design workflows, while maintaining artistic "taste" that differentiates from benchmark-optimized competitors.
The discussion covers enterprise customization, the future of creative AI workflows, and how smaller, specialized models can outperform larger general-purpose alternatives for specific use cases like graphic design and marketing.
Strategic Shift to Open-Weight Model Release
Ideogram transitioned from closed-source to open-weight to focus on foundation model development while partnering with inference providers, chip makers, and enterprise customers for customization and on-premise deployment.
"We decided to focus a little more on the model side. We think that's where a lot of potential exists" - Mohammad on concentrating resources on core model innovation rather than maintaining full-stack applications.
The open-weight approach enables new partnerships across the stack, from app developers to hardware optimizers, while maintaining direct user feedback through first-party applications.
JSON Prompting Innovation for Professional Design Control
The model uses detailed JSON prompting with thousands of words describing each image element, including bounding boxes, layout control, and positioning for precise design workflows.
"We don't want people to write in JSON. We don't think that's a natural way of interacting with these models" - Mohammad explains the backend translation from simple prompts to structured JSON representations.
JSON serves as an intermediate representation where language models describe images in structured format, then image generation processes the detailed specifications for consistent, editable output.
Professional users can access and modify the actual JSON input to the model, enabling precise control and consistency that "rolling the dice" approaches cannot provide.
Text Rendering Breakthrough and Brand Differentiation
Ideogram's first model three years ago focused on accurate text generation when "image generation was synonymous with garbled text," establishing typography quality as their core brand differentiator.
The current model achieves paragraph-length text rendering accuracy matching closed-source competitors like GPT Image, despite being significantly smaller at 9.3 billion parameters.
Training involves AI models converting images to detailed text descriptions with bounding box information, then training the reverse process from structured text back to images.
"Text is very important part of image generation. And that became a very important part of our brand" - Mohammad on discovering the graphic design and storytelling market demand.
Small Model Efficiency and Enterprise Customization
At 9.3 billion parameters versus competitors' 80 billion, the model runs on single GPUs and consumer devices, enabling privacy-focused on-device inference for enterprises.
"We focused on innovation. We think there's still so much more to do to innovate" - Mohammad on competing through technical advancement rather than compute scale against Google-sized competitors.
Enterprise customers can fine-tune with as few as 15 images, with artists reporting 3x productivity improvements in comic book creation workflows.
"Every artist who has at least like 50 pieces of art... they can really customize this model to the nuances of their style, the texture of their canvas, and really get 2K output" - Mohammad on democratizing model customization.
Taste-Driven Development vs Benchmark Optimization
"We really want our models to have taste" - Mohammad emphasizes artistic quality and style diversity over leaderboard performance, working with designers for side-by-side model comparisons.
The model deliberately avoids reinforcement learning that creates homogeneous outputs, instead preserving diverse artistic styles embedded during training for unique visual results.
"If you've seen some of the frontier models actually that score very highly in the leaderboards, they don't have a lot of kind of design variation. They always produce the same exact look" - Mohammad on differentiation strategy.
Internal evaluation prioritizes designer feedback over automated metrics, as "AI is not very good at doing the actual taste evaluation yet."
Future of Creative AI Workflows and Editing
Upcoming editable text and layout control features will use the same JSON structure, allowing modification of single elements while preserving overall composition for professional design iteration.
"For a lot of design and marketing use cases, we need editable design, not a single flat image" - Mohammad on the next frontier beyond static image generation.
Agentic workflows enable large-scale creative exploration through API calls, generating thousands of design variations for high-level direction before human refinement through traditional UI interfaces.
The visual representation space offers more customization potential than language models because "you immediately recognize the differences between brands" in visual design versus written communication.
Resources Mentioned
Pray Your Way To Marital Breakthrough 2026
ng the state of the art in text generation, but we continued to focus on that and we had a bunch of research breakthroughs. And with this model, despite the fact that it's very tiny, the text generati
Training with DISC 30 Games & Team Building Exercises to Lead your First or your 101st DISC Workshop
domain or optimizing for a different thing in a different domain. So what was the trade-off for the research team when training this model to decide what to focus on? Right, so one thing that you kind
The Emotionally Intelligent Team Building Collaborative Groups that Outperform the Rest
partner with the industry to push the kind of small model size quality further. Now in terms of the research team, it's an interesting question whether you can focus on a very small, narrow field in i
Rector's Community and Public Health Nursing Promoting the Public's Health
There are so many ways you could customize it. One hot topic we, you know, in the industry, in the research community, is the agentic loop for creative tools, right? So it used to be the creativity t
From a16z. Get a note like this from every new episode.