Google Gemini Now Creates Music From Text and Images

Google Gemini Now Creates Music From Text and Images

Google just turned Gemini into a music producer. Starting today, users can generate custom 30-second tracks directly in the Gemini app using Lyria 3, Google’s latest AI music model. Type a prompt, upload an image, and get a fully produced audio clip in seconds.

The announcement from Google positions this as “a new way to express yourself,” which is marketing speak for: we’re taking on Suno, Udio, and every other AI music startup that’s gained traction over the past year. The difference? Gemini music creation lives inside an app millions already use.

What Lyria 3 Actually Does

Here’s the practical bit: Lyria 3 generates what Google calls “high-quality” tracks capped at 30 seconds. You can prompt it with text descriptions like “upbeat jazz piano for a coffee shop” or feed it an image and let the model interpret the mood.

That 30-second limit isn’t arbitrary. It keeps generation time fast and sidesteps some thorny copyright issues around longer compositions. It also makes these clips perfect for social media, which is probably the point.

How It Compares to What’s Out There

Suno and Udio currently let users generate full-length songs with lyrics. Meta released an AI music tool last year. OpenAI has Jukebox, though it hasn’t seen much development lately. What makes Gemini’s approach interesting is the multimodal angle—using images as prompts feels native to how Gemini already works.

Google has been methodically expanding what Gemini can do beyond text. The company recently rolled out student-focused features targeting education users. Music generation is a different play entirely—this is about creative tools and keeping Gemini relevant as competitors add more multimedia capabilities.

The Licensing Question Nobody’s Answering

Google’s blog post doesn’t mention training data. That’s not surprising, but it matters. Every AI music model faces questions about what copyrighted material went into the training set. Some startups have signed licensing deals with major labels. Others are fighting lawsuits.

Google has deep pockets and a legal team that can handle challenges, but the company also has YouTube, which means existing relationships with music rights holders. Whether those relationships translate into cleaner training data for Lyria 3 is anyone’s guess.

Why This Matters Now

Timing feels significant. Music generation has been the sleeper hit of generative AI—less hyped than chatbots or image generators, but quietly building massive user bases. Suno claims millions of tracks created. Udio has gone viral multiple times on social platforms.

Google putting music creation into Gemini makes it mainstream-adjacent in a way standalone apps never quite achieve. It’s the same strategy that made Midjourney accessible when DALL-E integrated into ChatGPT.

The move also highlights where Google sees AI heading: not as a single-purpose tool, but as a unified interface for multiple creative tasks. Text, images, code, and now music—all from one prompt box. That’s the bet, anyway.

As more companies pile features into their flagship AI assistants, we’ll likely see a split. Power users will stick with specialized tools that offer more control. Casual creators will gravitate toward whatever’s already installed on their phone. Google is clearly betting on the second group being much larger.

Whether 30-second clips are enough to make Gemini music creation sticky is another question. But as AI capabilities continue expanding into creative domains, having music generation baked into a product millions already use gives Google a real distribution advantage.