- Published on
AI Music Generation 2026 — Suno, Udio, Stable Audio, MusicGen, Mubert, ElevenLabs, Lyria — Where Are We Really?
- Authors

- Name
- Youngju Kim
- @fjvbn20031
Prologue — What Has Changed in Two Years
Summer 2023: AI-generated music was a toy. One-bar melodies, awkward rhythms, vocals either absent or unintelligible. When Meta open-sourced MusicGen, the reaction was "neat" rather than "I'll write a song with this."
Spring 2024: Suno shipped v3, Udio opened its beta, and the mood shifted. A single text prompt produced a two-minute song with actual vocals. Rough in places, but for the first time people said "wait, this is real." Three months later, in June 2024, the RIAA sued both Suno and Udio for massive copyright infringement. Industry attention had arrived in earnest.
May 2026: the landscape has shifted again. Suno v5.5 clones a user's voice and supports personal fine-tunes. Udio has signed licensing settlements with Universal, Warner, Kobalt, and Merlin in sequence. Google acquired Riffusion's successor ProducerAI and folded it into Lyria 3. ElevenLabs expanded from voice into music. On the open-source side, YuE, ACE-Step, and DiffRhythm offer full-song models with vocals that run on a single RTX 4090.
And yet — vocals are still the hardest part. Korean lyrics still sound less natural than English. Anything past four minutes loses coherence. Models with airtight commercial licensing are still rare. The Suno summary judgment hearing is set for July 2026.
This post tries to map that landscape. Which tool fits which job, why vocals are difficult, where open source stands, how the lawsuits are unfolding, and what real workflows look like for indie game soundtracks, podcast intros, YouTube BGM, and songwriting ideation. This is not "AI is killing music" nor "AI is saving music." It is the middle ground that the actual practitioners live in.
One-line take: 2026 AI music is not about "replacing humans" but about "people who couldn't make music starting to make music." Knowing that boundary makes the tool choice easy.
1 · The Birth of the Category — What Happened in 2023–2024
1.1 Two Technical Lineages
AI music generation is the merger of two technical lineages.
Lineage 1: Autoregressive token models. Like text LLMs, tokenize audio and predict the next token. Meta's MusicGen (2023), Google's MusicLM (2023), and Suno's early versions belong here. Training works by compressing audio through a neural audio codec like EnCodec into tokens, then training a transformer on those token sequences.
Lineage 2: Diffusion-based audio. Apply image-diffusion architectures (Stable Diffusion) to audio. Stability AI's Stable Audio is the canonical example. Riffusion used a clever trick — convert audio to a spectrogram (a frequency image), run image diffusion on it, then convert the result back to audio.
By 2024 the two lineages cross-pollinated and vocal synthesis was bolted on. The real leap for Suno and Udio was producing a "full song with vocals and lyrics from text" — until then, almost everything was instrumental backing only.
1.2 Why Quality Jumped Suddenly
Three variables moved at once.
- Data. Access to large licensed music catalogs (or — as the lawsuits allege — scraped catalogs) became viable for training. MusicGen alone was trained on roughly 20,000 hours of licensed music.
- Compute. H100/H200 clusters made training multi-billion-parameter audio models feasible in reasonable time.
- Architecture. Neural audio codecs like EnCodec and SoundStream opened the door to handling audio as LLM-style tokens.
With those three in place, the trick that worked for text LLMs — "predict the next plausible token" — started working for music.
1.3 The RIAA Bomb — June 2024
On June 24, 2024, the Recording Industry Association of America, representing Universal, Warner, and Sony, filed two copyright infringement suits — against Suno in the District of Massachusetts and Udio in the Southern District of New York. The core claim: "trained on copyrighted recordings without permission." The defense from both companies: "transformative fair use."
This is not an isolated dispute. It will decide the commercial fate of the entire AI music category. If the training data is ruled infringing, model retraining is required and the licensing structure for outputs changes fundamentally. That is why the wave of settlements started arriving in late 2025.
2 · Consumer Tools — Suno, Udio, Lyria, ElevenMusic
2.1 Suno — The Category Leader
As of May 2026, the most-used text-to-song tool is Suno. The progression: v3 (early 2024), v4 (2025), v5 (late 2025), v5.5 (March 26, 2026).
Three pillars in v5.5:
- Voices. Users record about thirty seconds of their own singing voice, register it, and the AI sings in that timbre. Pro and Premier subscribers only. Private by default.
- Custom Models. Upload your own catalog (e.g., songs you have made) to fine-tune v5.5 toward that style. Up to three per account.
- Studio. Receive stems separated by track — vocals, bass, drums, harmony, instrumentation. Drop them into a DAW for post-production.
Quality? For English lyrics in mainstream genres like pop, rock, electronic, or folk, a first-time listener will believe a human made it. Korean and other less-trained languages still struggle with pronunciation and prosody (steadily improving since 2025, still weaker than English). Structurally complex genres like jazz improvisation or full classical orchestration remain weak spots.
Commercial licensing is explicitly granted on Pro and above. Marketing "100% safe" is hard while the RIAA case is pending.
2.2 Udio — A Different Aesthetic
Udio was founded in December 2023 by former Google DeepMind researchers, led by CEO David Ding. The April 2024 seed round of $10M was led by Andreessen Horowitz, with notable participation from Instagram co-founder Mike Krieger, will.i.am, Common, and other music-industry figures.
Udio's output has a subtly different character from Suno's. Where Suno tends toward "polished pop," Udio leans toward "track produced by a producer." It scores especially well in hip-hop, R&B, Latin, and electronic.
On October 29, 2025, Universal Music Group settled with Udio — a payment plus a licensing deal for a joint AI music platform launching in 2026. On November 25, Warner settled too (a multi-million-dollar settlement plus a licensing partnership, with Suno acquiring Songkick from Warner as part of the package). Kobalt and Merlin followed. As of May 2026, Sony is the only major still actively litigating against Udio.
2.3 Lyria 3 (Google DeepMind)
Google moved on two fronts.
Lyria the model. From Lyria 2 (May 2025) to Lyria 3 (February 18, 2026). 48kHz stereo, up to three minutes, working directly on audio tokens rather than spectrograms. SynthID watermarking is mandatory. Access via Vertex AI and the Gemini API.
Riffusion acquisition. On February 24, 2026, Google acquired ProducerAI (formerly Riffusion). ProducerAI was a conversational music-generation agent with a million users. After acquisition it was folded into Lyria 3. The spectrogram-diffusion lineage that Riffusion pioneered now lives inside Lyria 3.
2.4 Lyria RealTime — A Different Usage Model
Lyria RealTime is a separate beast. Not "generate a song" but "control streaming audio in real time." You adjust style, tempo, and mood live while infinite music plays. Primary use cases: live streaming, game BGM, interactive installations. Accessed via the Gemini API.
2.5 ElevenMusic (ElevenLabs)
ElevenLabs, known for voice synthesis, launched Eleven Music on August 5, 2025. On April 1, 2026, it relaunched as ElevenMusic with a standalone iOS app and a full consumer platform.
The differentiator is licensing. ElevenLabs signed training-data deals with Merlin Network, Kobalt Music Group, and SourceAudio in advance. Marketing positions ElevenMusic as "cleared for commercial use." The key signal: it deliberately did not train on the major labels' RIAA-side catalogs.
Functionally, you control length, lyric presence, and remix existing tracks (genre and tempo shifts). Free tier covers seven songs per day. Combined with ElevenLabs' voice synthesis, finer vocal-character control is possible.
2.6 Comparison — Consumer Tools
| Tool | Vocal Quality | Instrumental | Korean Lyrics | Length | Commercial License | Primary Use |
|---|---|---|---|---|---|---|
| Suno v5.5 | Very high | High | OK | Up to 8 min | Pro and above, explicit | Songwriting, content |
| Udio | High | Very high | OK | Up to 4+ min | Standard and above | Producing, hip-hop/R&B |
| Lyria 3 | Medium (lyric-light) | Very high | Weak | Up to 3 min | Vertex AI terms | Enterprise integration |
| ElevenMusic | High | High | Not benchmarked | Up to 5 min | Explicitly cleared | Content creators |
| Lyria RealTime | None | High | N/A | Infinite stream | API terms | Games, live |
3 · Open Source and Local Options — MusicGen, Stable Audio, YuE, ACE-Step
3.1 Why Open Source
Three reasons.
- Cost. No subscription, unlimited generation. Runs on a single local RTX 4090.
- Privacy. Lyrics and concepts never leave your machine. Crucial for unreleased projects.
- Control. Fine-tuning, fixed seeds, batch generation, and automation pipelines become possible.
The cost — quality lags consumer tools by a half-step, and licensing terms need careful reading.
3.2 MusicGen (Meta, 2023)
The starting point of open-source AI music. Released August 2023 as part of the AudioCraft framework. Text-to-instrumental.
- Parameters. Three sizes — 300M, 1.5B, 3.3B. The 3.3B variant wants 16GB+ VRAM.
- Data. About 20,000 hours of music Meta owns or licensed.
- License. Model weights are CC BY-NC 4.0 — non-commercial use only. This is widely misread. Self-hosting does not grant commercial rights.
- 2026 status. No meaningful update since 2024. Quality is visibly behind Suno and Udio. Cannot do vocals.
Still useful for "learning," "offline experiments," "non-commercial projects," and "as a baseline for comparing other models."
3.3 Stable Audio 2.5 / Stable Audio Open
The two Stability AI lines are easy to confuse.
Stable Audio 2.5. Commercial SaaS. Up to three minutes, complex structure (intro, development, outro). Better response to mood prompts like "uplifting" or "lush synthesizers." Strong for sound effects, ad music, and video tracks.
Stable Audio Open. Open source. The base model maxes at 47 seconds. Stable Audio Open Small (341M parameters, built with Arm) generates 11 seconds of audio in under 8 seconds on a smartphone CPU. Licensed under the Stability AI Community License, free for commercial and non-commercial use.
Stable Audio Open is stronger for sound design — short SFX, loops, textures, foley — than for full songs.
3.4 YuE — Open-Source Full-Song Model
YuE arrived in 2025 as an open-source full-song model with vocals. Apache 2.0 license (commercial use allowed). It does what MusicGen does not — "text plus lyrics into a full song with vocals."
- Hardware. Recommended 24GB VRAM. Quantized versions run in 8–16GB. On a 4090, 30 seconds takes roughly 360 seconds.
- Optimized forks. DeepBeepMeep's GPU-poor branch generates a 1-minute song in about 4 minutes on a 4090.
- License. Apache 2.0 — commercial use allowed. The cleanest license among open-source music models.
Quality does not match Suno v5, but YuE is the first open-source model to combine "open + commercial + vocals."
3.5 ACE-Step 1.5 — Another Local Contender
ACE-Step 1.5 stands out for supporting Mac, AMD, Intel, and CUDA. The fact that it runs on M-series Macs matters a lot. Reasonable music generation plus vocals plus decent quality makes it the often-recommended "2026 local starting point."
3.6 Comparison — Open Source / Local
| Model | Vocals | License | Min VRAM | Length | Strength |
|---|---|---|---|---|---|
| MusicGen 3.3B | No | CC BY-NC 4.0 (non-commercial) | 16GB | 30 sec | Learning, baseline |
| Stable Audio Open | No | Stability Community | 8GB | 47 sec | Sound design |
| YuE | Yes | Apache 2.0 | 24GB rec. | 1–5 min | Full songs, commercial |
| ACE-Step 1.5 | Yes | Open source | 12–24GB | Full song | Multi-platform |
| DiffRhythm | Yes | Open source | 16GB | Full song | Fast inference |
4 · Where It Actually Works
4.1 Indie Game Soundtracks
One of the strongest fits. The reason is simple — an indie game typically needs 10 to 30 tracks. Commissioning all of them from a composer costs roughly ten to fifty thousand dollars. Filling the gap from royalty-free libraries means the same music turns up in other games.
AI music slots neatly into that gap.
- Volume. Dozens of tracks per hour, keep what you like.
- Uniqueness. Unlike libraries, your track will not appear in another game.
- Variation control. Adjust the seed and prompt to generate similar tracks for the same mood.
- Loop-friendly. Game BGM loops anyway. You do not need a full four-minute song.
A workflow used by actual indie studios.
1. Write a mood sheet for the game: "neon-lit cyberpunk alley, tense but melancholy, 100 BPM"
2. Generate 10 to 20 tracks in Suno or Udio, shortlist favorites
3. Separate stems on the 1 to 2 chosen tracks
4. Adjust BPM and key in a DAW, build loop points
5. Import into Unity or Unreal as .ogg or .wav
6. Configure interactive layers in an adaptive music system like FMOD or Wwise
A caution: verify the licensing of AI output against your distribution channel (Steam, consoles). Suno Pro and above, or a clean model like ElevenMusic, is the safe choice.
4.2 Podcast Intros and Outros
A 15 to 30-second signature sound. AI music's main weakness — long-term coherence — barely matters here.
Workflow.
- Prompt mood and genre: "upbeat tech podcast intro, synth-driven, 20 seconds, fade-out"
- Generate 10 to 20, pick one
- Polish around the voiceover
- Use the same track on every episode — it becomes "brand sound"
Cost: Suno Pro at 300 to $1,000), it is negligible.
4.3 YouTube and Short-Form BGM
This is where Mubert shines. Mubert is not text-to-song — it is mood-based infinite track generation. It can produce 25-minute background tracks and 25 variations quickly. The royalty-free license is unambiguous. Musicians upload their sample packs and receive 80 percent of track sales, so the training-data origin is comparatively clean.
For a YouTuber, the appeal is "no Content ID claims." Vocal-bearing Suno tracks rarely trigger claims either, but Mubert is the most clearly safe option.
4.4 Songwriting Ideation
Professional songwriters and composers are surprisingly aggressive users. Two patterns.
Motif generation. Quickly try "what would this chord progression with this vocal melody sound like." They do not use the output directly — they steal the idea and weave it into their own track.
Guide track. Write lyrics first, then make an AI demo. Listen to the demo to judge "this part works, this part needs to change." Then build the real song. The AI music acts as an MVP.
The core mindset: use AI output as a design tool, not a finished product. Masterpieces will not pop out — the right position for AI music is "idea generator."
4.5 Where It Does Not Work
The same honesty applies to limits.
- Advanced classical composition. Four-voice fugues, sonata-form structures — still weak.
- Replacing live performance. Cannot manufacture stage energy.
- Jazz improvisation. No coherent motivic development.
- Big commercial IP. Major film soundtracks and lead ad tracks remain out of reach — not for quality reasons but for legal safety.
- Distinctive vocal character. Suno Voices cloning a user's own voice is roughly the ceiling.
5 · Quality Reality — Vocals Are the Hardest Part
5.1 Why Vocals Are Hard
The two hardest problems in audio generation are (a) long-term coherence and (b) vocals. Vocals are especially hard, for layered reasons.
Phonemes and pronunciation. Human voice changes phonemes every 50ms or so. The model has to map lyric text to a sequence of pronounced audio tokens. English has rich training data and works well. Korean, Japanese, Arabic and similar languages have far less audio data per phoneme.
Prosody (intonation). Singing "I love you" sadly versus joyfully sounds different. The model must combine lyric meaning with song mood to shape the intonation curve.
Pitch stability. Human singers hold pitch within roughly ±10 cents. AI sometimes wavers ±50 cents. The ear hears it as "off."
Intelligibility. Listeners need to hear the lyrics. Vocals are not finished when melody is in place — the words must be audible. Hard consonant clusters (like "strengths") often blur in AI output.
5.2 The Extra Penalty for Non-English Lyrics
Korean has roughly one-tenth to one-twentieth the training data of English. Consequences:
- Final consonants (especially ㄹ and ㅇ) sound awkward.
- An English-style vocal phrasing is forced onto Korean (consonants run together instead of being articulated).
- Natural prosody of the lyric is missed.
Mitigations: (a) Suno v5.5 is visibly better than v4 on Korean. (b) Explicit style tags like "korean ballad," "k-pop," or "trot" help. (c) When awkwardness remains, generate with English lyrics and re-record the vocal in Korean during post.
5.3 Instrumentals Are Surprisingly Solid
Conversely, instrumentals are near-human-quality from late 2025 onward. Electronic, synth pop, lo-fi, cinematic scores, ambient — telling them apart from human work is nearly impossible. That is why games, podcasts, and YouTube BGM exploded first.
5.4 Length and Coherence
Past three minutes, the model starts losing track of "where this song is going." Specifically:
- Motif forgetting. A hook introduced at one minute disappears by three.
- Structural drift. Verse-chorus-bridge structure erodes as length grows.
- Quality drift. After four minutes, vocals sometimes turn grainy or the mix shifts.
Workarounds: (a) generate short pieces and stitch in a DAW, (b) use Suno's Extend feature in segments, (c) for anything past five minutes, go instrumental.
6 · Lawsuits and the Copyright Debate — Honestly
6.1 What Is at Issue
The RIAA suits have two core issues.
- Training data use. "Trained on copyrighted recordings without permission." Both defendants invoke "transformative fair use."
- Output similarity. Plaintiffs claim Suno and Udio can reproduce specific training songs nearly verbatim.
The legal question reduces to whether AI training passes the four-factor fair-use test (purpose, nature, amount, market effect).
6.2 Status as of May 2026
Suno. Contesting all claims on fair-use grounds against Universal, Warner, and Sony in the District of Massachusetts. Suno filed for summary judgment in March 2026, with the key hearing scheduled for July 2026. Cited precedent: the Second Circuit's 2024 Bartz v. SoundAI ruling, which treated AI training as transformative use.
Udio. Successive licensing settlements with Universal (October 2025), Warner (November 2025), Kobalt, and Merlin. Sony remains the only major actively litigating. The Universal deal includes a joint AI music platform launching in 2026.
Independent artists. In October 2025, separately from the majors, a class of independent musicians sued both Suno and Udio.
6.3 Three Possible Outcomes
Scenario A — Suno wins (fair use upheld). AI training becomes legitimized. Every AI model uses a similar defense. The music industry shifts to a separate licensing market (e.g., the Universal-Udio joint platform). Users get the most freedom.
Scenario B — Suno loses (licensing required). Suno is forced into licensing settlements or model retraining. Costs rise sharply and subscription prices follow. New entrants cannot start without licensing. "Pre-licensed" models like ElevenMusic gain a structural advantage.
Scenario C — Settlement. The most likely scenario. The Universal-Udio template — majors + licensing + revenue sharing — becomes the industry standard. The entire industry aligns to that shape.
6.4 What Users Should Do
Safe to do, no caveats: subscribe to Suno or Udio Pro and above, plans that explicitly grant commercial usage rights, and avoid explicitly imitating named major artists.
Safer still: models like ElevenMusic with provable pre-licensed training data, or Apache 2.0 open-source models like YuE or ACE-Step run locally.
Avoid: prompts attempting to clone a specific named artist's voice ("in the style of [famous singer]"), then commercially distributing the output. That is the clearest risk.
7 · Decision Framework — What to Pick
7.1 "Situation → Recommended Tool"
| Situation | First choice | Second choice | Note |
|---|---|---|---|
| Songwriting demos | Suno v5.5 | Udio | Vocal quality first |
| Indie game BGM | Suno Pro | Mubert | Stem separation matters |
| Podcast intro | Suno | ElevenMusic | 30 seconds works anywhere |
| YouTube background | Mubert | Stable Audio 2.5 | Mood-based infinite tracks |
| Ad track (commercial) | ElevenMusic | Stable Audio 2.5 | License cleanliness first |
| Live game BGM | Lyria RealTime | (few alternatives) | Real-time control |
| Local / private experiment | YuE | ACE-Step | Data does not leave the box |
| Sound design (short SFX) | Stable Audio Open | (DAW plugins) | 11 to 47 seconds |
| Learning / research | MusicGen | YuE | Non-commercial OK |
| Korean-lyric songs | Suno v5.5 | Udio | Plan for vocal post-processing |
7.2 Decision Tree
Start
│
├─ Need vocals?
│ ├─ No → Mubert / Stable Audio / MusicGen / Lyria RealTime
│ └─ Yes ↓
│
├─ Commercial use?
│ ├─ No (research / learning) → Anything goes, MusicGen included
│ └─ Yes ↓
│
├─ License cleanliness top priority?
│ ├─ Yes → ElevenMusic or YuE / ACE-Step self-hosted
│ └─ No ↓
│
├─ Non-English lyrics?
│ ├─ Yes → Suno v5.5 first, expect post-processing
│ └─ No ↓
│
├─ What aesthetic?
│ ├─ Pop / electronic polish → Suno
│ ├─ Hip-hop / R&B / producer tone → Udio
│ └─ Enterprise / Vertex AI → Lyria 3
7.3 By Budget
| Budget | Recommendation |
|---|---|
| $0 / month | MusicGen + 4090 or cloud GPU. Suno free tier (5 songs / day). |
| $10 / month | Suno Pro alone. Enough for most content creators. |
| $30 / month | Suno Pro + Udio Standard + Mubert. Rich aesthetic choices. |
| $100+ / month | Suno Premier + ElevenMusic + Stable Audio 2.5. Commercial production. |
| $1,000+ | Own 4090 box + YuE self-hosted + subscriptions. Studios, game teams. |
Epilogue — Checklist, Anti-Patterns, What's Next
AI music has gone from 2023's "neat" to 2026's "I'll release this." The pivot is that vocals now sound like vocals, lengths reach actual song duration, and aesthetic differences have settled into genre. At the same time — Korean vocals, coherence past four minutes, and airtight commercial licensing remain unsolved. The Suno summary judgment hearing in July 2026 will likely decide the category's next year.
Tool Selection Checklist
- Do you need vocals? — If not, Mubert or Stable Audio is a much safer pick.
- Are you using it commercially? — Pro tier or higher, explicit license, permanent-rights confirmation.
- Is the language English? — If not, budget for post-processing and vocal re-recording.
- How long is the piece? — Past three minutes, use Extend or stitching, or stay instrumental.
- What genre aesthetic? — Suno (pop), Udio (hip-hop / R&B), Lyria (enterprise).
- Need stem separation? — Suno Studio is one of the few that really delivers.
- Online dependency a burden? — Consider YuE or ACE-Step locally.
- Workflow repetitive? — Use the Mubert API, Suno API, or Lyria RealTime API.
- Copyright safety top priority? — ElevenMusic, or models that document training data.
- Are you ready to treat AI output as a draft, not a final? — The most important question.
Anti-Patterns
| Anti-pattern | Why it's bad | Instead |
|---|---|---|
| Shipping the first generation | Average quality is low | Generate 10 to 20, curate |
| Naming famous artists in prompts | License gray zone, Content ID risk | Abstract descriptions like "late-80s synth-pop" |
| Judging Korean songs by English assumptions | Awkward pronunciation slips through | At least one native-speaker review |
| Releasing commercially on a free tier | License violation | Subscribe at Pro or above |
| Generating a 4-minute song in one shot | Late-track coherence falls apart | Generate short, stitch, or use Extend |
| Using MusicGen output in a commercial ad | CC BY-NC 4.0 violation | YuE / ACE-Step or consumer tools |
| Skipping vocal intelligibility checks | Releasing songs no one can parse | Three external listeners read the lyrics back |
| Treating Lyria 3 like a free tool | Vertex AI pricing not understood | Cost-calculate per minute |
| Crediting AI output as "I composed this" | Disclosure and copyright risk | Mark as "AI-assisted composition" |
| Relying on one model only | Model limits become work limits | Pair 2 to 3 models by aesthetic |
What's Next
The next post is "AI Video Generation 2026 — Sora, Veo, Runway, Pika, Kling — and How They Actually Differ." Same pattern as this one: the category's explosion (2024 Sora demo) and maturation (commercial tools in 2026), the hardest part analogous to vocals (long-term coherence, character identity, fingers), open-source options (Open-Sora, Mochi, Wan), real use cases (ads, short video, concept visuals), and the copyright debate (NYT-OpenAI, Disney's licensing model) at the same depth.
References
- Suno v5.5 announcement
- Suno official site
- Suno v5.5 — Music Business Worldwide
- Udio official site
- Udio Wikipedia
- Udio company profile — Sacra
- Music Ally — Udio launch
- Universal Music and Udio settlement — Billboard
- Udio-Kobalt licensing deal — MBW
- RIAA press release — Suno and Udio lawsuits
- RIAA Suno complaint PDF
- RIAA Udio complaint PDF
- Music Industry AI Lawsuits Tracker — Chartlex
- AI Music Lawsuits Settlements Timeline — Dynamoi
- Lyria 3 — Google DeepMind
- Lyria RealTime — Google DeepMind
- Lyria 2 announcement — DeepMind Blog
- Google acquires ProducerAI/Riffusion — Awesome Agents
- ElevenLabs Music official
- ElevenLabs music app — TechCrunch
- ElevenLabs commercial-licensed music — TechCrunch 2025
- ElevenMusic launch — Billboard
- Meta AudioCraft official
- MusicGen on Hugging Face
- Meta AudioCraft announcement blog
- Stable Audio 2.5 — Stability AI
- Stable Audio Open announcement
- Stable Audio Open 1.0 — Hugging Face
- Stable Audio Open Small + Arm
- YuE GitHub
- YuEGP GPU-poor fork
- ACE-Step 1.5 GitHub
- Riffusion-hobby GitHub
- Riffusion on Hugging Face
- Mubert official site
- Mubert API
- Spheron — open-source music models on GPU cloud
- 10 Best AI Music Generators 2026 — fal.ai
- Billboard — biggest AI music stories of 2025
- AI Music Copyright Legal Risks 2026 — Silverman Sound