May 12, 2026 · 3 min read
Turn Any MP3 Into a Lyric Video in 3 Minutes
Upload an audio URL. Whisper transcribes it. Flux paints a backdrop. Captions sync to the beat. That's it.
Lyric videos used to require a designer with motion graphics chops. Now it's a 3-minute job. Here's what actually happens between "paste an MP3 URL" and "mp4 in your downloads folder."
The pipeline, step by step
1. Audio fetch (~3s). You paste a direct mp3 URL (yours, with rights). Reviral pulls the file, checks it's under 25MB (Whisper's upload cap), and stores a copy in our private bucket so the renderer doesn't depend on your CDN staying up.
2. Lyric transcription (~30s). OpenAI Whisper transcribes the audio with word-level timestamps. Each word gets a start + end time. If the track is instrumental Whisper returns nothing — we warn you and render with empty captions.
3. Backdrop generation (~5s). Fal's flux/schnell model paints a single still image for the backdrop. We auto-derive a vibe prompt from the first lyric line, or you can override ("moonlit beach at midnight").
4. Composition (~2 min). AWS Lambda + Remotion combines: your audio on the audio track, the flux image as a static backdrop with a slow ken-burns push, and word-level karaoke captions synced to the transcript timestamps. Your captions highlight in time with the lyrics.
Why this works for short-form
Most lyric videos that go viral on TikTok / Reels are exactly this format: one striking backdrop, big karaoke-style captions that hit on the beat. The actual visual variation is doing very little work — the captions and the audio are the show. AI gen does well at this because it doesn't have to maintain character consistency across cuts.
Important: only your own music
We don't check copyright. You're responsible for having the rights to whatever audio you upload. Lyric videos posted to YouTube without rights get auto-muted or claimed. If you're an artist promoting your own track, this works perfectly. If you're trying to lyric-video someone else's song, that's on you.
Try it
Sign up for Reviral (100 free credits), pick the "Lyric video" workflow on /app/create, paste your MP3 URL + a one-line vibe prompt for the backdrop. Render time is under 3 minutes for tracks 30-60 seconds long.