MuseTalk

Built-in lip-sync on your FlyMyAI account. Give it a reference avatar video and a speech audio track, and it renders a talking-head MP4 where the avatar's mouth follows the audio. This is a non-realtime batch step - it submits the job, waits for it to finish, and saves the result as an agent_file with a public_url you can feed into the next tool.

What it can do

Method	What it does
`lip_sync`	Lip-sync a reference avatar video (`video_url`) to speech audio (`audio_url`) and return a talking-head MP4 `agent_file` plus an `output_url` on object storage. Optional `bbox_shift` (mouth-region tuning, leave 0) and `fps` override.

Both video_url and audio_url must be public HTTPS URLs reachable by the service. The audio is typically the agent_file.public_url from a prior elevenlabs.text_to_speech call.

Typical pipeline

Lip-sync is usually the last step in an avatar pipeline:

elevenlabs.text_to_speech - turn your script into speech audio, then take result.agent_file.public_url.
musetalk.lip_sync - pass that audio as audio_url and your avatar base clip as video_url.

Call lip_sync on its own when you already have both URLs and only need the lip-sync.

How to get credentials

None - MuseTalk is a built-in tool. It runs on FlyMyAI infrastructure using your account. Just enable it.

Fields to fill in FlyMyAI

None.

Troubleshooting

Input URL not fetched - both video_url and audio_url must be direct public HTTPS links reachable by the service. Pre-upload via download_link or another agent tool if the source requires auth.
Job timed out - long videos take longer to render. Keep the avatar clip and audio short, or split the audio into segments.
Wrong tool for realtime - MuseTalk is a batch render. For a live realtime avatar endpoint, use flymyai_deploy instead.

Built with care by FlyMy.AI.

What it can do​

Typical pipeline​

How to get credentials​

Fields to fill in FlyMyAI​

Troubleshooting​

What it can do

Typical pipeline

How to get credentials

Fields to fill in FlyMyAI

Troubleshooting