FFmpeg Remove Audio: Mute, Strip, or Replace Audio Tracks

April 9, 2026 · RenderIO

What FFmpeg remove audio actually does

When someone says they want to ffmpeg remove audio, they usually mean one of three things:

  1. Mute the video: strip all audio and keep only the video stream

  2. Extract audio before deleting it: save the audio as a separate file first, then remove it from the video

  3. Replace audio: swap the existing soundtrack for something better

Each has a different command. This guide covers all three, plus some edge cases that trip people up (multiple audio streams, duration mismatches, batch processing via API).

Mute a video with -an

The simplest case. You want a video file with no audio track at all:

ffmpeg -i input.mp4 -an output.mp4

-an drops all audio streams. FFmpeg re-encodes the video by default, which wastes time. Add -c:v copy to stream-copy the video without re-encoding:

ffmpeg -i input.mp4 -an -c:v copy output.mp4

With stream copy, a 2GB file processes in under a second. FFmpeg reads the container, discards audio, writes the video bytes straight through. No quality loss.

What -an actually does at the container level: it removes the audio stream mapping for the output file. The input file is unchanged. The original audio still exists in the source. If you need it back later, go back to the original.

Muting a time range (not the whole video)

Sometimes you only want to silence part of a video, maybe a stretch where a microphone picked up something it shouldn't have. Keep the audio track present but zero its volume:

ffmpeg -i input.mp4 -af "volume=enable='between(t,10,30)':volume=0" output.mp4

This silences the audio between the 10-second and 30-second marks. The audio track still exists in the output, it's just silent during that window. Some editors treat "no audio track" differently from "silent audio track," so this approach keeps the track structure intact.

Strip a specific audio stream

Video files sometimes carry multiple audio streams. Broadcast content ships with separate language tracks. Cameras can record dual-channel audio on separate tracks. GoPro cameras notoriously write two audio streams. If you want to remove one track but keep the others, -an is too blunt.

First, see what you're working with:

ffprobe -v quiet -print_format json -show_streams input.mp4 | grep -E "codec_type|index"

This tells you how many streams exist and what type each one is. Then target the stream you want to remove:

# Remove the second audio stream (index 1), keep everything else
ffmpeg -i input.mp4 -map 0 -map -0:a:1 -c copy output.mp4

-map 0 copies all streams from the input into the output mapping, then -map -0:a:1 removes the audio stream at index 1. The minus sign means "exclude this from the mapping." -c copy does stream copy for everything, so it runs fast.

If you want to keep only the first audio track and drop any others:

ffmpeg -i input.mp4 -map 0:v -map 0:a:0 -c copy output.mp4

This is more explicit: take video from stream 0, take only the first audio stream (0:a:0), and copy both without re-encoding.

Extract audio before you remove it

If there's any chance you'll want the original audio later, pull it out first. Deleting audio from a file doesn't damage the original if you're working on a copy, but it's worth being explicit about preservation.

Extract to AAC (lossless copy of the audio data):

ffmpeg -i input.mp4 -vn -c:a copy audio_backup.aac

-vn drops video streams. -c:a copy copies audio without re-encoding, so what you get out is byte-for-byte the same as what was in the container.

Extract to MP3 if you need something more portable:

ffmpeg -i input.mp4 -vn -c:a libmp3lame -q:a 2 audio_backup.mp3

-q:a 2 is VBR quality mode. Roughly 190kbps on average. For speech, this is overkill; for music, it's about as good as MP3 gets.

After extracting, you can do the mute command above with confidence that the audio is safely stored elsewhere. The extract audio from video tool does this without needing FFmpeg locally if that's easier.

Replace audio with a new track

This is the most common real-world use case. Screen recording caught your laptop fan noise. Stock footage came with generic music you can't use. You want to swap in something specific.

ffmpeg -i input.mp4 -i new_audio.mp3 -c:v copy -map 0:v -map 1:a -shortest output.mp4

Breaking this down:

  • -i input.mp4: original video

  • -i new_audio.mp3: the replacement audio

  • -c:v copy: stream-copy the video, no re-encoding

  • -map 0:v: take video from the first input

  • -map 1:a: take audio from the second input

  • -shortest: stop when the shorter stream ends

The -shortest flag matters more than it looks. If your replacement audio is 5 minutes but your video is 4 minutes, FFmpeg will pad the video to match the audio length (holding the last frame, or adding black) without that flag. Add it and the output stops at the natural end of whichever stream is shorter.

When audio is slightly shorter than video

If your replacement audio ends a few seconds before the video, you might want the last few seconds to be silent rather than cutting the video short. Pad with silence:

ffmpeg -i input.mp4 -i new_audio.mp3 \
  -filter_complex "[1:a]apad=pad_dur=5[a_padded]" \
  -c:v copy \
  -map 0:v \
  -map "[a_padded]" \
  output.mp4

apad=pad_dur=5 adds 5 seconds of silence to the end of the audio stream. Adjust the value to match how much padding you need.

Fade audio out at the end

Hard cuts in audio sound jarring. Fade the replacement audio out before it stops:

ffmpeg -i input.mp4 -i background_music.mp3 \
  -c:v copy \
  -map 0:v \
  -map 1:a \
  -af "afade=t=out:st=55:d=5" \
  -shortest \
  output.mp4

afade=t=out:st=55:d=5 fades out starting at the 55-second mark over 5 seconds. Adjust st and d for your video length. If your video is 3 minutes, st=175:d=5 gives you a 5-second fade starting 5 seconds before the end.

Mix original audio with new audio

Sometimes you want to layer rather than replace. Keep the original dialogue but bring in background music underneath:

ffmpeg -i input.mp4 -i background.mp3 \
  -filter_complex "[0:a]volume=1.0[a_orig];[1:a]volume=0.25[a_bg];[a_orig][a_bg]amix=inputs=2:duration=first" \
  -c:v copy \
  -map 0:v \
  output.mp4

volume=0.25 drops the background music to 25% before mixing. The amix filter handles blending. duration=first means the output runs for the length of the first input (your original video), cutting off the music if it runs longer.

The volume values here are the main thing to experiment with. Speech intelligibility tends to drop if background music is above 15-20% for talking-head content. For montage video with no dialogue, you can bring it up.

FFmpeg remove audio via API

If you're processing video in an application rather than on a local machine, running FFmpeg yourself means owning the infrastructure. An API handles the queue, the compute, the storage. The RenderIO API takes any FFmpeg command and runs it in the cloud.

Mute video via API

curl -X POST https://renderio.dev/api/v1/run-ffmpeg-command \
  -H "Content-Type: application/json" \
  -H "X-API-KEY: your_api_key" \
  -d '{
    "ffmpeg_command": "-i {{in_video}} -an -c:v copy {{out_video}}",
    "input_files": {"in_video": "https://your-storage.com/input.mp4"},
    "output_files": {"out_video": "muted.mp4"}
  }'

Replace audio via API

curl -X POST https://renderio.dev/api/v1/run-ffmpeg-command \
  -H "Content-Type: application/json" \
  -H "X-API-KEY: your_api_key" \
  -d '{
    "ffmpeg_command": "-i {{in_video}} -i {{in_audio}} -c:v copy -map 0:v -map 1:a -shortest {{out_video}}",
    "input_files": {
      "in_video": "https://your-storage.com/video.mp4",
      "in_audio": "https://your-storage.com/music.mp3"
    },
    "output_files": {"out_video": "with_new_audio.mp4"}
  }'

The {{double_brace}} syntax tells RenderIO where to substitute the file references. Each job gets an isolated container. Once submitted, you poll for completion using the command_id from the response.

The curl examples guide has copy-paste requests for audio extraction, muting, and 18 other operations, which is faster than composing commands from scratch.

Processing multiple videos in parallel

The API shines when you're running the same operation across a batch of files. You can submit all jobs immediately and let them run in parallel:

const videoUrls = [
  'https://your-storage.com/video1.mp4',
  'https://your-storage.com/video2.mp4',
  'https://your-storage.com/video3.mp4',
]

const jobs = await Promise.all(videoUrls.map(url =>
  fetch('https://renderio.dev/api/v1/run-ffmpeg-command', {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'X-API-KEY': process.env.RENDERIO_API_KEY
    },
    body: JSON.stringify({
      ffmpeg_command: '-i {{in_video}} -an -c:v copy {{out_video}}',
      input_files: { in_video: url },
      output_files: { out_video: 'muted.mp4' }
    })
  }).then(r => r.json())
))

// jobs is an array of {command_id, status} — poll each for completion

Each job runs in its own container. Your server doesn't block. A batch of 50 videos submits in the time it takes for one API call each, and they all process simultaneously.

Common errors and fixes

Trailing option(s) found in the command or similar errors when using -an with stream copy

This sometimes happens if you put flags in the wrong order. In FFmpeg, input options go before -i, output options go after. -an and -c:v copy are output options. This is wrong:

ffmpeg -an -i input.mp4 output.mp4  # wrong — -an before -i

This is right:

ffmpeg -i input.mp4 -an -c:v copy output.mp4  # correct

Output file has no audio but the video plays weirdly

If you used stream copy and the output behaves oddly in some players, the issue is often non-keyframe alignment. This can happen with -c:v copy when the input had B-frames out of order. Re-encode the video: drop -c:v copy and let FFmpeg transcode with default settings.

Audio and video are out of sync after replacing the audio track

This is the most common issue with audio replacement. Check whether your video has a timebase offset: some files start at a non-zero timestamp. Running ffprobe -v quiet -show_streams input.mp4 will show the start_time for each stream. If video start_time is non-zero, you may need -itsoffset to align the audio:

ffmpeg -i input.mp4 -itsoffset 0.5 -i new_audio.mp3 -c:v copy -map 0:v -map 1:a -shortest output.mp4

Adjust 0.5 to match the offset you measured.

Quick reference

GoalCommand
Mute video (remove all audio)ffmpeg -i input.mp4 -an -c:v copy output.mp4
Remove specific audio streamffmpeg -i input.mp4 -map 0 -map -0:a:1 -c copy output.mp4
Silence a time rangeffmpeg -i input.mp4 -af "volume=enable='between(t,10,30)':volume=0" output.mp4
Extract audio before removingffmpeg -i input.mp4 -vn -c:a copy backup.aac
Replace audio trackffmpeg -i video.mp4 -i audio.mp3 -c:v copy -map 0:v -map 1:a -shortest output.mp4
Mix original + new audioffmpeg -i video.mp4 -i bg.mp3 -filter_complex "[0:a][1:a]amix=inputs=2:duration=first" -c:v copy output.mp4

For the full set of FFmpeg audio and video manipulation commands, the FFmpeg cheat sheet organizes 50 of the most common operations by category. The complete API guide covers the RenderIO request/response format, authentication, polling, and webhooks in detail.