FFmpeg Concat Guide: Demuxer, Filter, Protocol and API

Pick the right method before you run anything

FFmpeg has three separate mechanisms for concatenating videos, and choosing the wrong one wastes time. Here's the decision table:

Situation	Method	Re-encodes?
Same codec, resolution, frame rate	Concat demuxer	No
Different codecs or resolutions	Concat filter	Yes
Existing `.ts` / HLS segments	Concat protocol	No
Variable frame rate clips (phone video)	Concat filter	Yes
Audio files only	Concat demuxer or filter	Depends

If your clips came from the same source (same camera, same export preset), start with the demuxer. It's fast because it copies bytes instead of re-encoding. If anything doesn't match, switch to the filter.

The FFmpeg merge videos guide covers the full workflow for combining clips. This guide goes deeper on the concat mechanisms themselves and how to diagnose when things break.

The concat demuxer

The demuxer reads a list of files and joins them without touching the encoded data. It's the fastest option when it applies.

# files.txt
file 'clip1.mp4'
file 'clip2.mp4'
file 'clip3.mp4'

ffmpeg -f concat -safe 0 -i files.txt -c copy output.mp4

-f concat tells FFmpeg to use the concat demuxer. -safe 0 allows paths outside the current directory. -c copy skips re-encoding.

Generate the file list with a script instead of typing it:

printf "file '%s'\n" /path/to/clips/*.mp4 > files.txt
ffmpeg -f concat -safe 0 -i files.txt -c copy output.mp4

Common demuxer errors:

DTS ... out of order: Your clips have inconsistent timestamps. Fix by regenerating them:

ffmpeg -f concat -safe 0 -i files.txt -c copy -fflags +genpts output.mp4

Not strictly monotonically increasing: Same root cause. Same fix.

Invalid data found when processing input: The clips actually have different codecs and the demuxer can't handle them. Switch to the concat filter.

Unsafe file name: Remove the -safe 0 flag and move your files to a path without special characters, or add -safe 0 if it's missing.

The concat filter

When your clips have different codecs, resolutions, or frame rates, you need the concat filter. It re-encodes everything to a common format.

ffmpeg -i clip1.mp4 -i clip2.mov -i clip3.webm \
  -filter_complex "[0:v][0:a][1:v][1:a][2:v][2:a]concat=n=3:v=1:a=1[outv][outa]" \
  -map "[outv]" -map "[outa]" \
  -c:v libx264 -crf 23 -c:a aac -b:a 128k \
  output.mp4

The filter syntax breaks down like this:

[0:v][0:a]: video and audio from the first input
[1:v][1:a]: video and audio from the second input
concat=n=3:v=1:a=1: join 3 inputs, each contributing 1 video stream and 1 audio stream
[outv][outa]: name the output streams so -map can reference them

When resolutions don't match:

FFmpeg refuses to concat clips with different dimensions. Pad them to a common size first:

ffmpeg -i clip1.mp4 -i clip2.mp4 \
  -filter_complex \
    "[0:v]scale=1920:1080:force_original_aspect_ratio=decrease,pad=1920:1080:(ow-iw)/2:(oh-ih)/2[v0]; \
     [1:v]scale=1920:1080:force_original_aspect_ratio=decrease,pad=1920:1080:(ow-iw)/2:(oh-ih)/2[v1]; \
     [v0][0:a][v1][1:a]concat=n=2:v=1:a=1[outv][outa]" \
  -map "[outv]" -map "[outa]" \
  -c:v libx264 -crf 23 -c:a aac -b:a 128k \
  output.mp4

When frame rates don't match:

Add fps=30 (or whatever target rate) to each input's filter chain:

ffmpeg -i clip1.mp4 -i clip2.mp4 \
  -filter_complex \
    "[0:v]fps=30[v0]; \
     [1:v]fps=30[v1]; \
     [v0][0:a][v1][1:a]concat=n=2:v=1:a=1[outv][outa]" \
  -map "[outv]" -map "[outa]" \
  -c:v libx264 -crf 23 -c:a aac -b:a 128k \
  output.mp4

Common filter errors:

Input link ... parameters do not match: The resolutions are different. Add the scale + pad block shown above.

[Parsed_concat] Input streams must have the same timebase: Different time bases between inputs. Add -vsync vfr before the output flags, or normalize the frame rate first.

Audio stream has no pts: Usually happens with variable-bitrate audio. Add -async 1 or convert the audio to a fixed format first.

The concat protocol

The protocol is the oldest method and the most limited. It concatenates at the byte level using a pipe-separated list of files.

ffmpeg -i "concat:clip1.ts|clip2.ts|clip3.ts" -c copy output.ts

It only works reliably with MPEG-TS (.ts) files. MP4, MOV, and MKV store header data in ways the protocol can't handle across file boundaries.

If you have .ts segments from an HLS stream or a live recording, this is the fastest way to rejoin them:

# Merge HLS segments back to MP4, fix AAC timestamp headers
ffmpeg -i "concat:segment000.ts|segment001.ts|segment002.ts" \
  -c copy -bsf:a aac_adtstoasc \
  output.mp4

The -bsf:a aac_adtstoasc flag converts the ADTS-formatted AAC headers that MPEG-TS uses into the MPEG-4 format that MP4 containers expect. Skip it and the audio won't play.

For anything other than .ts files, use the demuxer instead.

Audio-only concat

Audio files work with either mechanism. If you're joining MP3 files that are identical in format:

printf "file '%s'\n" track1.mp3 track2.mp3 track3.mp3 > audio_list.txt
ffmpeg -f concat -safe 0 -i audio_list.txt -c copy output.mp3

If the bitrates or sample rates differ:

ffmpeg -i track1.mp3 -i track2.ogg -i track3.m4a \
  -filter_complex "[0:a][1:a][2:a]concat=n=3:v=0:a=1[outa]" \
  -map "[outa]" \
  -c:a aac -b:a 192k \
  output.m4a

Note v=0:a=1 in the concat filter: no video streams, just audio. The rest of the syntax is the same.

Building concat commands in scripts

For anything beyond a handful of files, you want to generate the command programmatically.

Bash: merge all clips in a directory:

#!/bin/bash
OUTPUT=$1
CLIPS=("${@:2}")

LIST=$(mktemp /tmp/concat.XXXXXX.txt)
printf "file '%s'\n" "${CLIPS[@]}" > "$LIST"

ffmpeg -f concat -safe 0 -i "$LIST" -c copy "$OUTPUT"
rm "$LIST"

Usage: ./merge.sh output.mp4 clip1.mp4 clip2.mp4 clip3.mp4

Python: build the filter_complex dynamically:

import subprocess

def concat_videos(inputs, output):
    n = len(inputs)
    input_flags = []
    for f in inputs:
        input_flags += ["-i", f]

    streams = "".join(f"[{i}:v][{i}:a]" for i in range(n))
    filter_complex = f"{streams}concat=n={n}:v=1:a=1[outv][outa]"

    cmd = (
        ["ffmpeg"]
        + input_flags
        + ["-filter_complex", filter_complex,
           "-map", "[outv]", "-map", "[outa]",
           "-c:v", "libx264", "-crf", "23",
           "-c:a", "aac", "-b:a", "128k",
           output]
    )
    subprocess.run(cmd, check=True)

concat_videos(["clip1.mp4", "clip2.mp4", "clip3.mp4"], "output.mp4")

Running concat via the RenderIO API

Local FFmpeg works fine for one-off tasks. For production workloads (merging user uploads, building automated video pipelines, batch processing hundreds of clips), running FFmpeg on your own server means sizing for peak load and managing the binary yourself.

The RenderIO FFmpeg API accepts the same FFmpeg commands you'd run locally and handles the compute remotely. No binary to install, no server to scale.

Concat demuxer via API (curl):

curl -X POST https://renderio.dev/api/v1/run-ffmpeg-command \
  -H "Content-Type: application/json" \
  -H "X-API-KEY: your_api_key" \
  -d '{
    "ffmpeg_command": "-f concat -safe 0 -i {{in_list}} -c copy {{out_video}}",
    "input_files": {
      "in_list": "https://example.com/files.txt",
      "in_clip1": "https://example.com/clip1.mp4",
      "in_clip2": "https://example.com/clip2.mp4"
    },
    "output_files": {
      "out_video": "merged.mp4"
    }
  }'

Concat filter via API (curl):

curl -X POST https://renderio.dev/api/v1/run-ffmpeg-command \
  -H "Content-Type: application/json" \
  -H "X-API-KEY: your_api_key" \
  -d '{
    "ffmpeg_command": "-i {{in_clip1}} -i {{in_clip2}} -filter_complex \"[0:v][0:a][1:v][1:a]concat=n=2:v=1:a=1[outv][outa]\" -map \"[outv]\" -map \"[outa]\" -c:v libx264 -crf 23 -c:a aac -b:a 128k {{out_video}}",
    "input_files": {
      "in_clip1": "https://example.com/clip1.mp4",
      "in_clip2": "https://example.com/clip2.mp4"
    },
    "output_files": {
      "out_video": "merged.mp4"
    }
  }'

The API returns a command_id. Poll until it's done:

curl https://renderio.dev/api/v1/commands/COMMAND_ID \
  -H "X-API-KEY: your_api_key"

When status is SUCCESS, the response includes a download URL for the merged file.

Node.js: concat filter with submit-poll pattern:

const API_KEY = "ffsk_your_api_key";
const BASE = "https://renderio.dev/api/v1";

async function concatVideos(clipUrls) {
  const n = clipUrls.length;
  const inputs = clipUrls.map((_, i) => `-i {{in_clip${i}}}`).join(" ");
  const streams = clipUrls.map((_, i) => `[${i}:v][${i}:a]`).join("");
  const filterComplex = `${streams}concat=n=${n}:v=1:a=1[outv][outa]`;

  const inputFiles = {};
  clipUrls.forEach((url, i) => { inputFiles[`in_clip${i}`] = url; });

  // Submit
  const submitRes = await fetch(`${BASE}/run-ffmpeg-command`, {
    method: "POST",
    headers: { "Content-Type": "application/json", "X-API-KEY": API_KEY },
    body: JSON.stringify({
      ffmpeg_command: `${inputs} -filter_complex "${filterComplex}" -map "[outv]" -map "[outa]" -c:v libx264 -crf 23 -c:a aac -b:a 128k {{out_video}}`,
      input_files: inputFiles,
      output_files: { out_video: "merged.mp4" },
    }),
  });
  const { command_id } = await submitRes.json();

  // Poll
  while (true) {
    await new Promise((r) => setTimeout(r, 3000));
    const status = await fetch(`${BASE}/commands/${command_id}`, {
      headers: { "X-API-KEY": API_KEY },
    }).then((r) => r.json());

    if (status.status === "SUCCESS") return status;
    if (status.status === "FAILED") throw new Error(status.error);
  }
}

const result = await concatVideos([
  "https://example.com/intro.mp4",
  "https://example.com/main.mp4",
  "https://example.com/outro.mp4",
]);
console.log("Download:", result.output_files.out_video);

For a deeper walkthrough of the Node.js patterns, including webhooks so you don't need to poll, the FFmpeg API Node.js guide has the full setup.

Concat for specific platforms

Platform delivery often needs a particular resolution and codec, not just "merged."

TikTok / Instagram Reels (9:16, 1080x1920, H.264):

ffmpeg -i clip1.mp4 -i clip2.mp4 \
  -filter_complex \
    "[0:v]scale=1080:1920:force_original_aspect_ratio=decrease,pad=1080:1920:(ow-iw)/2:(oh-ih)/2[v0]; \
     [1:v]scale=1080:1920:force_original_aspect_ratio=decrease,pad=1080:1920:(ow-iw)/2:(oh-ih)/2[v1]; \
     [v0][0:a][v1][1:a]concat=n=2:v=1:a=1[outv][outa]" \
  -map "[outv]" -map "[outa]" \
  -c:v libx264 -profile:v high -level 4.2 -crf 23 \
  -c:a aac -b:a 128k -ar 44100 \
  tiktok_merged.mp4

YouTube (16:9, 1920x1080):

ffmpeg -i clip1.mp4 -i clip2.mp4 \
  -filter_complex \
    "[0:v]scale=1920:1080:force_original_aspect_ratio=decrease,pad=1920:1080:(ow-iw)/2:(oh-ih)/2[v0]; \
     [1:v]scale=1920:1080:force_original_aspect_ratio=decrease,pad=1920:1080:(ow-iw)/2:(oh-ih)/2[v1]; \
     [v0][0:a][v1][1:a]concat=n=2:v=1:a=1[outv][outa]" \
  -map "[outv]" -map "[outa]" \
  -c:v libx264 -crf 20 -preset slow \
  -c:a aac -b:a 192k \
  youtube_merged.mp4

For the full scale filter reference, see the FFmpeg cheat sheet.

FAQ

What's the difference between concat demuxer and concat filter?

The demuxer reads a list of files and joins them at the container level without decoding. The filter decodes each input, processes it through the filter graph, and re-encodes. Demuxer: fast, lossless, requires matching formats. Filter: slower, re-encodes, handles anything.

Can I concat MP4 and MOV files?

Yes, as long as they use the same codecs internally (H.264 + AAC is common to both). The demuxer handles it with -c copy. If the internal codecs differ, use the concat filter.

Why does the concat demuxer produce corrupted output?

Usually a format mismatch: the clips look compatible but have subtly different parameters (different audio sample rates, different video profiles). Run ffprobe clip1.mp4 and ffprobe clip2.mp4 and compare the output. Any difference in codec, resolution, frame rate, or sample rate will cause problems.

How do I concat videos without losing quality?

Use the concat demuxer with -c copy when formats match. This copies bytes directly with no quality loss. If you have to use the concat filter (formats differ), use a lower CRF value like 18-20 for near-lossless output. It'll be a larger file but visually indistinguishable from the source.

Is there a limit on how many files I can concat?

The demuxer reads from a text file, so there's no practical limit (thousands of entries work fine). The filter requires all inputs on the command line, which runs into OS argument limits somewhere around a few hundred inputs. For large batches, use the demuxer.

Does ffmpeg concat preserve audio sync?

Yes if the source material has correct timestamps. Problems happen when clips were trimmed improperly (cut mid-GOP without re-encoding) or come from variable-frame-rate sources like phone recordings. The fix is to re-encode with the concat filter and add -vsync cfr to force constant frame rate.

How do I concat videos with different audio tracks or no audio?

If only some clips have audio, FFmpeg will error out when building the concat filter. Add a silent audio track to clips that have none:

ffmpeg -i silent_clip.mp4 -f lavfi -i anullsrc=channel_layout=stereo:sample_rate=44100 \
  -filter_complex "[0:v][1:a]concat=n=1:v=1:a=1[outv][outa]" \
  -map "[outv]" -map "[outa]" \
  -shortest \
  with_audio.mp4

Then concat all clips normally via the filter.