FFmpeg Transcode Video: Complete Codec Conversion Guide

Transcoding is not the same as converting

Most people use "convert" and "transcode" interchangeably. They're not the same thing, and the difference matters when you're trying to figure out why your "conversion" took 45 minutes instead of 2 seconds. (If you're brand new to FFmpeg, start with the command line beginner guide for installation and basic syntax.)

When you transcode video with FFmpeg, you decode it from one codec and re-encode it into another. The pixels get decompressed, then recompressed using a different algorithm. This takes time and CPU because the encoder has to make millions of decisions about how to represent each frame.

When you remux (sometimes called "rewrap"), you move the existing compressed stream into a different container without touching the codec. No decoding, no re-encoding. It finishes in seconds because FFmpeg just copies bytes.

Here's the difference in practice:

# Transcoding: decode H.264, re-encode as H.265 (slow, CPU-heavy)
ffmpeg -i input.mp4 -c:v libx265 -crf 28 -c:a aac output.mp4

# Remuxing: copy streams into a new container (fast, no quality loss)
ffmpeg -i input.mkv -c:v copy -c:a copy output.mp4

The -c:v copy flag is the tell. When you see copy, nothing gets transcoded. When you see a codec name like libx264 or libx265, FFmpeg is doing real work.

A quick way to check what codecs are inside a file before deciding:

ffprobe -v error -show_entries stream=codec_name,codec_type -of csv=p=0 input.mp4
# Output:
# h264,video
# aac,audio

The ffprobe tutorial covers more inspection techniques — JSON output, batch scripting, and extracting specific fields — if you need to automate these checks.

If the output codec matches what you need and only the container is wrong, remux. Otherwise, transcode.

So when do you actually need to ffmpeg transcode video? When the codec itself needs to change. Maybe you shot ProRes on a camera and need H.264 for the web. Maybe you have old H.264 files and want H.265 to cut storage costs in half. Maybe a client sent you VP9 WebM and your player only handles MP4. Or maybe you're converting animated GIFs to MP4 for web delivery — that's a transcode from GIF's frame-by-frame encoding to H.264's inter-frame compression, and it typically shrinks the file by 10-50x. If your goal is purely to reduce file size within the same codec (or pick the best codec for compression), the video compression guide covers CRF tuning, two-pass encoding, and platform-specific compression in detail.

If your file just needs a different container (MKV to MP4, for example) and the codecs are already compatible, remuxing is always the right call. No reason to burn CPU re-encoding something that already works. Same goes for trimming video — if you just need to cut a clip and the codec is fine, stream copy (-c copy) skips re-encoding entirely. The FFmpeg formats guide has a compatibility table showing which codecs fit inside which containers, so you can check before deciding whether to remux or transcode. For a quick reference of common commands, the FFmpeg cheat sheet has 50 organized by task. Need to look up what a specific flag does? The FFmpeg options reference documents each option with syntax and practical examples.

H.264 to H.265: the most common transcode

H.265 (HEVC) produces files roughly 40-50% smaller than H.264 at the same visual quality. Independent codec comparisons back this up: at equivalent VMAF scores, H.265 consistently needs 40-45% fewer bits than H.264. The trade-off is encoding speed. H.265 runs at about 0.3-0.5x the speed of H.264, so a file that takes 5 minutes with libx264 might take 12-15 minutes with libx265.

The basic command:

ffmpeg -i input.mp4 -c:v libx265 -crf 28 -preset medium -tag:v hvc1 -c:a copy output.mp4

What each flag does:

-c:v libx265 selects the x265 encoder
-crf 28 sets the quality level (lower = better quality, bigger file)
-preset medium balances encoding speed against compression efficiency
-tag:v hvc1 marks the stream so Apple devices and QuickTime can play it (without this, macOS and iOS may refuse the file)
-c:a copy keeps the audio stream as-is

CRF values for H.265

CRF (Constant Rate Factor) is how you control quality in x265. The default is 28, which is a reasonable starting point, but "reasonable" depends on what you're doing with the output.

Here's how different CRF values compare for a typical 1080p source:

CRF	Quality	Relative file size	Use case
18	Visually lossless	~3x baseline	Archival, post-production
23	High quality	~1.5x baseline	Streaming, downloads
28	Good quality (default)	1x baseline	General distribution
32	Acceptable	~0.6x baseline	Mobile, low bandwidth
38	Noticeable artifacts	~0.3x baseline	Previews, thumbnails

Note: CRF scales are not directly comparable between codecs. CRF 23 in x264 and CRF 28 in x265 produce roughly similar quality. People get confused by this and set CRF 23 in x265 thinking it's the same target as x264. It's not. You'll get a much larger file than you expected.

You can verify this yourself with FFmpeg's built-in VMAF filter:

ffmpeg -i transcoded.mp4 -i original.mp4 \
  -lavfi libvmaf="model=version=vmaf_v0.6.1" -f null -

This gives you an objective quality score (0-100) so you're not guessing whether CRF 28 is "good enough" for your content.

Presets and what they actually do

The -preset flag controls how hard the encoder works to find optimal compression. Slower presets produce smaller files at the same quality, but the time difference is real:

# Fast encode, bigger file
ffmpeg -i input.mp4 -c:v libx265 -crf 28 -preset fast output_fast.mp4

# Slow encode, smaller file at same quality
ffmpeg -i input.mp4 -c:v libx265 -crf 28 -preset slow output_slow.mp4

For a 10-minute 1080p clip, fast might finish in 8 minutes, medium in 15, and slow in 40. The file size difference between fast and slow is typically 10-20%. Worth it for archival. Not worth it for a batch of 500 clips you need by tomorrow morning.

The full preset list, fastest to slowest: ultrafast, superfast, veryfast, faster, fast, medium, slow, slower, veryslow, placebo. Don't use placebo. It exists to prove a point, not to be useful.

H.264 to AV1: smaller files, if you can wait

AV1 is royalty-free (unlike H.265, which has a notoriously complex patent landscape), and it produces files 30-50% smaller than H.265. Hardware decoder support has caught up. Chrome, Firefox, Edge, and Safari all support AV1 playback, and most smartphones shipped since 2023 include hardware AV1 decoders (Qualcomm Snapdragon 888+, MediaTek Dimensity 1000+, Apple A17 Pro and later, Samsung Exynos 2200+). The catch is still encoding speed.

FFmpeg supports three AV1 encoders:

libaom-av1 (reference encoder): Best quality, absurdly slow. Runs at roughly 0.05-0.1x the speed of H.264. A 10-minute 1080p encode can take hours.

ffmpeg -i input.mp4 -c:v libaom-av1 -crf 30 -b:v 0 -cpu-used 4 -c:a libopus output.mkv

The -b:v 0 flag is required for CRF mode with libaom. The -cpu-used flag ranges from 0 (slowest, best) to 8 (fastest, worst). Setting it to 4 is a reasonable middle ground. At 0 you might finish the encode next week.

libsvtav1 (SVT-AV1): The practical choice. Developed by Intel and Netflix, SVT-AV1 runs at 0.3-0.6x the speed of H.264, roughly on par with libx265 but with better compression. This is what changed AV1 from "technically interesting" to "actually usable."

ffmpeg -i input.mp4 -c:v libsvtav1 -crf 30 -preset 6 -c:a libopus output.mkv

SVT-AV1 presets range from 0-13 (13 is for testing only). Preset 6 is a good balance for production work. Below 4 gets slow; above 10 sacrifices noticeable quality. If you want to squeeze more compression out of longer GOP structures, add -g 300 for a 30fps source (10-second GOPs).

librav1e: Rust-based encoder. Decent quality but slower than SVT-AV1 and not as widely available. Unless you have a specific reason, SVT-AV1 is the better pick.

For most workflows right now, AV1 makes sense for on-demand content where you encode once and serve millions of times (think YouTube, Netflix). It doesn't make sense for real-time or high-volume batch processing where encoding speed matters. For a deeper look at how different services handle codec choices at scale, check the complete FFmpeg API guide.

ProRes to H.264: camera footage for the web

If you work with footage from professional cameras or Final Cut Pro exports, you're probably sitting on ProRes files. They're great for editing (every frame is independently encoded, so scrubbing and seeking are instant). They're terrible for the web. A 2-minute ProRes 422 clip at 1080p runs 2-4 GB. ProRes 4444 with alpha doubles that.

ffmpeg -i input.mov -c:v libx264 -crf 18 -preset slow \
  -pix_fmt yuv420p -c:a aac -b:a 192k output.mp4

The -pix_fmt yuv420p flag matters here. ProRes often uses 4:2:2 or 4:4:4 chroma subsampling, but most web players expect 4:2:0. Without this flag, some browsers and mobile players will show a black screen or error out. FFmpeg may add it automatically in some cases, but being explicit avoids surprises.

Why CRF 18? ProRes footage is usually high quality source material. You want to preserve that quality when transcoding to a lossy codec. CRF 18 with libx264 is considered "visually lossless" for most content. Going higher (worse quality) on premium footage feels wasteful.

If you need to hit a specific bitrate for delivery (some broadcast specs require it), use two-pass encoding instead of CRF:

# Pass 1: analyze the video
ffmpeg -i input.mov -c:v libx264 -b:v 8M -pass 1 -an -f null /dev/null

# Pass 2: encode using the analysis
ffmpeg -i input.mov -c:v libx264 -b:v 8M -pass 2 -c:a aac -b:a 192k output.mp4

Two-pass encoding gives FFmpeg a complete picture of the video's complexity before it starts encoding. The first pass builds a log file of scene complexity. The second pass uses that log to distribute bits more intelligently. Scenes with fast motion get more bits; static shots get fewer. The result is more consistent quality than single-pass at the same average bitrate.

VP9 to H.264: WebM files that won't play everywhere

VP9 in WebM containers is common if you're pulling video from web sources. Chrome and Firefox handle it fine, but Safari support is inconsistent and many native mobile players don't support it at all.

ffmpeg -i input.webm -c:v libx264 -crf 22 -c:a aac -b:a 128k output.mp4

If you also need VP9 output for web delivery:

ffmpeg -i input.mp4 -c:v libvpx-vp9 -crf 30 -b:v 0 \
  -row-mt 1 -c:a libopus output.webm

VP9 encoding has a quirk: you need -b:v 0 for pure CRF mode. Without it, FFmpeg defaults to a constrained quality mode that behaves differently. The -row-mt 1 flag enables row-based multithreading, which can double VP9 encoding speed on multi-core machines. It's off by default for historical reasons.

Batch transcoding: processing hundreds of files

One file at a time doesn't scale. If you have a folder of 200 MOV files that need to become H.264 MP4s, you need a loop.

Bash loop (simple)

for f in *.mov; do
  ffmpeg -i "$f" -c:v libx264 -crf 22 -preset fast -c:a aac \
    "${f%.mov}.mp4"
done

This processes files sequentially. Fine if you're running it overnight. Slow if you're waiting.

GNU Parallel (use all your cores)

find . -name "*.mov" | parallel -j 4 \
  ffmpeg -i {} -c:v libx264 -crf 22 -preset fast -c:a aac {.}.mp4

The -j 4 flag runs 4 encodes simultaneously. Adjust based on your CPU cores. Each H.264 encode uses roughly 2-4 threads, so on an 8-core machine, 2-4 parallel jobs is usually the sweet spot. Going higher will slow everything down as encoders fight for CPU time.

xargs (portable)

find . -name "*.mov" -print0 | xargs -0 -I {} -P 4 \
  sh -c 'ffmpeg -i "$1" -c:v libx264 -crf 22 -preset fast -c:a aac "${1%.mov}.mp4"' _ {}

Same idea as parallel, using tools that ship with every Unix system.

Batch transcoding with different output codecs

Sometimes you need multiple formats from the same source. FFmpeg can write multiple outputs in a single pass:

ffmpeg -i input.mov \
  -c:v libx264 -crf 22 -c:a aac output.mp4 \
  -c:v libvpx-vp9 -crf 30 -b:v 0 -c:a libopus output.webm

This decodes the input once and encodes it twice. More efficient than running two separate commands because the decode step happens only once.

For large-scale batch processing, managing FFmpeg jobs on your own hardware gets complicated fast. Tracking progress, handling failures, retrying, managing output storage. If you're processing hundreds or thousands of files regularly, an FFmpeg REST API can handle the orchestration while you focus on the logic. You can also wire it into automation tools. The n8n video conversion guide walks through building a format conversion workflow with drag-and-drop nodes.

Hardware acceleration: when CPU encoding is too slow

Software encoding gives the best quality-per-bit. But it's slow. Hardware encoders trade some compression efficiency for speed, often encoding 5-10x faster than software.

NVIDIA NVENC (NVIDIA GPUs):

ffmpeg -hwaccel cuda -hwaccel_output_format cuda \
  -i input.mp4 -c:v h264_nvenc -preset p4 -cq 23 output.mp4

The -hwaccel cuda -hwaccel_output_format cuda flags keep decoded frames in GPU memory. Without them, FFmpeg copies frames back and forth over PCIe, cutting throughput roughly in half. NVENC supports H.264 and H.265, and newer GPUs (RTX 40 series and later) also support AV1 via av1_nvenc. The full CUDA and NVENC GPU acceleration guide covers setup, CUDA filters, multi-GPU configs, and troubleshooting in detail.

For H.265 with NVENC:

ffmpeg -hwaccel cuda -hwaccel_output_format cuda \
  -i input.mp4 -c:v hevc_nvenc -preset p5 -cq 28 output.mp4

Apple VideoToolbox (macOS):

ffmpeg -i input.mp4 -c:v hevc_videotoolbox -q:v 55 -tag:v hvc1 output.mp4

Uses the hardware encoder built into Apple Silicon and recent Intel Macs. The -q:v scale is 1-100 (lower = better quality). Around 55-65 produces good results. Apple Silicon's media engine is particularly fast. An M1 can encode 4K H.265 at 200+ fps.

Intel VAAPI (Linux with Intel GPUs):

ffmpeg -hwaccel vaapi -hwaccel_output_format vaapi \
  -hwaccel_device /dev/dri/renderD128 \
  -i input.mp4 -c:v h264_vaapi -b:v 5M output.mp4

Hardware encoders are worth it when speed matters more than optimal compression. If you're doing live streaming or burning through a high-volume batch and would rather finish in 2 hours than 20, hardware wins. For archival or bandwidth-sensitive delivery, software encoding still wins on quality-per-bit.

The NVIDIA developer blog has an in-depth guide on GPU-accelerated transcoding if you want to go deeper on NVENC tuning.

Common transcoding mistakes

A few things that trip people up regularly:

Using -c:v copy and wondering why nothing changed. Copy means no transcoding. If you want to change the codec, you have to specify a codec name. This is the "transcoding vs remuxing" distinction from earlier.

Setting CRF the same across codecs. CRF 23 in x264 is not the same as CRF 23 in x265. They use different scales. x265 default is 28, which roughly matches x264's 23. Set CRF based on the codec, not habit.

Forgetting -b:v 0 with VP9/AV1 CRF mode. Without it, libaom and libvpx default to constrained quality mode, which caps bitrate and can hurt quality on complex scenes. Always pass -b:v 0 when using CRF with these codecs.

Using ultrafast preset for production output. Ultrafast is for testing commands. The quality and file size are significantly worse than medium. If you used ultrafast and the output looks bad, try medium before blaming the codec.

Not copying audio when you don't need to change it. Every time you re-encode audio, you lose quality. If the audio codec is already AAC or compatible with your target container, -c:a copy saves time and preserves quality.

Skipping -pix_fmt yuv420p for web delivery. If your source is 4:2:2 or 4:4:4 (common with ProRes and professional formats), the output will inherit that chroma format. Most browsers and mobile devices can't decode it. Add -pix_fmt yuv420p to be safe.

Ignoring the -tag:v hvc1 flag for H.265. Without it, Apple devices may not recognize the file as playable. Takes two seconds to add, saves hours of debugging playback issues.

FAQ

What's the difference between transcoding and converting?

Converting is the general term. It covers any format change. Transcoding specifically means decoding from one codec and re-encoding to another. Remuxing (changing the container without re-encoding) is also a form of converting, but it's not transcoding. When someone says "convert my video," you need to figure out whether they need a codec change (transcode) or just a container change (remux).

How long does it take to transcode a 1-hour video?

It depends on the codec, preset, resolution, and hardware. As a rough guide for 1080p on a modern CPU (8 cores): H.264 at medium preset takes roughly 15-25 minutes. H.265 at medium takes 30-60 minutes. AV1 with SVT-AV1 at preset 6 takes 40-80 minutes. Hardware encoding (NVENC, VideoToolbox) can cut these times by 5-10x. If you're also resizing the video during transcoding, add the scale filter and transcode in one pass to avoid encoding twice.

Does transcoding lose quality?

Yes, always. Every encode/decode cycle introduces generational loss. The amount depends on your CRF setting and the codec. At CRF 18 (x264) or CRF 22 (x265), the loss is imperceptible to most people. Avoid transcoding the same file multiple times. Always go back to the highest-quality source. This is especially relevant when merging videos with different codecs — the concat filter re-encodes everything, so pick your quality settings carefully.

Should I use H.265 or AV1?

If compatibility matters (and it usually does), H.265 is the safer choice — it plays on more devices. If you're encoding for web delivery and your audience is on modern browsers, AV1 gives better compression for free (no royalties). For archival, AV1 with SVT-AV1 is worth the extra encode time. See the FFmpeg commands list for ready-to-use API calls for both codecs.

Can I transcode video without FFmpeg?

You can use HandBrake (GUI), VLC (basic conversion), or cloud APIs. If you're integrating transcoding into an application or automation workflow, a hosted FFmpeg API lets you send the same FFmpeg commands over HTTP without installing or maintaining FFmpeg yourself.

Or skip the command line entirely

Everything above assumes you want to manage FFmpeg yourself. Install it, babysit encodes, handle failures, manage storage.

If you're transcoding video as part of an application or workflow, there's a simpler path. Send the FFmpeg command to an API, get the transcoded file back. No server to manage, no binary to install.

curl -X POST https://api.renderio.dev/api/v1/run-ffmpeg-command \
  -H "Content-Type: application/json" \
  -H "X-API-KEY: your_api_key" \
  -d '{
    "ffmpeg_command": "-i {{input}} -c:v libx265 -crf 28 -preset medium -tag:v hvc1 -c:a aac {{output}}",
    "input_files": { "input": "https://storage.example.com/raw/video.mp4" },
    "output_files": { "output": "transcoded.mp4" }
  }'

RenderIO runs the command in a sandboxed environment, stores the output, and gives you a download URL. Same FFmpeg syntax you already know. Grab an API key and you can transcode video without touching a server. The supported codec list includes libx264, libx265, libvpx-vp9, libsvtav1, AAC, MP3, and Opus. For more on how this works, see the FFmpeg REST API tutorial or the full API commands reference.

This is especially practical for batch workflows. Instead of managing parallel encodes on your machine, you can submit hundreds of transcode jobs and let the API handle concurrency and storage. Pair it with n8n or Zapier and you have a video pipeline without writing infrastructure code. If you're weighing the build-vs-buy decision, the hosted vs self-hosted comparison breaks down the cost math at different scales.

Whether you need to run FFmpeg in the cloud without a server or on bare metal, the commands are the same.