Run FFmpeg in the Cloud Without Managing a Server

Running FFmpeg in the cloud without managing a server

The simplest way to run FFmpeg in the cloud: send a command over HTTP, get the output file back. No SSH. No scaling. No server. If you've already tried running FFmpeg online in a browser tab and hit the 2GB memory ceiling or single-threaded slowness, you want something faster and more capable — without adopting infrastructure.

There are four main options. They range from "still kind of managing a server" to "truly zero infrastructure." Each has real tradeoffs that depend on your video length, volume, and how much infrastructure you're willing to babysit.

If you're looking for a quick reference of FFmpeg commands before deciding where to run them, start there.

Running FFmpeg on AWS Lambda

Lambda is the first thing most people try. It's serverless. FFmpeg is just a binary. Should be easy.

It's not.

The deployment package problem

Lambda deployment packages max out at 250MB (unzipped). A statically-compiled FFmpeg binary is 70-90MB. Add your runtime, dependencies, and you're already tight. FFmpeg with full codec support pushes past the limit. For a deeper dive into all five Lambda limitations, see why Lambda is a poor fit for FFmpeg.

The workaround is a Lambda Layer for the FFmpeg binary. This works but adds cold start time (500ms-2s) and complicates deployment. You're now managing a layer version alongside your function version, and updates to FFmpeg mean rebuilding the layer.

Timeouts kill longer videos

Lambda functions max out at 15 minutes. A 1080p to 720p transcode of a 10-minute video takes 3-8 minutes depending on preset. Video compression with slower presets or H.265 can easily exceed that. Anything over 10 minutes of source video at reasonable quality settings is risky.

4K content? Forget it. A single 4K to 1080p downscale with CRF encoding runs about 0.3-0.5x realtime on Lambda's vCPUs. A 5-minute 4K clip would need 10-17 minutes of processing time, right at or past the timeout wall.

The /tmp storage constraint

Lambda gives you 512MB of /tmp by default (configurable to 10GB for extra cost). Your input AND output files must fit in this space. A raw 1080p file at moderate bitrate is ~150MB per minute. A 4-minute input plus the output fills 512MB.

What the code looks like

import subprocess
import boto3
import os

s3 = boto3.client('s3')

def handler(event, context):
    input_key = event['input_key']
    output_key = event['output_key']

    s3.download_file('my-bucket', input_key, '/tmp/input.mp4')

    subprocess.run([
        '/opt/bin/ffmpeg',
        '-i', '/tmp/input.mp4',
        '-c:v', 'libx264', '-crf', '23',
        '/tmp/output.mp4'
    ], check=True)

    s3.upload_file('/tmp/output.mp4', 'my-bucket', output_key)

    return {'statusCode': 200, 'output_key': output_key}

That's the happy path. In production, you also need: an API Gateway trigger, S3 bucket with proper CORS, IAM roles (at least 3), a Lambda Layer with FFmpeg, CloudWatch alarms for timeouts, a DLQ for failed invocations, and a deployment pipeline. Most teams spend 10-20 hours getting this production-ready.

Lambda IAM setup: the permissions you actually need

IAM is where most Lambda FFmpeg setups break. Here's the minimum policy that actually works.

Your Lambda function needs an execution role with two permissions: the ability to write logs to CloudWatch, and the ability to read/write your S3 bucket. Nothing else.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "CloudWatchLogs",
      "Effect": "Allow",
      "Action": [
        "logs:CreateLogGroup",
        "logs:CreateLogStream",
        "logs:PutLogEvents"
      ],
      "Resource": "arn:aws:logs:*:*:*"
    },
    {
      "Sid": "S3Access",
      "Effect": "Allow",
      "Action": [
        "s3:GetObject",
        "s3:PutObject",
        "s3:DeleteObject"
      ],
      "Resource": "arn:aws:s3:::your-video-bucket/*"
    },
    {
      "Sid": "S3ListBucket",
      "Effect": "Allow",
      "Action": "s3:ListBucket",
      "Resource": "arn:aws:s3:::your-video-bucket"
    }
  ]
}

The trust relationship for the execution role (the policy that lets Lambda assume the role):

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": { "Service": "lambda.amazonaws.com" },
      "Action": "sts:AssumeRole"
    }
  ]
}

Attach both via the IAM console or CLI:

# Create the role
aws iam create-role \
  --role-name ffmpeg-lambda-role \
  --assume-role-policy-document file://trust-policy.json

# Attach the permissions policy
aws iam put-role-policy \
  --role-name ffmpeg-lambda-role \
  --policy-name ffmpeg-lambda-permissions \
  --policy-document file://permissions-policy.json

# Use the role ARN when creating the function
aws lambda create-function \
  --function-name ffmpeg-processor \
  --role arn:aws:iam::ACCOUNT_ID:role/ffmpeg-lambda-role \
  ...

Two common mistakes: giving the role s3:* instead of specific actions (fine for testing, bad practice for production), and forgetting the s3:ListBucket permission on the bucket itself (not the /* path). Without ListBucket, any SDK call that lists prefixes will fail even if read/write works. Before you run FFmpeg on any file, probe it first with ffprobe — knowing the codec, resolution, and duration upfront saves you from Lambda timeouts on files that were never going to transcode in time.

Lambda cost breakdown

$0.20 per 1M requests +$ 0.0000166667 per GB-second. A 5-minute encode at 3GB memory costs about $0.015. At 1,000 videos/month: ~$ 15 in compute plus $5-10 in S3 and transfer. Cheap on paper, but the engineering time to set it up and maintain it costs a lot more.

Running FFmpeg on Google Cloud Run

Cloud Run gives you a container with more breathing room than Lambda. No hard timeout on v2 (up to 60 minutes). Up to 32GB RAM and 8 vCPUs.

FROM node:20-slim
RUN apt-get update && apt-get install -y ffmpeg
COPY . /app
WORKDIR /app
CMD ["node", "server.js"]

Where Cloud Run wins over Lambda

No deployment size limit: your container can include FFmpeg with every codec. The 60-minute timeout handles most video lengths. You get more CPU (up to 8 vCPUs) which matters for preset slow or veryslow encodes. And the container stays warm between requests, so you skip cold starts on consecutive calls.

Where it still hurts

You're managing a Docker image, a container registry, IAM, and storage integration. Every FFmpeg update means rebuilding and pushing the image. Concurrency defaults to 80 requests per container, which is fine for a web server but problematic when each FFmpeg job pegs 4 cores. You'll want concurrency set to 1-2 for video work, which means Cloud Run spins up more instances and costs more.

Cold starts are also heavier than Lambda: 2-5 seconds for a container with FFmpeg installed. Not a problem for batch work, but noticeable in synchronous pipelines.

Cloud Run cost breakdown

$0.00002400 per vCPU-second +$ 0.00000250 per GiB-second. A 5-minute encode on 4 vCPUs with 8GB RAM costs about $0.035. At 1,000 videos/month: ~$ 35 in compute. Lower operational overhead than Lambda (no size limits, longer timeouts), but you're still managing infrastructure.

Running FFmpeg on AWS Fargate

Fargate runs containers without managing EC2 instances. You define a task, Fargate allocates the resources. It's the most flexible option for heavy video work, but also the most infrastructure to wire together.

{
  "containerDefinitions": [{
    "name": "ffmpeg-worker",
    "image": "your-ecr-repo/ffmpeg-worker:latest",
    "cpu": 4096,
    "memory": 8192,
    "command": ["node", "worker.js"]
  }]
}

What you need to build around Fargate

You need ECR for images, ECS for task management, SQS for queuing (Fargate doesn't accept HTTP directly like Cloud Run does), S3 for storage, CloudWatch for monitoring, and IAM gluing it all together — six services before you write a line of video code. An ECS service or Step Functions to orchestrate the queue-to-task flow.

That's 6-7 AWS services wired together. Most teams spend 20-40 hours getting a production Fargate video pipeline running. Updates are painful because touching any piece risks breaking the chain.

Where Fargate earns its complexity

No timeout limits. Up to 16 vCPUs and 120GB RAM per task. You can process 4K content, hour-long videos, and complex multi-input filter graphs without hitting walls. For very high-volume pipelines (10,000+ videos/day), Fargate with Spot pricing can be cost-effective.

Fargate cost breakdown

$0.04048 per vCPU per hour +$ 0.004445 per GB per hour. A 5-minute task on 4 vCPUs with 8GB: ~$0.016. Similar to Lambda in per-job cost but with much more flexibility. The cost is in engineering time, not compute.

Running FFmpeg through an API

The FFmpeg as a service model. Send an HTTP request with your FFmpeg command. Get back a processed file. No containers, no queues, no storage management.

curl -X POST https://renderio.dev/api/v1/run-ffmpeg-command \
  -H "Content-Type: application/json" \
  -H "X-API-KEY: your_api_key" \
  -d '{
    "ffmpeg_command": "-i {{in_video}} -c:v libx264 -crf 23 -c:a aac {{out_video}}",
    "input_files": {
      "in_video": "https://your-bucket.s3.amazonaws.com/input.mp4"
    },
    "output_files": {
      "out_video": "output.mp4"
    }
  }'

That's it. You skip the Docker image, IAM roles, and monitoring setup. The same FFmpeg syntax you'd use locally, just swap file paths for URLs and placeholders. For language-specific integration, see the Node.js, Python, or curl guides.

API cost breakdown

RenderIO's Starter plan is $9/month for 500 commands. Growth is$ 29/month for 1,000 commands. Business is $99/month for 20,000 commands. Zero egress fees on all plans. Check the FFmpeg API pricing comparison for how this stacks up against other providers.

Engineering time

30 minutes to integrate. Zero ongoing maintenance. No driver updates, no container rebuilds, no IAM policy debugging at 2am.

Side-by-side comparison

Factor	Lambda	Cloud Run	Fargate	API service
Setup time	10-20 hrs	8-15 hrs	20-40 hrs	30 min
Monthly cost (1K vids)	~$20	~$35	~$25	$29
Max video duration	~10 min*	60 min	No limit	No limit
Maintenance	Medium	Medium	High	None
Scaling	Automatic	Automatic	Manual config	Automatic
Cold starts	Yes (1-2s)	Yes (2-5s)	No	No
4K support	Marginal	Yes	Yes	Yes
Custom codecs	Via Layer	Via Docker	Via Docker	Standard build

*Practical limit with quality presets, not Lambda's hard 15-minute timeout.

The hidden cost nobody calculates

The compute costs are similar across all options. The real difference is engineering time.

Setting up Lambda + FFmpeg properly (error handling, retries, monitoring, deployment pipeline) takes 2-3 days. Maintaining it takes 4-8 hours per month: FFmpeg updates, dependency patches, debugging timeout failures on videos that were 30 seconds too long.

At an engineer's fully-loaded cost of $75-150/hour, that's$ 300-1,200/month in labor. For a Fargate pipeline, double it.

An API call takes 5 minutes to integrate and zero minutes to maintain.

Unless you're processing at very high volume (10,000+ videos/month) or need custom codecs that aren't in the standard FFmpeg build, the math favors the API. The hosted vs self-hosted comparison breaks down exactly where the crossover point is.

Moving your existing FFmpeg commands to the cloud

You already have FFmpeg commands that work locally. Migrating them takes three changes:

Replace input file paths with {placeholder} names
Replace output file paths with {placeholder} names
Map placeholders to URLs and filenames

# Before (local)
ffmpeg -i input.mp4 -vf "scale=1280:720" -c:v libx264 output.mp4

# After (API)
{
  "ffmpeg_command": "-i {{in_video}} -vf scale=1280:720 -c:v libx264 {{out_video}}",
  "input_files": { "in_video": "https://example.com/input.mp4" },
  "output_files": { "out_video": "output.mp4" }
}

Same FFmpeg syntax. Different execution environment. The complete API guide has more examples with multi-input commands, webhooks, and error handling.

FAQ

Can AWS Lambda handle 4K video?

Barely. Lambda's max 10GB memory and 15-minute timeout make 4K processing unreliable. A 2-minute 4K clip downscaled to 1080p needs 4-8 minutes on Lambda's vCPUs. Longer clips or complex filter chains will timeout. Cloud Run or Fargate are better options for 4K.

Is Cloud Run cheaper than Fargate for video processing?

At low volume (under 1,000 videos/month), Cloud Run is slightly more expensive per job ( $0.035 vs$ 0.016 per 5-min encode) but significantly cheaper in setup and maintenance time. At high volume with Fargate Spot, Fargate pulls ahead on compute cost, if you're willing to invest 20-40 hours building the pipeline.

What happens if my Lambda FFmpeg job times out?

The function is killed mid-encode. No output file is produced. The input file in /tmp is lost. You need retry logic (either in your client or via a DLQ + re-trigger pattern) to handle this. Lambda doesn't give you a partial result or a graceful shutdown period.

Can I use GPU acceleration in the cloud without servers?

Not on Lambda (no GPU access). Cloud Run doesn't support GPUs either. Fargate supports GPU instances but the setup is complex. For GPU-accelerated encoding with CUDA and NVENC, you need a dedicated GPU instance or a service that manages GPU infrastructure for you.

How do I handle videos longer than 15 minutes on serverless?

Lambda can't. Cloud Run handles up to 60 minutes of processing time. Fargate has no timeout. An API service handles the timeout internally. You submit the job and poll for completion or receive a webhook callback. For long videos, async processing with webhooks is the most reliable pattern regardless of which option you pick.