Serverless FFmpeg: Process Video Without Infrastructure

"Serverless" FFmpeg isn't really serverless

You've seen the blog posts: "Run FFmpeg on AWS Lambda." They make it sound simple. Add a Lambda Layer, write a handler, done.

Then you try it. You hit the 250MB size limit. You fight with Lambda Layers. You realize cold starts add 2 seconds to every invocation. Your 15-minute video times out. Your /tmp fills up.

Running FFmpeg on Lambda is serverless in the billing sense. You don't pay for idle time. But in the operational sense, you're still managing infrastructure: deployment packages, container images, IAM roles, S3 buckets, and monitoring dashboards.

Real serverless means you don't think about infrastructure at all. You call an API. That's what FFmpeg as a service actually looks like. (And if you don't need an API at all and just want to convert a single small file, browser-based FFmpeg online tools work for files under 2GB without any setup.)

The challenges of FFmpeg on Lambda

Let's be specific about what goes wrong.

Binary size: A static FFmpeg binary with common codecs is 70-90MB. Lambda Layers have a 250MB total limit across all layers (unzipped). Your FFmpeg layer plus your runtime dependencies must fit within this.

Cold starts: Lambda containers that include FFmpeg take 1-3 seconds to cold start. If your function processes user-facing video, that's noticeable latency before processing even begins.

No GPU access: Serverless platforms don't offer GPU instances. If you need CUDA and NVENC hardware acceleration for faster encoding, you're out of luck on Lambda, Cloud Functions, or any FaaS platform.

Execution timeout: 15 minutes maximum. An H.264 encode of a 20-minute 1080p video at medium preset takes 10-15 minutes. Anything longer fails silently when Lambda kills the process.

Temporary storage: /tmp defaults to 512MB. A 10-minute 1080p video is ~500MB. You need to download the input AND write the output. That's 1GB minimum. You can extend /tmp to 10GB, but it costs extra.

Concurrency: Each Lambda invocation handles one video. If 100 videos arrive at once, you need 100 concurrent Lambdas. The default account limit is 1,000 concurrent executions across all functions. Video processing can eat through that quickly.

No streaming: Lambda can't stream partial results. The entire encode must complete before you can return anything. No progress updates.

We cover the Lambda-specific issues in more depth in the Lambda vs FFmpeg API comparison. Here's the reality of a "serverless" FFmpeg setup on AWS:

# serverless.yml - not as simple as it looks
service: ffmpeg-processor

provider:
  name: aws
  runtime: python3.12
  timeout: 900  # 15 minutes max
  memorySize: 3008  # Need CPU power
  environment:
    OUTPUT_BUCKET: ${self:custom.outputBucket}

layers:
  ffmpeg:
    path: layers/ffmpeg
    description: FFmpeg binary

functions:
  processVideo:
    handler: handler.process
    layers:
      - {Ref: FfmpegLambdaLayer}
    events:
      - sqs:
          arn: !GetAtt ProcessingQueue.Arn
          batchSize: 1

resources:
  Resources:
    ProcessingQueue:
      Type: AWS::SQS::Queue
      Properties:
        VisibilityTimeout: 1080
    InputBucket:
      Type: AWS::S3::Bucket
    OutputBucket:
      Type: AWS::S3::Bucket

That's the "simple" version. Add error handling, dead letter queues, monitoring, alerting, and retry logic, and you have a full infrastructure project.

What truly serverless FFmpeg looks like

Truly serverless means: no infrastructure to configure, deploy, or monitor. You make an HTTP request. Processing happens. You get a result.

curl -X POST https://renderio.dev/api/v1/run-ffmpeg-command \
  -H "Content-Type: application/json" \
  -H "X-API-KEY: your_api_key" \
  -d '{
    "ffmpeg_command": "-i {{in_video}} -c:v libx264 -preset medium -crf 23 -c:a aac -b:a 128k {{out_video}}",
    "input_files": {
      "in_video": "https://example.com/input.mp4"
    },
    "output_files": {
      "out_video": "encoded.mp4"
    }
  }'

No YAML files. No Lambda Layers. No SQS queues. No S3 bucket policies. No IAM roles.

The command runs in an isolated container. There's no timeout wall at 15 minutes. There's no /tmp limit. There's no cold start penalty.

How RenderIO achieves this

RenderIO uses Cloudflare Sandbox containers. Each FFmpeg command gets its own isolated environment that spins up in milliseconds. Here's what happens behind the scenes:

Your request hits Cloudflare's edge network (300+ cities)
A sandbox container starts with FFmpeg pre-installed
Input files download from your URLs
FFmpeg runs your exact command
Output files upload to R2 storage
You get a pre-signed download URL

The container exists only for the duration of your job. No idle resources. No long-running servers.

Comparing the approaches

Lambda approach (to process 1,000 videos/month):

Setup: 15-20 hours of engineering
Infrastructure: Lambda + SQS + S3 + CloudWatch + IAM
Ongoing maintenance: 4-6 hours/month
Cost: ~$20/month in AWS services
Limitations: 15-min timeout, 10GB storage, cold starts

RenderIO approach (same volume):

Setup: 30 minutes
Infrastructure: None
Ongoing maintenance: 0 hours/month
Cost: $29/month (Growth plan, 1,000 commands)
Limitations: Standard FFmpeg build (no custom codecs)

The $9/month difference in compute is nothing compared to the engineering time you avoid. For a full cost breakdown at every volume level, see the FFmpeg API pricing comparison.

Real code: processing a batch of videos

Here's how batch processing looks with a truly serverless approach:

import requests
import time
from concurrent.futures import ThreadPoolExecutor

API_KEY = "ffsk_your_api_key"
BASE_URL = "https://renderio.dev/api/v1"
HEADERS = {"Content-Type": "application/json", "X-API-KEY": API_KEY}

videos = [
    "https://example.com/video1.mp4",
    "https://example.com/video2.mp4",
    "https://example.com/video3.mp4",
    # ... hundreds more
]

def process_video(url, index):
    response = requests.post(f"{BASE_URL}/run-ffmpeg-command", headers=HEADERS, json={
        "ffmpeg_command": "-i {{in_video}} -vf scale=-1:720 -c:v libx264 -crf 23 {{out_video}}",
        "input_files": {"in_video": url},
        "output_files": {"out_video": f"processed_{index}.mp4"}
    })
    return response.json()["command_id"]

# Submit all jobs in parallel
with ThreadPoolExecutor(max_workers=20) as executor:
    command_ids = list(executor.map(
        lambda args: process_video(*args),
        enumerate(videos)
    ))

print(f"Submitted {len(command_ids)} jobs")

# Poll for all results
completed = set()
while len(completed) < len(command_ids):
    for cmd_id in command_ids:
        if cmd_id in completed:
            continue
        status = requests.get(f"{BASE_URL}/commands/{cmd_id}", headers=HEADERS).json()
        if status["status"] in ("completed", "failed"):
            completed.add(cmd_id)
            print(f"{cmd_id}: {status['status']}")
    time.sleep(2)

Submit 100 videos at once. Each runs in its own container. No concurrency limits to worry about. No queue configuration. No autoscaling policy.

For more batch processing patterns, see batch processing AI videos for social media.

Integrating with automation tools

If you don't want to write code at all, a truly serverless FFmpeg API works with any automation platform that can make HTTP requests. We have dedicated guides for n8n video processing and Zapier FFmpeg integration. Since the API is standard REST, it also works with Make, Pipedream, and anything else with an HTTP node.

When Lambda still makes sense

Lambda is a reasonable choice when:

You already have AWS infrastructure and deep Lambda expertise
Your videos are under 5 minutes and reliably fit within Lambda's constraints
You need custom FFmpeg builds with specific codecs
You're processing within a VPC that can't make external API calls

For everything else, an API is simpler, faster to implement, and cheaper when you account for engineering time. The 2026 FFmpeg API comparison covers all the managed options side by side.

FAQ

Is serverless FFmpeg actually cheaper than a dedicated server?

It depends on volume. At under 5,000 videos/month, a managed API like RenderIO ($9-49/month) costs less than the engineering time to maintain Lambda. At 50,000+ videos/month, a dedicated VPS cluster is cheaper on raw infrastructure — but you need someone to run it. The pricing comparison has exact numbers.

Can Lambda handle long videos?

Lambda has a hard 15-minute timeout. A 20-minute 1080p encode at medium preset can take 10-15 minutes of compute time, which puts you right at the edge. Anything longer will fail. If your videos regularly exceed 10 minutes, Lambda is a bad fit.

What about AWS Fargate or ECS instead of Lambda?

Fargate avoids the 15-minute timeout and storage limits, but you're back to managing container definitions, task definitions, VPC configuration, service discovery, and autoscaling policies. It's more capable than Lambda but also more work. At that point you're essentially self-hosting with managed containers.

Get started in 5 minutes

Start with the Starter plan at $9/mo. Scale to Growth ($ 29/mo) or Business ($99/mo) as volume increases. The complete FFmpeg API guide walks through making your first call. Or just grab your API key and go.