Varnish Cache Setup Guide: Accelerate Your Website with Full-Page Caching

What Varnish Cache Actually Does

Varnish is an HTTP reverse proxy cache. It sits between your web server (Nginx, Apache) and the internet, intercepting incoming requests before they reach your backend. When a request arrives for a page that Varnish has already cached, it serves the response directly from memory — bypassing PHP, your database, and your web server entirely. The backend never sees the request.

This is fundamentally different from application-level caching (Redis, Memcached) or opcode caching (OPcache). Those tools reduce backend processing time. Varnish eliminates backend processing altogether for cached requests. A typical Nginx + PHP-FPM stack serves a Magento 2 category page in 800-1200ms. Varnish serves the same page in 1-5ms. That is not a typo.

Varnish stores cached objects in memory. It uses a custom domain-specific language called VCL (Varnish Configuration Language) to define caching rules — what to cache, for how long, how to handle cookies, how to invalidate stale content, and how to behave when the backend is down.

Varnish Cache Setup Guide: Accelerate Your Website with Full-Page Caching — concept

Architecture: Where Varnish Fits in Your Stack

The standard production architecture puts Varnish between your SSL termination layer and your web server:

Client -> Cloudflare/CDN -> Nginx (SSL termination, port 443)
  -> Varnish (port 6081) -> Nginx/Apache (backend, port 8080)

In this layout, the outer Nginx handles SSL termination and proxies unencrypted HTTP to Varnish on port 6081. Varnish checks its cache. On a cache hit, it responds immediately. On a miss, it forwards the request to the backend web server on port 8080, caches the response, and returns it to the client.

Why not put Varnish at the edge? Varnish does not natively handle SSL/TLS. You need a termination layer in front of it. Nginx is the standard choice because it handles SSL efficiently and can also serve static assets (images, CSS, JS) directly without involving Varnish.

A simplified single-server setup for smaller sites:

Client -> Nginx (port 443, SSL) -> Varnish (port 6081) -> Nginx (port 8080, backend)

For high-traffic deployments, you can run multiple Varnish instances behind a load balancer, each with its own cache. This is common in Magento 2 deployments handling thousands of concurrent users.

Installation on Ubuntu 22.04 / 24.04

Install Varnish from the official repository to get the latest stable release. The default Ubuntu packages are often outdated.

# Install dependencies
sudo apt-get update
sudo apt-get install -y debian-archive-keyring curl gnupg apt-transport-https

# Add Varnish 7.x official repository
curl -s https://packagecloud.io/install/repositories/varnishcache/varnish75/script.deb.sh | sudo bash

# Install Varnish
sudo apt-get install -y varnish

# Verify installation
varnishd -V

After installation, configure the Varnish daemon to listen on port 6081 and point to your backend on port 8080. Edit the systemd service override:

sudo systemctl edit varnish

Add the following configuration:

[Service]
ExecStart=
ExecStart=/usr/sbin/varnishd \
  -a :6081 \
  -f /etc/varnish/default.vcl \
  -s malloc,2G \
  -p thread_pool_min=200 \
  -p thread_pool_max=4000 \
  -p thread_pool_timeout=120

The -s malloc,2G flag allocates 2 GB of RAM for the cache. Size this based on your available memory and working set. A Magento 2 store with 10,000 products and 500 category pages typically needs 1-4 GB. A WordPress site with 2,000 posts can get by with 512 MB to 1 GB.

Reconfigure your backend Nginx to listen on port 8080 instead of port 80:

server {
    listen 8080;
    server_name example.com;
    # ... rest of your existing configuration
}

Then configure the frontend Nginx to proxy to Varnish:

server {
    listen 443 ssl http2;
    server_name example.com;

    ssl_certificate /etc/letsencrypt/live/example.com/fullchain.pem;
    ssl_certificate_key /etc/letsencrypt/live/example.com/privkey.pem;

    location / {
        proxy_pass http://127.0.0.1:6081;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
    }
}

Restart everything in order:

sudo systemctl daemon-reload
sudo systemctl restart nginx
sudo systemctl restart varnish

Basic VCL Configuration

VCL defines how Varnish handles every request. The language is compiled to C and loaded into the running Varnish process. Here is a minimal production VCL:

vcl 4.1;

backend default {
    .host = "127.0.0.1";
    .port = "8080";
    .connect_timeout = 5s;
    .first_byte_timeout = 90s;
    .between_bytes_timeout = 2s;
}

sub vcl_recv {
    # Strip tracking parameters that bust the cache
    if (req.url ~ "(\?|&)(utm_source|utm_medium|utm_campaign|utm_content|utm_term|gclid|fbclid)=") {
        set req.url = regsuball(req.url, "(utm_source|utm_medium|utm_campaign|utm_content|utm_term|gclid|fbclid)=[^&]+&?", "");
        set req.url = regsub(req.url, "(\?|&)$", "");
    }

    # Do not cache POST requests
    if (req.method == "POST") {
        return (pass);
    }

    # Do not cache authenticated sessions
    if (req.http.Authorization) {
        return (pass);
    }

    # Remove cookies for static assets
    if (req.url ~ "\.(css|js|jpg|jpeg|png|gif|ico|svg|woff|woff2|ttf|eot)$") {
        unset req.http.Cookie;
        return (hash);
    }

    return (hash);
}

sub vcl_backend_response {
    # Cache static assets for 30 days
    if (bereq.url ~ "\.(css|js|jpg|jpeg|png|gif|ico|svg|woff|woff2|ttf|eot)$") {
        set beresp.ttl = 30d;
        unset beresp.http.Set-Cookie;
    }

    # Default TTL for dynamic pages: 2 hours
    if (beresp.ttl <= 0s) {
        set beresp.ttl = 2h;
    }

    # Enable grace mode (serve stale content while fetching fresh)
    set beresp.grace = 6h;

    return (deliver);
}

sub vcl_deliver {
    # Add debug headers (remove in production)
    if (obj.hits > 0) {
        set resp.http.X-Cache = "HIT";
        set resp.http.X-Cache-Hits = obj.hits;
    } else {
        set resp.http.X-Cache = "MISS";
    }

    return (deliver);
}

Key concepts in this VCL:

vcl_recv processes every incoming request. It decides whether to look up the request in cache (hash), bypass cache entirely (pass), or pipe the connection directly to the backend (pipe).
vcl_backend_response processes the response from your backend. It sets TTLs, removes cookies that would prevent caching, and configures grace periods.
vcl_deliver processes the response before sending it to the client. Useful for adding debug headers or stripping internal headers.

Magento 2 VCL Generation and Customization

Magento 2 has built-in Varnish support. It generates a VCL file tailored to your store configuration. Generate it from the admin panel or CLI:

bin/magento varnish:vcl:generate \
  --backend-host=127.0.0.1 \
  --backend-port=8080 \
  --access-list=127.0.0.1 \
  --grace-period=300 \
  --export-version=6 > /etc/varnish/default.vcl

Enable Varnish as the full-page cache backend in Magento:

bin/magento setup:config:set --http-cache-hosts=127.0.0.1:6081
bin/magento config:set system/full_page_cache/caching_application 2
bin/magento cache:flush

The generated VCL handles Magento-specific concerns: admin panel bypass, customer sessions, form keys, and category/product page caching. However, you will almost always need to customize it.

Common customizations to the Magento VCL:

Excluding specific URLs from cache:

sub vcl_recv {
    # Bypass cache for checkout and customer account
    if (req.url ~ "^/(checkout|customer|wishlist|catalog/product_compare)") {
        return (pass);
    }

    # Bypass cache for API endpoints
    if (req.url ~ "^/rest/" || req.url ~ "^/graphql") {
        return (pass);
    }
}

Handling multiple store views:

sub vcl_recv {
    # Different cache entries per store code
    if (req.http.X-Magento-Store) {
        set req.http.X-Store = req.http.X-Magento-Store;
    }
}

sub vcl_hash {
    hash_data(req.http.X-Store);
}

Increasing cache TTL for product pages:

sub vcl_backend_response {
    if (bereq.url ~ "^/catalog/product/view") {
        set beresp.ttl = 24h;
    }
}

After modifying the VCL, reload it without restarting Varnish:

# Compile and load new VCL
varnishadm vcl.load new_config /etc/varnish/default.vcl
varnishadm vcl.use new_config

This performs a zero-downtime reload. The old VCL continues serving requests until the new one is active.

WordPress Integration: proxy_cache vs Varnish

WordPress sites have two main full-page caching options: Nginx’s built-in proxy_cache (or fastcgi_cache) and Varnish.

Nginx fastcgi_cache is simpler to set up, stores cached pages on disk, and requires no additional services. It works well for sites under 50,000 monthly visitors where the cache hit ratio stays high and the working set fits comfortably in the OS page cache.

Varnish is the better choice when you need fine-grained cache invalidation (purge individual URLs, ban by pattern), grace mode for backend failures, Edge Side Includes (ESI) for partial page caching, or when your traffic patterns require serving thousands of concurrent requests from cache.

For WordPress with Varnish, the VCL needs to handle WordPress-specific cookies and admin paths:

sub vcl_recv {
    # Pass WordPress admin and login
    if (req.url ~ "^/wp-(admin|login|cron)" || req.url ~ "^/xmlrpc.php") {
        return (pass);
    }

    # Pass WooCommerce dynamic pages
    if (req.url ~ "^/(cart|my-account|checkout|addons|wp-json)") {
        return (pass);
    }

    # Pass requests with WordPress login cookies
    if (req.http.Cookie ~ "wordpress_logged_in_|wp-postpass_|comment_author_") {
        return (pass);
    }

    # Strip all other cookies for cacheable pages
    if (!(req.url ~ "^/wp-(admin|login)")) {
        unset req.http.Cookie;
    }
}

Install a cache purge plugin (Proxy Cache Purge or Varnish HTTP Purge) to automatically invalidate cached pages when content is updated in wp-admin.

Cache Invalidation: Purge and Ban

Cache invalidation is the hardest problem in caching. Varnish provides two mechanisms: purge and ban.

Purge removes a single cached object by URL. It is immediate and precise:

acl purge {
    "127.0.0.1";
    "10.0.0.0"/8;
}

sub vcl_recv {
    if (req.method == "PURGE") {
        if (!client.ip ~ purge) {
            return (synth(405, "Purge not allowed from this IP"));
        }
        return (purge);
    }
}

Trigger a purge from the command line or your application:

curl -X PURGE http://127.0.0.1:6081/catalog/product/view/id/42

Ban removes cached objects matching a pattern. It is more flexible but works differently — banned objects are checked against the ban list lazily, when a request comes in:

sub vcl_recv {
    if (req.method == "BAN") {
        if (!client.ip ~ purge) {
            return (synth(405, "Ban not allowed from this IP"));
        }
        ban("req.url ~ " + req.url);
        return (synth(200, "Banned"));
    }
}

Ban all product pages at once:

curl -X BAN http://127.0.0.1:6081/catalog/product/

Ban everything containing a specific tag (useful for Magento cache tags):

curl -H "X-Magento-Tags-Pattern: catalog_product_42" \
  -X PURGE http://127.0.0.1:6081/

When to use which: Purge for single URL invalidation (product update, page edit). Ban for pattern-based invalidation (all products in a category, all pages by a specific author, everything tagged with a cache key).

Varnish Cache Setup Guide: Accelerate Your Website with Full-Page Caching — solution

Health Checks

Varnish can monitor backend health and route traffic away from failing servers. This is essential in production:

backend default {
    .host = "127.0.0.1";
    .port = "8080";
    .probe = {
        .url = "/health-check.php";
        .timeout = 2s;
        .interval = 5s;
        .window = 5;
        .threshold = 3;
    }
}

This configuration probes /health-check.php every 5 seconds. It considers the backend healthy if at least 3 of the last 5 probes succeed. If the backend is marked sick, Varnish will serve stale cached content (if grace mode is configured) instead of returning errors.

Create a simple health check endpoint on your backend:

<?php
// /health-check.php
try {
    $pdo = new PDO('mysql:host=localhost;dbname=mydb', 'user', 'pass');
    $pdo->query('SELECT 1');
    http_response_code(200);
    echo 'OK';
} catch (Exception $e) {
    http_response_code(503);
    echo 'FAIL';
}

For multi-backend setups, define separate backends with individual probes and use a director to distribute traffic:

import directors;

backend web1 {
    .host = "10.0.1.10";
    .port = "8080";
    .probe = { .url = "/health-check.php"; .interval = 5s; .threshold = 3; .window = 5; }
}

backend web2 {
    .host = "10.0.1.11";
    .port = "8080";
    .probe = { .url = "/health-check.php"; .interval = 5s; .threshold = 3; .window = 5; }
}

sub vcl_init {
    new cluster = directors.round_robin();
    cluster.add_backend(web1);
    cluster.add_backend(web2);
}

sub vcl_recv {
    set req.backend_hint = cluster.backend();
}

Grace Mode: Serving Stale Content During Downtime

Grace mode is one of Varnish’s most valuable production features. When a cached object expires and the backend is slow or unavailable, grace mode serves the stale cached version while Varnish fetches a fresh copy in the background.

sub vcl_backend_response {
    # Keep objects in cache for 6 hours past their TTL
    set beresp.grace = 6h;

    # Keep objects around for serving during backend outages
    set beresp.keep = 8h;
}

sub vcl_recv {
    # Allow grace mode when backend is sick
    if (!std.healthy(req.backend_hint)) {
        set req.grace = 24h;
    } else {
        set req.grace = 6h;
    }
}

With this configuration, if your backend goes down for maintenance or crashes, Varnish continues serving cached pages for up to 24 hours. Users see the slightly stale cached version rather than an error page. When the backend recovers, Varnish begins fetching fresh content again.

Grace mode also helps during traffic spikes. When a cached object expires and 1,000 users request it simultaneously, Varnish sends only one request to the backend and serves the stale cached version to the other 999 users. This prevents the thundering herd problem that can bring backends to their knees after a cache flush.

Monitoring: varnishstat and varnishlog

Varnish ships with powerful monitoring tools. Use them. Blind caching is worse than no caching because you cannot debug performance problems.

varnishstat shows real-time cache performance counters:

varnishstat

Key metrics to watch:

Metric	What It Tells You
MAIN.cache_hit	Requests served from cache
MAIN.cache_miss	Requests forwarded to backend
MAIN.cache_hitpass	Requests marked uncacheable
MAIN.n_object	Number of objects in cache
MAIN.n_expired	Objects expired from cache
MAIN.backend_conn	Connections to backend
MAIN.sess_dropped	Dropped client sessions (capacity issue)

A healthy cache should show a hit ratio above 80%. Calculate it:

hit_ratio = cache_hit / (cache_hit + cache_miss) * 100

If your hit ratio is below 60%, you are likely caching too few pages, stripping too few cookies, or setting TTLs too short.

varnishlog shows detailed per-request logs:

# Show all cache misses
varnishlog -q "VCL_call eq MISS"

# Show requests taking more than 2 seconds
varnishlog -q "Timestamp:Resp[3] > 2.0"

# Show all requests to a specific URL
varnishlog -q "ReqURL ~ '/catalog/product/'"

# Show why objects are not being cached
varnishlog -q "VCL_call eq PASS" -g request

varnishtop shows the most frequent log entries:

# Most requested URLs
varnishtop -i ReqURL

# Most common response status codes
varnishtop -i RespStatus

# Most common User-Agents
varnishtop -i ReqHeader:User-Agent

For production monitoring, export Varnish metrics to Prometheus using the prometheus-varnish-exporter and visualize them in Grafana. This gives you historical data, alerting, and dashboards showing cache hit ratio trends, backend health, and request latency distributions.

Common Mistakes

These are the issues we see repeatedly across customer deployments.

Caching pages with Set-Cookie headers. If your backend sends a Set-Cookie header, Varnish will not cache the response by default. This is the number one reason for low hit ratios. Audit your backend and remove unnecessary cookies from cacheable pages. In VCL, strip Set-Cookie from responses you know should be cached:

sub vcl_backend_response {
    if (bereq.url !~ "^/(checkout|customer|cart)") {
        unset beresp.http.Set-Cookie;
    }
}

Not stripping cookies from requests. Even if the backend does not set cookies, the browser might send them. Google Analytics cookies (_ga, _gid), Facebook cookies (_fbp), and other tracking cookies bust the cache because Varnish treats requests with different cookies as different cache entries. Strip non-essential cookies in vcl_recv:

sub vcl_recv {
    if (req.http.Cookie) {
        # Remove tracking cookies, keep only session cookies
        set req.http.Cookie = regsuball(req.http.Cookie,
            "(^|;\s*)(_ga|_gid|_gat|_fbp|_fbc|__utm[a-z]+|__gads)=[^;]*", "");
        # Clean up empty cookie header
        if (req.http.Cookie ~ "^\s*$") {
            unset req.http.Cookie;
        }
    }
}

Setting TTLs too short. A 60-second TTL means your backend is still handling the majority of traffic. For content that changes infrequently (product pages, blog posts, category listings), set TTLs of hours or days and use active purge/ban for invalidation when content changes.

Ignoring grace mode. Without grace mode, a backend crash means every user sees errors. With grace mode configured, users see the last cached version. There is no valid reason to run Varnish in production without grace mode enabled.

Not monitoring the hit ratio. If you do not measure it, you cannot improve it. A cache running at 40% hit ratio is mostly decorative. Check varnishstat regularly and investigate pages that are not being cached.

Caching personalized content. Cart contents, logged-in user menus, and wishlist indicators should never be cached as full pages. Use ESI (Edge Side Includes) to cache the static page shell and dynamically inject personalized fragments, or handle personalization entirely in the browser with JavaScript after the cached page loads.

Running Varnish with insufficient memory. If the cache is too small, objects get evicted before they are requested again, driving the hit ratio down. Monitor MAIN.n_lru_nuked in varnishstat — if this counter is climbing, your cache is too small and objects are being evicted to make room. Increase the -s malloc allocation.

Benchmarking the Difference

Before and after numbers from production deployments we have managed:

Metric	Without Varnish	With Varnish
Magento 2 category page (TTFB)	920ms	3ms
WordPress homepage (TTFB)	340ms	2ms
Concurrent users before degradation	200	4,000+
Backend CPU under load	85%	12%
Database queries per page view	150-300	0 (cache hit)

These numbers represent cache hits. Cache misses still go to the backend at full cost. The goal is to maximize your hit ratio so that the vast majority of requests never touch the backend.

Next Steps

Once Varnish is running and your hit ratio is above 80%, consider these optimizations:

ESI (Edge Side Includes) for partial page caching. Cache the page layout for hours and inject dynamic blocks (cart count, logged-in user name) with short TTLs or pass-through.
Varnish modules (VMODs) for advanced functionality. The xkey VMOD enables tag-based invalidation, which Magento 2 uses for surgical cache purging by product ID or category ID.
Separate static asset handling. Let Nginx serve static files directly without involving Varnish. This frees Varnish memory and threads for dynamic content caching.
Connection pooling. Configure Varnish backend connection reuse to reduce TCP handshake overhead between Varnish and your web server.

Varnish is the single highest-impact performance optimization for any server-rendered website. If your site is running on Magento, WordPress, or any PHP application without a full-page cache layer, you are leaving an order-of-magnitude performance improvement on the table.