How to set proxy in Playwright

How to set proxy in Playwright

Table of Contents

In this article I will describe how to set a proxy in Playwright (Node.js version of Playwright).

Playwright is obviously one of the best and most modern solutions to automate browsers in 2024. It uses the CDP protocol to send commands to browsers and supports Chromium, Chrome and Firefox browsers out of the box. It is open source and very well maintained. It's main use case is UI test automation and web scraping. Setting up proxies is useful for both of these use cases - especially for web scraping, where using high quality proxies is crucial. The Playwright SDK is available for many programming languages (C#, Python, Node.js), this blog post is dedicated to the Node.js version of Playwright. By the way, if you can't decide between the Node.js and Python versions of Playwright, read my article about their differences.

Setting Playwright proxy can be done on two layers:

  • on global browser instance, and
  • on browser context

The simplest way to set proxy in Playwright

Chances are you need just this way!

const browser = await chromium.launch({
  proxy: {
    server: 'http://myproxy.com:3128',
    username: 'usr',
    password: 'pwd'
  }
});

That's all you need!

In case you have your proxy credentials in standard http://username:pw@host:port syntax, you need to convert it into Playwright format. Here is how I usually do it in my Node.js projects:

My way to setup proxy in Node.js Playwright projects

I don't recommend to hardcode proxy into your project Javascript/Typescript code. Even if you don't change your proxy every day, if you specify it this way, you will have to commit these sensitive credentials into git repository, which is usually considered a bad practice. Instead, I recommend to use environment variables to use .env file. I also recommend to use standard one-line syntax for the proxy:

PROXY_URL=http://user:pw@proxy-host:port
put this into .env file in the root of your project

Now, in main file of your project, init dotenv (do not forget to install dotenv using npm i dotenv command:

import dotenv from 'dotenv'
dotenv.config()

function convertProxyToPlaywrightFormat(proxyUrl) {
    const url = new URL(proxyUrl);
    return {
        server: `${url.protocol}//${url.host}`,
        username: url.username,
        password: url.password
    };
}

const proxyOptions = convertProxyToPlaywrightFormat(proxyUrl);

This way, we avoid having 3 env variables just for proxy (username, pw, host) and replace them with only one variable.

And here is the full code of using this proxy in playwright:

import 'dotenv/config';
import { chromium } from 'playwright';

function convertProxyToPlaywrightFormat(proxyUrl) {
    const url = new URL(proxyUrl);
    return {
        server: `${url.protocol}//${url.host}`,
        username: url.username,
        password: url.password
    };
}

async function main() {
    const proxyUrl = process.env.PROXY_URL;

    if (!proxyUrl) {
        console.error('Proxy URL not found in .env file');
        process.exit(1);
    }

    const proxyOptions = convertProxyToPlaywrightFormat(proxyUrl);

    const browser = await chromium.launch({
        proxy: proxyOptions,
    });

    const page = await browser.newPage();
    await page.goto('http://example.com');

    await browser.close();
}

main();
Simple way to use environment variable to feed proxy into Playwright

Playwright for web scraping: rotating proxies and retries

If you are trying to use some kind of rotating proxies for real world scraping with Playwright, the code above will of course won't work reliably: the main reason is that proxies, even good, high quality ones, are naturally another moving part which reduces the overall reliability of the connection to the target website, so it will inevitably fail a lot (this code will fail even without using proxies, but with proxies it will fail much more often!). To mitigate this, having retry strategy is crucial. Here is a very simple way to retry  the Playwright request:

import 'dotenv/config';
import { chromium } from 'playwright';

function convertProxyToPlaywrightFormat(proxyUrl) {
    const url = new URL(proxyUrl);
    return {
        server: `${url.protocol}//${url.host}`,
        username: url.username,
        password: url.password
    };
}

async function tryNavigate(page, url, maxRetries = 3) {
    for (let attempt = 1; attempt <= maxRetries; attempt++) {
        try {
            await page.goto(url);
            return; // If successful, return without throwing an error
        } catch (error) {
            console.error(`Attempt ${attempt} failed: ${error.message}`);
            if (attempt === maxRetries) {
                throw error; // Rethrow the last error if all retries fail
            }
        }
    }
}

async function main() {
    const proxyUrl = process.env.PROXY_URL;

    if (!proxyUrl) {
        console.error('Proxy URL not found in .env file');
        process.exit(1);
    }

    const proxyOptions = convertProxyToPlaywrightFormat(proxyUrl);
    const browser = await chromium.launch({
        proxy: proxyOptions,
    });

    try {
        const page = await browser.newPage();
        await tryNavigate(page, 'http://example.com');
    } catch (error) {
        console.error(`Failed to navigate: ${error.message}`);
    } finally {
        await browser.close();
    }
}

main();
The same code with simple retry strategy

When loading web pages using Playwright with a proxy, it is also often a good idea to reduce the amount of loaded resources, for example, by blocking resource load by resource type.

Setting different proxies for one Playwright instance

Sometimes, you want to reduce the amount of browser instances, while using different proxies for different requests. This helps to reduce hardware and RAM usage (and Playwright is very resource intensive!). This is where Playwright contexts become useful: BrowserContexts provide a way to operate multiple independent browser sessions. It is important to understand that two browser contexts launched from one Playwright browser share nothing: for websites, these two contexts are essentially looking like a different browsers (read more in Playwright official docs). Let's say you have this kind of .env file:

PROXY_URL=http://user:pw@proxy-host:port
PROXY2_URL=http://user2:pw@proxy-host2:port

Here is how you can use these two different proxies in one Playwright instance:

import 'dotenv/config';
import { chromium } from 'playwright';

function convertProxyToPlaywrightFormat(proxyUrl) {
    const url = new URL(proxyUrl);
    return {
        server: `${url.protocol}//${url.host}`,
        username: url.username,
        password: url.password
    };
}

async function main() {
    const proxyUrl = process.env.PROXY_URL;
    const proxy2Url = process.env.PROXY2_URL;

    if (!proxyUrl || !proxy2Url) {
        console.error('One or both proxy URLs not found in .env file');
        process.exit(1);
    }

    const proxyOptions = convertProxyToPlaywrightFormat(proxyUrl);
    const proxy2Options = convertProxyToPlaywrightFormat(proxy2Url);

    const browser = await chromium.launch();

    // Create two different contexts with different proxies
    const context1 = await browser.newContext({ proxy: proxyOptions });
    const context2 = await browser.newContext({ proxy: proxy2Options });

    const page1 = await context1.newPage();
    const page2 = await context2.newPage();

    // Do something with both pages. 
    // Cookies and sessions are not shared between page1 and page2
    await page1.goto('http://example.com');
    await page2.goto('http://example.com');

    // Close the browser contexts
    await context1.close();
    await context2.close();

    // Close the browser
    await browser.close();
}

main();
Setting Playwright proxy on context level

Thank you for reading this writeup! If you enjoyed this article, you might also be interested in how I investigated Cloudflare anti-scraping protections and bypassed them and how to download PDF in Playwright.