Getting started with Scraping Browser

Learn about Bright Data’s Scraping Browser solution, how to get started, and some tips for best use.

Bright Data’s Scraping Browser

Scraping Browser is one of our proxy-unlocking solutions and is designed to help you easily focus on your multi-step data collection from browsers while we take care of the full proxy and unblocking infrastructure for you, including CAPTCHA solving. 

You can now easily access and navigate target websites via browsing libraries such as puppeteer, playwright, and selenium (see our full list here) and interact with your target site's HTML in a multi-step manner to extract the data you need.

Behind the scenes, our Scraping Browser solution incorporates our complete proxy infrastructure along with our dynamic unlocking capabilities to get you the exact data you need wherever it may be.

Best for

    • puppeteer, playwright, and selenium integration

    • Navigating through a website, clicking buttons, scrolling to load a full page, hovering, solving CAPTCHAs, and more

    • Teams that don’t have a reliable browser unblocking infrastructure in-house 

Quick start

  1. Sign in to your Bright Data control panel
    • If you haven’t yet signed up for Bright Data, you can sign up for free, and when adding your payment method, you’ll receive a $5 credit to get you started!
  2. Create your new Scraping Browser proxy
    • Navigate to ‘My Proxies’ page, and under ‘Scraping Browser’ click ‘Get started’
      mceclip0.png
      Note: If you already have an active proxy, simply choose ‘Add proxy’ at the top right
  3. In the ‘Create a new proxy” page, choose and input a name for your new Scraping Browser proxy zone
    Note: Please select a meaningful name, as the zone's name cannot be changed once created

  4. To create and save your proxy, click ‘Add proxy

    A note on Account verification:

    If you haven’t yet added a payment method, you’ll be prompted to add one at this point in order to verify your account. If it’s your first time using Bright Data, then you’ll also receive a $5 bonus credit to get you started!

    Be advised: You will not be charged anything at this point and this is solely for verification purposes.

     

  5. Obtaining your API credentials

    After verifying your account above, you can now view your API credentials.

    In your proxy zone’s ‘Access parameters’ tab, you’ll find your API credentials which include your Username and Password. You will use them to launch your first Scraping Browser session below. 
    Scraping-browser-username-password.png
  6. Creating your first Scraping Browser session in Node.js, Python, or C# (see support for other languages)  

    Choose your preferred language and browser navigation library below to continue:

  • Install puppeteer-core via npm

    npm i puppeteer-core

    Try running the example script below (swap in your credentials, zone, and target URL):

    const puppeteer = require('puppeteer-core');
    const AUTH = 'USER:PASS';
    const SBR_WS_ENDPOINT = `wss://${AUTH}@brd.superproxy.io:9222`;

    async function main() {
        console.log('Connecting to Scraping Browser...');
        const browser = await puppeteer.connect({
            browserWSEndpoint: SBR_WS_ENDPOINT,
      });
        try {
            console.log('Connected! Navigating...');
            const page = await browser.newPage();
            await page.goto('https://example.com', { timeout: 2 * 60 * 1000 });
            console.log('Taking screenshot to page.png');
            await page.screenshot({ path: './page.png', fullPage: true });
      console.log('Navigated! Scraping page content...');
    const html = await page.content();
    console.log(html)
    // CAPTCHA solving: If you know you are likely to encounter a CAPTCHA on your target page, add the following few lines of code to get the status of Scraping Browser's automatic CAPTCHA solver
    // Note 1: If no captcha was found it will return not_detected status after detectTimeout
    // Note 2: Once a CAPTCHA is solved, if there is a form to submit, it will be submitted by default
    // const client = await page.target().createCDPSession();
    // const {status} = await client.send('Captcha.solve', {detectTimeout: 30*1000});
    // console.log(`Captcha solve status: ${status}`)

        } finally {
            await browser.close();
      }
    }

    if (require.main === module) {
        main().catch(err => {
            console.error(err.stack || err);
            process.exit(1);
      });
    }

    Run the script:

    node script.js

    Automatically opening devtools to view your live browser session

    The Scraping Browser Debugger enables developers to inspect, analyze, and fine-tune their code alongside Chrome Dev Tools, resulting in better control, visibility, and efficiency. You can integrate the following code snippet to launch devtools automatically for every session:

    // Node.js Puppeteer - launch devtools locally

    const { exec } = require('child_process');
    const chromeExecutable = 'google-chrome';

    const delay = ms => new Promise(resolve => setTimeout(resolve, ms));
    const openDevtools = async (page, client) => {
       // get current frameId
       const frameId = page.mainFrame()._id;
       // get URL for devtools from scraping browser
       const { url: inspectUrl } = await client.send('Page.inspect', { frameId });
       // open devtools URL in local chrome
       exec(`"${chromeExecutable}" "${inspectUrl}"`, error => {
           if (error)
               throw new Error('Unable to open devtools: ' + error);
       });
       // wait for devtools ui to load
       await delay(5000);
    };

    const page = await browser.newPage();
    const client = await page.target().createCDPSession();
    await openDevtools(page, client);
    await page.goto('http://example.com');
  • Install playwright via npm

    npm i playwright

    Try running the example script below (swap in your credentials, zone, and target URL):

    const pw = require('playwright');
    const AUTH = 'USER:PASS';
    const SBR_CDP = `wss://${AUTH}@brd.superproxy.io:9222`;

    async function main() {
        console.log('Connecting to Scraping Browser...');
        const browser = await pw.chromium.connectOverCDP(SBR_CDP);
        try {
            console.log('Connected! Navigating...');
            const page = await browser.newPage();
            await page.goto('https://example.com', { timeout: 2 * 60 * 1000 });
            console.log('Taking screenshot to page.png');
            await page.screenshot({ path: './page.png', fullPage: true });
    console.log('Navigated! Scraping page content...');
    const html = await page.content();
    console.log(html);

        } finally {
            await browser.close();
      }
    }

    if (require.main === module) {
        main().catch(err => {
            console.error(err.stack || err);
            process.exit(1);
      });
    }
  • Install selenium via npm

    npm i selenium-webdriver

    Try running the example script below (swap in your credentials, zone, and target URL):

    const fs = require('fs/promises');
    const { Builder, Browser, By } = require('selenium-webdriver');

    const AUTH = 'USER:PASS';
    const SBR_WEBDRIVER = `https://${AUTH}@zproxy.lum-superproxy.io:9515`;

    async function main() {
      const driver = await new Builder()
          .forBrowser(Browser.CHROME)
          .usingServer(SBR_WEBDRIVER)
          .build();
      try {
    console.log('Connected! Navigating...');

          await driver.get('https://example.com');
    console.log('Taking page screenshot to file page.png');
          const screenshot = await driver.takeScreenshot();
          await fs.writeFile('./page.png', Buffer.from(screenshot, 'base64'));
    console.log('Navigated! Scraping page content...');
    const html = await driver.getPageSource();
    console.log(html);
      } finally {
          driver.quit();
      }
    }

    if (require.main == module) {
      main().catch(err => {
          console.error(err.stack || err);
          process.exit(1);
      });
    }

    Run the script

    node script.js
  • Install playwright via pip

    pip3 install playwright

    Try running the example script below (swap in your credentials, zone, and target URL):

    import asyncio
    from playwright.async_api import async_playwright

    AUTH = 'USER:PASS'
    SBR_WS_CDP = f'wss://{AUTH}@brd.superproxy.io:9222'

    async def run(pw):
       print('Connecting to Scraping Browser...')
       browser = await pw.chromium.connect_over_cdp(SBR_WS_CDP)
       try:
           print('Connected! Navigating...')
           page = await browser.new_page()
           await page.goto('https://example.com', timeout=2*60*1000)
           print('Taking page screenshot to file page.png')
            await page.screenshot(path='./page.png', full_page=True)
    print('Navigated! Scraping page content...')
    html = await page.content()
    print(html)
    # CAPTCHA solving: If you know you are likely to encounter a CAPTCHA on your target page, add the following few lines of code to get the status of Scraping Browser's automatic CAPTCHA solver
    # Note 1: If no captcha was found it will return not_detected status after detectTimeout
    # Note 2: Once a CAPTCHA is solved, if there is a form to submit, it will be submitted by default
    # client = await page.context.new_cdp_session(page)
    # solve_result = await client.send('Captcha.solve', { 'detectTimeout': 30*1000 })
    # status = solve_result['status']
    # print(f'Captcha solve status: {status}')

    finally:
    await browser.close()

    async def main():
       async with async_playwright() as playwright:
           await run(playwright)

    if __name__ == '__main__':
    asyncio.run(main())

    Run the script

    solve_result = await client.send('Captcha.solve', { 'detectTimeout': 30*1000 })
    status = solve_result['status']
    print(f'Captcha solve status: {status}')
    python scrape.py
  • Install selenium via pip

    pip3 install selenium

    Try running the example script below (swap in your credentials, zone, and target URL):

    from selenium.webdriver import Remote, ChromeOptions
    from selenium.webdriver.chromium.remote_connection import ChromiumRemoteConnection
    from selenium.webdriver.common.by import By

    AUTH = 'USER:PASS'
    SBR_WEBDRIVER = f'https://{AUTH}@zproxy.lum-superproxy.io:9515'

    def main():
        print('Connecting to Scraping Browser...')
       sbr_connection = ChromiumRemoteConnection(SBR_WEBDRIVER, 'goog', 'chrome')
       with Remote(sbr_connection, options=ChromeOptions()) as driver:
        print('Connected! Navigating...')
    driver.get('https://example.com')
    print('Taking page screenshot to file page.png')
    driver.get_screenshot_as_file('./page.png')
    print
    ('Navigated! Scraping page content...')
    html = driver.page_source
    print(html)


    if __name__ == '__main__':
      main()

    Run the script

    python scrape.py
  • Install PuppeteerSharp

    dotnet add package PuppeteerSharp

    Try running the example script below (swap in your credentials, zone, and target URL):

    using PuppeteerSharp;
    using System.Net.WebSockets;
    using System.Text;

    var AUTH = "USER:PASS";
    var SBR_WS_ENDPOINT = $"wss://{AUTH}@brd.superproxy.io:9222";

    var Connect = (string ws) => Puppeteer.ConnectAsync(new ()
    {
        BrowserWSEndpoint = ws,
        WebSocketFactory = async (url, options, cToken)=>{
            var socket = new ClientWebSocket();
            var authBytes = Encoding.UTF8.GetBytes(new Uri(ws).UserInfo);
            var authHeader = "Basic " + Convert.ToBase64String(authBytes);
            socket.Options.SetRequestHeader("Authorization", authHeader);
            socket.Options.KeepAliveInterval = TimeSpan.Zero;
            await socket.ConnectAsync(url, cToken);
            return socket;
      },
    });

    Console.WriteLine("Connecting to Scraping Browser...");
    using var browser = await Connect(SBR_WS_ENDPOINT);
    Console.WriteLine("Connected! Navigating...");
    var page = await browser.NewPageAsync();
    await page.GoToAsync("https://example.com", new NavigationOptions()
    {
        Timeout = 2 * 60 * 1000,
    });
    Console.WriteLine("Taking page screenshot to file page.png");
    await page.ScreenshotAsync("./page.png", new ()
    {
        FullPage = true,
    });
    Console.WriteLine("Navigated! Scraping page content...");
    var html = await page.GetContentAsync();
    Console
    .WriteLine(html);
  • Install Microsoft.Playwright

    dotnet add package Microsoft.Playwright

    Try running the example script below (swap in your credentials, zone, and target URL):

    var AUTH = "USER:PASS";
    var SBR_CDP = $"wss://{AUTH}@brd.superproxy.io:9222";

    Console.WriteLine("Connecting to Scraping Browser...");
    using var pw = await Playwright.CreateAsync();
    await using var browser = await pw.Chromium.ConnectOverCDPAsync(SBR_CDP);
    Console.WriteLine("Connected! Navigating...");
    var page = await browser.NewPageAsync();
    await page.GotoAsync("https://example.com", new()
    {
        Timeout = 2 * 60 * 1000,
    });
    Console.WriteLine("Taking page screenshot to file page.png");
    await page.ScreenshotAsync(new ()
    {
        Path = "./page.png",
       FullPage = true,
    });
    Console.WriteLine("Navigated! Scraping page content...");
    var html = await page.ContentAsync();
    Console
    .WriteLine(html);
  • Install Selenium.WebDriver

    dotnet add package Selenium.WebDriver

    Try running the example script below (swap in your credentials, zone, and target URL):

    using OpenQA.Selenium;
    using
    OpenQA.Selenium.Chrome;
    using OpenQA.Selenium.Remote;
    var AUTH = "USER:PASS";
    var SBR_WEBDRIVER = "https://{AUTH}@brd.superproxy.io:9515";

    Console.WriteLine("Connecting to Scraping Browser...");
    var options = new ChromeOptions();
    using var driver = new RemoteWebDriver(new Uri(SBR_WEBDRIVER), options);
    Console.WriteLine("Connected! Navigating to https://example.com...");
    driver.Navigate().GoToUrl("https://example.com");

    Console.WriteLine("Taking page screenshot to file page.png");
    var screenshot = driver.GetScreenshot();
    screenshot.SaveAsFile("./page.png");

    Console.WriteLine("Navigated! Scraping page content...");
    var html = driver.PageSource;
    Console.WriteLine(html);

Additional Info and Resources

How To

Find out more about the common library navigational functions and examples for browser automation and specifically for Scraping Browser

FAQ

Check out our frequently asked questions regarding Scraping Browser

Troubleshooting

Tackle navigation challenges and learn more about our Scraping Browser Debugger, which enables you to inspect, analyze, and fine-tune your code alongside Chrome Dev Tools. 

Pricing & Billing

See more about how your sessions are calculated and priced

Scraping Browser Intro Video

 

Was this article helpful?