When you automatically collect public data from professional networking sites, you may get an error response, followed by a JS redirect to an endpoint that sets cookies that block your IP for several hours. You may then get redirected again to the login page. If the login flow is completed it means that you were granted access to the requested resource. Below are some tips for collecting public data successfully with Bright Data:

  • Use dedicated residential IPs
  • Rotate between as many unblocked IPs as possible
  • Make sure your requests appear as coming from a browser
  • Cookie behavior must be that of a real user
  • Rotate User Agent
  • Periodically throttle (slow down) your request rate
  • Some websites set trap links, make sure to avoid them
  • Crawl pages in the same order that a real user would (e.g. do a search before viewing a profile)
  • Use a Referer header that matches your crawling pattern
  • Randomize crawling patterns, do not use the same one all the time
  • Use remote DNS
  • Use a headless browser (if there is a mismatch between the implemented user-agent and the browser used you can be flagged)
  • Use the Proxy Manager and check the Success Rate of your requests
  • Ask your success manager about using the Proxy Manager in debug mode
