Website using DataDome gets captcha blocked while scraping using Selenium and Python

Mikycid picture Mikycid · Jun 4, 2020 · Viewed 7.3k times · Source

I'm actually trying to scrape some car datas from different websites, i've been using selenium with chromebrowser but some websites actually block selenium with captcha validation(example: https://www.leboncoin.fr/), and this in just 1 or 2 requests. I tried changing $_cdc in the chromebrowser but this didn't resolve the problem, and I've been using those options for the chromebrowser

user_agent = 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.97 Safari/537.36'
options = webdriver.ChromeOptions()
options.add_argument(f'user-agent={user_agent}')
options.add_argument('start-maximized')
options.add_argument('disable-infobars')
options.add_argument('--profile-directory=Default')
options.add_argument("--incognito")
options.add_argument("--disable-plugins-discovery")
options.add_experimental_option("excludeSwitches", ["ignore-certificate-errors", "safebrowsing-disable-download-protection", "safebrowsing-disable-auto-update", "disable-client-side-phishing-detection"])
options.add_argument('--disable-extensions')
browser = webdriver.Chrome(chrome_options=options)

browser.delete_all_cookies()

browser.set_window_size(800,800)

browser.set_window_position(0,0)

The website I'm trying to scrape uses DataDome for bot security, any clue ?

Answer

AzyCrw4282 picture AzyCrw4282 · Jun 4, 2020

It could be happening due to a myriad of reasons. Try going through the answer here that gives someway in you can prevent this problem.

A simple solution that worked for me sometimes is to use Waits/Sleep calls in selenium, see here from the docs about Waits. Or sleep calls can be done like so

Import time
time.sleep(2)