Selenium - python. how to capture network traffic's response

Rich Rajah picture Rich Rajah · Oct 3, 2018 · Viewed 19.7k times · Source

I am using python Django to create a web app. i am using selenium to launch a headless browser(phantomjs) and making some clicks till i reach a particular page. I wish to capture network traffic and get the response of a particular network call. This network call actually holds a html doc as it's response.

Any way to achieve this ?

Answer

hellpanderr picture hellpanderr · Oct 4, 2018

You can get access to browser or chromedriver logs, they are slightly different when it comes to network responses. The browser log is called performance and the driver log is called driver. They return a json-like object, which you can parse to extract events with Network methods inside them:

{'level': 'INFO',
  'message': '{"message":{"method":"Page.frameStoppedLoading","params":{"frameId":"FB10764A3ABF7FFC83110C39C5F7BF77"}},"webview":"C2D13BD13CF743B6D0695B35E9CC935C"}',
  'timestamp': 1538607113832},
 {'level': 'INFO',
  'message': '{"message":{"method":"Page.frameDetached","params":{"frameId":"FB10764A3ABF7FFC83110C39C5F7BF77"}},"webview":"C2D13BD13CF743B6D0695B35E9CC935C"}',
  'timestamp': 1538607113838},
 {'level': 'INFO',
  'message': '{"message":{"method":"Network.requestWillBeSent","params":{"documentURL":"https://stackoverflow.com/questions/52633697/selenium-python-how-to-capture-network-traffics-response","frameId":"C2D13BD13CF743B6D0695B35E9CC935C","hasUserGesture":false,"initiator":{"type":"other"},"loaderId":"5331BFDC4F466FCED920CFC9F033D2EC","request":{"headers":{"Upgrade-Insecure-Requests":"1","User-Agent":"Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/69.0.3497.100 Safari/537.36"},"initialPriority":"VeryHigh","method":"GET","mixedContentType":"none","referrerPolicy":"no-referrer-when-downgrade","url":"https://stackoverflow.com/questions/52633697/selenium-python-how-to-capture-network-traffics-response"},"requestId":"5331BFDC4F466FCED920CFC9F033D2EC","timestamp":104499.729,"type":"Document","wallTime":1538607113.838206}},"webview":"C2D13BD13CF743B6D0695B35E9CC935C"}',
  'timestamp': 1538607113839},...}

You need to enable logging in DesiredCapabilities and then parse it using JSON module:

import json
from selenium import webdriver
from selenium.webdriver.common.desired_capabilities import DesiredCapabilities

caps = DesiredCapabilities.CHROME
caps['goog:loggingPrefs'] = {'performance': 'ALL'}
driver = webdriver.Chrome(desired_capabilities=caps)
driver.get('https://stackoverflow.com/questions/52633697/selenium-python-how-to-capture-network-traffics-response')

def process_browser_log_entry(entry):
    response = json.loads(entry['message'])['message']
    return response

browser_log = driver.get_log('performance') 
events = [process_browser_log_entry(entry) for entry in browser_log]
events = [event for event in events if 'Network.response' in event['method']]

I don't know if you can get access to response data itself using this, but you can get a url of the response.

Another option is to use a library like selenium-wire.

UPDATE 2020-10-07 ⬇

As @Roey B and @Inactivist explain in the comments, you can access response body using Network.getResponseBody command:

driver.execute_cdp_cmd('Network.getResponseBody', {'requestId': events[0]["params"]["requestId"]})