I've done this with BeautifulSoup but it's a bit cumbersome, and I'm trying to figure out if I can do it directly with Selenium.
Let's say I have the following HTML, which repeats multiple times in the page source with identical elements but different contents:
<div class="person">
<div class="title">
<a href="http://www.url.com/johnsmith/">John Smith</a>
</div>
<div class="company">
<a href="http://www.url.com/company/">SalesForce</a>
</div>
</div>
I need to build a dictionary where the entry for each person looks like:
dict = {'name' : 'John Smith', 'company' : 'SalesForce'}
I can easily get Selenium to produce a list of the contents of each top level element by doing:
driver.find_elements_by_class_name('person')
But then I can't iterate through the list because the above method doesn't narrow the scope/source to just the contents of that element.
If I try to do something like this:
people = driver.find_elements_by_class_name('person')
for person in people:
print person.find_element_by_xpath['//div[@class="title"]//a').text
I just get the same name over and over again.
I need to do this group by group because in my case, iterating through the whole page and appending each tag individually won't work (there's infinite scrolling, so it would be really inefficient).
Does anyone know whether it's possible to do this directly in Selenium, and if so how?
Use find_elements_by_class_name()
to get all blocks and find_element_by_xpath()
to get title
and company
for each person:
persons = []
for person in driver.find_elements_by_class_name('person'):
title = person.find_element_by_xpath('.//div[@class="title"]/a').text
company = person.find_element_by_xpath('.//div[@class="company"]/a').text
persons.append({'title': title, 'company': company})