How to use CSS selectors to retrieve specific links lying in some class using BeautifulSoup?

Flecha picture Flecha · Jul 17, 2014 · Viewed 65.9k times · Source

I am new to Python and I am learning it for scraping purposes I am using BeautifulSoup to collect links (i.e href of 'a' tag). I am trying to collect the links under the "UPCOMING EVENTS" tab of site http://allevents.in/lahore/. I am using Firebug to inspect the element and to get the CSS path but this code returns me nothing. I am looking for the fix and also some suggestions for how I can choose proper CSS selectors to retrieve desired links from any site. I wrote this piece of code:

from bs4 import BeautifulSoup

import requests

url = "http://allevents.in/lahore/"

r  = requests.get(url)

data = r.text

soup = BeautifulSoup(data)
for link in soup.select( 'html body div.non-overlay.gray-trans-back div.container div.row div.span8 div#eh-1748056798.events-horizontal div.eh-container.row ul.eh-slider li.h-item div.h-meta div.title a[href]'):
    print link.get('href')

Answer

Lavish Saluja picture Lavish Saluja · Sep 28, 2017
soup.select('div')
# All elements named <div>

soup.select('#author')
# The element with an id attribute of author

soup.select('.notice')
# All elements that use a CSS class attribute named notice

soup.select('div span')
# All elements named <span> that are within an element named <div>

soup.select('div > span')
# All elements named <span> that are directly within an element named <div>,
# with no other element in between

soup.select('input[name]')
# All elements named <input> that have a name attribute with any value

soup.select('input[type="button"]')
# All elements named <input> that have an attribute named type with value button

You may also be interested by this book.