I have links looks like this
<div class="systemRequirementsMainBox">
<div class="systemRequirementsRamContent">
<span title="000 Plus Minimum RAM Requirement">1 GB</span> </div>
I'm trying to get 1 GB
from there. I tried
tt = [a['title'] for a in soup.select(".systemRequirementsRamContent span")]
for ram in tt:
if "RAM" in ram.split():
print (soup.string)
It outputs None
.
I tried a['text']
but it gives me KeyError. How can I fix this and what is my mistake?
You can use a css selector, pulling the span you want using the title text :
soup = BeautifulSoup("""<div class="systemRequirementsMainBox">
<div class="systemRequirementsRamContent">
<span title="000 Plus Minimum RAM Requirement">1 GB</span> </div>""", "xml")
print(soup.select_one("span[title*=RAM]").text)
That finds the span with a title attribute that contains RAM, it is equivalent to saying in python, if "RAM" in span["title"]
.
Or using find with re.compile
import re
print(soup.find("span", title=re.compile("RAM")).text)
To get all the data:
from bs4 import BeautifulSoup
r = requests.get("http://www.game-debate.com/games/index.php?g_id=21580&game=000%20Plus").content
soup = BeautifulSoup(r,"lxml")
cont = soup.select_one("div.systemRequirementsRamContent")
ram = cont.select_one("span")
print(ram["title"], ram.text)
for span in soup.select("div.systemRequirementsSmallerBox.sysReqGameSmallBox span"):
print(span["title"],span.text)
Which will give you:
000 Plus Minimum RAM Requirement 1 GB
000 Plus Minimum Operating System Requirement Win Xp 32
000 Plus Minimum Direct X Requirement DX 9
000 Plus Minimum Hard Disk Drive Space Requirement 500 MB
000 Plus GD Adjusted Operating System Requirement Win Xp 32
000 Plus GD Adjusted Direct X Requirement DX 9
000 Plus GD Adjusted Hard Disk Drive Space Requirement 500 MB
000 Plus Recommended Operating System Requirement Win Xp 32
000 Plus Recommended Hard Disk Drive Space Requirement 500 MB