I was using urllib in python to get stock prices from yahoo finance. Here is my code so far:
import urllib
import re
name = raw_input(">")
htmlfile = urllib.urlopen("http://finance.yahoo.com/q?s=%s" % name)
htmltext = htmlfile.read()
# The problemed area
regex = '<span id="yfs_l84_%s">(.+?)</span>' % name
pattern = re.compile(regex)
price = re.findall(pattern, htmltext)
print price
So I enter a value, and the stock price comes out. But so far I can get it to display a price, just a blank [ ]. I hace commented over where I believe the problem is. Any suggestions? Thanks.
You have not escaped the forward slash in your regex. Change your regex from:
<span id="yfs_l84_%s">(.+?)</span>
to
<span id="yfs_l84_goog">(.+?)<\/span>
This will fix your problem assuming you enter the company's listing code as the input to your code. Ex; goog for google.
That said, regex is a bad choice for what you are trying to do. As suggested by others, explore BeautifulSoup which is a Python library for pulling data out of HTML. With BeautifulSoup your code can be as simple as:
from bs4 import BeautifulSoup
import requests
name = raw_input('>')
url = 'http://finance.yahoo.com/q?s={}'.format(name)
r = requests.get(url)
soup = BeautifulSoup(r.text)
data = soup.find('span', attrs={'id':'yfs_l84_'.format(name)})
print data.text