I am not too familiar with python and have to write a script to perform a host of functions. Basically the module i still need is how to check a website code for matching links provided beforehand.
Matching links what? Their HREF attribute? The link display text? Perhaps something like:
from BeautifulSoup import BeautifulSoup, SoupStrainer
import re
import urllib2
doc = urllib2.urlopen("http://somesite.com").read()
links = SoupStrainer('a', href=re.compile(r'^test'))
soup = [str(elm) for elm in BeautifulSoup(doc, parseOnlyThese=links)]
for elm in soup:
print elm
That will grab the HTML content of somesite.com
and then parse it using BeautifulSoup, looking only for links whose HREF attribute starts with "test". It then builds a list of these links and prints them out.
You can modify this to do anything using the documentation.