I have a list of PubMed entries along with the PubMed ID's. I would like to create a python script or use python which accepts a PubMed id number as an input and then fetches the abstract from the PubMed website.
So far I have come across NCBI Eutilities and the importurl library in Python but I don't know how I should go about writing a template.
Any pointers will be appreciated.
Thank you,
Using Biopython's module called Entrez, you can get the abstract along with all other metadata quite easily. This will print the abstract:
from Bio.Entrez import efetch
def print_abstract(pmid):
handle = efetch(db='pubmed', id=pmid, retmode='text', rettype='abstract')
print handle.read()
And here is a function that will fetch XML and return just the abstract:
from Bio.Entrez import efetch, read
def fetch_abstract(pmid):
handle = efetch(db='pubmed', id=pmid, retmode='xml')
xml_data = read(handle)[0]
try:
article = xml_data['MedlineCitation']['Article']
abstract = article['Abstract']['AbstractText'][0]
return abstract
except IndexError:
return None
P.S. I actually had the need to do this kind of stuff in a real task, so I organized the code into a class -- see this gist.