Running BLAST queries with BioPython

Jon picture Jon · Nov 3, 2009 · Viewed 8.3k times · Source

I would like to

  1. BLAST several sequences
  2. Retrieve the top 100 hits or so from each query
  3. Pool the downloaded sequences
  4. Remove duplicates

How I can do this in BioPython?

Answer

Chirag Matkar picture Chirag Matkar · Jun 6, 2015
 from Bio.Blast import NCBIWWW
    fasta_string = open("myfasta").read()
    result_handle = NCBIWWW.qblast("blastn", "nt", fasta_string)
    print result_handle.read()

Above myfasta is your custom seq file which is provided for internet BLAST

you can later play with result_handle using NCBIXML as you wish to (ie to get top 100,remove duplicates)