Python re.findall prints output as list instead of string

Ilea picture Ilea · Mar 29, 2015 · Viewed 35.2k times · Source

My re.findall search is matching and returning the right string, but when I try to print the result, it prints it as a list instead of a string. Example below:

> line =  ID=id5;Parent=rna1;Dbxref=GeneID:653635,Genbank:NR_024540.1,HGNC:38034;gbkey=misc_RNA;gene=WASH7P;product=WAS protein family homolog 7 pseudogene;transcript_id=NR_024540.1

> print re.findall(r'gene=[^;\n]+', line)

>     ['gene=WASH7P']

I would like the print function just to return gene=WASH7P without the brackets and parentheses around it.

How can I adjust my code so that it prints just the match, without the brackets and parentheses around it?

Thank you!

Answer

Ilea picture Ilea · Mar 29, 2015

Thank you for everyone's help!

Both of the below codes were successful in printing the output as a string.

> re.findall(r'gene=[^;\n]+', line)[0]  

> re.search(r'gene=[^;\n]+', line).group

However, I was continuing to get "list index out of range" errors on one of my regex, even though results were printing when I just used re.findall().

> re.findall(r'transcript_id=[^\s]+',line)

I realized that this seemingly impossible result was because I was calling re.findall() within a for loop that was iterating over every line in a file. There were matches for some lines but not for others, so I was receiving the "list index out of range" error for those lines in which there was no match.

the code below resolved the issue:

> if re.findall(r'transcript_id=[^\s]+',line):

>    transcript = re.findall(r'transcript_id=[^\s]+',line)[0]

> else:

>   transcript = "NA" 

Thank you!