My re.findall search is matching and returning the right string, but when I try to print the result, it prints it as a list instead of a string. Example below:
> line = ID=id5;Parent=rna1;Dbxref=GeneID:653635,Genbank:NR_024540.1,HGNC:38034;gbkey=misc_RNA;gene=WASH7P;product=WAS protein family homolog 7 pseudogene;transcript_id=NR_024540.1
> print re.findall(r'gene=[^;\n]+', line)
> ['gene=WASH7P']
I would like the print function just to return gene=WASH7P
without the brackets and parentheses around it.
How can I adjust my code so that it prints just the match, without the brackets and parentheses around it?
Thank you!
Thank you for everyone's help!
Both of the below codes were successful in printing the output as a string.
> re.findall(r'gene=[^;\n]+', line)[0]
> re.search(r'gene=[^;\n]+', line).group
However, I was continuing to get "list index out of range" errors on one of my regex, even though results were printing when I just used re.findall().
> re.findall(r'transcript_id=[^\s]+',line)
I realized that this seemingly impossible result was because I was calling re.findall() within a for loop that was iterating over every line in a file. There were matches for some lines but not for others, so I was receiving the "list index out of range" error for those lines in which there was no match.
the code below resolved the issue:
> if re.findall(r'transcript_id=[^\s]+',line):
> transcript = re.findall(r'transcript_id=[^\s]+',line)[0]
> else:
> transcript = "NA"
Thank you!