I really don't know, what the Problem is? I get the following error:
File "C:\Python27\lib\xml\dom\expatbuilder.py", line 223, in parseString
parser.Parse(string, True)
ExpatError: junk after document element: line 5, column 0
I DONT SEE NO JUNK! Any help? I'm getting crazy......
text = """<questionaire>
<question>
<questiontext>Question1</questiontext>
<answer>Your Answer: 99</answer>
</question>
<question>
<questiontext>Question2</questiontext>
<answer>Your Answer: 64</answer>
</question>
<question>
<questiontext>Question3</questiontext>
<answer>Your Answer: 46</answer>
</question>
<question>
<questiontext>Bitte geben</questiontext>
<answer>Your Answer: 544</answer>
<answer>Your Answer: 943</answer>
</question>
</questionaire>"""
cleandata = text.split('<questionaire>')
cleandatastring= "".join(cleandata)
stripped = cleandatastring.strip()
planhtml = stripped.split('</questionaire>')[0]
clean= planhtml.strip()
from xml.dom import minidom
doc = minidom.parseString(clean)
for question in doc.getElementsByTagName('question'):
for answer in question.getElementsByTagName('answer'):
if answer.childNodes[0].nodeValue.strip() == 'Your Answer: 99':
question.parentNode.removeChild(question)
print doc.toxml()
Thanx!
Your original text
string is well-formed XML. Then you do a bunch of stuff to it that breaks it. Parse your original text
, and you will be fine.
XML is required to have exactly one top-level element. By the time you parse it, it has a number of top-level <question>
tags. The XML parser is parsing the first one as a root element, and then is surprised to find another top-level element.