I want to extract parts of an XML file and make a note that I extracted some part in that file, like "here something was extracted".
I'm trying to do this with Nokogiri, but it seems to not really be documented on how to:
<Nokogiri::XML::Element>
inner_text
of that complete elementAny clues?
Nokogiri makes this pretty easy. Using this document as an example, the following code will find all vitamins
tags, remove their children (and the children's children, etc.), and change their inner text to say "Children removed.":
require 'nokogiri'
io = File.open('sample.xml', 'r')
doc = Nokogiri::XML(io)
io.close
doc.search('//vitamins').each do |node|
node.children.remove
node.content = 'Children removed.'
end
A given food
node will go from looking like this:
<food>
<name>Avocado Dip</name>
<mfr>Sunnydale</mfr>
<serving units="g">29</serving>
<calories total="110" fat="100"/>
<total-fat>11</total-fat>
<saturated-fat>3</saturated-fat>
<cholesterol>5</cholesterol>
<sodium>210</sodium>
<carb>2</carb>
<fiber>0</fiber>
<protein>1</protein>
<vitamins>
<a>0</a>
<c>0</c>
</vitamins>
<minerals>
<ca>0</ca>
<fe>0</fe>
</minerals>
</food>
to this:
<food>
<name>Avocado Dip</name>
<mfr>Sunnydale</mfr>
<serving units="g">29</serving>
<calories total="110" fat="100"/>
<total-fat>11</total-fat>
<saturated-fat>3</saturated-fat>
<cholesterol>5</cholesterol>
<sodium>210</sodium>
<carb>2</carb>
<fiber>0</fiber>
<protein>1</protein>
<vitamins>Children removed.</vitamins>
<minerals>
<ca>0</ca>
<fe>0</fe>
</minerals>
</food>