Inserting and deleting XML nodes and elements using Nokogiri

hans zeckinger picture hans zeckinger · Aug 13, 2009 · Viewed 22.3k times · Source

I want to extract parts of an XML file and make a note that I extracted some part in that file, like "here something was extracted".

I'm trying to do this with Nokogiri, but it seems to not really be documented on how to:

  1. delete all childs of a <Nokogiri::XML::Element>
  2. change the inner_text of that complete element

Any clues?

Answer

Pesto picture Pesto · Aug 14, 2009

Nokogiri makes this pretty easy. Using this document as an example, the following code will find all vitamins tags, remove their children (and the children's children, etc.), and change their inner text to say "Children removed.":

require 'nokogiri'

io = File.open('sample.xml', 'r')
doc = Nokogiri::XML(io)
io.close

doc.search('//vitamins').each do |node|
  node.children.remove
  node.content = 'Children removed.'
end

A given food node will go from looking like this:

<food>
    <name>Avocado Dip</name>
    <mfr>Sunnydale</mfr>
    <serving units="g">29</serving>
    <calories total="110" fat="100"/>
    <total-fat>11</total-fat>
    <saturated-fat>3</saturated-fat>
    <cholesterol>5</cholesterol>
    <sodium>210</sodium>
    <carb>2</carb>
    <fiber>0</fiber>
    <protein>1</protein>
    <vitamins>
        <a>0</a>
        <c>0</c>
    </vitamins>
    <minerals>
        <ca>0</ca>
        <fe>0</fe>
    </minerals>
</food>

to this:

<food>
    <name>Avocado Dip</name>
    <mfr>Sunnydale</mfr>
    <serving units="g">29</serving>
    <calories total="110" fat="100"/>
    <total-fat>11</total-fat>
    <saturated-fat>3</saturated-fat>
    <cholesterol>5</cholesterol>
    <sodium>210</sodium>
    <carb>2</carb>
    <fiber>0</fiber>
    <protein>1</protein>
    <vitamins>Children removed.</vitamins>
    <minerals>
        <ca>0</ca>
        <fe>0</fe>
    </minerals>
</food>