how to make Nokogiri not to convert   to space

ywenbo picture ywenbo · Dec 18, 2010 · Viewed 8.4k times · Source

i fetch one html fragment like

"<li>市&nbsp;场&nbsp;价"

which contains "&nbsp;", but after calling to_s of Nokogiri NodeSet, it becomes

"<li>市 场 价"

, i want to keep the original html fragment, and tried to set :save_with option for to_s method, but failed.

can someone encounter the same problem and give me help? thank you in advance.

Answer

Mike Dotterer picture Mike Dotterer · Dec 22, 2010

I encountered a similar situation, and what I came up was a bit of a hack, but it seems to work well.

nbsp = Nokogiri::HTML("&nbsp;").text
text.gsub(nbsp, " ")

In my case, I wanted the nbsp to be a regular space. I think in your case, you want them to be returned to a "&nbsp;", so you could do something like:

nbsp = Nokogiri::HTML("&nbsp;").text
html.gsub(nbsp, "&nbsp;")