I am using a REXML Ruby parser to parse an XML file. But on a 64 bit AIX box with 64 bit Ruby, I am getting the following error:
REXML::ParseException: #<REXML::ParseException: #<RegexpError: Stack overflow in
regexp matcher:
/^<((?>(?:[\w:][\-\w\d.]*:)?[\w:][\-\w\d.]*))\s*((?>\s+(?:[\w:][\-\w\d.]*:)?[\w:][\-\w\d.]*\s*=\s*(["']).*?\3)*)\s*(\/)?>/mu>
The call for the same is something like this:
REXML::Document.new(File.open(actual_file_name, "r"))
Does anyone have an idea regarding how to solve this issue?
I've had several issues for REXML, it doesn't seem to be the most mature library. Usually I use Nokogiri for Ruby XML parsing stuff, it should be faster and more stable than REXML. After installing it with sudo gem install nokogiri
, you can use something like this to get a DOM instance:
doc = Nokogiri.XML(File.open(actual_file_name, 'rb'))
# => #<Nokogiri::XML::Document:0xf1de34 name="document" [...] >
The documentation on the official webpage is also much better than that of REXML, IMHO.