Scala XML.loadString vs literal expression

Henri Bauer picture Henri Bauer · Dec 9, 2010 · Viewed 11.7k times · Source

I have been experimenting with Scala and XML and I found a strange difference in behavior between a XML tag created with XML.load (or loadString) and writing it as a literal. Here is the code :

import scala.xml._
// creating a classical link HTML tag
val in_xml = <link type="text/css" href="/css/main.css" rel="stylesheet" xmlns="http://www.w3.org/1999/xhtml"></link>
// The same as a String
val in_str = """<link type="text/css" href="/css/main.css" rel="stylesheet" xmlns="http://www.w3.org/1999/xhtml"></link>"""
// Convert the String into XML
val from_str = XML.loadString(in_str)

println("in_xml  : " + in_xml)
println("from_str: "+ from_str)
println("val_xml == from_str: "+ (in_xml == from_str))
println("in_xml.getClass() == from_str.getClass(): " +
  (in_xml.getClass() == from_str.getClass()))

And here, the output :

in_xml  : <link href="/css/main.css" rel="stylesheet" type="text/css" xmlns="http://www.w3.org/1999/xhtml"></link>
from_str: <link rel="stylesheet" href="/css/main.css" type="text/css" xmlns="http://www.w3.org/1999/xhtml"></link>
val_xml == from_str: false
in_xml.getClass() == from_str.getClass(): true

The types are the same. But there is not equality. The order of the attributes changes. It is never the same as the original one. The attributes of the litteral are alphabetically sorted (only hazard ?).

This would not be a problem if both solutions did not behave differently when I try to transform them. I picked up some intresting Code from Daniel C. Sobral at How to change attribute on Scala XML Element and wrote my own rule in order to remove the first slash of the "href" attribute. The RuleTransformer works well with the in_xml, but has no effect on from_str !

Unfortunately, most of my programs have to read there XML via XML.load(...). So, I'm stuck. Does someone know about this topic ?

Best regards,

Henri

Answer

huynhjl picture huynhjl · Dec 11, 2010

From what I can see, in_xml and from_str are not equals because the order of the attributes is different. This is unfortunate and due to the way the XML is created by the compiler. That causes the attributes to be different:

scala> in_xml.attributes == from_str.attributes
res30: Boolean = false

You can see see that if you replace the attributes the comparison will work:

scala> in_xml.copy(attributes=from_str.attributes) == from_str
res32: Boolean = true

With that said, I'm not clear why that would cause a different behavior in the code that replaces the href attribute. In fact I suspect that something is wrong with the way attribute mapping works. For instance, if I replace the in_str with:

val in_str = """<link type="text/css" rel="stylesheet" href="/css/main.css" 
xmlns="http://www.w3.org/1999/xhtml"></link>"""

It works fine. Could it be that the attribute code from Daniel only works if the attribute is in the head position of MetaData?


Side note: unless in_xml is null, equals and == would return the same value. The == version will check whether the first operand is null before calling equals.