Context: I have a web application that processes and shows huge log files. They're usually only about 100k lines long, but it can be up to 4 million lines or more. To be able to scroll through that log file (both user initiated and via JavaScript) and filter the lines with decent performance I create a DOM element for each line as soon as the data arrives (in JSON via ajax). I found this better for performance then constructing the HTML at the back-end. Afterwards I save the elements in an array and I only show the lines that are visible.
For max 100k lines this takes only about a few seconds, but anything more takes up to one minute for 500k lines (not including the download). I wanted to improve the performance even more, so I tried using HTML5 Web Workers. The problem now is that I can't create elements in a Web Worker, not even outside the DOM. So I ended up doing only the json to HTML conversion in the Web Workers and send the result to the main thread. There it is created and stored in an array. Unfortunately this worsened the performance and now it takes at least 30 seconds more.
Question: So is there any way, that I'm not aware of, to create DOM elements, outside the DOM tree, in a Web Worker? If not, why not? It seems to me that this can't create concurrency problems, as creating the elements could happen in parallel without problems.
Alright, I did some more research with the information @Bergi provided and found the following discussion on W3C mailing list:
http://w3-org.9356.n7.nabble.com/Limited-DOM-in-Web-Workers-td44284.html
And the excerpt that answers why there is no access to the XML parser or DOM parser in the Web Worker:
You're assuming that none of the DOM implementation code uses any sort of non-DOM objects, ever, or that if it does those objects are fully threadsafe. That's just not not the case, at least in Gecko.
The issue in this case is not the same DOM object being touched on multiple threads. The issue is two DOM objects on different threads both touching some global third object.
For example, the XML parser has to do some things that in Gecko can only be done on the main thread (DTD loading, offhand; there are a few others that I've seen before but don't recall offhand).
There is however also a workaround mentioned, which is using a third-party implementation of the parsers, of which jsdom is an example. With this you even have access to your own separate Document.