I've already seen lots of posts on the site for RTF to HTML and some other posts talking about some HTML to RTF converters, but I'm really trying to get a full breakdown of what is considered the most widely used commercial product, open source product or if people recommend going home grown. Apologies if you consider this a duplicate question, but I'm trying to create a product matrix to see what is the most viable for our application. I also think this would be helpful for others.
The converter would be used in an ASP.NET 2.0 application (we're upgrading to 3.5 shortly but still sticking with WebForms) using SQLServer 2005 (soon 2008) as the DB.
From reading a few posts, SautinSoft appears to be popular as a commercial component. Are there other commercial components that you'd recommend for converting HTML to RTF? Price does matter, but even if it's a little on the expensive side, please list it.
For open source, I read that OpenOffice.org can be run as a service so that it can convert files. However, this appears to be only Java based. I imagine, I'd need some kind of interop to use this? What .NET open source components, if any, are out there for converting HTML to RTF?
For home grown, is an XSLT the way to go with XHTML? If so, what component do you recommend for generating XHTML? Otherwise, what other home grown avenuses do you recommend.
Also, please note that I currently don't care so much about RTF to HTML. If a commercial component offers this and the price is still the same, fine, otherwise please don't mention it.
For what its worth and in no particular order.
A while ago i wanted to export to RTF and then import from RTF the RTF in question being manipulated by MS Word.
The first problem is RTF is not an open standard. It is an internal MS standard and there fore they alter it as and when they like and do not generally worry about compatibility. Currently the versions of RTF are 1.3 to 1.9 and they are all different. Internally they use twips for measurement just for good measure.
I bought the O'Reilly pocket book on the subject which helped and read a lot of the MS documentation which is good, but there is a lot of it and lots for each version.
Because of the way RTF is coded using regex to manipulate is incredibly hard work and needs careful handling and concentration to test and get to work. I use a Mac editor that had built in regex so i could steadily test each section and build it into the code.
Because of the number of versions there is also a lot of incompatibility between versions but there is a lot of commonality and in the end it was reasonably hard/easy to get where i wanted (after about a weeks reading and a weeks coding) and producing a really simple version.
I never found a commercial solution but i had to have a free on because of budget so that cut a lot out but take great care in choosing one to make sure it does what you want and has support.
I don't think where you are coming from HTML/XML/XHTML, i was converting CSV formats, it the RTF.
I am not sure if i would advise to DIY or buy. Probably on balance DIY but your own circumstances will dictate that.
Edit: One thing going from content to RTF is easier than vice versa.
BTW not criticising MS fior the RTF versions, hey it's theirs and proprietary so they can do what they like.