Web Application Translation, methods and tools

CiscoIPPhone picture CiscoIPPhone · Mar 23, 2010 · Viewed 7.8k times · Source

I've developed a web application. It needs to be translated to languages other than English in the future, and ideally the translators shouldn't need to know HTML/JS/C++ to provide the translation. The server side of the web application is written in C++ and the majority of the localised text is in the HTML files.

My question is: What approaches are there to translating web applications? -

  • Are there any existing tools that would enable a translator who doesn't understand HTML to translate a site?
  • Should I write an application that extracts the localised text from a html file and can re-substitute translated text?
  • Do you just provide the html file to your translators to be localised?

I'm aware the question isn't strictly programming related but the solution may involve programming and it may require some software engineering.

Answer

newtover picture newtover · Mar 31, 2010

Having some experience in localization of the applications, I can tell you the following:

  • Any translator you can rely on, will not have problems with HTML (assuming that the translation would not break the design)
  • Most professional translators use translation memory applications (i.e., Transit, Trados) that can parse many document formats (XML, HTML, PDF, .DOC, etc.) and separate markup from content. They will deliver a translated copy in the same format as original.
  • All messages to be translated that are used within your programming code should be isolated in resource bundles. Almost all popular web-application frameworks have the corresponding means. The bundles are usually just plain text files with key/value pairs. Translator should not see the code.
  • Messages in resource bundles can be formatting strings for printf-like functions. In that case, you should document the expected 'fillers'.
  • When you provide resource bundles to translators, be sure to attach instructions how to get the texts in the application interface, so that the translator would know the context of the given message.
  • If any labels should not exceed a length, you should inform about it in advance.
  • If the application uses company-specific terminology, you should provide a glossary, so that the translation would be consistent.
  • Do your best to get rid of texts on images. Those will be your head ache.
  • If you translate from English, you might face the necessity to introduce additional logic to cover the grammatical features of the target language (correct case, gender)
  • it is very smart to store user manual text and similar texts in simple XML format (a subset of XHTML, DOCBOOK) and apply an XSL transformation for resulting HTML. It allows easily outsource the translation and validate the format of the document.

The list is certainly not web-application specific.