It's possible to convert the XML to UTF-8 encoding in Delphi 6?
Currently that's what I am doing:
WideStringVariable = AnsiToUtf8(Doc.XML.Text);
WideStringVariable
to file using TFileStream
and Adding BOM for UTF8
at the file beggining.CODE:
Procedure SaveAsUTF8( const Name:String; Data: TStrings );
const
cUTF8 = $BFBBEF;
var
W_TXT: WideString;
fs: TFileStream;
wBOM: Integer;
begin
if TRIM(Data.Text) <> '' then begin
W_TXT:= AnsiToUTF8(Data.Text);
fs:= Tfilestream.create( Name, fmCreate );
try
wBOM := cUTF8;
fs.WriteBUffer( wBOM, sizeof(wBOM)-1);
fs.WriteBuffer( W_TXT[1], Length(W_TXT)*Sizeof( W_TXT[1] ));
finally
fs.free
end;
end;
end;
If I open the file in Notepad++ or another editor that detects encoding, it shows me UTF-8 with BOM. However, it seems like the text it's not properly encoded.
What is wrong and how can I fix it?
UPDATE: XML Properties:
XMLDoc.Version := '1.0';
XMLDoc.Encoding := 'UTF-8';
XMLDoc.StandAlone := 'yes';
You can save the file using standard SaveToFile
method over the TXMLDocument
variable: http://docs.embarcadero.com/products/rad_studio/delphiAndcpp2009/HelpUpdate2/EN/html/delphivclwin32/XMLDoc_TXMLDocument_SaveToFile.html
Whether the file would be or not UTF8 you have to check using local tools like aforementioned Notepad++ or Hex Editor or anything else.
If you insist of using intermediate string and file stream, you should use the proper variable. AnsiToUTF8
returns UTF8String
type and that is what to be used.
Compiling `WideStringVar := AnsiStringSource' would issue compiler warning and
It is a proper warning. Googling for "Delphi WideString" - or reading Delphi manuals on topic - shows that WideString
aka Microsoft OLE BSTR
keeps data in UTF-16 format. http://delphi.about.com/od/beginners/l/aa071800a.htm
Thus assignment UTF16 string <= 8-bit source
would necessarily convert data and thus dumping WideString
data can not be dumping UTF-8
text by the definition of WideString
Procedure SaveAsUTF8( const Name:String; Data: TStrings );
const
cUTF8: array [1..3] of byte = ($EF,$BB,$BF)
var
W_TXT: UTF8String;
fs: TFileStream;
Trimmed: AnsiString;
begin
Trimmed := TRIM(Data.Text);
if Trimmed <> '' then begin
W_TXT:= AnsiToUTF8(Trimmed);
fs:= TFileStream.Create( Name, fmCreate );
try
fs.WriteBuffer( cUTF8[1], sizeof(cUTF8) );
fs.WriteBuffer( W_TXT[1], Length(W_TXT)*Sizeof( W_TXT[1] ));
finally
fs.free
end;
end;
end;
BTW, this code of yours would not create even empty file if the source data was empty. It looks rather suspicious, though it is you to decide whether that is an error or not wrt the rest of your program.
The proper "uploading" of received file or stream to web is yet another issue (to be put as a separate question on Q&A site like SO), related to testing conformance with HTTP. As a foreword, you can readsome hints at WWW server reports error after POST Request by Internet Direct components in Delphi