I am trying to write a program in C# that will split a vCard (VCF) file with multiple contacts into individual files for each contact. I understand that the vCard needs to be saved as ANSI (1252) for most mobile phones to read them.
However, if I open a VCF file using StreamReader
and then write it back with StreamWriter
(setting 1252 as the Encoding format), all special characters like å
, æ
and ø
are getting written as ?
. Surely ANSI (1252) would support these characters. How do I fix this?
Edit: Here's the piece of code I use to read and write the file.
private void ReadFile()
{
StreamReader sreader = new StreamReader(sourceVCFFile);
string fullFileContents = sreader.ReadToEnd();
}
private void WriteFile()
{
StreamWriter swriter = new StreamWriter(sourceVCFFile, false, Encoding.GetEncoding(1252));
swriter.Write(fullFileContents);
}
You are correct in assuming that Windows-1252 supports the special characters you listed above (for a full list see the Wikipedia entry).
using (var writer = new StreamWriter(destination, true, Encoding.GetEncoding(1252)))
{
writer.WriteLine(source);
}
In my test app using the code above it produced this result:
Look at the cool letters I can make: å, æ, and ø!
No question marks to be found. Are you setting the encoding when your reading it in with StreamReader
?
EDIT:
You should just be able to use Encoding.Convert
to convert the UTF-8 VCF file into Windows-1252. No need for Regex.Replace
. Here is how I would do it:
// You might want to think of a better method name.
public string ConvertUTF8ToWin1252(string source)
{
Encoding utf8 = new UTF8Encoding();
Encoding win1252 = Encoding.GetEncoding(1252);
byte[] input = source.ToUTF8ByteArray(); // Note the use of my extension method
byte[] output = Encoding.Convert(utf8, win1252, input);
return win1252.GetString(output);
}
And here is how my extension method looks:
public static class StringHelper
{
// It should be noted that this method is expecting UTF-8 input only,
// so you probably should give it a more fitting name.
public static byte[] ToUTF8ByteArray(this string str)
{
Encoding encoding = new UTF8Encoding();
return encoding.GetBytes(str);
}
}
Also you'll probably want to add using
s to your ReadFile
and WriteFile
methods.