C#: Converting byte[] to UTF8 encoded string

dotmartin picture dotmartin · Aug 4, 2010 · Viewed 30.6k times · Source

I am using a library called EXIFextractor to extract metadata information from images. This lib in part is using System.Drawing.Imaging.PropertyItem to do all the hard work. Some of the data in PropertyItem, such as Image Details etcetera, are fetched as an ASCII-string stored in a byte[] according to the Microsoft documentation.

My problem is that international characters (å, ä, ö, etcetera) are dropped and replaced by questionmarks. When I debug the code it is apparent that the byte[] is a representation of an UTF-8.

I'd like to parse the byte[] as an UTF8-string, how can I do this without loosing any information in the process?

Thanks in advance!


Update:

I have been asked to provide a snippet from my code:

The first snippet is from the class I use, namely the EXIFextractor.cs written by Asim Goheer

foreach( System.Drawing.Imaging.PropertyItem p in parr )
{
 string v = ""; 

                // ...

 else if( p.Type == 0x2 )
 {
  // string     
  v = ascii.GetString(p.Value);
 }

And this is my code where I try my best to handle the results of the above.

                try {
  EXIFextractor exif = new EXIFextractor(ref bmp, "");
  object o;
                    if ((o = exif["Image Description"]) != null)
                        MediaFile.Description = Tools.UTF8Encode(o.ToString()); 

I have also tried a couple of other ways of getting my precious å, ä, ö from the data, but nothing seems to do the trick. I am starting to think Hans Passant is right about his conclusions in his answer below.

Answer

Scoregraphic picture Scoregraphic · Aug 4, 2010
string yourText = System.Text.Encoding.UTF8.GetString(yourByteArray);