fwrite() and UTF8

Lizard picture Lizard · Jun 13, 2011 · Viewed 50.5k times · Source

I am creating a file using php fwrite() and I know all my data is in UTF8 ( I have done extensive testing on this - when saving data to db and outputting on normal webpage all work fine and report as utf8.), but I am being told the file I am outputting contains non utf8 data :( Is there a command in bash (CentOS) to check the format of a file?

When using vim it shows the content as:

Donâ~@~Yt do anything .... Itâ~@~Ys a great site with everything....Weâ~@~Yve only just launched/

Any help would be appreciated: Either confirming the file is UTF8 or how to write utf8 content to a file.

UPDATE

To clarify how I know I have data in UTF8 i have done the following:

  1. DB is set to utf8 When saving data
  2. to database I run this first:

    $enc = mb_detect_encoding($data);

    $data = mb_convert_encoding($data, "UTF-8", $enc);

  3. Just before I run fwrite i have checked the data with Note each piece of data returns 'IS utf-8'

    if (strlen($data)==mb_strlen($data, 'UTF-8')) print 'NOT UTF-8'; else print 'IS utf-8';

Thanks!

Answer

Florin Sima picture Florin Sima · Aug 31, 2012

If you know the data is in UTF8 than you want to set up the header.

I wrote a solution answering to another tread.

The solution is the following: As the UTF-8 byte-order mark is \xef\xbb\xbf we should add it to the document's header.

<?php
function writeStringToFile($file, $string){
    $f=fopen($file, "wb");
    $file="\xEF\xBB\xBF".$file; // this is what makes the magic
    fputs($f, $string);
    fclose($f);
}
?>

You can adapt it to your code, basically you just want to make sure that you write a UTF8 file (as you said you know your content is UTF8 encoded).