PHP - UTF-8 to Chinese ANSI (GB2312?) - Export CSV file

david_b picture david_b · Jul 27, 2012 · Viewed 8.7k times · Source

I post this after several hours of research (several times...). I couldn't find any answer yet.

My goal is to write a CSV file using PHP. This file has to have the Chinese ANSI encoding (I suppose it's GB2312 for simplified Chinese, in notepad++ I only see ANSI as encoding). It's a must to import to another tool.

[Important note]

We are currently converting a file with notepad++ and a PC that has Chinese as default language. The process is:

  • get the UTF8 CSV from the web-app
  • save as csv with Excel 2003 on the Chinese PC
  • open in notepad++, the encoding is ANSI already, delete one leading "?" at the beginning of the file.

I ran a test: change my .csv file into a .php and replace it by the following code to keep the same encoding:

<?php echo mb_detect_encoding("test"); ?>

This will print: "ASCII".

Then I am not sure what should be the output of my CSV: GB2312?, ASCII?, ANSI?. I am not even clear on the difference between those.

I also read that a file saved with Excel 2007 as CSV with Chinese PC is OK for this tool.

[/Important Note]

Currently, I don't manage to get it right! When I open the file I get in notepad++, it still shows encoding as being encoded in UTF-8. And it's obvious because the Chinese characters look nice, they should look "broken" :-).

I am using the following header conditions:

header("Content-type: text/csv; charset=GB2312");
header("Content-Disposition: attachment; filename=$filename.csv");
header("Content-Transfer-Encoding: binary"); 
header("Cache-Control: must-revalidate, post-check=0, pre-check=0");
header("Pragma: no-cache");
header("Expires: 0");

[Additional information]

The way my file is coded is (I made it abstract to keep it easy)

//header, hard coded in Chinese
$csv = "东西,东西,东西\n"; //example "stuff,stuff,stuff"
[...]
//write line by line, status is also hard coded (行)
$csv .= $DB_data_1.",".$DB_data_2.",行\n"; //行=OK

[/Additional information]

I also convert my CSV string into GB2312 with iconv before printing it (also tried mb_convert_encoding)

setlocale(LC_ALL,'zh_CN');
$csv = iconv("UTF-8","GB2312",$csv);
echo($csv);

My .php file is written in UTF-8 encoding (not UTF-8 without BOM)

Basically, I always get UTF-8 file as output, I need ANSI. It looks like there are so many parameters/attributes and I don't get it right. Your help would be appreciated!

Thanks!

David

[Additional information]

As example, on columns of my header will have the following encoding change:

  • in PHP source code (UTF-8 file, English computer): 商品序号 (meaning: SKU, item code)
  • in the final CSV file (ANSI file, English computer): ÉÌÆ·ÐòºÅ
  • in the final CSV file (ANSI file, Chinese computer): 商品序号

[/Additional information]

Answer

xdazz picture xdazz · Jul 27, 2012

string mb_convert_encoding ( string $str , string $to_encoding [, mixed $from_encoding ] )

Note the second parameter is to encoding. So it should be

$csv = mb_convert_encoding($csv, "GB2312", "UTF-8");