strlen() and UTF-8 encoding

Jon Lyles picture Jon Lyles · Jun 14, 2012 · Viewed 30.4k times · Source

Assuming UTF-8 encoding, and strlen() in PHP, is it possible that this string has a length of 4?

I'm only interested to know about strlen(), not other functions

This is the string:

$1�2

I have tested it on my own computer, and I have verified UTF-8 encoding, and the answer I get is 6.

I don't see anything in the manual for strlen or anything I've read on UTF-8 that would explain why some of the characters above would count for less than one.

PS: This question and answer (4) comes from a mock test for ZCE I bought on Ebay.

Answer

Anton picture Anton · Jun 14, 2012

how about using mb_strlen() ?

http://lt.php.net/manual/en/function.mb-strlen.php

But if you need to use strlen, its possible to configure your webserver by setting mbstring.func_overload directive to 2, so it will automatically replace using of strlen to mb_strlen in your scripts.