PHP: strtolower and UTF-8

16.03.2009 Tags: tutorial,PHP,Version française disponible ici! I have a small function that I use for formatting strings and characters, mainly for URL rewrinting. It allows me to convert capital letters, replace white spaces with '-', remove special characters etc...
The 'capital letters' part is done by the simple, yet efficient strtolower():

<?php

function urlrewrite($string) {

$string = strtolower($string);
...

}

?>


Working with the utf-8 charset, I noticed that a string like '2ème TEST' will return a wonderful 2�me test... not really what I wanted.
Having a look at php.net, I found that strtolower() get the charset defined in the current locale. Meaning that working in utf-8 or not (working in utf-8 -> page characters, database connection and datas), strtolower() won't care and will grab the local charset. This will probably result in a conversion to ISO or even ASCII characters, leading to some special characters that won't be correctly displayed.
Well, and now? We have 2 main solutions.
The first would be to convert from utf-8 to the local charset, use strtolower(), then reconvert the result to utf-8 like this:

<?php

function strtolower_utf8($string) {

$result = utf8_decode($string);
$result = strtolower($result);
$result = utf8_encode($result);
return $result;

}

?>


It works, but it's not really... sexy.
The other solution: user the function mb_strtolower(). It also allows you to lower all capital letters, but with the possibility to define in which charset.
So with a lighter function, we can have the same result:

>?php

function strtolower_utf8($string) {

$string = mb_strtolower($string,'UTF-8');

}

?>


Comments

gravatar_Panagiotis
Panagiotis 29.04.2010, 01:31:42 Thanks man
gravatar_lacus
lacus 12.11.2010, 07:48:08 Thanks for the solution!
I was in the stack with hungarian characters too, but this works fine.
gravatar_Astanos
Astanos 07.12.2010, 20:33:18 You're welcome lacus ;)
gravatar_bertrand
bertrand 03.07.2011, 13:57:43 It doesnt work for ñ for examples
gravatar_Fumagally
Fumagally 30.09.2011, 21:38:36 thanks!! usefull.
gravatar_flashfs
flashfs 23.01.2012, 16:44:32 Thank you, that solved my problem. I was using mb_strtolower without specifying 'UTF-8'
gravatar_Astanos
Astanos 23.01.2012, 16:48:58 You're welcome flashfs :)
gravatar_twitter unfollowers
twitter unfollowers 21.08.2012, 11:22:38 Thank you for information.
That solved my day while struggling with UTF-8 characters.
gravatar_Astanos
Astanos 21.08.2012, 13:17:21 Glad it was useful :)