Replace function utf8_decode() - deprecated since PHP 8.2
Summary:
The function utf8_decode() was a shortcut to convert strings
encoded from UTF-8 to ISO-8859-1 ("Latin 1").
This function was deprecated since PHP 8.2 and will be dropped
in PHP 9:
https://wiki.php.net/rfc/remove_utf8_decode_and_utf8_encode
As mentioned in the RFC, if a $string is a valid UTF-8 string,
so this could be used to count the number of code points:
strlen(utf8_decode($string))
It works because any unmappable code point is replaced with the
single byte '?' in the output. But, the correct native approach
should be this one:
mb_strlen($string, 'UTF-8');
Also, another good approach is this one:
iconv_strlen($string, 'UTF-8')
Note that mb_strlen() was introduced in PHP 4, so, there
are no compatibility issues in using that.
Note that the mbstring extension is already required in the installation
documentation, so this should not change anything for any person.
https://wiki.php.net/rfc/remove_utf8_decode_and_utf8_encode
https://www.php.net/manual/en/function.utf8-decode
https://www.php.net/manual/en/function.mb-convert-encoding.php
Closes T15188
Test Plan:
- I was able to execute "arc lint" from PHP 8.2
- I was able to execute this "arc diff" from PHP 8.2
- With this patch you can still run "arc lint" with your local version
Reviewers: O1 Blessed Committers, avivey
Reviewed By: O1 Blessed Committers, avivey
Subscribers: speck, tobiaswiese, Matthew, Cigaryno
Maniphest Tasks: T15188
Differential Revision: https://we.phorge.it/D25092