When usinc `arc diff` in PHP 8.1 and above, this exception can be raised:
```
strlen(): Passing null to parameter #1 ($string) of type string is deprecated
```
Example specific case:
```
$ arc diff --trace
Running unit tests...
>>> [57] (+50,812) <connect> phabricator_config
<<< [57] (+50,813) <connect> 571 us
>>> [58] (+50,813) <query> SELECT * FROM `phabricator_config`.`config_entry` WHERE namespace = 'default' AND isDeleted = 0
<<< [58] (+50,813) <query> 165 us
[2023-03-24 19:51:00] EXCEPTION: (RuntimeException) strlen(): Passing null to parameter #1 ($string) of type string is deprecated at [<arcanist>/src/error/PhutilErrorHandler.php:261]
arcanist(head=master, ref.master=9e1bb955fac9), phorge(head=arcpatch-D25066_workboard_refactor, ref.master=c6f56b8221ba, ref.arcpatch-D25066_workboard_refactor=6c38649ba830)
#0 PhutilErrorHandler::handleError(integer, string, string, integer) called at [<phorge>/src/infrastructure/env/PhabricatorEnv.php:128]
#1 PhabricatorEnv::initializeCommonEnvironment(boolean) called at [<phorge>/src/infrastructure/env/PhabricatorEnv.php:75]
#2 PhabricatorEnv::initializeScriptEnvironment(boolean) called at [<phorge>/scripts/init/lib.php:26]
#3 init_phabricator_script(array) called at [<phorge>/scripts/init/init-script.php:7]
#4 require_once(string) called at [<phorge>/scripts/__init_script__.php:3]
#5 require_once(string) called at [<phorge>/src/infrastructure/testing/PhabricatorTestCase.php:62]
#6 PhabricatorTestCase::willRunTestCases(array) called at [<arcanist>/src/unit/engine/PhutilUnitTestEngine.php:64]
#7 PhutilUnitTestEngine::run() called at [<arcanist>/src/unit/engine/ArcanistConfigurationDrivenUnitTestEngine.php:148]
#8 ArcanistConfigurationDrivenUnitTestEngine::run() called at [<arcanist>/src/workflow/ArcanistUnitWorkflow.php:170]
#9 ArcanistUnitWorkflow::run() called at [<arcanist>/src/workflow/ArcanistDiffWorkflow.php:1231]
#10 ArcanistDiffWorkflow::runUnit() called at [<arcanist>/src/workflow/ArcanistDiffWorkflow.php:1133]
#11 ArcanistDiffWorkflow::runLintUnit() called at [<arcanist>/src/workflow/ArcanistDiffWorkflow.php:363]
#12 ArcanistDiffWorkflow::run() called at [<arcanist>/scripts/arcanist.php:427]
<<< [1] (+51,023) <exec> 51,023,728 us
```
This is just a specific exception but there are lot of them in the source code. It happens since this is executed:
```lang=php
<?php
$v = null;
if (strlen($v) ) {
echo "I have a non-empty string";
}
```
That above snippet in PHP 8.1 is not accepted anymore since NULL is not anymore an accepted value in the input domain of `strlen()`.
https://wiki.php.net/rfc/deprecate_null_to_scalar_internal_arg
> Passing null to non-nullable scalar internal arguments will be forbidden in the future, which constitutes a backwards compatibility break. The behavior for user-defined functions, strict typing mode and non-scalar arguments is not affected, as all these cases already generate a TypeError for null arguments.
More general info:
> Scalar types for built-in functions are nullable by default, this behaviour is deprecated to align with the behaviour of user-defined functions, where scalar types need to be marked as nullable explicitly.
> https://www.php.net/manual/en/migration81.deprecated.php
== What to Avoid (1) ==
Suppressing the warning is not much feasible:
```lang=diff
- if (strlen($test)) {
+ if (@strlen($test)) {
```
If something is deprecated we should avoid using it, not avoid knowing the problem.
== General Fix ==
This is the literal needed fix for some cases:
```lang=diff
- if (strlen($test)) {
+ if ($test !== null && strlen($test)) {
```
So strlen never receives NULL.
But, this probably does not fix the root problem. That is: probably we should avoid to use `strlen()` in some cases.
NOTE: Also Evan had not a strong opinion about this: https://we.phorge.it/book/flavor/article/php_pitfalls/ «A better test is __probably__ strlen()»
For example, we should avoid to use `strlen()` if we are not interested in knowing the exact length of the string in bytes.
For example, we should adopt other methods to check whenever a string is empty. For example this is the literal check that is more efficient and not deprecated:
```lang=diff
- if (strlen($test)) {
+ if (is_string($test) && $test !== '') {
```
The above also fixes the deprecation, and it does in a way that is the literal solution: assure to have a populated string (not null, not an array, etc.).
The above solution is also more efficient because it does not calculate the length of the whole string just to understand if it's a string and if it's empty.
https://www.php.net/manual/en/function.is-string.php
https://stackoverflow.com/q/75898776/3451846
Truth table:
| Value | `if($v)` |`!empty($v)`|`is_string($v) && $v !== ''`| `if(strlen($v))` |`phutil_nonempty_string($v)`|
|---------|------------|------------|----------------------------|----------------------|----------------------------|
| `null` |false | false | false |false (**Deprecated**)| false |
|`false` |false | false | false |false |Exception|
|`true` |true | true | false |**true** |Exception|
| `""` |false | false | false |false | false |
| `"0"` |**false** | **false** | true |true | true |
| `"aaa"` |true | true | true |true | true |
== About PhabricatorEnv::getEnvConfig() ==
This deserves a dedicated strategy since it's very frequent.
NOTE: See {T15199}
== The most safe Universal solution to check for a Populated String? ==
→ Casting to string is safe.
Example:
```lang=diff
-if(strlen($v))
+if(phutil_string_cast($v) !== '')
```
Notes upstream:
- AphrontRequest::getStr only ever returns a string, and is safe to use phutil_nonempty_string.
- PhabricatorEnv::getEnvConfig can return non-string things so any values coming from there should never use phutil_nonempty_string.
- AphrontRequest::getHTTPHeader indicates it could return wild so phutil_nonempty_string should not be used.
- AphrontRequest::getURIData isn't clear if it could return non-string data, so never use phutil_nonempty_string.
But, if you know that you should receive a string, this is really better and more readable:
```lang=diff
-if(strlen($v))
+if(phutil_nonempty_string($v))
```
But, note that `phutil_nonempty_string()` will throw an exception by design if the input value is an array or an object or something really unexpected. This may be a good extra check in most cases.
IMPORTANT: In some cases `phutil_nonempty_string()` can throw an exception also when before it was somehow tolerated. For example, if `$v` is false, that method throws an exception. Anyway, note that `strlen(true)` returns 1 so using booleans with `strlen()` is already nonsense and deserves stricter checks.
Anyway sometime it's just better a simple `is_string($v) && $v !== ''` for generic types.
Also, using a simple `if($v)` can be useful in contexts where you hare sure that you have a string/null, and you are sure that the string "`0`" is nonsense for your context:
```lang=diff
-if(strlen($v))
+if($v)
```
== In short ==
* `phutil_string_cast($v) !== ''`: assures to have something that, casted to string (like an object with __toString(), etc.), or the boolean true that becomes `"1"`, etc., it's not an empty string. This is really what happens under the hood most of the time.
* `phutil_nonempty_string($v)`: assures to have a non-empty string (not null) and throws an exception for alien stuff like booleans, object and arrays, etc.
* `is_string($test) && $test !== ''` assures to have a non-empty string and does not throw anything otherwise
* `if($test)` assures to have a non-empty value (but the string "0" is considered empty)
→ ~~Going blindly~~, it might make ~~a lot of~~ sense to replace ~~each~~ `if(strlen($v))` with `if(phutil_nonempty_string($v))`.
Example upstream:
https://secure.phabricator.com/rARCe5b92735c6dcb8c5fcb90b68d3ef79bce7a5b2dd
IMPORTANT: As already mentioned, that damn function is too much pissed off and explodes violently by design it it receives `false` or other non-strings. Caution.
## Automatic fixes
This is a sed regular expression that could be used to match a single `strlen($v)` or a single `!strlen($v)` inside an `if` and replace it to the respective `phutil_nonempty_string($v)` or `!phutil_nonempty_string($v)`:
```
(if *\([ !]*)strlen\(( *\$?[a-z1-9A-Z_]+ *)\)( *\))
```
It splits the stuff before strlen, in the function argument of strlen, and after. So, the replacement can be something like:
```
\1phutil_nonempty_string(\2)\3
```
And this is the related GNU sed rule:
```
sed -E 's/(if *\([ !]*)strlen\(( *\$?[a-z1-9A-Z_]+) *\)( *\))/\1phutil_nonempty_string(\2)\3/g' FILE
```
So I created a small script to do a single replacement on a file:
```
name=~/fix-strlen.sh
FILE="$1"
sed -E 's/(if *\([ !]*)strlen\(( *\$?[a-z1-9A-Z_]+) *\)( *\))/\1phutil_nonempty_string(\2)\3/g' "$FILE"
```
Then I'm able to execute this fix on every damn file:
```
find . -type f -name "*.php" -exec ~/fix-strlen.sh {} \;
```