diff --git a/src/docs/contributor/php_coding_standards.diviner b/src/docs/contributor/php_coding_standards.diviner index bb54478fa3..6c68d194ca 100644 --- a/src/docs/contributor/php_coding_standards.diviner +++ b/src/docs/contributor/php_coding_standards.diviner @@ -1,178 +1,182 @@ @title PHP Coding Standards @group standards This document describes PHP coding standards for Phorge and related projects (like Arcanist). = Overview = This document outlines technical and style guidelines which are followed in Phorge and Arcanist. Contributors should also follow these guidelines. Many of these guidelines are automatically enforced by lint. These guidelines are essentially identical to the Facebook guidelines, since I basically copy-pasted them. If you are already familiar with the Facebook guidelines, you probably don't need to read this super thoroughly. = Spaces, Linebreaks and Indentation = - Use two spaces for indentation. Don't use tab literal characters. - Use Unix linebreaks ("\n"), not MSDOS ("\r\n") or OS9 ("\r"). - Put a space after control keywords like `if` and `for`. - Put a space after commas in argument lists. - Put a space around operators like `=`, `<`, etc. - Don't put spaces after function names. - Parentheses should hug their contents. - Generally, prefer to wrap code at 80 columns. = Case and Capitalization = - Name variables and functions using `lowercase_with_underscores`. - Name classes using `UpperCamelCase`. - Name methods and properties using `lowerCamelCase`. - Use uppercase for common acronyms like ID and HTML. - Name constants using `UPPERCASE`. - Write `true`, `false` and `null` in lowercase. = Comments = - Do not use "#" (shell-style) comments. - Prefer "//" comments inside function and method bodies. = PHP Language Style = - Use "" tag. - Prefer casts like `(string)` to casting functions like `strval()`. - Prefer type checks like `$v === null` to type functions like `is_null()`. - Avoid all crazy alternate forms of language constructs like "endwhile" and "<>". - Always put braces around conditional and loop blocks. = PHP Language Features = - Use PHP as a programming language, not a templating language. - Avoid globals. - Avoid extract(). - Avoid eval(). - Avoid variable variables. - Prefer classes over functions. - Prefer class constants over defines. - Avoid naked class properties; instead, define accessors. - Use exceptions for error conditions. - Use type hints, use `assert_instances_of()` for arrays holding objects. = Examples = **if/else:** lang=php if ($some_variable > 3) { // ... } else if ($some_variable === null) { // ... } else { // ... } You should always put braces around the body of an if clause, even if it is only one line long. Note spaces around operators and after control statements. Do not use the "endif" construct, and write "else if" as two words. **for:** lang=php for ($ii = 0; $ii < 10; $ii++) { // ... } Prefer $ii, $jj, $kk, etc., as iterators, since they're easier to pick out visually and react better to "Find Next..." in editors. **foreach:** lang=php foreach ($map as $key => $value) { // ... } **switch:** lang=php switch ($value) { case 1: // ... break; case 2: if ($flag) { // ... break; } break; default: // ... break; } `break` statements should be indented to block level. **array literals:** lang=php $junk = array( 'nuts', 'bolts', 'refuse', ); Use a trailing comma and put the closing parenthesis on a separate line so that diffs which add elements to the array affect only one line. **operators:** lang=php $a + $b; // Put spaces around operators. $omg.$lol; // Exception: no spaces around string concatenation. $arr[] = $element; // Couple [] with the array when appending. $obj = new Thing(); // Always use parens. **function/method calls:** lang=php // One line eject($cargo); // Multiline AbstractFireFactoryFactoryEngine::promulgateConflagrationInstance( $fuel, $ignition_source); **function/method definitions:** lang=php function example_function($base_value, $additional_value) { return $base_value + $additional_value; } class C { public static function promulgateConflagrationInstance( IFuel $fuel, IgnitionSource $source) { // ... } } **class:** lang=php class Dog extends Animal { const CIRCLES_REQUIRED_TO_LIE_DOWN = 3; private $favoriteFood = 'dirt'; public function getFavoriteFood() { return $this->favoriteFood; } } + +# Extra Readings + +* @{article:PHP Pitfalls} diff --git a/src/docs/flavor/php_pitfalls.diviner b/src/docs/flavor/php_pitfalls.diviner index e15ca9e2bb..b280cb2394 100644 --- a/src/docs/flavor/php_pitfalls.diviner +++ b/src/docs/flavor/php_pitfalls.diviner @@ -1,329 +1,376 @@ @title PHP Pitfalls @group php This document discusses difficult traps and pitfalls in PHP, and how to avoid, work around, or at least understand them. = `array_merge()` in Incredibly Slow When Merging A List of Arrays = If you merge a list of arrays like this: COUNTEREXAMPLE, lang=php $result = array(); foreach ($list_of_lists as $one_list) { $result = array_merge($result, $one_list); } ...your program now has a huge runtime because it generates a large number of intermediate arrays and copies every element it has previously seen each time you iterate. In a libphutil environment, you can use @{function@arcanist:array_mergev} instead. = `var_export()` Hates Baby Animals = If you try to `var_export()` an object that contains recursive references, your program will terminate. You have no chance to intercept or react to this or otherwise stop it from happening. Avoid `var_export()` unless you are certain you have only simple data. You can use `print_r()` or `var_dump()` to display complex variables safely. = `isset()`, `empty()` and Truthiness = A value is "truthy" if it evaluates to true in an `if` clause: lang=php $value = something(); if ($value) { // Value is truthy. } If a value is not truthy, it is "falsey". These values are falsey in PHP: null // null 0 // integer 0.0 // float "0" // string "" // empty string false // boolean array() // empty array -Disregarding some bizarre edge cases, all other values are truthy. Note that -because "0" is falsey, this sort of thing (intended to prevent users from making -empty comments) is wrong in PHP: - - COUNTEREXAMPLE - if ($comment_text) { - make_comment($comment_text); - } - -This is wrong because it prevents users from making the comment "0". //THIS -COMMENT IS TOTALLY AWESOME AND I MAKE IT ALL THE TIME SO YOU HAD BETTER NOT -BREAK IT!!!// A better test is probably `strlen()`. +Disregarding some bizarre edge cases, all other values are truthy. In addition to truth tests with `if`, PHP has two special truthiness operators which look like functions but aren't: `empty()` and `isset()`. These operators help deal with undeclared variables. In PHP, there are two major cases where you get undeclared variables -- either you directly use a variable without declaring it: COUNTEREXAMPLE, lang=php function f() { if ($not_declared) { // ... } } ...or you index into an array with an index which may not exist: COUNTEREXAMPLE function f(array $mystery) { if ($mystery['stuff']) { // ... } } When you do either of these, PHP issues a warning. Avoid these warnings by using `empty()` and `isset()` to do tests that are safe to apply to undeclared variables. `empty()` evaluates truthiness exactly opposite of `if()`. `isset()` returns `true` for everything except `null`. This is the truth table: -| Value | `if()` | `empty()` | `isset()` | -|-------|--------|-----------|-----------| -| `null` | `false` | `true` | `false` | -| `0` | `false` | `true` | `true` | -| `0.0` | `false` | `true` | `true` | -| `"0"` | `false` | `true` | `true` | -| `""` | `false` | `true` | `true` | -| `false` | `false` | `true` | `true` | -| `array()` | `false` | `true` | `true` | -| Everything else | `true` | `false` | `true` | +| Value | `if()` | `empty()` | `isset()` | +|---------------|--------|-----------|-----------| +| `null` | `false`| `true` | `false` | +| `0` | `false`| `true` | `true` | +| `0.0` | `false`| `true` | `true` | +| `"0"` | `false`| `true` | `true` | +| `""` | `false`| `true` | `true` | +|`false` | `false`| `true` | `true` | +|`array()` | `false`| `true` | `true` | +|Everything else| `true` | `false` | `true` | The value of these operators is that they accept undeclared variables and do not issue a warning. Specifically, if you try to do this you get a warning: ```lang=php, COUNTEREXAMPLE if ($not_previously_declared) { // PHP Notice: Undefined variable! // ... } ``` But these are fine: ```lang=php if (empty($not_previously_declared)) { // No notice, returns true. // ... } if (isset($not_previously_declared)) { // No notice, returns false. // ... } ``` So, `isset()` really means `is_declared_and_is_set_to_something_other_than_null()`. `empty()` really means `is_falsey_or_is_not_declared()`. Thus: - If a variable is known to exist, test falsiness with `if (!$v)`, not `empty()`. In particular, test for empty arrays with `if (!$array)`. There is no reason to ever use `empty()` on a declared variable. - When you use `isset()` on an array key, like `isset($array['key'])`, it will evaluate to "false" if the key exists but has the value `null`! Test for index existence with `array_key_exists()`. Put another way, use `isset()` if you want to type `if ($value !== null)` but are testing something that may not be declared. Use `empty()` if you want to type `if (!$value)` but you are testing something that may not be declared. += Check for non-empty strings = + +As already mentioned, note that you cannot just use an `if` or `empty()` to +check for a non-empty string, mostly because "0" is falsey, so you cannot rely +on this sort of thing to prevent users from making empty comments: + + COUNTEREXAMPLE + if ($comment_text) { + make_comment($comment_text); + } + +This is wrong because it prevents users from making the comment "0". + +//THE COMMENT "0" IS TOTALLY AWESOME AND I MAKE IT ALL THE TIME SO YOU HAD +BETTER NOT BREAK IT!!!// + +Another way //was// also `strlen()`: + + COUNTEREXAMPLE + if (strlen($comment_text)) { + make_comment($comment_text); + } + +But using `strlen(null)` causes a deprecation warning since PHP 8.1. Also, +using `strlen()` uses too many CPU cycles to just check of a non-empty. + +In short, outside Phorge, this is a general way to check for non-empty strings +for most wild input types: + +```lang=php + $value_str = (string) $value; + if ($value_str !== '') { + // do something + } +``` + +To do the same thing in Phorge, use this better and safer approach: + +```lang=php + $value_str = phutil_string_cast($value); + if ($value_str !== '') { + // do something + } +``` + +And, if you are 100% sure that you are __only__ working with string and +null, evaluate this instead: + +```lang=php + if (phutil_nonempty_string($value)) { + // do something + } +``` + +WARNING: The function `phutil_nonempty_string()` is designed to throw a nice +exception if it receives `true`, `false`, an array, an object or anything +alien that is not a string and not null. Do your evaluations. + = usort(), uksort(), and uasort() are Slow = This family of functions is often extremely slow for large datasets. You should avoid them if at all possible. Instead, build an array which contains surrogate keys that are naturally sortable with a function that uses native comparison (e.g., `sort()`, `asort()`, `ksort()`, or `natcasesort()`). Sort this array instead, and use it to reorder the original array. In a libphutil environment, you can often do this easily with @{function@arcanist:isort} or @{function@arcanist:msort}. = `array_intersect()` and `array_diff()` are Also Slow = These functions are much slower for even moderately large inputs than `array_intersect_key()` and `array_diff_key()`, because they can not make the assumption that their inputs are unique scalars as the `key` varieties can. Strongly prefer the `key` varieties. = `array_uintersect()` and `array_udiff()` are Definitely Slow Too = These functions have the problems of both the `usort()` family and the `array_diff()` family. Avoid them. = `foreach()` Does Not Create Scope = Variables survive outside of the scope of `foreach()`. More problematically, references survive outside of the scope of `foreach()`. This code mutates `$array` because the reference leaks from the first loop to the second: ```lang=php, COUNTEREXAMPLE $array = range(1, 3); echo implode(',', $array); // Outputs '1,2,3' foreach ($array as &$value) {} echo implode(',', $array); // Outputs '1,2,3' foreach ($array as $value) {} echo implode(',', $array); // Outputs '1,2,2' ``` The easiest way to avoid this is to avoid using foreach-by-reference. If you do use it, unset the reference after the loop: ```lang=php foreach ($array as &$value) { // ... } unset($value); ``` = `unserialize()` is Incredibly Slow on Large Datasets = The performance of `unserialize()` is nonlinear in the number of zvals you unserialize, roughly `O(N^2)`. | zvals | Approximate time | |-------|------------------| | 10000 |5ms | | 100000 | 85ms | | 1000000 | 8,000ms | | 10000000 | 72 billion years | = `call_user_func()` Breaks References = If you use `call_use_func()` to invoke a function which takes parameters by reference, the variables you pass in will have their references broken and will emerge unmodified. That is, if you have a function that takes references: ```lang=php function add_one(&$v) { $v++; } ``` ...and you call it with `call_user_func()`: ```lang=php, COUNTEREXAMPLE $x = 41; call_user_func('add_one', $x); ``` ...`$x` will not be modified. The solution is to use `call_user_func_array()` and wrap the reference in an array: ```lang=php $x = 41; call_user_func_array( 'add_one', array(&$x)); // Note '&$x'! ``` This will work as expected. = You Can't Throw From `__toString()` = If you throw from `__toString()`, your program will terminate uselessly and you won't get the exception. = An Object Can Have Any Scalar as a Property = Object properties are not limited to legal variable names: ```lang=php $property = '!@#$%^&*()'; $obj->$property = 'zebra'; echo $obj->$property; // Outputs 'zebra'. ``` So, don't make assumptions about property names. = There is an `(object)` Cast = You can cast a dictionary into an object. ```lang=php $obj = (object)array('flavor' => 'coconut'); echo $obj->flavor; // Outputs 'coconut'. echo get_class($obj); // Outputs 'stdClass'. ``` This is occasionally useful, mostly to force an object to become a JavaScript dictionary (vs a list) when passed to `json_encode()`. = Invoking `new` With an Argument Vector is Really Hard = If you have some `$class_name` and some `$argv` of constructor arguments and you want to do this: ```lang=php new $class_name($argv[0], $argv[1], ...); ``` ...you'll probably invent a very interesting, very novel solution that is very wrong. In a libphutil environment, solve this problem with @{function@arcanist:newv}. Elsewhere, copy `newv()`'s implementation. = Equality is not Transitive = This isn't terribly surprising since equality isn't transitive in a lot of languages, but the `==` operator is not transitive: ```lang=php $a = ''; $b = 0; $c = '0a'; $a == $b; // true $b == $c; // true $c == $a; // false! ``` When either operand is an integer, the other operand is cast to an integer before comparison. Avoid this and similar pitfalls by using the `===` operator, which is transitive. = All 676 Letters in the Alphabet = This doesn't do what you'd expect it to do in C: ```lang=php for ($c = 'a'; $c <= 'z'; $c++) { // ... } ``` This is because the successor to `z` is `aa`, which is "less than" `z`. The loop will run for ~700 iterations until it reaches `zz` and terminates. That is, `$c` will take on these values: ``` a b ... y z aa // loop continues because 'aa' <= 'z' ab ... mf mg ... zw zx zy zz // loop now terminates because 'zz' > 'z' ``` Instead, use this loop: ```lang=php foreach (range('a', 'z') as $c) { // ... } ```