Page MenuHomePhorge
Diviner libphutil Tech Docs ArcanistScriptAndRegexLinter

final class ArcanistScriptAndRegexLinter
libphutil Technical Documentation ()

Simple glue linter which runs some script on each path, and then uses a regex to parse lint messages from the script's output. (This linter uses a script and a regex to interpret the results of some real linter, it does not itself lint both scripts and regexes).

Configure this linter by setting these keys in your .arclint section:

  • script-and-regex.script Script command to run. This can be the path to a linter script, but may also include flags or use shell features (see below for examples).
  • script-and-regex.regex The regex to process output with. This regex uses named capturing groups (detailed below) to interpret output.

The script will be invoked from the project root, so you can specify a relative path like scripts/lint.sh or an absolute path like /opt/lint/lint.sh.

This linter is necessarily more limited in its capabilities than a normal linter which can perform custom processing, but may be somewhat simpler to configure.

Script...

The script will be invoked once for each file that is to be linted, with the file passed as the first argument. The file may begin with a "-"; ensure your script will not interpret such files as flags (perhaps by ending your script configuration with "--", if its argument parser supports that).

Note that when run via arc diff, the list of files to be linted includes deleted files and files that were moved away by the change. The linter should not assume the path it is given exists, and it is not an error for the linter to be invoked with paths which are no longer there. (Every affected path is subject to lint because some linters may raise errors in other files when a file is removed, or raise an error about its removal.)

The script should emit lint messages to stdout, which will be parsed with the provided regex.

For example, you might use a configuration like this:

/opt/lint/lint.sh --flag value --other-flag --

stderr is ignored. If you have a script which writes messages to stderr, you can redirect stderr to stdout by using a configuration like this:

sh -c '/opt/lint/lint.sh "$0" 2>&1'

The return code of the script must be 0, or an exception will be raised reporting that the linter failed. If you have a script which exits nonzero under normal circumstances, you can force it to always exit 0 by using a configuration like this:

sh -c '/opt/lint/lint.sh "$0" || true'

Multiple instances of the script will be run in parallel if there are multiple files to be linted, so they should not use any unique resources. For instance, this configuration would not work properly, because several processes may attempt to write to the file at the same time:

sh -c '/opt/lint/lint.sh --output /tmp/lint.out "$0" && cat /tmp/lint.out'

There are necessary limits to how gracefully this linter can deal with edge cases, because it is just a script and a regex. If you need to do things that this linter can't handle, you can write a phutil linter and move the logic to handle those cases into PHP. PHP is a better general-purpose programming language than regular expressions are, if only by a small margin.

...and Regex

The regex must be a valid PHP PCRE regex, including delimiters and flags.

The regex will be matched against the entire output of the script, so it should generally be in this form if messages are one-per-line:

/^...$/m

The regex should capture these named patterns with (?P<name>...):

  • message (required) Text describing the lint message. For example, "This is a syntax error.".
  • name (optional) Text summarizing the lint message. For example, "Syntax Error".
  • severity (optional) The word "error", "warning", "autofix", "advice", or "disabled", in any combination of upper and lower case. Instead, you may match groups called error, warning, advice, autofix, or disabled. These allow you to match output formats like "E123" and "W123" to indicate errors and warnings, even though the word "error" is not present in the output. If no severity capturing group is present, messages are raised with "error" severity. If multiple severity capturing groups are present, messages are raised with the highest captured severity. Capturing groups like error supersede the severity capturing group.
  • error (optional) Match some nonempty substring to indicate that this message has "error" severity.
  • warning (optional) Match some nonempty substring to indicate that this message has "warning" severity.
  • advice (optional) Match some nonempty substring to indicate that this message has "advice" severity.
  • autofix (optional) Match some nonempty substring to indicate that this message has "autofix" severity.
  • disabled (optional) Match some nonempty substring to indicate that this message has "disabled" severity.
  • file (optional) The name of the file to raise the lint message in. If not specified, defaults to the linted file. It is generally not necessary to capture this unless the linter can raise messages in files other than the one it is linting.
  • line (optional) The line number of the message. If no text is captured, the message is assumed to affect the entire file.
  • char (optional) The character offset of the message.
  • offset (optional) The byte offset of the message. If captured, this supersedes line and char.
  • original (optional) The text the message affects.
  • replacement (optional) The text that the range captured by original should be automatically replaced by to resolve the message.
  • code (optional) A short error type identifier which can be used elsewhere to configure handling of specific types of messages. For example, "EXAMPLE1", "EXAMPLE2", etc., where each code identifies a class of message like "syntax error", "missing whitespace", etc. This allows configuration to later change the severity of all whitespace messages, for example.
  • ignore (optional) Match some nonempty substring to ignore the match. You can use this if your linter sometimes emits text like "No lint errors".
  • stop (optional) Match some nonempty substring to stop processing input. Remaining matches for this file will be discarded, but linting will continue with other linters and other files.
  • halt (optional) Match some nonempty substring to halt all linting of this file by any linter. Linting will continue with other files.
  • throw (optional) Match some nonempty substring to throw an error, which will stop arc completely. You can use this to fail abruptly if you encounter unexpected output. All processing will abort.

Numbered capturing groups are ignored.

For example, if your lint script's output looks like this:

error:13 Too many goats!
warning:22 Not enough boats.

...you could use this regex to parse it:

/^(?P<severity>warning|error):(?P<line>\d+) (?P<message>.*)$/m

The simplest valid regex for line-oriented output is something like this:

/^(?P<message>.*)$/m

Tasks

Human Readable Information

Runtime State

Executing Linters

  • public function didLintPaths($paths) — Hook called after a list of paths are linted.

Linting

  • public function willLintPaths($paths) — Run the script on each file to be linted.
  • public function lintPath($path) — Run the regex on the output of the script.

Linter Information

  • public function getLinterName() — Return the short name of the linter.

Parsing Output

  • private function getMatchLineAndChar($match, $path) — Get the line and character of the message from the regex match.
  • private function getMatchSeverity($match) — Map the regex matching groups to a message severity. We look for either a nonempty severity name group like 'error', or a group called 'severity' with a valid name.

Validating Configuration

No methods for this task.

Other Methods

Methods

public function __get($name)
Inherited

This method is not documented.
Parameters
$name
Return
wild

public function __set($name, $value)
Inherited

This method is not documented.
Parameters
$name
$value
Return
wild

public function current()
Inherited

This method is not documented.
Return
wild

public function key()
Inherited

This method is not documented.
Return
wild

public function next()
Inherited

This method is not documented.
Return
wild

public function rewind()
Inherited

This method is not documented.
Return
wild

public function valid()
Inherited

This method is not documented.
Return
wild

private function throwOnAttemptedIteration()
Inherited

This method is not documented.
Return
wild

public function getPhobjectClassConstant($key, $byte_limit)
Inherited

Phobject

Read the value of a class constant.

This is the same as just typing self::CONSTANTNAME, but throws a more useful message if the constant is not defined and allows the constant to be limited to a maximum length.

Parameters
string$keyName of the constant.
int|null$byte_limitMaximum number of bytes permitted in the value.
Return
stringValue of the constant.

public function getInfoURI()
Inherited

ArcanistLinter

Return an optional informative URI where humans can learn more about this linter.

For most linters, this should return a link to the project home page. This is shown on arc linters.

Return
string|nullOptionally, return an informative URI.

public function getInfoDescription()

ArcanistLinter

Return a brief human-readable description of the linter.

These should be a line or two, and are shown on arc linters.

ArcanistScriptAndRegexLinter
This method is not documented.
Return
string|nullOptionally, return a brief human-readable description.

public function getAdditionalInformation()
Inherited

ArcanistLinter

Return arbitrary additional information.

Linters can use this method to provide arbitrary additional information to be included in the output of arc linters.

Return
map<string, string>A mapping of header to body content for the additional information sections.

public function getInfoName()

ArcanistLinter

Return a human-readable linter name.

These are used by arc linters, and can let you give a linter a more presentable name.

ArcanistScriptAndRegexLinter
This method is not documented.
Return
stringHuman-readable linter name.

final public function getActivePath()
Inherited

This method is not documented.
Return
wild

final public function setActivePath($path)
Inherited

This method is not documented.
Parameters
$path
Return
wild

final public function setEngine($engine)
Inherited

This method is not documented.
Parameters
ArcanistLintEngine$engine
Return
wild

final protected function getEngine()
Inherited

This method is not documented.
Return
wild

final public function setLinterID($id)
Inherited

ArcanistLinter

Set the internal ID for this linter.

This ID is assigned automatically by the ArcanistLintEngine.

Parameters
string$idUnique linter ID.
Return
this

final public function getLinterID()
Inherited

ArcanistLinter

Get the internal ID for this linter.

Retrieves an internal linter ID managed by the ArcanistLintEngine. This ID is a unique scalar which distinguishes linters in a list.

Return
stringUnique linter ID.

public function willLintPaths($paths)

ArcanistLinter

Hook called before a list of paths are linted.

Parallelizable linters can start multiple requests in parallel here, to improve performance. They can implement didLintPaths() to collect results.

Linters which are not parallelizable should normally ignore this callback and implement lintPath() instead.

ArcanistScriptAndRegexLinter

Run the script on each file to be linted.

Parameters
list<string>$pathsA list of paths to be linted
Return
void

public function lintPath($path)

ArcanistLinter

Hook called for each path to be linted.

Linters which are not parallelizable can do work here.

Linters which are parallelizable may want to ignore this callback and implement willLintPaths() and didLintPaths() instead.

ArcanistScriptAndRegexLinter

Run the regex on the output of the script.

Parameters
string$pathPath to lint.
Return
void

public function didLintPaths($paths)
Inherited

ArcanistLinter

Hook called after a list of paths are linted.

Parallelizable linters can collect results here.

Linters which are not paralleizable should normally ignore this callback and implement lintPath() instead.

Parameters
list<string>$pathsA list of paths which were linted.
Return
void

public function getLinterPriority()
Inherited

This method is not documented.
Return
wild

public function setCustomSeverityMap($map)
Inherited

This method is not documented.
Parameters
array$map
Return
wild

public function addCustomSeverityMap($map)
Inherited

This method is not documented.
Parameters
array$map
Return
wild

public function setCustomSeverityRules($rules)
Inherited

This method is not documented.
Parameters
array$rules
Return
wild

final public function getProjectRoot()
Inherited

This method is not documented.
Return
wild

final public function getOtherLocation($offset, $path)
Inherited

This method is not documented.
Parameters
$offset
$path
Return
wild

final public function stopAllLinters()
Inherited

This method is not documented.
Return
wild

final public function didStopAllLinters()
Inherited

This method is not documented.
Return
wild

final public function addPath($path)
Inherited

This method is not documented.
Parameters
$path
Return
wild

final public function setPaths($paths)
Inherited

This method is not documented.
Parameters
array$paths
Return
wild

private function filterPaths($paths)
Inherited

ArcanistLinter

Filter out paths which this linter doesn't act on (for example, because they are binaries and the linter doesn't apply to binaries).

Parameters
list<string>$paths
Return
list<string>

final public function getPaths()
Inherited

This method is not documented.
Return
wild

final public function addData($path, $data)
Inherited

This method is not documented.
Parameters
$path
$data
Return
wild

final protected function getData($path)
Inherited

This method is not documented.
Parameters
$path
Return
wild

public function getCacheVersion()
Inherited

This method is not documented.
Return
wild

final public function getLintMessageFullCode($short_code)
Inherited

This method is not documented.
Parameters
$short_code
Return
wild

final public function getLintMessageSeverity($code)
Inherited

This method is not documented.
Parameters
$code
Return
wild

protected function getDefaultMessageSeverity($code)
Inherited

This method is not documented.
Parameters
$code
Return
wild

final public function isMessageEnabled($code)
Inherited

This method is not documented.
Parameters
$code
Return
wild

final public function getLintMessageName($code)
Inherited

This method is not documented.
Parameters
$code
Return
wild

final protected function addLintMessage($message)
Inherited

This method is not documented.
Parameters
ArcanistLintMessage$message
Return
wild

final public function getLintMessages()
Inherited

This method is not documented.
Return
wild

final public function raiseLintAtLine($line, $char, $code, $description, $original, $replacement)
Inherited

This method is not documented.
Parameters
$line
$char
$code
$description
$original
$replacement
Return
wild

final public function raiseLintAtPath($code, $desc)
Inherited

This method is not documented.
Parameters
$code
$desc
Return
wild

final public function raiseLintAtOffset($offset, $code, $description, $original, $replacement)
Inherited

This method is not documented.
Parameters
$offset
$code
$description
$original
$replacement
Return
wild

public function canRun()
Inherited

This method is not documented.
Return
wild

public function getLinterName()

Return the short name of the linter.

Return
stringShort linter identifier.

public function getVersion()
Inherited

This method is not documented.
Return
wild

final protected function isCodeEnabled($code)
Inherited

This method is not documented.
Parameters
$code
Return
wild

public function getLintSeverityMap()
Inherited

This method is not documented.
Return
wild

public function getLintNameMap()
Inherited

This method is not documented.
Return
wild

public function getCacheGranularity()
Inherited

This method is not documented.
Return
wild

public function getLinterConfigurationName()

ArcanistLinter

If this linter is selectable via .arclint configuration files, return a short, human-readable name to identify it. For example, "jshint" or "pep8".

If you do not implement this method, the linter will not be selectable through .arclint files.

ArcanistScriptAndRegexLinter
This method is not documented.
Return
wild
This method is not documented.
Return
wild

public function setLinterConfigurationValue($key, $value)

This method is not documented.
Parameters
$key
$value
Return
wild

protected function canCustomizeLintSeverities()
Inherited

This method is not documented.
Return
wild

protected function shouldLintBinaryFiles()
Inherited

This method is not documented.
Return
wild

protected function shouldLintDeletedFiles()
Inherited

This method is not documented.
Return
wild

protected function shouldLintDirectories()
Inherited

This method is not documented.
Return
wild

protected function shouldLintSymbolicLinks()
Inherited

This method is not documented.
Return
wild

protected function getLintCodeFromLinterConfigurationKey($code)
Inherited

ArcanistLinter

Map a configuration lint code to an arc lint code. Primarily, this is intended for validation, but can also be used to normalize case or otherwise be more permissive in accepted inputs.

If the code is not recognized, you should throw an exception.

Parameters
string$codeCode specified in configuration.
Return
stringNormalized code to use in severity map.

private function getMatchLineAndChar($match, $path)

Get the line and character of the message from the regex match.

Parameters
dict$matchCaptured groups from regex.
$path
Return
pair<int|null,int|null>Line and character of the message.

private function getMatchSeverity($match)

Map the regex matching groups to a message severity. We look for either a nonempty severity name group like 'error', or a group called 'severity' with a valid name.

Parameters
dict$matchCaptured groups from regex.
Return
const@{class:ArcanistLintSeverity} constant.