Page MenuHomePhorge

Restrict maximum number of search tokens per query to 16
Changes PlannedPublic

Authored by aklapper on Sun, Jun 1, 18:18.

Details

Summary

Currently the Query text field in Search allows an arbitrary number of tokens (characters divided by whitespace) up to a total of 1024 bytes. This feels excessive and could decrease performance in public Phorge instances.

This patch mitigates T15831 by not allowing a huge amount of search tokens which would trigger a "Too many tables" MariaDB error (plus it displays a clearer error message).
This only applies to parsing the Query string and not to typeahead fields or such.
The threshold of 16 is entirely arbitrary and welcomes excessive bikeshedding before accepting this patch.
(For better understanding, note that a Query term like foo AND (bar OR meouw) counts as 5 tokens, not 3 tokens.)

See also rP72cb3d3c84905c0d75074e4ecf74c493e3a2d527 (though that older commit introduced analyzing search tokens while this very change just drops the query with an error, as I personally interpret too many search tokens as PEBKAC).

This patch should also decrease the likeliness of facing https://secure.phabricator.com/T12993 and https://phabricator.wikimedia.org/T383068 (extremely long running queries) leading to decreased DB performance.

Closes T15831

Test Plan
  1. Go to http://phorge.localhost/typeahead/class/PhabricatorSearchDatasource/?__path__=%2ftypeahead%2fclass%2fPhabricatorSearchDatasource%2f&__ajax__=true&__metablock__=569&q=i%20have%20a%20hacker%20and%20scammer%20that%20is%20on%20me%2024%20hoirs%20a%20day%20and%20had%20access%20to%20me%20icloud%20snd%20other%20accounts%20i%20cakled%20everyonevand%20even%20fraud%20cant%20help%20me%20this%20gurl%20is%20smart%20snd%20nadechersrlf%20the%20admin%20of%20my%20accounts%20while%20i%20was%20in%20a%20comma%20in%20the%20hospital%20for%20over%20a%20month%20ina%20half%20she%20is%20rtryingcto%20take%20over%20my%20identity%20and%20i%20refuse%20tovletcthat%20happen%20how%20do%20i%20stop%20her%20from%20being%20the%20admin%20of%20ny%20accounts%20i%20bever%20had%20inevand%20dintvneed%20one%20cause%20she%20doesntvhelp%20me%20she%20hackescinto%20ky%20accounts%20and%20scanms%20mecand&raw=i%20have%20a%20hacker%20and%20scammer%20that%20is%20on%20me%2024%20hoirs%20a%20day%20and%20had%20access%20to%20me%20icloud%20snd%20other%20accounts%20i%20cakled%20everyonevand%20even%20fraud%20cant%20help%20me%20this%20gurl%20is%20smart%20snd%20nadechersrlf%20the%20admin%20of%20my%20accounts%20while%20i%20was%20in%20a%20comma%20in%20the%20hospital%20for%20over%20a%20month%20ina%20half%20she%20is%20rtryingcto%20take%20over%20my%20identity%20and%20i%20refuse%20tovletcthat%20happen%20how%20do%20i%20stop%20her%20from%20being%20the%20admin%20of%20ny%20accounts%20i%20bever%20had%20inevand%20aaa and do not blow up anymore with Too many tables; MariaDB can only use 61 tables in a join when Phorge queries the user.nametoken DB table but instead get a more meaningful error message.
  2. Go to http://phorge.localhost/search/query/advanced/ and enter ds ad sd rj a fdj dj fjw wj sdj sdg s jsdk jdsd jks djksdj (16 tokens) in the top bar search field, works as expected.
  3. Go to http://phorge.localhost/search/query/advanced/ and enter ds ad sd rj a fdj dj fjw wj sdj sdg s jsdk jdsd jks djksdj jsd (17 tokens) in the top bar search field, shows error as expected.
  4. Enter the same two strings from step 2 and step 3 on http://phorge.localhost in the top bar search field on the right.
  5. Go to http://phorge.localhost/search/query/HCvkIIvHK4Db/#R and combine a Query string with 16 tokens with other Typeahead fields (enter some Authors, Tags etc), query still works.
  6. Go to an Advanced Search of an application, like http://phorge.localhost/maniphest/query/advanced/, and repeat step 5.
  7. Go to http://phorge.localhost/search/query/advanced/ and enter a random CJK string like 㪳㪸㫆㫓㫕㫱㫠㫝㫴㫙㪼㫸㬄㫣㪴㪿㫬㫽㫱㫥㫵㬆㬈㬕㬮㯅㯊㮮𢬎𢬢𢭃𢭏 and see that it does not throw an error, as expected, as PhutilSearchQueryCompiler special-cases via phutil_utf8_is_cjk().

Diff Detail

Repository
rP Phorge
Branch
searchTokensMaxT15831 (branched from master)
Lint
Lint Passed
Unit
Tests Passed
Build Status
Buildable 2045
Build 2045: arc lint + arc unit

Event Timeline

aklapper requested review of this revision.Sun, Jun 1, 18:18