Page MenuHomePhorge

newOAuth1Future doesn't set a User-Agent
Closed, ResolvedPublic

Description

It is good practice to always set a User-Agent.

While debugging why cloudflare was showing a captcha to Phorge, I realised it was because of lack of a captcha. I did exempt it from the rule but I don't see why a User-Agent shouldn't be added (and probably should be added to any other external http(s) requests)

Affected code is https://we.phorge.it/source/phorge/browse/master/src/applications/auth/adapter/PhutilOAuth1AuthAdapter.php;cb934602c2e8c7da4b1c15793fe2511fe47c108d$102?as=source&blame=off

Event Timeline

https://we.phorge.it/source/phorge/browse/master/src/applications/auth/adapter/PhutilGitHubAuthAdapter.php$57-58 uses a boring
$future->addHeader('User-Agent', __CLASS__); for this.

I wonder if both should use something like PhabricatorEnv::getEnvConfig('phabricator.base-uri'); instead.

That would probably be a better User-Agent

I would like to better understand the root problem. I see that Cloudflare was showing a captcha to Phorge (what Phorge?). It seems it was because an user agent was missing. But aklapper said that we are already setting an user agent.

Why changing the current user agent should improve the situation?

@valerio.bozzolan: thats not what Andre says. He says another area of the code uses a terrible user agent and we should change that to a better one at the same time.

I would like to better understand the root problem.

Why changing the current user agent should improve the situation?

A User-Agent string which makes it more easy to contact the owner of the software/website sending those requests is generally helpful when the software/website is scraping, compared to a generic user agent. Wikimedia SRE often blocks scraping activity from/with generic user agents as Wikimedia SRE has no other way to contact the folks running that software/website. See especially https://phabricator.wikimedia.org/T319423 (and maybe https://phabricator.wikimedia.org/T371039 or https://phabricator.wikimedia.org/T366363 ).
Many requests coming in with the very same generic user agent coming from a bunch of different IPs may get interpreted as a DDoS and thus more likely get blocked.