Repository Identity "Automatically Detected User": it reads unverified emails, with spam concerns
Closed, ResolvedPublic
Actions

Assigned To

Authored By

	valerio.bozzolan
	Dec 5 2024, 09:57

Description

Description of the Problem

A registered Phorge user may add unverified emails in their profile. Doing so, these users can effectively claim some Repository Identities, which may create spam in Diffusion commits that is difficult to combat at the moment, because of this sub-problem:

T15556: Improve Diffusion identity reassignment propagation

Attack Surface

Phorge/Phabricator installations are affected if they may have untrusted users, that may add unverified email in their profile.

A Phorge installation is particularly affected by this problem if it has an open registration (check your /auth/register/ page).

Examples:

check if a generic user can add a new unverified personal email from /settings/user/my.user.name.lol/page/email/ (yes by default for obvious reasons)
check if a new user can register from /auth/register/, for example via email
check if your page /auth/register/ allows to login using a social network (with or without "Trust Email Addresses" from /auth/config/). And check if later that user can add unverified emails

Affected Code - part 1 (CLI)

The workflow PhabricatorRepositoryManagementRebuildIdentitiesWorkflow seems affected by this problem. That is, this script:

./bin/repository rebuild-identities

It reads whatever email, and not just verified emails:

https://we.phorge.it/source/phorge/browse/master/src/applications/repository/management/PhabricatorRepositoryManagementRebuildIdentitiesWorkflow.php;48fd3f1c40def49cdfec3ad342e32225adc31a79$283-285

Proposed solution:

diff --git a/src/applications/repository/management/PhabricatorRepositoryManagementRebuildIdentitiesWorkflow.php b/src/applications/repository/management/PhabricatorRepositoryManagementRebuildIdentitiesWorkflow.php
index 9a0f68713e..9df88ad055 100644
--- a/src/applications/repository/management/PhabricatorRepositoryManagementRebuildIdentitiesWorkflow.php
+++ b/src/applications/repository/management/PhabricatorRepositoryManagementRebuildIdentitiesWorkflow.php
@@ -281,7 +281,7 @@ final class PhabricatorRepositoryManagementRebuildIdentitiesWorkflow
           $user->getMonogram()));
 
       $emails = id(new PhabricatorUserEmail())->loadAllWhere(
-        'userPHID = %s',
+        'userPHID = %s AND isVerified = 1',
         $user->getPHID());
       if ($emails) {
         $this->rebuildEmails(mpull($emails, 'getAddress'));

Affected Code - part 2 (DiffusionResolveUserQuery)

Also (end probably most importantly) the method didUpdateEmailAddress deserves attention since it seems it's triggered even if the email is unverified:

https://we.phorge.it/source/phorge/browse/master/src/applications/people/editor/PhabricatorUserEditor.php;48fd3f1c40def49cdfec3ad342e32225adc31a79$88-93

So whenever you add an unverified email it seems it triggers this:

https://we.phorge.it/source/phorge/browse/master/src/applications/repository/worker/PhabricatorRepositoryIdentityChangeWorker.php

That ... seems to call PhabricatorUser::loadOneWithEmailAddress() at a certain point:

https://we.phorge.it/source/phorge/browse/master/src/applications/diffusion/query/DiffusionResolveUserQuery.php;48fd3f1c40def49cdfec3ad342e32225adc31a79$95-102

That finds user by whatever UNVERIFIED addresses:

https://we.phorge.it/source/phorge/browse/master/src/applications/people/storage/PhabricatorUser.php;48fd3f1c40def49cdfec3ad342e32225adc31a79$655-665

Proposed solution, adding a new function that finds users by their verified email addresses and use that:

diff --git a/src/applications/diffusion/query/DiffusionResolveUserQuery.php b/src/applications/diffusion/query/DiffusionResolveUserQuery.php
index 8ad13f660e..69a2e66c5f 100644
--- a/src/applications/diffusion/query/DiffusionResolveUserQuery.php
+++ b/src/applications/diffusion/query/DiffusionResolveUserQuery.php
@@ -93,7 +93,7 @@ final class DiffusionResolveUserQuery extends Phobject {
 
 
   private function findUserByEmailAddress($email_address) {
-    $by_email = PhabricatorUser::loadOneWithEmailAddress($email_address);
+    $by_email = PhabricatorUser::loadOneWithVerifiedEmailAddress($email_address);
 
     if ($by_email) {
       return $by_email->getPHID();
diff --git a/src/applications/people/storage/PhabricatorUser.php b/src/applications/people/storage/PhabricatorUser.php
index 2c9cf9f877..b20deb6ca6 100644
--- a/src/applications/people/storage/PhabricatorUser.php
+++ b/src/applications/people/storage/PhabricatorUser.php
@@ -652,6 +652,28 @@ final class PhabricatorUser
     return $this->getUsername();
   }
 
+  /**
+   * Load one user from their verified email address.
+   * @param string $address
+   * @return PhabricatorUser|null
+   */
+  public static function loadOneWithVerifiedEmailAddress($address) {
+    $email = id(new PhabricatorUserEmail())->loadOneWhere(
+      'address = %s AND isVerified = 1',
+      $address);
+    if (!$email) {
+      return null;
+    }
+    return id(new PhabricatorUser())->loadOneWhere(
+      'phid = %s',
+      $email->getUserPHID());
+  }
+
+  /**
+   * Load one user from their potentially unverified email address.
+   * @param string $address
+   * @return PhabricatorUser|null
+   */
   public static function loadOneWithEmailAddress($address) {
     $email = id(new PhabricatorUserEmail())->loadOneWhere(
       'address = %s',

This seems the most safe approach to do not alter the PhabricatorPasswordAuthProvider business logic that may use an unverified email to try the login.

Discussion

Thanks for any opinion lol

Revisions and Commits

rP Phorge
	D25845	rP7429da91d293 Repository Identity "Automatically Detected User": don't trust unverified emails

Related Objects
Search...

Status	Assigned	Task
Open	valerio.bozzolan	T15939 Calendar Import Invitees 2.0
Open	None	T15938 Understand if Diffusion public identities could be used to expand Calendar invitees
Resolved	valerio.bozzolan	T15965 Repository Identity "Automatically Detected User": it reads unverified emails, with spam concerns

Event Timeline

valerio.bozzolan created this task.Dec 5 2024, 09:57

Herald added subscribers: Matthew, chris. · View Herald TranscriptDec 5 2024, 09:57

valerio.bozzolan added a revision: D25845: Repository Identity "Automatically Detected User": don't trust unverified emails.Dec 5 2024, 09:58

Adding @aklapper as subscriber in this security issue since I trust this user (unclear if this should be flagged as security thought, feel free to open)

valerio.bozzolan claimed this task.Dec 5 2024, 10:09

valerio.bozzolan triaged this task as Low priority.

valerio.bozzolan updated the task description. (Show Details)

valerio.bozzolan added a project: Diffusion.

What can a malicious user accomplish by claiming unverified email for commits? The idea outlined here sounds right but I’d like to understand what potential harm could be done on its current state, and also whether there’s any legitimate use case for the current behavior.

We might want to flesh out behavior here, e.g. allow this but indicate in the UI that an account claims the commit but not verified by Phorge. For example I believe in upstream Evan uses (I think?) a fake email for his commits likely to avoid having his email be published/accessible in public git commit histories. Yet he would still want to claim that email on his Phabricator account to link authorship to the account.

avivey subscribed.Dec 8 2024, 12:23

In T15965#20052, @speck wrote:

What can a malicious user accomplish by claiming unverified email for commits?

The harm at the moment seems:

at minimum: the easy and sneaky possibility to steal credits of other people's edits from any repository (also private ones), without the need to do any user action auditable from the Diffusion interface, but just adding some personal dummy unverified emails. A bit difficult to proactively detect since personal emails are private and every user can add personal unverified emails under their own account.
at maximum: ability to guess emails of users (that is technically a data leak)

Limitation: to steal a commit identity, it must be the default.

In T15965#20052, @speck wrote:

whether there’s any legitimate use case for the current behavior.

I think it's safe to change this default guesser, because we will not change the manual workflow.

The most awful thing that can happen I think, is that users without a verified email may then need a couple of clicks for manual identity association.

For example I believe in upstream Evan uses (I think?) a fake email for his commits likely to avoid having his email be published/accessible in public git commit histories. Yet he would still want to claim that email on his Phabricator account to link authorship to the account.

I think this can be continued like before, by manually setting the identity

Limitation: to steal a commit identity, it must be the default. Sorry I forgot to say.

"Steal credit" might actually lead to a real issue: If a new user can get themselves identified as an old, trusted, user based on commit history, their changes might not be checked as rigorously by the rest of the team - similar to the XZ Utils backdoor issue, only faster.

Leaking users' emails is also concerning, from PII POV.

I think this can be continued like before, by manually setting the identity

what does this flow look like?
Github also uses fake emails in commits for some commits made via the web (possibly defaults to on?) - e.g. rPdea952aa48.

Take for example this commit that has a default (empty) identity:

https://we.phorge.it/diffusion/identity/view/10/

(Found from https://we.phorge.it/rPb8b392481fd128f1ab39264451e2a0e4a13aca09)

In this exact case I can steal this credit by registering git@epriestley.com in my user profile (unverified).

But, if I visit the above page and I manually pick somebody else, that will be used instead of the bad default.

valerio.bozzolan renamed this task from Repository Identity: it reads unverified emails, with spam concerns to Repository Identity "Automatically Detected User": it reads unverified emails, with spam concerns.Dec 9 2024, 08:56

valerio.bozzolan added a project: Spam mitigation.Dec 10 2024, 13:54

Another edge case: Most of my contributions to Phorge happened as part of my work for Wikimedia. Those commits are under an email address that I no longer have access to, since I am no longer employed at the Wikimedia Foundation.

Yep, manually setting your unverified (and not verifiable) email would still be possible 👍 just two clicks are needed from this kind of pages:

https://we.phorge.it/diffusion/identity/view/10/

What is changing is, that unverified email will not match your unverified email as default, so that should need these 2 clicks manual configs (or, find a way to verify the email)

In T15965#20144, @valerio.bozzolan wrote:

What is changing is, that unverified email will not match your unverified email as default, so that should need these 2 clicks manual configs (or, find a way to verify the email)

FWIW I am in favor of this change.

If there are no objections I would be happy to accept the diff. @speck are your concerns addressed or should we continue discussion / consider other options?

Let’s do it

valerio.bozzolan closed this task as Resolved by committing rP7429da91d293: Repository Identity "Automatically Detected User": don't trust unverified emails.Dec 11 2024, 08:31

valerio.bozzolan added a commit: rP7429da91d293: Repository Identity "Automatically Detected User": don't trust unverified emails.

valerio.bozzolan changed the visibility from "Custom Policy" to "Public (No Login Required)".Dec 11 2024, 08:31

Herald added subscribers: revi, tobiaswiese. · View Herald TranscriptDec 11 2024, 08:31

valerio.bozzolan mentioned this in Next Up.Dec 11 2024, 08:39

valerio.bozzolan mentioned this in Z1: Phorge.Mar 13 2025, 17:29

avivey mentioned this in 2025.18.May 2 2025, 11:35

valerio.bozzolan added a parent task: T15938: Understand if Diffusion public identities could be used to expand Calendar invitees.May 4 2025, 08:17