Page MenuHomePhorge

Discuss Arcanist as a barrier to adoption of Phorge and how to address the underlying issues.
Open, Needs TriagePublic

Description

Context

As discussed in the phorge april 19 2022 meeting and with much history recorded in Phabricator T5000 :

The Problem

Arcanist is a significant barrier to adoption for many potential Phabricator and thus Phorge users.

The sad story of almost adopting Differential at the Wikimedia Foundation

Some thoughts from @20after4. This is a very sore subject for me personally, however, I'm trying to overcome my bitterness for the greater good.

I'm trying to convey a rough outline of the issues I came up against during my very difficult and years-long battle against the Arcanist haters which ultimately ended with me leaving the Wikimedia foundation just as they are adopting Gitlab to replace Gerrit for code review. This comes more than 7 years after a public decision making process concluded with the decision to adopt Phabricator and I, a new Wikimedia employee, set out to make it happen.

The original plan was to start by adopting Maniphest in place of Bugzilla. Differential was to follow once the Bugzilla migration was completed.

This never happened. The migration to Differential was ultimately hamstrung by a significant push-back against Arcanist and a widespread view (among at least a large minority of existing staff) that a pure git workflow was vastly superior. So much so that, at least in the minds of many influential Wikimedia developers, Phabricator was seen as completely unfit to replace Gerrit. Ultimately the project died and I went on to work on other things, spirit crushed, burned out and disillusioned. I never really recovered.

Ultimately I tried and failed to win people over to the benefits that Differential has to offer. This is despite the fact that Gerrit and Phabricator share a nearly identical workflow. I believe that the major sticking point was Arcanist which is slightly baffling since Wikimedia user's typically use git review to interact with Gerrit. git review is a client side tool which is conceptually and superficially similar to Arcanist, just much simpler and providing much less functionality. The important distinction however, is that gerrit does not require git review, it's purely optional and really just provides a few CLI shortcuts for interfacing with gerrit. Seemingly the other major advantage is that git review isn't written in PHP. I guess what I'm trying to say is that some people really hate PHP and will avoid it at all costs.

I am a long time user of gerrit and a believer in Phabricator's superiority in nearly every important measure. Nonetheless, I will still concede that gerrit's git interface is pretty nice to use and vastly easier to get started as compared to installing Arcanist and learning the nuances of arc diff. With gerrit it's rather pleasant to push a change even without using git review to do so. A developer can mostly treat gerrit as just another git remote that happens to use an unconventional naming convention for remote refs: All it takes to submit a new code review to gerrit is this incantation:

git push gerrit-server HEAD:refs/for/{branch-name}

After which, gerrit creates a new revision for review as a diff against the branch named by {branch-name} in the fictitious ref named by the push. It's slightly ugly but conceptually it's fairly clean and it just works. Everything else about gerrit is pretty much one disaster after another but the interface between the git client and the gerrit server is pretty decent.

So with all that baggage out of the way, a bit more commentary and hopefully some opportunities for progress:

From what I've gathered, Wikimedia folks weren't the only ones who had an allergic reaction to Arcanist coupled with a strong preference for a "pure" git interface. A careful reading of Phabricator T5000 will reveal that several potential Phabricator migrations were killed by users hostile to the idea of Arcanist as a front end to their code review system.

A Phorge without client-side arcanist?

That same upstream task also discusses a potential path to supporting a git-based interface to Differential, it's just a bit more than trivial to implement. It's totally tractable though and should probably be considered as one possible way forward - essentially it boils down to just running arcanist server side against a branch that is pushed to some temporary place on the server. Pretty much exactly what gerrit does if you overlook the fact that gerrit is 100% java and completely re-invents the low level details instead of using native git.

There are still some advantages to running tests and Linters all on the client side for faster iteration instead of pushing and waiting for the remote CI system to spit back errors at you asynchronously. I don't peronally think that we should kill arc so much as maybe offer an alternative arc-free mode of operation.

An outline of the problems with Arcanist the constellation of somewhat-related barriers to adoption of Phorge in 2022:

  1. The need to install something client side in order to have a reasonable code review interface is off-putting to many people.
    1. A lot of people seem to especially dislike that it is a PHP application.
    2. Windows support is second-class for various reasons.
    3. This could likely be partially alleviated if Arcanist were to be written in GO, however, that's a huge undertaking and only a partial solution.
    4. There is a pretty compelling argument to be made for the way which gerrit works with git. It's really not very different from the Phabricator workflow, however, git push origin HEAD:refs/for/main is fairly straightforward, nearly elegant even, using only the standard git client tools with no further dependencies.
  2. It seems that the pull request workflow, specifically the GitHub & Gitlab implementations of it, have almost completely won in the minds of the vast majority of younger developers.
    1. The war may have already been lost, however, I still believe there is value in write, review, merge, publish.
    2. Phorge might be doomed to obscurity if we fail to address the on-boarding friction and general confusion that new users experience when learning the Phorge workflow with expectations set by their git(hub/lab) prior experience.
      1. The mental models are different enough that Phorge can be very confusing and counter-intuitive to someone who has only ever experienced a PR-based code review process.
      2. This results in an initial hostility towards the tool and many wind up hating the tool and never getting over the learning curve to appreciate the benefits that Phorge can offer.

Event Timeline

I find this rather interesting (and a little bit weird, to some extent), because IMO the arcanist command line tool is one of the things which IMO _add_ value to Phabricator and sets it apart from it's alternatives.

  • IMO, the UI of Phabricator vs Gitlab/Github are pretty much on par. I can't offhand think of any big differences.
  • Diffs vs Branches. I prefer the former, but I don't think there is a meaningful difference between diffs vs short-lived branches done right.

What I like about Phabricator:

  • I do not need to create a branch on remote and fiddle with UI to create a PR. This is handled directly from the command line with arc work + arc diff.
  • I have colleagues who use PR workflows. They spent months setting up and tweaking their git hooks and other crap to ensure their workflow was satisfactory. And with all that, they still have no idea whether everyone's workflows are set up correctly and run on each diff (completely reliant on everyone's setup working correctly). All of that comes "for free" with arcanist.

Honestly, if gitlab or github offered and arcanist-like tool for their code review flows, I'd probably have switched a couple of years ago. If we decide to shift (much more likely now that Phacility has "ceased" operating), we might end up building one for our self.

That being said, I agree with some of the key points:

  • It's unfortunate that arcanist is the only really practical way for code reviews. it creates a barrier to entry, and infrequent contributors rarely have the patience for it.
  • Being forced to deal with PHP sucks. As well as it not functioning very well in Windows.

Really, if you look at Phabricator, what are the killer features which set it apart from Gitlab and other competitors? To me, there are two:

  • Herald: allows you to script business rules / workflows into the platform.
  • arcanist: handles most of the workflow scripting wrt code reviews.

Everything else is either equivalent - or better - in other solutions, IMO. If I had time and resources to spare, I'd probably look into building this kind of tool against a PR-based workflow.

@micax: Good points and it's helpful to hear another perspective on this. From my past experience using Phabricator on a corporate team I definitely think that arcanist helped keep everyone's workflow consistent and simple.

Honestly the effort to set up arcanist isn't huge (the flow for setting up your CLI cert couldn't be more perfect and user friendly IMO) and it's all worth it because of the productivity gained by automatic lint fixes, easy patch submission and code review checkouts (arc patch is awesome!)

I am definitely not in the arc haters club, I've just seen a lot of feedback from people who expressed some of the perspectives that I tried to capture here.

I think a lot of what differentiates between various code review systems and workflows are actually the subtle ways that they nudge you towards certain behaviors. Phabricator and Gerrit both encourage small changes that are a single unit of work. branch-based workflow facilitate the opposite. You can have the phabricator style workflow with github but it requires discipline. In short: Subtle differences in the tooling can have unexpectedly big impact on behaviors because people tend to take the path of least resistance.

In T15096#2229, @micax wrote:

Honestly, if gitlab or github offered and arcanist-like tool for their code review flows, I'd probably have switched a couple of years ago. If we decide to shift (much more likely now that Phacility has "ceased" operating), we might end up building one for our self.

FWIW, I still think there is a bit of life left in the platform, especially with Phorge development progressing. We're moving slowly but several useful bits have merged already and I'm sure more are forthcoming.
There is still some activity upstream in core Phabricator as well. Just basic maintenance but don't count the project dead yet!

Really, if you look at Phabricator, what are the killer features which set it apart from Gitlab and other competitors? To me, there are two:

  • Herald: allows you to script business rules / workflows into the platform.
  • arcanist: handles most of the workflow scripting wrt code reviews.

Agreed, Herald is awesome. It's used extensively by Wikimedia for all kinds of obscure operations.

Everything else is either equivalent - or better - in other solutions, IMO. If I had time and resources to spare, I'd probably look into building this kind of tool against a PR-based workflow.

While this may be true for the most part, github is proprietary and gitlab is largely commercial software - the community edition lacks a lot of the nice features. For these reasons I think there is still room for Phorge to be a useful free/open source alternative.

One big advantage vs. gitlab of the top of my head - Phabricator is _much_ easier to self-host. Gitlab is a complex web of services that require a lot of sysadmin work to set up and maintain. Phabricator largely takes care of itself and the well-understood LAMP stack is very reliable / easy to operate, very scaleable and not particularly demanding on server resources.

Definitely agree that the effort to set up arcanist isn't huge. And at my current work, it's baked into our common Dev PC setup, so it's almost zero effort. But there is an effort, and a dev/user who is just passing by to fix a typo or suggest a one-line change in some code isn't going to be willing to do that effort.

I also agree about the benefits of Arcanist (especially the subtle one). Of all the tools I've used, arcanist (and Phabricator) are definitely the tools I feel have contributed best to a healthy code review culture without requiring some sort of carrot/whip approach. Discipline - as you say - and in my experience discipline tends to slip very easily. Even just simple stuff like having people delete their branches after creating them. I personally also like the psychological effect that your code is not in the repository until _after_ the diff is landed.

That being said, there is a threshold required to get into using arcanist which one shouldn't under-estimate. I've introduced Phabricator into two engineering teams so far (at two different companies), and it takes time to get people comfortable (though the arc work/diff/land flow was an improvement). And when issues arise in someone's workflow (as they inevitably will at some point), people tend to get lost.

Good point about Phabricator being simple to self-host and maintain onsite. That's definitely a factor in why I've used phabricator over the alternatives. And I'm not discounting the product yet either (still using it); though I do admit I doubt it's future as a full competitor to things like Gitlab. As I think I mentioned in one of the discussions last year, I have more belief in it as a light-weight competitor that does one or two things (one of those things being code reviews), really, really well. As such, I'd love to see:

  • Improvements to the code review toolset (such as you discuss in the original post). I think the underlying problem is the lack of compatibility with the PR workflow, though, rather than arcanist itself (although getting rid of PHP would certainly help).
  • Being actually integration friendly. Not that Phabricator itself is necessarily unfriendly, but Phacility were pretty clear that they didn't care to integrate with anyone else. I think that was a mistake. It's possible for me to argue for using Phabricator as an alternative to Crucible, Yousee, Gerrit, etc (and where I use Phabricator today, it's almost exclusively as a code review tool). I am never going to be able to convince business that Phabricator is _the_ alternative to Jira or Confluence in a company that can afford such products.
  • Ditch stuff like Harbormaster. Even when they started talking about it, IMO it was a bridge way too far. It would have been much better to ensure Phabricator had seamless integration possibilities to Jenkins/TeamCIty/Bamboo, etc. In my current setup, we have phabricator's code reviews integrated into our CI/CD workflow (building off Matt Oliver's excellent work), but it's - IMO - more complicated than it should be to do this, because Phabricator insist that Harbormaster is it's CI tool.

Thank you for these write-ups, I'll need more time to review however I noticed Evan recently started a task in the upstream where it looks like he's investigating compiling PHP to a library for use with a custom native entrypoint which would allow distributing arcanist as a single binary (he estimates ~10mb in size).
https://secure.phabricator.com/T13675

As for my thoughts on https://secure.phabricator.com/T5000, the current opinion I hold is that we should translate local/development branches into phorge revisions, rather than managing the branches in the upstream repo. So if a user creates a new branch and pushes it, phorge would check against a list of branches that it's meant to track (I think this config already exists?) and if it's not there it would convert the branch into a revision instead. One difficulty however is tracking which local branches belong to which revisions -- today this is managed due to arcanist modifying the local commit message to reference the revision, so that subsequent updates/pushes will update the revision instead of making a new one.

For mercurial I think the same approach could be used, but utilizing "topics" instead of mercurial branches.

In T15096#2233, @speck wrote:

Thank you for these write-ups, I'll need more time to review however I noticed Evan recently started a task in the upstream where it looks like he's investigating compiling PHP to a library for use with a custom native entrypoint which would allow distributing arcanist as a single binary (he estimates ~10mb in size).
https://secure.phabricator.com/T13675

That's very interesting!

As for my thoughts on https://secure.phabricator.com/T5000, the current opinion I hold is that we should translate local/development branches into phorge revisions, rather than managing the branches in the upstream repo. So if a user creates a new branch and pushes it, phorge would check against a list of branches that it's meant to track (I think this config already exists?) and if it's not there it would convert the branch into a revision instead.

I agree with converting branches into revisions, this is essentially what gerrit does and it works well. The code still exists in Gerrit's repo, just in a proprietary location ( /refs/changes ) instead of a branch ( /refs/heads ). One sort of unique aspect of gerrit is that it has no database other than git repositories on disk. They eliminated the mysql database a long time ago. It does use elasticsearch for indexing, however.

One difficulty however is tracking which local branches belong to which revisions -- today this is managed due to arcanist modifying the local commit message to reference the revision, so that subsequent updates/pushes will update the revision instead of making a new one.

The way Gerrit handles this is to require a "change-id" footer in the commit, rejecting the push if the id isn't present. The rejection message includes instructions with a one-line command that you run in order to install a commit-message hook that generates the change-id client side,. Run the command to copy the standard commit-message hook code from the server and install it into the git hooks directory. Then just git commit --amend and push again. Only slightly inconvenient the first time you try to submit to a repo and after that it's fully automatic.

I'd point out that Gerrit was originally intended to work with a client-side tool called repo (Which also manages multi-repo code trees).
The git push ... <magic ref> is kind of a workaround for users that found having a client-side tool annoying to use :)

The way Gerrit handles local commits is also counter the the way Phorge does - we believe the local commits are completely irrelevant, whereas Gerrit forces the user clean those up manually (whereas Github just exposes them as-is. insert rant against pull-requests here).

I agree that "the pull request workflow, specifically the GitHub & Gitlab implementations of it, have almost completely won in the minds of the vast majority of younger developers", but I think that's something we should still fight against. It's really is bad enough, IMO that we should not surrender to it, but try to educate away from it.

The way forward, I think, should be something like this:

For new users, support the the magic-ref style of gerrit, but with a web-UI Wizard based next steps, where we walk her through creating/updating the right revision, maybe explain what happens to all the commits and what other users will experience. Treat this as an onboarding experience - with the expectation that full-time contributors will eventually switch to Arcanist.
Maybe upsell Arcanist at this point.

Then, to simplify the installation flows - bundle Arcanist with a Windows and Mac style installer that includes its own php.
I'm somewhat tempted to say "lets also add a mini-Arcanist implemented in Go", but that's a lot of work (Specifically, duplicating the arc-diff flow is a major undertaking) and only slightly simplifies the setup.
However, maybe we can cheat here: A single binary download, that, when invoked, downloads and installs a private PHP and the whole Arcanist client - or just runs it if it's already installed. This simplifies a lot of the hassle for the user, and doesn't require us to maintain 2 copies of anything. With any luck, most users won't know they run PHP...

In T15096#2329, @avivey wrote:

For new users, support the the magic-ref style of gerrit, but with a web-UI Wizard based next steps, where we walk her through creating/updating the right revision, maybe explain what happens to all the commits and what other users will experience. Treat this as an onboarding experience - with the expectation that full-time contributors will eventually switch to Arcanist.
Maybe upsell Arcanist at this point.

For me as a new developer, the way I learned to use the PR workflow was by using GitHub's web UI. Allowing a user to make quick simple changes with nothing but a web browser is IMO the single best way to encourage new contributors.

Allowing a user to make quick simple changes with nothing but a web browser is IMO the single best way to encourage new contributors.

Adding a web-based editor is another often-requested feature, but this is much more complicated in Phorge because we don't do branches. I guess we can sort-of-simply make an "create Revision for single file" flow directly in the browser - it might make sense for some use-cases like typo-fixes.