Page MenuHomePhorge

arc browse: add support to Subversion repos
Needs ReviewPublic

Authored by valerio.bozzolan on Sep 10 2024, 11:26.

Details

Summary

The command arc browse FILE introduced in 2020, finally supports Subversion repositories.

Closes T15541

Test Plan

✅ Subversion tests (simple checkout)

See that Subversion finally works. Tested on my Subversion repository on my Phorge, with this checkout:

svn checkout svn+ssh://git@gitpull.it/source/example-svn-repo/ example-svn-repo-root
Working directoryTest commandBeforeAfter
./example-svn-repo-rootarc browse README.mdlinklink
./example-svn-repo-rootarc browse asd/lol/OTHER.mdlinklink
./example-svn-repo-root/asdarc browse lol/OTHER.mdlinklink
./example-svn-repo-root/asd/lolarc browse OTHER.mdlinklink
./example-svn-repo-root/asd/lolarc browse ../../README.mdlinklink
./example-svn-repo-rootarc browse --branch wtfbranch README.mdlinklink ("BRANCH OPTION NOT AVAILABLE")

See that finally SVN works.

The last example uses the git option --branch that has little sense in SVN so it works but with a warning, to omit --branch in SVN.

Note that if you have not the .arcconfig file this does not work, with and without the change.

✅ Extra Subversion tests (deep checkout)

See that Subversion finally works (tested on my Subversion repository on my Phorge):

svn checkout svn+ssh://git@gitpull.it/source/example-svn-repo/asd example-svn-repo-subdir
Working directoryTest commandBeforeAfter
./example-svn-repo-subdirarc browse lollinklink
./example-svn-repo-subdirarc browse lol/OTHER.mdlinklink

Note that if you have not the .arcconfig file this does not work, with and without the change.

Note that if you checked a partial sub-directory (like in this example) you cannot indeed access parent files from the command line (as design in SVN).

✅ Git tests

See that git still works. Tested on master current local branch.

Working DirectoryTest commandBeforeAfter
./phorgearc browse README.mdlink✅ same
./phorgearc browse src/docs/user/userguide/remarkup_cowsay.divinerlink✅ same
./phorge/src/docs/userarc browse userguide/remarkup_cowsay.divinerlink✅ same
./phorge/src/docs/userarc browse ../../../README.mdlink✅ same
./phorgearc browse --branch stable README.mdlink✅ same
./phorgearc browse --branch stable src/docs/user/userguide/remarkup_cowsay.divinerlink✅ same
./phorge/src/docs/userarc browse --branch stable userguide/remarkup_cowsay.divinerlink✅ same
./phorge/src/docs/userarc browse --branch stable ../../../README.mdlink✅ same

Other Tests

From the arcanist directory, these still work, same result before and after the patch:

arc browse T15541https://we.phorge.it/T15541
arc browse -- T15541https://we.phorge.it/T15541
arc browse HEADhttps://we.phorge.it/rARCb3d45c71041478c1410bfae2aaf88a95106cfc5b
arc browse -- HEADhttps://we.phorge.it/rARCb3d45c71041478c1410bfae2aaf88a95106cfc5b or https://we.phorge.it/D25823 (it depends on your last commit)
arc browse --help✅ Still shows help message
arc browse -- --help✅ Unable to resolve argument "--help"

Test narration

You stand in the middle of a small clearing in the woods of your Phorge.

Create a new Subversion repository in Diffusion with some contents, like these:

README.md
trunk/lol.txt
/tags/17_04_2019/lol.txt

Then clone that repository with your usual URL, that is:

svn checkout svn+ssh://phab@phorge.localhost/source/example-svn-repository example-svn-repository

All of these finally work in SVN:


In the very same SVN repository, manually go to the Diffusion > Subversion menu and set "trunk/"
as sub-directory - so the web interface only shows the trunk directory.

All of these still work:

  • arc browse trunk/lol.txt
  • cd trunk; arc browse lol.txt; cd -

Inside the directory of arcanist (git), all of these still work as expected, opening the browser
web on the specified file and on the specified branch:

  • arc browse README.md
  • arc browse bin/arc
  • arc browse README.md --branch stable
  • arc browse README.md --branch master
  • arc browse bin/arc --branch stable
  • arc browse bin/arc --branch master
  • cd bin; arc browse arc; cd -
  • arc browse bin/arc

The browser web carefully opens as expected.
If it worked before, it works now, but also on Subversion repos.

Diff Detail

Repository
rARC Arcanist
Branch
arcpatch-D25823_1
Lint
Lint Passed
Unit
Tests Passed
Build Status
Buildable 1623
Build 1623: arc lint + arc unit

Event Timeline

Note that I introduced the method isBranchJustFilesystemPath() that returns something different for Subversion, just to recognize Subversion,

So in the future we can replace some occurrences in this way that is less hacky and more object-oriented:

- if ($api instanceof ArcanistSubversionAPI) {
+ if ($api->isBranchJustFilesystemPath()) {

I have not a strong opinion against the name of that method. 🤔 I've invested all my creativity there, so, suggestions are very welcome.

a. I'm not sure I like isBranchJustFilesystemPath() to check Is this SVN - if we're trying to be generic, then there may be a different vcs that does this particular trick, but has different logic; And if we're not trying to be generic, then isSVN() is probably a good-ish name? or isVCS('SVN').

b. If the user explicitly specifies a branch, shouldn't that take precedent over whatever is calculated from the $api?

c. don't we have a matching logic for "get branch" for other VCSes?

src/browse/query/ArcanistBrowsePathURIHardpointQuery.php
59

if you default to $ref->getBranch() here, you won't need the else.

64

Did you actually test it on a branch that contains / ?

src/ref/ArcanistRepositoryRef.php
104–107

won't this convert both '' and '/' to empty string? vs the comment saying we want '/'?

valerio.bozzolan marked 2 inline comments as done.

integrate tips

wow, it also makes sense to add --branch support to SVN! added

a. I'm not sure I like isBranchJustFilesystemPath() to check Is this SVN - if we're trying to be generic, then there may be a different vcs that does this particular trick, but has different logic; And if we're not trying to be generic, then isSVN() is probably a good-ish name? or isVCS('SVN').

Some thoughts:

  • it we try to add a method ArcanistSubversionAPI#isSVN() that returns true, we should also add ArcanistRepositoryAPI#isSVN() that returns false; and it may create some desire to have also isGit() etc. but that may become a bit weird to implement them all. Also, if we decide to do not add them all (skipping isGit()) it still will be a bit not-intuitive. At this point, it's probably better to just run if ($api instanceof ArcanistSubversionAPI).
  • we may try the "smarter way" to add ArcanistSubversionAPI#isSVN() implemented like return $this instanceof ArcanistSubversionAPI but that would introduce a circular dependency, so still probably not a desiderata
  • it we try to add a method ArcanistRepositoryAPI#isVCS('SVN') we should also probably add a data-member "SVN" / "GIT" / "MERCURIAL" in all classes, but that it's probably overkill, and maybe if ($api instanceof ArcanistSubversionAPI) would be better. Moreover, that will need some extra backward-compatibility extras to support eventual unknown downstream additional esoteric APIs. Not really something critical, but something still somehow overkill, compared with if ($api instanceof ArcanistSubversionAPI).
  • by the way, the semantic of isBranchJustFilesystemPath() somehow makes sense! (surprised face) I'm really surprised that it's also somehow readable, if we look at ArcanistRepositoryRef and its logics to escape a filesystem path (not needed if the branch is already a filesystem path). So we tried to describe the specific nature of the branch. In any case, even if the name is a bit "unusual", avoiding instanceof is probably still something interesting, for the fun of object-oriented fans.

b. If the user explicitly specifies a branch, shouldn't that take precedent over whatever is calculated from the $api?

  • premising that in git this hasn't changed, so, before yes; after yes
  • for SVN, premising this never worked: hey, you are 100% right, it may make sense to implement too! I've added support right now and added in test plan (arc browse --branch /tags/17_04_2019/ lol.txt - wow) 👍

c. don't we have a matching logic for "get branch" for other VCSes?

  • OK I don't get this question. I think the answer is that the default behavior of arc browse was to operate on ArcanistRepositoryRef#getDefaultBranch()
  • but allowing to to specify a arc browse --branch FOO and that is $ref->getBranch()

Maybe I should just try to improve the ArcanistRepositoryAPI#getDefaultBranch() 🤔 since the root problem is there, and the current solution just involves more code to avoid to fallback on that "master" default

mm, isBranchJustFilesystemPath() makes more sense now, as in "the branch name is part of the URI path"...

Also, I think I've literally never used SVN with Arc/Phabricator.

if "branch" is just part of the path, wouldn't the full file-path be the same in the local disk and in the server repository?

mm, isBranchJustFilesystemPath() makes more sense now, as in "the branch name is part of the URI path"...

Nice 👌

Also, I think I've literally never used SVN with Arc/Phabricator.

That's an interesting wild experience. In the company I work for, they have kind of 100/200 SVN repositories, and Phorge saves their life.

if "branch" is just part of the path, wouldn't the full file-path be the same in the local disk and in the server repository?

I've wrote this comment for 20 minutes to answer correctly. The short answer is yes, you are probably very right.

So probably, even if arc browse is useful, arc browse --branch in Subversion still has probably completely no sense in 99.99999% of cases...

Let's pretend to have this remote repository structure:

/README.md
/trunk/lol.txt
/tags/17_04_2019/lol.txt
/tags/17_04_2020/lol.txt
/tags/17_04_2021/lol.txt

The above directory structure is surprisingly common. Note that the "trunk" directory is just a directory (that is the "master") and the "tags" directory is just a directory (with copy-pasted old versions of trunk in it - let's say - lol).

Anything really is just a directory: yes, worldwide Subversion users just somehow communicated to each-other that the "trunk" directory contains the last version; and the "tags" directory contain old versions. That's it. Don't ask me why. LOL

Subversion users can download everything with:

svn checkout svn+ssh://phab@phorge.example.com/source/myrepo myRepoComplete

And they obtain on your computer the very exact remote complete structure. At this point you can run arc browse correctly. Some working examples:

$ arc browse README.md
http://phorge.localhost/source/myrepo/browse/README.md

$ arc browse trunk/lol.txt
http://phorge.localhost/source/myrepo/browse/trunk/lol.txt

$ arc browse tags/17_04_2021/lol.txt
http://phorge.localhost/source/myrepo/browse/tags/17_04_2021/lol.txt

And it works.

At this point you may also try some "intuitive" things, like, opening lol.txt on a different "branch" but... it's interestingly nonsense.

$ arc browse --branch tags/17_04_2021   trunk/lol.txt
→ 404 page: http://phorge.localhost/source/myrepo/browse/tags/17_04_2021/trunk/lol.txt
→ expected: http://phorge.localhost/source/myrepo/browse/tags/17_04_2021/lol.txt
→   but we really cannot imagine that... how we should strip "trunk/"?

$ # Maybe this?
$ arc browse --branch tags/17_04_2021   lol.txt
→ Unable to resolve argument "lol.txt"

Or, what about opening tags/17_04_2021/lol.txt but on the trunk?

$ arc browse --branch trunk   tags/17_04_2021/lol.txt
→ 404 page: http://phorge.localhost/source/myrepo/browse/trunk/tags/17_04_2021/lol.txt
→ expected: http://phorge.localhost/source/myrepo/browse/trunk/lol.txt
→   but we really cannot imagine that... how we should strip "tags/17_04_2021/"?

In short, this seems just nonsense.

Also... Subversion users can also download just part of the remote tree. That's interesting.

For example, you can download only the trunk and only work on that:

svn checkout svn+ssh://phab@phorge.example.com/source/myrepo/trunk myRepoOnlyTrunk

The above command in fact only downloads the trunk/ and it only generates this filesystem structure under myRepoOnlyTrunk:

/.svn
/lol.txt

So, even in the above situation, now arc browse lol.txt surprisingly works and gets the expected remote URL on the trunk.

But what about our esoteric intention to open the file lol.txt on the branch /tags/17_04_2019?

$ arc browse --branch tags/17_04_2019 lol.txt
→ 404 page: https://sviluppo.erinformatica.it/source/fe_data_imprt/browse/tags/17_04_2019/trunk/lol.txt

In Short

Premising that arc browse would be still super-useful also in Subversion,

We we should probably block the feature arc browse --branch for Subversion users, since this only leads to a path full of tears and madness.

Ok, so --branch probably doesn't make much sense in that case, right.

What if the user used the svn checkout svn+ssh://phab@phorge.example.com/source/myrepo/trunk myRepoOnlyTrunk form - are we still able to find the right target when arc browse lol.txt? Is it a common use-case?

I will add a warning about --branch and better management

What if the user used the svn checkout svn+ssh://phab@phorge.example.com/source/myrepo/trunk myRepoOnlyTrunk form - are we still able to find the right target when arc browse lol.txt? Is it a common use-case?

Yes and yes!

The potential complication is that if the .arcconfig sits at the root of the repo only, then checking out a slice below that in the directory tree won't have the .arcconfig. But it's kindof implied by the note at https://we.phorge.it/book/phorge/article/arcanist_new_project/#arcconfig-basics that one should commit .arcconfig files alongside your other code and follow normal branching workflow (aka whatever perverse bespoke branching workflow this particular group of SVN users has chosen) with it, and you just generally won't have a proper config otherwise, so that's not at all specific to arc browse. And I've tested and the fallback plan of specifying the values on the commandline, aka arc --config repository.callsign="MYREPO" --conduit-uri http://phabricator.gmcl.internal browse lol.txt (though I actually just used a checkout of a real repo at my work), does work fine.


Our somewhat pathological use of SVN at my work shows how complicated getting --browse to work properly for SVN could be, because our structure is essentially:

  • Our Big Repo Containing Most Projects
    • Project1
      • trunk
      • branches
        • release11_4
        • release11_5
    • Project2
      • trunk
      • branches
        • release11_4
        • release11_5
    • working
      • trunk
        • external pointing to Project1/trunk
        • external pointing to Project2/trunk
      • release11_4
        • external pointing to Project1/branches/release11_4
        • etc etc etc, you get the idea.

(If unfamiliar with SVN you can think of externals roughly the same as Git submodules, or more accurately like symlinks.)

So what does a branch even, like, mean, man? It's clear enough to a human considering context and nomenclature, but from arc's perspective it's unclear as hell. The plan to warn SVN users about --branch is probably the best possible approach, barring rather complex logic that humans would still find some way to break the assumptions of ;)

(One approach we used to try and square Phabricator's concepts of what a repo is with our own multi-project-within-one-repo usage was to create additional repositories in Phabricator that each used the special Import Only option to specify the project in question's path, and from there then one would see the trunk, branches, and tags folders for each project in its standalone "repository" in Diffusion and etc. The benefits of this though ended up being meager, and the proliferation of "repositories" that are all covered under the real repository was a bit ugly and junked up the listings as we started using some Git repos in addition to the two SVN repos our main development efforts live in, so we abandoned that.)

(The Arguments have a supports feature for these cases - like https://we.phorge.it/source/arcanist/browse/master/src/workflow/ArcanistLintWorkflow.php$67 . Not sure what it actually does, but 🤷🏻‍♂️)

Hi @keithzg (and any SVN hacker) do you like the new added Subversion support?

The --branch feature in SVN is ignored with a nice warning, so this warning cannot break anything.

valerio.bozzolan edited the test plan for this revision. (Show Details)

@avivey Thanks for the tip about arguments allow-list in getArguments() https://we.phorge.it/source/arcanist/browse/master/src/workflow/ArcanistLintWorkflow.php$67 but I was not able to use that since here it's using getWorkflowArguments() 🤔 So I've just added a warning. It works lol. Check "BRANCH OPTION NOT AVAILABLE"

Hi @keithzg (and any SVN hacker) do you like the new added Subversion support?

Oh yeah this work has been very nice and convenient for my uses! I have kept my local copy tracking this :)

Poking at it a bit more intentionally today, I was briefly surprised that arc browse alone didn't work, but it does give a clear enough warning that "This repository API ('svn') does not support the requested capability.". Judging by what it does on a Git repo, that perhaps makes sense.

Conversely, arc browse . works for what I expected a pathless invocation to do, and funny enough that doesn't work on Git repos! So we have feature parity ;)

... Conversely, arc browse . works for what I expected a pathless invocation to do, and funny enough that doesn't work on Git repos! So we have feature parity ;)

Nice catch.

About arc browse .

Nice catch. In git that never worked, before and after this change. Let's continue in T15957: 'arc browse .' should work in git but this specific fix will probably be not covered by this patch. Patch welcome :D

I'm curious how it works for you in SVN now, since to me it never worked anywhere (git+svn) and it still never work after this patch 💃 So I think this patch is consistent for that - lol. Again, let's continue in T15957.

About arc browse

Uh! Nice catch bonus level. I have absolutely no idea how to make this work in Subversion.

I see that git tries to open the HEAD commit. So I think a root cause is the missing of "HEAD" alternative for Subversion:

diff --git a/src/repository/api/ArcanistSubversionAPI.php b/src/repository/api/ArcanistSubversionAPI.php
index e0733b47..5bdd98ff 100644
--- a/src/repository/api/ArcanistSubversionAPI.php
+++ b/src/repository/api/ArcanistSubversionAPI.php
@@ -230,6 +230,10 @@ final class ArcanistSubversionAPI extends ArcanistRepositoryAPI {
     return null;
   }
 
+  public function getHeadCommit() {
+    return $this->getSVNBaseRevisionNumber();
+  }
+
   public function getSVNBaseRevisionNumber() {
     if ($this->svnBaseRevisionNumber) {
       return $this->svnBaseRevisionNumber;

But this diff only unlocks other sub-problems :D Maybe because there is the wrong assumption that the commit is an unique identifier among all type of repositories. So the query works in git but not in Subversion (since all Subversion repos have the very same incremental diffs from 1, 2 etc.) Still unclear. Bug report welcome on this. I don't open it by myself since I'm afraid that it's a rabbit hole :D But Bug Reports welcome in Arcanist <3

Glad that this patch seems good to you! :)