I apologize for a somewhat click-baity title. Finally I got some time to organize thoughts that were floating in my head for the past couple of months. Strap in for a lengthy monologue, brevity is not my virtue 😄
I've been pondering the idea of starting a small business of hosted phorge, similar to what Phacility was.
The motivation for this is to expose Phorge to wider audience, people who are not ready to setup Phorge themselves but nevertheless need a focused all-in-one platform for smal-to-medium sized teams. The profits from this endeavor can be channeled back into development of Phorge, funding critical development effort (the pipe dream of every OSS dev, I know)
Immediately this idea faces a a harsh fact: Phacilty shut down their operations. Understanding why is important. I certainly don't know all the details, but I think I understand the technical side of the decision. It seems that business had it's customers, so it is likely a question of cost/benefit ratio, and as I hope it will become evident by the end of this post technical aspect is very much linked to cost/benefit ratio. It is worth noting that Phorge has improved on one very important aspect - community management. It was clear for Phab that BDFL style reached the end of it's usefulness.
Diving into technical details, I started analyzing what wold it take to run a hosted Phorge service.
My background is in infrastructure and highly distributed systems, I look at most problems through the lens of distributed systems and complexity they bring. As a LAMP stack application Phorge is definitely a distributed system, which has a lot of standalone components:
- HTTP server
- Database
- Main PHP application
- Additional PHP applications (daemons, long-running tasks, mail, etc)
- Additional non-PHP applications (Aphlict, a javascript websocket component)
- Client side Javascript logic
- Maybe more...
The number of components for what makes a single web application is high. It all comes from the fact that PHP was designed for a web 1.0, where request-response was all you ever needed. PHP has no threads, no background tasks, no async capabilites. Rasmus Lerdorf, creator of PHP, explained on multiple occasions why it is so. PHP was designed to be a simple sandbox that gets spun up on request and torn down after response is sent, it is stateless, and it works well for the design specs it was built for.
The web of today no longer limited by this simple model. Today users expect application to be highly interactive, collaborative, etc. Lots of usecases can't be satisfied with simple request-response model. That is why Phorge has daemons, aphlict, etc. All of those are separate standalone, long-running processes that address the limitations of PHP.
As infra engineer I am vividly and painfully aware when a distributed system accumulates too much accidental complexity. The burden of carrying this complexity is always transferred from application developers to infra an operations engineers:
- deployment procedures become more complex
- there is often a need for "glue" components like shell scripts, cron jobs,
- operation require orchestration tools - ansible, puppet.
- workload orchestration tools come into play : kubernetes, nomad, mesos, docker swarm, etc.
- glue-like applications caches and queues: memcahed, redis, kafka, RabbitMQ, zeromq etc.
- monitoring and logging becomes a pain, now you need a distributed tracing and logging frameworks, etc.
Usually also each system component brings its own configuration language/syntax, different programming language, different bugs. Add a popular cloud provider into the mix and you have a life sentence of a typical devops engineer.
In the end system gets too complex to understand, different groups of specialists emerge: database/backend engineers, frontend/UI engineers, devops/SRE/operations. Sprinkle that with need for management and coordination and you have a classic picture of a corporation or enterprise where all joy of life is sucked out by agile poker and standups.
Phorge definitely has these problems: a devops in me screams "no" when I think about managing multiple isolated instances of phorge. There is just too many moving pieces. Too many to make it a viable product. A cost of maintenance and innovation is just too high.
Most noticeable phorge's problems are caused by the lack of PHP's capabilities to be a modern web application tool. Arcanist, a client-side tool recognized to be a friction point of adoption, is needed because there is no server-side merge capability. There is nothing that checks if a Diff is still mergable, clients can't do gerrit-like git push for/ref/branch approach because there is nothing that will create a Diff from a temporary ref. Adding it is possible as yet another daemon, but it will create yet another standalone component. There is a reason why this was not done yet - doing it in the main request flow is slow or impossible, and building yet another daemon is very difficult and is "unsexy" and error prone grunt work by definition (basically a programmer becomes a glue-writer for the gaps in the the distributed system, gaps that 100% can be automated or removed with better architecture)
I believe the situation can be improved with a coordinated effort to reduce accidental complexity. As an example I want to demonstrate an alternative web application technical stack that has a lot less components:
- Database
- http server + websockets + deamons + servier side rendered html + mail +rendering all in one application, written in a one language (It can be python, nodejs, golang or some other popular web framework)
- minimal javascript on the client to process data on websocket and perform DOM patching if needed.
It is not a call for rewrite. Not yet. Rewrites are super hard, but I was looking at Elixir and Phoenix framework, or gitea and golang (Go has decent async capabilitess and I can't avoid noticing that their accidental complexity is orders of magnitude lower,because they can can keep 90% of the system in one application.
Thoughts?