The hardest thing about VersionPress is database merging; everything else is relatively simple. Very few dare to take on this problem so I was excited when Delicious Brains announced their newest project, Mergebot, yesterday. I'd like to share some early thoughts on it.
BTW, it has not really been that much of a surprise to anyone who follows this space closely. Brad talked about the then called “Data Hawk” / datahawk.io in this episode of Apply Filters and I've personally chatted with the devs at various WordCamps. Don't expect any secrets though; I didn't invest enough beers into learning the dark magic they use and know not much more than is publicly available.
First of all, I have to say that Mergebot is the only solution I know of (other than VersionPress) that really tries to tackle the problem properly. For example, people often mention Revisr but its goals were always much simpler. Here's a table from last year that's still valid:
Mergebot, on the other hand, checks the “database merge” column which is exciting. So, how does it compare to VersionPress?
(Just to repeat, I don't have any hands-on experience yet and am awaiting the beta program.)
On the outside, VersionPress and Mergebot are vastly different. One is an open source project that you run locally on your site, has no dependency on an external service (but has a local dependency on Git which makes it hard to run on shared hosts), you never send the data anywhere, pay nothing etc. Mergebot is practically the opposite in all the points above, being a paid SaaS app, partly proprietary, etc. There's no good or bad here and I'm sure Mergebot's approach and strong & reputable company behind it will attract many professional WordPress users.
What I was mostly interested in is how our solutions compare on the technical level, and here's what I gather so far.
Mergebot builds on the same premise as VersionPress: to merge two databases, you need to know their history. It's similar to traditional version control systems: they can provide painless merging only because they know the history of the data, can compare with common ancestors etc. In other words, painless merges are impossible if all you have are two static snapshots of the data.
VersionPress builds on the power of Git directly (all its merges, as well as other things, are powered by Git) but it's not the only way and Mergebot uses “query recording” as Brad mentions in the blog post. I haven't seen how exactly their solution works but I can imagine it.
Then comes the tricky part, updating references. Really even something as simple as a
page_on_front option containing number 42 is a huge challenge when it comes to DB merging because ID no. 42 could be already taken in the target database and you need to re-number the post to 43 (or whatever ID is free) and update all the references to it. In our case, the option
page_on_front needs to be updated to 43 or else the site is broken. Now, add 3rd party plugins, shortcodes, meta values, serialized data etc. to the mix and you've got a seriously hard problem.
How do you solve it? In VersionPress, we introduced the “schema.yml” format which describes the relations in the WordPress database. For example, this is the options table. It's somewhat complicated but I don't think it can be avoided.
Mergebot does something similar. I've not seen their format yet, and maybe it will not be public at all (but I doubt that: they will need plugin adoption to handle real-world sites and publishing the format will be mandatory for that) but it will be something similar.
UPDATE: In the comments on the Delicious Brains post, Brad showed an example of their DB schema format. It will be a JSON looking like this:
So apart from a slightly different syntax, they are creating a very similar schema format. I've already suggested to them that the two formats could converge so let's see if that happens.
Overall, I currently believe that Mergebot is tackling the problem the right way. Kudos to the guys behind it and certainly give them a try once they launch the beta. In the end, I believe that WordPress users deserve to have a proper solution to the database merging problem and more options to choose from is always good.
And, as I obviously love tables, here's another one hopefully providing a brief and fair summary:
|DB tracked in||Git||Custom format|
|Shared host friendliness||No||Yes|
|Distribution model||Free plugin||SaaS|