Flashlight Firmware Repository

ToyKeeper · June 2, 2019, 3:00am

Warning: Long post, lots of complaining, totally safe to skip.

It seems like git has all the right tools, or at least most of the right tools, but its interface design (and resulting cultural norms) could use some work.

Normally the way I work is… branch off the trunk so I have a dev area to work in. Hack code there for as long as it takes. Sometimes this is an hour, sometimes it’s a few months. Sometimes I merge upstream changes into my branch along the way, especially if it’s a long-lived branch. Then when it’s all done and fully tested, merge it back into trunk. Typically I then move the old working tree into an “old” or “merged” directory, because it often has extra files in it which aren’t and shouldn’t be committed into the repository itself. For example, notes from clients, todo lists for that specific branch, intermediate calculations and scripts, measurements, IDE clutter, etc. I don’t want to delete those files, but I don’t want them in the actual repository either.

And there are often quite a few of these branches being developed simultaneously. I tend to have some shells, editors, and maybe other things open for each one, and I usually leave that stuff open until the branch is merged and ready to be archived.

Any time I need to compare branches, it’s trivial. Standard filesystem tools can be used… whatever tools I like. And the actual process of creating branches is simple too; simply copying a branch creates a new one. Every copy is its own branch, and its directory name is the branch name.

The default behavior in git is all wrong for this type of workflow. And more generally, the interface is a bit unintuitive.

For starters, that first step (creating a branch) isn’t done with the “branch” command. It’s “git checkout -b”. That’s simple enough though. The branch command isn’t for switching branches, it’s mostly used for listing branches and deleting old ones. The checkout command is used for creating branches and switching between them.

When it comes time to merge though, git’s default behavior is to not merge at all. Instead, it pretends the branch never happened, and rewrites history to make it look like all the commits happened on trunk (er, master). … and even though it’s default, it’s a behavior I literally never want. If I want to pretend history was linear, I’ll use the rebase command. (as an aside: Plus, after merging, if I delete the old branch to get it out of the list of active branches, there is no record that the branch ever happened. Even if there was an actual revision for the merge instead of fast-forwarding.)

So I set a global config option to make “merge” always use the “—no-ff” option. This tells it to do something sane by default instead of fast-forwarding to make it look like history was linear.

But then “git pull” breaks. Oops. Because “pull” is just an alias for “fetch” followed by “merge”, and “merge” has been told not to fast-forward. There is no first-class concept of updating the current branch to match its upstream counterpart; it’s implemented as two separate steps which don’t necessarily have quite the same meaning.

So git eventually implemented a workaround for that. And I put it in my global git config, to make it do something sane without extra options. I set pull to use “—ff” and “—ff-only”. So now it works again.

I’ve tried to override some other defaults too, like setting —no-commit by default during merge, because I want to make sure the tests pass before committing any merges. Merge, test, commit. But I haven’t found a way to make it do that yet. It tries really hard to enforce “merge, commit, test, fix, commit” instead of “merge, test, fix, commit”… and this tends to put broken revisions on the mainline, which is a big no-no.

Oh, and git has no concept of a mainline. So it can’t really tell which revisions were stable, well-tested parts of the trunk, and which were sloppy dev branch versions which are likely to have problems. This breaks the bisection tool, and makes it harder to read the history. In part because of this, it has become the cultural norm in git circles to make sure no one ever commits any broken revisions… even in dev branches. People are expected to do their development and then rewrite history afterward to make sure each individual step works correctly. It creates extra work which shouldn’t be necessary.

Anyway, there’s still the problem of git wanting to keep all the branches in the same directory in the filesystem. This is completely incompatible with my workflow. So I tried making copies for each branch with “git clone”. And then do work in the clones, doing things as normal… but then when it comes time to merge, I discover, oops, those clones don’t count as different branches. Different copies of a branch are treated as being still the same branch. But that’s not too difficult to work around. Instead of just doing a clone, do a clone followed by a “checkout -b” with the same name. Work in clone X, in branch X. Then when it’s time to merge, don’t go back into the original copy… merge clone X branch X into clone X branch master. And then the original copy can fetch updates from the clone and update its head pointer. Kind of awkward.

It’s a bit wasteful having the entire repository copied each time, but that’s okay. Normally I avoid this in bzr by doing “bzr init-repo” in the parent directory, so each branch effectively only has a new working tree without having to duplicate the history data. One parent dir with the metadata, many subdirs where each one is a branch. Pretty simple, straightforward, convenient, and reasonably disk-efficient.

Git finally added something similar, using the “worktree” command. I’ve only just discovered this though, and haven’t had a chance to see how well it works in practice.

The “worktree” feature wasn’t added until ten years after Git was first released. It should have been the default behavior, yet wasn’t available for an entire decade. And from what I’ve seen so far, it’s designed as sort of an afterthought so it’s still a bit awkward to use.

For example, it appears to not use a shared parent directory… instead, one sibling is the primary, and other siblings are kinda just linked back to the primary. Technically, they don’t even have to be siblings; the other working trees can be anywhere on the filesystem. If I understand correctly, the primary needs to know about all the secondaries, and the secondaries each need a link back to the primary. It also appears that one cannot create a branch of a branch this way; each one must be branched off the primary.

Regardless, it seems a lot less awkward than working in a bunch of completely independent clones. It’s just not as coherent or as well integrated as the default branching behavior in bzr. So I’m quite disappointed to see bzr being left to die, because I find it to be a better-designed DVCS tool.

Linus does good work with the kernel… but he really, really should have consulted some user interface designers and VCS / SCM experts during the early phases of creating git.