Journal Articles

CVu Journal Vol 32, #3 - July 2020 + Process Topics
Browse in : All > Journals > CVu > 323 (11)
All > Topics > Process (83)
Any of these categories - All of these categories

Note: when you create a new publication type, the articles module will automatically use the templates user-display-[publicationtype].xt and user-summary-[publicationtype].xt. If those templates do not exist when you try to preview or display a new article, you'll get this warning :-) Please place your own templates in themes/yourtheme/modules/articles . The templates will get the extension .xt there.

Title: The Trouble with GitHub Forks

Author: Bob Schmidt

Date: 06 July 2020 17:05:47 +01:00 or Mon, 06 July 2020 17:05:47 +01:00

Summary: Silas S. Brown describes a problem with stale copies.

Body: 

Originally, ‘forking’ a project meant taking a copy of that project for independent development (think OpenOffice versus LibreOffice). Many developers saw it as a last resort, since, although it is possible for new features and fixes to be applied to both versions, this takes extra effort (especially as they diverge), and frequently one version ends up lagging behind. Nevertheless, the ability to fork is an important test of a project’s freedom.

GitHub provides a one-click ‘fork’ option on their Web interface: you can be looking at somebody else’s project, and click to ‘fork’ it, copying it into your own GitHub account in its current state. If you want to propose a change to a public project for which you don’t have commit rights, the standard method is to create a ‘fork’, change your fork, and then create a ‘pull request’ which is delivered to the upstream maintainers for their consideration. Like the creation of forks, the creation and review of pull requests can be done via the Web interface; so far so good.

The problem is what happens to a fork after the pull request is done, or if someone decides they don’t want to make changes after all. I’ll write this from the viewpoint of the upstream maintainer. I publish a project on GitHub, and somebody forks it. Fine. Then I find and fix a bug. Now, their fork still has the bug which I have fixed upstream. They don’t bother to update, and their version is visible to the public. Now, I don’t mind my embarrassing mistakes being preserved for historians, but it should be made clear that this is an old version and has been fixed. Although the ‘desktop’ version of GitHub’s Web interface does say that their project is forked from mine, and is N commits behind, the mobile version does not make this very clear at all, and anyone landing on their account instead of mine might think their outdated version is the latest when it’s not.

GitHub makes it easy to click to create a fork, but there’s nowhere to click to update that fork to upstream when there are no conflicts. You must do it from the command-line (Listing 1), which most people who keep forks in their accounts don’t seem to know.

git clone git@github.com:ME/REPO.git
git remote add upstream https://github.com/
AUTHOR/REPO.git
...
git fetch upstream
git rebase upstream/master
git push
			
Listing 1

Moreover, I as the upstream maintainer have absolutely no way of sending a courtesy message to the people who’ve forked me to let them know that a fix is available, unless I sleuth out their email and hope to make it past their spam filters. I cannot raise an issue against their fork (since most forks have issues disabled), and so far there is no such thing as a ‘reverse pull request’ from the upstream project to the fork. When they copy my code into their own (non-fork) projects, it might be harder for me to find them, but when I do, at least I stand a chance of getting a message through like ‘please update to latest upstream version, here, have a pull request’. But when they fork me, I can do nothing except sit here and hope they notice the tiny message (only on the desktop version of GitHub) telling them how many commits their fork has fallen behind.

I myself have forked other people’s projects, but I generally delete the fork from my account when it is no longer necessary (e.g. my pull request has been accepted and I’m not planning on making another soon). I don’t use forking as a protection against the rare event of an original repository being deleted by its maintainer, since an easier way to do that is simply to clone it on your local machine and type ‘git pull’ every so often to keep up-to-date (and then consider republishing only if they actually do delete it; no need to clutter up your own account with ‘just in case’ copies). But I do think the whole idea of GitHub forking would work much better if they made it easier to keep forks up-to-date, perhaps even providing an option to do so automatically (which people can disable if they know what they’re doing).

Silas S. Brown Silas is a partially-sighted Computer Science post-doc in Cambridge who currently works in part-time assistant tuition and part-time for Oracle. He has been an ACCU member since 1994.

Notes: 

More fields may be available via dynamicdata ..