Setting up Jitsi Meet


With the ongoing Corona pandemic, video conferencing is one of the means to stay in contact. Since some offerings have dubious privacy or security status, an open source solution that you can self-host is a good thing to have. Since Snowden's revelations, anybody asuming that centralized applications are not monitored is naive in the best case.

An example of an application with dubious security history is Zoom: To my knowledge they're the only 3rd party software vendor who has managed that their software was removed by the operating system vendor during a security upgrade because of their lax security. They were installing an application that allowed any malicious website to enable your camera and spy on you. But they didn't learn much from it, later they were caught uploading your user data to Facebook even if you did not have a Facebook account. With that security and privacy record one cannot advocate the use of that application.

Now doing audio or videoconferencing with more than two participants usually involves unencrypted/clear voice or video at the server. Even if you connect via https and your audio and/or video stream is encrypted, it is unpacked at the server and re-encrypted to the other participants of the conference. The reason is that doing otherwise, each participant would have to send streams that are encrypted for each other participant. A simple implementation would involve that the number of streams grows quadratically with the number of participants. A more sophisticated implementation would encrypt the stream for each participant in the conference. The latter makes leaving/joining of conferences hard and is not supported by the usual secure protocols used for audio and video encryption (so keys would have to be exchanged in a separate channel). This is technically the reason why most videoconferencing applications have non-encrypted (cleartext) audio and video on the server. So it is easy for the server operator to monitor everything. Jitsi-Meet is no different: We have cleartext on the server. But the good news is that you can host the server yourself.

On the Jitsi-Meet Webpage you can find instructions to point the installation paths of your Debian or Ubuntu based Linux installation to the Jitsi-Meet repository. That way you can install Jitsi-Meet with the usual apt-get install jitsi-meet.

Once that is done, the resulting Jitsi-Meet installation allows anyone to create new conferences. For most installations this is not what you want. There are instructions on the Jitsi-Meet github pages to allow only moderators to create new conferences.

Note that the guest-domain, guest.jitsi-meet.example.com in the example, needs not be in the DNS, it's just used internally for all non-authenticated users.

With the resulting server, you can host your own video conferences. There is, however, a problem with the Firefox browser interacting badly with the Jitsi-Meet implementation. The details are documented in a Jitsi-Meet bug-tracker ticket. The effect is that audio and video becomes flakey, not just for the Firefox users, but for all participants in the conference when a single Firefox user is present. For this reason it's a good idea to not allow Firefox browsers into the conference until this issue is fixed. If you want this you can edit the file /usr/share/jitsi-meet/interface_config.js in the Jitsi-Meet installation. There are two config-items, one named OPTIMAL_BROWSERS includes firefox by default. Another named UNSUPPORTED_BROWSERS is empty by default. To exclude firefox, move the firefox entry from OPTIMAL_BROWSERS to UNSUPPORTED_BROWSERS.

With this setup I have now a running conference server where I don't have to trust dubious online offerings with doubtful security and privacy practices.

No sound after upgrade to debian buster


I recently upgraded my desktop to debian buster. I'm not using sound very often and only discovered after some time that sound output did not work at all.

Symptoms: The tool pavucontrol just showed a dummy device.

pacmd list-cards

Just output:

0 card(s) available.

Speaker test did run (and displayed the typical 'Front Left', 'Front Right' messages) but did not output any sounds.

In alsamixer I could successfully find my intel sound card and change the settings. So the kernel seemed to know about the device. But everything else in the system refused to cooperate. When turning to a web search I found the following on askubuntu:

https://askubuntu.com/questions/1085174/no-sound-after-upgrade-to-18-10-only-a-dummy-device-is-shown

This suggests to remove timidity-daemon. And after a fast:

% sudo dpkg --purge timidity-daemon
(Reading database ... 541273 files and directories currently installed.)
Removing timidity-daemon (2.14.0-8) ...
Purging configuration files for timidity-daemon (2.14.0-8) ...
Processing triggers for systemd (241-7~deb10u3) ...

Everything started to work immediately: The still-running pavucontrol immediately recognized a new device and I could play sound as before.

In search of a general transmission line simulator


Currently I'm in search of a good formula for a transmission line with almost arbitrary shape. One project is a directional coupler, another the simulation of a Log-Periodic antenna. When trying to model a two-band Log-Periodic antenna [1] originally pioneered by Günter Lindemann, DL9HCG [2] (sk), the model did not fit reality too well: The VSWR was higher than reported by people having built this antenna. The original antenna has two square booms but since NEC2 only supports round wires, the antenna was modelled with round wires.

Now for re-modelling the antenna with transmission lines (NEC2 supports those), I was searching for the impedance of the two square booms acting as a transmission line. I (re-) discovered the transmission line calculator by Hartwig Harm, DH2MIC [3] via his article in the german magazine Funkamateur [4]. But his model does not (yet?) include the parameters for two square wires. Harm uses atlc2 for estimating the parameters of his model. The software atlc2 is a reimplementation of Dave Kirkby's arbitrary transmission line calculator (which is available as source code and shipped with some Linux distributions) [5] but at least for round conductors I'm getting errors of several percent when trying to model the case of one round conductor against a wall, also reported by Harm [3]. Since atlc does not support conductors in free space we need to simulate walls in a great distance when modelling conductors.

So in search of a formula for this I discovered Owen Duffy's work [6] (via a re-implementation of his calculator by Serge Y. Stroobandt, ON4AA [7] who acknowledges Duffy). He also uses atlc [5] to compute the parameters of a model. When fitting the values of Duffy's calculator to Harm's model, I'm getting a K-factor of 1.65 but the first 2 values don't agree (the first value for d = 10 and D = 15, i.e., D/d = 1.5 is off by 6.4%, for D = 20 it's still 1.5%). Since Duffy states that "figures below about 100 Ω are likely to be underestimates" [6] I'm trusting Harm's model better for those low values but I haven't measured this and I'm not understanding Duffy's argument about the proximity effenct since the model of atlc is size-independent (it just uses D/d expressed via a pixel-drawing of the model). But since we can't fully trust the model of atlc, I'm fine with the accuracy.

I've not yet plugged the estimated impedance of the two square booms into an antenna model – but it seems the impedance is much higher than the 50 Ω of the real antenna.

Network goes down shortly after reboot with Debian Buster


I recently upgraded an existing Debian based virtual machine (running under KVM) to Debian Buster (the latest stable Debian Release as of this writing).

After reboot everything seemed fine but shortly after the reboot (about 10 Minutes) the machine was no longer reachable. After a reboot the same thing happened again.

I then turned on VNC (the machine hosting that VM is on a hosted infrastructure and only reachable via network, so I had to forward a VNC port to my local machine for testing) and discovered that the machine was running fine – just without a network connection. The interface was up but did not have an IPv4 address.

Investigating further I discovered that the machine was coming up with a wrong time (one hour after the current time, i.e. clock in the future), and the time was set back one hour by ntp once it was starting up, from the log:

Jan 22 19:19:35 tux4 dhclient[351]: DHCPOFFER of 10.33.33.4 from 10.33.33.254
...
Jan 22 19:19:36 tux4 ntpd[391]: ntpd 4.2.8p12@1.3728-o (1): Starting

Jan 22 19:19:36 tux4 ntpd[420]: kernel reports TIME_ERROR: 0x41: Clock Unsynchronized
Jan 22 18:19:47 tux4 ntpd[420]: receive: Unexpected origin timestamp 0xe1d310c2.72d51f2c does not match aorg 0000000000.00000000 from server@XX.XXX.XXX.XXX xmt 0xe1d302b3.0dae523c

Notice how the time changed to one hour earlier in the last log entry.

So, my next move was to check the dhcp leases file in /var/lib/dhcp/dhclient.eth0.leases and I found:

renew 3 2020/01/22 18:39:54;
rebind 3 2020/01/22 18:44:51;
expire 3 2020/01/22 18:46:06;

So the lease had expired. Obviously the ISC dhpc server was able to remove the IP address from the interface when the lease expired but did not renew the lease in time (probably this would have happened one hour later when the time again reached that point it was set back one hour). Looks like the ISC dhcp server uses two different mechanisms for timing these events: The lease expiry is timed with a mechanism that still works when the clock is set back in time (see the leases file, the time is the correct one, not one hour in the future). The renewal does not. I haven't looked into the code to figure out why this is so.

So what was the root cause for the wrong time? It turns out that I had always configured this machine to run in localtime (CET is one hour ahead of UTC). So it must always have been wrong. But the problem above only appeared with Debian Buster.

Now I'm starting the machine with base UTC and everything works again as expected. I'm directly using KVM so the correct option turned out to be:

-rtc base=utc

Second SPI chipselect for Orange-Pi Zero


I recently tried to get a TPM-Module from Infineon made for the Raspberry-Pi working on the Orange-Pi Zero. The Chipselect of the TPM module is hard-wired to chipselect 1 (via a 0 Ω resistor that could be soldered to another location to use chipselect 0, the board has the necessary unpopulated solder pads).

The Orange-Pi uses the same connector as the Raspberry-Pi but uses SPI-1 (instead of SPI-0 on the Raspi) on connectors 19 (MOSI), 21 (MISO), 23 (CLK), and 24 (CS-0, the hardware-supported chipselect). Chipselect CS-1, which is a native chipselect on the Raspi, is a normal GPIO (PA10) on the Orange-Pi Zero. The SPI-Driver for the H2 Allwinner processor in the Linux kernel is supposed to support normal GPIOs as additional chipselects. But this didn't work in my case.

So I first tried to find out if something with my Device-Tree configuration was wrong or if I needed a patch to fix the SPI initialization. This search is documented in a discussion thread in the Armbian Forum. It turned out that my Device-Tree configuration was OK, the following is the Device-Tree Overlay I'm currently using for enabling GPIO PA10 on the Orange-Pi Zero as the second chipselect CS1:

// Enable CS1 on SPI1 for the orange pi zero
/dts-v1/;
/plugin/;

/ {
    fragment@0 {
        target = <&pio>;
        __overlay__ {
            spi1_cs1: spi1_cs1 {
                pins = "PA10";
                function = "gpio_out";
                output-high;
            };
        };
    };

    fragment@1 {
        target = <&spi1>;
        __overlay__ {
            status = "okay";
            pinctrl-names = "default", "default";
            pinctrl-1 = <&spi1_cs1>;
            /* PA10 GPIO_ACTIVE_HIGH (last 0) */
            cs-gpios = <0>, <&pio 0 10 0>;
        };
    };
};

But the Clock-Line of the SPI-Bus is set to high before the SPI bus transfer starts. This is shown in the following oszilloscope screen-dump for the builtin chipselect CS0:

/wp-content/uploads/2019/native-cs0.bmp

The clock line (blue) goes high before the transfer and then to low when the chipselect (yellow) goes low (chipselect is active low, the data on MISO/MOSI, not shown in oscilloscope picture, is sampled when the clock-line goes high). This may confuse some devices but works in most cases.

Now the picture looks different when we use the second chipselect CS1. In that case clock (blue) also goes high before the transfer but it does not go low when the chipselect (yellow) becomes active (low). This is a problem because a high clock line tells the SPI slave to sample the bus at that point. So all bits are right-shifted by one bit and the SPI slave only receives garbage.

/wp-content/uploads/2019/additional_cs1.bmp

When I had seen this, I remembered a patch I had seen at the end of the SUNXI project spidev page which fixes the problem of the clock line going high before the transfer (See the end of that page under the title "HIGH on SCK line right before transfer"). So I applied the patch.

Looking at the next screen-dump from the oscilloscope, this shows again CS1 in yellow and clock in blue, you see that now the chip-select does not go high before the transfer starts. I haven't made a picture of this with CS0 (default chipselect) because the CS0 stays low all the time: It is low before the SPI is initialized and then stays low when the transfer starts. It doesn't show the bump in the clock line as in the very first picture before the patch.

/wp-content/uploads/2019/working_cs1.bmp

I intend to convince the original author of this patch on the SUNXI project spidev page to submit this patch to the linux kernel because without it a second chip-select won't work. Failing that (so far I don't have an answer) I'll try myself to get the patch accepted.

Moved blog to static website generator


I've finally moved my blog to a static site generator. The criteria for selection were:

  • Support for ReStructuredText as the markup language

  • A responsive design

And since I had Wordpress (WP) previously one of the benefits of the chose software, the static website generator Nikola, which I'm using now, was that it can import WP from the WP XML backup format (in the latest 1.2 version).

The WP importer can convert to HTML or MD (Markdown), I found the latter to produce too many artefacts and chose to use the HTML format. I had to change the size of a picture and fix the caption of some pictures with a caption in WP (seems the converter did not recognize this). Otherwise the conversion was fine.

Converting the comments, however, was more work: The converter produces .wpcomment files, one for each comment. They need the static_comments plugin for Nikola.

The generated files need the specification of the compiler (which produces static HTML for the comment) inside the comment-file. So I had to fix this line in all of them. I chose to use ReStructuredText, the directive is:

.. compiler: rest

which, of course, needs editing of the comments where they use HTML entities.

In addition the static_comments plugin needs setup of some localization strings that are used in the template. It mentions that these have to be added but fails to mention how and where. I've opened a github issue for this.

In my old WP blog, WP marks links in blog comments, both the URL of the commenter as well as links in the comment text as rel=external nofollow. I wanted to retain this for the comments and have changed the jinja template that comes with static_comments as indicated in a second github issue. This works when a blog-comment author has specified an author-URL. So far I have found no way of specifying that a rendered link generated by ReStructuredText should contain a rel=nofollow. So currently all the comments that contained a link inside the comment text will have search machines follow those links (I've verified this would cause no harm in my case). For people converting from WP with more posts than my blog, conversion of comments will be an issue that potentially will cause a lot of work.

I hope to be able to blog more now that the awkward markup language of WP will no longer be an excuse :-)

Peer Production License


Recently discussions about new licensing models for open cooperative production have come up (again). This discussion resurrects the “Peer Production License” proposed in 2010 by John Magyar and Dmytri Kleiner [1] which is also available on the p2pfoundation website [2] although it’s not clear if the latter is a modified version. The license is proposed by Michel Bauwens and Vasili Kostakis accompanied by a theoretical discussion [3] why such a license would enhance the current state of the art in licensing. The proposal has already sparked criticism in form of critical replies which I will cite in the following where appropriate.

The theoretical argument (mostly base on marxist theories I don’t have the patience to dig into) boils down to differentiating “good” from “bad” users of the licensed product. A “good” user is a “workerowned business” or “workerowned collective” [2] while a “bad” user seems to be a corporation. Note that the theoretical discussion seems to allow corporate users who contribute “as IBM does with Linux. However, those who do not contribute should pay a license fee” [3] (p.358). I’ve not found a clause in the license that defines this contribution exception. Instead it makes clear that “you may not exercise any of the rights granted to You in Section 3 above in any manner that is primarily intended for or directed toward commercial advantage or private monetary compensation”. Finally it is made clear “for the avoidance of doubt” that “the Licensor reserves the exclusive right to collect such royalties for any exercise by You of the rights granted under this License” [2].

With the cited clauses above the “new” license is very similar to a Creative Commons license with a non-commercial clause as others have already noted [4] (p.363). Although the missing clauses in the license for a contribution exception for non-workerowned collectives or businesses is probably only an oversight — Rigi [5] (p.396) also understands the license this way — this is not the major shortcoming.

For me the main point is: who is going to be the institution to distinguish “good” from “bad” users of the product, those who have to pay and those who don’t? The license mentions a “collecting society” for this purpose. Whatever this institution is going to be, it might be a “benevolent dictator” [6] at the start but will soon detoriate into a real dictatorship. Why? As the software-focused “benevolent dictator for life” Wikipedia article [7] notes, the dictator has an incentive to stay benevolent due to the possibility of forking a project, this was first documented in ESRs “Homesteading the Noosphere” [8]. Now since our dictator is the “Licensor [who] reserves the exclusive right to collect such royalties” [2] there are other, monetary, incentives for forking a project which has to be prevented by other means covered in the license. Collection of royalties is incompatible with the right to fork a project. We have an “owner” who decides about “good” vs. “bad” and uses a license to stay in power. A recipe for desaster or — as a friend has put it in a recent discussion “design for corruption” [9].

Other problems are the management of contributions. As Meretz has already pointed out, “only people can behave in a reciprocal way” [4] (p.363). Contributors are people. They may belong to one instution that is deemed “good” by the dictator at one point and may later change to an institution that is deemed “bad” by the dictator. So a person may be excluded from using the product just because they belong to a “bad” institution. Take myself as an example: I’m running an open source business for ten years “primarily intended for or directed toward commercial advantage or private monetary compensation” [2]. I guess I wouldn’t qualify for free use of a “peer production license” licensed product. One of the reasons for success of open source / free software like Linux was that employees could use it for solving their day-to-day problems. This use often resulted in contributions, but only after using it for some time.

Which leads to the next problem: The license tries to force “good” behaviour. But you must first prove to be “good” by contributing before you’re eligible for using the product. As noted by Rigi the “GPL stipulated reciprocity does not fit into any of these forms [usually known to economists (my interpretation)]” [5] (p.398) because a contributor always gets more (e.g. a working software package) than his own contribution. This exactly is one if not the main reason people are motivated to contribute. Openness creates more ethical behaviour than a license that tries to force ethics. Force or control will destroy that motivation as exemplified in the Linus vs. Tanenbaum discussion where Tanenbaum stated:

If Linus wants to keep control of the official version, and a group of eager beavers want to go off in a different direction, the same problem arises. I don’t think the copyright issue is really the problem. The problem is co-ordinating things. Projects like GNU, MINIX, or LINUX only hold together if one person is in charge. During the 1970s, when structured programming was introduced, Harlan Mills pointed out that the programming team should be organized like a surgical team–one surgeon and his or her assistants, not like a hog butchering team–give everybody an axe and let them chop away.
Anyone who says you can have a lot of widely dispersed people hack away on a complicated piece of code and avoid total anarchy has never managed a software project. [10] (Post 1992-02-05 23:23:26 GMT)

To which Linus replied:

This is the second time I’ve seen this “accusation” from ast, who feels pretty good about commenting on a kernel he probably haven’t even seen. Or at least he hasn’t asked me, or even read alt.os.linux about this. Just so that nobody takes his guess for the full thruth, here’s my standing on “keeping control”, in 2 words (three?):
I won’t.
[10] (Post 1992-02-06 10:33:31 GMT)

and then goes on to explain how kernel maintenance works (at the time).

What becomes clear from this discussion is that the main focus of chosing a license is to attract contributors — preventing others from appropriating a version or influencing derived works is only secondary. Many successful open source projects use licenses that are more permissive than the GNU General Public License GPL Version 2 [11], and the new Version 3 of the GPL [12] which is more restrictive sees less use. The programming language Python is a prominent example of a successful project using a more permissive license [13]. Armin Ronacher documents in a blog post [14] that there is a trend away from the GPL to less restricitive licenses. This is also confirmed statistically by other sources [15].

One reason for this trend is the growing mess of incompatible licenses. One of the ideas of open source / free software is that it should be possible to reuse existing components in order not to reinvent the wheel. This is increasingly difficult due to incompatible licenses, Ronacher in his essay touches the tip of the iceberg [14]. License incompatibility has already been used to release software under an open source license and still not allowing Linux developers to incorporate the released software into Linux [14].

Given the reuse argument, adding another incompatible license to the mix (the proposed Peer Production License is incompatible with the GPL and probably other licenses) is simply insane. The new license isn’t even an open source license [16] much less fitting the free software definition [17] due to the commercial restrictions, both definitions require that the software is free for any purpose.

When leaving the field of software and other artefacts protected by copyright we’re entering the field of hardware licensing. Hardware unlike software is not protected by copyright (with the exception of some artefacts like printed circuit boards, where the printed circuit is directly protected by copyright). So it is possible for private or research purposes to reverse-engineer a mechanical part and print it on a 3D printer. If the part is not protected by a patent, it is even legal to publish the reverse-engineered design documents for others to replicate the design. This was shown in a study for UK law by Bradshaw et. al. [18] but probably transcends to EU law. Note that the design documents are protected by copyright but the manufactured artefact is not. This has implications on the protection of open source hardware because this finding can be turned around. A company may well produce an open source design without contributing anything back, even a modified or improved design which is not given back to the community would probably be possible.

Hardware could be protected with patents, but this is not a road the open source community wants to travel. The current state in hardware licensing seeks to protect users of the design from contributors who later want to enforce patents against the design by incorporating clauses where contributors license patents they hold for the project. This was pioneered by the TAPR open hardware license [19] and is also reflected in the CERN open hardware license [20].

To sum up: Apart from the inconsistencies in the theoretical paper [3] and the actual license [2] I pointed out that such a license is a recipe for corruption when money is involved due to the restrictions of forking a project. In addition the license would hamper reuse of existing components because it adds to the “license compatibility clusterfuck” [14]. In addition it won’t protect what it set out to protect: Hardware artefacts — except for some exceptions — are not covered by copyright and therefore not by a license. We can only protect the design but the production of artefacts from that design is not subject to copyright law.

Last not least: Thanks to Franz Nahrada for inviting me to the debate.

[1] Dymtri Kleiner, The Telekommunist Manifesto. Network Notebooks 03, Institute of Network Cultures, Amsterdam, 2010.
[2] (1, 2, 3, 4, 5, 6) Dymtri Kleiner, Peer Production License, 2010. Copy at p2pfoundation.org (Not sure if this is the original license by Kleiner or a modification)
[3] (1, 2, 3) Michel Bauwens and Vasilis Kostakis. From the communism of capital to capital for the commons: Towards an open co-operativism. tripleC communication capitalism & critique, Journal for a Global Sustainable Information Society, 12(1):356-361, 2014.
[4] (1, 2) Stefan Meretz. Socialist licenses? A rejoinder to Michel Bauwens and Vasilis Kostakis. tripleC communication capitalism & critique, Journal for a Global Sustainable Information Society, 12(1):362-365, 2014.
[5] (1, 2) Jakob Rigi. The coming revolution of peer production and revolutionary cooperatives. A response to Michel Bauwens, Vasilis Kostakis and Stefan Meretz. tripleC communication capitalism & critique, Journal for a Global Sustainable Information Society, 12(1):390-404, 2014.
[6] Wikipedia, Benevolent dictatorship, accessed 2014-05-27.
[7] Wikipedia, Benevolent dictator for life, accessed 2014-05-27.
[8] Eric S. Raymond, Homesteading the Noosphere 1998-2000.
[9] Michael Franz Reinisch, private communication.
[10] (1, 2) Andy Tanenbaum, Linus Benedict Torvalds. LINUX is obsolete, discussion on USENIX news, reprinted under the title The Tanenbaum-Torvalds Debate in Open Sources: Voices from the Open Source Revolution, 1999. The discussion was continued under the subject “Unhappy campers”.
[11] GNU General Public License version 2. Software license, Free Software Foundation, 1991
[12] GNU General Public License version 3. Software license, Free Software Foundation, 2007
[13] Python Software Foundation. History and License 2001-2014
[14] (1, 2, 3, 4) Armin Ronacher, Licensing in a Post Copyright World, Blog entry, Jul 2013
[15] Matthew Aslett, On the continuing decline of the GPL. Blog entry, December 2011
[16] Bruce Perens, The Open Source Definition, Online document, Open Source Initiative, 1997
[17] Free Software Foundation, The free software definition. Essay, 2001-2010
[18] Simon Bradshaw, Adrian Bowyer, and Patrick Haufe, The intellectual property implications of low-cost 3D printing. SCRIPTed — A Journal of Law, Technology & Society 7(1):5-31, April 2010.
[19] John Ackermann, TAPR open hardware license version 1.0, Tucson Amateur Packet Radio, May 2007
[20] Javier Serrano, CERN open hardware license v1.1, Open Hardware Repository, September 2011

Migrating to GIT with Reposurgeon


I’ve recently worked a lot with reposurgeon, a tool by Eric S. Raymond to do surgery on version control data. With this tool it is possible to migrate from almost any version control system to almost any other version control system — although these days the most feature-complete system is GIT which is the recommended and best supported target system (and the only one I’ve tested). It comes with a migration guide, the DVCS migration HOWTO.

Beyond just converting data, reposurgeon can be used to clean up artefacts, the simplest of which is to reformat commit comments to conform to established standards on GIT commit messages.

My conversion of the history of the pyst project, a python library to connect to different network interfaces of the Asterisk telephony engine, is a good example of what can be done with reposurgeon. The project, originally started by Karl Putland with version control at the time in CVS, was later taken over by Matthew Nicholson who used Monotone for version control. When I took over maintenance in 2010, I used Subversion. So we had three separate source code repositories in different version control systems. No effort was ever made to convert the version history from one system to another, so each new maintainer imported the last release version into the new version control system and continued from there.

Fortunately, Monotone has a GIT export feature and reposurgeon can natively read the other two formats. So I used separate reposurgeon scripts to clean up the three repositories and then used the reposurgeon graft command to unite them into one. The Subversion repo started with release 0.2 but there had been some commits after 0.2 in Monotone (which were later merged in Subversion) so the commits after 0.2 were put on a branch in the new repository. The last step then was to write out the resulting repository in GIT fast-export format and import into a GIT repository.

What artefacts did I clean up? Let me give two examples, both of which are problems I have when using Subversion.

Subversion doesn’t have the concept of a tag like other version control systems have. Instead tags are emulated by copying the to-be-tagged content to a new location in the repository, effectively creating a new branch, it’s just a naming convention that this is called a tag.

The first example deals with last-minute changes when doing a release: It’s not possible in Subversion to really remove a tag as in other systems. So when doing a release in GIT and something in the release process doesn’t work (not having created a release yet), as long as we didn’t push our changes to the public repository we can still move the release tag. This isn’t possible in Subversion. So I frequently have the situation that a release tag isn’t just one commit but several and the changes are either merged from the trunk to the tag-branch or the other way round from the tag-branch to the trunk. An example of a commit on a tag (r22 “fix PACKAGE definition for SF release”) and subsequent merge back to trunk can be seen in the following illustration created with the graph command of reposurgeon. The tag, originally Subversion commit r21 has already been tagified by reposurgeon. But the commit on the tag is now on the branch V_0_3.

First example

This can be fixed with reposurgeon with the following commands:


debranch V_0_3 trunk
[/^V_0_3\//] paths sup
:70 unmerge
:70 tagify –canonicalize
tag emptycommit-23 delete
tag V_0_3-root rename V_0_3
tag V_0_3 move :69

This puts the branch V_0_3 back onto the trunk and creates a new subdirectory V_0_3 there. Then this subdirectory is removed with the paths reposurgeon command. We then make the merge commit a normal commit with only one predecessor with unmerge and create a tag from this new commit which is possible because it doesn’t change any files. Finally we delete the tag just created and rename and move the V_0_3 tag.

The second example involves accidental commits on a release tag. This frequently happens to me when using Subversion for doing a release and happens as follows:

  • Create new tag by copying to a subdirectory in tags
  • Switch to this new tag using svn switch
  • Do the release
  • Forget to switch back to trunk
  • Come back later, do some accidental commits on the tag
  • Merge accidental changes back to trunk
  • Revert the changes on the tag

An example of this problem can be seen in the following figure.
Second example

This example is from my svnpserver project and shows a series of commits on the V_0_4 tag. Just before the next release I noticed, merged the commits from the tag to trunk, and reverted the erroneous commits on the tag. This is fixed with reposurgeon as follows:

:14063 delete
debranch V_0_4 trunk
[/^V_0_4//] paths sup
tag V_0_5-root rename V_0_5
tag V_0_5 move :14061
:14062 unmerge
:14062 tagify –canonicalize
tag emptycommit-4734 delete

First we delete the commit that reverts the changes on the tag. Then we move the commits from the tag to trunk and remove the resulting path prefix V_0_4. The new V_0_5 tag is moved to the last commit on what was previously the last commit on the tag because we’re going to eliminate the merge-commit next: First we make the merge commit a normal commit by removing the earlier ancestor using unmerge. The last step is to convert this commit into a tag (which is possible because it now doesn’t modify anything) and remove that resulting tag.

Modifying history is usually a bad idea when converting repositories. After all, the version control system is here to preserve the history. My rule is to remove artefacts of the used version control system that would never have occurred with another system. All the problems above would have been avoided by using, e.g., GIT in the first place: With GIT we can simply move a tag (if we haven’t pushed yet) and the erroneous commits on the tag could never have happened because we don’t have to switch branches for doing a release with GIT, so forgetting to switch back from the branch is not possible.

So by using reposurgeon we now have a GIT repository for pyst that spans the entire history of the project united from three different version control systems in use over the duration of the project.

ICMPv6 and prefixes


In IPv4 the address assignment is coupled with the assignment of a subnet-mask — which means the insertion of a route to the given subnet.

In IPv6 address assignment is separate from on-link determination, for this an interface maintains a list of prefixes. All addresses matched by these prefixes are directly reachable. In addition routers may issue ICMPv6 redirects and the target of such a redirect is also on-link even if not contained in a known prefix of the interface.

Unfortunately the dhclient from ISC doesn’t get this right, so I spent some time to learn why a prefix I wanted for an interface, which is different from /64, didn’t work. It turns out, dhclient always configures an interface with a /64 subnet mask and associated route.

RFC4861 on IPv6 Neighbor Discovery later updated by RFC5942 “IPv6 Subnet Model: The Relationship between Links and Subnet Prefixes” makes clear that such prefixes may only be set by the following means (RFC5942, p.4)

The Prefix List is populated via the following means:

  • Receipt of a valid Router Advertisement (RA) that specifies a prefix with the L-bit set. Such a prefix is considered on-link for a period specified in the Valid Lifetime and is added to the Prefix List. (The link-local prefix is effectively considered a permanent entry on the Prefix List.)
  • Indication of an on-link prefix (which may be a /128) via manual configuration, or some other yet-to-be-specified configuration mechanism.

And makes clear this also holds for DHCPv6 (RFC5942, p.7):

The assignment of an IPv6 address — whether through IPv6 stateless address autoconfiguration [RFC4862], DHCPv6 [RFC3315], or manual configuration — MUST NOT implicitly cause a prefix derived from that address to be treated as on-link and added to the Prefix List. …

It even lists the bug if dhclient under the heading “Observed Incorrect Implementation Behavior” (RFC5942, p.8):

… An address could be acquired through the DHCPv6 identity association for non-temporary addresses (IA_NA) option from [RFC3315] (which does not include a prefix length), or through manual configuration (if no prefix length is specified). The host incorrectly assumes an invented prefix is on-link. This invented prefix typically is a /64 that was written by the developer of the operating system network module API to any IPv6 application as a “default” prefix length when a length isn’t specified…

And code inspection (client/dhc6.c, line 3844 dhcp-4.3.0a1) shows the value is really hard-coded in dhclient:

/* Current practice is that all subnets are /64′s, but
 * some suspect this may not be permanent.
 */
client_envadd(client, prefix, “ip6_prefixlen”,
              “%d”, 64);
client_envadd(client, prefix, “ip6_address”,
              “%s”, piaddr(addr->address));

I’ve filed a bug-report (#35178, not the first one I discovered later as the bug-tracker doesn’t seem to be public, the reporter of the Debian bug also has submitted a report) and hope this will be fixed. The bug is present also in older versions, for example isc-dhcp-4.2.2 in Debian stable (wheezy). Debian also has a bug-report that references RFC 5942 which exists since 2012 (RFC 5942 is from 2010).

The fix would probably be to hard-code the netmask /128 for the newly-assigned address and leave the configuration to ICMPv6 router advertisements (see RFC4861).

I hope this will finally be fixed as dhcp is the only autoconfiguration mechanism in IPv6 that can handle netmasks different from /64 (on Ethernet, there may be other layer-2 protocols with a different interface identifier length for stateless autoconfiguration).