Archive for the ‘open source’ Category

Peer Production License

Wednesday, May 28th, 2014

Recently discussions about new licensing models for open cooperative production have come up (again). This discussion resurrects the “Peer Production License” proposed in 2010 by John Magyar and Dmytri Kleiner [1] which is also available on the p2pfoundation website [2] although it’s not clear if the latter is a modified version. The license is proposed by Michel Bauwens and Vasili Kostakis accompanied by a theoretical discussion [3] why such a license would enhance the current state of the art in licensing. The proposal has already sparked criticism in form of critical replies which I will cite in the following where appropriate.

The theoretical argument (mostly base on marxist theories I don’t have the patience to dig into) boils down to differentiating “good” from “bad” users of the licensed product. A “good” user is a “workerowned business” or “workerowned collective” [2] while a “bad” user seems to be a corporation. Note that the theoretical discussion seems to allow corporate users who contribute “as IBM does with Linux. However, those who do not contribute should pay a license fee” [3] (p.358). I’ve not found a clause in the license that defines this contribution exception. Instead it makes clear that “you may not exercise any of the rights granted to You in Section 3 above in any manner that is primarily intended for or directed toward commercial advantage or private monetary compensation”. Finally it is made clear “for the avoidance of doubt” that “the Licensor reserves the exclusive right to collect such royalties for any exercise by You of the rights granted under this License” [2].

With the cited clauses above the “new” license is very similar to a Creative Commons license with a non-commercial clause as others have already noted [4] (p.363). Although the missing clauses in the license for a contribution exception for non-workerowned collectives or businesses is probably only an oversight — Rigi [5] (p.396) also understands the license this way — this is not the major shortcoming.

For me the main point is: who is going to be the institution to distinguish “good” from “bad” users of the product, those who have to pay and those who don’t? The license mentions a “collecting society” for this purpose. Whatever this institution is going to be, it might be a “benevolent dictator” [6] at the start but will soon detoriate into a real dictatorship. Why? As the software-focused “benevolent dictator for life” Wikipedia article [7] notes, the dictator has an incentive to stay benevolent due to the possibility of forking a project, this was first documented in ESRs “Homesteading the Noosphere” [8]. Now since our dictator is the “Licensor [who] reserves the exclusive right to collect such royalties” [2] there are other, monetary, incentives for forking a project which has to be prevented by other means covered in the license. Collection of royalties is incompatible with the right to fork a project. We have an “owner” who decides about “good” vs. “bad” and uses a license to stay in power. A recipe for desaster or — as a friend has put it in a recent discussion “design for corruption” [9].

Other problems are the management of contributions. As Meretz has already pointed out, “only people can behave in a reciprocal way” [4] (p.363). Contributors are people. They may belong to one instution that is deemed “good” by the dictator at one point and may later change to an institution that is deemed “bad” by the dictator. So a person may be excluded from using the product just because they belong to a “bad” institution. Take myself as an example: I’m running an open source business for ten years “primarily intended for or directed toward commercial advantage or private monetary compensation” [2]. I guess I wouldn’t qualify for free use of a “peer production license” licensed product. One of the reasons for success of open source / free software like Linux was that employees could use it for solving their day-to-day problems. This use often resulted in contributions, but only after using it for some time.

Which leads to the next problem: The license tries to force “good” behaviour. But you must first prove to be “good” by contributing before you’re eligible for using the product. As noted by Rigi the “GPL stipulated reciprocity does not fit into any of these forms [usually known to economists (my interpretation)]” [5] (p.398) because a contributor always gets more (e.g. a working software package) than his own contribution. This exactly is one if not the main reason people are motivated to contribute. Openness creates more ethical behaviour than a license that tries to force ethics. Force or control will destroy that motivation as exemplified in the Linus vs. Tanenbaum discussion where Tanenbaum stated:

If Linus wants to keep control of the official version, and a group of eager beavers want to go off in a different direction, the same problem arises. I don’t think the copyright issue is really the problem. The problem is co-ordinating things. Projects like GNU, MINIX, or LINUX only hold together if one person is in charge. During the 1970s, when structured programming was introduced, Harlan Mills pointed out that the programming team should be organized like a surgical team–one surgeon and his or her assistants, not like a hog butchering team–give everybody an axe and let them chop away.
Anyone who says you can have a lot of widely dispersed people hack away on a complicated piece of code and avoid total anarchy has never managed a software project. [10] (Post 1992-02-05 23:23:26 GMT)

To which Linus replied:

This is the second time I’ve seen this “accusation” from ast, who feels pretty good about commenting on a kernel he probably haven’t even seen. Or at least he hasn’t asked me, or even read alt.os.linux about this. Just so that nobody takes his guess for the full thruth, here’s my standing on “keeping control”, in 2 words (three?):
I won’t.
[10] (Post 1992-02-06 10:33:31 GMT)

and then goes on to explain how kernel maintenance works (at the time).

What becomes clear from this discussion is that the main focus of chosing a license is to attract contributors — preventing others from appropriating a version or influencing derived works is only secondary. Many successful open source projects use licenses that are more permissive than the GNU General Public License GPL Version 2 [11], and the new Version 3 of the GPL [12] which is more restrictive sees less use. The programming language Python is a prominent example of a successful project using a more permissive license [13]. Armin Ronacher documents in a blog post [14] that there is a trend away from the GPL to less restricitive licenses. This is also confirmed statistically by other sources [15].

One reason for this trend is the growing mess of incompatible licenses. One of the ideas of open source / free software is that it should be possible to reuse existing components in order not to reinvent the wheel. This is increasingly difficult due to incompatible licenses, Ronacher in his essay touches the tip of the iceberg [14]. License incompatibility has already been used to release software under an open source license and still not allowing Linux developers to incorporate the released software into Linux [14].

Given the reuse argument, adding another incompatible license to the mix (the proposed Peer Production License is incompatible with the GPL and probably other licenses) is simply insane. The new license isn’t even an open source license [16] much less fitting the free software definition [17] due to the commercial restrictions, both definitions require that the software is free for any purpose.

When leaving the field of software and other artefacts protected by copyright we’re entering the field of hardware licensing. Hardware unlike software is not protected by copyright (with the exception of some artefacts like printed circuit boards, where the printed circuit is directly protected by copyright). So it is possible for private or research purposes to reverse-engineer a mechanical part and print it on a 3D printer. If the part is not protected by a patent, it is even legal to publish the reverse-engineered design documents for others to replicate the design. This was shown in a study for UK law by Bradshaw et. al. [18] but probably transcends to EU law. Note that the design documents are protected by copyright but the manufactured artefact is not. This has implications on the protection of open source hardware because this finding can be turned around. A company may well produce an open source design without contributing anything back, even a modified or improved design which is not given back to the community would probably be possible.

Hardware could be protected with patents, but this is not a road the open source community wants to travel. The current state in hardware licensing seeks to protect users of the design from contributors who later want to enforce patents against the design by incorporating clauses where contributors license patents they hold for the project. This was pioneered by the TAPR open hardware license [19] and is also reflected in the CERN open hardware license [20].

To sum up: Apart from the inconsistencies in the theoretical paper [3] and the actual license [2] I pointed out that such a license is a recipe for corruption when money is involved due to the restrictions of forking a project. In addition the license would hamper reuse of existing components because it adds to the “license compatibility clusterfuck” [14]. In addition it won’t protect what it set out to protect: Hardware artefacts — except for some exceptions — are not covered by copyright and therefore not by a license. We can only protect the design but the production of artefacts from that design is not subject to copyright law.

Last not least: Thanks to Franz Nahrada for inviting me to the debate.

[1] Dymtri Kleiner, The Telekommunist Manifesto. Network Notebooks 03, Institute of Network Cultures, Amsterdam, 2010.
[2] (1, 2, 3, 4, 5, 6) Dymtri Kleiner, Peer Production License, 2010. Copy at (Not sure if this is the original license by Kleiner or a modification)
[3] (1, 2, 3) Michel Bauwens and Vasilis Kostakis. From the communism of capital to capital for the commons: Towards an open co-operativism. tripleC communication capitalism & critique, Journal for a Global Sustainable Information Society, 12(1):356-361, 2014.
[4] (1, 2) Stefan Meretz. Socialist licenses? A rejoinder to Michel Bauwens and Vasilis Kostakis. tripleC communication capitalism & critique, Journal for a Global Sustainable Information Society, 12(1):362-365, 2014.
[5] (1, 2) Jakob Rigi. The coming revolution of peer production and revolutionary cooperatives. A response to Michel Bauwens, Vasilis Kostakis and Stefan Meretz. tripleC communication capitalism & critique, Journal for a Global Sustainable Information Society, 12(1):390-404, 2014.
[6] Wikipedia, Benevolent dictatorship, accessed 2014-05-27.
[7] Wikipedia, Benevolent dictator for life, accessed 2014-05-27.
[8] Eric S. Raymond, Homesteading the Noosphere 1998-2000.
[9] Michael Franz Reinisch, private communication.
[10] (1, 2) Andy Tanenbaum, Linus Benedict Torvalds. LINUX is obsolete, discussion on USENIX news, reprinted under the title The Tanenbaum-Torvalds Debate in Open Sources: Voices from the Open Source Revolution, 1999. The discussion was continued under the subject “Unhappy campers”.
[11] GNU General Public License version 2. Software license, Free Software Foundation, 1991
[12] GNU General Public License version 3. Software license, Free Software Foundation, 2007
[13] Python Software Foundation. History and License 2001-2014
[14] (1, 2, 3, 4) Armin Ronacher, Licensing in a Post Copyright World, Blog entry, Jul 2013
[15] Matthew Aslett, On the continuing decline of the GPL. Blog entry, December 2011
[16] Bruce Perens, The Open Source Definition, Online document, Open Source Initiative, 1997
[17] Free Software Foundation, The free software definition. Essay, 2001-2010
[18] Simon Bradshaw, Adrian Bowyer, and Patrick Haufe, The intellectual property implications of low-cost 3D printing. SCRIPTed — A Journal of Law, Technology & Society 7(1):5-31, April 2010.
[19] John Ackermann, TAPR open hardware license version 1.0, Tucson Amateur Packet Radio, May 2007
[20] Javier Serrano, CERN open hardware license v1.1, Open Hardware Repository, September 2011

Migrating to GIT with Reposurgeon

Friday, April 18th, 2014

I’ve recently worked a lot with reposurgeon, a tool by Eric S. Raymond to do surgery on version control data. With this tool it is possible to migrate from almost any version control system to almost any other version control system — although these days the most feature-complete system is GIT which is the recommended and best supported target system (and the only one I’ve tested). It comes with a migration guide, the DVCS migration HOWTO.

Beyond just converting data, reposurgeon can be used to clean up artefacts, the simplest of which is to reformat commit comments to conform to established standards on GIT commit messages.

My conversion of the history of the pyst project, a python library to connect to different network interfaces of the Asterisk telephony engine, is a good example of what can be done with reposurgeon. The project, originally started by Karl Putland with version control at the time in CVS, was later taken over by Matthew Nicholson who used Monotone for version control. When I took over maintenance in 2010, I used Subversion. So we had three separate source code repositories in different version control systems. No effort was ever made to convert the version history from one system to another, so each new maintainer imported the last release version into the new version control system and continued from there.

Fortunately, Monotone has a GIT export feature and reposurgeon can natively read the other two formats. So I used separate reposurgeon scripts to clean up the three repositories and then used the reposurgeon graft command to unite them into one. The Subversion repo started with release 0.2 but there had been some commits after 0.2 in Monotone (which were later merged in Subversion) so the commits after 0.2 were put on a branch in the new repository. The last step then was to write out the resulting repository in GIT fast-export format and import into a GIT repository.

What artefacts did I clean up? Let me give two examples, both of which are problems I have when using Subversion.

Subversion doesn’t have the concept of a tag like other version control systems have. Instead tags are emulated by copying the to-be-tagged content to a new location in the repository, effectively creating a new branch, it’s just a naming convention that this is called a tag.

The first example deals with last-minute changes when doing a release: It’s not possible in Subversion to really remove a tag as in other systems. So when doing a release in GIT and something in the release process doesn’t work (not having created a release yet), as long as we didn’t push our changes to the public repository we can still move the release tag. This isn’t possible in Subversion. So I frequently have the situation that a release tag isn’t just one commit but several and the changes are either merged from the trunk to the tag-branch or the other way round from the tag-branch to the trunk. An example of a commit on a tag (r22 “fix PACKAGE definition for SF release”) and subsequent merge back to trunk can be seen in the following illustration created with the graph command of reposurgeon. The tag, originally Subversion commit r21 has already been tagified by reposurgeon. But the commit on the tag is now on the branch V_0_3.

First example

This can be fixed with reposurgeon with the following commands:

debranch V_0_3 trunk
[/^V_0_3\//] paths sup
:70 unmerge
:70 tagify --canonicalize
tag emptycommit-23 delete
tag V_0_3-root rename V_0_3
tag V_0_3 move :69

This puts the branch V_0_3 back onto the trunk and creates a new subdirectory V_0_3 there. Then this subdirectory is removed with the paths reposurgeon command. We then make the merge commit a normal commit with only one predecessor with unmerge and create a tag from this new commit which is possible because it doesn’t change any files. Finally we delete the tag just created and rename and move the V_0_3 tag.

The second example involves accidental commits on a release tag. This frequently happens to me when using Subversion for doing a release and happens as follows:

  • Create new tag by copying to a subdirectory in tags
  • Switch to this new tag using svn switch
  • Do the release
  • Forget to switch back to trunk
  • Come back later, do some accidental commits on the tag
  • Merge accidental changes back to trunk
  • Revert the changes on the tag

An example of this problem can be seen in the following figure.
Second example

This example is from my svnpserver project and shows a series of commits on the V_0_4 tag. Just before the next release I noticed, merged the commits from the tag to trunk, and reverted the erroneous commits on the tag. This is fixed with reposurgeon as follows:

:14063 delete
debranch V_0_4 trunk
[/^V_0_4//] paths sup
tag V_0_5-root rename V_0_5
tag V_0_5 move :14061
:14062 unmerge
:14062 tagify --canonicalize
tag emptycommit-4734 delete

First we delete the commit that reverts the changes on the tag. Then we move the commits from the tag to trunk and remove the resulting path prefix V_0_4. The new V_0_5 tag is moved to the last commit on what was previously the last commit on the tag because we’re going to eliminate the merge-commit next: First we make the merge commit a normal commit by removing the earlier ancestor using unmerge. The last step is to convert this commit into a tag (which is possible because it now doesn’t modify anything) and remove that resulting tag.

Modifying history is usually a bad idea when converting repositories. After all, the version control system is here to preserve the history. My rule is to remove artefacts of the used version control system that would never have occurred with another system. All the problems above would have been avoided by using, e.g., GIT in the first place: With GIT we can simply move a tag (if we haven’t pushed yet) and the erroneous commits on the tag could never have happened because we don’t have to switch branches for doing a release with GIT, so forgetting to switch back from the branch is not possible.

So by using reposurgeon we now have a GIT repository for pyst that spans the entire history of the project united from three different version control systems in use over the duration of the project.

ICMPv6 and prefixes

Friday, January 10th, 2014

In IPv4 the address assignment is coupled with the assignment of a subnet-mask — which means the insertion of a route to the given subnet.

In IPv6 address assignment is separate from on-link determination, for this an interface maintains a list of prefixes. All addresses matched by these prefixes are directly reachable. In addition routers may issue ICMPv6 redirects and the target of such a redirect is also on-link even if not contained in a known prefix of the interface.

Unfortunately the dhclient from ISC doesn’t get this right, so I spent some time to learn why a prefix I wanted for an interface, which is different from /64, didn’t work. It turns out, dhclient always configures an interface with a /64 subnet mask and associated route.

RFC4861 on IPv6 Neighbor Discovery later updated by RFC5942 “IPv6 Subnet Model: The Relationship between Links and Subnet Prefixes” makes clear that such prefixes may only be set by the following means (RFC5942, p.4)

The Prefix List is populated via the following means:

  • Receipt of a valid Router Advertisement (RA) that specifies a prefix with the L-bit set. Such a prefix is considered on-link for a period specified in the Valid Lifetime and is added to the Prefix List. (The link-local prefix is effectively considered a permanent entry on the Prefix List.)
  • Indication of an on-link prefix (which may be a /128) via manual configuration, or some other yet-to-be-specified configuration mechanism.

And makes clear this also holds for DHCPv6 (RFC5942, p.7):

The assignment of an IPv6 address — whether through IPv6 stateless address autoconfiguration [RFC4862], DHCPv6 [RFC3315], or manual configuration — MUST NOT implicitly cause a prefix derived from that address to be treated as on-link and added to the Prefix List. …

It even lists the bug if dhclient under the heading “Observed Incorrect Implementation Behavior” (RFC5942, p.8):

… An address could be acquired through the DHCPv6 identity association for non-temporary addresses (IA_NA) option from [RFC3315] (which does not include a prefix length), or through manual configuration (if no prefix length is specified). The host incorrectly assumes an invented prefix is on-link. This invented prefix typically is a /64 that was written by the developer of the operating system network module API to any IPv6 application as a “default” prefix length when a length isn’t specified…

And code inspection (client/dhc6.c, line 3844 dhcp-4.3.0a1) shows the value is really hard-coded in dhclient:

/* Current practice is that all subnets are /64's, but
 * some suspect this may not be permanent.
client_envadd(client, prefix, "ip6_prefixlen",
              "%d", 64);
client_envadd(client, prefix, "ip6_address",
              "%s", piaddr(addr->address));

I’ve filed a bug-report (#35178, not the first one I discovered later as the bug-tracker doesn’t seem to be public, the reporter of the Debian bug also has submitted a report) and hope this will be fixed. The bug is present also in older versions, for example isc-dhcp-4.2.2 in Debian stable (wheezy). Debian also has a bug-report that references RFC 5942 which exists since 2012 (RFC 5942 is from 2010).

The fix would probably be to hard-code the netmask /128 for the newly-assigned address and leave the configuration to ICMPv6 router advertisements (see RFC4861).

I hope this will finally be fixed as dhcp is the only autoconfiguration mechanism in IPv6 that can handle netmasks different from /64 (on Ethernet, there may be other layer-2 protocols with a different interface identifier length for stateless autoconfiguration).

Elevate Festival

Tuesday, October 22nd, 2013

Am Freitag 25.10. um 12:00 Uhr halte ich beim Elevate Festival in Graz einen Vortrag zu Open Source. Ich werde einige Anwendungen vorstellen, den Begriff Open Source erklären (anhand der Open Source Definition) und einen Ausblick in Richtung Open Hardware geben.

Friday 25th at 12:00 I’ll talk about Open Source at the Elevate Festival in Graz. I’ll cover some applications, explain the term “Open Source” (by using the Open Source Definition) and also look at Open Hardware.

Unix Domain Sockets

Tuesday, November 9th, 2010

I recently had to find a solution for a communication problem: An application running on a web-server should update configuration files that are only readable by a privileged user and these should not be directly writeable by the web-server user.
So the idea was to write an update-server running under the privileged account which receives update requests (and can perform additional checks) from the unprivileged web server user.
One of the checks I wanted to make was that only the web-server user (www-data on debian) should be able to send update requests. So I had to find out the user sending a request via the Unix-domain socket. Google found a nice socket howto on Henning Makholm’s blog which told me most of what I needed to know: “so I ended up just checking the EUID of the client process after the connection has been accept()ed. For your reference, the way to do this is getsockopt() with SO_PEERCRED for Linux”.
But one issue was remaining: I didn’t need a SOCK_STREAM socket but wanted to send datagrams to the other side (and didn’t want to fiddle with implementing my own datagram layer on top of a stream socket). With normal SOCK_DGRAM datagram sockets there is no connection — and therefore I can’t determine the user sending the datagram from the other side of the socket.
Looking further I discovered that Linux has connection-oriented datagram sockets for quite some time under the name SOCK_SEQPACKET. With this type of socket you first connect() to the other side and then you send a datagram. Since now there is a connection the trick with SO_PEERCRED as described above works, too.
Code for Server (python):

from socket import socket, SOCK_SEQPACKET, AF_UNIX, SOL_SOCKET
from struct import unpack
try :
    # Not implemented in python 2.6, maybe higher
    from socket import SO_PEERCRED
except ImportError :
    SO_PEERCRED = 17 # Linux
sock = socket (AF_UNIX, SOCK_SEQPACKET)
path = '/path/to/socket'
try :
    os.remove (path)
except OSError :
sock.bind (path)
conn, adr = self.sock.accept ()
ucred = conn.getsockopt (SOL_SOCKET, SO_PEERCRED, 12)
pid, uid, gid = unpack ('LLL', ucred)
if uid... check uid:
    conn.close ()

data = conn.recv (4096)

Code for client (python):

from socket import socket, SOCK_SEQPACKET, AF_UNIX
s.connect ('path/to/socket')
s.send (.....)
s.close ()