25 June 2016

Linking in the Permanent Web

Discourse on the future of the web has always been revolving around decentralization, culminating in this month’s Decentralized Web Summit. Judging by the keynotes and the people involved, this future will also be heavily shaped by the concept of permanence - versioning, archival and protection against breaking links built directly into whatever protocol will succeed http.

With hypertext documents becoming immutable, links will naturally evolve from being mutable pointers to content that five years from now, with luck, might not have diverged too much from the thing you were originally referencing. Instead, we will be linking to documents directly, via a hash of its contents. In fact, content-addressable documents were so omnipresent at the DWS, one speaker even apologized for bringing it up again. Content-addressing and immutability are some of the same good ideas we have come to appreciate in our version control systems, data structures, file systems and peer-to-peer networks. Without considering additional, human-readable naming layers like DNS, the phrase

“All those (location-oriented) links will be lost in time, like tears in rain.”

will, in the web of the future, remain *forever* accessible by a link like


This form of linking is perfectly sufficient to build and work with the permanent web. Still it might not hurt to consider what more we could do, to make linking even more powerful.

Consider a whitepaper published to the permanent web under /sha3/256/somehash-1. To make it easily accessible, it is also made available under a mutable, human-readable address by pointing /ada-lovelace/on-linking to /sha3/256/somehash-1. Now the author updates the paper, its content-address changes to /sha3/256/somehash-2 and she updates the human-readable link to point to the updated version /ada-lovelace/on-linking -> /sha3/256/somehash-2. All is well for new readers, as they will always reach the most recent version.

One of the strengths of the permanent web is the stability of links. This means that a colleague’s paper, referencing and commenting on the original version at /sha3/256/somehash-1 (its source), will still be perfectly valid and readable.

But still, it would be useful to at least know about the availability of a newer, maybe better version of the source. With just the content-address, we would now need some kind of pub-sub system, notifying authors (and readers) whenever their sources are updated. Crucially, no party can check for themselves, when given just the content-address. So keeping a mutable address around is still valuable, as it preservers the intent of the original reference.

Another way to think about this: Immutable, interlinked documents form a Merkle DAG. Starting at any inner node, the DAG can only be traversed from newer to older versions, never the other way around. So there has to be meta-information pointing to the heads of the Merkle DAG.

A fully specified link should consist of both, an immutable content-address and a mutable name, communicating intent.

With such a link, many years later even, any reader could easily check what is available at /ada-lovelace/on-linking and compare it with the version that was originally referenced. Should the /ada-lovelace domain have changed owners in the meantime and is maybe now pointing to something completely unrelated, the reader can easily detect this.