Social-network threat models

There have been a couple of comments on my peer-to-peer blogging post, both addressing different threat models than I was looking at.

My posts were looking at countermeasures to continue blogging in the event that public web hosting service providers are taken out by IP enforcement action. The aim of such enforcement action is to prevent distribution of copyrighted content: since I don’t actually want to do that I am not trying to evade the enforcement as such, just trying to avoid being collateral damage.  The major challenges are to avoid conventional abuse, and to maintain sufficient availability, capacity and reliability without the resources of a centralised service with a proper data centre.

Sconzey mentioned DIASPORA*.  That is an interesting project, but it is motivated by a different threat model – the threat from the service providers themselves.  Social-networking providers like facebook or google, have, from their position, privileged access to the data people share, and are explicitly founded on the possibilities of profiting from that access. Diaspora aims to free social-networking data from those service providers, whose leverage is based on their ownership of the sophisticated server software and lock-in and network effects.  To use Diaspora effectively, you need a good-quality host.  Blogging software is already widespread – if you have the infrastructure you need to run Diaspora, you can already run wordpress.  The “community pods” that exist for Diaspora could be used for copyright infringement and would be vulnerable to the SOPA-like attacks.

James A. Donald says “we are going to need a fully militarized protocol, since it is going to come under state sponsored attack.” That’s another threat model again. Fundamentally, it should be impossible for open publication: if you publish something, the attacker can receive it. Having received it, he can trace back one step where it came from, and demand to know where they got it from.  If refused, or if the intermediate node is deliberately engineered so messages cannot be traced back further, then the attacker can threaten to shut down or isolate the node provider.

In practice it can be possible to evade that kind of attacker by piggy-backing on something the attacker cannot shut down, because he relies on it himself.  That is a moving target, because what is essential changes over time.

(One could avoid using fixed identifiable locations altogether – e.g. wimax repeaters in vehicles. That’s not going to be cheap or easy).

James seems to be thinking more about private circles, where end-to-end encryption can be used. That’s more tractable technically, but it’s not useful to me. I don’t have a circle of trusted friends to talk about this stuff with: I’m throwing ideas into the ether to see what happens. Any of you guys could be government agents for all I know, so carefully encrypting my communications with you doesn’t achieve anything.

More on peer-to-peer blogging

I was musing a few days ago on how to do blogging if SOPA-like measures take out hosting providers for user content.

Aaron Davies in a comment suggests freenet. I’m not sure about that; because you don’t choose at all what other content you’re hosting, I would expect the whole system to drown in movie rips and porn. The bittorrent idea where the stuff which you help distribute is the stuff which you want to consume seems less vulnerable. alt.binaries didn’t die because of copyright enforcement, it died because the copyright infringement made such large demands on capacity that it was not worth distributing.

Bear in mind that I’m not going “full paranoid” here: my threat scenario is not “the feds want to ban my blog”, it’s “Blogger and the like have so much difficulty complying with IP law that they’re becoming painful and/or expensive to use”.

In that circumstance, simply running wordpress or geeklog on my own machine is an option, but rather a crappy one in capacity and reliability terms. I’ve already looked into using a general web hosting provider, and I could move onto that for probably five quid a month, but I’ve again been put off by reliability issues. Also, in the threat scenario under consideration, third-party web hosting might be affected also.

But Davies in passing mentioned email. When I saw that I went “D’oh”. I hadn’t thought of using SMTP. I’d thought of NNTP, which I have a soft spot for¹, but rejected it. SMTP could well be the answer — like NNTP, it was designed for intermittent connections. Running mailman or something on your home PC is a lot simpler and safer than running wordpress. The beauty of it is that not even Hollywood can get email banned. And if they tried, all you need to keep dodging is a non-government-controlled DNS, which is something people are already working on.

You still need a published archive though; one that people can link to. But that can work over SMTP too, as a request-response daemon. Those were actually quite common before the web: you could get all sorts of information by sending the right one-line email to the right address.

There were actually applications that ran over SMTP. One which lasted well into web days, and may even still exist here and there, was the diplomacy judge, for playing the board game Diplomacy over email.

Unmoderated comments would have to go under this scenario, whatever the technology, but moderated comments would be easy enough; the moderator would just forward acceptable comments onto the publication queue. Email clients in the days when mailing lists were very common were designed specifically to make following lists in this way easy (I remember mutt was much favoured for the purpose). Each list became a folder (by using procmail or the like), each post a thread, and each comment a reply. My own email is still set up that way, though I pretty much never look at the list folders any more, I think a couple of them are still being populated for things like development of the linux b43 wireless chipset driver.

The problem with using mail is spam. Everyone who wants to subscribe has to give me their email address — that’s probably the biggest reason why the use of mailing lists declined; that and the impact of false positives from spam filtering.

 If generic publishing networks drown in media, and mail drowns in spam, then some more private network is needed.

Requirements:

  •  Anyone can access posts, as easily as possible
  •  I only have to process posts from sources I’ve chosen

Our big advantage is that the actual storage and bandwidth needed for blogging are rounding error in a world of digital video.

Reliable access requires that there are multiple sources for posts, to compensate for the fact we’re not running industrial data centres.

The obvious approach is that if I follow a blog, I mirror it. Someone wanting to read one of my posts can get it from my server, or from any of my regular readers’ servers. That just leaves the normal P2P problems

  • locating mirrors, in spite of dynamic IP assignment
  • traversing NAT gateways which don’t allow incoming connections.
  • authenticating content (which might have been spoofed by mirror)

Authentication is trivial — there’s no complex web of trust: each blog has an id, and that id is the digital signature. The first two are difficult, but have been solved by all the P2P networks. Unlike some of them, we do want to persistently identify sources of data, so presumably each node regularly notifies the other nodes it knows of of its location. Possibly other already-existing p2p networks could be used for this advertisement function. There’s a DoS vulnerability there with attackers spoofing location notifications, so probably the notifications have to be signed. I guess the node id is distinct from the blog id (blogs could move, nodes could originate more than one blog) so it’s also a distinct key. Like a blog id, a node id essentially is the public key. NAT traversal I’m not sure about — there’s stuff like STUN and ICE which I haven’t really dealt with.

Assuming we can map a persistent node id to an actual service interface of some kind, this is what it would have to provide:

  • List blogs that this is the authoritative source for
  • List blogs that this mirrors (also returning authoritative source)
  • List other known mirrors for a blog id
  • List posts by blog id (optional date ranges etc)
  • Retrieve posts by blog id and post id
  • Retrieve moderated comments by blog id and post id (optional comment serial range)
  • Retrieve posts and moderated comments modified since (seq num)

The service is not authenticated, but posts and moderated blog comments are signed with the blog key. (Comments optionally signed by the commenter’s key too, but a comment author signature is distinguishable from a comment moderator signature).

The service owner can also

  • Create post
  • Add post to a blog
  • Edit post
  • Add a moderated comment to a blog
  • Check mirrored blogs for new posts & comments & mirror list updates

There’s a case for mirroring linked posts on non-followed blogs: if I link to a post, I include it on my server so that whoever reads it can read the link too.  Ideally, there should be an http side to the service as well, so people outside the network can link to posts and see them if they have the good luck to catch the right server being available at the time.  That all needs more thought.

¹When RSS was coming in, I argued that it was just reinventing NNTP and we ought to use that instead.

SOPA

I never blogged on the SOPA kerfuffle; it happened while my creative(?) energies were elsewhere.

Looking back, a few minor points emerge:

Some commentators got all excited: “look what we did! What shall we do next?!” “We” meaning right-thinking internet-type people. The answer, obviously, is nothing: this, “we” agreed about, most things, we don’t. I think Wikipedia’s claim: “Although Wikipedia’s articles are neutral, it’s existence is not” was basically justified.

Libertarian commentators had a lot of fun jeering at leftist techies who wanted every aspect of the economy to be regulated by the government except the internet. The criticism is only justified against those who demand that government regulate things but don’t specify exactly how they should regulate them (others can say they’re in favour of regulation, but just want it to be better). But that’s most people. So yeah.

In some ways, it’s a disappointment that SOPA didn’t go through; the circumvention techniques that would have been developed if it had would have been interesting and useful. At the end of the day, the biggest threat to free computing isn’t legislation, it’s that in a stable market, locked-down “appliance” devices are more useful to the non-tinkering user than general-purpose, hackable devices. So far, we tinkerers still have the GP devices, because the locked-down ones go obsolete too quickly even for lay users. I’m not sure whether that situation will persist for the long term: I’ve looked at the question before.

But if the government makes stupid laws that can easily be circumvented using general-purpose devices, the demand for those devices will be helpfully supported.

Note when I talk about circumvention, I’m not talking about copyright infringement. That was not what the argument was about. While I lean toward the view that copyright is necessarily harmful, I’m not certain and it’s not that big a deal. The important argument is all about enforcement costs: given that copyright exists, whose responsibility is it to enforce it. The problem with SOPA was that it would have put crippling copyright enforcement costs on any facilitator of internet communication.

Currently, internet discussion is structured mostly around large service providers — in the case of this blog Google — providing platforms for user content. If those service providers become legally liable for infringing user content, the current structure collapses. The platforms would either have to go offshore, with users relying on the many easy ways of circumventing the SOPA provisions attempting to limit access to offshore infringers, or else evade the enforcers by going distributed, redundant and mobile. What will be to Blogger as Kazaa and then BitTorrent were to Napster?  It would have been interesting to find out, and possibly beneficial. There is a lot of marginal censorship that can be applied to easy-target platforms like Blogger or Wikipedia that will not induce sufficient users to create alternatives, as the sheer idiot clumsiness of SOPA would probably have done.

(Note Wikipedia might have been spared, but it would have suffered, because if existing less respectable platforms were removed, their content would migrate to the likes of Wikipedia. If 4chan did not exist, Wikipedia would become 4chan.)

Actually, it’s interesting to think about how to blog over a pure P2P framework. Without comments, you’re publishing a linear collection of documents. (I don’t think you can handle comments — we’d need something more like trackbacks). Posts would need to be cryptographically signed and have unique ids. Serial numbers would be useful so readers would know if they’d missed anything. I wonder if anyone’s worked on it. A sort of bittorrent-meets-git hybrid would be really interesting — search this list of hosts for any git commits signed by any of these keys…

The dance of censorship and evasion is very difficult to predict in detail. I found some time ago that the way to find the text of an in-copyright book is to take a short phrase from it (that isn’t a well known quotation or the title) and google it. That used to work. I wanted some text from Evelyn Waugh’s Decline and Fall the other day, so I did the usual, and got pages and pages of forum posts, containing chunks of the book interspersed with links to pages selling MMO currency and fake LVMH crap. My access to illicit literature was being messed up by someone else’s illicit SEO.

Goodbye scribefire

I gradually noticed that some of the extensions I’m using in firefox are not actually free software. Oh well, I thought, it doesn’t matter much. I put looking for free alternatives on my list of things to do, somewhere near the bottom.

I just now noticed that my last two blog posts have transparent tracking images at the bottom.

I posted them with scribefire.

I feel like the tough-guy in one of the Elmore Leonards I’ve been reading recently.

You have to kidding me. You put tracking images on my blog posts and you expect me to just accept it. What are you, nuts?

Scribefire will be uninstalled shortly. And I will never install a non-free firefox extension again.

Apple Sells Unencrypted Music

It’s a good thing, certainly.   To my mind, the bigger change is that downloads are starting to appear at substantial discounts to CDs – I bought a new-release big-name album as an MP3 download from Amazon last month for only three pounds – the first time it’s been cheaper for me to download than to buy the disc.  Of course, competition from Amazon is one of the main reasons for this new development from Apple.

So ends the music shop.  I suspect that retailing will never recover from the current downturn.  The proportion of economic activity devoted to retailing seems vastly excessive.

Seizure of Intellectual Property

Tweetable link: https://t.co/p20oSGnTV4?amp=1

There have been two stories recently involving governments seizing intellectual property.

In one, the US government was seeking to take ownership of the trademark rights over certain symbols or logos used by motorcycle gangs.

The problem with this, for me, is that it’s an abuse of trademark rights, albeit a familiar one. The purpose of trademark law is to protect consumers from being deceived about goods or services they are buying. There is no welfare justification in preventing me from painting a Nike whoosh on my own T-shirt, unless I attempt to sell the T-shirt to a mug who thinks it is made by Nike.

Therefore having trademarks would not allow the government to do anything useful (i.e. stopping gang members from wearing their gang colours) without abusing the trademark rights.

The second case is the State of Kentucky seeking to seize internet domain names used by gambling operators, as “gambling devices”. Now the question of whether a domain name is a device is very debatable, but aside form that, the rights to a domain name are assets, and can be seized by government if the law permits.

That does seem rather odd, but it really isn’t. The reason it seems odd is that a domain name is essentially an entry in a directory, and it seems odd that a directory entry can be controlled in that way. But, to get all Aristotelian for a bit, while it is essentially an entry in a directory, there is also accidentally something associated with it that is an owned, tradable right – the right to specify which IP address the name in the directory will be listed against. Since the domain name owner could, voluntarily, sell or hand over his rights to the domain name to the government, then, given appropriate legal power, the government can perfectly well take it.

If the domain name system were not accidentally based on tradable rights – if names were allocated arbitrarily and finally by a central domain authority, then there would be no basis for the State of Kentucky to order the domain authority to change the use of that domain name. The system could work that way, but as a matter of fact it doesn’t (at least, not for .com domains), and the State can order a body subject to its law to hand over the contractual rights to it, as it could order it to hand over physical property or assign other assignable contractual rights.

Whee – I was wrong again.

My argument would apply if the State proceeded against the owners of the domain names themselves. It appears, looking at the details, that they went directly to the internet registries, and demanded ownership of the domains, without reference to who actually owned them or used them.

My original rationalisation of the process was that the State was effectively forcing the owner of the domain name rights to transfer the rights to the State, under some law that gave them power to do so. To simply announce that the rights now belong to it, without asserting jurisdiction over, or even identifying, the owners, is something else.

With regard to actual devices physically present in the State of Kentucky, it is reasonable that the State might have the power to seize them irrespective of who owned or operated them. But without establishing jurisdiction over the owners of the domain names, it’s more difficult. It comes down to my point that the domain name itself (an entry in a register) is not the same thing as the contractual right to control that entry. Only the second is actually property, and therefore only the second can actually be seized. Even if, by a stretch, the domain name itself is classed as some kind of abstract “device” used in gambling, the rights are something else.

Courts generally try to be sensible, even when the formulation of the laws are downright weird. The 44-page PDF of the courts opinion contains justification for considering the domain names as property, and justification for proceeding against them without reference to who actually owns or controls them. It then goes very badly astray (emphasis mine):

As the evidence in the record stands, the Defendants 141 Domain Names transport the virtual premises of an Internet gambling casino inside the houses of Kentucky residents, and are not providing information or advertising only.

The reason for that conclusion is indeed the confusion between the domain name as a name, the domain name as an entry in a directory, and a domain name as a contractual right to control the entry in the directory.

The court reasonably concludes that the right to control the domain name’s entry is property. It then observes the name all over every page of the casino website, and concludes that “the presence of … the internet domain names … is continuous and systematic”. However, only the name itself has a continuous presence; the directory entry is only referenced once, by the name resolver in the user’s operating system when they first go to the site. That might be enough to justify seizing the trademark, as in the Mongols case, but not the directory entry.

Good reason to buy fakes

Luxury watchmakers are apparently starting to add anti-forgery features to make it easier to identify counterfeits (via Division of Labour).

How embarrassing for them to have to admit that the cheap knock-offs are otherwise indistinguishable from the real thing.

As I’ve said before, the legitimate purpose of trademark law is to protect the consumer from inferior goods. A technical or artistic innovation is a positive externality, which copyright and patent law is designed to internalize. A brand established by costly marketing is of no general benefit, and cannot be deserving of state protection except inasmuch as it is a guarantee to the consumer of superior quality.

Perception of Value

The most revealing aspect of the row over Prince’s release of his album in the Mail on Sunday is the choice of words by the co-chairman of the Entertainment Retailers’ Association: “It would be yet another example of the damaging covermount culture which is destroying any perception of value around recorded music”.

The “perception of value around recorded music” is the music industry’s main asset – one it spent billions creating. New, fashionable recordings are sold at a very high markup. However, this does not mean that the industry is reaping huge profits, because the cost of making new recordings fashionable is very high.

That is not a criticism of the industry – there is no reason why they should not market their product in that way. However, it is does mean that, should new technology make their current strategy impossible, we should not conclude that no other strategy is possible. Without the “perception of value” and the high gross margins it produces, the heavy promotion of popular music would not be feasible, but its production and distribution would be.

The main thread of the Prince story – the dispute between the producer and the high-street retailers – is not interesting. In any industry which sells its products through shops, the retailer performs a double function of physical distribution and advertising. The extra benefit that the producer receives from the retailer’s advertising is usually paid for in one way or another, and keeping “the channel” happy is a concern across industries. Disputes such as this between a producer and a retailer are commonplace.

Rockbox

Excited by the EMI/Apple announcement of imminent DRM-free downloads, I checked whether my audio player — an iAudio M5 — could play the AAC format that Apple sells. I found that it couldn’t, but that the open-source Rockbox player software, which can, has recently been ported to the M5.

I’ve installed it, and it works. I like the plugins – there’s a chess program, and a sudoku. The metadata database feature doesn’t seem to work, and the interface is sometimes slow to respond, which is irritating. (It can take a couple of seconds sometimes for a submenu to come up, and if you’ve repeated your selection, the extra events then take effect afterwards).

These are quibbles; I’m very impressed with rockbox. I’ll dig into the database issue over the Easter weekend, and maybe come up with some patches if I can work out what it’s supposed to be doing.

There are other obstacles to taking advantage of the Apple thing: there is some question as to whether rockbox can play 256kbps AAC in realtime, but I suspect on the M5 it can, as it has a more powerful CPU than some of rockbox’s older targets. I also understand you can’t just buy iTunes from the web, you need to install the software. Apple may change that, or I might be able to get it working with Wine.

There is also a question as to whether the iTunes offer is value for money. I currently get music by buying CDs from the likes of Play or 101cd, typically at GBP5-8 each (very little music that has come out in the 21st century has interested me). I will try it out if I can, just because it’s a step that has to be encouraged.