Thesis Introduction Draft Before a censorship-resistant publication system can be designed, we must clarify what censorship-resistance means. What is being censored, who is censoring it, and how is this censorship resisted against in the system? We will confine the scope of the project to censorship of websites. There are many other types of communication: printed publication, person-to-person conversations, online chats. mailing lists, newgroups. However, in order to determine the tolerances of the system in aspects such as latency and throughput, it is useful to limit the scope to a single network protocol and publication model. in order to define censorship and censorship resistance, we will have to explore attackers. The first step in desiging a censorship-resistant publication system is to determine what attackers the system is designed to protect against and the relative priority of these attackers. This is an often overlooked step. as systems are designed to provide "perfect" security against "all" attackers. This is equivalent to prioritizing all attackers equally or randomly. Since not all attacks are equally likely or costly in the real world, this leads to a suboptimal design. The best possible design for solving real censorship issues can only be achieved by first examining what attacks on censorship are currently occuring in the world, their relative cost, and the relative severity of the results of the attacks. A necessary prelude to choosing attackers is to first determine what attacks have actually occurred. To this end, I have scanned through the Your Rights Online section of the popular news site slashdot.org. I have selected from these stories of online censorship a number of real attacks against online content. From these examples we can estimate the popularity and severity of attacks and the resources of the attackers. From this information we can determine a general set of vulnerabilities which open web sites to attack and from there determine what characteristics a censorship resistant system needs to possess in order to be resistant to realistic attacks based which occur in the real world. Attackers RIAA The RIAA's purported goal is to crack down on sharing of music not authorized by the copyright holder. However, in this course they have also curtailed much authorized music sharing. The RIAA has in the past shut down sites aimed at authorized music distribution under the pretense of it being possible that they were also distributing unauthorized music. Since there is no method to technically determine whether a music file contains content under the jurisdiction of the RIAA, there is no technical means to prove that no RIAA music is being shared. The only mechanism to do this is legal, by going to trial against the RIAA lawyers, a costly venture with unknowable results. Thus, the RIAA can attack whoever they want. Their means of attack is legal. They send threatening leaders to websites, companies that make peer-to-peer file sharing software, and users of peer-to-peer file sharing software. Other similar associations such as the Harry Fox Agency have used the same technique to shut down websites with material such as guitar tablature and television show transcripts. It is important to note that none of these sites have been found to actually be doing anything illegal. No court case has set a precedent on the legality of hosting such content. The RIAA and similar organizations need not prove that their claims of illegality are valid as a threatening letter alleging illegla activity is sufficient to shut down most websites. If the website maintainer does not comply then a similar letter to the ISP will usually be effective. The RIAA has increased the severity of its attacks by including a statment in its letter to Kazaa users that in order to avoid litigation they must send the RIAA money, ranging from $3,000 to $7,000. While sending this money to the RIAA provides no actual legal protection for the recipient of the letter, it has caused to RIAA to receive a significant number checks for the simple act of sending a letter. Hackers High profile websites, both commercial and politically controversial, are common targets of hackers. Such attacks are usually brief, one-time affairs. The motivations range from the pride and status gained from taking down a high traffic commercial site or the personal satisfaction gained from disabling a site which is considered to represent something distasteful, to a desire to educate the world about dangerous security vulnerabilities which the site maintainers and software vendors attempt to conceal. There are hackers in many countries all over the world with different beliefs and cultures, so their motivations and targets vary. Their attacks, however, fall into two basic categories: security exploits, and denial of service (DoS) attacks. Security exploits are used to gain access to the computer that hosts the website, at which point the website can be shut down or the content changed in either blatant or subtle and humorous ways. Solving the problems with security exploits is an open problem which is outside the scope of this research and so will not be dealt with here. This issue is best addressed by running an operating system with fewer severe security vulnerabilities and ensuring that the administrator keeps the machine up to date on criticial security patches. Even secure machines, however, are vulnerable to a denial of service attack. A DoS attack uses up the resources available to the site until they are exhausted, causing the machine to be unable to serve legitimate clients. An example of such resources are CPU time and network bandwidth. If the webserver is unable to keep up with the number of HTTP requests received then it will be unable to serve pages to legitimate users. Similarly, if there is no network bandwidth available then no pages can be sent. Scientologists Many claims have been made about the Church of Scientology and its activities. While not all of these claims can be verified as organized efforts by the organization, such alleged attacks are important at the very least because they are a perceived threat which publishers will worry about when publishing information which they think might draw the guile of such an organization. Therefore, a censorship-resistant publication system should protect against such attacks. In addition to using legalistic threats similar to those of the RIAA, the Church of Scientology has been alleged to threaten physical violence to maintainers of websites publishing information alleging wrongdoing on the part of the church. Members of the church were allegedly given the names, addresses, and other personal information about the maintainers of anti-Scientology websites and encouraged to give them a hard time in various unspecified ways. This is an interesting attack to note because it is the only attack which circumvents the abstract world of online, legal, and financial threats and goes directly to the real world physical body of the site maintainers. While hypothetical situations involving political dissidents being shot for using circumvention software in totalitarian regimes are often discussed, here is a real, or allegedly real, example in the United States where running a website might result in physical violence to your person or property. The Government of China The Chinese goverment has a unique and very interesting method of censoring websites. All Internet traffic between China and the rest of the world passes through government owned computers. These computers scan Internet connections to see if they are connecting from inside China to an outside computer on a blacklist of known banned sites. If so, the connection is rejected. This stops people inside China from connecting to news site outside of China which carry news critical of the Chinese government. Creating news sites critical of the Chinese government is illegal inside of China, so such sites are dealt with by the police. Similarities in Attack Methods The RIAA uses a legalistic method of sending threatening letters that use legal language to maintainers of websites, software companies, individuals, and ISPs, sometimes just demanding that the website be shut down and sometimes demanding money as well. Hackers employ technical methods such as using up CPU time and network bandwidth of the website so that it cannot serve legitimate users. The Church of scientology allegedly uses a direct method by threatening critics with physical violence. The goverment of China uses a technical method of blocking traffic leaving China with a destination on a blacklist of known sites providing material criticial of the government. These methods might at first seem totally different, but they do have a similar basis. For the RIAA to send a threatening letter to an indidual, they must obtain this individual's address. If the individual owns the domain name of the site, they can find their address from their DNS registrar, which requires contact information for anyone buying a domain name. If the case of people using peer-to-peer file sharing, which does not require a domain name to find servers providing the desired information, they find the IP address of the computer serving the information and then ask the ISP providing the IP address for that customer's address, as the ISP can be easily determined given the IP address. The key vulnerabilities then which allow the RIAA to attack an individual running a website are that the domain name and IP address can be traced back to the individual. The alleged attacks of the Church of Scientology actually used the exact same methodology as the RIAA in identifying the individual which they wish to attack. So, while the actual method of attack is different, the vulnerability is the same: the connection between and individual and the IP address and domain name of their website. The attack method of the Chinese government has a similar requirement. They block Internet connections by the destination IP address. Since they have total control of Internet connections leaving China, there is no need to use the IP address to find the person who maintains the website. They simply put the IP address on their blacklist and then it is effectively shutdown for everyone in China. The DoS attacks of the hackers, however, rely on a very different vulernability: the limited resources of the machine hosting the website. This attack can be solved simply by having more resources than a hacker can exhaust. This is why you won't see a DoS attack against, say, google. Google has local mirrors in countries all over the world connected by a dedicated high-speed backbone, each mirror consisting of a huge load-balanced cluster. Their average traffic far exceeds that of the largest DoS attacks any hacker has managed to muster so far. However, most websites are hosted on a single machine with a limited amount of bandwidth or bandwidth pricing which will quickly skyrocket when it exceeds a meager allotted amount. Many small websites get shut down ever year due to the accidental DoS attack caused by having their site linked to from Slashdot. The increased traffic either crashes their webserver or causes their ISP to shut down their account or charge them an outrageous amount of money for the bandwidth consumed. There are, therefore, two major vulnerabilities which websites must overcome in order to resist censorship. The IP address of the site can be used to either find the person maintaining the site and legally or physically harass them, and the machine hosting the site can overwhelmed to the point that it is no longer able to serve the site. These are very different problems with different solutions, however their solutions can be integrated into a single comprehensive solution which provides useful censorship resistance from significant real world attackers. Other Attacks A number of possible attacks have been ignored because there is insufficient evidence that they actually matter. In matter, legal attacks have been ignored. The RIAA attacks are, as has been pointed out, not actually legal in nature, but rather legalistic. The system specified herein is designed to protect people who are not engaging in illegal activity but are nonetheless victims of censorship. There are many legal activities which are not being addressed, such as child pornography, fraud, sedition, or black market commerce in items such as drugs, assasination politics, and bootleg DVDs. Much of this has already been covered elsewhere in the literature. This needs to be pointed out not to defend the legitimacy of the work contained herein, but because there are a number of considerations which must be made to design a system which can support such activities but which are ignored here. Necessary Properties of a Censorship-Resistant Publication System Having decided on the scope of the system and what sort of attackers it should defend against, it is possible to determine what properties such a system needs in order to accomplish its goals. The system needs to have properties which allow it to overcome the vulnerabilities which websites have which allow them to be censored. The system should, however, not interfere with the medium which it is attempting to protect. Since our chosen medium is websites, our publication system must reproduce some of the properties of websites. It must have reasonably low latency so that realtime browsing is possible. It must integrate with web browsers. It must be able to reliably publish sites containing multiple files including images, javscript, and stylesheets. It should also obey HTTP caching conventions so as to integrate best with both web servers and web browsers. In order to protect from RIAA, CoS, and China attacks, the system needs to hide the IP address of the webserver from clients. If the attacker cannot determine the IP address of the server, it cannot be blacklisted and it cannot be used to find the maintainer of the website. In order to protect from DoS attacks, the system must include a mechanism for reducing the load the the webserver, both CPU load and bandwidth consumption. This amounts, basically, to reducing the number of requests that the webserver must handle, via caching or mirroring. Other Properties There are a number of properties which other censorship-resistant system designs have attempted to attain, but which are not included here because there is insufficient evidence that they are useful. High among their ranks is "plausible deniability", which is the property that a user of the system can make the claim that files which they've downloaded were not downloaded intentionally and were placed there by the system. This property is not considered useful as it is a legal protection and we include no legal attackers in our attack model. Plausible deniability does not stop an RIAA or CoS attack, but hiding your IP address will. Another major property which is not included in our system but which is a major aspect of much research in this area is the unlinkability of the client and the server. A number of sophisticated techniques such as mixing and winnowing and chafing are unnecessary if this is not a desirable property. We ignore this property because it assumes an attacker who is monitoring both ends of the connection and attempting to determine which clients are talking to which servers. We have no attacker who has this capability in our attack model. This simplifies our problem somewhat, but unfortunately this also makes a large body of work in this area irrelevant. Secondary Attacks On the one hand, only the primary attacks are really important. If a system can protect against those attacks, then it is already better than every other system as there do not currently exist any systems which provide adequate protection from the attackers in our attack model. On the other hand, it would be negligent to not speculate to some degree about what attacks might be adopted to counteract the protections given by this system. Even without full specifying the system, some general properties of the system can be speculated and attacks can be projected based on those properties. It can be assumed that the solution will take the place of software, which will run on a number of computers, which will form a censorship-resistance network. Obvious attacks, therefore, are to attack the creators of the software, the website that hosts the software, the network, or people that operate nodes in the network. The methodology of attack is likely to be very similar to the methods of primary attack. In particular, the website that distributes the software is just another website and so can be censored using the same attacks used for other websites. It's possible that the same system can be used to protect this website as other websites. However, if it is the case that a protected website cannot be reached without using the software then the initial website where the software is acquired will be unable to use the system for protection. This becomes somewhat of a chicken and egg problem, but is probably tractable. Attacks on the network are more difficult than attacks on an individual site because there is presumably a large and ever-changing membership of active nodes. The general attacks here are to shut down either certain key nodes which will cripple the network, corrupting certain key nodes and turning them to evil, running a large number of nodes and insinuating them into key positions in the network, running a large enough number of nodes to overwhelm the network and take over, or shut down all nodes in the entire network. The general solution to selectively destroying or corrupting key nodes it to make a network which does not have key nodes, but where all nodes are equal and where the network checks itself for nodes which are not behaving properly. The solution to overwhelming the network with evil nodes is to make it costly to run a node and have a large legitimate network, so that building a large evil network will cost more than the attacker can afford. The solution to shutting down all of the network is to make it too costly for the attacker to find the IP addresses of all of the nodes in the network. If the attacker cannot find all of the nodes, then they cannot be shut down.