The Four Generations
Peer-to-peer file sharing has a long history dating back to the 70’s. This method of sharing information among users has evolved over the years. Benchmarks for significant changes to the file sharing technology known as “Peer-to-Peer”, has been explained here in what is referred to as the four generations of P2P. It is important to understand the evolution of this technology to fully appreciate why existing firewalls and other network technology is not fully able to protect networks against various security threats and risks associated with P2P.
First P2P Generation: Server-Client
To understand peer-to-peer file sharing and what was indeed the very first implementation of peer-to-peer file sharing you need to go back before the popularized form of the Internet as we know it. First use of Peer-to-peer file sharing was on a network similar to the Internet known as WWIVnet. WWIVnet was like FidoNet but it used a distributed model of nodes where traffic was re-routed based on the shortest distance between nodes. It worked very much like the Internet but without a constant always on connection. The Internet existed prior to WWIVnet, but it was only available to academic institutions, governments and large corporations. FidoNet was a hierarchical (server/client) based network thus not peer-to-peer. WWIVnet was the first widely available distributed network model that you could bring to your home. That all being said, it did not have the capability to share files built in. It was not until the introduction of Linker34 by Jayson Cowan did we see the first P2P application over a distributed end user network. Requests for file lists and specific files were handled by the peer much in the same way as second generation peer-to-peer file sharing and no central server was used for this process.
The first generation of peer-to-peer file sharing networks over the Internet had a centralized server system. This system controls traffic amongst the users. The servers store directories of the shared files of the users and are updated when a user logs on. In the centralized peer-to-peer model, a user would send a search to the centralized server of what they were looking for. The server then sends back a list of peers that have the data and facilitates the connection and download. The server-client system is efficient because the central directory is constantly being updated and all users had to be registered to use the program. However, there is only a single point of entry, which could result in a collapse of the network. In addition, it is possible to have out-of-date information or broken links if the server is not refreshed.
The first file-sharing programs on the Internet marked themselves by inquiries to a server, either the data to the download held ready or in appropriate different Peers and so-called Nodes further-obtained, so that one could download there. Two examples were Napster (today using a pay system) and eDonkey2000 in the server version (today, likewise with Overnet and KAD - network decentralized). Another notable instance of peer to peer file sharing, which still has a free version, is Limewire.
Second P2P Generation: Decentralization
After Napster encountered legal troubles, Justin Frankel of Nullsoft set out to create a network without a central index server, and Gnutella was the result. Unfortunately, the Gnutella model of all nodes being equal quickly died because of bottlenecks as the network grew from incoming Napster refugees. FastTrack solved this problem by having some nodes be ‘more equal than others’.
By electing some higher-capacity nodes to be indexing nodes, with lower capacity nodes branching off from them, FastTrack allowed for a network that could scale to a much larger size. Gnutella quickly adopted this model, and most current peer-to-peer networks implement this design, as it allows for large and efficient networks without central servers, WinMX also falls into this category.
Also included in the second generation are distributed hash tables (DHTs), which help solve the scalability problem by electing various nodes to index certain hashes (which are used to identify files), allowing for fast and efficient searching for any instances of a file on the network. This is not without drawbacks; perhaps most significantly, DHTs do not directly support keyword searching (as opposed to exact-match searching).
The best examples are Gnutella, Kazaa or eMule with Kademlia, whereby Kazaa has still a central server for logging in. eDonkey2000/Overnet, Gnutella, FastTrack and Ares Galaxy have summed up approx. 10.3 million users (as of April 2006, according to slyck.com). This number does not necessarily correspond to the actual number of persons who use these networks; it must be assumed that some use multiple clients for different networks.
Third P2P Generation: Indirect, Encrypted and Anonymous P2P
The third generation of peer-to-peer networks, are those that have anonymity features built in. Examples of anonymous networks are ANts P2P, RShare, Freenet, I2P, GNUnet and Entropy.
A degree of anonymity is realized by routing traffic through other users’ clients, which have the function of network nodes. This makes it harder for someone to identify who is downloading or who is offering files. Most of these programs also have strong encryption to resist traffic sniffing.
Friend-to-friend networks only allow already-known users (also known as “friends”) to connect to the user’s computer each node can forward requests and files anonymously between its own “friends’” nodes.
Third-generation networks have now reached mass usage for file sharing in countries where very fast internet access is commonplace. In countries such as the US, UK and Japan, a number of anonymous file-sharing clients have already reached high popularity in the tens of millions of users.
An example might be: Steve gives a file to Mike, then Mike gives the file to Ann. Steve and Ann never become acquainted and thus are protected. Often used virtual IP addresses obfuscate the user’s network location because Steve only knows the virtual IP of Ann. Although real IP’s are always necessary to establish a connection between Stan and Mike, nobody knows if Ann really requested and Steve really shared the file or if they just forward it (as long as they won’t tell anyone their virtual IPs).
Additionally all transfers are encrypted, so that even the network administrators cannot see what was sent to whom. Example software includes WASTE and Tor. These clients differ greatly in their goals and implementation. WASTE is designed only for small groups and may therefore be considered Darknet; ANts and I2P are public Peer-to-Peer systems, with anonymization provided exclusively by routing reach.
Fourth P2P Generation: Streams over P2P
Apart from the traditional file sharing there are services that send streams instead of files over a P2P network. Thus one can hear radio and watch television without any server involved — the streaming media is distributed over a P2P network. It is important that instead of a treelike network structure, a swarming technology known from BitTorrent is used. Examples include Peercast, Miro and Wuala.