File sharing

From Example Problems
Jump to navigation Jump to search

File sharing is the practice of making files available to other users for download over the Internet and smaller networks. Usually file sharing follows the peer-to-peer (P2P) model, where the files are stored on and served by personal computers of the users. Most people who engage in file sharing are also downloading files that other users share. Sometimes these two activities are linked together. File sharing is distinct from file trading in that downloading files from a P2P network does not require uploading, although some networks either provide incentives for uploading such as credits or force the sharing of files being currently downloaded.

History

Napster, originally a centralized system, was the first major file-sharing tool and popularized file sharing for the masses. Napster was a localized index for MP3 files shared by the users logged into the system. It included IRC-like chat and instant messenger features. Many new major clients now follow its example in design. An MP3-only sharing system, Napster was finally shut down by legal attacks from the music industry. It was openly attacked by some artists (notably Dr. Dre, Metallica) and supported by others (Motley Crüe, Limp Bizkit, Courtney Love, Dave Matthews, David Crowder Band).

There was widespread media coverage of unreleased Madonna songs leaking out on to the web prior to the official commercial release, but there was no evidence that this injured sales. In fact, a similar leak of Radiohead's album Kid A proved that Napster actually stimulated sales. Tracks from Kid A were released on Napster 3 months before the CD's release and millions had downloaded the music by the time it hit record stores. The album was not expected to do that well to begin with as it was an artsy endeavor by a band that never hit the Top 20 in the US before. There was very little marketing employed and few radio stations played it so Napster was expected to kill off whatever market was left. Instead, when the CD was released Radiohead zoomed to the top of the charts. Having put the music in the hands of so many people, Napster appears to be the force that drove this success. Nonetheless, the record industry was reluctant to credit a company it was suing.

Even before its legal problems, the community created an alternative: OpenNap. A reverse-engineered version of the Napster protocol, it was released as the open source server alternative for Napster users. These networks continue to exist even after Napster's collapse and many clients using this protocol have appeared, particularly with the help of the Napigator server list - an effort to centralize all of the different servers and networks.

Afterward, a decentralized network known as Gnutella appeared. This service is fully open source and allows users to search for almost any file type; users can find more than just MP3s on these networks. It was created in response to the threat posed toward any centralized body like Napster. The purpose behind decentralization is to prevent any single broken link from compromising the entire network.

Napster and Gnutella continue to define file sharing today, forming the extremes at both ends of the law in the wake of a series of civil lawsuits filed against computer users by the RIAA (which began in September, 2003). Gnutella is a free and open protocol and service while Napster has been resurrected as a commercial online music service that competes with other commercial services like iTunes and Rhapsody. Most file-sharing systems since have sought to ride the line between these two extremes.

Today a variety of file-sharing programs are available on several different networks. Availability depends partly on operating system, and different networks have different features (for example, multiple-source downloads, different sorts of search limiting, and so on). It is common for commercial file sharing clients to contain abrasive advertising software, or spyware, while non-commercial ones usually do not.

Network architecture

There are several major issues surrounding file sharing. Of these, the two most important are centralization vs decentralization and the privacy and anonymity of users. The latter takes on added importance when the legality of file-sharing is challenged by some copyright owners. A third issue is the collection and sale of data about users, using software referred to by its detractors as "spyware".

In the early days, client software was protocol-specific, so one had "Napster" clients, and one had "Gnutella" clients. There is an everpresent push towards making the GUI-side of things capable of using multiple protocols. It is argued: why should a user have to load up several different applications to do what is, in their mind, the same thing?

In cases where there is perceived value in collecting, some people will have lots to share and will find themselves surrounded by eager people. This can cause problems when the collector cannot keep up with demand. Decentralization is one means to alleviate this problem, especially in cases where it is possible to ensure that multiple copies of a popular item are available from multiple sources (even simultaneously, as with multi-source downloading).

Decentralization has also been pushed as a means of overcoming the threats posed to a centralized network, either by legal disputes or hostile users. A decentralized network has no body to attack; only its individual active members may be targeted, and even if a small portion of them are removed the remaining peers on the network will still be able to function.

Concepts like leeching or hoarding come about where the one centralized person will collect files and later refuse to make those available to others. Trade and ratio systems evolve in order to reduce the impact of leeching. Under these systems, a person shares when he can expect to get something in return. KaZaA, for instance, has a very simple rating system. The client calculates the user's priority and tells the sources what level of downloading priority they should give that user. Shortly afterward, however, hacked clients were released that told the sources that the user had one of the highest priority levels regardless of his actual sharing.

Another client which has a rating system is eMule. The eMule client, which uses MFTP as its protocol, tracks how much downloading and uploading has been done from individual sources and if files are downloaded locally or if other peers download files. Sometimes it seems that this rating system does not have a big impact on the download speed. A reason could be the size of the upload queue and the chunk size. If there is a free upload slot, the client takes the peer on top, transferes 8 MB to it and moves it to the end of the queue. A peer with rating of x2 would have to wait to get an upload slot for only half of the amount of time of a peer with a rating of x1. Furthermore, after the client has received an 8 MB chunk, it should upload an 8 MB chunk to the other peer as soon as possible if there is a download pending for that user. Then the other client would upload one chunk to you and your download speed and the one from the other client will increase.

BitTorrent also has a very good share rating system. The download speed is slow if a client does not upload, but it can easily be the fastest protocol if the size of the swarm is large enough.

Today we are left with a slew of clients with functionality designed around making sharing files more effective, both in the real sense of uploading and downloading (like anti-leeching functions) and in the more ethereal sense of being bulletproof toward legal issues (as with anonymity and decentralization).

Generational classification of peer-to-peer file sharing networks

Some people describe peer-to-peer file sharing networks by their "generation". This taxonomy only concerns itself with the popular internet-based file sharing networks, not earlier research- and business-oriented peer-to-peer systems, which pre-date them.

First generation

The first generation of peer-to-peer file sharing networks had a centralized file list, like Napster. Courts in the United States ruled that whoever controlled this centralized file list containing works whose copyright was being infringed was responsible for any infringement. Ultimately, Napster was held liable even if it used the most advanced technology available to identify works copyright holders had asked it to block, because no technology that can identify works with 100% certainty exists or can exist. Napster continues to operate today, but the company has taken a new direction, and is now legally distributing music under a subscription-based model.

In the centralized peer-to-peer model, a user would send a search to the centralized server of what they were looking for, i.e., song, video, movie. The server then sends back a list of which peers have the data and facilitates the connection and download.

Second generation

After Napster encountered legal troubles, Justin Frankel of Nullsoft set out to create a network without a central index server, and Gnutella was the result. Unfortunately, the Gnutella model of all nodes being equal quickly died from bottlenecks as the network grew from incoming Napster refugees. FastTrack solved the problem by having some nodes be 'more equal than others'.

By electing some nodes that had more capacity as indexing nodes, and having lower capacity nodes branching off from them, it allowed for a network that could scale to a much larger size. Gnutella quickly adopted this model, and most current peer-to-peer networks follow this model, as it allows for large and efficient networks without central servers.

Also included in the second generation are distributed hash tables, which solve the scalability problem by electing various nodes to index certain hashes (which are used to identify files), allowing for fast and efficient searching for any instances of a file on the network. They are not without their own drawbacks; perhaps most significantly, DHTs do not directly support keyword searching (as opposed to exact-match searching).

Third generation

The third generation of peer-to-peer networks are those that have anonymity features built in. Examples of anonymous networks are Freenet, I2P, GNUnet, Entropy.

Friend-to-friend networks only allow already known users (a.k.a. "friends") to connect to your computer, then each node can forward requests and files anonymously between its own "friends" nodes; some of these networks are MUTE, ANts P2P, and WASTE.

Third generation networks, however, have not reached mass usage for file sharing because of the overhead that anonymity features introduce, multiplying the bandwidth required to send a file with each intermediary used.

Copyright issues

File sharing (such as Gnutella and Napster) grew in popularity with the proliferation of high speed Internet connections and the (relatively) small file size and high-quality MP3 audio format. Although file sharing is a legal technology with legal uses, many users use it to download copyrighted materials without explicit permission. This has led to counterattacks against file sharing in general from some copyright owners.

There has been great discussion over perceived and actual legal issues surrounding file sharing. In circumstances where trading partners are in different countries with different legal codes, there are significant problems to contend with. What if a person in Canada wishes to share a piece of source code which, if compiled, has encryption capabilities? In some countries, a citizen may not request or receive such information without special permission.

Throughout the early 2000s, the entire file-sharing community has been in a state of flux. In the year 2000, there was speculation over how seriously record companies like the RIAA would strike the file-sharing community because of its limits compared to more traditional forms of media [1]. However, the communities suffered strain as record companies and the RIAA tried to shut down as much of it as possible. Even though they have forced Napster and Grokster into cooperating against copyright violations, they are fighting an uphill battle since the community has flourished and produced many different clients based on several different underlying protocols. The third generation of P2P protocols, such as Freenet, are not as dependent as Napster is on a central server; and as they encrypt the shared data, it is much harder to shut down these systems through court actions. Another attempt (used by the maintainers of KaZaA) is to change the company's organization or country of origin so that it is impossible or useless to attack it legally.

The Electronic Frontier Foundation (EFF) is a donor-supported group which protects users' digital rights. It is one of the most influential online human rights organizations, and it is involved in legislation, court cases, and campaigns to make the public aware of their rights. The EFF has opposed the RIAA in its onslaught of lawsuits against users of file sharing applications. The foundation supports the idea that P2P file-sharing can exist while allowing users to compensate artists for their copyrighted material.

See also

External links

  • Slyck - A popular file sharing news site and community
  • ZeroPaid - Another popular file sharing news site and community
  • P2P United - Pro-file sharing activism

Canada's approach to P2P and copyright

  • C|net Article entititled "Judge: File Sharing is Legal in Canada." In Spring 2004, file sharing via p2p networks was affirmed as legal in Canada.
  • Canadian Private Copying Collective Since 1998, private copying of music for personal use is entirely legal in Canada. A levy was established on blank media sales in order to compensate artists for lost revenue due to file-sharing, and the CPCC was founded to oversee the levy process. The levy was renewed in December 2004.
  • Copyright Board of Canada Government copyright policy.
  • Copyfight: A Tale of Two Cities Article contrasting Ottawa and Washington, D.C.'s assessment of copyright parameters and the issue of criminal sanctions.
  • Digital Rights Management in Canada An advocacy website offering an introduction to digital rights management and links to licensed file sharers.
  • Digital Copyright Canada A forum created to host public dialogue concerning digital copyright and related issues. Includes a news aggregate.
  • Numbers Don't Crunch Against Downloading Michael Geist, a professor at the University of Ottawa, discusses the financial impact of downloading on the recording industry. This column appeared in the Toronto Star, November 29th 2004.
  • p2pnet.net An op-ed urging solidarity between musicians and Canadian consumers.


References

Template:Cyberspace

da:Fildeling de:File Sharing es:Darknet it:File sharing he:שיתוף קבצים pt:Compartilhamento de arquivos ru:Совместное использование файлов sv:Fildelning zh:檔案分享