Podmailing: the next p2p craze?

Harry Tormey @ 7 August, 2008 (10:42) | P2P Software | No comments

A couple of weeks ago I had the opportunity to interview Louis Choquel CEO of zSlide the company behind a new service called podmailing. Podmailing is a simple way to send and receive large files and folders by e-mail. Prior to launching in the US last month the service had over 40,000 registered users mostly in France and Spain. You can listen to a podcast of the interview here.

The service works by uploading file(s) from your computer to a tracker/server hosted on Amazon web services through the podmailing client. The provided software is a modified version of the original open source Bittorrent Inc python client. Once you register and upload your files a link to the torrent and a direct download option are emailed to the recipient who may choose to use either option. Seeding of the torrent is handled on the podmailing servers.

In order to get to either the torrent or direct download link the user must first visit a webpage covered in advertisements. Currently the service is free with few restrictions on the file sizes. I asked Louis directly if advertisement was the only planned source of revenue for this service. His reply was that eventually users will be able to purchase premium accounts which will offer larger file hosting, encryption and higher bandwidth downloads.

At present the average file/folder size sent through this service is 200 megabytes. Another interesting point is that the amount of successfully completed downloads made using the direct download link versus downloads made using the podmailing bittorrent client are approximately equal.

While podmailing is an interesting twist on the existing one-click hosting concept, I am unsure as to whether the p2p aspect sufficiently differentiates them from giants such as rapidshare and others.

Share and Enjoy:
  • Digg
  • del.icio.us
  • Google
  • Reddit
  • Technorati

Three reasons why video is the Holy Grail of P2P

Niall O'Higgins @ 21 July, 2008 (15:55) | BitTorrent Protocol, P2P Software | No comments

Peer-to-peer technology has many extremely useful applications. Fundamentally P2P is about increasing network resilience and decreasing bandwidth costs. Privacy, anonymity and security are all secondary to these essential principles. While BitTorrent has been an extremely successful P2P protocol for certain types of P2P applications, such as patch distribution for Blizzard’s World of Warcraft, it has also been a failure in other areas.

The Holy Grail

Streaming video is one of the largest consumers of bandwidth today. Many estimates put it ahead of P2P in terms of gross bandwidth consumption. Sites like YouTube and Google Video attract vast numbers of viewers. Various online video streaming services are starting up, offered by companies like Amazon and NetFlix. It seems that streaming video, and the sales and advertising opportunities which come along with it, represent an irresistible revenue source for large companies.

However, streaming video also has large bandwidth costs associated with it. According to the Wikipedia page on streaming video, to stream a standard video to 1,000 viewers would require 300 Mbit/sec of bandwidth using a traditional unicast approach. All this bandwidth is expensive. How can streaming providers spread this cost out? With P2P. Make your consumers pitch in to host the video. As the number of viewers increases, so should the amount of peers which can serve up the data, allowing you to scale up the number of participants without proportionate increases in your own bandwidth.

Why not BitTorrent

As we have written about before on this blog, BitTorrent is not good for streaming video due to its rarest-first download ordering policy. In order to stream video, or music, or whatever - you want it to arrive in a predictable order. Typically that order is linear, starting at the start. This way data arrives in the order of consumption. But BitTorrent does not provide this. In fact, it almost explicitly guarantees that it will not order data in a linear fashion. BitTorrent trades predictable ordering for replication increases. Under BitTorrent, the rarest pieces of data will be replicated the most, and so become less rare.

So what, then?

Companies are instead developing their own protocols. The EU has given 19 million euro to one P2P group which is modifying the BitTorrent protocol to support streaming - presumably by doing away with the rarest-first policy.

China has a number of well-funded start ups developing their own P2P video streaming technologies. Blin.cn claims to be 50x faster than BitTorrent for video streaming. Google is an investor in Chinese streaming company Xunlei.

And of course BitTorrent, Inc have been working to develop their own video streaming version of their protocol.

Who will come out on top remains to be seen.

Share and Enjoy:
  • Digg
  • del.icio.us
  • Google
  • Reddit
  • Technorati

Why Python is better than C

Niall O'Higgins @ 17 July, 2008 (17:33) | P2P Software | 3 comments

I love C. I’ve written a little bit of C code in my time - both UNIX user land and kernel stuff. I co-wrote OpenBSD’s rum(4) i802.11a/b/g wireless driver for Ralink USB devices [article here] and also made large contributions to OpenRCS and OpenCVS [articles here, here and here]. I’m also the author of the small, portable and efficient BitTorrent implementation, Unworkable, which is part of our work at P2P Research. So I am relatively familiar with the language.

I’ve been hacking Python code for around two years now, really developing a taste for it from my day job. I would not consider myself a Python guru by any stretch, but I’ve worked with many different parts of the standard library, and used enough of the features (generators, lambdas, list comprehensions, classes etc) that I reckon I have a pretty solid handle on what it offers.

The majority of the crawling and data analysis software developed here at P2P Research is written in Python - with a little bit of C here and there, for performance. I suppose that the system features our stuff uses can be broken down into the following categories:

  • String manipulation / parsing.
  • Fast dynamic data structures. Lists and dictionaries, at a high level, including sorting etc.
  • Networking. Specifically, a lot of HTTP is spoken.
  • Threading. For increased throughput.
  • File I/O. For archival purposes.
  • Database. We use PostgreSQL for some reporting and analysis.

I’m going to do a brief comparison with each of these items, comparing the two languages. All these things can be achieved relatively straight forwardly with both C and Python. Consider how many network servers, text editors and databases are written purely in C. The POSIX and ANSI standards actually give you a pretty good set of library functions for doing these things, too - apart from the data structure area I suppose. There are mature interfaces available for working with databases.

What Python really gives you that C does not, in my opinion, are the following:

  • Largely eliminates the headaches of memory management.

  • Similarly, makes string manipulation much less painful, while maintaining much of C’s performance by interfacing directly with printf family of functions. Consider the following C snippet, followed by the Python equivalent:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    
    /* Format a HTTP 1.0 GET request safely in C */
    l = snprintf(request, GETSTRINGLEN,
        "GET %s%s HTTP/1.0\r\nHost: %s\r\nUser-agent: Unworkable/%s\r\n\r\n", path,
        params, host, UNWORKABLE_VERSION);
    if (l == -1 || l >= GETSTRINGLEN)
            goto trunc;
    /* ... */
    trunc:
    trace("announce: string truncation detected");
    xfree(params);
    xfree(request);
    xfree(tparams);
    return (-1);

    1
    2
    3
    
    # Format a HTTP 1.0 GET request safely in Python
    request = "GET %s%s HTTP/1.0\r\nHOST: %s\r\nUser-agent: Unworkable/%s\r\n\r\n" %(path, 
        params, host, UNWORKABLE_VERSION)

    The big difference in this case, is really the amount of care you need to take with memory cleanup and error checking in C. Python is far more lenient when it comes to string and memory manipulation than C, which saves a great deal of complexity.

  • While there are good, relatively straight-forward implementations of various data structures for C, well-known examples being the venerable sys/queue.h for various sorts of linked lists, and the similar sys/tree.h for red-black trees or splay trees, typically used to implement dictionaries.

    But these C macros, while extremely helpful, are still tricky. It is not obvious, for example, how to make an object (In C, something declared with the struct keyword) be allowed to be a member of an arbitrary set of TAILQs. In fact, you need a fairly convoluted definition, let alone complex management code:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    
    /* An actual node, which can be used in arbitrary lists */
    struct node {
            char *key;
    };
     
    /* Separated list structure for managing nodes */
    struct node_list_entry {
            TAILQ_ENTRY(node_list_entry)     node_list;
            struct node *item;
    };

    It makes you appreciate Python code like this:

    1
    2
    3
    4
    
    mylist = []
    mylist.append("foo")
    mylist.append(1)
    mylist.sort()

    And after investigating what is involved in getting dictionary-like storage from C (left as an exercise to the reader), code like this:

    1
    2
    3
    
    mydict = {}
    mydict['foo'] = bar
    del mydict['foo']

  • The TCP/IP stacks in all major operating systems are written in C, and a good number of extremely popular network clients and servers are also (Apache, Sendmail, OpenSSH). One could perhaps even argue that networking is one of the things that C is best suited for, in fact, particularly very low level networking. However, just opening a TCP socket safely is quite a lot of C code:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    
    /* C snippet to connect to a remote host via TCP */
    struct addrinfo hints, *res, *res0;
    int error, sockfd;
    memset(&hints, 0, sizeof(hints));
    hints.ai_family = PF_INET;
    hints.ai_socktype = SOCK_STREAM;
    error = getaddrinfo(host, port, &hints, &res0);
    if (error) {
      /* handle error */
    }
    res = res0;
    sockfd = socket(res->ai_family, res->ai_socktype, res->ai_protocol);
    if (sockfd == -1) {
     /* handle error */
    }
    if (connect(sockfd, res->ai_addr, res->ai_addrlen) == -1) {
     /* handle error */
    }
    freeaddrinfo(res0);
    return (sockfd);

    Now compare this to the Python equivalent:

    1
    2
    3
    
    import socket
    s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
    s.connect((HOST, PORT))

    When it comes to HTTP, or other protocols, the difference is even greater. Of course, much of this can be attributed to string and memory handling. To be fair, implementing a basic HTTP/1.0 client in C is not that hard - I did it in under 500 lines of code in Unworkable. However, Python’s standard library - whether via urllib, urllib2 or httplib directly - just makes it at least an order of magnitude less of a headache compared to C.

  • In the realm of threading, it seems pretty clear to me that the POSIX threads (pthreads) interface has won. Of course, the API is available on all POSIX compliant operating systems. I don’t have a huge amount of experience with using it through C - a few years ago I did some very simple stuff with it. While not impossible, it is complicated and tricky enough to deal with. On the other hand, Python offers its own threading module, loosely based on Java’s API. I find it very easy to use threads in Python - perhaps the most glaring feature being that the Python threading module supports both an object-oriented paradigm - where you extend the Thread class with your own - and also a functional approach. The functional approach makes great sense to me - I very much like the idea. Creating a thread like this is as simple as:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    
    # Simple Python threads example, using functional paradigm
    import threading
    def worker():
        while True:
            # do work then break
            break
     
    t = threading.Thread(target=worker)
    t.start()

  • File I/O is an area where straight C really isn’t too bad. You have your POSIX interface, via open(2), read(2), write(2), etc - and you have your ANSI buffered I/O functions with fopen(3), fread(3), fwrite(3), etc. Many of the shell commands for file system manipulation map very closely to libc calls. For example, mkdir(2), dirname(3), stat(2) and so on. Python - once again mostly thanks to being able to handle the memory management for you - helps a lot in the situation where you are reading from a file, of which the size is unknown (for example, a pipe, or a network socket).

    I would also mention that Python’s standard library has a concept of ‘file-like objects’ which are essentially opaque data buffers which can be accessed through exactly the same interfaces as actual files. Common examples are StringIO, urllib and urllib2.

  • When it comes to working with databases, Python has the usual advantage of making it easy to deal with dynamic result sets. Additionally, abstractions like DB API 2 and some of the advanced language features such as list comprehensions and generators, can greatly reduce the amount of code required for filtering and processing data from databases. Furthermore, I have found that psycopg2 (the website of which is unfortunately in bad shape) works extremely well in a threaded environment.

In conclusion, Python allows you to write complicated, useful applications, with fewer bugs, much faster than in C. It removes many (but not all) headaches associated with memory management and data structures. Much of the portability issues are taken care of for you. Essentially with Python you stand on the shoulders of giants. While C is still extremely useful and important, Python makes excellent sense for many classes of program.

Share and Enjoy:
  • Digg
  • del.icio.us
  • Google
  • Reddit
  • Technorati

P2P Research at Google wrap up and slides

Niall O'Higgins @ 16 July, 2008 (20:50) | BitTorrent Protocol, P2P Software, Piracy Research | No comments

I gave a talk at Google/bayPIGgies last week. I was very pleasantly surprised by the turnout - and most of all by the excellent questions asked by the audience. The interest from people at the talk crossed many domains - people were generally curious about many aspects, from security to technical scalability concerns to legal issues.

Four employees from BitTorrent, Inc were present. It was wonderful to have the chance to talk to some of the pioneers in this technology. They seemed very interested in the work we are doing and invited us to call in to their offices in downtown SF some time.

Overall, it was wonderful to have so much interest in our research and to be able to spread some information about BitTorrent and p2p to a wider audience. I believe the talk was recorded on video and should be made available by Google on YouTube or Google Video, however I don’t have the URL just yet. I did the slides in S5 though, and you can view the slides for the p2p research talk at Google here in a regular web browser.

Share and Enjoy:
  • Digg
  • del.icio.us
  • Google
  • Reddit
  • Technorati

P2P Research Talk @ Google in Mountain View tonight

Niall O'Higgins @ 10 July, 2008 (09:53) | P2P Software, Piracy Research | No comments

Just a quick note, I will be speaking on the subject of our research at the Google amphitheatre this evening. Details on the talk, along with directions etc, can be found at the BayPiggies site.

I will post my slides online and I believe there will be a good-quality recording of the talk made available on Youtube or Google Video.

Share and Enjoy:
  • Digg
  • del.icio.us
  • Google
  • Reddit
  • Technorati

Have your software email you when it is in trouble with Python logging

Niall O'Higgins @ 29 June, 2008 (09:08) | P2P Software | 3 comments

Here at P2P Research, we have numerous long-running software agents written mostly in Python, which crawl BitTorrent and perform various types of analysis. While these agents are relatively robust, every few months they might hiccup - perhaps not even from a condition within their control, like a full disk.

In any case, we want these agents to be running as much as possible. If any of the agents exits for some reason, we want to know about it. One of the most important pieces in any production software system is robust logging. Long-running daemon processes should at a minimum log to an on-disk file in a clean, consistent format with timestamps. After very basic logging, the next requirement is some log rotation scheme, to prevent the disk from filling up with logs, and perhaps the introduction of log levels, to make it easy to distinguish warnings from critical events from debug events and so on.

Fortunately with Python’s logging module makes this trivial to implement. Where you might usually use a print statement in debugging, you just use one of the logging functions. Instead of

1
print >> stderr, "message"

you probably want

1
logging.warn("message")

or

1
logging.critical("message")

Setting up rotated logging is very easy, since Python has the RotatedFileHandler already implemented in its standard library. To set it up for use in your program, you can add something like the following snippet:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
# Rotated logging setup snippet
import logging
from logging.handlers import RotatingFileHandler
 
LOG_FILE='/path/to/logfile'
LOG_FORMAT='%(asctime)s %(levelname)s %(message)s'
MAX_BYTES=1024 * 100 # 100KB
BACKUP_COUNT=10
 
# ...
 
rotating_handler = RotatingFileHandler(LOG_FILE, 'a',MAX_BYTES, BACKUP_COUNT)
rotating_handler.setFormatter(logging.Formatter(LOG_FORMAT))
logging.getLogger().addHandler(rotating_handler)
 
logging.info("Logging setup complete")

Logging to disk is a great start. But what if you want to notify a human if something goes horribly wrong? We can define “horribly wrong” as logging.crit(), and set up a SMTPHandler to send email whenever this happens:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
# Email logging setup snippet
import logging
from logging.handlers import SMTPHandler
LOG_FORMAT='%(asctime)s %(levelname)s %(message)s'
# who do we send the log email to
TO_LIST=['admin@example.com', 'me@mydomain.com']
# who is it from
FROM='agent@example.com'
# subject template
SUBJECT='[my agent] Critical event'
# SMTP server
SMTP_SERVER='localhost'
 
# ...
 
# smtp handler, only logs on CRITICAL level messages
smtp_handler = logging.handlers.SMTPHandler(SMTP_SERVER, FROM, TO_LIST, SUBJECT)
smtp_handler.setLevel(logging.CRITICAL)
smtp_handler.setFormatter(logging.Formatter(LOG_FORMAT))
logging.getLogger().addHandler(smtp_handler)

Now we are configured to send notices via email on any critical event. However, what do we do about uncaught exceptions which don’t have a logging.critical() handler, and the interpreter just exits? We install our own uncaught exception handler, which calls logging.critical(), of course!

1
2
3
4
5
6
7
8
# Snippet to setup custom uncaught exception handler
import sys
def except_hook(type, value, tb):
    ''' handler for exceptions which would cause the interpreter to exit '''
    logging.critical("Uncaught exception of type `%s' with value '%s'" %(type, value))
    os._exit(1)
 
sys.excepthook = except_hook

And there you have it!

Share and Enjoy:
  • Digg
  • del.icio.us
  • Google
  • Reddit
  • Technorati

The most popular things on BitTorrent? You might be surprised.

Niall O'Higgins @ 28 June, 2008 (14:46) | Piracy Research | 1 comment

We have conducted a survey of 300,000 individual torrents available on four of the best-known BitTorrent trackers on the Internet. Part of our analysis has been to see what category they belong to. Is porn more popular than software? Is music more numerous than TV?

Pie chart of BitTorrent category distribution

Movies Software Music TV Games Porn Books Pictures Other
26.26% 20.00% 18.54% 11.06% 7.23% 4.61% 3.83% 1.18% 7.29%

Why so little porn?

While it is true that certain torrent aggregators/trackers specialise in a specific category, sites included in the survey such as Mininova and TPB have a very wide variety of content types. Given the apparent prevalence of porn on the Internet as a whole, I’m surprised that the numbers are so low for it. It is possible that much of this material was mis-categorised, or lumped in with another category such as movies or pictures.

More music than TV?

Another surprise, especially considering the huge amount of talk about rampant sharing of TV shows. Also, I would posit that there may be more TV episodes released per annum than music albums. However, perhaps there is a tendency - as with DVD releases of the same shows - to group individual TV shows into one torrent per season. On the other hand, the same pattern exists with music, where an artist’s entire discography is released in a single torrent.

Share and Enjoy:
  • Digg
  • del.icio.us
  • Google
  • Reddit
  • Technorati

Why Python sucks for BitTorrent and P2P

Niall O'Higgins @ 19 June, 2008 (06:10) | P2P Software | No comments

Python is a wonderful programming language. We use it very heavily at P2P Research for all our statistical analysis and data mining of peer-to-peer networks. It has a great deal of expressiveness and enables a software author to very quickly jump from idea to implementation. It removes much - but by no means all - of the headaches around memory management and choosing efficient data structures. With its ‘batteries included’ philosophy, the standard library contains a vast array of very useful modules straight off - for the majority of tasks, there is something already in the standard library has something which will greatly simplify it, from excellent Berkeley DB support to convenient secure temporary file handling.

However, Python’s “one size fits all” standard library can also be a weakness in cases where you need certain performance optimisations. My specific example here is the mmap module. On UNIX systems, the mmap(2) system call supports an ‘offset’ parameter. Mmap is extremely useful for BitTorrent because it allows you very efficiently and easily to read and write directly to and from on-disk data. Mmap access can be implemented by the kernel using zero-copy semantics, and avoids all the overhead (system call and otherwise) of the read(2) and write(2) family of functions. BitTorrent fundamentally involves a huge amount of IO dancing and mmap(2) offers an elegant and high performance boulevard for this.

While the UNIX system call supports an ‘offset’ parameter, the Python mmap module does not. From what I have been able to find out about why this is the case it appears to me that the biggest reason for not having this support is that the Python mmap module also supports Windows, and so any additions or changes to it - for example adding support for an offset parameter - must be ported to Windows also. While Windows does support offsets in its memory mapping routines it seems that this has proven difficult to use for the Python developers. Presumably there is not enough demand for the feature to truly warrant implementation. Now that I think of it, perhaps I will myself take a stab at implementing it, since I have no prejudice against Windows users being able to run my software.

In any case, because of this situation, Python BitTorrent implementations - such as Brahm Cohen’s mainline client - are therefore limited to using the more unwieldy (for random access) and less efficient read and write calls. Due to this, the mainline BitTorrent code is more complicated than it could be, and is slower. On the other hand, my high performance BitTorrent implementation written in C and portable to UNIX and Windows is able to take advantage of memory mapped file support - precisely because it is written in C and not Python, giving me sufficient control.

The same idiom of using memory mapped files to avoid system call overhead and allow the kernel’s VM to optimise IO for you of course applies to a much broader spectrum of applications than just BitTorrent. Just about all the major P2P applications which I am aware of could use this feature to efficiently read and write large blocks of data in random order. Unfortunately until this is fixed, Python remains a sub-optimal choice for BitTorrent and P2P applications.

Share and Enjoy:
  • Digg
  • del.icio.us
  • Google
  • Reddit
  • Technorati

MacGyver: The entertainment industry should embrace piracy to kill piracy

Harry Tormey @ 3 June, 2008 (12:27) | Piracy Research | No comments

Recently I purchased series one through five of the hit TV show MacGyver from amazon.com on DVD for roughly $14 a pop including shipping. The inspiration for this whimsy came from a $4 trip to my local rental place which resulted in a straight through viewing of disc 1 of season 1. Prior to making the decision to purchase I spent quite some time weighing up the plethora of media options available.

If I was willing to tolerate adverts and poor quality, AOL TV seemed like a reasonable choice. Amazon Unbox was another tempting “now” option, but what if I wanted some sweet MacGyver lovin’ on the bus home from work with my iPod? Unbox’s DRM does not support the iPod so I’d be SOL on that count. Besides these restrictions Unbox wanted to charge me $1.99 an episode which at roughly 24 episodes per season was quite a bit more than I was willing to pay!

How about NetFlix? Well I usually don’t watch that many movies a month so a NetFlix account would stay dormant at least half the time and its digital delivery services suffers from the same DRM problems that afflict Unbox. A quick search of all the main torrent trackers yielded nothing promising (a seedless Love Boat crossover and a French language version of MacGyver series 2) so even if I wanted to sneak into the show for free I’d have to wait an undetermined amount of time for something to appear. Taking all of the above into consideration and considering the totally reasonable price I decided to do it the old fashioned way and buy DVD.

This whole process of evaluation took more than an hour to complete and was way more trouble than your average consumer would be willing to put up with. With all of these catches it’s no wonder that piracy is so prevalent, if a good MacGyver torrent was readily available at the time I was about to click buy, maybe I would have thought twice.

BitTorrent is really good at catering to the latest trends, where it falls down is in catering towards the more obscure pieces of media. In my opinion it is this inconsistent nature which is the entertainment industry’s biggest opportunity to stem the tide in its losing battle against piracy.

Consider what I was doing in my survey of the media options available to me. I was looking for the best bang for my buck. If I went to Swedish BitTorrent site The Pirate Bay and tried searching for MacGyver and the only promising thing that came up was an advertisement for NetFlix or Amazon and the price was right I might have clicked buy. Certainly I was more likely to click on that on an advertisement for pheromones.

By not embracing BitTorrent aggregator sites the entertainment industry is missing out on a valuable opportunity to extract revenue from potential customers. Picture a service that trawled all of the major aggregators and provided an up to the minute search able index of every torrent available on the Internet. A Google for BitTorrent if you will. Now picture text advertisements on this service presenting you with the relative costs of all media available - free with advertisements, cheap DVD postal subscription, DRM streamed immediate download. In my opinion, such a service would be an excellent way to regain lost revenue from piracy.

Share and Enjoy:
  • Digg
  • del.icio.us
  • Google
  • Reddit
  • Technorati

P2P Case Study: On the illegal distribution of the hit film 300

Niall O'Higgins @ 30 May, 2008 (16:31) | Piracy Research | No comments

Introduction

Distribution of media in peer to peer networks tends to mirror the mainstream media. A new cinematic release will suddenly become the hottest thing to download on BitTorrent. But who releases the films, and how? Usually there are many different individual releases for a given film, and the cycle typically starts with initial low-quality offerings, with higher-quality copies being published as time goes on. This article is an in-depth look at the BitTorrent release progression of one major 2007 Hollywood film, 300.

Time line

  • March 9th 2007, blockbuster film 300 opens in movie theatres across the United States. While movie-goers crowd out cinemas on release day, Internet piracy gangs spring into action, working secretly beneath the radar of copyright enforcers in shady arrangements with theatre staff.
  • March 10th 2007, so-called “cam” releases - crudely videotaped reproductions - of the hit film appear on the BitTorrent P2P network.
  • March 12th 2007, a mere three days after the film has opened to the public, highly sophisticated “telesync” reproductions - which practically necessitate insider help to produce - flood onto BitTorrent.
  • March 20th 2007, less than two weeks since the film’s opening, BitTorrent is inundated with leaked “workprint” copies of the flick - very high quality digital reproductions.
  • March 21st 2007, piracy groups begin circulating misappropriated DVD “Screeners” - advance retail-quality copies of the film intended for critics and distributors - on BitTorrent.
  • April 11th, 2007, while continuous streams of pirated copies have poured onto P2P networks since the film’s release - including versions in French, Spanish and German along with versions with fan-made “alternative soundtracks” - the first authentic “DVDRip” copies appear on BitTorrent, barely more than a month after the initial release date.
  • July 21st, 2007, the first in a string of ultra high quality HDTV duplications are released by pirates.
  • By the end of 2007, more than 332 different pirate versions of the film have been released on BitTorrent.
  • A whopping 25% of all pirated versions of 300 were distributed through infamous Swedish BitTorrent site, The Pirate Bay.

Analysis

The time line above makes it clear that despite huge investments made by the entertainment industry in piracy prevention, piracy is rampant. Indeed, it is easier now than ever before to download a pirated movie. What is unclear, however, is the impact of various offerings, such as Amazon’s online movie store, or Netflix’s, are having on piracy. At what point will people switch over to simply paying for a download rather than firing up their torrent client? What needs to change in terms of cost, software and licensing? These are some of the questions we are really interested in, and will continue to research.

Share and Enjoy:
  • Digg
  • del.icio.us
  • Google
  • Reddit
  • Technorati