how to upload a file on archive.org using php and curl

Archive.org is an interesting non profit project based in San Francisco active since 1996 founded to create and maintain a persistent and free to access online library of different kind of media, from ebooks to movies. And yes, you can upload your own file (if you own the rights of course) for free and make it available for everyone in different formats, for example uploading a mp3 audio file the servers will automatically convert it in ogg and vbr mp3 as well (check the faq for more info about derivatives) and afterward you will be able to download it with a direct link or with a torrent or even stream.

What makes this project (among other projects like the wayback machine) even more interesting is the possibility to store our media files trough the archive’s web service API, an Amazon S3 like storage. To use the API you will need an account on the library and then you can get your API keys from this page.

I assume you know what a web service is and i’m not going to explain here how technically an Amazon S3 server works.
In this article i’m going to show how to upload from our server and store in the archive an audio file with a HTTP request using cURL and php. A basic code snippet that hopefully will help other web developers looking for examples since the official documentation lacks information about it.

The server supports the request methods GET, POST, PUT, DELETE; in the request we will include the http headers and few other settings that will create an Item or bucket in the archive and include the uploaded file in it.

Before the request we need to read the file that we are going to handle from our server:

$file = 'my-new-song.mp3';
$file_read = fopen($file, 'r');

Now it’s time to specify an array of headers we send together with the file.

$headers = array(
'Authorization: LOW accesskey:secretkey',
'x-amz-auto-make-bucket:1',
'Content-Type: audio/mp3',
'x-archive-meta-mediatype:audio',
'x-archive-meta01-collection:Songs',
'x-archive-meta-title:New song'
);

We are sending a basic authentication (don’t forget to replace accesskey and secretkey with the API keys you get here), 'x-amz-auto-make-bucket:1' set on 1 (true) creates automatically the archive Item where the file will end up, then we specify the kind of file and some other optional meta data (check the documentation for a complete list of possibilities).

To upload a file with curl we need to create a PUT request that requires two additional information: the file and its size.

curl_setopt($ch, CURLOPT_PUT, true);
curl_setopt($ch, CURLOPT_INFILE, $file_read);
curl_setopt($ch, CURLOPT_INFILESIZE, filesize($file));

In the request we need to include in the url the identifier where the file is going to be available and we can even decide a custom file name. I want to upload my new song in an album called ‘Demo songs’ so in this case the album will be our identifier, after the upload the song will be available at http://s3.us.archive.org/demo-songs/my-new-song.mp3.

curl_setopt($ch, CURLOPT_URL, "http://s3.us.archive.org/demo-songs/$file");

You can change the $file var with any custom value that will became the definitive filename.
Now we have everything we need to perform the upload; let’s put all together with some comment:

// file on our server
$file = 'my-new-song.mp3';
$file_read = fopen($file, 'r');

// headers we need to send
$headers = array(
'Authorization: LOW accesskey:secretkey',
'x-amz-auto-make-bucket:1',
'Content-Type: audio/mp3',
'x-archive-meta-mediatype:audio',
'x-archive-meta01-collection:Songs',
'x-archive-meta-title:New song'
);

// curl in action
$ch = curl_init();
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);

curl_setopt($ch, CURLOPT_HEADER, true);
curl_setopt($ch, CURLOPT_HTTPHEADER, $headers);

curl_setopt($ch, CURLOPT_URL, "http://s3.us.archive.org/demo-songs/$file");

curl_setopt($ch, CURLOPT_PUT, true);
curl_setopt($ch, CURLOPT_INFILE, $read_file);
curl_setopt($ch, CURLOPT_INFILESIZE, filesize($file));

curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_VERBOSE, false);

$data = curl_exec($ch);
echo $data;

curl_close($ch);

Notice that i’m printing the request data with the var $data and if you need additional info i suggest to set CURLOPT_VERBOSE on true.
You can now run the script and you should get a response like this:

HTTP/1.1 200 Ok

This means that the request was completed successfully and the file is now flying to the archive.org servers where will be put in queue to be processed and will be available as soon as possible (it depends on the server loads) at the address we requested. Usually for small files like an mp3 audio file it takes around 15 minutes (or less) to complete the publishing process so don’t panic if you don’t see your file uploaded in real time.

This snippet of code can be improved with a lot more details and for example with error handling but i’ll keep it basic, consider this as a starting point to explore the possibilities of the great service archive.org offers.

About these ads

2 comments

  1. Pingback: how to upload a file on archive.org using php and curl | Design News


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s