RAWS tutorial

The 'RAWS tutorial' series is designed to allow our users to get up to speed quickly with Rambla's Web Services (RAWS) and their usage. Our API's are based on REST principles and open standards (eg. ATOM and Atom Publishing Protocol) and accommodate for rapid client application development in any programming language.

The following tutorials are already available:

In the previous tutorial we've shown the different uses of the "atom:link" element. In this tutorial, we will use the PHP client that was introduced in tutorial 2 to show you how to handle this element. We'll also learn how to retrieve partial atom responses. The complete source code for this tutorial is also part of the client download package (samples/raws_links_and_pagination.php). You can try running it yourself, after having edited the named constants on top of the script. For more information on using the PHP client, see tutorial 2.

This tutorial assumes that a directory named "tutorial6" has already been created on the CDN and 15 files have been uploaded to this directory (named "test1.mp4", "test2.mp4" ...). The sample script uses a local file, if which the path needs to be set in the named constant 'LOCAL_FILE', to create these files.

Getting entry links

You can use the properties of an $entry object (= in this case returned by RASS as a response to a POST item request) to directly access each link element and its attributes. The following code loops through all 'atom:link' elements that are part of the entry and displays the relevant attributes.

  foreach($item_entry->link as $link) {
echo "\nType of relation: " . $link->rel;
echo "\nLink URI: " . $link->href;
echo "\nExpected type: " . $link->type . "\n";
}
>> Type of relation: self
>> Link URI: http://rass.cdn03.rambla.be/meta/monty/tutorial6/test15.mp4/
>> Expected type: application/atom+xml
>>
>> Type of relation: edit
>> Link URI: http://rass.cdn03.rambla.be/item/tutorial6/test15.mp4/
>> Expected type: video/mp4
>> ...

The $entry object also has helper methods which you can use to retrieve the 'href' value of a link with a given relation type. Since this entry is returned by the RASS item resource, it will contain an enclosure link that points to CDN location at which the file (attached to this resource) can be downloaded by end-users.

  echo "\nGetting alternate link via helper method: " . $item_entry->get_alternate_url();
>> http://rass.cdn03.rambla.be/meta/monty/tutorial6/test15.mp4/?alt=atom
echo "\nGetting enclosure link via helper method: " . $item_entry->get_enclosure_url() . "\n";
>> http://monty.cdn03.rambla.be/tutorial6/test15.mp4

Getting feed links

In the same way as for an entry, you can use a $feed object's properties (or helper methods) to directly access each link element and its attributes. The following code loops through all 'atom:link' elements that are part of the feed and displays the relevant attributes.

  foreach($feed->link as $link) {
echo "\nType of relation: " . $link->rel;
echo "\nLink URI: " . $link->href;
echo "\nExpected type: " . $link->type . "\n";
}
>> Type of relation: self
>> Link URI: http://rass.cdn03.rambla.be/dir/tutorial6/?kind=file;page=1;paginate_by=...
>> Expected type: application/atom+xml
>> ...

Using pagination

In the previous tutorial, we already learned how the 'atom:link' element is being used by some RAWS resources to return partial collections. These resources will return a feed that may contain an 'atom:link' element with its 'rel' attribute set to "next", to indicate that not all entries could be fitted inside of the response. The next page can then be retrieved by sending a GET request to the URI inside of the link element's 'href' attribute.

Most resources that support pagination also return a 'last' link that points to a URL that can be used to retrieve the last batch of entries from a given collection. You can use the $feed object's helper methods get_next_link() and get_last_link() to retrieve these URLs.

For this tutorial we have created a directory that contains 15 files. By passing the 'paginate_by' as part of the query-string arguments (for details, see the 'pagination' wiki page) and setting it to "10", we instruct RASS to return only 10 entries in each (partial) response. Therefore, the response should also contain a 'next' link that points to an URI that can be used to retrieve the 5 remaining entries.

  $query->setPaginateBy("10");
$feed = $rass->getDirFeed($dir_entry->path, $query);
echo "\nNext link URI: " . $feed->get_next_link() . "\n";
>> Next link URI: http://rass.cdn03.rambla.be/dir/tutorial6/?kind=file;page=2;paginate_by=10;
echo "Last link URI: " . $feed->get_last_link() . "\n";
>> Last link URI: http://rass.cdn03.rambla.be/dir/tutorial6/?kind=file;page=2;paginate_by=10;

Note that in this case the 'next' and 'last' link both point to the same URL. This is normal, since RASS only needs to return 5 item entries in its next response. If instead our 'tutorial6' directory would contain more than 20 files, the 'next' and 'last' link would point to different URLs.

The connection object also has two helper methods - getNextFeed() and getLastFeed() - that automatically return the $feed object for the 'next' or 'last' link (by sending a GET request to the URL inside their 'href' attribute). When calling these methods, you need to pass the original $feed object as their argument.

   $next_feed = $rass->getNextFeed($feed);

Pagination best practices

If you're using a RAWS resource that supports pagination and you want to retrieve all entries from a given collection, you should always check for the presence of a 'next' link inside the 'atom:feed' element. You can automate this behaviour by nesting a call to the connection object's getNextFeed() inside a while-loop that checks on the $feed object returned by this call. If there's no 'next' link present (= no more entries), the $feed variable will be set to null and your code will exit the loop.

  $feed = $rass->getDirFeed($dir_entry->path, $query);
while($feed)
{
foreach ($feed as $entry) {
# process your entries here..
echo "\nFound entry with path = " . $entry->path;
}
# get next feed, by sending a new request to the next link inside this page
$feed = $rass->getNextFeed($feed);
}

In this tutorial we will zoom in on the Atom Protocol's link element and the way it's being used in the RAWS APIs. More specifically, we will look at the different kinds of atom:link elements that may be present inside of an atom:feed or atom:entry element. In our next tutorial, you will learn how to handle these links using the RAWS PHP client libraries.

The atom:link element

According to the atom protocol, the 'atom:link' element defines a reference from an entry or feed to a Web resource. The reference itself is contained inside the "href" attribute. The "rel" attribute is used to give meaning to the reference: it specifies the type of relation that exists between the entry or feed that contains the link and the web resource that is referenced by the link. The "type" attribute contains a hint about the type of the representation that you can expect when following the link. The protocol specifies a few more attributes, but they are currently not being used by RAWS.

For example, a feed or entry that discusses the performance of the search engine at "http://search.example.com" might contain the following element:

<link href="http://search.example.com/" rel="related" type="text/html" />

The link references a web resource located at "http://search.example.com/", which is 'related' to the current feed or entry and will (in all probability) return an HTML page.

The following relation types are currently being used by RAWS: 'self', 'alternate', 'edit', 'enclosure', 'first', 'next' and 'last'.

Self, Alternate and Edit links

These types of relations allow for some introspection regarding the current feed or entry. Below are some examples of how they are being used (taken from the RATS example - get output list on our wiki).

<link href="http://rats.enc01.rambla.be/output/75/" rel="self" type="application/atom+xml"/>

The value 'self' signifies that the IRI in the value of the "href" attribute identifies a resource equivalent to the containing element (= the entry or feed). Basically, it refers to the location where the current entry or feed can be retrieved. In this case, the entry represents a RATS output profile.

<link href="http://rats.enc01.rambla.be/output/75/?alt=html" rel="alternate" type="text/html"/>

The value alternate signifies that the IRI in the value of the "href" attribute identifies an alternate version of the resource described by the containing element. In this case, the output profile in the atom entry can also be rendered as a web page, which can be inferred from the "type" attribute.

<link href="http://rats.enc01.rambla.be/output/75/" rel="edit" type="application/atom+xml"/>

The value edit signifies that the IRI in the value of the "href" attribute identifies a resource that is editable. In most cases, this means that you can change the resource - represented by the entry - by sending a PUT request to the edit link.

Enclosure links

The value enclosure signifies that the IRI in the value of the "href" attribute identifies a related resource that is potentially large in size and might require special handling. In the RAWS APIs, enclosure is used in entries that contain metadata about a file. The type attribute can be used as an indication of the file's mimetype. Depending on the specific service and/or resource, this file may be located on the CDN or may be accessible through a web service.

<link href="http://rats.enc01.rambla.be/transc/media/monty/test.flv/" rel="enclosure" type="video/x-flv"/>

An enclosure link inside an entry which is part of a RATS response always refers to a file that is available from RATS. This means that only the same, authenticated user is able to access the file (= in this case the result of a transcoding operation).

<link href="http://monty.cdn03.rambla.be/tutorial5/snake.mp4" rel="enclosure" type="video/mp4"/>

An enclosure link that is returned by the RASS 'item', 'dir' or 'meta' resource refers to the public location of a file on the CDN.

First, Next and Last links

These types of links are used for pagination; for collections that have too many entries to be returned in a single feed, the atom publishing protocol's defines a mechanism that allows the server to return the entries in consecutive, partial feeds. To have control over the pagination, a client application can check for "atom:link" elements in the feed with a "first", "next" or "last" relation type.

<link href="http://rass.cdn03.rambla.be/dir/tutorial/?kind=file;paginate_by=50;page=2" rel="next" 
type="application/atom+xml"/>

If a feed contains an atom:link element with a "next" relation type, this means that the response contains a partial feed and more entries can be retrieved by sending an HTTP GET request to the URL in the "href" attribute. The client may keep doing this as long as the returned feeds contain a "next" link. When the last partial feed is being returned by the server (= all entries have been returned), there will be no more "next" link in the feed.

<link href="http://rass.cdn03.rambla.be/dir/tutorial/?kind=file;paginate_by=50;page=2" rel="last" 
type="application/atom+xml"/>

If a feed contains an atom:link element with a "last" relation type, the client can jump to the last (partial) feed by sending an HTTP GET request to the URL in the "href" attribute.

<link href="http://rass.cdn03.rambla.be/dir/tutorial/?kind=file;paginate_by=50;page=1" rel="first" 
type="application/atom+xml"/>

If a feed contains an atom:link element with a "first" relation type, the client can jump to the first (partial) feed by sending an HTTP GET request to the URL in the "href" attribute.

Currently, pagination support within RAWS is limited to resources that return large feeds. These include the RASS dir and meta resources and most of the RAMS resources. All other RAWS resources will return their collections in one single feed. For more details about RAWS and pagination, see the dedicated wiki page.

In the previous tutorial, we learned how to retrieve ATOM collections from RAWS. More specifically, we retrieved information about our files and directories on the CDN using the RASS API's dir resource. This time, we will repeat these requests using the PHP client from tutorial 2.

The complete source code for this tutorial is also part of the client download package ('rass_basics_get_feed.php' in the 'samples' directory). You can try running it yourself, after having edited the named constants on top of the script. For more information about using the PHP client, see tutorial 2.

Pre-configuration

This tutorial assumes that some files and directories have already been created on the CDN for user 'monty'. The sample script uses a local file (-> path needs to be set in the named constant 'LOCAL_FILE') to create the files.

  • Files located under the root-directory: test1.mp4, test2.mp4.
  • Files located under a 'bucks' sub-directory: bunny1.mp4, bunny2.mp4.

Get files list

First we'll retrieve a list of all files inside our root-directory, by sending a GET request to "http://rass.cdn01.rambla.be/dir/?kind=file". To do this, we create a Rass connection object and call getDirFeed() on it, passing "/" as the first argument. As the second argument, we pass a Rass_DirQuery object which has a 'kind' property that has been set to "file" (in the constructor, see below for more).

If the request succeeds, our method will return a Rass_DirFeed object. Otherwise, it will raise a Zend_Gdata_App_Exception.

require_once './RawsClient/Raws/Rass.php';
$rass = new Rass("monty", "mypwd", "http://rass.cdn01.rambla.be/");
$query = new Rass_DirQuery("file");
$dir_feed = $rass->getDirFeed("/", $query);

The returned $dir_feed object contains an array of Rass_DirEntry objects, which encapsulate the ATOM entries from the response (see tutorial 2). These entries provide information about our files and/or directories.

foreach ($feed as $entry) {
# Retrieve the entry element's "kind" and "path" attributes
echo "Found " . $entry->kind . " entry with path = " . $entry->path . "\n";
# Retrieve the value of some file properties
echo "Filename: " . $entry->content->params->name->text . "\n";
echo "Filesize: " . $entry->content->params->size->text . "\n";
# Retrieve the public URL of the file (for download by end-users from the CDN)
echo "Download URL: " . $entry->get_enclosure_url() . "\n";
}
>> Found file entry with path = /test1.mp4
>> Filename: test1.mp4
>> Filesize: 13827
>> Download URL: http://monty.cdn01.rambla.be/test1.mp4
>>
>> Found file entry with path = /test2.mp4
>> ...

Manipulating query-string arguments

Query-string arguments are used by RAWS to further specify a request. Depending on the resource, they can be used as a search query, a resource filter, a response format indicator...

The PHP client defines resource specific query-string objects (= all derived from RawsQuery) which follow a naming convention. For example, if you want to send a request to the RASS 'dir' resource, you instantiate a Rass_DirQuery object.

The actual query-string arguments can be passed in the constructor of these objects or can be set on them as properties. In our previous example, we created an instance of Rass_DirQuery like this:

$query = new Rass_DirQuery("file");

Which can also be written as:

$query = new Rass_DirQuery()
$query->setKind("file")

Get files list for sub-directory

To get the list of files stored under our 'bucks' sub-directory, we only need to change the first getDirFeed() parameter from "/" to "/bucks". The client will now send a GET request to 'http://rass.cdn01.rambla.be/dir/bucks/?kind=file'.

$dir_feed = $rass->getDirFeed("/bucks", $query);
foreach ($feed as $entry) {
echo "Found " . $entry->kind . " entry with path = " . $entry->path . "\n";
}
>> Found file entry with path = /bucks/bunny1.mp4
>> Found file entry with path = /bucks/bunny2.mp4

Get list of sub-directories

In this case, we simply set the 'kind' property of the Rass_DirQuery object to "dir". This will result in a GET request to "http://rass.cdn01.rambla.be/dir/?kind=dir".

$query->setKind("dir");
$feed = $rass->getDirFeed("/", $query);
foreach ($feed as $entry) {
echo "\nFound " . $entry->kind . " entry with path = " . $entry->path . "\n";
}
>> Found dir entry with path = /bucks

In this third episode of our RAWS tutorial series, we will show you how to retrieve a collection of resources. We'll also explain how to tweak RAWS requests using query-string arguments. To demonstrate some of the REST mechanics involved, we will be using the cURL command line tool. In our next tutorial, we will show you how to make the same requests using the PHP client libraries.

Most RAWS resources are able to return a collection of resources in response to a GET request, in the form of an ATOM feed. The RASS API's dir resource returns a feed if the URL path part (after the resource indicator 'dir') points to a directory on the CDN. This feed contains information about files, sub-directories and/or the directory itself, depending on the arguments passed in the query-string part of the GET request.

To demonstrate this, some files and directories have been created on the CDN for user 'monty':

  • Files located under the root-directory: test1.mp4, test2.mp4.
  • Files located under a 'bucks' sub-directory: bunny1.mp4, bunny2.mp4.

Get files LIST

We'll start this tutorial by sending a 'GET dir' request for our root directory in order to retrieve the files it contains: "http://rass.cdn01.rambla.be/dir/?kind=file".

  • The path part of the URL is set to "/dir/". Since the resource indicator is only followed by a slash, the GET request is pointed at the root-directory.
  • The query-string contains a kind argument which is set to "file", indicating that we only want to receive entries that reference files.

The (relevant) output from curl looks like this:

curl --url http://rass.cdn01.rambla.be/dir/?kind=file --request GET --user monty:mypwd --verbose
> GET /dir/?kind=file HTTP/1.1
> Host: rass.cdn01.rambla.be

< HTTP/1.1 200 OK
< Content-Type: application/atom+xml

<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom" xmlns:raws="http://rambla.be/raws/ns-metadata/1.0">
<id>http://rass.cdn01.rambla.be/dir/?kind=file;page=1;paginate_by=100</id>
<link href="http://rass.cdn01.rambla.be/dir/?kind=file;page=1;paginate_by=100" rel="self" type="application/atom+xml"/>
<entry raws:kind="file" raws:path="/test1.mp4">
<id>http://rass.cdn01.rambla.be/item/test1.mp4/</id>
<link href="http://rass.cdn01.rambla.be/meta/monty/test1.mp4/" rel="self" type="application/atom+xml"/>
<link href="http://rass.cdn01.rambla.be/item/test1.mp4/" rel="edit" type="video/mp4"/>
<link href="http://monty.cdn01.rambla.be/test1.mp4" rel="enclosure" type="video/mp4"/>
<content type="application/xml">
<params xmlns="http://rambla.be/raws/ns-metadata/1.0">
<name>test1.mp4</name>
<size>13827</size>
<updated>2010-11-18 09:35:45</updated>
<mimetype>video/mp4</mimetype>
</params>
</content>
</entry>
<entry raws:kind="file" raws:path="/test2.mp4">
<id>http://rass.cdn01.rambla.be/item/test2.mp4/</id>
<link href="http://rass.cdn01.rambla.be/meta/monty/test2.mp4/" rel="self" type="application/atom+xml"/>
<link href="http://rass.cdn01.rambla.be/item/test2.mp4/" rel="edit" type="video/mp4"/>
<link href="http://monty.cdn01.rambla.be/test2.mp4" rel="enclosure" type="video/mp4"/>
<content type="application/xml">
<params xmlns="http://rambla.be/raws/ns-metadata/1.0">
<name>test2.mp4</name>
<size>13827</size>
<updated>2010-11-18 09:35:45</updated>
<mimetype>video/mp4</mimetype>
</params>
</content>
</entry>
</feed>

The response body has an atom:feed root-element which is the container for a number of atom:entry sub-elements, each containing information about a file located directly under the root-directory. In this example, the feed contains 2 entries.

Each entry element has an atom:id sub-element, containing the URI at which the corresponding item resource can be accessed. Also note the 'enclosure' link which points to the file's public URL, to be used by end-users for downloading the file from the CDN. The file's properties are inside the raws:params element (sub-element of atom:content).

Get files LIST FOR sub-directory

To get the files that are stored under our 'bucks' sub-directory, we simply have to add this directory to the URL path of our GET request: "http://rass.cdn01.rambla.be/dir/bucks/?kind=file". RASS will now return a feed with 2 entries referencing 'bunny1.mp4' and 'bunny2.mp4'. The feed's atom:id element looks like this:

http://rass.cdn01.rambla.be/dir/bucks/?kind=file;page=1;paginate_by=100

To find out where a file is located, look at the raws:path attribute of the entry element which points to the relative path of the file on the CDN.

<entry raws:kind="file" raws:path="/bucks/bunny1.mp4">

Get LIST OF sub-directories

In our previous requests, we've indicated to RASS that we only want information about files. In the same way, we can retrieve information about sub-directories by setting the 'kind' query-string argument to "dir". The following GET request will return all sub-directories of our CDN root-directory: "http://rass.cdn01.rambla.be/dir/?kind=dir".

<feed xmlns="http://www.w3.org/2005/Atom" xmlns:raws="http://rambla.be/raws/ns-metadata/1.0">
<id>http://rass.cdn01.rambla.be/dir/?kind=dir;page=1;paginate_by=100</id>
<entry raws:kind="dir" raws:path="/bucks">
<id>http://rass.cdn01.rambla.be/dir/bucks/</id>
<link href="http://rass.cdn01.rambla.be/dir/bucks/?kind=dir" rel="self" type="application/atom+xml"/>
<link href="http://rass.cdn01.rambla.be/dir/bucks/" rel="edit" type="application/atom+xml"/>
<summary>31750 bytes, updated on 2010-11-18 09:35:45</summary>
<content type="application/xml">
<params xmlns="http://rambla.be/raws/ns-metadata/1.0">
<name>bucks</name>
<size>31750</size>
<updated>2010-11-18 09:35:45</updated>
</params>
</content>
</entry>
</feed>

Since we have only created a single sub-directory named "bucks", RASS returns a feed with a single entry. As you can see, the entry's raws:kind attribute is now set to "dir". The raws:path attribute contains the directory's relative path.

Syndicate content