Less Talk, More Code: Distributed objects - how to cope with objects scattered across multiple Fedora's

I was asked this question by Chris Wilper a couple of weeks ago:

From http://blogs.sun.com/georg/entry/open_respository_2008_day_3
"Scalability : objects can be placed in any object store on the
network, and located via their object meta data. This means that
scaling over multiple Fedora instances is a no brainer."

Sounds cool. How'd you do resolution? (e.g. a request comes in,
which repository is it in?)

At the time, I had a good number of solutions I had tried, with one solution waiting on something to become more stable for it to work. They all had their plus sides and their minuses. They generally followed the tried and tested method of either a centralised source, or having a DNS style system.

What I didn't write to him about was an idea that's been fermenting in my mind for a little while now but which wasn't thought through enough to explain at that time.

De-centralised database for Fedora object URIs

The base premise is to use something called a distributed hash table or DHT to hold the link between URI and base Fedora URL. Now, as DHTs are kinda new and tend to be found in 2 main fields - research projects and trackerless bitorrent - I'll just write a brief summary of what they are.

Distributed Hash Tables (DHT)

From Wikipedia:

Distributed hash tables (DHTs) are a class of decentralized distributed systems that provide a lookup service similar to a hash table: (name, value) pairs are stored in the DHT, and any participating node can efficiently retrieve the value associated with a given name. Responsibility for maintaining the mapping from names to values is distributed among the nodes, in such a way that a change in the set of participants causes a minimal amount of disruption. This allows DHTs to scale to extremely large numbers of nodes and to handle continual node arrivals, departures, and failures.

DHTs form an infrastructure that can be used to build more complex services, such as distributed file systems, peer-to-peer file sharing and content distribution systems, cooperative web caching, multicast, anycast, domain name services, and instant messaging. Notable distributed networks that use DHTs include BitTorrent (with extensions), eDonkey network, YaCy, and the Coral Content Distribution Network.

(Emphasis my own)

As other, much more qualified people, have written about the underlying algorithms that power DHT, I will simply link to the articles that I have found illuminating. (DHT can also be treated as a black-box technology if you wish too)

Distributed Hash Tables, Part I October 1st, 2003 by Brandon Wiley, Linux Journal

Wikipedia: Distributed Hash Tables

OpenDHT: User's Guide

Let's illustrate how having a DHT of URI-to-Fedora-Server pairs will help us, by skipping ahead and imaging that we already have the situation that each Fedora server hes a DHT node service of its own - just like a peer2peer filesharing program, it maintains a list of value pairs for the items it holds.

Let's now imagine 3 commands that a Fedora instance can use to work with it's own DHT node - get, put and remove:

get key - gets the values associated with a given key
put key value - puts the given key-value pair into the DHT
remove key value - removes the given key-value pair from the DHT

So, as a new item is allocated to a given Fedora, it can add a pair to the DHT, the key being the item's URI <info:fedora/ns:id>, and the value being the base URL for that Fedora, e.g. <http://host:8080/fedora>


$ ./put.py info:fedora/uuid:00e41229-1c9f-4c1d-ac3a-b51d34bbbe8f http://archive.sers.ox.ac.uk:8080/fedora
Success

$ ./get.py info:fedora/uuid:00e41229-1c9f-4c1d-ac3a-b51d34bbbe8f 
http://archive.sers.ox.ac.uk:8080/fedora

Specific implementation notes

The Bamboo DHT implementation is a very useful implementation of a DHT system, and has a good amount of helpful documentation. It is also the basis for a good test-bed service, the OpenDHT service mentioned earlier.

In fact, the three commands illustrated above, have real, live python implementations which are presented on the OpenDHT site - get.py, put.py, and rm.py. The underlying mechanics to the protocol is just simple XMLRPC, so most languages have solid libraries for interfacing with the API.

I see this service hooking into Fedora by using a listening service on the ActiveMQ message queue - put'ing the URI/URL hash pair when items are added, and rm'ing when purged. The Fedora and the Bamboo service should be booted and shutdown together, adding or removing whole sets of hashes to or from the DHT, accurately reflecting the accessibility of the Fedora item.

Now that we have a DHT, why not...

Another beneficial use of the hash table may be to encode certain information, such as the template URL for a HTML splash page (should one exist) - perhaps the template style as used by OpenSearch 1.1 - e.g. "http://archive.sers.ox.ac.uk:5000/resolve/{uri}" (Note: I have added this resolving service to handle both info:fedora/ns:id and info:fedora/ns:id/dsid type URIs - the first leads to the splash page and the second form redirects to the download for that given datastream)

Important final note

It is very important in distributed environments that you have UUIDs (which may or may not involve the numerical UUIDs I promote the use of) - the important part is that each Fedora identifier is unique across the whole set of Fedora instances. As there the only issue with joining these URI lookup tables across institutional boundaries is political, it may be a good thing to adopt a consistent and bullet-proof mechanism for ensuring that your id system is not going to collide with someone elses.

4 comments:

Anonymous said...: Ben, this seems like a lot of work to avoid using URLs. What is it buying you?; 23 April 2008 at 04:45
Ben O'Steen said...: If I could guarantee the URLs for the lifetime of the object, I would use URLs instead.

But the object can and does move around the system - whether it's from a staging area to the main archive, or whether it's a simple change of host for pragmatic reasons.

In a nutshell, it allows us to reap some of the benefits of unchanging URLs (copy-by-reference, etc) without maintaining a direct link between the retrieval URL and object itself.; 23 April 2008 at 07:30
Chris Wilper said...: Hi Ben, this is interesting. I've also wondered about the feasibility of using a decentralized resolver with Fedora (or any repository system, for that matter).

Centralized/hierarchical resolvers have certainly been shown to scale, and can be designed with a lot of fault-tolerence. But a decentralized resolver could provide a lot of freedom (independence from DNS, or baking protocol/owner assumptions into your IDs).

You may be interested in what's been done with Wuala. Here's a recent techtalk I ran across: http://youtube.com/watch?v=3xKZ4KGkQY8; 23 April 2008 at 11:10
Graham Fawcett said...: I wonder whether you could use consistent hashing, without using DHTs, and simplify your overhead somewhat. I know that some DHTs use consistent hashing; but perhaps CH alone might address your issue.

You may find this blog article of interest.; 5 May 2008 at 08:27