Monday 28 January 2008

Update

I am still here...

Just thought I'd put in a quick note to say that yes, i am still posting to this blog and yes, it is worth waiting for (I hope)

The posts that are sitting in my drafts at the moment are:

Image annotation and how to describe regions of images in a sound and stable manner. From talking to people on the #swig on freenode IRC, most notably Ben Nowack, I think that the format for doing this is pretty much crystallised.

Why is image region annotation important? It gives a consistent mechanism for recording information, such as:

- relating the area on a scanned page to words from an OCR'd text.

- it is easy to identify the same thing in multiple images - for example, identifying a person in multiple photos, labelling a burial site both in a photo and also in a topographical survey map, showing a particular phenotype being expressed on multiple slides of fruitflys, etc.

I think that is enough to whet your appetite for why this is handy to have sorted out early on.

And as this blog is called 'Less Talk, More Code', the answer to 'but how do we add this?' is that I have found a good GPL2 javascript library for drawing regions on images and the format is simple to write for.


Another thing lurking in my draft folder is all to do with "HOWTO: Building a fedora-commons backed web site from scratch, using open source tools" - Part 1 & 2.


My aim is that someone following both parts should end up with a web site that will allow them to do what is normally expected of an archive - handily, some points are listed in this photo from the Jisc-Crig unconference - http://www.flickr.com/photos/wocrig/2197484000/


"Put stuff in, get stuff out, Find stuff in, relationships between objects, annotations" plus content types, edit, event logging and visualisation and record pages built from one or more metadata sources.


Part 1 is the mundane details of installing an OS (Ubuntu JeOS), and then acquiring all the dependencies such as Sun Java 1.5 Fedora, Apache Solr, and the requisite python libraries (PyXML, Pylons, and the libraries I've written) for part 2, building the interface itself.


The key to it I hope is that the build of the python interface is described in such a way that it will help people do the same and allow them to think about what can be done further.


Progress? Part 1 is almost done and I am re-jigging part 2 to be more readable; I am trying to put part 2 into an order so that the simple things are dealt with first and the more complex things later, rather than in a 'function-by-function' structure.

No comments: