Creating good web URI’s


This is taken from our BCAD project blog, but something that might be worth sharing here also… seeing as i’m mean’t to be writing web development type stuff ;)….

Well, one of the requirements of the BCAD project was to ensure that the new website came with a good set of consistent, stable Uniform Resource Identifers (URIs). 

Looking at page content, a page such as a biography on Carl Giles can be easily represented as follows:

http://www.cartoons.ac.uk/artists/carl-giles/biography

Human readable, simple to remember and “hackable” (I.e. You able to break off parts of the URI to view a different resource).  An additional bonus, but not what defines a URI, is that the URI is SEO (Search Engine Optimisation) friendly.  This is just the easy part of mapping and determining URI’s for the BCAD project, and now we have a good solid mapping in place as a result for some key areas of the web site. 

The above example just by chance works, but the problem is with the archive records however.  This is not so simple due to the way that we pull out the record from the database.  All records come with a unique reference number which must be referenced somewhere in the URI to be able to obtain the resource.

http://www.cartoons.ac.uk/record/carl-giles/0001 was an initial suggestion but it wasn’t acceptable because:

  1. It relies on results ALWAYS being returned in the same order.  It assumes that you could change the artist and search for reference number ‘0001’ for that artist, but the system isn’t set up in that way. 
  2. Every record has a unique reference number, no matter who the artist is.  The artist part in the above URI is actually meaningless.
  3. It is not “hackable” – http://www.cartoons.ac.uk/record/carl-giles does not point to anything.

All of this suggested that we were stuck with the following URI as a result:

http://www.cartoons.ac.uk/record/0001

It fit’s under the category of URI for sure, but it doesn’t really say too much about the resource itself.  Where as it ticks a good section of the URI boxes, it doesn’t really say much about the resource.  So without needing to ID records with anything other than the ID (i.e. the primary key or reference number), how can we improve things and try to improve SEO?

Although it seems to be a bit of a dark area currently with not much written up in the way of URI’s for archive resources, it seems that not all is lost.  An interesting blog article discusses the concept of ‘slugs’ within URI’s, and essentially adds additional resource describing “fluff” to the URI to purely improve SEO (We say, “fluff” as although it has meaning to the resource, it doesn’t determine which record to grab).  Where does the “fluff” come from?  The record resource itself.

Remembering that we need to keep to Jacob Nielsen’s rule of URI’s being less than 78 characters, the most obvious field to choose is the ‘Artist’ field still.  It would be nice to integrate ‘Caption’ and ‘Publisher’ into the URI, but then where SEO may improve, readability and consistency would degrade dramatically (The caption for instance is more likely to change).  By using URI slugs and using the ‘Artist’ field, we end up with a record URI like so:

http://www.cartoons.ac.uk/record/0001-carl-giles

Here the “carl-giles” part of the URI is purely superficial and pays no part in determining the resource (Hence the “fluff” tag), it just helps to describe the resource within the URI better and improve SEO as a result.  Additionally the idea is that if you were to also enter http://www.cartoons.ac.uk/record/0001 without the artist name, the URI would still work and still point to the same resource.

Useful resources on the subject:

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s