Digitised objects are not all the same size. Users want different things from them. What to do?
Descriptions of things live in library, archive and museum catalogues. Sometimes, each thing in the catalogue has its own page on the web.
Things get digitised. Now the institution has a digital version of the thing where there was once just a catalogue record. Now there might be a lot of content to show for each digitised thing in the catalogue. At least one image, possibly thousands. There could be transcriptions of the text in those images, and all sorts of other related content produced by digitisation. There could be content contributed only because the thing is now digitised. There can be huge variation in the amount of content associated with a thing, or the amount of content that might later accumulate for it over time.
Catalogue records are generally about this same size as each other, and any variation in the size of a catalogue record doesn’t really matter anyway. If there’s more metadata, it’s just a longer web page. But the things themselves are not all the same size. One thing might be scrap of paper, another thing might be a 36 volume encyclopaedia. One thing might be a single page letter, another thing might be the contents of a large archival box.
What was once a user experience challenge about how to present similar-looking things (catalogue records, API endpoints, blobs of data) now becomes a challenge about how to represent wildly different things (representations of objects and their associated content).
Consider these examples of catalogue records and their digital representations, from the Royal Society and elsewhere.
1 – Stomachs
The physical artifact is a pencil drawing of the stomachs of a turtle and a shark side by side. Two smaller drawings have been glued on top, one of a frog’s stomach and one of a snake’s stomach. The three drawings are separately created things, and have each been given their own identifiers in the archival hierarchy – there are three separate records, PT/73/1/20, PT/73/1/21, and PT/73/1/22.
There is only one image, because there was only one thing to photograph: the sheet of paper on which the turtle and shark stomachs were drawn and the other two attached. If we presented these three archive records as distinct resources on the web, each presentation would show the same image.  We could crop the image differently in the three cases, but this would hurt the user’s experience of looking at the object. In a viewer, you’d want to see the whole piece of paper, just as you would if your interest in a drawing of a snake’s stomach led you to to the archive in person and you had the sheet in front of you on the table. Although there are three catalogue records, they are all about different parts of one real world object.
2 – A letter
Referee’s report by Charles Robert Darwin on a paper by [William Benjamin] Carpenter, [‘Researches on the Foraminifera’]
This is a four page letter, and a transcript is available. It makes sense to represent this object as a single reading experience in a viewer, it corresponds naturally to a human concept of a thing.
3 – A book
This example is A Tour in Wales, first published in 1778, here shown in the National Library of Wales: https://viewer.library.wales/4692237. The viewer owns most of the real estate of the web page, to avoid confusion about information conveyed about the object on the page (none in this case) and information conveyed within the viewer (all of it).
Other content and metadata that isn’t rendered by the viewer would have to live elsewhere, on a different page. 
4 – Sets of photographs
Sometimes a catalogue record describes more than one photograph. In this example from the Wellcome library, one archival record (and hence one web page with one viewer on it) comprises 73 assorted photographs and diagrams.
The record level item is a set of sometimes quite distinct things. A user can share a deep link to a particularly interesting photograph, but there’s no concept of that photograph as an independently addressable thing in its own right (at least, not for the user, not at the level of web pages). That photograph doesn’t have its own web page, somewhere additional content about it can live. Perhaps this photograph should have its own page, whereas any one of millions of book pages presented in the same way should not. But as far as the presentation goes it is exactly the same kind of thing as a book page; it is an image in a sequence rendered by a viewer. 
In The Royal Society pilot project each photograph generally has its own catalogue identifier and therefore (in our current thinking) its own item page. One of the case studies we will be looking at is Polar Exploration. For each of these photographs we have a whole web page, as long as we like, to say what we want about it and present its relationship with the rest of the world.
The different decisions about cataloguing the photographs in these two examples (which may have as much to do with time constraints and reasonable workload as much as institutional practice) determines the prominence of an individual photograph on the web, its identity as a thing of interest in its own right, in a way that was irrelevant for physical visitors to the library requesting to look at the material in person. When there is one web page per catalogue entry, the user experience is quite different for photographs catalogued as singletons and photographs catalogued as members of a set. And therefore the significance of the object on the web is different in the two cases.
5 – A manuscript from the Newton Papers
This is one page from Newton’s early drafts of the Principia. The archival item is a 1560 page manuscript.
The volume of content that each image in the manuscript could carry is huge. Some of the Royal Society material is like this. There is a lot to say about each image, a lot of links inbound and outbound, a lot of commentary, a lot of tagging. The user experience must satisfy the needs of users who want to look at the image content, and the needs of those who want the wealth of surrounding information. Each image of the manuscript has the user interface requirements of a web page.
What is the focus of attention?
If something isn’t digitised, a visitor’s gaze can only alight on text on web pages that look pretty much the same regardless of the the thing they describe. But if the thing is digitised, there’s all sorts of places to look. There may be thousands of distinct views to see, or texts to read. When you can see objects, they look different from each other.
The user experience is usually addressed by putting the object in a box – a viewer – and including the viewer on a web page. This might be the catalogue page, or something directly linked from it. This can work very well, especially in large digitised collections and for book-like things, because the transition from web pages to enclosed navigation in a viewer makes sense for the object in that context. On the object’s page, the user explores the visible content of the object within its box. This allows the web page in which the viewer sits to delegate any potentially complex interior structure or content of an object to the viewer application. The page is for item-level descriptive metadata and content, and if the user wants otherwise her focus of attention has to stay in the viewer when interacting with the object. There’s a clear boundary between the concerns of the hosting web page and the concerns of a viewer-of-objects. If the hosting web site is a library catalogue (or a web site that behaves in a catalogue-like way), then this separation of concerns works; the site’s purpose is to help the user find the object, so now let her look at it. The web page is about the thing; how a user interacts with the constituent parts of the thing is an interaction between the user and the viewer application, not an interaction between the user and the item page.
What about user experiences that are not catalogue-like? It’s not always just about finding the object THEN having a look at it. The institution might have a lot more to say about some objects that others. And the same object can have a different focus of attention in different contexts. A digitised version of Newton’s Principia or the manuscript of Middlemarch may be served well in a book reading UI in a catalogue-like user interface, but have a different presentation entirely in an interface made for close reading, annotation or study of that object, where every image has a story to tell, or a single paragraph could be the subject of an essay.
There’s quite a lot going on on any one page of Newton’s notes and drafts for the Principia (https://cudl.lib.cam.ac.uk/view/MS-ADD-03965/9) or in notebooks:
The amount of content, and its variety, suggests a web page for every image if we want to provide transcripts, commentary and ask users to contribute too. There’s simply too much going on to expect a generic book-reading interface to do what we want. We’re going to need to craft something extra.
In the Science in the Making project, we have archival items (catalogue records), and we think there will be a page for each item. We’re not, at this stage, required to handle book-like material in the same user interface. We want to show a lot of content for some of the things in the collection. Some of that content is about the item, and some of it is specific to individual item images. A transcriptions of a page of a letter, tags of people in a page, commentary on figures – these are examples of content that belongs to a particular image. We think we have enough of this content, and enough variation in user requirements, that each canvas also needs to have its own web page, it’s own place on the web that is about that image and carries some content about that image. But that means competition for the user’s attention in the information architecture of the site. Are we introducing a tension between a page for the object, and a page for each of its images. They have different requirements.
The item page carries content about the object. Its title, its description, its author, and links to other people associated with it through various roles. The item has links to other items. The item usually has descriptive metadata because the item was the thing that was catalogued.
A child page for each image gives us room for transcripts, direct tags on photographs or drawings or mentions of entities in text, commentary, explanation, notes and other annotated content. It’s a different kind of page.
Here are some wireframes for an item page for a multi image object, a letter:
This page has object level information, the letter’s relationships to people and other objects. But in some treatments it also has the image of the first page visible, and a transcription of that first page. In those treatments the first page of the letter is favoured; we need additional UI to get to other image pages.
What happens on a child page? Is there any object level information there? Just the image-specific content?
The tension between these two different types of page is most clear when there is only one image for an item. Is that going to lead to confusion? For the user, what’s the difference between these two pages? They both appear to be about the same drawing, but in different ways. Why am I seeing two different pages for the same thing? If there is one image, do we merge the functionality of an object page and an image page? Would that also lead to confusion?
What happens when we don’t have any content (other than an image) for a page? Especially if it’s the only page and we separate out object functionality on one page and image functionality on another?
Is there anything wrong with offering multiple ways of viewing an object? To have an enclosed viewer if the user wants to just read through the letter, but provide the complexities and potential of item page and image pages for closer study and exploration? If we’re describing the objects using open standards then it’s easy for us to offer multiple ways of viewing the material, but is that desirable? If so, do we offer all the options all the time, for any object, or do we favour particular presentations for particular types of material? We don’t know what the user’s intention is or their relationship with the material. What to one user is just another page of a book is to another the foundation of a thesis.
The pilot project will allow us to put some of these ideas to the test.
The focus of attention is different…
- The focus of attention is different for different kinds of object.
- The focus of attention is different for the same object in different contexts.
- The focus of attention is different for different people.
- The focus of attention is different for the same object in the same context at different times, depending on how much additional content is available
In the examples earlier, A Tour in Wales happens to be one of eight volumes, and all of them are part of a crowdsourcing project where they are presented in a website with one page per image. It’s the same resource driving the user interface in both cases . We’re providing a different environment to view it in, with very different functionality, for viewing and contributing complex additional structured content. The Royal Society content feels more like this, because of the amount of functionality available per image. In the Pennant example the difference between the page for the volume and the page for each image is obvious – the volume page has thumbnails linking to the image pages.
So, we have some competing UX requirements to balance in producing a template driven presentation of the Science in the Making archive material, even though we don’t need to tackle things like books and newspapers. These tensions come from the different size of the material, and the different amount of content. What other issues are there?
Apart from the tension between image and item pages there is another potential problem with presenting archive material outside of the archival context. We’re taking records that were created as members of a hierarchy, and reusing them in a site where navigation is driven by topics. We’re taking items out of a tree and expecting them to work well in a web. Or rather, we’re taking some items out of a tree – the pilot material is not the whole archive. We think that this will work for the pilot material, but that assumption is one that needs testing with user research as the pilot develops and starts to be used.
The four “focus of attention” assertions made above are hypotheses for user testing once the pilot is ready. Possible outcomes could be that a single set of templates might not be enough to cover all these cases or that we might need to compromise the UX in one direction. User types can range widely from a layperson who happens to be just a bit curious on the subject to a seasoned researcher. A single UX/UI approach to very different information needs might be possible, and might be achieved only after a few cycles of design and user testing.
Footnotes – IIIF implementation
 In IIIF terms, there could be one manifest for each of these three archival records, and each manifest would contain one canvas annotated by the same image in each case. We could use IIIF fragment selectors to crop the image differently in the three cases, but this would hurt the user’s experience of looking at the object. Although there are three catalogue records, it’s really just one object, it’s really just one manifest. Here, three catalogue records might all point to different parts of the same IIIF manifest, because the IIIF manifest is modelling the physical object, the sheet of drawing paper.
 A 300 page book and a 73-image set of photographs are both described by IIIF manifests and rendered in a viewer. IIIF gives us extra presentation metadata to help convey the bookness of the first object – enough information about pagination to ensure a reading experience is an accurate representation of the physical object in a viewer that observes IIIF viewing hints. But the decision about the individuality of an image within a sequence is an information architecture and user experience decision, made in two steps: when modelling the objects in IIIF, and when deciding how to render them to the user.
 In IIIF terms, each of the canvases of each manuscript of the Newton Papers could be the carrier for content linked via the mechanism of annotation. Transcription, commentary, tags, citations, links outbound and inbound, descriptions – these are all content that a web page for a single image could convey to the user, and also content that we might want to capture from users on that page.