A couple of weeks ago Jessica wrote a post about her visit to Goldsmith’s College to see how they were using eprints software to make their digital collections accessible. Since then, as part of our project looking at all areas of ingest of digital archives, we have been thinking about what information users will want when searching our digital archive. Once again, (what we thought would be) a simple task has ballooned into something rather more complex and it has opened up a number of questions about archive cataloguing.
To start at the beginning, the context of an archive collection is just as important as the content. Put simply we can gain a greater understanding of the content of a letter if we know it was written by a Conservative Minister of Education, rather than a Labour Minister, (or a teacher, academic, or member of a pressure group). To provide this context archive cataloguing is hierarchical. You start with the broad and work down to the specific. We follow certain ‘rules’ such as creating different levels of description and not repeating information at lower levels. We’ve already discovered this causes problems when searching the online catalogue as records are detached from the hierarchy in the hitlist. It becomes even more difficult when trying to represent a catalogue using repository software. For example if the search results return records in eprints that are merely there to represent the hierarchy of the catalogue it’s going to be pretty frustrating when there is no record attached to it. It got us thinking about how people search, the relevance of the hierarchy and how we can represent the hierarchy of the catalogue. At the moment we are thinking a static diagram (probably displayed through the browse function) might be the best way to go.
It also dawned on us that we’re going to have to re-look at our descriptive metadata. The details displayed in hitlist results need to provide the searcher with enough information to make a decision as to whether it is relevant to them e.g. author, title, date. That’s when we realised titles like Agenda1 with no author are not going to be helpful in the slightest! It means those archivist ‘rules’ such as not repeating information at lower levels of a catalogue and using the original title given to a file by the creator are less easily applied. It has made clear to us the work that will need to be done on collections by creators before they are passed to us to ensure that we have a good level of meaningful metadata (we’re all guilty of having titles like ‘Doc1’ or ‘FinalFinal’). We will need to work much more closely with our depositors to provide guidance on file naming, organisation and metadata creation meaning the process of archive cataloguing will change significantly. Our aim is to have some procedures in place by the end of the project and our next step will be to use JISC-funded guidance such as Versions toolkit developed by LSE.