One of the best ways of learning is by doing but it can be daunting to get started with digitisation. In this Quick Start Guide we aim to give a short overview of digitisation, some important questions to ask yourself, with links to more detailed information.
This guide focuses on digitising single specimens, or individual objects, rather than collections or groups of specimens. If you are interested in capturing data on collections or groups, we recommend reading the overview of the Latimer Core data standard.
What is digitisation?
Digitisation is used as a blanket term to mean any creation of digital data relevant to collections. In a wider context, it is usually used to mean any creation of digital data from analogue objects or sources. For natural science collections digitisation could include: where the specimen is stored and its conservation condition; photographs of specimens; transcribed data from labels, registers or notebooks; georeferenced data; any molecular and chemical analytical data derived or associated with a specimen; linking up data to make it more findable and interoperable; and more!
Why are you digitising?
Your reason for digitisation will determine and influence what you need to prepare and consider. The three main reasons (which can overlap) are:
-
Documentation/inventorying - usually taking a mass or industrial digitisation approach, capturing minimal data to increase searchability and support collections management but often allowing for subsequent enhancement. This can also include the valuation of collections
-
Research - natural science collections support a wide range of research. For research-based digitisation we recommend close collaboration with the researcher or research team to understand their requirements. Capturing high-level specimen data aids discoverability and enables research, especially when the data are published to GBIF.
-
Public engagement - creating content for social media, virtual exhibitions, or more narrative-based pages
How do you prioritise?
Prioritisation is usually linked with your reason for digitising, but may be influenced by an institutional strategy or a plan. Prioritisation is often very institutionally specific as it depends on many factors, including available resources and expertise. If you need inspiration on where to get started with prioritisation then we recommend our pages on plans and strategy, and the report by Bakker et al (2018)
What are you digitising?
The kinds of specimens and their method of preservation will determine the methodology and approach. We have tried and tested workflows for pinned insects, microscope slides and herbarium sheets collections. See our Workflows pages for more information.
Do you need to digitise in-house?
While many institutions start by doing digitisation themselves with their own staff, there are other options. If you lack the expertise, or need to scale up rapidly, you can consider outsourcing all, or parts of digitisation, and doing this outsourced work off-site or on-site.
Examples of companies that offer outsourcing include: Picturae and Bioshare
How much are you digitising and over what period of time?
If you are uncertain about how many specimens are in the collection you plan to digitise, we recommend either counting, or estimating the size of the collection. If you combine this with a small pilot then you will be able to estimate how long it will take to digitise a collection, which in turn can be used to estimate staff costs.
Can you divide your digitisation project into chunks?
We recommend planning and running your digitisation projects into smaller manageable chunks. This helps with planning, prioritisation, funding, team morale and practicality. We recommend considering breaking up projects based on how they are stored (e.g., by drawers, cabinets or rooms), and considering these chunks like a project management milestone. This gives you time to reflect on whether estimates and rates were accurate, if you need to modify a digitisation workflow, and to celebrate completing part of a project.
You may also want to have separate phases for the task clusters as summarised in the Digitisation section
What do you need to prepare in advance?
Some preparation can make your digitisation projects run more smoothly. This is covered comprehensively in the pre-digitisation checklist section of the guides [ADD LINK]. Some general preparation we recommend includes:
- Auditing/assessing the collection prior to digitisation to assess conservation or curatorial preparation requirements
- Running a pilot to test assumptions and new workflows
- Gather or check any relevant information and reference, including taxonomic checklists,
- Discuss your project with others to get feedback and advice
What identifiers or barcodes should you use?
When digitising you should assign institutionally unique identifiers to specimens, and we strongly recommend you encode these identifiers as both human and machine readable barcode labels.
Many older collections will have historical identifiers, so you should be careful if using a simple numbering system to avoid duplicates. Most modern collection management systems will have the function to generate and check the uniqueness of specimen identifiers. You can create specimen identifiers in Excel, but this is not recommended as it is easy to make serious mistakes that can cost you time and money.
In addition to specimen identifiers, there are other kinds of identifiers you may encounter or want to use when digitising specimens.
Do you need to take photographs of a specimen to digitise?
The decision on whether to take a photograph and what kind of photograph to capture is dependent on the purpose of the digitisation project. For many purposes an overview image of the specimen at a reasonably high resolution (e.g. 600dpi) has been considered a standard. However, for other purposes, including curation and transcription, an image of specimen labels can be more useful than a photograph of the specimen itself (e.g., a specimen where the diagnostic characters are not visible). In some cases photographing a label or other documentation can be a lot easier or safer than handling the specimen itself (e.g., liquid preserved specimens where labels are visible on a jar or container, or for very fragile specimens).
In general, the recommendation has been to take an overview image of the specimen, whether as part of a whole drawer image or for the individual specimen, but we would also recommend taking photographs of specimen labels or documentation - as this information is often interpreted when being transcribed, these photographs will be the only reference for checking the original verbatim information.
What equipment will you use?
The basics for digitisation are:
- a computer with an internet connection to record data and edit images
- a camera to photograph specimens, labels or other supporting documents (for documents, this could be a smart phone)
- a printer to create labels and barcodes
We have more detailed pages on different equipment setups and software you may want to consider depending on your needs.
What additional information will you capture?
Where in the collection, conservation status, acquisition information, on what a specimen is (taxonomic name), where it was collected (country level), who collected it, and when it was collected. A Minimal Information about a Digital Specimen (MIDS) standard is being developed that provides guidance about which data should be prioritised for capture at different stages of digitisation. This standard recognises three main categories of digitisation. Level 1 is a basic record, capturing institution, the kind of specimen and a unique specimen identifier to provide a basic catalogue of the collection. Level 2 includes information that is generally considered important for specimen discovery and broad research questions. This would include where and when it was collected and who collected it. Level 3 expects full data entry and links to related information.
How will you store images and data?
This is often one of the most challenging aspects of a large digitisation project and can be underestimated. A good strategy for digital preservation will provide a good foundation for a digitisation programme. Consider the level of access required since storage that does not include instant access may be significantly cheaper. See the IT Infrastructure pages for more information.
How will you share or publish your digitised specimens?
GBIF is the aggregator for biological data but it does not include earth sciences specimen data. GeoCASE is the aggregator for geological and palaeontological data.
References
Bakker, Hannco P.A.J., Willemse, Luc, van Egmond, Emily, Casino, Ana, Gödderz, Karsten, & Vermeersch, Xavier. (2018). Inventory of criteria for prioritization of digitisation of collections focussed on scientific and societal needs. Zenodo. DOI: 10.5281/zenodo.2579156
Nelson G, Paul D, Riccardi G, Mast A (2012) Five task clusters that enable efficient and effective digitization of biological collections. ZooKeys 209: 19-45. DOI: 10.3897/zookeys.209.3135
Latimer Core Guidance Documentation
Citation
Livermore, L. & Haston, E. (2023) DiSSCo Digitisation Guide: Digitisation Quick Start Guide. v.1.0. Available at: https://dissco.github.io/
Licence
CC-BY
Document Control
Version: 1.0
Changes since last version:
Last Updated: 28 July 2023
Edit This Page
You can suggest changes to this page on our GitHub