Meise Botanic Garden Herbarium Sheets
Table of contents
Overview
This workflow was designed for the first mass digitisation project DOE! (Digitale Ontsluiting erfgoedcollecties = Unlocking Heritage collections) of the herbarium BR at Meise Botanic Garden. The aim of this project was to digitise the entire African and Belgian collection of vascular plants housed at BR (ca. 1,2 million flat herbarium sheets) within 3 years (2015-2018).
Two workflow lines were created: one for the major part of the project where an external company was hired for the image capture and data transcription, a second line for digitising the exceptions in house.
Workflow
Pre-Digitisation Curation
There are two key tasks in the pre digitisation step at Meise Botanic Garden, both conducted in the herbarium rooms:
- pre-digitisation curation of the specimens
- pre-digitisation of the covers
These workflows were followed for all specimens and folders already inserted into the collection, so not for new incoming material or returned loans.
1. pre-digitisation curation of the specimens
This task was conducted by 15 technicians on a half time basis for 1,5 years. Also volunteers regularly helped. They have checked and prepared 1,2 million specimens this way.
\
Adding a white folder with a red dot was for marking specimens/sheets that didn’t need to be digitised by the external company. For example, specimens already digitised in previous projects, pictures, literature, manuscripts and photos of herbarium specimens were put in these folders.
\
All exceptions were digitised in house and were extracted from the collection and kept in separate boxes. For example multi gatherings, specimens completely kept in envelopes, sheets with only label information were kept aside. When a sheet was extracted, it was replaced by a post-it with the collector and number written on it. This way they could be inserted easily after digitisation.
\
2. pre-digitisation of the covers
The vascular plant herbarium specimens of BR are kept in 3 different subcollections: an African collection, a Belgian collection and a general collection. During this project all the specimens of the African and Belgian collection were digitised.
The specimens are stored in an alphabetical order by family, genus and species. There is a name tag every time the filing name changes. A QR barcode was added to the cover every time the name was different from the previous one. If the name wasn’t fully written on the first folder with that name, this name needed to be written on the folder so the external company could capture the complete name and no errors could be made.
Image Capture
As mentioned before they were two streams of imaging:
- the bulk part on a conveyor belt system by Picturae,
- the exceptions on our internal infrastructure
1. Outsourced image capture
Picturae installed a conveyor belt in a room next to the collection. They digitised the folders with a QR code and all the specimens. The specimens and folders were picked up in the collection and were digitised in the same order as they are stored in the collection. After imaging, they were brought back to the collection. Picturae used a tracking system to make sure that all the specimens were at their original location after digitisation. Some 5000 specimens were imaged a day. It took them less than a year to image all 1,2 million sheets.
2. Internal workflow
For the inhouse imaging 3 stations are used. They all use a Pentax 645Z camera, image transmitter software and a black background with a fixed color scale. Images are made with a resolution of 450 DPI.
While imaging, the TIFF files are stored on the server. For each imaging session, the operator needs to create a new folder using following structure: name operator/project/date
The specimens are renamed automatically after imaging. The filename of the image is changed into the barcode which is used afterwards for linking with the label data.
Image Processing
These processes are described in detail in the following publication: Nieva de la Hidalga , Paul L Rosin , Xianfang Sun , Ann Bogaerts , Niko De Meeter , Sofie De Smedt , Maarten Strack van Schijndel , Paul Van Wambeke , Quentin Groom Designing an Herbarium Digitisation Workflow with Built-In Image Quality Management Biodiversity Data Journal 8: e47051 doi: https://doi.org/10.3897/BDJ.8.e47051
The major difference in the workflow between the images created internally and the images created by Picturae is the creation of the JP2 and jpg derivatives. Picturae delivered us as well TIFF files as their JP2 and jpg derivatives while they needed to be generated in house for the internal images.
Electronic Data Capture
1. workflow outsourced label transcription
Label transcription was done by Alembo, a subcontractor of Picturae, based on the images. A protocol was set up for this transcription together with the three parties (Picturae, Alembo and BR) and look up lists were foreseen by BR for filing names, collectors, phytoregions and countries.
The following fields were transcribed: filing name, barcode, collector, collector number, country as given, country code, phytoregion, collection date, locality, altitude, altitude unit and coordinates as given
2. workflow internal process
The online data sources for filing names are the following:
IPNI
TROPICOS
GBIF
African Plants Database
The label information is transcribed from the specimen itself, not from the digital image.
BGBase is the content management system that is used.
3. Crowdsourcing
For the Belgian collection, another approach for label transcription was chosen then for the African collection.
DoeDat, a multilingual crowdsourcing platform based on DigiVol, was created to transcribe the label information from our Belgian collection by volunteers.
Quality control of the outsourced label transcription
The quality of the data was measured using a subsample of the data file. The size of the subsample was determined using the table below:
\
Two types of errors were distinguished: Identification and Transcription errors 1) Identification errors occur when:
- Data is entered into the wrong field or incorrect data is entered in a field;
- Data has not been entered despite it being present on the label. 2) Transcription errors are when data that have not been correctly transcribed from the label (typos). For each field, a penalty calculation was made and determined on the retrievability of the collections. This information was made available to the contractor via one of the tender annexes.
Georeferencing
No georeferencing was requested to the contractor or was done in house. Only when the coordinates were available on the sheet, they were transcribed as they appeared on the label.
Preserving and Publishing Data
TIFF files are stored on tape at 3 different locations at the Flemish institute of Archiving (meemoo) for long term preservation. JP2 and JPEG derivatives are stored at the Botanic Garden on servers at two different locations. These derivatives are used for display on our virtual herbarium botanicalcollections and GBIF
After the data export from BGBase, the related images are extracted from the archive and displayed on botanicalcollections and GBIF. All specimens have a permanent URI and the data is rdf readable.
Part of our digital collection is also consultable on plants.jstor.org and europeana.
Requirements
Hardware
Software
Image transmitter 2
Adobe
Set up digitisation station
Pentax 645Z
Lens 90 mm
Photostand
Lighting
Camera Settings
TIFF format
450 ppi
TV ⅛
AV F16
ISO 100
Other Sources
Section for links out to other related resources e.g blog posts, journal articles
Authors
Sofie De Smedt & Ann Bogaerts
Contributors
Mathias Dillen
References
Michel Giraud, Quentin Groom, Ann Bogaerts, Sofie De Smedt, Mathias Dillen, Hannu Saarenmaa, Noortje Wijkamp, Sarah Philips, Steven ven der Mije, Agnes Wijers, Zhengzhe Wu. Best practice guidelines for imaging of herbarium specimens ICEDIG deliverable 3.6
Nieva de la Hidalga , Paul L Rosin , Xianfang Sun , Ann Bogaerts , Niko De Meeter , Sofie De Smedt , Maarten Strack van Schijndel , Paul Van Wambeke , Quentin Groom (2020) Designing an Herbarium Digitisation Workflow with Built-In Image Quality Management Biodiversity Data Journal 8: e47051 doi: https://doi.org/10.3897/BDJ.8.e47051
Licence
Citation
De Smedt, S. & Bogaerts, A. (2022) Meise Botanic Garden Herbarium Sheet Workflow. version 1.0. Available at: https://dissco.github.io/HerbariumSheets/MeiseBGHerbariumSheets.html
Document Control
Version: 1.0
Changes since last version: N/A
Last Updated: 1 April 2022
Edit This Page
You can suggest changes to this page on our GitHub