Plant and fungal collections: Current status, future perspectives

that of the location of the collections, most data coming from collections in Europe and North America; (c) a chart showing the proportion of microbial culture collections held in each continent (%) as recorded by WDCM; and (d) the results of a survey of the representation of both crop and wild seed plant taxa in the form of seeds in ex situ storage (short-, medium-, or/and long-term storage, including cryopreservation) at institutes across the world; including genebanks, seed banks, and botanic gardens. It is a compilation of separate, but overlapping species lists from: (i) the MSB itself plus collections elsewhere within the MSB Partnership, but not duplicated at the MSB; (ii) in seedbanks worldwide using records from PlantSearch; and (iii) elsewhere in genebanks worldwide, using accessions recorded on the GENESYS-PGR, the global online portal for Plant Genetic Resources for Food and


| INTRODUC TI ON
Natural History collections are a unique resource for research using morphological and molecular techniques (Funk, 2018). They also provide evidence for research into conservation and to tackle broader societal challenges and are a resource for public education (Bakker et al., 2020;Kvaček et al., 2016). Each specimen represents a unique collection event. It is made up of the physical sample with its observable data and associated taxonomic, spatial and temporal information. Further data can be added including images, ecological information, and physical preparations (DNA sequences, slides, dissections) creating an "extended specimen" (Lendemer et al., 2020;Webster, 2017).

| HERBARIA AND FUNG ARIA
According to the data in Index Herbariorum (http://sweet gum. nybg.org/scien ce/ih/), as of December 2019, there are 3,324 active herbaria in the world, containing 392,353,689 specimens and in digital data. We also consider what collection types need to be further developed to support research, such as environmental DNA and cryopreservation of desiccation-sensitive seeds. Around 31% of vascular plant species are represented in botanic gardens, and 17% of known fungal species are held in culture collections, both these living collections showing a bias toward northern temperate taxa. Only 21% of preserved collections are available via the Global Biodiversity Information Facility (GBIF) with Asia, central and north Africa and Amazonia being relatively under-represented. Supporting long-term collection facilities in biodiverse areas should be considered by governmental and international aid agencies, in addition to shortterm project funding. Institutions should consider how best to speed up digitization of collections and to disseminate all data via aggregators such as GBIF, which will greatly facilitate use, research, and community curation to improve quality. There needs to be greater alignment between biodiversity informatics initiatives and standards to allow more comprehensive analysis of collections data and to facilitate linkage of extended information, facilitating broader use. Much can be achieved with greater coordination through existing initiatives and strengthening relationships with users.

K E Y W O R D S
botanical garden, culture collection, DNA and tissue Bank, fungarium, GBIF, herbarium, seed bank, specimen (Thiers, 2020). There are 178 countries with at least one herbarium (Thiers, 2020). Index Herbariorum organizes herbaria into regions following Brummitt, Pando, Hollis, and Brummitt (2001) (Figure 1a). The large specimen total for Europe reflects the European origin of the herbarium tradition and the fact that European herbaria hold many specimens from outside Europe gathered during the colonial expeditions of the 17th to 19th centuries. Temperate Asia, which includes both Russia and China, ranks third in terms of the number of herbaria and specimens, but has more staff associated with herbaria than either Europe or North America. The ratio of specimen total to the number of staff may serve as a proxy for the level of research and curation activity in regional herbaria. These ratios range from 1.6 staff per 100,000 specimens in Europe to a high of 11 staff per 100,000 specimens in the Pacific region. Some botanically diverse areas have few herbaria: the island of New Guinea has five herbaria, and a vascular plant flora of 13,634 species (Cámara-Leret et al., 2020), compared to the UK with 223 herbaria and vascular plant flora of around 7,400 native and naturalized species (BSBI, 2020). . For institutions listed in GardenSearch, 32% of have provided collection data to BGCI's PlantSearch, which presently includes 1,471,901 records representing 107,304 accepted species. The source of data in PlantSearch largely matches that of the location of the collections, most data coming from collections in Europe and North America; (c) a chart showing the proportion of microbial culture collections held in each continent (%) as recorded by WDCM; and (d) the results of a survey of the representation of both crop and wild seed plant taxa in the form of seeds in ex situ storage (short-, medium-, or/and long-term storage, including cryopreservation) at institutes across the world; including genebanks, seed banks, and botanic gardens. It is a compilation of separate, but overlapping species lists from: (i) the MSB itself plus collections elsewhere within the MSB Partnership, but not duplicated at the MSB; (ii) in seedbanks worldwide using records from PlantSearch; and (iii) elsewhere in genebanks worldwide, using accessions recorded on the GENESYS-PGR, the global online portal for Plant Genetic Resources for Food and Agriculture Only a small proportion of herbaria in Index Herbariorum have provided information of their holdings by taxonomic groups: that is, how many specimens held, how many databased and imaged, for seed plants, algae, bryophytes, ferns, and related groups and fungi.
As a result, these data are still too provisional to be of use, and as discussed later, this restricts understanding of collection gaps and what still needs to be digitized.

Fungal cultures held in international microbial Biological Resource
Centres underpin research and development and the global bioeconomy. These centers are well placed to strengthen infrastructure to aid governments as they strive to deliver their commitments to the United Nations' sustainable development goals (Antonelli, Smith, & Simmonds, 2019). However, to achieve this objective, they must not only consolidate their existing capacities but also evolve their approaches to meet the ever-changing requirements of their users comm.). There is also evidence that at least some fungal spores and pollen can be conserved like orthodox seeds (Hong et al., 1999;Hong, Jenkins, Ellis, & Moore, 1998

| S EED BANK ING
Seed banks were originally developed as a cost-effective means of ex situ conservation of plant diversity, to mitigate the anticipated loss of genetic resources from the world's major crops (Li & Pritchard, 2009). Of the more than 1,750 agricultural seed banks worldwide, most focus on species currently covered by the International Treaty on Plant Genetic Resources for Food and Agriculture (Hay & Probert, 2013). In total 57,051 species (17% of seed plants) have been banked including more than 9,000 taxa that are threatened with extinction; and 6,881 tree species, more than half of which are single country endemics and represent species from more than 166 countries.
Major seed banks index their collections using different stan- Hence, further analysis is restricted to a single, diverse collection: that at the MSB (Box 1).

| DNA AND TISSUE B I OBANK S
The molecular revolution in plant and fungal science has driven a dramatic increase in the demand for the availability of biological samples of sufficient quality for genomic research. To satisfy this demand biodiversity repositories and institutes have increasingly developed dedicated biobanks for preserving both tissue material (usually In fungal groups collections are dominated by Ascomycetes and Basidiomycetes (Figure 3b). We did not analyze the geographic origins of the collections as the current goal of GGBN is to achieve a wide taxonomic representation, rather than in-depth geographical coverage, hence only a few taxa are represented from more than one country.

| D I G ITALLY ACCE SS IB LE DATA
There has been a huge increase in the mobilization and use of digital collections data (Lendemer et al., 2020;Nelson & Ellis, 2019;Schindel & Cook, 2018) Figure 4a shows the broad application of GBIF mediated data in scientific pub-

| Taxonomic coverage and data mobilization
There are taxonomic, spatial and temporal biases, uncertainties and errors related to the data already mobilized ( Heberling and Isaac (2018) note that there is a significant delay in making digital specimen records accessible. Figure 5b shows that vascular plant and, to a lesser extent, fungal specimens are accumulating in GBIF slowly after 2012, in comparison to records available for dates prior to that; and more slowly than observation records. Meyer et al. (2016) reported that although absolute numbers of unrecorded species were highest in the tropics there was not a "tropical data gap" in the pattern of proportional taxonomic coverage.

| Spatial coverage
These authors noted that some emerging economies are even more under-represented regarding digitally accessible information than species-rich, low-income countries in the tropics. These authors

| What are the main taxonomic and geographical collection gaps?
Around 31% of vascular plant species are represented in botanic gardens, and 17% of known fungi species held in culture collections.
These living collections are geographically biased toward collections in Europe and North America and Asia for fungal cultures. Wild seed banks and plant and fungal biobanks are relatively recent and cover a relatively smaller proportion of species (Figures 1d and 3).
Taxonomic and geographic diversity of non-digitized herbaria and F I G U R E 6 Point distribution of specimen occurrence and species representation of plant and fungal specimen data in GBIF. Point distribution of (a) Vascular Plants; (b) Fungi; and (c) Bryophytes. GBIF maps were downloaded using leaflet (Cheng, Karambelkar, & Xie, 2019) in R (R Core Team, 2019) using the GBIF maps API (https://www.gbif.org/devel oper/maps). Vascular plants, fungi, and bryophytes were queried using taxonid and for preserved specimens only; (d) Ratio of species present in GBIF data compared to reported species at World Geographical Scheme for Recording Plant Distributions (WGSRPD) level 3 areas (Brummitt et al., 2001) from WCVP (2020), shows a mean ratio of 0.82 of the total species that we believe should be in each WGSRPD level 3 area according to WCVP. Areas in dark red are where the total number of species is relatively poorly represented by GBIF coverage, in dark blue the total number of species show relatively good representation. WCVP data are likely to under-represent the presence of non-native species thus ratios are likely to be inflated  Meyer et al. (2016) report that high geographic coverage of collections was often associated with the botanical interests of institutions and major research and data mobilization programmes.

| Geographic and taxonomic gaps
Collection and digitization programmes need to consider where they can have the most impact on research and conservation and in part will be driven by the policies of biodiverse countries.
There is a tension between the general collection and digitization to fill geographic and taxonomic gaps and focused collection to provide evidence to solve particular science questions or societal challenges. Sampling of under-collected areas could be made more efficient by greater coordination of institutional priorities (Box 2). This coordination could be provided by existing regional initiatives such as the Latin American Botanical Network

BOX 2 Addressing collection gaps in the Karoo, South Africa
The plant and fungal diversity in large areas of South Africa remain poorly explored (Victor, Smith, Wyk, & Ribeiro, 2015). One such area is the Karoo, a biodiverse but poorly delimited region that has been earmarked for shale gas exploration, as well as the Square Kilometre

BOX 3 National level support for collections
The Natural Science Collections Facility (NSCF) of South Africa is a virtual facility comprising a network of 14 natural science collections institutions, including museums, herbaria, science councils and universities (www.nscf.co.za).
The purpose of the NSCF is to promote and upgrade natural science collections, making the collections and their asso-

| Supporting collections-based science
Strengthening collections in tropical biodiverse countries and countries with emerging economies should be an international priority and financial mechanisms need to be developed to support the development of collections-based science (Heywood, 2017). Often funding is project based and focused on short-term delivery. It is crucial that long-term national infrastructure is developed which in turn can provide support to shorter-term projects. The creation and support of expertise in collecting, managing, and using collections is also important. For example, Ethiopian government support and Swedish aid funding enabled the development of the Ethiopian National Herbarium and training of Ethiopian nationals to create a centers of expertise for the country and the surrounding region (Demissew, 2014). Box 3 gives a further example of national level support and Box 4 focuses on fungi.

| Digitization and data mobilization
Only 21% of preserved collections are available via GBIF, and 95% of these records cover only 38% and 26%   science platforms such as iNaturalist (Heberling & Isaac, 2018) and the growth of observation data in GBIF demonstrate the potential for increasing the number of recorded occurrences and enable a broader audience to participate in annotating the specimen regarding its identity or other attributes of the organism, and enhancing or correcting the data provided with the original collection. Persistent identifiers on specimen data will enable the tracking of specimen use and citation measures. Unique identifiers are also critical for linking of specimen information to other genomic, trait or relationship data and for linking the occurrence of the specimen in different datasets or aggregators and ensuring information subsequently added to the specimen is available to all potential users (Bakker et al., 2020;Hedrick et al., 2019;Lendemer et al., 2020).

| Data quality
The inaccuracies in geographic coordinate information and inconsistency of use of taxonomic names on collection data create difficulties in analyzing data sets (Ball-Damerow et al., 2019;Meyer et al., 2016;Mounce et al., 2017). Paul and Fisher (2018) and Ball-Damerow et al. (2019) stress the importance of creating better automated solutions to flag errors, such as the newly developed software CoordinateCleaner (Zizka et al., 2019) and efficient mechanisms to report and correct data quality issues back to the source. Although automated flagging of erroneous records can improve certain analyses (e.g., Maldonado et al., 2015) it is often crucial that specialists validate results in a critical way in relation to their taxonomic group of expertise (Zizka, Carvalho, et al., 2020). Making linkages between preserved specimen duplicates will help propagate annotations made on one specimen to other duplicate specimens reducing curator time and increasing data quality (Nicolson, Paton, Phillips, & Tucker, 2018). Also, linking field images to specimen data via platforms such as iNaturalist will also assist broader, community level curation (Heberling & Isaac, 2018). The more complete the data associated with the specimen, the more potential uses it will likely have. A standard method for identifying the extent of digital information available from specimens in a collection is being developed (Wu et al., 2018).
Recognizing the need for greater alignment between biodiversity informatics related effort, GBIF hosted the second Global Biodiversity Informatics Conference in 2018. An outcome of the conference was a call for action (Hobern et al., 2019)

| User engagement
Although much can be done with existing resources, engaging with a broad range of users will help recruit resources to assist with the development of collections to better address user needs.
Kvaček There is a risk that if collections are not used they can be lost. DiSSCo face a similar challenge. However, they can provide the overall coordination for a broader engagement with potential stakeholders than would be possible for a single institution.

| CON CLUS IONS-FUTURE LOOK
Data and material from collections can support a much broader range of study and use than has been seen historically and currently.
Increase in use has been enabled by the development and wide accessibility of DNA sequencing technologies and other molecular analyses which expand the utility of collections (Bakker et al., 2020), the increase in computer power to analyze vast amounts of data, the development of web technologies allowing the semantic linking of information and the proactive engagement of potential users. The actions suggested below would help capitalize on these developments and maximize the impact of collection data on research and societal challenges: • Governmental and international aid agencies should aim to support long-term collection facilities and staff training, including collecting from identified biodiversity hotspots and poorly explored areas.
This infrastructure will better support short-term funded projects to develop societal benefits from plant and fungal diversity.
• Collection-holding institutions should speed up digitization of collections. This will require changes in work practices and additional resources. Making all data visible via aggregators such as GBIF will greatly facilitate use, research and community curation to improve quality.
• There should be greater alignment between biodiversity informatics initiatives and standards such as agreed species level consensus taxonomies and use of specimen persistent identifiers.
GBIF and the Alliance for Biodiversity Knowledge are playing a key role.
• Greater institutional coordination is needed to assist with both indepth collection of a broad range of taxa in key areas to address national priorities and also targeted collecting in under-collected areas. Regional and international initiatives that bring botanists and mycologists together could better coordinate activity to fill gaps.
• Institutions should invest in environmental biobank collections, following and developing GGBN data standards to better represent fungal material.
• Further research into the low-temperature ex situ storage of species with desiccation-sensitive seed is required.
• Collections-based institutions should work together to proactively build relationships with their current and potential new users.