Chapter 2 The importance of questions and sources of data

2.1 Questions

Any question to be asked of biodiversity data should be put as simply and succinctly as possible. With the number of different subject areas and techniques used, analyses can quickly become complex.

2.2 Taxonomies

It is important to be aware of likely taxonomic anomalies prior to working within a region. Check-lists are very important, especially if working over several regions / countries. Whilst there are many things that will automatically look for the validity of a name they do not check for the validity of that species occurrence. For example Sphagnum auriculatum and S. denticulatum are both valid names. S. auriculatum is the currently accepted species in Europe but in the British Isles, Ireland and the Netherlands s. denticulatum is the most recorded taxa. Using data from across the European region without acknowledging this disagreement would impact the results of any research undertaken. For taxa which are known to be capable of dispersing great distances (eg birds) this becomes even more difficult especially when using community sourced data.

For Sweden there is an agreed taxonomy for species accessible through dyntaxa and the R library dyntaxa.

2.3 Data Sources

Depending on what questions are being asked there are many different resources available. We focus on biodiversity data

2.3.1 Biodiversity record data

There are a large number of available on-line resources. These include but are not limited to (specific R libraries that connect to these databases are provided in bold):

  • Swedish Biodiversity Data Infrastructure - Sweden’s data portal for biodiversity data
  • Global Biodiversity Information Facility - International organization aggregating biodiversity data. Contains data from a mixture of sources; curated collections, community science data, ecological research projects etc. rgbif, spocc
  • BioCASE - A European transnational biodiversity repository
  • eBird - American database of bird observations auk, rebird,spocc
  • iNaturalist - International community science observation repository spocc
  • Berkeley ecoengine - Access to UC Berkley’s Natural history data spocc
  • VertNet - vertebrate biodiversity collections rvert, spocc
  • iDigBio - Integrated digitise biodiversity collections ridigbio
  • OBIS - Ocean biodiversity information system robis
  • ALA - Atlas of living Australia galah
  • Neotoma Palaeoecology databas neotoma

2.3.2 Taxonomic diversity

To keep track of ever changing taxonomy of species there are different databases that follow different standard.

2.3.3 Functional diversity

Very broadly functional diversity is the diversity of what organisms do (Petchey and Gaston 2006). Such diversity can be direct physical measurements of traits of the organisms involved and / or data summarized from published works. There are databases dedicated to the distribution of scientific data that may be used. Such resources include:

2.3.4 Genetic data bases

Genetic data may be related directly to the samples used, phylogenetic trees generated from some other data set set or some other genetic aspect. Such resources include:

2.3.5 Environemntal data

There are a number of environmental repositories available. Static data sets for global resources include:

2.3.5.1 Sweden

2.3.5.2 Global and European

References

Petchey, Owen L., and Kevin J. Gaston. 2006. “Functional Diversity: Back to Basics and Looking Forward.” Ecology Letters 9 (6): 741–58. https://doi.org/10.1111/j.1461-0248.2006.00924.x.
Tyler, Torbjörn, Lina Herbertsson, Johan Olofsson, and Pål Axel Olsson. 2021. “Ecological Indicator and Traits Values for Swedish Vascular Plants.” Ecological Indicators 120 (January): 106923. https://doi.org/10.1016/j.ecolind.2020.106923.