Using R tools for analysis
of primary biodiversity data provided by SBDI
2024-05-06
Introduction
Biodiversity resources are increasingly international. The SBDI has made an effort to canalise biodiversity data and resources to help the research community access and analyse Swedish primary biodiversity data. Each research question draws its own challenges which are unique in themselves. Our aim here is to provide a few examples that prompt questions that may be asked at different stages of the process. The validity and appropriateness of a particular method depends on the individual researcher(s). For a comprehensive workflow on how to treat and analyse primary biodiversity data please refer to our tutorial on biodiversity analysis tools where we go through the complete workflow Data –> Cleaning –> Fitness evaluation –> Analysis
R and Mirroreum
The present tutorial is focused on the statistical programming language R. R is a free software environment for statistical computing and graphics that is widely used within the scientific community and where the complete analysis workflow can be documented in a fully reproducible way.
At SBDI we provide access for researchers and students to Mirroreum – an online web-based environment for Reproducible Open Research in the area of biodiversity analysis. Mirroreum is based on a Free and Open Source stack of software. Logging in, you immediately get access to a web-based version of R Studio with a large number of pre-installed packages such as all the packages offered from ROpenSci and more.
Compared to running R Studio on your own machine, Mirroreum offers more computational resources and a standardized environment where you can rely on all the relevant packages being installed and the configuration parameters being set appropriately. To know more about Mirroreum or to request an account please visit the SBDI documentation site
sbdi4r2 - a new R 📦 to search an access data
The sbdi4r2 package enables the R community to directly access data and resources hosted by SBDI. The goal is to enable observations of species to be queried and output in a range of standard formats. It includes some filter functions that allow you to filter prior to download. It also includes some simple summary functions, and some function for some simple data exploration. The examples included in this tutorial also show you how you can continue exploring and analyzing using other R package.
Please refer to the package documentation for details on how to install it. Once installed the sbdi4r2 package must be loaded for each new R session:
Various aspects of the sbdi4r2 package can be customized.
E-mail address
Each download request to SBDI servers is also accompanied by an “e-mail address” string that identifies the user making the request. You will need to provide an email address registered with the SBDI. You can create an account here. Once an email is registered with the SBDI, it should be stored in the config:
Else you can provide this e-mail address as a parameter directly to each call of the function occurrences().
Setting the download reason
SBDI requires that you provide a reason when downloading occurrence data (via the sbdi4r2 atlas_occurrences()
function). You can provide this as a parameter directly to each call of atlas_occurrences()
, or you can set it once per session using:
(See sbdi_reasons()
for valid download reasons, e.g. * 3 for “education”, * 7 for “ecological research”, * 8 for “systematic research/taxonomy”, * 10 for “testing”)
Other packages needed
Some additional packages are needed for these examples. Install them if necessary with the following script:
to_install <- c("colorRamps", "cowplot","dplyr",
"ggplot2", "leaflet", "maps", "mapdata",
"remotes", "sf", "tidyr", "xts")
to_install <- to_install[!sapply(to_install,
requireNamespace,
quietly = TRUE)]
if (length(to_install) > 0)
install.packages(to_install,
repos = "http://cran.us.r-project.org")
remotes::install_github("Greensway/BIRDS")
Your collaboration is appreciated
Open Source also means that you can contribute. You don’t need to know how to program but every input is appreciated. Did you find something that is not working? Have suggestions for examples or text? you can always
- Reach to us via the support center
- Submit and issue to the GitHub code repository see how
- Or contribute with your code or documents modifications by “forking” the code and submitting a “pull request”
The repositories you can contribute to are:
- Mirroreum https://github.com/mskyttner/mirroreum
- sbdi4r2 https://github.com/biodiversitydata-se/sbdi4r2
- the general analysis workflows https://github.com/biodiversitydata-se/biodiversity-analysis-tools
- this R-tools tutorial https://github.com/biodiversitydata-se/r-tools-tutorial