Introduction

Biodiversity resources are increasingly international. The SBDI has made an effort to canalise biodiversity data and resources to help the research community access and analyse Swedish primary biodiversity data. Each research question draws its own challenges which are unique in themselves. Our aim here is to provide a few examples that prompt questions that may be asked at different stages of the process. The validity and appropriateness of a particular method depends on the individual researcher(s). For a comprehensive workflow on how to treat and analyse primary biodiversity data please refer to our tutorial on biodiversity analysis tools where we go through the complete workflow Data –> Cleaning –> Fitness evaluation –> Analysis

Essential Biodiversity VariablesDiversity
Diversity

rank 
abundance

species pool

dark diversity

rarefaction

other measures

Z diversity
α diversity
vegan
adiv
bipartite
BAT
betapart
β diversity
ade4%3CmxGraphModel%3E%3Croot%3E%3CmxCell%20id%3D%220%22%2F%3E%3CmxCell%20id%3D%221%22%20parent%3D%220%22%2F%3E%3CmxCell%20id%3D%222%22%20value%3D%22adiv%22%20style%3D%22shape%3Dimage%3Bhtml%3D1%3BverticalAlign%3Dtop%3BverticalLabelPosition%3Dbottom%3BlabelBackgroundColor%3D%23ffffff%3BimageAspect%3D0%3Baspect%3Dfixed%3Bimage%3Dhttps%3A%2F%2Fcdn4.iconfinder.com%2Fdata%2Ficons%2Flogos-and-brands%2F512%2F285_R_Project_logo-128.png%22%20vertex%3D%221%22%20parent%3D%221%22%3E%3CmxGeometry%20x%3D%223957%22%20y%3D%221850%22%20width%3D%2267.74626865671642%22%20height%3D%2267.74626865671642%22%20as%3D%22geometry%22%2F%3E%3C%2FmxCell%3E%3C%2Froot%3E%3C%2FmxGraphModel%3E
phytools
Juice

community matrices

clustering

ordination

hypervolumes
BiodiversityR
SDM
Species Distribution Models

spatial correlation
sp / sf / raster

spatial representation
Dismo
MCMCglmm
sdm
biomod2
jSDM
brt
Hmsc
lme4
maxent
SDMtoolbox

bioclim

maximum entropy

GLM / GAM

occupancy modelling

joint SDM
base
ENMeval

regression  trees
Profiling: Presence only
ML: Presence Absence /
Presence background
Regression: Presence absence

interpolation

point pattern analysis

variograms
Trends and pattern detection

time
lubridate
BRCindicators
Trends and patterns
MSI

other patterns

permutational
statistics
lmPerm

multispecies indices
RTRIM

extinction risk
red

population indices

GLM / GAM
lme4
base
gam
MCMCglmm
Spatial Multivariate Analysis
adespatial
SMA
spmoran
      Exploration and Transformation
Long
Wide
Raster
BIRDS
sp / sf / raster

Transform
Grided
Vector
coordinates

Visualise

Filter

Summarize

Export

Aggregate
tydiverse
Fitness
AND
Fitness for
 purpose
AND

enough
amount?

enough
continuity?

enough
quality?

Has your data
sampBias

define the minimum sampling unit

spatial precision

sampling behaviour/reportability

sampling effort

taxonomic completeness

hot- or coldspots?

temporal bias

spatial bias
recorderMetrics
It is very difficult to generalise
a minimum number
of observations but it should
back up your statistics at the
minimum sampling unit

ignorance

species lists

field visits
/ checklists

observers accuracy

observers consistency
GIS

rarity 
recording

list length

intensity
no

absence data?

do you need it?
dplyer
BIRDS

observation index
spreadsheet
Your fit for purpose dataset
Occupancy Models
distance
Occupancy models
"own script"
yes
pseudo absences
Inference from Background
spatialEcol
Data Cleaning

is there relevant 
data in the columns?

geocode
location names
tidyverse
::mutategeocode()
ggmap
::geocode()
taxise
gbif.org
dyntaxa.se
dyntaxa
Wallace
Your accurate and certain dataset
scrubr
CoordinateCleaner
scrubr
bdclean

clean by
assumed visit/checklist quality

misspellings
and duplicates

missing or
incorrect data
Spatially
Temporally
Taxonomically
All

Data cleaning and filling

spreadsheet
tidyverse
GIS
BIRDS
spreadsheet
biogeo
OpenRefine
Research Question

Functional Diversity

Taxonomic  Diversity

Phylogenetic Diversity

Checked species
list of
agreed synonyms
Research
Question
Biodiversity
Atlas
Sweden
SBDI4R
www.bioatlas.se
Query
GBIF
eBbird
Others...
Your dataset
rgbif, spooc
GBIF API
GBIF.org
auk
eBird API
ebird.org
DynTaxa /
GBIF Backbone
GenBank
WorldClim /
Copernicus ...

Environmental Variables

bioatlas API
TRY
GIS
Legend
Web portal
R package
Spreadsheet
Database
API
GIS
Other Software
Dataset

R and Mirroreum

The present tutorial is focused on the statistical programming language R. R is a free software environment for statistical computing and graphics that is widely used within the scientific community and where the complete analysis workflow can be documented in a fully reproducible way.

At SBDI we provide access for researchers and students to Mirroreum – an online web-based environment for Reproducible Open Research in the area of biodiversity analysis. Mirroreum is based on a Free and Open Source stack of software. Logging in, you immediately get access to a web-based version of R Studio with a large number of pre-installed packages such as all the packages offered from ROpenSci and more.

Compared to running R Studio on your own machine, Mirroreum offers more computational resources and a standardized environment where you can rely on all the relevant packages being installed and the configuration parameters being set appropriately. To know more about Mirroreum or to request an account please visit the SBDI documentation site

Mirroreum - An RStudio session on a server

sbdi4r2 - a new R 📦 to search an access data

The sbdi4r2 package enables the R community to directly access data and resources hosted by SBDI. The goal is to enable observations of species to be queried and output in a range of standard formats. It includes some filter functions that allow you to filter prior to download. It also includes some simple summary functions, and some function for some simple data exploration. The examples included in this tutorial also show you how you can continue exploring and analyzing using other R package.

Please refer to the package documentation for details on how to install it. Once installed the sbdi4r2 package must be loaded for each new R session:

library(sbdi4r2)

Various aspects of the sbdi4r2 package can be customized.

E-mail address

Each download request to SBDI servers is also accompanied by an “e-mail address” string that identifies the user making the request. You will need to provide an email address registered with the SBDI. You can create an account here. Once an email is registered with the SBDI, it should be stored in the config:

sbdi_config(email = "your.registered@emailaddress.com")

Else you can provide this e-mail address as a parameter directly to each call of the function occurrences().

Setting the download reason

SBDI requires that you provide a reason when downloading occurrence data (via the sbdi4r2 atlas_occurrences() function). You can provide this as a parameter directly to each call of atlas_occurrences(), or you can set it once per session using:

sbdi_config(download_reason_id = "your_reason_id")

(See sbdi_reasons() for valid download reasons, e.g. * 3 for “education”, * 7 for “ecological research”, * 8 for “systematic research/taxonomy”, * 10 for “testing”)

Privacy

NO other personal identification information is sent. You can see all configuration settings, including the the user-agent string that is being used, with the command:

sbdi_config()

Other packages needed

Some additional packages are needed for these examples. Install them if necessary with the following script:

to_install <- c("colorRamps", "cowplot","dplyr",
                "ggplot2", "leaflet", "maps", "mapdata", 
                "remotes", "sf", "tidyr", "xts")
to_install <- to_install[!sapply(to_install, 
                                 requireNamespace, 
                                 quietly = TRUE)]
if (length(to_install) > 0)
    install.packages(to_install, 
                     repos = "http://cran.us.r-project.org")

remotes::install_github("Greensway/BIRDS")

Your collaboration is appreciated

Open Source also means that you can contribute. You don’t need to know how to program but every input is appreciated. Did you find something that is not working? Have suggestions for examples or text? you can always

  1. Reach to us via the support center
  2. Submit and issue to the GitHub code repository see how
  3. Or contribute with your code or documents modifications by “forking” the code and submitting a “pull request”

The repositories you can contribute to are: