Skip to contents

GBIF and it's partner nodes store content in hundreds of different fields, and users often require thousands or millions of records at a time. To reduce time taken to download data, and limit complexity of the resulting `tibble`, it is sensible to restrict the fields returned by [atlas_occurrences()]. This function allows easy selection of fields, or commonly-requested groups of columns, following syntax shared with `dplyr::select()`.

The full list of available fields can be viewed with `show_all(fields)`. Note that `select()` and `sbdi_select()` are supported for all atlases that allow downloads, with the exception of GBIF, for which all columns are returned.

Usage

sbdi_select(..., group)

Arguments

...

zero or more individual column names to include

group

`string`: (optional) name of one or more column groups to include. Valid options are `"basic"`, `"event"` `"media"` and `"assertions"`

Value

A tibble specifying the name and type of each column to include in the call to `atlas_counts()` or `atlas_occurrences()`.

Details

Calling the argument `group = "basic"` returns the following columns:

* `decimalLatitude` * `decimalLongitude` * `eventDate` * `scientificName` * `taxonConceptID` * `recordID` * `dataResourceName` * `occurrenceStatus`

Using `group = "event"` returns the following columns:

* `eventRemarks` * `eventTime` * `eventID` * `eventDate` * `samplingEffort` * `samplingProtocol`

Using `group = "media"` returns the following columns:

* `multimedia` * `multimediaLicence` * `images` * `videos` * `sounds`

Using `group = "assertions"` returns all quality assertion-related columns. The list of assertions is shown by `show_all_assertions()`.

See also

[search_taxa()], [sbdi_filter()] and [sbdi_geolocate()] for other ways to restrict the information returned by [atlas_occurrences()] and related functions; [atlas_counts()] for how to get counts by levels of variables returned by `sbdi_select`; `show_all(fields)` to list available fields.

Examples

if (FALSE) {
# Download occurrence records of *Perameles*,
# Only return scientificName and eventDate columns
sbdi_config(email = "your-email@email.com")
sbdi_call() |>
  sbdi_identify("perameles")|>
  sbdi_select(scientificName, eventDate) |>
  atlas_occurrences()

# Only return the "basic" group of columns and the basisOfRecord column
sbdi_call() |>
  sbdi_identify("perameles") |>
  sbdi_select(basisOfRecord, group = "basic") |>
  atlas_occurrences()

# When used in a pipe, `sbdi_select()` and `select()` are synonymous.
# Hence the previous example can be rewritten as:
request_data() |>
  identify("perameles") |>
  select(basisOfRecord, group = "basic") |>
  collect()
}