galah is an R interface to biodiversity data hosted by the Global Biodiversity Information Facility (GBIF) and its subsidiary node organisations. GBIF and its partner nodes collate and store observations of individual life forms using the ‘Darwin Core’ data standard.
To install from CRAN:
Or install the development version from GitHub:
Load the package
By default, galah downloads information from the Atlas of Living
Australia (ALA). To show the full list of organisations currently
supported by galah, use show_all(atlases).
## # A tibble: 10 × 4
##    region         institution                                                             acronym url                    
##    <chr>          <chr>                                                                   <chr>   <chr>                  
##  1 Australia      Atlas of Living Australia                                               ALA     https://www.ala.org.au 
##  2 Austria        Biodiversitäts-Atlas Österreich                                         BAO     https://biodiversityat…
##  3 Brazil         Sistemas de Informações sobre a Biodiversidade Brasileira               SiBBr   https://sibbr.gov.br   
##  4 France         Portail français d'accès aux données d'observation sur les espèces      OpenObs https://openobs.mnhn.fr
##  5 Global         Global Biodiversity Information Facility                                GBIF    https://gbif.org       
##  6 Guatemala      Sistema Nacional de Información sobre Diversidad Biológica de Guatemala SNIBgt  https://snib.conap.gob…
##  7 Portugal       GBIF Portugal                                                           GBIF.pt https://www.gbif.pt    
##  8 Spain          GBIF Spain                                                              GBIF.es https://gbif.es        
##  9 Sweden         Swedish Biodiversity Data Infrastructure                                SBDI    https://biodiversityda…
## 10 United Kingdom National Biodiversity Network                                           NBN     https://nbn.org.ukUse galah_config() to set the node organisation using
its region, name, or acronym. Once set, galah will
automatically populate the server configuration for your selected GBIF
node. To download occurrence records from your chosen GBIF node, you
will need to register an account with them (using their website), then
provide your registration email to galah. To download from GBIF, you
will need to provide the email, username, and password.
galah_config(atlas = "GBIF",
             username = "user1",
             email = "email@email.com",
             password = "my_password")You can find a full list of configuration options by running
?galah_config.
The standard method to construct queries in {galah} is
via piped functions. Pipes in galah start with the
galah_call() function, and typically end with
collect(), though collapse() and
compute() are also supported. The development team use the
base pipe by default (|>), but the
{magrittr} pipe (%>%) should work too.
## # A tibble: 1 × 1
##       count
##       <int>
## 1 146185520To pass more complex queries, you can use additional
{dplyr} functions such as filter(),
select(), and group_by().
## # A tibble: 1 × 1
##      count
##      <int>
## 1 40200358Each GBIF node allows you to query using their own set of in-built
fields. You can investigate which fields are available using
show_all() and search_all():
## # A tibble: 2 × 3
##   id     description                            type  
##   <chr>  <chr>                                  <chr> 
## 1 cl2013 ASGS Australian States and Territories fields
## 2 cl22   Australian States and Territories      fieldsTo narrow your search to a particular taxonomic group, use
identify(). Note that this function only accepts scientific
names and is not case sensitive. It’s good practice to first use
search_taxa() to check that the taxa you provide returns
the correct taxonomic results.
## # A tibble: 1 × 9
##   search_term scientific_name taxon_concept_id                               rank  match_type kingdom phylum class issues
##   <chr>       <chr>           <chr>                                          <chr> <chr>      <chr>   <chr>  <chr> <chr> 
## 1 reptilia    REPTILIA        https://biodiversity.org.au/afd/taxa/682e1228… class exactMatch Animal… Chord… Rept… noIss…## # A tibble: 1 × 1
##    count
##    <int>
## 1 338434If you want to query something other than the number of records,
modify the type argument in galah_call(). Here
we’ll query the number of species:
galah_call(type = "species") |>
  identify("reptilia") |> 
  filter(year >= 2020) |> 
  count() |>
  collect()## # A tibble: 1 × 1
##   count
##   <int>
## 1   883To download records—rather than find how many records are
available—simply remove the count() function from your
pipe.
result <- galah_call() |>
  identify("Litoria") |>
  filter(year >= 2020, cl22 == "Tasmania") |>
  select(basisOfRecord, group = "basic") |>
  collect()## Retrying in 1 seconds.## # A tibble: 6 × 9
##   recordID            scientificName taxonConceptID decimalLatitude decimalLongitude eventDate           occurrenceStatus
##   <chr>               <chr>          <chr>                    <dbl>            <dbl> <dttm>              <chr>           
## 1 00052544-d943-42e9… Litoria ewing… https://biodi…           -42.9             147. 2022-09-19 00:00:00 PRESENT         
## 2 00168ca6-84d0-4af1… Litoria ranif… https://biodi…           -41.2             146. 2023-12-21 10:20:19 PRESENT         
## 3 001a43fe-8586-4064… Litoria ewing… https://biodi…           -43.0             147. 2021-08-07 00:00:00 PRESENT         
## 4 00250163-ec50-4eda… Litoria ranif… https://biodi…           -41.2             147. 2023-08-23 11:49:28 PRESENT         
## 5 003e0f63-9f95-4af9… Litoria ewing… https://biodi…           -42.9             148. 2022-12-24 06:27:00 PRESENT         
## 6 0070521f-bb45-46fb… Litoria ewing… https://biodi…           -43.1             147. 2023-12-20 14:29:23 PRESENT         
## # ℹ 2 more variables: dataResourceName <chr>, basisOfRecord <chr>Check out our other vignettes for more detail on how to use these functions.