LEARN TO explore
Accessing the AURIN Data Provider via RStudio
This tutorial is also available on the following GitHub repository: GitHub – AURIN-OFFICE/training
The goal of this tutorial is to introduce how to use RStudio to connect to AURIN Data Provider (ADP) and download data. This is designed with no prerequisite of R experiences.
The ADP is an interface that allows you to use programming scripts to interact with and access the AURIN Data Catalogue. You can send a request to the AURIN Data Provider using R scripts and get a response to download specific datasets.
The ADP is based on Open Geospatial Consortium Web Feature Service (WFS) Interface Standards. With ADP credentials, AURIN’s data repository is at your fingertips.
Before commencing this step, please ensure you have generated your unique ADP Credentials (username and password) using the ADP Access Dashboard.
In this tutorial, our goal is to use RStudio to connect to the ADP and download a dataset. You will learn how to:
- Choose between different R environments
- Install related R packages
- Use R coding environments
- Interact with the ADP
- Setup connection parameters
- List available datasets
- Download data
- View data
- Visualise data
- Filter data by bounding box
1. Choose between Different R Environments
R is a free and open-source statistical software program and programming language. It is a very powerful tool for data analysts from all disciplines, from economics to ecology and geography. The geographic information system (GIS) capabilities of R have developed significantly over the last decade.
If you are getting started with R, we suggest you use a cloud environment, such as R Studio Cloud, as it has R and its prerequisites already installed. If you wish to use your personal computer, we recommend using R version 4.2.1 or later on your computer. We hope the image below may help you understand the benefits and limitations of each option.
a) Cloud Environment
R Studio Cloud is a free cloud service accessible here.
To gain access follow the below steps:
- Click Cloud – Free and Sign Up
- Enter your login details and click Sign Up
- Create a New Project → New RStudio Project
b) Local Environment
We recommend using R version 4.2.1 or later on your computer. Please note: you must download R first, then download RStudio.
Download and install R from the following link: https://cran.csiro.au
Download and install RStudio from the following link: Download the RStudio IDE
2. Packages and Libraries
R packages are libraries of functions developed by the R community.
The packages that we are going to use in this tutorial include:
- sf – Provides support for simple features, a standardized way to use geospatial data.
- httr – This is a web service package that can be used to create a request to AURIN.
- tidyverse – A huge collection of R packages designed for data science.
- osw4R – Provides an Interface to Web-Services defined as standards by the Open Geospatial Consortium (OGC).
- mapview – Enables functions to quickly and conveniently create interactive visualisations of spatial data.
3. Use R coding environments
Now is the time to introduce you to the R coding environment.
Open up R Studio and using the
Install.packages() function, install the packages described above in section 2.
To do this you can copy and paste the following code into the RStudio console and hit Enter to execute the code:
install.packages(c("sf", "httr", "tidyverse", "ows4R", "mapview"))
After installation, we can use
library() to load these packages. Again, you can copy the code below, and paste it into the Console.
library(sf) library(httr) library(tidyverse) library(ows4R) library(mapview) library(utils)
4. Interact with the ADP
a. Setup connection parameters
To connect to the ADP you need to let RStudio know the address of the ADP and your access credentials.
Please replace yourName and yourPassword below with your own ADP username and password and execute the code to define the connection parameters. If you don’t have ADP credentials, please generate your own unique credentials in the ADP Access Dashboard.
#### ------ Setup variables ------- #### wfs_url <- "https://adp.aurin.org.au/geoserver/wfs" user_name <- "yourName" password <- "yourPassword" #### ------ Define url ----- #### url <- parse_url(wfs_url) url$hostname <- paste(user_name,":",password,"@",url$hostname, sep="")
Create a new adp_client, using the below code to connect to the ADP, which is itself a Web Feature Service (WFS).
adp_client <- WFSClient$new(url = wfs_url, user = user_name, pwd = password, serviceVersion = '2.0.0')
b. List available datasets
We can view information about dataset name and title of the first 10 datasets:
#Get all available layers head(adp_client$getFeatureTypes(pretty = TRUE),10)
c. Download data
You can search the AURIN Data Catalogue to find datasets you would like to download.
For example, if you are interested in a dataset describing the locations of fire stations in Victoria, enter “fire station” and click search. Browse the results and select VIC DELWP – Vicmap Features of Interest – Country Fire Authority (CFA) Fire Stations (Points) and view its description, including its metadata table.
To download that dataset into R, find and copy its ADP ID in the metadata table, in this case it is
datasource-VIC_Govt_DELWP-VIC_Govt_DELWP:datavic_VMFEAT_CFA_FIRE_STATION. Then create the following query, pasting the ADP ID into the typeName variable within RStudio.
url$query <- list(service = "wfs", #version = "2.0.0", # optional request = "GetFeature", typeName = "datasource-VIC_Govt_DELWP-VIC_Govt_DELWP:datavic_VMFEAT_CFA_FIRE_STATION", srsName = "EPSG:4326")
Use the following functions to build the query and download the data from the ADP and save it. In this case we are downloading the data in GML format and saving it with the file name data_fire.gml. You can see a list of other supported ADP output formats here.
request <- build_url(url) ### ---- Download the data ---- ### download.file(request, destfile = "./data_fire.gml", mode='wb') ### ---- Read the data ---- #### data <- read_sf("data_fire.gml")
d. View data
head() function again to view some of the dataset’s rows:
Now you will see information about the fire stations in tabular format:
e. Visualise data
mapview() function to visualise the dataset:
Now you will see fire stations on a map as individual point locations:
f. Filter data by bounding box
The ADP supports spatial queries that permit filtering your data in a particular spatial area. For example, you can filter the data by bounding box (BBOX). By adding the BBOX, instead of downloading entire datasets, which can be very large and irrelevant to your project, you can download data according to your area of interest. Here we show an example to use the Melbourne CBD as the area of interest.
The BBOX parameter allows you to search for features that are contained (or partially contained) inside a box of user-defined coordinates. The format of the BBOX parameter is
b2 represent the coordinate values.
We recommend using BBox finder to create your BBOX using a base map. Click the rectangle icon and draw a rectangle using your mouse to cover the Melbourne CBD area or any other areas you are interested in.
Now you can see the selected rectangle is covered in pink. You may check if it is the right area you’d like to collect data from. Copy the BBOX coordinates from the highlighted area, and replace the coordinates after the code
You also need to replace yourName and yourPassword in the code block below with your ADP username and password. If you don’t have ADP credentials, please generate your credentials via the ADP Access Dashboard.
###### ------ Libraries ------- #### library(sf) library(httr) library(tidyverse) library(ows4R) library(mapview) library(utils) ##### ----- Crendentials ------ ##### wfs_url <- "https://adp.aurin.org.au/geoserver/wfs" user_name <- "yourName" password <- "yourPassword" #### ------ Define url ----- #### url <- parse_url(wfs_url) url$hostname <- paste(user_name,":",password,"@",url$hostname, sep="") #### ------ Select the data set ----- ##### ADP_ID = 'datasource-OSM-UoM_AURIN_DB:osm_lines_2017' ### ------ Copy vector from http://bboxfinder.com/ ---- ##### bbox = '144.927135,-37.828836,145.000648,-37.799408' ### ------ Create request ---- ##### url$query <- list(service = "WFS", version='2.0.0', request = "GetFeature", typeNames = ADP_ID, bbox=paste0(bbox,',EPSG:4326')) request <- build_url(url) #### ---- Download data from server ----- #### download.file(request, destfile = "data_bbox.gml", mode='wb') ### ---- Read the data ---- #### data <- read_sf("data_bbox.gml") ### --- Show the map --- ### mapview(data)