AURIN Data61-GNAF Geocoder

Using the AURIN Data61-GNAF Geocoder

The AURIN Data61-GNAF Geocoder is an AURIN hosted implementation of the open source geocoder published by Data61 which runs on top of the Open Geocoded National Address File (G-NAF/GNAF) produced by PSMA Australia Limited (PSMA).

In this tutorial we will explore using both the web-interface and API of the AURIN Data61-GNAF Geocoder using a web-browser and Python3 respectively. The goal of the tutorial will be to acquire valid latitude and longitude coordinates of a number of sample addresses.

The source code of the Data61-GNAF Geocoder can be found here.


Software Required

The AURIN Data61-GNAF Geocoder is available to be used through two distinct options: through the web-interface accessible through a web-browsing software, and through an API which can be communicated with through any software able to create API requests. The capability of both options to identify addresses and provide a valid latitude and longitude does not differ between the them, only the input and output methodologies differ.

  • Please note that the geocoder returns coordinates following the EPSG:4326 (WGS 84) coordinate reference system.

Web-Interface

To use the web-interface of the AURIN Data61-GNAF Geocoder the only software you will require is a web-browser.

The AURIN Data61-GNAF Geocoder web-interface can be found here.

Geocoder API

The AURIN Data61-GNAF Geocoder has the ability to communicate through an API. In this tutorial we will explore using the API through Python3. This requires the following Python3 libraries:

json, csv, urllib

In any case, you will need to have Python3 installed and have the ability to run Python3 scripts, these libraries should be available by a default installation of Python3.

The following is the skeleton code for geocoding through the AURIN Data61-GNAF Geocoder API:

AURIN Geocoder Python Skeleton Code


Using the Geocoder Through a Web-browser

Using a web-browser and the geocoder web-interface is the simplest method of acquiring information about an address.

Begin by visiting the URL for the AURIN Data61-GNAF Geocoder web-interface as provided above. Upon entering the web-interface you should first see the geocoding UI with “Address Lookup” selected by default, the singular address lookup tool returns a list of the top 10 locations which match the address being searched.

  1. This allows you to set a coordinate as a central point and identify a radial distance from that point at which to limit all of the searches from.
    • Where you would not like to limit the search extents, select “Don’t use location” in the search distance drop down and click “update”
  2. Enter the address which you would like to search into this field.
    • The address should be, where possible, formatted similar to:
      [Unit Number] [Street Number] [Street Name] [Suburb] [State] [Postcode]
      e.g. 69 Fitzgibbon St Parkville VIC 3052

The next tab available, “Nearest Address” allows you to find addresses nearby to the coordinates entered into the location field (1). This is a form of reverse geocoding.

  1. Once the location has been set, including the search distance, you will be able to search for known addresses within the search distance. You are able to refine the selection by first filtering by street name and then choosing an address.
    • To increase the precision of the search, a search distance of 50m is recommended.

The final tab available on the web-interface is the bulk lookup tab, this allows you to geocode a list of addresses at once. Similar to the singular address lookup, this can be limited by using the location option (1).

  1. Multiple addresses can be input into the text box, where each new line is a separate address to geocode.
    • This will return 1 result, which the geocoder believes is the most correct location, per address.

Web-interface Example

For this example, we will be attempting to geocode the following addresses:

69 Fitzgibbon St Parkville VIC 3052

Liverpool Street Hyde Park South Sydney NSW 2000

24 Railway Terrace Alice Springs NT 0870

44 Harlequin St Lightning Ridge NSW 2834

385 Cardigan St Carlton VIC 3053

To do this we will be using the bulk lookup capability of the AURIN Data61-GNAF Geocoder. Copying the above addresses into the textbox and ensuring that the location option is not being used will result in the following:

  • You can click on the icon for each of the results to bring up additional information which will inform you of the address type and the reliability of the geocode, definitions of the reliability are:

    G-NAF Data Product Description 2019


Using the Geocoder API

Using the Data61-GNAF Geocoder API we are able to very quickly geocode a very large number of datasets. A Python3 script is run on a CSV file containing the addresses which we want to geocode, the script sends each address through the geocoding API and receives coordinates and its respective geocoding score in return.

The following CSV which contains the same addresses as the previous web-interface example will be used.

Sample Address CSV

Begin by opening the Python3 script provided in the “Software Required” section of the tutorial. On line 6 and 7 of the script you will need to specify the input and output CSVs. In this case the lines should be similar to:

#Identify input and output CSVs here#
input_csv="sample_addresses.csv"
output_csv="geocoded_sample_addresses.csv"
#

Running the python script with the CSV in the same location as the script file will produce an output CSV with results similar to:

address longitude latitude score
69 Fitzgibbon St Parkville VIC 3052 144.9567222 -37.79458685 0.578070819
Liverpool Street Hyde Park South Sydney
NSW 2000
151.2109499 -33.87574516 0.693060875
24 Railway Terrace Alice Springs NT
0870
133.878813 -23.697436 0.881190479
44 Harlequin St Lightning Ridge NSW
2834
147.9823345 -29.42926474 0.7332865
385 Cardigan St Carlton VIC 3053 144.9657563 -37.79727558 0.53554368

The script creates a CSV file which contains the address with the longitude, latitude and the geocoder search score. The geocoder API produces an JSON formatted output and the script can be adjusted to your requirements.

The Data61-GNAF Geocoder uses Lucene as a base for its search index, in-depth technical information for the scoring methodology can be found here.