 # Getis-Ord Local G

The $G_{i}$ statistics are known to be useful for identifying “hot and/or cold spots” and to check for heterogeneity in the dataset. $G_{i}$ (Getis and Ord, 1992) statistics are the ratio of the sum of values in neighbouring locations, defined by a given distance, to the sum over all observations. A global variation can be calculated with the Getis-Ord Global G tool.

Like Local Moran’s I, $G_{i}$ statistics can detect local ‘pockets’ of dependence that may not show up when using global statistics, for example, they isolate micro-concentrations in the data which are otherwise swamped by the data’s overall randomness. The form of the $X$ matrix is additive, either $(X_{j})$ or $(X_{i} + X_{j})$ for the self-included measure. This contrasts with the Moran statistic where the matrix is in the form $(X_{i} - \overline{X})(X_{j}- \overline{X})$ or Geary’s C which specifies the matrix in the form $(X_{i} - X_{j})^2$. As $G_{i}$ measures are measures of concentration, they are in contrast to the Moran statistic which examines the correlation or covariance of values in neighbouring regions compared to the data’s overall variance (while Geary’s C calculates differences). $G_{i}$ statistics evaluate association by examining ‘additive qualities’ (Getis and Ord, 1996, p. 262), they compare local weighted averages to global averages to isolate ‘coldspots’ and ‘hotspots’.

The interpretation of $G_{i}$ is also somewhat different from other measures of spatial association. When $G_{i}$ is larger than its expected value, the overall distribution of the variable being measured can be seen as characterised by positive spatial autocorrelation, with high-value clusters prevalent. If $G_{i}$ is smaller than its expected value, then the overall distribution of the variable being measured is still characterised by positive spatial autocorrelation, but with low values clustered together.

A special feature of this statistic is that it equals 0 when all $X$ values are the same. Also while the weighted value of an $X$ value might be expected to rise with the number of neighbours (or weighted regions), all else equal, a region that has a greater number of neighbours does not receive a greater $G_{i}$. Only when the observed estimate in the vicinity of the region $i$ varies significantly from the mean of the variable does $G_{i}$ rise (Getis and Ord, 1992).

A slightly different form of $G_{i}$ was suggested by Ord and Getis (1995), $G_{i}(d)$ originally proposed for elements of a symmetric binary weights matrix, was extended to variables that do not have a natural origin and to non-binary standardised weight matrices (Ord and Getis, 1995, p. 289). This statistic for each region $i$ is:

$G_{i}(d) = {{\sum\nolimits_{j}w_{ij}(d)X_{j}-W_{i}\mu}\over{\sigma\{[(n-1)S_{1i}-W_{i}^2]/(n-1)\}^2}}, j\neq i$

where $w_{ij}$ is the spatial weight matrix element, $X_{j}$ is the variable, $W_{i} = \sum\nolimits_{j}w_{ij}$, $S_{1i} = \sum\nolimits_{j}w_{ij}^2$ and $\mu$ and $\sigma$ are the usual sample mean and standard deviation for the sample size of n-1. $d$ is the threshold distance from $i$. $G_{i}^*(d)$ includes the case where $i = j$.

The Getis-Ord $G_{i}$ are statistics for local spatial association but are not LISAs given the criteria established by Anselin (1995). Their individual components are not related to the global statistic of spatial association ($G$). Anselin notes that “this requirement is not needed for the identification of local spatial clusters, but it is important for the second interpretation of a LISA, as a diagnostic of local instability in measures of global spatial association (for example in the presence of significant global association)” (Anselin, 1995, p. 102).

The results firstly produce the $G_{i}$; for each area $i$ as a standardised z-value. Getis and Ord (1992) argued that inference, as with global measures are based on calculating a standardised value and comparing this against a null which is assumed to follow a normal distribution. However a normally distributed null may not be an appropriate assumption, as Local $G_{i}$ are not independent of each other by design (Ord and Getis, 1995). By definition one region may appear in a number of different region’s weighting vectors. This raises the general issue that for local measures of spatial autocorrelation inference is complicated as statistics will be correlated when weights contain the same elements which they do. This is a problem of multiple statistical comparisons and reflects “the built-in correlatedness of measures for adjoining locations” (Anselin, 1995, p. 112). This requires a more stringent test to be able to assert spatial non-randomness, that is, to assert the presence of spatial autocorrelation at the local level. Anselin (1995, p. 96) notes “This means that when the overall significance associated with the multiple comparisons (correlated tests) is set to $\alpha$, and there are $m$ comparisons, then the individual significance $\alpha_{i}$ should be set to $\alpha/m$ (Bonferroni) or $1 - (1 - \alpha)^{1/m}$

### SET UP

• Select ABS – Socio-Economic Indexes for Areas (SEIFA) – The Index of Relative Socio-economic Disadvantage (SA1) 2016 as your dataset, selecting all variables.
• Spatialise the above dataset using the Spatialise Aggregated Dataset tool or from the dataset’s spanner menu. • Generate a Contiguous Spatial Weight Matrix for the spatialised dataset, using 1st order Queen contiguity, row-standardised style, and island parameter selected.

Once this has been completed, continue to the next section for instructions on using the tool.

### Inputs

Once you have selected these, open the Getis-Ord Local G parameter input window (Tools → Spatial Autocorrelation → Getis-Ord Local G) and enter the parameters as in the image shown below.

• Dataset Input: The dataset that contains the variable(s) to be tested. Select the spatialised dataset we generated.
• Spatial Weights Matrix: The spatial weight matrix to be used. Select the generated spatial weights matrix we generated.
• Key Column: Specify the unique codes for your areas. Select SA1 11-digit code 2016.
• Variable: the variable to be tested. Select IRSD Score.
• Matrix type. This indicates whether the spatial weights matrix should include wii > 0. Inclusion of the self weight wii > 0 corresponds to Gi∗. Tick this box.
• Significance level: This corresponds to the significance level used to filter the results. P-values above this will be labelled as “Not Significant” in the output. Type 0.05. Once you have entered your parameters, click Run Tool.

### Outputs

Your output will be a dataset that can be mapped based on a number of the variables produced by the analyses. These are explained below

• Z-value: This is the z score for the variable that you chose to include in the analysis. In this instance, it is the IRSD decile score. The Z score is calculated by taking the variable value, subtracting the sample mean and dividing that by the sample standard deviation. It is a measure of how far away from the mean the score is.
• p-value_(norm): The statistical significance of your z score using normal significance testing.
• p-value_(bon): The statistical significance of your z score using a Bonferroni significance testing adjustment method.
• map_group_(norm): The map group that each area belongs to:
• 1 = Cluster of High Values,
• 2 = Cluster of Low Values,
• 0 = Non Significant when using normal statistical significance testing.
• map_group_(bon): The map group each area belongs to:
• 1 = Cluster of High Values,
• 2 = Cluster of Low Values,
• 0 = Non Significant when using Bonferroni statistical significance testing
• map_group_name(norm): The map group that each area belongs to:
• High = Cluster of High Values,
• Low = Cluster of Low Values, and
• Non Significant when using normal statistical significance testing
• map_group_name(bon): The map group that each area belongs to:
• High = Cluster of High Values,
• Low = Cluster of Low Values, and
• Non Significant when using Bonferroni statistical significance testing

You can create a Choropleth of these variables. For the image below, we have chosen map_group_name(norm), using 3 classes, and a Qualitative, Dark2 palette. Green indicates SA1s of high index scores (low disadvantage) clustered together, orange indicates low index scores (high disadvantage), and purple indicates non-significant clusters. Anselin, L. (1995). Local indicators of spatial association—LISA. Geographical Analysis, 27(2), 93–115.
Getis, A., & Ord, K. (1992). The Analysis of Spatial Association by Use of Distance Statistics. Geographical Analysis, 24, 189–206.
Ord, J. K., & Getis, A. (1995). Local spatial autocorrelation statistics: Distributional issues and an application. Geographical Analysis, 27(4), 286–306.

### Looking for Spatial Data?

You can browse the AURIN Data Discovery: ### How can you Create Impact? 