# Moran's I

Moran’s I is a global measure of spatial autocorrelation across an entire study area. The Moran’s I statistic provides an indication of the degree of linear association between the observation vector ($x$) and a vector of spatially weighted averages of neighbouring values ($W_x$), where $W$ is the spatial weight matrix which formalises the neighbourhood or contiguity structure of the dataset (Moran, 1948).

Moran’s I is calculated as:

$\LARGE{I_t=\frac{R\sum_{i=1}^{R}\sum_{j=1}^{R}x_i x_j w_{ij}} {R_b\sum_{i=1}^{R}x^2_i}}$

Where $R$ is the number of regions in the dataset; $R_b$ is the sum of the weights which simplifies to $R$ if the spatial weight matrix is row-standardised; $x_i$ is the variable that we are testing and is measured as deviation from the mean, i.e. $x_i=X_i-\bar{X}$. The location variable for the area’s proximity is given by $w_{ij}$ which is the element from the corresponding spatial weight matrix.

The Moran I statistic can be expressed as a standardised normal z-value for inference purposes, computed by:

$\LARGE{Z_I=\frac{I-E(I)}{sd(I)}}$

Where $I$ is Moran’s I statistic, $E(I)$ is the theoretical mean and $sd(I)$ the theoretical standard deviation of Moran’s I statistic.

Anselin (1992, p. 134) points out the different assumptions that can be made about the data that effect the calculation of the standard deviation. The component allows for the standard deviation to be calculated under the assumption of randomisation or normality, both of which allow the computed z-value to follow a normal distribution (asymptotically), so its significance can be evaluated by means of a standard normal table. Alternatively, a user may prefer either a Saddlepoint approximation (Tiefelsdorf, 2002) or an exact test (Tiefelsdorf & Boots, 1995) which are also provided in the tool’s output. In general, these make little difference to the significance of global tests unless $R$ is quite small.

The range of possible Moran’s I values is between -1 and 1. An estimate of 0 implies no spatial autocorrelation. For a significant estimate the closer it gets to 1, the greater the degree of positive spatial autocorrelation; while the closer it is to -1 indicates stronger negative spatial autocorrelation.

Anselin (1996) shows the relationship between the Moran Scatterplot and the Moran’s I statistic. The Moran Scatterplot visualises and identifies the degree of local spatial instability in spatial association that is present in the Moran’s I statistic. The Moran’s I statistic can be interpreted as the regression coefficient in a bivariate regression of the spatially lagged variable, $W_x$, on the original variable, $x$ (in deviations from the mean). The spatially lagged variable, $W_x$, is the average of observations at neighbouring locations, that is, locations for which $w_{ij}\neq0$.

This interpretation of Moran’s I easily translates to a bivariate scatterplot with $x$ on the x-axis, $W_x$ on the y-axis and Moran’s I being the slope of the linear best fit. The scatterplot is useful in identifying those observations that do not conform, and that differ significantly from the global Moran’s I in magnitude and/or direction. The scatterplot centres on the point where the mean of the variable meets the mean of the lagged variable, and the four quadrants of the plot relative to this point give information about the type of association that is present. The upper right and lower left quadrants represent positive spatial association, while the upper left and lower right quadrants show those observations that have negative spatial association. The densities of each of the quadrants indicate which spatial pattern dominates and also provides information on the distribution of the individual spatial associations and the contribution of each to the global statistic. Further, the plot shows outliers and leverage points and can provide an overall picture of the consistency of the global indicator.

### SET UP

To demonstrate this tool in use, we will look at socio-economic disadvantage data in Greater Perth to examine the degree of spatial-autocorrelation.

Select Greater Perth GCCSA as your area.

Select ABS – Socio-Economic Indexes for Areas (SEIFA) – The Index of Relative Socio-economic Disadvantage (SA2) 2016 as your dataset, select IRSD Score as the variable.

Use the Spatialise Aggregated Dataset tool to Spatialise the dataset.

Use the Contiguous Spatial Weight Matrix tool to build a Spatial Weights Matrix for the spatialised dataset, using 1st order, row-standardised, Queen contiguity.

### Inputs

Open the Moran’s I tool (Tools → Spatial Autocorrelation → Moran’s I) and enter the following parameters:

• Dataset Input: The dataset that contains the variable to be tested. Select the Spatialised Dataset.
• Spatial Weights Matrix: The spatial weight matrix to be used. Select the Contiguous Spatial Weight Matrix.
• Key Column: Specify the unique codes for your areas. Select SA2 9-digit Code.
• Variable: The variable to be tested. Select IRSD Score.
• Alternative Hypothesis: Specifies the alternative hypothesis. Select two.sided.
• two.sided: a priori assumption that the difference between I and the expected E[I] is not equal to zero (spatial autocorrelation).
• greater: a priori assumption that I is greater than the expected E[I] (positive spatial autocorrelation).
• less: a priori assumption that I is less than the expected E[I] (negative spatial autocorrelation).
• Inference: Indicates the assumption under which the variance should be calculated. A tick indicates randomisation, blank indicates normality. Select ticked.

The input parameters are summarised in the image below, once complete click Run Tool.

### Outputs

Once the tool has finished, tick both boxes and click Display Output. This will open up two outputs.

The first output is a data file that you can map, containing your input variables and the following:

• <input_variable>_Lagged: the average value of the variable for the areas surrounding each area.
• <input_variable>_Scaled: the value of the variable for your area scaled (z score).
• <input_variable>_Lagged_Scaled: the average value of the variable for the areas surrounding each area, scaled (z-score).

The second output is a text window with a matrix of outputs for your Moran’s I test under all of the different conditions:

• Moran’s I:
• Estimate: The Moran’s I statistic under normality assumption.
• Std. Deviate: The standard deviate of the Moran’s I statistic under normality assumption, this can also be interpreted as a z-score.
• p-value: The p-value of the Moran’s I statistic under normality assumption.
• Expectation: The expected Moran’s I statistic under normality assumption.
• Variance: The variance of Moran’s I statistic under normality assumption.
• Estimate: The Moran’s I statistic with a Saddlepoint approximation.
• Saddlepoint: The standard deviate of the Moran’s I statistic with a Saddlepoint approximation, this can also be interpreted as a z-score.
• p-value (sad): The p-value of the Moran’s I statistic with a Saddlepoint approximation.
• Moran’s I (Exact):
• Estimate: The Moran’s I statistic under an Exact test.
• Exact SD: The standard deviate of the Moran’s I statistic under an Exact test, this can also be interpreted as a z-score.
• p-value (exact): The p-value of the Moran’s I statistic under an Exact test.

We can see from the results that the Moran’s I value is moderately high in all instances and is also significant (P < 0.05). Therefore, we can reject the null hypothesis (no spatial autocorrelation), and accept the alternative hypothesis that there is a moderate positive spatial autocorrelation (clustering) of IRSD scores by SA1 across Greater Perth.

Anselin, L. (1992). SpaceStat Tutorial: A workbook for using SpaceStat in the analysis of spatial data. UC-Santa Barbara: Department of Geography.

Anselin, L. (1995). Local Indicators of Spatial Association – LISA. Geographical Analysis27(2), 93-115.

Anselin, L. (1996). The Moran Scatterplot as an ESDA Tool to Assess Local Instability in Spatial Association. In M. M. Fischer, H. J. Scholten & D. Unwin (Eds.), Spatial Analytical Perspectives on GIS (pp. 111-125). Taylor & Francis, London.

Moran, P. A. (1948). The interpretation of statistical maps. Journal of the Royal Statistical Society. Series B (Methodological), 10(2), 243-251.

Tiefelsdorf, M. & Boots, B. (1995). The Exact Distribution of Moran’s I. Environment and Planning A: Economy and Space. 27(6), 985-999.

Tiefelsdorf, M. (2002). The Saddlepoint Approximation of Moran’s I’s and Local Moran’s I’s Reference Distributions and Their Numerical Evaluation. Geographical Analysis. 34(3), 187-206.

### Looking for Spatial Data?

You can browse the AURIN Data Discovery: