# Bivariate Moran's I

Bivariate Moran’s I is a global measure of spatial autocorrelation to measure the influence one variable has on the occurrence of another variable in close proximity. Whereas the original Moran’s I statistic measured the degree of linear association of the values of a variable in neighbouring regions. The Bivariate Moran’s I statistic provides an indication of the degree of linear association between one variable, $x$, and a different variable in neighbouring regions $\sum_{j}y_j w_{ij}$ (but not in the same region).

Bivariate Moran’s I is computed as:

$\LARGE{I_t=\frac{R\sum_{i=1}^{R}\sum_{j=1}^{R}x_i y_j w_{ij}} {R_b\sum_{i=1}^{R}x^2_i}}$

Where $R$ is the number of regions in the dataset; $R_b$ is the sum of the weights which simplifies to $R$ if the spatial weight matrix is row-standardised; $x_i$ is the first variable and is measured as deviation from the mean, i.e. $x_i=X_i-\bar{X}$; $y_i$ is the second variable and is also measured as deviation from the mean, i.e. $x_i=Y_i-\bar{Y}$. The location variable for the area’s proximity is given by $w_{ij}$, which is the element from the corresponding spatial weight matrix.

In a similar way to Moran’s I, the Bivariate Moran’s I statistic is expressed as a standardised normal Z-score for inference purposes, computed by:

$\LARGE{Z_I=\frac{I-E(I)}{sd(I)}}$

Where $I$ is Bivariate Moran’s I statistic, $E(I)$ is the theoretical mean and $sd(I)$ the theoretical standard deviation of the Bivariate Moran’s I statistic.

The range of possible Bivariate Moran’s I is between -1 and 1. An estimate of 0 implies no spatial autocorrelation. For a significant estimate the closer it gets to 1, the greater the degree of positive spatial autocorrelation; while the closer it is to -1 indicates stronger negative spatial autocorrelation.

The Bivariate Moran’s I statistic can be interpreted as the regression coefficient in a bivariate regression of the spatially lagged second variable, $W_y$, on the original variable, $x$ (in deviations from the mean). The spatially lagged second variable, $W_y$, is the average of observations of the second variable at neighbouring locations, that is, locations for which $w_{ij}\neq0$. For the scatterplot the first variable, $x$, is placed on the x-axis and the lagged version of the second variable, $W_y$, lies on the y-axis. Once again the slope of the linear best fit is the Bivariate Moran’s I value.

Words of Caution:

Inference for Bivariate Moran’s I is based on the standardised normal Z-score (null hypothesis = spatial randomness). However, due to the statistic’s asymptotic normality (that is, as n increases, the actual distribution of the test statistic gets closer to the normal distribution) small sample sizes ($n<40$) should be interpreted with caution, as the assumption of normality may not be valid (Czaplewski & Reich, 1993). For small sample sizes permutation based approaches should be used.

Furthermore, Anselin (2019) points out that the concept of bivariate spatial correlation is often misinterpreted as the correlation between one variable and the spatial lag of another variable. But this does not account for the inherent correlation between the two variables. Hence, this statistic should be interpreted with caution, as it can overestimate the spatial aspect of the correlation that instead may be due mostly to the in-place correlation.

### SET UP

To demonstrate this tool in use, we will look at the association between education and employment in Greater Brisbane to examine the degree of spatial-autocorrelation.

Select Greater Brisbane GCCSA as your area.

Select ABS – Data by Region – Education & Employment (SA2) 2011-2017 as your dataset, Select Labour Force Statistics Participation Rate % and Persons With Post School Qualifications Bachelor Degree % as the variables.

Use the Spatialise Aggregated Dataset tool to Spatialise the dataset.

Use the Contiguous Spatial Weight Matrix tool to build a Spatial Weights Matrix for the spatialised dataset, using 1st order, row-standardised, Queen contiguity.

### Inputs

Open the Bivariate Moran’s I tool (Tools → Spatial Autocorrelation → Bivariate Moran’s I) and enter the following parameters:

• Dataset Input: The dataset that contains the variable to be tested. Select the Spatialised Dataset.
• Spatial Weights Matrix: The spatial weight matrix to be used. Select the Contiguous Spatial Weight Matrix.
• Key Column: Specify the unique codes for your areas. Select SA2 Code.
• X Variable: The first variable. Select Persons With Post School Qualifications Bachelor Degree %.
• Y Variable: The second (or alternative) variable. Select Labour Force Statistics Participation Rate %.
• Alternative Hypothesis: Specifies the alternative hypothesis. Select two.sided.
• two.sided: a priori assumption that the difference between I and the expected E[I] is not equal to zero (spatial autocorrelation).
• greater: a priori assumption that I is greater than the expected E[I] (positive spatial autocorrelation).
• less: a priori assumption that I is less than the expected E[I] (negative spatial autocorrelation).
• Inference: Indicates the assumption under which the variance should be calculated. A tick indicates randomisation, blank indicates normality. Select unticked.

The input parameters are summarised in the image below, once complete click Run Tool.

### Outputs

Once the tool has finished, tick both boxes and click Display Output. This will open up two outputs.

The first output is a data file that you can map, containing your input variables and the following:

• <Y_variable>_Lagged: the average value of the Y variable for the areas surrounding each area.
• <X_variable>_Scaled: the value of the X variable for your area scaled (z score).
• <Y_variable>_Lagged_Scaled: the average value of the Y variable for the areas surrounding each area, scaled (z-score).

The second output is a text window with a matrix of outputs for your Bivariate Moran’s I test:

• Estimate: The Bivariate Moran’s I statistic.
• Std. Deviate: The standard deviate of the Bivariate Moran’s I statistic, this can also be interpreted as a z-score.
• p-value: The p-value of the Bivariate Moran’s I statistic.
• Expectation: The expected Bivariate Moran’s I statistic.
• Variance: The variance of Bivariate Moran’s I statistic.

The results show that the Bivariate Moran’s I value is positive but low, and is also significant (P < 0.05). Therefore, we can reject the null hypothesis (no spatial autocorrelation), and accept the alternative hypothesis that there is a low positive relationship between persons with bachelor degrees and labour force participation. This indicates that higher percentages of persons with bachelor degrees tend to be spatially correlated with higher rates of labour force participation at neighbouring SA2s.

Anselin, L. (2019, June 3). Global Spatial Autocorrelation (2) –  Bivariate, Differential and EB Rate Moran Scatter Plot. Geoda. https://geodacenter.github.io/workbook/5b_global_adv/lab5b.html

Centre of Full Employment and Equity. (2015). AURIN spatial statistics and econometrics e-tools help file. University of Newcastle.

Czaplewski, R. L. & Reich, R. M. (1993). Expected Value and Variance of Moran’s Bivariate Spatial Autocorrelation Statistic for a Permutation Test. (U.S. Department of Agriculture, Rocky Mountain Forest and Range Experiment Station, Research Paper RM-309).

### Looking for Spatial Data?

You can browse the AURIN Data Discovery: