Housing Stress and Income
The first element of our housing data that we will look at relates to housing stress and its relationship to other factors such as income.
To start of with we will make some maps
Create a Choropleth Map of the Tenure Type Owned With A Mortgage % Variable
Create a Choropleth Centroid Map of the Household Stress Households With Mortgage Repayments Greater Than Or Equal To 30% Of Household Income % Variable
Your maps should look something like the images below


Key Research Question: Do you think there is a relationship between the proportion of households in an SA2 who are servicing a mortgage, and the proportion of households who are spending more than 30% of their income on servicing a mortgage?
In order to answer the above question, it is worthwhile first of all viewing the shape of this potential relationship with a scatterplot. A scatterplot allows a preliminary ‘eyeballing’ of the data before bringing more rigourous statistical tests to bear on the data
Create an Interactive Scatterplot of Tenure Type Owned With A Mortgage % (x axis) vs Household Stress Households With Mortgage Repayments Greater Than Or Equal To 30% Of Household Income % (y axis)
Your scatterplot should look something like the image below

This suggests that there is a relationship, but on closer inspection of the variables that we have mapped, it is not particularly insightful: areas with more households servicing a mortgage have more households servicing mortgages with more than 30% of their income – as one goes up the other is likely to go up.
What we are potentially more interested in is understanding which areas have higher proportions of households spending more than 30% of their income on a mortgage as a proportion of the total households servicing a mortgage not as a proportion of total households.
We can generate this measure in the portal by using the Generate tool
Using the Generate tool, create a new dataset with a new variable that is Household Stress Households With Mortgage Repayments Greater Than Or Equal To 30% Of Household Income % divided by Tenure Type Owned With A Mortgage %
Create a Choropleth Map of this new variable (making sure to switch off your other layers as well!)
Your map should look something like the images below


Key Research Question: What is the potential relationship between the proportion of mortgaged households spending more than 30% of their income on their mortgage and the average income and median mortgage repayments?
Now we are starting to ask some interesting questions: do areas with a greater proportion of mortgaged households spending more than 30% of their income on mortgage have higher or lower average mortgage payments, or higher or lower incomes?
To answer this question, first we want to map mortgage repayments on top of our proportion map.
Create a Choropleth Centroid Map of the Median Monthly Mortgage Repayment
Create an Interactive Scatterplot of Proportion of Mortgaged Households Spending 30% or more of Income on Mortgage vs Median Monthly Mortgage Repayments
Create an Interactive Scatterplot of Proportion of Mortgaged Households Spending 30% or more of Income on Mortgage vs Median Weekly Total Household Income
Your map and scatterplot should look something like the images below

Median Monthly Mortgage Repayments:

Median Weekly Household Income:

We can see that there is a datapoint which is an outlier, which could potentially skew the dataset for any future statistical analysis. We need to remove this point from our dataset
Using the Dataset Attribute Filter tool, create a new dataset that has this row removed (hint – only keep rows that have a value of the column you created less than 1!)
Now that we have filtered our dataset we are going to run a correlation analysis of our data – we want to understand whether or not there are positive or negative relationships between our variable that we created, and the average amount either spent on mortgages or the average income taken in by a household in areas.
Remember, a correlation value ranges from -1 to 1. The sign of the correlation tells us whether or not it it is positive or negative. Positive correlations mean that as one variable goes up, the other does too. Negative correlations mean that as one variable goes up, the other goes down. The further away a correlation value is from 0 to either -1 or 1 gives us an indication of how strong the correlation is. The closer to zero, the weaker the relationship, the closer to -1 or 1, the stronger the relationship is.
Using the Correlation tool, run a correlation matrix for the three variables: median monthly income, median weekly rent, and the proportion of mortgaged households spending more than 30% of their income servicing their mortgage.
The output of your analysis should look something like the image below

The problem with this is that it is a text editor. However, if you copy and paste these values into excel, you should be able to see the data structured in a more meaningful way, similar to the tables below.
Correlation R value and P Value Matrix
| PropMS_M | M0_median_mortgage_repay_monthly | M0_median_tot_hhd_inc_weekly |
PropMS_M | 1 | 0 | 0 |
M0_median_mortgage_repay_monthly | -0.3007 | 1 | 0 |
M0_median_tot_hhd_inc_weekly | -0.4755 | 0.865 | 1 |
The matrix is mirrored so you only need to look at the bottom left three values for the tables (in bold/blue) for the correlation values. These show that there are moderate negative relationships between the proportion of mortgaged households spending 30% or more of their incomes servicing their mortgages and the median mortgage repayments and incomes. The top right of the table represents the statistical significance of the relationships. These relationships are statistically significant (the P-Values are negligibly close to zero), so they are shown in green.
Key Take-Home Message #1: Areas with lower incomes, and with lower monthly mortgage repayments, nonetheless have higher proportions of households spending more than 30% of their total income servicing their mortgages