Case study – Data to find the missing
Aims and benefits
Within this case study Paul Clarke from FiguringOutData.com describes how data was used to find people who needed help, even though they had disappeared from sight, leaving few traces behind them.
The people were tenants of a Housing Association Group – a not-for-profit enterprise that provides homes for those unable to do so for themselves, paid for via a government provided housing benefit.
The Association is in touch with most of them – through rent payments, fire and safety checks at the property, and the occasional call or email.
But there are some tenants who effectively disappear. Rent payments dwindle and then stop, and efforts to reach them fail.
The tenant might have simply left, in which case the Association can offer the house to someone else. They might however be in need of help – due to illness, bereavement, or simply a descent into a chaotic lifestyle within which they have little control.
If someone has effectively abandoned their home, they are unlikely to qualify for another. So the question became ‘can we identify those at greatest risk of abandoning their home – in time to help?’.
To find this group we needed data. But as tenants steadily disengage, any data relating to them also dries up. So finding them was going to be tough.
For each person identified as being at risk, some form of intervention would be needed, but at what cost? There were about 40k tenants in total. If just 1% were flagged as being at risk, that would still be 400 people. Would this mean front line staff knocking on 400 doors, talking to 400 sets of neighbours – trying to find a clue about the tenant’s whereabouts and circumstances?
A precise prediction method was needed – one that could attach a probability of the tenant being in some sort of distress to each address in the tenant database, and be capable of pinpointing the few who were the most urgent cases – before it was too late.
Not only would the data be in short supply, but little of it might relate to the tenant themselves. E.g. calls to the tenant might occasionally be answered, but by a family member rather than the tenant. They might say that the tenant is ok, and offer assurances that the tenant would be in touch. But it didn’t amount to evidence that the tenant was still there, or could still be contacted should help be offered.
The challenge was therefore to detect signs of potential home abandonment – at a point when there was evidence that the tenant was still in contact.
For this we needed some powerful data science.
- To find patterns out of the scarce data we had available – and to create data where there was none.
The patterns captured involved the changing period of time between contacts, changes in the frequency and amount of rent payments, increasing difficulty accessing the property for inspections, the lexicon used by staff recording each exchange with the tenant, and the frequency of use of key words in emails and texts that indicated increasing levels of stress and inability to engage positively with the Association.
The front line Housing Team, who quickly developed a sixth sense about their tenants, also listed the signs they look for, that may have been noted in an inspection report for the estate as a whole e.g. a build up of rubbish in gardens.
- Using those patterns to identify key features that could be used in a prediction model
Initially over 30 key features were identified that could be extracted from the data, that correlated to some degree with the circumstances that were leading people to abandon their homes.
- Finding historical records of tenants who had been listed as ‘abandoning their homes’
In order to build a prediction model, a set of ‘training data’ was required. This was a set of records for past tenants including those who had been confirmed as having abandoned their homes.
The algorithm looked into this training dataset for the 30 or so features identified. Using the confirmed outcome for each tenant (they had either left or remained) each feature was then weighted to reflect the strength of its correlation with the outcome.
This revealed the most important features. These included the decline in rent payments, the decline in the frequency of confirmed contact, the increased use of about 20 key words by staff recording each interaction, and the use of specific key words and phrases by family and friends.
- Developing a prediction model based on the historical records that could be used for prediction
The prediction model was applied to the prior 6 months of records for the current population of tenants. And it yielded a significant probability of potential home abandonment for approximately 1% of the population of tenants.
But this figure was way too high given that 20 residents per year, at most, would leave their homes without warning.
- Refining predictions to the point at which they could become usable
This was too big a number to check out using visits to the property, so the sixth sense of the Housing Team was brought into play. Members of the team were first able to confirm that many of the tenants within the 1% identified indeed bore the hallmarks of potential abandonment. They were however able to pinpoint the top 20 who, in their view, were the most urgent cases and worthy of immediate investigation.
The outcome was a system capable of detecting very faint signals in all the noise associated with supporting and maintaining a large number of social housing tenants.
But it depends upon finding data that would otherwise remain hidden from view – and will remain so unless inquisitive people ask questions about the people out there who are invisible and in need of help.
It also depends upon records for each tenant that are entered and updated the same way each and every time, so that patterns build up in the text which can be extracted and relied upon within the search task.
And finally, it depends upon the wisdom of the people who know the tenant well, who know what to look out for, but who need the help of the digital systems around them for the pointers and steers they need.
Tel: +44 333 301 0302
Inspiring businesses to achieve success through data.
If we can be of help, please use the contact details above or complete the form below.