Social network analysis: the key to churn prediction in the prepaid market
Social network analysis is a data mining technique that seeks to explain the behaviour of individuals through their social ties, says Liam Furman, senior analyst at OLRAC SPS.
Liam Furman, Senior Analyst, OLRAC SPS
Conventional churn models – designed to predict the likelihood that a customer will terminate their subscription to a service – treat people like isolated individuals. However, that is clearly not the case for most of us. Rather, we are social creatures who are influenced by those around us.
Social network analysis (SNA) is a data mining technique which seeks to explain the behaviour of individuals through their social ties, says Liam Furman, Senior Analyst at OLRAC SPS.
Wait, are you talking about Facebook and Twitter?
A common misconception is that SNA involves the analysis of data from popular social networking sites like Facebook and Twitter. Rather, a social network refers to a particular representation of the relationships between individuals that is based on the mathematical field of graph theory.
Figure 1 below illustrates a small social network. In this type of display, individual customers correspond to nodes and links between them indicate social relationships.
SNA is not new; it has been used in the social sciences for more than 80 years. However, only fairly recently have businesses woken up to the significant potential gains that can be realised through its application. The telecommunications sector is probably the sector that is most mature in the use of SNA; there are two key reasons for this:
* Churn problem: As mobile penetration is increasing and the market is approaching saturation, the emphasis of service providers is shifting from customer acquisition to retention – the only way to grow the consumer base is to poach customers from rival networks. Increasingly, telecommunications networks are relying on predictive models to identify at-risk subscribers for special offers before they move to the competition. SNA is especially useful in the prepaid segment where there exists little demographic data for conventional data mining techniques.
* Availability of data: Telecommunications networks have access to vast quantities of high quality data. This data details who their customers contact and who contacts them – this allows for rich and comprehensive social network graphs to be built.
But how does SNA work?
First, a social network graph is built from call data record (CDR) data. These data sets typically contain detailed information about voice calls, SMS and other services of users. Churn predictions are based on a word-of-mouth premise, whereby churners influence their friends to churn, who in turn spread that influence to others, and so on. Therefore, the next step is to simulate the spreading influence of known past churners through the social network. This gives an estimate of the amount of "churn influence" that each customer has received. Those with higher "churn influence" are more likely to discontinue their subscription and, consequently, should be priority targets for retention efforts.
Figure 1: An example of a small social network represented as a graph. Doug is friends with Sarah and Grant, but he is only indirectly related to John and Hayley
Before going much further, it is worth noting that SNA does not necessarily preclude traditional churn modelling. The two approaches are, in fact, complementary – outputs from SNA should be combined with information on the past behaviour and demographics of customers. This produces more accurate predictive models; academic literature on the subject reports increased performance by a factor of 10 or more over conventional churn modelling techniques.
There are some drawbacks to using SNA: mostly notably, the computing requirements. CDR data is usually in the region of hundreds of Gigabytes large, and existing infrastructure may be insufficient to effectively handle the volumes of data. However, the emergence of new technologies that can be scaled to meet the demands of the industry means this is becoming less of an issue every day, and SNA has now become a viable production solution.
What else is SNA good for?
Although this article has focused on churn prediction, there are many more applications for SNA, including:
* Customer value management: A number of key business processes rely on estimates of the future value of a customer. For example, it is worth offering reduced rates to a potential churner only if the cost to the company is less than the expected future spend of the at-risk subscriber. However, it is not enough simply to consider historical spend, because a low value customer may have influence over a high value social network. SNA provides a means of incorporating a customer's relationships into their value calculation.
* Social network fingerprinting: In the prepaid segment it is particularly difficult to measure revolving churners. These are subscribers who cancel a product and then replace it with a new one from the same network. The problem is that the personal details of prepaid customers are often not collected, making it almost impossible to track them. SNA can be used to map "new" customers into existing social networks where one or more individuals has previously left. By identifying these revolving churners, they can be treated differently and monitored.
* Marketing campaign improvement: If resources are limited, it is often beneficial to target direct marketing to influencers only. These are individuals with a large number of strong social ties. It has been shown that influencers promote products to their social networks, improving the response to marketing campaigns. SNA enables businesses to identify these individuals.
In summary, SNA has emerged as an important technique for studying complex, real-world networks. Its most developed business application is in the telecommunications sector, where its use can dramatically improve the performance of churn prediction models. This is especially true in the prepaid segment, which ordinarily suffers from a lack of customer demographic data. With further applications in marketing and customer segmentation, SNA is set to become a common term in the world of predictive analytics. OLRAC SPS strives to get maximal value out of your data, and it believes that social networks are too important to ignore.