How CompassRed’s Dr. Steve Poulin and his team leveraged Unstructured Data and External Data to uncover success and save $10 million dollars for every percentage point reduction in turnover. 

When 92% of the agents hired by a life insurance company were leaving the company within their first year, this large Insurance Company realized their current approach to retention was not sustainable. This was causing significant expenses for continuous recruitment and hiring costs.  Using unstructured data from the applicant’s resume and US Census data about the area in which they live, a predictive analytics process was developed to identify which applicants were most like to sell the amount of insurance required to become a successful agent with the company. 

Background and Predictive Analytics Objectives

CompassRed’s Dr. Steven Poulin led the modeling effort to uncover and fix the mass exodus that was going on each year. Here were the facts given:

  • The life insurance agency maintained records for each agent that records the amount of insurance that they sell. 
  • The agents must sell above a specific threshold to remain with the company and if an agent does not sell above this threshold within their first year with the company, they are terminated. 
  •  Many agents leave voluntarily within their first year if they are not confident they will sell above the required threshold, or if they decide that they do not want to pursue a career in selling life insurance. 

Because 92% of their agents leave before the end of their first year, the company is constantly searching for new applicants through the CareerBuilder and Monster employment websites.  Applicants for the company’s life insurance agent positions must submit their resume, but they are not required to complete an application form.  The company’s human resources department screens the resumes, and the applicants who are hired must go through an orientation and training process.

Through predictive modeling, Dr. Poulin was asked to leverage the predictive analytics process is to decrease the percentage of agents that leave within a year by identifying the applicants with the highest probability of selling the required level of life insurance. The company’s actuarial department has calculated by reducing recruitment and hiring costs, the company will save 10 million dollars with every percentage point reduction in their turnover rate.


The resumes submitted by applicants through CareerBuilder and Monster are matched regularly to the records of new hires.  Dr. Poulin and his team leveraged Text Analytics and analyzed resume content with SPSS technology.  The team used software to extract words and phrases from unstructured text data, and distilled them into categories.  Job categories were identified based on standardized job categories. The applicant’s highest level of education was derived from the mentions of educational achievement cited in the resumes.

Additional predictors were obtained by matching data from the US Census Bureau’s annual American Community Survey to hired applicants by zip code.  These predictors included the average household income, the average number of hours worked, average age, educational profile, cost of housing, and health insurance coverage by each new employee’s zip code.

Several predictive models were built with these predictors, including regression-based, rule induction, and machine learning models.  All of these models were evaluated for their ability to accurately predict whether a new hire sold the required amount of life insurance during their first year.  Using the predicted probabilities generated by the models for each employee to successfully sell the required amount of insurance, the models were also evaluated how well they identified successful employees when compared to the number of successful employees selected at random.

Finally, using the best performing models, the predictive analytics process ranks new applicants in descending order by their probability to successfully sell the required amount of insurance.  These ranked lists are presented to the persons who are responsible for selecting applicants to interview, thereby ensuring that they interview the persons who are most likely to succeed as life insurance agents.

 CompassRed is a Data and Analytics agency leveraging Predictive Analytics technologies and Big Data concepts to use their data in a better way. Contact us at

AuthorPatrick Callahan