Paper
Estimating Car Insurance Premia: a Case Study in High-Dimensional Data Inference
Estimating insurance premia from data is a di#cult regression problem for several reasons: the large number of variables, many of which are discrete, and the very peculiar shape of the noise distribution, asymmetric with fat tails, with a large majority zeros and a few unreliable and very large values. We introduce a methodology for estimating insurance premia that has been applied in the car insurance industry. It is based on mixtures of specialized neural networks, in order to reduce the e#ect of outliers on the estimation. Statistical comparisons with several di#erent alternatives, including decision trees and generalized linear models show that the proposed method is significantly more precise, allowing to identify the least and most risky contracts, and reducing the median premium by charging more to the most risky customers. 1
Authors: Nicolas Chapados · Yoshua Bengio · Pascal Vincent · Joumana Ghosn · Charles Dugas · Ichiro Takeuchi · Linyan Meng