Projects

2025

Student	Thesis	Title	Supervisors	Industry Partner
Diego Alovisetti	MSc	Duality for General Regression Problems	Patrick Cheridito
Abstract: In this thesis, we develop a duality framework for tackling a general class of regression problems of the form \(\inf_{f\in H} E L(f(X), Y)\), where \((X, Y)\) is a pair of random vectors and L is a loss function satisfying cer- tain convexity properties. Such problems arise across several contexts, including ordinary least squares regression as the most fundamental and well-studied example. Other instances are logistic and multinomial regression, (generalized) quantile regression, generalized linear models, and loss minimization problems in machine learning. Exploiting convexity, we employ and generalize standard methods to construct a dual formulation, thus establishing conditions under which strong duality holds. Primarily, this allows for an alternative characterization of the minimum value. Second, it provides useful bounds for the estimation error in approximating the true minimizer. If the dual problem is more tractable than the primal, then our duality results may be seen as a more computationally efficient alternative. The broad applicability of this framework is demonstrated through several examples, including novel dual representations for key functionals in risk management and the theory of elicitability. Furthermore, we lay the ground for addressing further questions related to integrability assumptions, the structure of the dual space, and error estimation in numerical optimization.

Patrick Edera	MSc	E-Backtesting in Financial Markets: A Model-Free Approach to Risk Measure Forecasting	Mario Wüthrich Valeria Bignozzi
Abstract: The theoretical basis of the article “E-backtesting” by Qiuqi Wang, Ruodu Wang and Johanna Ziegel (2024) is the development of a model-free backtesting procedure for risk measures, in particular the Expected Shortfall (ES), using e-values and e-processes. The paper’s main contribution is the introduction of backtest e-statistics, which provide a method for evaluating the accuracy of predictions of risk measures. This method is advantageous because it allows for sequential testing, yielding early warnings without requiring distributional assumptions. It also ensures that backtests are valid over any time horizon and for any sample size. In this thesis, we aim to investigate this e-backtesting framework, analysing its application to risk management and regulation. Furthermore, we investigate the theoretical evolutions behind e-values and e-processes, highlighting their importance in modern financial regulation.

Arianna Guadagnini	MSc	Pricing and Hedging Variable Annuities using Physics-Informed Neural Networks	Patrick Cheridito Jean-Loup Dupret
Abstract: This thesis explores the application of Physics-Informed Neural Networks (PINNs) for pricing and hedging variable annuities, focusing on contracts with Guaranteed Minimum Accumulation Benefit (GMAB) and Guaranteed Minimum Death Benefit (GMDB). For pricing, we build on a Feynman-Kac-based approach previously used for GMAB and extend it to cover GMDB as well, considering a financial market with multiple risk factors. PINNs provide a flexible and efficient alternative to traditional methods, allowing fast evaluations across various contract specifications. For hedging, we investigate the Generalized Policy Iteration (GPI)-PINN algorithm, guided by the Hamilton-Jacobi- Bellman (HJB) equation, to manage the risks of these guarantees. While the models successfully reduce risk exposure, full convergence in our examples remains challenging due to liability complexity and limited tradable assets. Our findings highlight PINNs’ potential for pricing and hedging, though further optimization might be needed to improve accuracy.

Isidora Stojadinovic	MSc	Quantitative Analysis of Mortality Risk with Data-Driven Methods	Patrick Cheridito Daniel Meier	Swiss Re
Abstract: The goal of this thesis is to analyze mortality rates based on health-related survey data collected in the United States and provided by IPUMS (Integrated Public Use Microdata Series). Throughout mortality rate analysis, we examine the main determinants and mortality-risk drivers to identify key factors influencing survival outcomes. Additionally, we conduct a counterfactual analysis to assess the impact of body mass index (BMI) on mortality trends in the United States and highlighting the policy making strategies that could (based on the results of this thesis) lead to lower and more stable mortality rates. To achieve these objectives, we employ a range of modeling approaches, including the Cox Proportional Hazard model, tree-based classification (decision tree, random forest, gradient boosting and XGBoost classifier) and tree-based survival models (survival trees, random survival forest and gradient boosted survival model) and Deep Survival Neural Networks (DeepSurv, variation of DeepSurv adjusted to goals of our analysis, DeepHit and SurvivalNet). Furthermore, due to the considerable amount of missing data, we evaluate the performance of different imputation methods, including elementary imputation techniques, regression-based imputation models, and the Multivariate Imputation by Chain Equations (MICE) method and conduct the imputation of missing values of the income related-feature of the dataset. In the initial chapters of the thesis we cover theoretical aspects of all techniques used in of the each stages of analysis in all details, while in the later chapters we present the implementation aspects of the same and discuss our findings.

2024

Student	Thesis	Title	Supervisors	Industry Partner
Diego Alovisetti	Semester	Convex Optimization, Duality and General Regression	Patrick Cheridito
Abstract: Convex optimization and probabilistic problems are two areas of mathematics of fundamental importance. Their wide-ranging applicability to various fields makes them of extreme interest for those interested in the most theoretical aspects of the subject as well as those more inclined to study the applications. The present report explores key concepts and results lying at the crossing of the two. In the first chapter, we lay the groundwork. Specifically, we introduce the fundamental principles of convexity, thus presenting a series of basic properties of convex sets and functions. We provide rigorous proofs and set the stage for deeper exploration, carried out in what comes next. The presentation embodies the spirit of those in [1] and [2]. Transitioning to the realm of Hilbert spaces, we come across the theory of duality. Our focus is on the classic problem of minimizing the distance of a point from a convex, closed subset, which allows us to provide insights into the structure of optimal solutions and their relations with the duals. At the end of the chapter, we specialize the discussion to a probabilistic setup. Upon rephrasing the initial problem for this special case, we can leverage the results obtained in the preceding sections to define the conditional expectation of a square-integrable random variable as the solution to a suitable minimization problem. In our exploration toward the final goal of tackling so-called M-estimation problems, we happen to encounter the notion of disintegration — a fundamental technique in the theory of probability used to formalize the concept of conditional distribution. Our treatment of disintegration serves an informative purpose, and therefore, we do not present rigorous proofs. However, in later chapters, by means of its applicability, its scope could be appreciated clearly. The results in this section follow [3]. Finally, we arrive at the culmination of our journey — the study of general regression problems in the context of M-estimation. Through a synthesis of convex optimization techniques, duality theory, and probabilistic concepts, we hope to provide a solid understanding of the considered problems. An appendix then presents a specialization of the results in the last chapter to the case of conditional quantiles. The results presented in the final chapter are to be intended as the arrival point of all preceding discussions and stem from the brilliant work of Professor Patrick Cheridito, with whom I collaborated towards the goal of giving them rigorous treatment. We hope that this may contribute to further exploration of the subject and that it may prove to be as exciting as it was for the undersigned.

Ahmet Altintas	MSc	Time-series Generation using RBM-based Models	Patrick Cheridito Gabriele Visentin
Abstract: This thesis examines the potential of Recurrent Temporal Restricted Boltzmann Machines (RTRBMs), RNN-RBMs and LSTM-RTRBMs for generating time series. We perform a comparative study of these models by training them on AR(2) and ARMA(2,2) processes and finally analyze their ability to augment the Dow Jones Industrial Average dataset of closing prices. The augmentation is then tested to hedge a hypothetical at-the-money call option. For this, we introduce the models in detail, and explain how they are trained and how they can be evaluated. Finally, we investigate the potential for extending and incorporating existing financial applications, such as importance sampling and stress testing, into the recurrent setting. We also include a discussion on different training approaches for the RTRBM focusing mainly on Wasserstein distance-based methods.

Giulio Carassai	MSc	Neural Optimal Transport at Scale: Wasserstein Barycenters for Fair Insurance	Patrick Cheridito Gabriele Visentin
Abstract: The insurance industry relies vast datasets to assess risk and offer personalized policies. With the growing availability of data and advances in computing power, insurers can now tailor policies more precisely to individual clients. However, the use of extensive data raises concerns about fairness. We engage the challenge of ensuring fairness without compromising performance. We propose a novel method, Cond-2WB, to pre-process datasets to fairness. Cond-2WB leverages optimal transport theory and 2-Wasserstein barycenters to repair datasets by removing sensitive information, while retaining predictive power. Cond-2WB is scalable and unsupervised, using convex neural architectures to learn optimal mappings to the barycenter distribution. We demonstrate the model’s ability to ensure fairness while maintaining predictive power by performing tests both on synthetic and real-world datasets. Finally, we provide theoretical guarantees on the method’s effectiveness and explore the trade-offs involved in fairness-aware insurance modeling.

Yasin Durrani	Semester	Expectation Maximization Estimation for the Bivariate Mixed Poisson Model	Mario Wüthrich
Abstract: In claim count modelling, one usually works with models that assume independence between different claim types for the same policyholder. However, empirical data suggests this assumption is false. This semester paper follows the idea of Chen, Dassios and Tzougas (European Actuarial Journal 2023) to model jointly the claim count of two different claim types using the Bivariate First-Order Integer-Valued auto-regressive model with mixed Poisson correlated random effects. This model has the added advantage of capturing the time dependence as well as the cross dependence structure of the claim counts. The complicated form of the log-likelihood function requires the use of the Expectation Maximization algorithm, an iterative method, that provides an approximation of the maximum likelihood estimate. The performance of the model is tested on real data.

John Ery	PhD	Actuarial Data Science: Insights from Pricing, Capital Modeling and CAT Bonds	Patrick Cheridito Mario Wüthrich Despoina Makariou
Abstract: Several standard problems in actuarial science can be tackled with machine learning techniques. The latter have been popularized with a significant uptick in computing power and the common belief that numerous insights can be generated from the vast amounts of available data. Neural networks are a popular tool in the family of machine learning methods: one can describe them as non-linear regression functions in a high-dimensional setting. In asset-liability risk management, a neural network can be introduced for the evaluation of capital requirements, replacing for instance the specification of a set of basis functions in the implementation of the least-squares Monte Carlo algorithm. One can view optimal exercise decisions embedded into financial products from the reinforcement learning perspective, where an agent searches for the best possible strategy by exploring the environment and exploiting previously visited states. Finally, we can use environmental data and the theory of spatial extremes to derive a trigger and a pricing framework for a new type of catastrophe (cat) bond, combining the advantages of both parametric and modeled loss triggers, hence increasing the product offering and the attractiveness of the market for alternative risk transfer. In this thesis, we explore how various methods from supervised learning, reinforcement learning and extreme value theory can be employed to study standard actuarial problems in the fields of pricing, capital modeling and cat bonds. In each instance, we will turn insights from either simulated or real data into actionable analytics for a (re)insurance company. In Paper A, we present a neural network approach to estimate the risk of a portfolio of assets and liabilities. The neural network provides an elegant functional approximation tool to the conditional expectation appearing in the definition of solvency capital requirements. Several examples are presented in the case of value-at-risk and expected shortfall. In Paper B, we consider the pricing of option-type products from the perspective of reinforcement learning. We show on several examples how Q-learning can be useful to derive the optimal stopping strategy, in the case of single stopping and in the more general case of multiple stopping with constraints. In Paper C, we present a new type of cat bond, characterized by a potentially more attractive balance between the low level of basis risk of a modeled loss trigger and the high level of transparency of a parametric trigger. A pricing methodology using the theory of spatial extremes is introduced and illustrated on a real-world cat bond covering German windstorms.

Nico Graf	MSc	Quantitative Analysis and Innovations in Synthetic Market Data Models	Patrick Cheridito Koushik Balasubramanian Vincent Zoonekynd	Abu Dhabi Investment Authority (ADIA)
Abstract: Synthetic market data is used to create better models, more homogeneous testing environments and to augment datasets for risk management and related fields in quantitative research. The usefulness of such data is directly dependent on the quality of the synthetic dataset. Because financial data is noisy and fast-changing, it cannot be completely understood. The question of how to measure the quality remains an active research field. The authors introduce a quantitative score for assessing the data quality based on well-established stylized facts by over 70 years of research. Further, the thesis discusses a collection of models from widely used GARCH to most recently developed deep learning models, like FinGAN, in the light of the new score. The authors also propose a complementary loss function based on the new scores for training deep neural networks, which allows for substantial improvements in the quality of these mod- els and the synthetic data. The experiments show that FinGAN is the overall best-scoring model and it can be further improved by the complementary loss function. The experiments and the code are available on GitHub https://github.com/ncograf/synthetic_data.

Wiebke Hansen	MSc	An Axiomatic Approach to Fairness and Discrimination in Decision-Making	Stefan Weber Mario Wüthrich
Abstract: Decisions that affect individuals are often accompanied by the question of whether they are perceived to be fair. An application of decision-making is the pricing of insurance policies, whereby insurers assess the risk posed by individual policyholders based on observable characteristics and assign premiums by a pricing rule. Determining whether a pricing rule is fair is a complex task, as there exists no universally accepted definition of fairness. One perspective, known as actuarial fairness, suggests that premiums should accurately reflect individual risk. Nevertheless, this gives rise to ethical and legal concerns. Is it fair to charge men and women unlike premiums if their risks differ solely due to their gender? According to the 2011 Test-Achats ruling of the European Court of Justice, it is not. Instead, the direct use of gender is deemed unfairly discriminatory and legally prohibited. Conversely, one might consider a scenario in which higher life insurance premiums are charged to smokers due to increased health risks, with the majority of smokers in an insurer’s portfolio being women. Does the statistical dependence of gender on premiums render the pricing rule unfair? Interpreting fairness is not straightforward, as the answer cannot be reduced to a simple yes-or-no. What degree of disparity in pricing effects must be exhibited between different genders for the pricing rule to be considered unfairly discriminatory? The aim of this thesis is to discuss such questions rigorously in a mathematical context.

André Emanuel Jacob	MSc	Modelling the Probability of Default of Corporate Private Loans	Patrick Cheridito Jovan Samardzic	StepStone Group
Abstract: This study investigates the influence of various financial risk factors on the probability of default for private corporate loans using both classical and advanced machine learning techniques. We employ Logistic Regression and Artificial Neural Networks, further enriching our analysis with Bayesian approaches to these models. Our comprehensive evaluation leverages a proprietary dataset from StepStone Group, which includes single loan transactions from 2004 to 2022. To ensure robust insights into the predictive power and reliability of each model, we use appropriate performance measures and directly address dataset im- balances with techniques such as SMOTE.

Yuhan Jiang	MSc	Premium Control in Insurance Companies: Insights from Reinforcement Learning	Mario Wüthrich
Abstract: Insurance companies operate in the domains of insurance coverage, insurance pricing and risk management, where premium control plays a key role in balancing profitability, portfolio size and risk mitigation. Traditional high-level control methods are often limited by their ability to accurately capture the stochastic dynamics of real-world environments. This study explores the application of reinforcement learning algorithms to address the premium control problem in dynamic insurance systems. We first introduce stochastic insurance models, outline the control problems, and describe the reinforcement learning algorithms used. We then apply various reinforcement learning algorithms to insurance models and evaluate their performance. Our research shows that third-order Fourier linear function approximation and deep Q-network (DQN) algorithms perform well on our problems, offering feasible solutions for insurance companies to control premiums effectively in complex and dynamic environments.

Hugo Martinez Ibarra	Semester	Generative AI for Time Series Generation	Patrick Cheridito Koushik Balasubramanian Vincent Davy Zoonekynd	Abu Dhabi Investment Authority (ADIA)
Abstract: This work explores and shows some available approaches to generate financial time series, mainly with generative-AI models, specifically time series of historical stock prices. It starts discussing the motivation and purpose of creating AI models to synthesize financial time series. It is followed by a review of the most representative features of financial time series which are aimed to reproduce. Based on this, a baseline model is proposed to evaluate the capabilities of later generative artificial intelligence (AI) models. Among the generative AI models considered in this work are: Generative Adversarial Networks (GANs) and Diffusion models. As a remark, the focus of this work is to create synthetic financial time series of a particular stock price. Finally, comments on each type of model are made, highlighting its advantages, drawbacks as well as some further improvements and study directions.

Chihiro Okuyama	MSc	Effective Dimension of Stochastic Models	Mario Wüthrich
Abstract: In this thesis, we explore the concept of the effective dimension of stochastic models, which serves as a measure of model complexity. Unlike classical capacity measures such as the VC dimension, the effective dimension accounts for crucial properties like data distribution and training dependency without relying on unrealistic assumptions, such as access to infinite data. Furthermore, the effective dimension provides a bound for the generalization error, a crucial quantity in statistical modeling for assessing a model’s robustness against new, unseen data. We address and refine theoretical issues observed in the original formulation of the generalization bounds, enhancing the theoretical framework of the effective dimension. Through practical examples, we demonstrate the application and utility of the effective dimension in statistical modeling. This work contributes to a deeper understanding of model complexity and its implications for model selection and evaluation in practical scenarios.

Ronald Richman	PhD	Applying Deep Learning in Actuarial Science	Roseanne Harris Mario Wüthrich
Abstract: Deep learning refers to the modern approach to designing and fitting deep neural networks, which has achieved excellent performance on numerous different types of artificial intelligence tasks. This research aims to integrate deep learning into actuarial science by addressing challenges with applying deep neural networks in practice, such as explainability, variable selection, regularization, smoothing, discrimination-free pricing and uncertainty quantification. This thesis consists of seven research papers: In Paper 1, we discuss several of the basic questions addressed in the research by using, as an example, a large-scale mortality forecasting problem. We investigate how neural network design choices affect the stability and predictive performance of predictions, show how neural networks can be adapted to produce prediction intervals and propose a new explainable deep neural network architecture, the Combined Actuarial eXplainable Neural Network (CAXNN). In Paper 2, we introduce the LocalGLMnet model, which uses a neural network to predict the coefficients of a Generalized Linear Model for each observation in a dataset. We demonstrate the new model on a non-life pricing example, and show how variable selection and interaction identification can be performed using the LocalGLMnet. In Paper 3, we show how regularization can be introduced into the LocalGLMnet model to perform variable selection and apply a regularized LocalGLMnet model to real-world datasets, including a high-dimensional telematics dataset. In Paper 4, we return to the mortality forecasting problem, and adapt the LocalGLMnet model to produce multiple outputs. Using Locally Connected Networks, the LocalGLMnet is able to extract coefficients of an autoregressive time-series forecasting model directly from observed mortality rates, exhibiting excellent out-of-sample forecasting ability, while remaining explainable. In Paper 5, we turn our attention to discrimination-free insurance pricing, which, in its original formulation, requires all policyholder records to contain complete information about protected characteristics. Using a novel multi-output neural network, we show how estimates of discrimination-free prices can be made even when a significant proportion of the policyholder data is missing the protected characteristics. In Paper 6, we show how smoothness and monotonicity constrains can be enforced on the outputs of a neural network, to ensure that predictions made by the network are sensible with respect to selected covariates, such as policyholder age. This proposal is called the ICEnet, to emphasize the close connection with the Individual Conditional Expectation model interpretability technique. In Paper 7, we survey the requirements of professional standards and guidance that govern practical actuarial work, and show how the advances in this research, as well as other recent work in the field, can be used to comply with these requirements. We discuss the implications of deep learning for actuarial education and the impact of advances in deep learning on the role of the actuary.

Denis Schaub	Semester	Random Durations in Insurance Pricing	Mario Wüthrich
Abstract: This semester project follows the ideas of Lindskog-Lindholm-Palmquist (Scandinavian Actuarial Journal 2023). It considers the problem of non-life insurance contracts with the issue of possible early termination. This is tackled with a change of measure. Additionally, it proposes a distribution free locally-unbiased predictor on the basis of a previously chosen (potentially biased) predictor. It offers insight into partitioning methods of the covariate space and how to handle the variable duration problem. This leads to an automatic, data-driven tariffing method, where the number and size of the tariffs correspond to the partitioning. This procedure is illustrated both with simulated data and with data from the freMTPL2 data set.

Mingjie Shi	Semester	Data Analysis on Victoria Road Crashes	Mario Wüthrich
Abstract: Today, road traffic accidents impose a severe challenge to the public safety, causing casualties and a large amount of property damage, as well as profound social impact worldwide. In the Australian state of Victoria, the road safety remains to be a crucial issue despite the continuous efforts of government to reduce risk and improve infrastructure. This thesis aims to investigate the Victorian road crash dataset, whose data has been consolidated from Victoria Police reports and hospital injury information, then validated and enriched to provide a comprehensive and detailed view of road crashes and injuries across Victoria. We provide a detailed analysis of this data analyzing the key factors of accidents and we give some advice in improving road safety.

2023

Student	Thesis	Title	Supervisors	Industry Partner
Luca Aschmann	MSc	Meta-Labeling Architectures for Return Classification	Patrick Cheridito Jacques Joubert	Abu Dhabi Investment Authority
Abstract: Lopez de Prado (2018) introduced (Single) Meta-Labeling, an approach consisting of training a secondary model how to use a primary exogenous model in order to extract predicted probabilities for dynamic position sizing, which still remains widely unex- amined in literature. We introduce two novel architectures, Relative and Dual Meta- Labeling, which, in addition, also allow for a dynamic switching mechanism between two primary models. The goal of this thesis is to formalize the architectures and examine whether Meta-Labeling can increase the performance of an exogenous primary model. We utilize slow and fast time-series momentum strategies as primary models as well as Random Forests, XGBoost and LightGBM as secondary models and backtest our exper- iment on monthly S&P 500 data. We find that Relative Meta-Labeling outperforms all baselines, mainly uses volatility-based features for adjusting its switching dynamics and allows for valuable inference about market regimes. The Single Meta-Labeling archi- tecture alike outperforms all primary models, yet suffers from long subsequent periods of inactivity. Dual Meta-Labeling achieves the highest risk-adjusted returns and aids time-series momentum strategies in responding to momentum turning points.

Nico Ehrler	MSc	Individual Claims Reserving with Machine Learning Methods	Mario Wüthrich Frank Ettwein	Baloise
Abstract: For all accidents happening during the insurance period of an insurance contract, the insurance company needs to cover the corresponding losses. Therefore, the cash flows for these future losses have to be determined such that an insurance company can determine the size of its liabilities. Usually, aggregated methods, such as modifications of the Chain-Ladder algorithm, are used to determine the size of the liabilities. This thesis tries to determine the size of the liabilities through predictions on each individual claim instead of on an aggregated level. These predictions are made with gradient boosting machines, namely the LightGBM package in R is used. Then, the aggregated Chain-Ladder algorithm is compared with the individual predictions on a claim as well as on a portfolio basis.

Egemen Erdogdu	MSc	Analysis of the Distribution of Corporate Defaults with Bayesian Methods	Patrick Cheridito Kai Schnee Gabriel Visentin
Abstract: This thesis proposes Bayesian approaches for model validation of three different credit risk models: Gaussian one-factor model, a reduced-form model, and a Restricted Boltzmann Machine (RBM) credit risk model. The primary objective of this thesis is to evaluate the calibration of these models using Bayesian techniques, which provide an intuitive and visual way for model validation. Proposed methodologies are implemented on a real default data set of US speculative grade borrowers.

Tatjana Mäder	Semester	Variable Annuity Hedging	Patrick Cheridito
Abstract:

Tatjana Mäder	MSc	A Reinsurance Pricing Model for Nuclear Pools	Mario Wüthrich Philipp Arbenz	SCOR
Abstract: The current discussion about how and with which fuels more electricity can best be generated also concerns the use of nuclear energy. Despite past negative headlines, the nuclear power sector continues to grow and as a result it is becoming increasingly interesting for reinsurers to write business of nuclear risks. In this context, reinsurance companies participate in nuclear pools, as the loss of a nuclear accident exceeds the capacity of a single insurer. This thesis deals with the pricing of such a pool. The goal is to derive a model that can quantify the expected total loss amount in the event of a nuclear accident. For the derivation of the model, data on nuclear accidents as well as characteristics of all worldwide operating reactors have been collected. Based on this, an attritional frequency-severity model was developed. The main focus hereby lied on estimating the frequency of a large loss nuclear accident by means of influencing factors. In order to incorporate such key drivers, a generalized linear model was used. As a final model, a Poisson log-linear model was chosen, which depends on the exposure, i.e., the global number of operating reactors, as well a time factor resulting from the number of calendar years since the first nuclear reactor was commissioned. In a further analysis, the historical loss sizes were examined and a severity distribution was fitted. Due to several past extreme losses, the focus lied on heavy-tailed distributions. It was found that the use of a log-gamma distribution provides the best fit, resulting in an infinite mean model. The final result is a pricing model for nuclear pools, which models losses resulting from nuclear accidents using a compound Poisson-log-gamma model for the large losses.

Mohamed Fadhel Omar	MSc	Optimising the Allocation of Time Supplements in Railways Timetables: A Heuristic Approach	Dan Burkolter Burkhard Franke Francesco Corman Patrick Cheridito	trafIT solutions GmbH
Abstract:

Christopher Panizzolo	Semester	A Combined Approach of Multidimensional Lee-Carter Model and Hidden Markov Model	Mario Wüthrich Hélène Schernberg
Abstract: This semester paper describes a multi-dimensional extension of the Lee-Carter mortality model whereby the mortality indices are driven by a hidden Markov process. This allows capturing transitions between mortality regimes characterized by different trends and volatilities. We build on the paper “Multidimensional Lee–Carter model with switching mortality processes” by Hainaut (2012) which describes a two-dimensional Lee-Carter model driven by a two-states Markov process. We extend this paper in two ways. First, we allow for as many dimensions of the temporal indices and for as many states of the Markov process as desired by the user (e.g., based on predictive power). Second, we rely on an improved method for identifying the Markov chain and calibrating the model, namely, by using the Baum-Welch algorithm instead of a method based on the Hamilton filter.

Panagiotis Papakonstantinou	MSc	Hedging Options with Deep Learning	Patrick Cheridito Stephan Eckstein
Abstract: The incredible speed improvement in computers has been a defining feature of the 21st century. This progress has allowed us to use complex math methods to solve tough problems. One area that has become really popular in the last decade is called deep learning. Simply put, deep learning is a way to teach computers to do math tasks without telling them exactly how to do it. They learn by trying things out, just like humans do through trial and error. This is made possible by neural networks, which are like our brain’s digital version. These networks have interconnected parts, like brain cells, that respond when given data. The goal of training them is to make them good at responding correctly when they see lots of examples, so they can make accurate conclusions or predictions. This thesis focuses on the use of deep learning in the field of financial modeling, with a specific emphasis on pricing and risk managementof financial securities. The particular security type under analysis is called an option, which is a financial contract linked to an underlying asset like a stock or bond. Options grant the buyer the choice to buy or sell the asset at a predetermined price and time. They are known as financial derivatives since their value is derived from the underlying asset. The task of determining a fair price for this derivative is closely connected to reducing its risk through hedging. The primary aim of this thesis is to reduce the risk of a portfolio containing an option by employing hedging with a neural network. Theobjective is to develop an agent that can automate the tasks performed by a human trader using quantitative data, following the principles of machine learning. Initially, we apply this approach in a straightforward setting, such as the Black-Scholes model, and subsequently, we evaluate its performance in a more realistic scenario, opting for the Heston model. The results of our study reveal that the deep hedging approach, when applied to simulations based on the Black-Scholes model, generates a hedging strategy that is comparable and slightly superior to the theoretical optimal hedging strategy for Black-Scholes. Furthermore, in the Heston model, the resulting hedging strategy exhibits performance similar to the traditional hedging method derived from the analytical pricing model.

Adrien Perroud	Semester	Prediction Intervals with Generalized Linear Models	Mario Wüthrich
Abstract: Prediction intervals estimate future value ranges based on observed values with a certain probability. In this paper, we present different methods to construct such intervals. The first part focuses on presenting the algorithms. In the second part, we apply the prediction intervals to simulated data with the help of generalized linear models. Finally, we analyse their performance and computational burden. The first two methods are based on a popular statistical tool, bootstrapping. We proceed by resampling the original data many times and fit a GLM model to each resample. The resulting regressors are then used to construct the prediction interval. Another way for constructing these intervals is by conformal inference. This relatively new concept is based on testing whether a set of candidate values can be accepted in the prediction interval. More precisely, we check how conformal a candidate is using a specified distance function. The resulting distance, called conformity score, is then compared to a set of distances for some data and if the new distance is enough small, the candidate is conformal. The goal is to find such sets of distances to compare our candidate values with. We present 4 different methods to construct such sets. After presenting the methods, we test the prediction intervals on a set of simulated data. We consider a dataset of Swedish motorcycle claims and resample with replacement the data. We fit a GLM to the resampled data and find the average point prediction for every data point. The response variables are then generated by a gamma distribution with the average point prediction as the scale parameter and shape parameter equal to 1. The intervals are computed on this new set and analysed.

Florian Rossmannek	PhD	The Curse of Dimensionality and Gradient-based Training of Neural Networks: Shrinking the Gapbetween Theory and Applications	Patrick Cheridito Arnulf Jentzen
Abstract: Neural networks have gained widespread attention due to their remarkable performance in various applications. Two aspects are particular striking: on the one hand, neural networks seem to enjoy superior approximation capacities than classical methods. On the other hand, neural networks are trained successfully with gradient-based algorithms despite the training task being a highly nonconvex optimization problem. This thesis advances the theory behind these two phenomena. On the aspect of approximation, we develop a framework for showing that neural networks can break the so-called curse of dimensionality in different high-dimensional approximation problems, meaning that the complexity of the neural networks involved scales at most polynomially in the dimension. Our approach is based on the notion of a catalog network, which is a generalization of a feed-forward neural network in which the nonlinear activation functions can vary from layer to layer as long as they are chosen from a predefined catalog of functions. As such, catalog networks constitute a rich family of continuous functions. We show that, under appropriate conditions on the catalog, these catalog networks can efficiently be approximated with rectified linear unit (ReLU)-type networks and provide precise estimates of the number of parameters needed for a given approximation accuracy. As special cases of the general results, we obtain different classes of functions that can be approximated with ReLU networks without the curse of dimensionality. On the aspect of optimization, we investigate the interplay between neural networks and gradient-based training algorithms by studying the loss surface. On the one hand, we discover an obstruction to successful learning due to an unfortunate interplay between the architecture of the network and the initialization of the algorithm. More precisely, we demonstrate that stochastic gradient descent fails to converge for ReLU networks if their depth is much larger than their width and the number of random initializations does not increase to infinity fast enough. On the other hand, we establish positive results by conducting a landscape analysis and applying dynamical systems theory. These positive results deal with the landscape of the true loss of neural networks with one hidden layer and ReLU, leaky ReLU, or quadratic activation. In all three cases, we provide a complete classification of the critical points in the case where the target function is affine and one-dimensional. Next, we prove a new variant of a dynamical systems result, a center-stable manifold theorem, in which we relax some of the regularity requirements usually imposed. We verify that ReLU networks with one hidden layer fit into the new framework. Building on our classification of critical points, we deduce that gradient descent avoids most saddle points. We proceed to prove convergence to global minima if the initialization is sufficiently good, which is expressed by an explicit threshold on the limiting loss.

Matthias Schmickler	MSc	Industrial Property Insurance Rate Making	Mario Wüthrich Adrian Kolly Adrian Lüssy	Swiss Re
Abstract: Nowadays, data analytics is common practice in pricing of non-life insurance products. In the reinsurance industry, however, it is not easy to use a fully data-driven approach to pricing. Therefore, many prices are made of a combination of data and expert judgement. This is mainly due to the heterogeneity of data and the fact that little data from direct insurers ends up in reinsurance. In this thesis, we look at what a data-driven approach might look like. The first part is devoted to the theory of pricing contracts in the reinsurance industry. In the second part, we apply the methods to a dataset provided by Swiss Re. The results shows that the approach is promising but it highly depends on the data quality.

Trevor Winstral	MSc	Network Statistics and Systemic Risk in Financial Networks	Stefano Battiston Patrick Cheridito
Abstract: The subsequent work begins with a literature review of modern research in the field of systemic risk in financial networks. This covers the statistical methods used in the evaluation of the topologies of empirical financial networks, the framework used to model financial contagion (NEVA), and the results found from various specification of NEVA applied to theoretical and empirical financial systems. Next, novel Bayesian and frequentist statistical methodologies are proposed for improved evaluation of empirical network topologies. These methods allow for understanding the degree to which empirical networks fit idealized Core-Periphery networks. Finally, a method to introduce firesales, illiquidity, and leverage requirements is introduced to the NEVA framework, adding thus far unaccounted for parameters to NEVA. The dependence of the size of financial crash cascades stemming from the aforementioned parameters, as well as the deviance of the topology from an ideal Core-Periphery network, is then studied.

Haoyun Ying	Semester	An Analysis of the Electricity Infrastructure in South Africa	Mario Wüthrich Suchita Srinivasan
Abstract: This project proposes a mathematical optimization approach to determine the number and locations of power plants in South Africa. The objective is to minimize the total operation and transmission costs while meeting the growing demand for electricity. We formulate the problem as an optimization model, with fixed activation and variable transportation costs. We investigate two clustering methods: k-medoid clustering based on sum of distance and computationally easier k-means clustering based on the sum of squared distances. We determine the optimal number of power plants comparing total costs and validate results from k-means clustering through silhouette scores. We present the numerical results of our approach and compare them to the current power plant locations in South Africa. Our approach can provide decision makers with valuable insights for optimizing the country’s electricity grid system.

Philipp Zimmermann	PhD	Inverse Problems for Variable Coefficient Nonlocal Operators	Patrick Cheridito Joaquim Serra
Abstract: The purpose of this PhD thesis is the study of the class of nonlocal inverse problems, which can be regarded as nonlocal generalizations of the famous Calderón problem. This thesis is divided into two parts. In Part 1, we consider linear nonlocal inverse problems and in Part 2 nonlinear nonlocal inverse problems.

2022

Student	Thesis	Title	Supervisors	Industry Partner
Davide Apolloni	MSc	Significance Tests for Neural Networks	Mario Wüthrich
Abstract: We present a pivotal test to assess the statistical significance of feature variables in a single-layer sigmoid neural network. We provide an asymptotic estimator of the test statistics and derive its asymptotic distribution under the corresponding null hypothesis. This test allows us for variable selection within single-layer sigmoid neural networks. The key tool to prove such results are sieve estimators, where the complexity of the estimator increases with the size of the data. Universality theorems say that this allows us to precisely reconstruct the true regression function, which then is the basis for the variable significance test.

Rayen Ayari	BSc	Mortgage Default Prediction	Patrick Cheridito
Abstract: Mortgage probability of default is a key factor in Credit Portfolio risk management which quantifies the likelihood of a client not paying back their debt and interest. To my knowledge, prior literature has relied either on mathematical models or on machine learning models to quantify this key concept without implementing a framework explaining the impact of every variable (ML models feature impact). Credit risk assessment models have been mostly studied using machine learning techniques but are a bit limited with regard to deep learning. In this paper, we implement state-of-the art ensemble machine learning models that include transition matrices in computing PD (probability of default). Specifically, we focus on implementing a credit scoring model composed of an ensemble of convolutional neural network and conditional transition matrices applied to more than 7.2 million loans and 88.6 million monthly records offered by the US-based mortgage loan company Freddie Mac. We discuss the results of this model compared to random Forests and Gradient Boosted Trees (the two most used ML models by banks and rating agencies). We also discuss the explainability of these three models using SHAP values to get a global explanation of the impact of every feature on the PD.

Berno Binkert	MSc	Optimal Liquidity Pool Graphs	Patrick Cheridito Roger Wattenhofer
Abstract:

Richard Breitschopf	MSc	Deep Reinforcement Learning for Optimal Trade Execution	Patrick Cheridito Moritz Weiss
Abstract:

Niels Cariou-Kotlarek	Semester	Hawkes Iterative Time-dependent Estimation of Parameters	Patrick Cheridito
Abstract:

Niels Cariou-Kotlarek	MSc	Jump-Diffusion Models for Financial Bubbles Modelling: A Multi-Scale Type-II Bubble Model with Self-excited Crashes	Patrick Cheridito Didier Sornette Roger Wehrli
Abstract:

Federica Casanova	BSc	Mortality Modeling using Frailty Methods	Mario Wüthrich
Abstract: Standard methods for calculating mortality rates consider a homogeneous population, which tends to overestimate mortality. This thesis aims to improve mortality rate estimation, so that future mortality can also be better predicted. To this end, we add a frailty parameter, which leads to considering a heterogeneous population, thus not ignoring differences in longevity between individuals. We will consider two different baseline mortalities and different distributions for the frailty variable, comparing estimated and forecasted mortalities with and without frailty, seeing which one better predicts the actual data. This way, we can have a more accurate estimation and forecast, especially for the elderly.

Walid Chatt	Semester	Machine Learning for Fraud Detection: An Initial Approach to Tackle the Issues of Class Imbalance and the Shortage of Labels	Helmut Bölcskei Patrick Cheridito Silvia Mongellu	UBS
Abstract:

Chantal Emmenegger	MSc	Elicitability and Consistency in Statistical Estimation	Mario Wüthrich
Abstract: Elicitability and consistency are central concepts in semiparametric statistical estimation, specifically when considering M-estimation through loss functions, with its corresponding counterpart for Z-estimation and identification functions being called identifiability. For estimation purposes one relates these terms to consistent estimation in order to avoid a bias. In the multivariate setting we make note of Osband’s principle, associating M- with Z-estimators and vice versa under certain conditions. Since these restrictive conditions have to be met, we receive a gap between loss and identification functions, as there are far more identification functions than loss functions, which eventually might lead through efficient estimation to an efficiency gap. To illustrate this gap we show the double quantile model in a theoretical and a numerical setting and complement it by the pair of first two moments and the mean, variance pair, since no gap arises in this setting.

Selim Gatti	MSc	Optimal Insurance Policies under Ambiguity Aversion	Patrick Cheridito
Abstract:

Arianna Guadagnini	BSc	Evaluating the Tail Risk of Multivariate Aggregate Losses	Mario Wüthrich
Abstract: This thesis examines tail risk measures for several widely used multivariate aggregate loss models where the claim counts are dependent while the claim sizes are mutually independent and independent of the claim counts. First, we derive formulas for moment transforms of the multivariate aggregate losses, showing how they relate to moment transforms of the claim counts and claim sizes. Using these formulas, we evaluate popular risk measures, such as the multivariate tail conditional expectation (MTCE) and the multivariate tail covariance (MTCov) of aggregated losses. Moreover, we determine capital allocations.

André Emanuel Jacob	BSc	Analysis of Peer-to-Peer Insurance	Mario Wüthrich
Abstract: We consider a peer-to-peer (P2P) insurance scheme where the higher layer of the total risk is covered by a (re-)insurer and the global retention level grows proportionally with respect to the total number of participants. The retained losses are then distributed among the participants according to the conditional mean risk sharing rule. The individual retention levels are analyzed as the number of participants increases. The results depend on the proportional rate of increase of the global retention level, as well as on the existence of the Esscher transform of the individual losses brought into the pool.

Yan-Xing Lan	MSc	Multi-Dimensional Exponential Family for Claim Size Modeling	Mario Wüthrich Simon Rentzman	AXA Winterthur
Abstract: Generalized linear models (GLMs) have been and still are important tools for statistical modeling. They are a cornerstone in the pricing of insurance contracts. However, the modeling of the dispersion parameter in the framework of GLMs is often neglected and only approximated outside of the modeling framework. Therefore, we investigate double generalized linear models (DGLMs) and also the direct approach of using the maximum likelihood estimator for the parameters of the exponential family (EF), which both also model and estimate the dispersion parameter. In the first part of this thesis we describe the theory behind DGLMs and parameter estimation in the EF, where we establish these two approaches; in the case of the gamma and inverse Gaussian distributions actually they give the same results. In the second part we model claim sizes of insurance data using gamma and inverse Gaussian distributions. We conclude that, depending on the data, DGLMs can improve the modeling.

Yining Li	Semester	The Application of QuantumComputing in Financial PortfolioOptimization Problems	Patrick Cheridito Stephan Eckstein
Abstract:

Andrin Melliger	BSc	Modeling Mortality with the Lee-Carter Model	Mario Wüthrich Michael Koller	Amlin Re
Abstract: As demographic data reveals, mortality rates vary between different subgroups of the human population. Moreover, they are by no means constant over time but rather they have generally declined in the past. Such changes can have financially material consequences for the life insurance industry, which necessitates reliable models to forecast mortality trends. In a first part of this bachelor thesis, one potential model - the Lee-Carter model - is thoroughly explained and its performance is analyzed based on Swiss mortality data. The backtesting procedure conducted reveals that the Lee-Carter approach is only suitable to some limited extent for modeling this data from Switzerland. The second part of this work focuses on examining the conceptual model risk entailed in the Lee-Carter modeling approach by critically appraising its underlying assumptions. While it is put forward that the general modeling assumptions which the Lee-Carter model is based on seem to be more or less plausible, it is also stressed that assessing and, in particular, quantifying the conceptual model risk is an extremely challenging endeavor. Moreover, possible ways of extending and improving the model are discussed, one of which consists in introducing a capability to account for cohort effects.

Rebecca Morger	MSc	Imputation Algorithms with Principal Component Analysis for Financial Data	Patrick Cheridito Pawel Kuczera Philipp Arbenz	SCOR
Abstract: The goal of this thesis is the imputation of missing values in the financial data set of SCOR. Among others, the data consist of equity indices and yield curves. There are several reasons why financial data contain missing values: A currency might not have existed yet, counterparties in a certain rating category were not available, and many others. However, for reinsurance modeling purposes, the data set needs to be complete. The method used for the imputation is Principal Component Analysis (PCA), which takes into account the correlation structure of the data variables. The extension of PCA to the case of missing data yields a non-convex optimization problem. We focus on the “Iterative PCA Algorithm” as well as gradient-based “Subspace Learning Algorithms”. These algorithms are compared and used to impute the returns of the financial variables. The transformation from the level of returns to the level of the original data is constructed with Brownian bridges.

Bianca Morrone	MSc	Dynamic Classification	Patrick Cheridito Gregor Heyne	UBS
Abstract: The problem of dynamic classification for temporal sequences is to construct a classifier which makes incremental decisions that are sensitive to the changes in a temporal environment, and such that the classifier achieves a reliable classification at an effective point in time. In this work, we introduce a dynamic classification framework which requires a dynamic model, that can perform incremental evaluations of a sequence, along with a time-sensitive environment to make reliable classification decisions in. We implement a few of the proposed methods to demonstrate how dynamic classification can be applied to fields which can benefit from a more time-sensitive approach, such as anomaly detection. We then compare these methods to conventional full sequence classifiers and observe the inherent tradeoff between timeliness and accuracy.

Fabian Rohner	Semester	Estimating the Discount Curve	Patrick Cheridito
Abstract:

Matthias Schmickler	BSc	Mathematics of Stochastic Loss Reserving	Patrick Cheridito
Abstract:

Lena Schütte	MSc	A Churn Model for Swiss Mandatory Health Insurance	Patrick Cheridito	Azenes
Abstract: In this thesis, we investigate the use of churn models in an actuarial pricing context. We fit logistic regression, classification tree and gradient boosting machine models to a large data set of a swiss health insurer. Here, the actuarial premiums are implicitly based on an assumed portfolio structure, which is predicted by the churn model. We therefore develop a pricing loss function that measures the impact of the churn prediction error on the predicted profits and can be seen as a proxy for the error of the actuarial premium resulting from the error in the churn model. Each model’s performance is then compared with respect to the Pricing Loss, the Binomial Deviance and the AUC. As pricing is linked to setting a market premium, we aim to incorporate the impact of the insurer’s premium in a competitive market in the churn model. To do this, we include the insurer’s premium, premium changes and the premiums of the main competitors as explanatory variables. For logistic regression and gradient boosting machine, we then deduce an approximation of the premium sensitivity of the insured.

Lena Schütte	Semester	Modeling Churns with ANNs for Actuarial Pricing	Patrick Cheridito
Abstract:

Trevor Winstral	Semester	Review of Systemic Risk in FinancialNetworks	Stefano Battiston Patrick Cheridito
Abstract:

Jianing Yang	Semester	Statistical Analysis of Telematics Car Driving Data	Mario Wüthrich
Abstract: In this semester paper we analyze a synthetic dataset of 100’000 car insurance policies generated by So, Boucher and Valdez (2021), which contains observations from classical risk features as well as telematics-related features. Regression against claim frequency is carried out using various machine learning methods, including generalized linear models, generalized additive models, regression trees and Poisson tree boosting. We compare the predictive power of each regression estimator, and through our analysis we acquire an insight on the most important features that affect claim frequency.

2021

Student	Thesis	Title	Supervisors	Industry Partner
Mayeul Cayette	Semester	Computational General Equilibrium Model of Optimal Carbon Policies	Mario Wüthrich Alexey Minabutdinov
Abstract: We derive the optimal contribution to the global climate policy for a given country having a fixed emission limit. Therefore, we study the Ramsey model that describes the transition and consumption of a given pollution capital budget, and we are determining the optimal path reaching this limiting budget. These considerations are done under a finite time horizon view and an infinite time horizon view, and for the infinite time horizon problem we derive the steady states of the model giving an asymptotic balanced growth rate. Numerically, these problems are solved by a value function iteration and a backward iteration. We derive these algorithms and we calibrate the models under different utility functions to the real world problem to derive short-run and long-run consumption of capital and pollution budgets.

Eneko Clemente	Semester	MisGAN: A Machine Learning Approach to Learn from Incomplete Data	Mario Wüthrich
Abstract: Often one has to deal with incomplete datasets. This paper aims at studying an imputing approach that essentially consists of completing the missing values of the data. We discuss the MisGAN Imputer, which uses Wasserstein Generative Adversarial Networks to perform this task. While this approach is built on a strong theoretical foundation, we will show that its application in practice is difficult and does not fully comply with the theoretical promises.

Hadi Eghlidi	DAS	Machine Learning and Deep Learning for Business Sales Forecasting	Patrick Cheridito	VAT Group
Abstract: In recent years, machine learning algorithms and methods have been increasingly employed in different aspects of manufacturing and business-to-business (B2B) industry. One of the areas of interest is leveraging data to understand the dynamics and drivers of such complicated industries and forecasting the future business volume and performance. This helps in different aspects of the industry such as managing manufacturing capacity and supply chain, planning investment in technology transitions and production planning and control. In this project, we employ data science and machine learning approaches to forecast the future sales of a B2B company and understand its relation with a supply chain indicator. The workflow includes a literature review, communicating with the company’s business managers to determine the relevant drivers, collecting and preparing the company’s sales data and an important supply chain driver, training, validating various machine learning methods, and visualizing the results and draw business conclusions. As for the machine learning techniques, we use a multi-regressor technique developed by Facebook (Prophet), and a few popular deep learning techniques used for time-series forecasting.

Daisuke Frei	MSc	Insurance Claim Size Modelling with Mixture Distributions	Mario Wüthrich Simon Rentzmann	AXA Winterthur
Abstract: Insurance claim size data often cannot be modeled precisely by a single (one- or two-parameter) probability distribution, because the small and large claims can have a very different behavior. In this thesis we use mixture distributions to model insurance claim size data. We consider mixtures of light tailed and heavy tailed distributions in order to obtain an accurate approximation of both small and large claims. In the theoretical part of this thesis, we introduce the distributions which we will consider and the methods that we will use to fit them to the data. In particular, to fit mixture distributions we use the Expectation-Maximization algorithm. We first study homogeneous models and then we improve them by making use of the generalized linear model framework. In the application part of my thesis we compare the fit of different mixture distributions to a sample of insurance claim size data, and we conclude that the proposed framework can accurately capture the features of the claims.

Selim Gatti	Semester	Pareto Optimal Insurance Policies	Patrick Cheridito
Abstract: Uncertainty affects every activity around the world. Methods were thus developed to pool the risks between individuals leading to insurance. Nowadays an insured person buys a contract from an insurer which guarantees a coverage if a certain type of loss occurs. Many forms of contracts exist like policies with a deductible or policies with full insurance up to an upper limit and coinsurance above it. In this paper, we formulate mathematically an insurance problem using expected utility theory, we then solve it and give the conditions under which these forms can be seen as optimal.

Nick Gebert	BSc	Detection of Asset Price Bubbles	Patrick Cheridito
Abstract: This work gives an introduction to the bubble process of an asset in an arbitrage-free complete market case using martingale theory. Building on this, a detection criterion for an asset price bubble is presented and implemented using a neural network. Lastly, the network is employed to detect asset price bubbles in real-world assets.

Massimo Michele Jörin	MSc	Hedging Options with Reinforcement Learing	Mario Wüthrich
Abstract: This thesis shows how different reinforcement learning algorithms can be implemented to calculate trading strategies for hedging problems within the Cox-Ross-Rubinstein and the Black-Scholes-Merton models. While the Cox-Ross-Rubinstein and the Black-Scholes-Merton models allow us to explicitly hedge and price options in finite discrete and finite continuous time, respectively, the assumptions made for these models are rather restrictive (e.g., w.r.t. transaction costs, impossibility of holding arbitrary amounts of stock assets, bid-offer spread, and negative interest rates). Reinforcement learning provides us with a way of finding optimal (or close to optimal) solutions to the hedging and pricing problem of options where there is no closed form solution available. We present the model-based algorithms Value Iteration and Policy Iteration, which require complete knowledge about the stochastic model. Furthermore, we also present model-free algorithms, which do not require complete knowledge about the model and compensate this by interacting with the environment / model and processing these observations. In the case of a discrete state space and a discrete action space we present the algorithms SARSA, Q-Learning, and Double QLearning for finding such solutions. For a continuous state space and a discrete action space we use Deep Q Network (DQN) and Deep Double Q Network (DDQN).

Fang Rui Lim	MSc	Entropy Martingale Optimal Transport and Utility InducedDivergences	Patrick Cheridito
Abstract: The objective of this thesis is to investigate the dual of the Entropy Martingale Optimal Transport problem, introduced by Doldi et al. “Entropy Martingale Optimal Transport and Nonlinear Pricing-Hedging Duality" for non-compact subsets of R^n. We provide conditions under which the duality representation, a type of inf-sup relation, holds and when this infimum is achieved. As an application, we concentrate on the case when the Entropy Martingale Optimal Transport problem involves the control of the marginal distribution of measures via divergence terms induced by utility functions as in Doldi et al.

Nikolaos Mourdoukoutas	MSc	Probabilistic Approaches to Invariance	Patrick Cheridito Gunnar Rätsch
Abstract: We propose three novel Bayesian models that can learn invariances from data alone by inferring a posterior distribution over different weight-sharing schemes. We show that our last method, which is a Bayesian neural network, outperforms other noninvariant architectures when trained on datasets that contain specific invariances. The same holds true when no data augmentation is performed. Finally, we overview some already existing approaches for modeling invariant functions with Gaussian processes.

Simon Müller	MSc	On the Transformation of Actuarial Loss Models into Synthetic NatCat Loss Tables	Philipp Arbenz Patrick Cheridito	SCOR
Abstract: In (re-)insurance, when it comes to loss modelling and aggregation, the actuarial and natural catastrophe modelling approaches are rather separate. - On the actuarial side, loss modelling through aggregate or frequency-severity distributions is often used and aggregated through dependence assumptions such as copulas or correlation matrices. - On the natural catastrophe side, loss simulations through NatCat modells (using hazard, exposure and vulnerability components) are used and aggregated by adding up loss amounts by event. The thesis brings these two worlds closer by bridging the gap for actuarial modells to be aggregated in NatCat aggregation systems. Such systems need a consistent set of so called “Event IDs” to aggregate losses across different contracts or portfolios since for each modelled contract the loss amounts are linked to these event IDs. SCOR has built an algorithm which allows to translate standard actuarial frequency-severity modells into a modell setup allowing to attach such event IDs to simulated losses. The thesis studies the different approximation steps and mathematically analyses the errors and approximations happening in this transformation. Three error sources were identified: One on the frequency transformation (general frequency to Poisson (2)), one on the severity transformation (effectively dropping losses in some cases), and one on the event ID injection (if the number of simulations is too low). For all three cases precise mathematical derivations and error bounds allow to understand the performance and behaviour of the algorithm.

Luca Pedrazzini	MSc	Reserving Methods: A Practical Overview	Patrick Cheridito Sari De Martin Stefan Bregy	Ernst & Young
Abstract: This Master’s thesis presents a possible way of estimating the future reserve by applying the Mack Chain Ladder, Bornhuetter-Ferguson and Cape Cod methods. By implementing the three methods through the use of the programming language R, we try to output the best estimate for each accident year, without focusing too much on the theoretical backgrounds and prefer looking at the practical aspects. The whole process relies on the use of claim triangles and, through the technique of model selection, it allows to find the reserving method which optimises the estimation of the reserve. This is done by minimising two different aspects: the claim development result and the actual versus expected. Using different data sets, we are able to examine more results and to find a general conclusion.

Nicola Ruckstuhl	MSc	Multi-Population Mortality Modeling using Tensor Decomposition	Mario Wüthrich
Abstract: The topic of mortality modeling is important in many different fields such as insurance, biology and medicine. Insurance companies use mortality tables, which depict the mortality rates in a specific population for different ages and calendar years, to price life insurance products such as annuities and death benefits. Mortality models can usually be divided into two categories: Single- and multi-population mortality models. In single-population modeling, the mortality rates in a single population are modeled in isolation. Single populations include for example the total population of some country or the female or male populations of that country. One of the most renowned single-population mortality models is the Lee-Carter (LC) model, which was established by Lee and Carter (1992), who used a matrix decomposition, specifically singular value decomposition, to fit and forecast U.S. mortality rates. Many mortality models are based on the LC model. In multi-population modeling, multiple single populations are modeled simultaneously, which means that the module for each specific population within the model is affected by the other populations. Considering that it is reasonable to assume correlations between mortality trends of different countries, the aim with this type of modeling is that the single populations benefit from the increased number of observations. In order to increase this effect, it is useful for the populations to be as similar as possible when it comes to variables that might affect mortality in some way. Multi-population mortality modeling was pioneered by Li and Lee (2005). Whereas in single-population modeling, the mortality rates are considered as functions of only age and calendar year and can thus be depicted as matrices, in multi-population modeling another dimension is added by considering multiple populations simultaneously. Thus the rates are given as 3-dimensional arrays. Multi-dimensional arrays are known as tensors. In this thesis we study multi-population tensor decompositions in a similar way as in Russolillo-Giordano-Haberman (2011) and Dong-Huang-Yu-Haberman (2020).

Rui Wang	MSc	Discriminating Modelling Approaches for Point in Time Economic Scenario Generation	Patrick Cheridito Binghuan Lin	UBS
Abstract: We introduce the notion of Point in Time Economic Scenario Generation (PiT ESG) with a clear mathematical problem formulation to unify and compare economic scenario generation approaches conditional on forward looking market data. Such PiT ESGs should provide quicker and more flexible reactions to sudden economic changes than traditional ESGs calibrated solely to long periods of historical data. We specifically take as economic variable the S&P500 Index with the VIX Index as forward looking market data to compare the nonparametric filtered historical simulation, GARCH model with joint likelihood estimation (parametric), Restricted Boltzmann Machine and the conditional Variational Autoencoder (Generative Networks) for their suitability as PiT ESG. Our evaluation consists of statistical tests for model fit and benchmarking the out of sample forecasting quality with a strategy backtest using model output as stop loss criterion. We find that both Generative Networks outperform the nonparametric and classic parametric model in our tests, but that the CVAE seems to be particularly well suited for our purposes: yielding more robust performance and being computationally lighter.

2020

Student	Thesis	Title	Supervisors	Industry Partner
Carlo Casati	Semester	Tontines in the Light of Systematic Longevity Risk	Mario Wüthrich Irina Gemmo
Abstract: Tontines have been introduced in 1653 by the Italian banker Lorenzo de Tonti as an investment vehicle. Recently, tontines have gained a lot of popularity in the life and pension community, as they offer an alternative pension tool, where tontine subscribers bear the financial and longevity risk in a self-organized way. The purpose of this semester thesis is to review tontines and to better understand how longevity risks act on tontine subscribers under the assumption of having a heterogeneous tontine community.

Daria Filippova	MSc	Modelling Propensity to Type 2 Diabetes using Medical Data	Mario Wüthrich Francesca Volpe	Swiss Re
Abstract: From the insurance perspective an increase in diabetes type 2 cases, which was observed in the last decades, leads to a signifi cant surge in costs. Therefore, the goal of this thesis is to develop a model which is able to identify individuals with high propensity to developing diabetes type 2 within a predetermined time period. A successful classifi cation model may be used as an early warning system to notify individuals that are at risk and to prescribe them a prevention or mitigation program. This will result in reduction of claims and an improved risk management of health insurance companies. The classi fication model is based on Logistic Regression, which is a part of Generalised Linear Models. However, an insurer's interest is not only "if" but also "when" a case of diabetes type 2 diagnosis will occur. To answer this Survival Analysis will be leveraged. For the survival analysis of medical or health data, non-parametric statistical estimation is preferred. For this thesis, the non-parametric Cox Proportional Hazard Rate model is used since it allows for multiple predictors. It provides an extension of the classifi cation model by a new dimension, namely, the time frame. The survival model uses machine learning techniques and survival analysis in order to estimate the expected time until the diagnosis of diabetes type 2.

Andrea Gabrielli	PhD	Claims Reserving and Neural Networks	Mario Wüthrich Patrick Cheridito Franco Moriconi
Abstract: In non-life insurance, an insurance claim can generally not be settled immediately at occurrence. A claims development process often takes several years. Future cash flows for claims that have occurred in the past are called outstanding loss liabilities, and a prediction thereof provides the claims reserves. Typically, the claims reserves are the largest position in the balance sheet of a non-life insurance company. This underlines the importance of the claims reserving exercise. Traditional claims reserving models such as for example Mack’s chain-ladder model or the over-dispersed Poisson reserving model work on aggregated data. These models neglect claim-specific information, which implies a considerable loss of information. Moving towards more data-driven techniques has the potential to improve accuracy of the claims reserves.The rising popularity of machine learning methods in the last couple of years promoted the development of new claims reserving techniques. One of the most popular machine learning methods are neural networks. In simple words, neural networks can be used as high-dimensional non-linear regression functions. In this thesis we combine the art of claims reserving with the power of neural networks. This thesis consists of five research papers: In Paper A we use neural networks to develop a stochastic simulation machine that generates individual non-life insurance claims. These synthetic individual claims allow us to back-test classical claims reserving models as well as to develop new claims reserving techniques. In Paper B we use the individual claims history simulation machine of Paper A in order to back-test the chain-ladder model. This study provides a general intuition of how the chain-ladder claims reserving model behaves, particularly, when the portfolio size increases. In Paper C we embed the over-dispersed Poisson reserving model into a neural network. We start the neural network calibration exactly in this over-dispersed Poisson model. Such a nested model allows us to learn model structure beyond the classical reserving model. In Paper D we extend the embedding of the over-dispersed Poisson model for claim amounts of Paper C to a joint embedding of separate over-dispersed Poisson models for both claim amounts and claim counts, exploring additional information provided by the claim counts. In Paper E we provide an individual claims reserving model for reported claims. This model uses claim-specific feature and past payment information in order to calculate claims reserves for individual reported claims. For this task we design one single neural network.

Vito Gallo	MSc	XVA Analysis for Bilateral Derivatives in Continuous Time	Patrick Cheridito
Abstract: XVAs are add-ons that a bank dealing bilateral derivatives charges to its clients to account for counterparty risk and its capital and funding implications. In this thesis we reformulate the continuous-time analysis of XVAs of [AC18] adding important theoretical results from the theory of invariance times of [CS17] that help us set rigorous assumptions for the well-posedness of the XVA equations and improved definition of the capital value adjustment (KVA) problem. We also generalise two important assumptions: we separate the margin value adjustment (MVA) from the funding value adjustment (FVA), and we allow the liquidation period of a trade due to default of the client to be positive. These generalisations permits us to obtain a more realistic XVA model, in which we distinguish the variation and initial margin. We also obtain a generalised counterparty exposure cash-flow, which is used in the formula for the credit value adjustment (CVA) and debt value adjustment (DVA). At the end of the thesis present a simple case study portfolio of interest rate swaps that could be used in an implementation of the XVA problem. As in [AC18], we take a balance sheet perspective on the pricing and risk management of the bilateral derivatives portfolio of the bank; not only studying the pricing, but also the relative collateralisation, accounting, and dividend policy of the bank. Since the bank cannot hedge against default exposure cash-flows (of clients and of the bank itself), the bank’s shareholders have to set aside a capital at risk and a wealth transfer from shareholders to bondholders occurs at the default of the bank. By consequence, the bank charges to the clients on top of the fair valuation of counterparty risk the so-called contra-liabilities and a cost-of-capital at inception of each new trade. This results in an all-inclusive XVA formula given by CVA + FVA + MVA + KVA.

Yan-Xing Lan	BSc	Variational Autoencoders	Mario Wüthrich
Abstract: The variational autoencoder is one of the most popular approaches to unsupervised learning and generative modelling. In the few years after its initial inception there has been extensive research on its extensions and applications. To get a better understanding of these powerful models the main focus of this bachelor thesis is to formulate the theory behind the variational autoencoder and to explore the theory on an explicit example. Therefore, we describe the mathematical theory behind the framework of the variational autoencoder in the first part and apply it on the MNIST data set in the second part of the thesis.

Marcello Monga	MSc	Deep Portfolio Optimization	Sebastian Becker Patrick Cheridito
Abstract: This work is aimed at introducing a new machine learning based approach to Portfolio Optimization. Its most important feature is that in contrast with classical models our methods makes it possible to handle market frictions such as transaction costs and market impact. This Master’s Thesis is divided in two parts. The first one is identified with the first Section and consists in an introduction to Portfolio Optimization and Assets and Liabilities Management. The second part is the most important and coincide with the second Section. Here, we formalize our approach and then apply it to three different examples. In the first one we play the role of an investor that do not have any external cash flow but only acts on the market. After that we assume the point of view of a pension fund and then of an insurance company. Our examples will be presented together with graphs and tables obtained using Python, which will show how our method works. Codes are reported at the end of the work.

Marc Nübel	BSc	Matrix Mittag-Leffler Distributions with Applications to Insurance	Mario Wüthrich
Abstract: This thesis explains the construction and properties of Matrix Mittag-Leffler distributions. Furthermore, it explores the use of this family of distribution functions on a motor third party liability (MTPL) insurance data set available from the R package CASdatasets.

Nicola Ruckstuhl	Semester	Unintuitive Modelling Effects in Non-proportional Reinsurance Contracts	Philipp Arbenz Patrick Cheridito	SCOR
Abstract:

Fanny Siegwart	MSc	Robust Wasserstein Profile Inference And Applications To Machine Learning	Mario Wüthrich
Abstract: This thesis studies robust generalized linear model fitting by using the framework of distributionally robust optimization. We describe the theory which basically is optimizing over Wasserstein balls of the empirical distribution, and we relate this to ridge and LASSO regularization. Furthermore, this approach is explored on simulated and on real data.

Robin Vogtland	BSc	Calibration of Stochastic Volatility Models using different Neural Network Approaches	Patrick Cheridito
Abstract: In finance, parametric models are often used to price various derivative contracts. The model parameters have to be calibrated to the quoted market data, i.e. the model parameters have to be chosen such that they best fit the behaviour observed in the financial market. Hence the parameters should be calibrated to the quoted market data. This leads to optimization problems that can only be solved efficiently when closed or semi-closed option pricing formulas can be derived for the model in question. For more realistic models, one has to resort to Monte Carlo sampling, making the calibration a computationally expensive optimization problem. In recent years it has been discussed in research [10, 2, 8] to speed up this procedure by using neural networks. One possibility is splitting the calibration procedure into two steps. First, learning the mapping from the parameter space to the prices of the contracts (or similarly implied volatilities) using observed data. Second, using this mapping to optimize for the best choice of model parameters. This has been shown to work well in Horvath et al., [10]. An alternative approach is to directly learn the mapping from prices and contract parameters to the model parameters. The main advantage of these approaches is the ability to train the neural networks prior to application, using large quantities of data. Once the network is trained, the calibration task can be performed relatively fast, which is of extreme importance for application in the financial industry. Therefore, the range of models, that can be used in practice is extended to include more sophisticated and accurate models, which could previously not be used due to their calibration times. In this thesis, we compare the different calibration procedures on several examples. This will show advantages for the two-step approach considering different error metrics. The direct approach only has the benefit of performing calibration extremely fast once the network is trained.