The link between previous life trajectories and a later life outcome: A feature selection approach

TitreThe link between previous life trajectories and a later life outcome: A feature selection approach
Type de publicationJournal Article
Year of Publication2020
AuteursBolano, D
Secondary AuthorsStuder, M
JournalLIVES Working paper
Volume82
Pagination1-38
Date Published01/2020
ISSN2296-1658
Mots-cléslife course methodology, machine learning, sequence analysis, variable selection
Résumé

Several studies have investigated the link between a previous trajectory and a given later-life outcome. Trajectories are complex objects. Identifying which aspects of the trajectories are relevant is of primary interest in terms both of prediction and testing specific theories. In this work, we propose an innovative approach based on data mining feature selection algorithms. The approach is in two steps. We start by automatically extracting several properties of the sequences. Using a life course approach, we focus here on features related to three key aspects of the life course: sequencing, timing and duration of life events. Then, in a second step, we use feature selection algorithms to identify the most relevant properties associated with the outcome. We discuss the use of two features selection approaches a random forest approach (Boruta) and a LASSO method (Stability Selection). We also discuss the inclusion of control variable such as socio-demographic characteristics of the respondent in this selection process. The proposed approach is illustrated through a study of the effects of family and work trajectories between age 20 and 40 on health and income conditions in midlife.

DOI10.12682/lives.2296-1658.2020.82