ZüKoSt: Seminar on Applied Statistics

Would you like to be notified about these presentations via e-mail? Please subscribe here.

Modal title

Modal content

Spring Semester 2024

Date / Time

Speaker

Title

Location

1 February 2024
15:30-16:30

Andrew Vickers
Memorial Sloan Kettering Cancer Center, New York

Event Details

ZüKoSt Zürcher Kolloquium über Statistik

Title	If calibration, discrimination, Brier, lift gain, precision recall, F1, Youden, AUC, and 27 other accuracy metrics can’t tell you if a prediction model (or diagnostic test, or marker) is of clinical value, what should you use instead?
Speaker, Affiliation	Andrew Vickers, Memorial Sloan Kettering Cancer Center, New York
Date, Time	1 February 2024, 15:30-16:30
Location	AKI Lecture Hall 1&2, Hirschengraben 86, 8001 Zürich
Abstract	A typical paper on a prediction model (or diagnostic test or marker) presents some accuracy metrics - say, an AUC of 0.75 and a calibration plot that doesn’t look too bad – and then recommends that the model (or test or marker) can be used in clinical practice. But how high an AUC (or Brier or F1 score) is high enough? What level of miscalibration would be too much? The problem is redoubled when comparing two different models (or tests or markers). What if one prediction model has better discrimination but the other has better calibration? What if one diagnostic test has better sensitivity but worse specificity? Note that it doesn’t help to state a general preference, such as “if we think sensitivity is more important, we should take the test with the higher sensitivity” because this does not allow to evaluate trade-offs (e.g. test A with sensitivity of 80% and specificity of 70% vs. test B with sensitivity of 81% and specificity of 30%). The talk will start by showing a series of everyday examples of prognostic models, demonstrating that it is difficult to tell which is the better model, or whether to use a model at all, on the basis of routinely reported accuracy metrics such as AUC, Brier or calibration. We then give the background to decision curve analysis, a net benefit approach first introduced about 15 years ago, and show how this methodology gives clear answers about whether to use a model (or test or marker) and which is best. Decision curve analysis has been recommended in editorials in many major journals, including JAMA, JCO and the Annals of Internal Medicine, and is very widely used in the medical literature, with well over 1000 empirical uses a year. We are pleased to invite you - see you there!

If calibration, discrimination, Brier, lift gain, precision recall, F1, Youden, AUC, and 27 other accuracy metrics can’t tell you if a prediction model (or diagnostic test, or marker) is of clinical value, what should you use instead?read_more

AKI Lecture Hall 1&2, Hirschengraben 86, 8001 Zürich

29 February 2024
15:15-16:15

Manuela Brunner
WSL Institute for Snow and Avalanche Research SLF

Event Details

ZüKoSt Zürcher Kolloquium über Statistik

Title	Exceptional flood events: insights from three simulation approaches
Speaker, Affiliation	Manuela Brunner, WSL Institute for Snow and Avalanche Research SLF
Date, Time	29 February 2024, 15:15-16:15
Location	HG G 19.2
Abstract	Exceptional floods, i.e. flood events with magnitudes or spatial extents occurring only once or twice a century, are rare by definition. Therefore, it is challenging to estimate their frequency, magnitude, and future changes. In this talk, I discuss three methods that enable us to study exceptional extreme events absent in observational records thanks to increasing sample size: stationary and non-stationary stochastic simulation, reanalysis ensemble pooling, and single-model initialized large ensembles. I apply these techniques to (1) study the frequency of widespread floods, (2) quantify future changes in spatial flood extents, (3) estimate the magnitude of floods happening once or twice a century, and (4) shed light on the relationship between future increases in extreme precipitation and flooding. These applications suggest that simulation approaches that substantially increase sample size provide a better picture of flood variability and help to increase our understanding of the characteristics, drivers, and changes of exceptional extreme events.

Exceptional flood events: insights from three simulation approachesread_more

HG G 19.2

11 April 2024
15:15-16:15

Arnoldo Frigessi
University of Oslo

Event Details

ZüKoSt Zürcher Kolloquium über Statistik

Title	Probabilistic preference learning from incomplete rank data
Speaker, Affiliation	Arnoldo Frigessi, University of Oslo
Date, Time	11 April 2024, 15:15-16:15
Location	HG G 19.1
Abstract	Ranking data are ubiquitous: we rank items as citizens, as consumers, as scientists, and we are collectively characterised, individually classified and recommended, based on estimates of our preferences. Preference data occur when we express comparative opinions about a set of items, by rating, ranking, pair comparing, liking, choosing or clicking, usually in an incomplete and possibly inconsistent way. The purpose of preference learning is to i) infer on the shared consensus preference of a group of users, or ii) estimate for each user their individual ranking of the items, when the user indicates only incomplete preferences; the latter is an important part of recommender systems. I present a Bayesian preference-learning framework based on the Mallows rank model with any right-invariant distance, to infer on the consensus ranking of a group of users, and to estimate the complete ranking of the items for each user. MCMC based inference is possible, by importance-sampling approximation of the normalising function, but mixing can be slow. We propose a Variational Bayes approach to performing posterior inference, based on a pseudo-marginal approximating distribution on the set of permutations of the items. The approach scales well and has useful theoretical properties. Partial rankings and non-transitive pair-comparisons are solved by Bayesian augmentation. The Bayes-Mallows approach produces well-calibrated uncertainty quantification of estimated preferences, which are useful for recommendation, leading to excellent accuracy and increased diversity, compared for example to matrix factorisation. Simulations and real-world applications help illustrate the method. This talk is based on joint work with Elja Arjas, Marta Crispino, Qinghua Liu, Ida Scheel, Øystein Sørensen, and Valeria Vitelli.

Probabilistic preference learning from incomplete rank dataread_more

HG G 19.1

19 April 2024
15:15-16:15

Joshua Warren
Yale University

Event Details

ZüKoSt Zürcher Kolloquium über Statistik

Title	A Bayesian framework for incorporating exposure uncertainty into health analyses with application to air pollution and stillbirth
Speaker, Affiliation	Joshua Warren , Yale University
Date, Time	19 April 2024, 15:15-16:15
Location	HG G 19.1
Abstract	Abstract: Studies of the relationships between environmental exposures and adverse health outcomes often rely on a two-stage statistical modeling approach, where exposure is modeled/predicted in the first stage and used as input to a separately fit health outcome analysis in the second stage. Uncertainty in these predictions is frequently ignored, or accounted for in an overly simplistic manner when estimating the associations of interest. Working in the Bayesian setting, we propose a flexible kernel density estimation (KDE) approach for fully utilizing posterior output from the first stage modeling/prediction to make accurate inference on the association between exposure and health in the second stage, derive the full conditional distributions needed for efficient model fitting, detail its connections with existing approaches, and compare its performance through simulation. Our KDE approach is shown to generally have improved performance across several settings and model comparison metrics. Using competing approaches, we investigate the association between lagged daily ambient fine particulate matter levels and stillbirth counts in New Jersey (2011–2015), observing an increase in risk with elevated exposure 3 days prior to delivery. The newly developed methods are available in the R package KDExp.

A Bayesian framework for incorporating exposure uncertainty into health analyses with application to air pollution and stillbirthread_more

HG G 19.1

Notes: the highlighted event marks the next occurring event and if you want you can subscribe to the iCal/ics Calender.

Archive: SS 24 AS 23 SS 23 AS 22 SS 22 AS 21 SS 20 AS 19 SS 19 AS 18 SS 18 AS 17 SS 17 AS 16 SS 16 AS 15 SS 15 AS 14 SS 14 AS 13 SS 13 AS 12 SS 12 AS 11 SS 11 AS 10 SS 10 AS 09