Seminar overview

×

Modal title

Modal content

Spring Semester 2024

Date & Time Speaker Title Location
Thr 01.02.2024
15:30-16:30
Andrew Vickers
Memorial Sloan Kettering Cancer Center, New York
Abstract
A typical paper on a prediction model (or diagnostic test or marker) presents some accuracy metrics - say, an AUC of 0.75 and a calibration plot that doesn’t look too bad – and then recommends that the model (or test or marker) can be used in clinical practice. But how high an AUC (or Brier or F1 score) is high enough? What level of miscalibration would be too much? The problem is redoubled when comparing two different models (or tests or markers). What if one prediction model has better discrimination but the other has better calibration? What if one diagnostic test has better sensitivity but worse specificity? Note that it doesn’t help to state a general preference, such as “if we think sensitivity is more important, we should take the test with the higher sensitivity” because this does not allow to evaluate trade-offs (e.g. test A with sensitivity of 80% and specificity of 70% vs. test B with sensitivity of 81% and specificity of 30%). The talk will start by showing a series of everyday examples of prognostic models, demonstrating that it is difficult to tell which is the better model, or whether to use a model at all, on the basis of routinely reported accuracy metrics such as AUC, Brier or calibration. We then give the background to decision curve analysis, a net benefit approach first introduced about 15 years ago, and show how this methodology gives clear answers about whether to use a model (or test or marker) and which is best. Decision curve analysis has been recommended in editorials in many major journals, including JAMA, JCO and the Annals of Internal Medicine, and is very widely used in the medical literature, with well over 1000 empirical uses a year. We are pleased to invite you - see you there!
ZüKoSt Zürcher Kolloquium über Statistik
If calibration, discrimination, Brier, lift gain, precision recall, F1, Youden, AUC, and 27 other accuracy metrics can’t tell you if a prediction model (or diagnostic test, or marker) is of clinical value, what should you use instead?
AKI Lecture Hall 1&2, Hirschengraben 86, 8001 Zürich
Thr 29.02.2024
15:15-16:15
Manuela Brunner
WSL Institute for Snow and Avalanche Research SLF
Abstract
Exceptional floods, i.e. flood events with magnitudes or spatial extents occurring only once or twice a century, are rare by definition. Therefore, it is challenging to estimate their frequency, magnitude, and future changes. In this talk, I discuss three methods that enable us to study exceptional extreme events absent in observational records thanks to increasing sample size: stationary and non-stationary stochastic simulation, reanalysis ensemble pooling, and single-model initialized large ensembles. I apply these techniques to (1) study the frequency of widespread floods, (2) quantify future changes in spatial flood extents, (3) estimate the magnitude of floods happening once or twice a century, and (4) shed light on the relationship between future increases in extreme precipitation and flooding. These applications suggest that simulation approaches that substantially increase sample size provide a better picture of flood variability and help to increase our understanding of the characteristics, drivers, and changes of exceptional extreme events.
ZüKoSt Zürcher Kolloquium über Statistik
Exceptional flood events: insights from three simulation approaches
HG G 19.2
Thr 07.03.2024
16:15-17:15
Elliot Young
The University of Cambridge
Abstract
We study partially linear models in settings where observations are arranged in independent groups but may exhibit within-group dependence. Existing approaches estimate linear model parameters through weighted least squares, with optimal weights (given by the inverse covariance of the response, conditional on the covariates) typically estimated by maximising a (restricted) likelihood from random effects modelling or by using generalised estimating equations. We introduce a new ‘sandwich loss’ whose population minimiser coincides with the weights of these approaches when the parametric forms for the conditional covariance are well-specified, but can yield arbitrarily large improvements in linear parameter estimation accuracy when they are not. Under relatively mild conditions, our weighted least squares (within a double machine learning framework) estimated coefficients are asymptotically Gaussian and enjoy minimal variance among estimators with weights restricted to a given class of functions, when user-chosen regression methods are used to estimate nuisance functions. We further expand the class of functional forms for the weights that may be fitted beyond parametric models by leveraging the flexibility of modern machine learning methods within a new gradient boosting scheme for minimising the sandwich loss. We demonstrate the effectiveness of both the sandwich loss and what we call ‘sandwich boosting’ in a variety of settings with simulated and real-world data.
Research Seminar in Statistics
Sandwich Boosting for accurate estimation in partially linear models for grouped data
HG G 19.1
Thr 21.03.2024
16:15-17:15
Bryon Aragam
The University of Chicago Booth School of Business
Abstract
One of the key paradigm shifts in statistical machine learning over the past decade has been the transition from handcrafted features to automated, data-driven representation learning. A crucial step in this pipeline is to identify latent representations from observational data along with their causal structure. In many applications, the causal variables are not directly observed, and must be learned from data, often using flexible, nonparametric models such as deep neural networks. These settings present new statistical and computational challenges that will be focus of this talk. We will re-visit the statistical foundations of nonparametric latent variable models as a lens into the problem of causal representation learning. We discuss our recent work on developing methods for identifying and learning causal representations from data with rigourous guarantees, and discuss how even basic statistical properties are surprisingly subtle. Along the way, we will explore the connections between causal graphical models, deep generative models, and nonparametric mixture models, and how these connections lead to a useful new theory for causal representation learning.
Research Seminar in Statistics
Research Seminar on Statistics - FDS Seminar joint talk: Statistical aspects of nonparametric latent variable models and causal representation learning
HG D 1.2
Thr 21.03.2024
16:15-17:15
Bryon Aragam
The University of Chicago Booth School of Business
Abstract
One of the key paradigm shifts in statistical machine learning over the past decade has been the transition from handcrafted features to automated, data-driven representation learning. A crucial step in this pipeline is to identify latent representations from observational data along with their causal structure. In many applications, the causal variables are not directly observed, and must be learned from data, often using flexible, nonparametric models such as deep neural networks. These settings present new statistical and computational challenges that will be focus of this talk. We will re-visit the statistical foundations of nonparametric latent variable models as a lens into the problem of causal representation learning. We discuss our recent work on developing methods for identifying and learning causal representations from data with rigourous guarantees, and discuss how even basic statistical properties are surprisingly subtle. Along the way, we will explore the connections between causal graphical models, deep generative models, and nonparametric mixture models, and how these connections lead to a useful new theory for causal representation learning.
ETH-FDS seminar
Research Seminar on Statistics - FDS Seminar joint talk: Statistical aspects of nonparametric latent variable models and causal representation learning
HG D 1.2
Thr 11.04.2024
15:15-16:15
Arnoldo Frigessi
University of Oslo
Abstract
Ranking data are ubiquitous: we rank items as citizens, as consumers, as scientists, and we are collectively characterised, individually classified and recommended, based on estimates of our preferences. Preference data occur when we express comparative opinions about a set of items, by rating, ranking, pair comparing, liking, choosing or clicking, usually in an incomplete and possibly inconsistent way. The purpose of preference learning is to i) infer on the shared consensus preference of a group of users, or ii) estimate for each user their individual ranking of the items, when the user indicates only incomplete preferences; the latter is an important part of recommender systems. I present a Bayesian preference-learning framework based on the Mallows rank model with any right-invariant distance, to infer on the consensus ranking of a group of users, and to estimate the complete ranking of the items for each user. MCMC based inference is possible, by importance-sampling approximation of the normalising function, but mixing can be slow. We propose a Variational Bayes approach to performing posterior inference, based on a pseudo-marginal approximating distribution on the set of permutations of the items. The approach scales well and has useful theoretical properties. Partial rankings and non-transitive pair-comparisons are solved by Bayesian augmentation. The Bayes-Mallows approach produces well-calibrated uncertainty quantification of estimated preferences, which are useful for recommendation, leading to excellent accuracy and increased diversity, compared for example to matrix factorisation. Simulations and real-world applications help illustrate the method. This talk is based on joint work with Elja Arjas, Marta Crispino, Qinghua Liu, Ida Scheel, Øystein Sørensen, and Valeria Vitelli.
ZüKoSt Zürcher Kolloquium über Statistik
Probabilistic preference learning from incomplete rank data
HG G 19.1
Fri 19.04.2024
15:15-16:15
Joshua Warren
Yale University
Abstract
Abstract: Studies of the relationships between environmental exposures and adverse health outcomes often rely on a two-stage statistical modeling approach, where exposure is modeled/predicted in the first stage and used as input to a separately fit health outcome analysis in the second stage. Uncertainty in these predictions is frequently ignored, or accounted for in an overly simplistic manner when estimating the associations of interest. Working in the Bayesian setting, we propose a flexible kernel density estimation (KDE) approach for fully utilizing posterior output from the first stage modeling/prediction to make accurate inference on the association between exposure and health in the second stage, derive the full conditional distributions needed for efficient model fitting, detail its connections with existing approaches, and compare its performance through simulation. Our KDE approach is shown to generally have improved performance across several settings and model comparison metrics. Using competing approaches, we investigate the association between lagged daily ambient fine particulate matter levels and stillbirth counts in New Jersey (2011–2015), observing an increase in risk with elevated exposure 3 days prior to delivery. The newly developed methods are available in the R package KDExp.
ZüKoSt Zürcher Kolloquium über Statistik
A Bayesian framework for incorporating exposure uncertainty into health analyses with application to air pollution and stillbirth
HG G 19.1
Mon 22.04.2024
17:00-18:30
Max Simchowitz
MIT
Mohammad Lotfollahi
Cambridge University
Zhijing Jin
MPI and ETH Zurich
Discussant: Nicolai Meinshausen
ETH Zurich
Abstract
1) Statistical Learning under Heterogeneous Distribution Shift, Max Simchowitz, MIT
Abstract: What makes a trained predictor, e.g. neural network, more or less susceptible to performance degradation under distribution shift? In this talk, we will investigate a less well-studied factor: that of the statistical complexity of the individual features themselves. We will show that, for a very general class of predictors with a certain additive structure, empirical risk minimization is less sensitive to distribution shifts in "simple features" than "complex" ones, where simplicity/complexity are measured in terms of natural statistical quantities. We demonstrate that this arises because standard ERM learns the dependence on the "simpler" feature more quickly, whilst avoiding the risk of overfitting to more "complex" features. We will conclude by drawing connections to the orthogonal machine learning literature, and validating our theory on various experimental domains (even those in which the additivity assumption fails to hold).

2) Generative Machine Learning to Model Cellular Perturbations, Mohammad Lotfollahi, Cambridge University
Abstract: The field of cellular biology has long sought to understand the intricate mechanisms that govern cellular responses to various perturbations, be they chemical, physical, or biological. Traditional experimental approaches, while invaluable, often face limitations in scalability and throughput, especially when exploring the vast combinatorial space of potential cellular states. Enter generative machine learning that has shown exceptional promise in modeling complex biological systems. This talk will highlight recent successes, address the challenges and limitations of current models, and discuss the future direction of this exciting interdisciplinary field. Through examples of practical applications, we will illustrate the transformative potential of generative ML in advancing our understanding of cellular perturbations and in shaping the future of biomedical research.

3) A Paradigm Shift in Addressing Distribution Shifts: Insights from Large Language Models, Zhijing Jin, MPI and ETH Zurich
Abstract: Traditionally, the challenge of distribution shifts—where the training data distribution differs from the test data distribution—has been a central concern in statistical learning and model generalization. Traditional methods have primarily focused on techniques such as domain adaptation, and transfer learning. However, the rise of large language models (LLMs) such as ChatGPT has ushered in a novel empirical success, triggering a significant "shift" in problem formulation and approach for traditional distribution shift problems. In this talk, I will start with two formulations for LLMs: (1) the engineering heuristics aimed at transforming "out-of-distribution" (OOD) problems into "in-distribution" scenarios, which is further accompanied by (2) the hypothesized "emergence of intelligence" through massive scaling of data and model parameters, which challenges our traditional views on distribution shifts. I will sequentially examine these aspects, first by presenting behavioral tests of these models' generalization capabilities across unseen data, and then by conducting intrinsic checks to uncover the mechanisms LLMs learned. This talk seeks to provoke thoughts on several questions: Do the strategies of "making OOD problem IID" and facilitating the "emergence of intelligence" by scaling, truly stand up to scientific scrutiny? Furthermore, what do these developments imply for the field of statistical learning and the broader evolution of AI?

Discussant: Nicolai Meinshausen, ETH Zurich
Young Data Science Researcher Seminar Zurich
Joint webinar of the IMS New Researchers Group, Young Data Science Researcher Seminar Zürich, and the YoungStatS Project: Extrapolation to unseen domains: from theory to applications
Zoom
Fri 26.04.2024
15:15-16:15
Richard De Veaux
Williams College
Abstract
As we are all too aware, organizations accumulate vast amounts of data from a variety of sources nearly continuously. Big data and data science advocates promise the moon and the stars as you harvest the potential of all these data. And now, AI threatens our jobs and perhaps our very existence. There is certainly a lot of hype. There’s no doubt that some savvy organizations are fueling their strategic decision making with insights from big data, but what are the challenges? Much can wrong in the data science process, even for trained professionals. In this talk I'll discuss a wide variety of case studies from a range of industries to illustrate the potential dangers and mistakes that can frustrate problem solving and discovery -- and that can unnecessarily waste resources. My goal is that by seeing some of the mistakes I (and others) have made, you will learn how to better take advantage of data insights without committing the "Seven Deadly Sins."
Research Seminar in Statistics
The Seven Deadly Sins of Data Science
HG G 19.1
Tue 07.05.2024
15:15-16:15
Zijian Guo
Rutgers University
Abstract
Empirical risk minimization may lead to poor prediction performance when the target distribution differs from the source populations. This talk discusses leveraging data from multiple sources and constructing more generalizable and transportable prediction models. We introduce an adversarially robust prediction model to optimize a worst-case reward concerning a class of target distributions and show that our introduced model is a weighted average of the source populations' conditional outcome models. We leverage this identification result to robustify arbitrary machine learning algorithms, including, for example, high-dimensional regression, random forests, and neural networks. In our adversarial learning framework, we propose a novel sampling method to quantify the uncertainty of the adversarial robust prediction model. Moreover, we introduce guided adversarially robust transfer learning (GART) that uses a small amount of target domain data to guide adversarial learning. We show that GART achieves a faster convergence rate than the model fitted with the target data. Our comprehensive simulation studies suggest that GART can substantially outperform existing transfer learning methods, attaining higher robustness and accuracy. Short Bio: Zijian Guo is an associate professor at the Department of Statistics at Rutgers University. He obtained a Ph.D. in Statistics in 2017 from Wharton School, University of Pennsylvania. His research interests include causal inference, multi-source and transfer learning, high-dimensional statistics, and nonstandard statistical inference.
Research Seminar in Statistics
Adversarially Robust Learning: Identification, Estimation, and Uncertainty Quantification
HG G 19.2
Thr 16.05.2024
15:15-16:15
Jiwei Zhao
University of Wisconsin–Madison
Abstract
In studies ranging from clinical medicine to policy research, complete data are usually available from a population P, but the quantity of interest is often sought for a related but different population Q. In this talk, we consider the unsupervised domain adaptation setting under the label shift assumption. In the first part, we estimate a parameter of interest in population Q by leveraging information from P, where three ingredients are essential: (a) the common conditional distribution of X given Y, (b) the regression model of Y given X in P, and (c) the density ratio of the outcome Y between the two populations. We propose an estimation procedure that only needs some standard nonparametric technique to approximate the conditional expectations with respect to (a), while by no means needs an estimate or model for (b) or (c); i.e., doubly flexible to the model misspecifications of both (b) and (c). In the second part, we pay special attention to the case that the outcome Y is categorical. In this scenario, traditional label shift adaptation methods either suffer from large estimation errors or require cumbersome post-prediction calibrations. To address these issues, we propose a moment-matching framework for adapting the label shift, and an efficient label shift adaptation method where the adaptation weights can be estimated by solving linear systems. We rigorously study the theoretical properties of our proposed methods. Empirically, we illustrate our proposed methods in the MIMIC-III database as well as in some benchmark datasets including MNIST, CIFAR-10, and CIFAR-100.
Research Seminar in Statistics
A Semiparametric Perspective on Unsupervised Domain Adaptation
HG G 19.1
Mon 27.05.2024
17:30-18:30
Sebastien Bubeck
Microsoft
Abstract
Large language models (LLMs) have taken the field of AI by storm. But how large do they really need to be? I'll discuss the phi series of models from Microsoft, which exhibit many of the striking emergent properties of LLMs despite having merely a few billion parameters.
ETH-FDS Stiefel Lectures
Small Language Models
HG F 30
JavaScript has been disabled in your browser