Seminar overview

×

Modal title

Modal content

Autumn Semester 2012

Date & Time Speaker Title Location
Tue 18.09.2012
15:00-16:15
Jodi Lapidus
Oregon Health & Science University, Portland, OR, USA
Abstract
Recently, investigations dedicated to identifying and evaluating biomarkers have increased considerably. Rather than searching for a single biomarker to diagnose disease or predict an outcome, studies often focus on combining information from multiple sources to improve classification. While there are ample classification methods proposed in the statistical literature, McIntosh and Pepe (2002) showed that decision rules based on the likelihood ratio function, or equivalently, the risk score, are optimal. Logistic regression can be used to generate a risk score, and the c-statistic or area under the receiver operating characteristic curve (AUC) based on that risk score can then be used to assess classification performance. When several candidate biomarkers are collected -- for example, a multi-analyte assay panel that contains hundreds of proteins – it is labor-intensive to check all possible combinations. Additionally, large-scale cohort studies often collect a host of demographic, medical history, clinical information, as well as serum or other laboratory-based biomarkers – and these measures may be used to predict subsequent health outcomes. One could utilize standard variable selection methods (e.g. best subsets) to build classification/prediction models, but these do not guarantee optimal performance based on AUC. We propose a new procedure to select markers for inclusion in a logistic regression model based on improvement in AUC. The procedure begins by noting the equivalence of the non-parametric two sample test statistic (Mann-Whitney U) and AUC. We make use of the jagged ordered multivariate optimization algorithm for partial ROC curves outlines in Baker (2000) to select additional markers. We built in a stopping rule based on the category-free version of the Net Reclassification Index (NRI) proposed by Pencina (2011). We will illustrate the algorithm using various datasets, including a protein biomarker discovery project based on a small preterm labor cohort, and predictors of fracture in a large multi-site cohort of community-dwelling aging US men.
ZüKoSt Zürcher Kolloquium über Statistik
A Variable Selection Method for Logistic Regression Models Based on the Receiver Operating Characteristic Curve
HG G 19.1
Thr 27.09.2012
16:15-17:30
Nicolas Städler
Nederlands Kanker Instituut, Amsterdam
Abstract
Early, pioneering applications of hidden Markov models (HMMs) to genome data (see [1]) considered univariate or low-dimensional observations (such as the gene sequence itself). However, in recent years technological advances have begun to permit truly multivariate studies. For example, using technologies as DamID [2] or ChIP-seq [3] it is now possible to measure the binding of proteins to the DNA across the entire genome for hundreds of proteins and the dimensionality of such approaches continues to increase. In the moderate-to-large dimensional setting, estimation for HMMs remains challenging in practice, due to several concerns arising from the hidden nature of the states. We consider penalized estimation in HMMs with multivariate Normal observations. Penalization and setting of associated parameters is non-trivial in this latent variable setting: we propose a penalty that automatically adapts to number of states K and state-specific sample size and can cope with scaling issues arising from the unknown states. The methodology is adaptive and very general, applying in particular to both low- and high-dimensional settings. Furthermore, our approach explores the number of states K in an efficient manner by exploiting the relationship between parameter estimates for successive candidate values for K. We consider genome-wide binding data of 53 chromatin proteins in the embryonic Drosophila cell line Kc167 (data from [4]). We demonstrate the ability of our approach to yield huge gains in predictive power and deliver far richer estimates than currently used methods. [1] Durbin, R., Eddy, S. R., Krogh, A. and Mitchison, G. J. (1998) Biological Sequence Anal- ysis: Probabilistic Models of Proteins and Nucleic Acids. Cambridge University Press. [2] van Steensel, B. and Henikoff, S. (2000) Identification of in vivo DNA targets of chromatin proteins using tethered dam methyltransferase. Nature Biotechnology, 18, 424-428. [3] Park, P. (2009) ChIP-seq: advantages and challenges of a maturing technology. Nature Reviews Genetics, 10, 669-680. [4] Filion, G. J., van Bemmel, J. G., Braunschweig, U., Talhout, W., Kind, J., Ward, L. D., Brugman, W., de Castro, I. J., Kerkhoven, R. M., Bussemaker, H. J. and van Steensel, B. (2010) Systematic protein location mapping reveals five principal chromatin types in drosophila cells. Cell, 143, 212-224.
ZüKoSt Zürcher Kolloquium über Statistik
Penalized hidden Markov models for high-dimensional genome analysis
HG G 19.1
Tue 02.10.2012
15:15-16:15
Shaowei Lin
University of California Berkeley
Abstract
Many parameter estimation and integral approximation problems in machine learning suffer, not from the curse of dimensionality as commonly believed, but from the curse of singularities. A common way of overcoming such problems is regularization using sparse penalties. Recent developments in the learning theory of singular models might be the key to understanding this phenomenon. In this talk, we give a brief introduction to Sumio Watanabe's Singular Learning Theory, as outlined in his book "Algebraic Geometry and Statistical Learning Theory". We will learn how geometry and resolution of singularities help us approximate integrals efficiently.
Research Seminar in Statistics
Understanding the curse of singularities in machine learning
HG G 19.1
Fri 19.10.2012
15:15-16:30
Gabor Lugosi
Universitat Pompeu Fabra, Barcelona
Abstract
We consider the problem of finding information in high-dimensional noisy data. Our goal is to understand the possibilities and limitations of such correlation detection problems. The mathematical analysis reveals some interesting phase transitions. We also discuss an interesting connection with random geometric graphs. (The talk is mostly based on joint work with Ery Arias-Castro and Sebasiten Bubeck.)
Research Seminar in Statistics
Detection of correlations in high dimension
HG G 19.1
Thr 25.10.2012
16:15-17:30
Simon Barthelmé
Bernstein Center for Computational Neuroscience Berlin
Abstract
The measure of eye movements is central to neuroscience and psychology, not only because of what they reveal about the distribution of attention, but also for their own sake as a central aspect of motor behaviour. Eye movements are quite complex, but often analysis focuses on fixation locations: these are areas in which the eyes stayed still. We show how the analysis of fixation locations can be thought of as a spatial statistics problem, and how point process models can be used to characterise patterns of fixation. We also discuss how the time dimension can be integrated into the analysis through non-parametric Markov (in time) models, and how these can be treated in essentially the same way as inhomogeneous Poisson point process models. Joint work with Hans Trukenbrod (U Potsdam), Ralf Engbert (U Potsdam), and Felix Wichmann (U Tübingen).
ZüKoSt Zürcher Kolloquium über Statistik
Point process models for eye movements
HG G 19.1
Thr 01.11.2012
16:15-17:30
Oliver Sander
Novartis Basel
Abstract
Clinical development of a new drug requires a series of complex decisions, e.g. which study designs to use, doses and dosing regimens to use, characteristics of patients to include and importantly whether to continue or stop development at important milestones. These decisions can be best supported by continuously integrating all available information along the development process. Non-linear mixed effects models provide an elegant framework for integrating relevant information for example on drug dose and timing of doses, exposure, and clinical response across multiple studies. Such a model-based approach allows to make best use of the available longitudinal data, accounts for typical trends as well as for different sources of variability, takes an integrated rather than a study-by-study perspective, and allows for simulation in order to explore what-if scenarios. The talk will present basics of non-linear mixed effects models, frequently used model types, and their applications in drug development projects.
ZüKoSt Zürcher Kolloquium über Statistik
Non-linear mixed effects models in drug development
HG G 19.1
Fri 09.11.2012
15:15-16:30
Garvesh Raskutti
UC California, Berkeley, USA
Abstract
The phenomenon of overfitting is ubiquitous throughout statistics, and is particularly problematic in non-parametric problems. As a result, regularization or shrinkage estimators are used. The early stopping strategy is to run an iterative algorithm for a fixed but finite number of iterations. Early stopping of iterative algorithms is known to achieve regularization since it implicitly shrinks the solution of the un-regularized objective towards the starting point of the algorithm. When using early stopping as a strategy for regularization,a critical issue is determining when to stop. In this talk, I present analysis for an iterative update corresponding to gradient descent applied to the non-parametric least-squares loss in an appropriately chosen co-ordinate system. In particular, for our iterative update, I present a computable data-dependent stopping rule developed by me and my former advisors. Our stopping rule achieves minimax optimal rates in mean-squared error for Sobolev space or finite-rank Reproducing kernel Hilbert space (RKHS). Importantly, our stopping rule does not require data-intensive methods such as cross-validation or hold-out data and has optimal mean-squared error performance. This work is joint with my former advisors, Martin Wainwright and Bin Yu.
Research Seminar in Statistics
Early stopping of gradient descent for non-parametric regression: An optimal data-dependent stopping rule
HG G 19.1
Fri 16.11.2012
15:15-16:30
Ya'acov Ritov
Hebrew University, Jerusalem
Abstract
In this talk we consider some empirical Bayes results. Mainly we consider a new approach to the very classical Poisson model, to the use of proxies (AKA covariates).
Research Seminar in Statistics
Some empirical Bayes results.
HG G 19.1
Fri 30.11.2012
15:15-16:15
Sebastian Reich
Universität Potsdam
Abstract
Sequential filtering relies on the propagation of uncertainty under a given model dynamics within a Monte Carlo (MC) setting combined with an assimilation of observations using Bayes' theorem. The recursive application of Bayes' theorem within a dynamic MC framework poses major computational challenges. The popular class of sequential Monte Carlo methods (SMCMs) relies on a proposal step and an importance resampling step. However, SMCMs are subject to the curse of dimensionality and alternative methods are needed for filtering in high-dimension. The ensemble Kalman filter (EnKF) has emerged as a promising alternative to SMCMs but is also known to lead to asymptotically inconsistent results. Following an introduction to sequential filtering, I will discuss a McKean approach to Bayesian inference and its implementation using optimal couplings. Applying this approach to the sequential filtering leads to new perspectives on both SMCMs and EnKFs as well as to novel filter algorithms.
Research Seminar in Statistics
Bayesian inference and sequential filtering: An optimal coupling of measures perspective
HG G 19.1
Fri 30.11.2012
16:15-17:00
Aad van der Vaart
University of Leiden
Abstract
Bayesian nonparametric procedures for function estimation (densities, regression functions, drift functions, etc.) have been shown to perform well, if some care is taken in the choice of the prior. Many nonparametric priors do not "wash out" as the number of data points increases, unlike for finite-dimensional parameters, but by introducing hyperparameters they can give reconstructions that adapts to the properties of large classes of true underlying functions, similar to the best non-Bayesian procedures for function estimation. Besides a reconstruction a posterior distribution also gives a sense of remaining uncertainty about the true parameter, through its spread. In practice "credible sets, which are central sets of prescribed posterior probability, are often treated as if they are confidence sets. We present some results that show that this practice can be justified, but also results that show that it can be extremely misleading. The situation is particularly delicate if the prior is adapted through hyperparameters (by either empirical of hierarchical Bayes). General, non-Bayesian, difficulties with nonparametric confidence sets play an important role in the resulting difficulties. Although the message of the results is thought to be general, our talk will be limited to the special case of prior distributions furnished by Gaussian processes.
Research Seminar in Statistics
Nonparametric Credible Sets
HG G 19.1
Fri 07.12.2012
15:15-16:30
Michael Wolf
Universität Zürich
Abstract
Many statistical applications require an estimate of a covariance matrix and/or its inverse. When the matrix dimension is large compared to the sample size, which happens frequently, the sample covariance matrix is known to perform poorly and may suffer from ill-conditioning. There already exists an extensive literature concerning improved estimators in such situations. In the absence of further knowledge about the structure of the true covariance matrix, the most successful approach so far, arguably, has been shrinkage estimation. Shrinking the sample covariance matrix to a multiple of the identity, by taking a weighted average of the two, turns out to be equivalent to linearly shrinking the sample eigenvalues to their grand mean, while retaining the sample eigenvectors. Our paper extends this approach by considering nonlinear transformations of the sample eigenvalues. We show how to construct an estimator that is asymptotically equivalent to an oracle estimator suggested in previous work. As demonstrated in extensive Monte Carlo simulations, the resulting bona fide estimator can result in sizeable improvements over the sample covariance matrix and also over linear shrinkage.
Research Seminar in Statistics
Nonlinear Shrinkage Estimation of Large-Dimensional Covariance Matrices
HG G 19.1
Mon 10.12.2012
15:00-16:00
Andreas Buja
The Wharton School, University of Pennsylvania
Abstract
It is common practice in statistical data analysis to perform data-driven variable selection and derive statistical inference from the resulting model. Such inference enjoys none of the guarantees that classical statistical theory provides for tests and confidence intervals when the model has been chosen a priori. We propose to produce valid "post-selection inference" by reducing the problem to one of simultaneous inference and hence suitably widening conventional confidence and retention intervals. Simultaneity is required for all linear functions that arise as coefficient estimates in all submodels. By purchasing "simultaneity insurance" for all possible submodels, the resulting post-selection inference is rendered universally valid under all possible model selection procedures. This inference is therefore generally conservative for particular selection procedures, but it is always less conservative than full Scheffe protection. Importantly it does NOT depend on the truth of the selected submodel, and hence it produces valid inference even in wrong models. We describe the structure of the simultaneous inference problem and give some asymptotic results. JOINT WITH: Richard Berk, Larry Brown, Kai Zhang, Linda Zhao
Research Seminar in Statistics
Valid Post-Selection Inference
HG G 19.1
Fri 14.12.2012
14:15-15:30
Mohamed Hebiri
Université Paris-Est Marne-la-Vallée
Abstract
Sparse estimation methods based on `1 relaxation, such as the Lasso and the Dantzig selector, are powerful tools for estimating high dimensional linear models. However, in order to properly tune these methods, the variance of the noise is often required. This constitutes a major obstacle for practical applications of these methods in various frameworks – such as time series, random fields, inverse problems – for which noise is rarely homoscedastic or with a level that is hard to know in advance. In this paper, we propose a new approach to the joint estimation of the conditional mean and the conditional variance in a high-dimensional (auto-)regression setting. An attractive feature of our proposed estimator is that it is computable by solving a second-order cone program (SOCP). We present numerical results assessing the performance of the proposed procedure both on simulations and on real data. We also establish non-asymptotic risk bounds which are nearly as strong as those for original `1-penalized estimators. This work is joint with Arnak Dalalyan, Katia Meziani and Joseph Salmon
Research Seminar in Statistics
Learning heteroscedastic models via SOCP under group sparsity
HG G 19.1
Thr 20.12.2012
16:15-17:30
Stefano Castruccio
Department of Statistics, The University of Chicago
Abstract
Climate sensitivity to anthropogenic forcing can be investigated by the use of global climate models which reproduce physical processes on a global scale and predict variables such as temperature. A collection of different runs (model ensemble) can be obtained setting different initial conditions and greenhouse gas concentrations. The purpose of this work is to show how the runs of a precomputed ensemble can be reproduced (emulated) with a global space/time statistical model that addresses the issue of capturing nonstationarities in latitude more effectively than current alternatives in the literature. Exploiting the gridded geometry of the data, the proposed algorithm is able to fit massive datasets with millions of observations within a few hours. In the last part of the talk, an application to the recent CMIP5 multi model ensemble will be introduced and compared with reanalysis data. An extension to modeling land/ocean nonstationarities will also be discussed.
ZüKoSt Zürcher Kolloquium über Statistik
Space time global models for climate ensembles
HG G 19.1
JavaScript has been disabled in your browser