Seminar overview

×

Modal title

Modal content

Autumn Semester 2013

Date & Time Speaker Title Location
Mon 02.09.2013
11:15-12:00
Julien Gagneur
Ludwig Maximilian University of Munich (LMU)
Abstract
1. The interpretation of data-driven experiments in genomics often involves a search for biological categories that are enriched for the responder genes identified by the experiments. With Model-based Gene Set Analysis (MGSA), we tackle the problem by turning the question differently. Instead of searching for all significantly enriched groups, we search for a minimal set of groups that can explain the data. 2. Systems genetics with environment. We show how non-additive effects between genotype and environment can be exploited for causal inference in molecular networks. Using genome-wide perturbation assays in yeast, we experimentally demonstrate the validity of the approach. Bio: Julien Gagneur has a background in applied mathematics. His contribution includes the development of computational methods for a wide range of genomic application (metabolic and protein network, gene set enrichment, transcription) and insights into gene regulation mechanisms from genome-wide data (cis-regulatory modules, antisense expression). His lab, started in July 2012 at the gene center in Munich, focuses on computational approaches to understand mechanisms of gene regulation and their phenotypic impact from genome-wide assays. http://www.gagneur.genzentrum.lmu.de
Research Seminar in Statistics
Model-based Gene Set Analysis and Systems genetics with environment
HG G 19.1
Fri 20.09.2013
15:15-16:00
Robin Evans
University of Cambridge, UK
Abstract
Causal models based upon directed acyclic graphs (DAGs, or Bayesian networks) have gained wide attention over the past 20 years, but accounting for the effect of hidden variables in this context has proved extremely challenging. The resulting marginalized DAG models (mDAGs) fail to display many of the nice properties of ordinary DAGs, and they are difficult to describe mathematically. We introduce these models and gives some recent results on their characterization. The nested Markov models of Richardson et al (2013) provide approximations to the mDAG models which are much easier to work with; we show that the nested model is 'complete', in that it gives a complete algebraic description of the mDAG. If there is time, we will also discuss some methods for finding inequality constraints in mDAG models, and how these may be used (in principle) to distinguish between different causal hypotheses, even using only observational data.
Research Seminar in Statistics
Distinguishing between different causal models
HG G 19.1
Fri 27.09.2013
15:15-16:00
Sylvain Sardy
Université de Genève
Abstract
I derive new tests for fi xed and random ANOVA based on a thresholded point estimate. The pivotal quantity is the threshold that sets all the coefficients of the null hypothesis to zero. Thresholding can be employed coordinatewise or blockwise, or both, which leads to tests with good power properties under alternative hypotheses that are either sparse or dense.
Research Seminar in Statistics
Blockwise and coordinatewise thresholding to combine tests of di fferent natures in modern ANOVA
HG G 19.1
Tue 22.10.2013
12:15-13:00
Po-Ling Loh
Seminar für Statistik, ETH Zürich
Abstract
We present recent results concerning local optima of various regularized M-estimators, where both loss and penalty functions are allowed to be nonconvex. We show that whenever the loss function satisfies restricted strong convexity and the penalty function satisfies suitable regularity conditions, all local optima of the composite objective function lie within statistical precision of the true parameter vector. Applications of interest include the corrected Lasso for errors-in-variables models and regression in generalized linear models with nonconvex regularizers such as SCAD and MCP. We also show that a simple adaptation of composite gradient descent may be used to efficiently optimize such nonconvex objectives. This is joint work with Martin Wainwright. *****
Research Seminar in Statistics
Local optima of nonconvex regularized M-estimators
HG G 19,1
Thr 24.10.2013
16:15-17:00
Lieven Clement
University of Gent
Abstract
The advent of next generation sequencing (seq) technology enables researchers for assessing genome-wide ‘omics profiles at an unpreceded resolution. The downstream statistical analysis is based on the number of sequenced reads mapping to the genomic regions of interest. The seq-technology conceptually allows for generating count profiles on a single nucleotide resolution. The majority of the algorithms for seq data either focus on segmentation or on differential analysis. But, they seldom perform both tasks simultaneously. Most statistical methods for differential analysis aggregate counts based on existing annotation. It implies that all reads mapping to unannotated regions are discarded. Improvements are possible by developing methods that (a) provide inference on a single base resolution and (b) perform segmentation, discovery and differential analysis, simultaneously. Within this context we explore the use of wavelet based functional models. We first introduce wavelets and show they can be used as building blocks in a functional model for count profiles. We will describe estimation and inference procedures. Finally, we illustrate our approach in a case study and show its potential for simultaneous discovery and differential analysis in sequencing studies.
ZüKoSt Zürcher Kolloquium über Statistik
Wavelet based functional models for 'omics count profiles
HG G 19.1
Thr 31.10.2013
16:15-17:00
John Copas
University of Warwick, UK
Abstract
It is often suspected that outcomes in medical trials are selectively reported. A systematic review for a particular outcome of interest can only include trials where that outcome was reported, and may omit, for example, a trial which considers several outcome measures but only reports those giving significant results. Using information about studies considered in a systematic review but not included in the meta-analysis, I will describe a likelihood-based model for estimating the effect of outcome reporting bias on confidence intervals and p-values. Correcting for outcome reporting bias has the effect of moving estimated treatment effects towards the null value and hence more cautious assessments of significance. The bias can be very substantial, sometimes sufficient to completely overturn previous claims of significance. The seminar will be based on a forthcoming paper in Biostatistics: John Copas, Kerry Dwan, Jamie Kirkham and Paula Williamson (2014). A model–based correction for outcome reporting bias in meta-analysis. Biostatistics, to appear.
ZüKoSt Zürcher Kolloquium über Statistik
Correcting for outcome reporting bias in meta-analysis
HG F 33.1
Fri 08.11.2013
15:15-16:00
Fabian Wauthier
University of Oxford, Department of Statistics
Abstract
The Lasso is a cornerstone of modern multivariate data analysis, yet its performance suffers in the common situation in which covariates are correlated. This limitation has led to a growing number of Preconditioned Lasso algorithms that pre-multiply X and y by matrices P_X, P_y prior to running the standard Lasso. A direct comparison of these and similar Lasso-style algorithms to the original Lasso is difficult because the performance of all of these methods depends critically on an auxiliary penalty parameter \lambda. In this paper we propose an agnostic, theoretical framework for comparing Preconditioned Lasso algorithms to the Lasso without having to choose \lambda. We apply our framework to three Preconditioned Lasso instances.
Research Seminar in Statistics
A Comparative Framework for Preconditioned Lasso Algorithms
HG G 19.1
Thr 14.11.2013
16:15-17:00
Werner Stahel
Seminar für Statistik, ETH Zürich
Abstract
Statistik ist ein wichtiges Werkzeug für die empirischen Wissenschaften. Sobald deterministische Formeln nicht mehr reichen für die Beschreibung der Wirklichkeit, versucht man, die Gesetzmässigkeiten mit Modellen zu beschreiben, die den Zufall einbauen. Dafür sind statistische Regressionsmodelle grundlegend. Sie gehen von der Vorstellung aus, dass es Naturgesetze gibt, die allgemein gelten, und bauen darin zufällige Abweichungen ein, weil unsere Beobachtungen nicht perfekt sind. Auf der anderen Seite sind Modelle immer ''nur Modelle'', also unvollständige Beschreibungen der Wirklichkeit. Es gibt auch den pragmatischen Ansatz, der nicht die ''wahren Zusammenhänge'' abbilden will, sondern lediglich der ''Vorhersage'' dient: Wir wollen eine ''Zielgrösse'' so gut als möglich bestimmen, wenn wir die Werte von anderen Grössen, den ''Eingangsgrössen'', kennen, die mit ihr zusammenhängen. Bei der Entwicklung und bei der Anwendung von statistischen Regressionsmodellen pendeln wir zwischen einem pragmatischen Gebrauch und einer Interpretation als ''wahre Modelle''. Ich will im Vortrag anhand eines Beispiels und einigen allgemeinen Ueberlegungen zeigen, wie man mit diesem Dilemma umgehen kann.
ZüKoSt Zürcher Kolloquium über Statistik
Das Wahre Modell ! ?
HG F 30
Fri 15.11.2013
15:15-16:00
Dennis Kristensen
University College London
Abstract
We develop a novel method for estimation and filtering of continuous-time models with stochastic volatility and jumps using so-called Approximate Bayesian Computation which build likelihoods based on limited information. The proposed estimators are computationally attractive relative to standard likelihood-based estimators since they rely on low-dimensional auxiliary statistics and so avoid computation of high-dimensional integrals. We also develop a simple filtering algorithm that allows one to track the latent volatility process in real time without any heavy computational burden. Despite their computational simplicity, we find that estimators and filters perform well in practice and lead to precise estimates of model parameters and latent variables. We show how the methods can incorporate intra-daily information to improve on the estimation and filter- ing. In particular, the availability of realized volatility measures help us in learning about parameters and latent states. The method is employed in the estimation of a flexible stochastic volatility model for the dynamics of the Stoxx50 equity index. We find evidence of the presence of jumps and in favor of a structural break in parameters. During the recent financial crisis, volatility has a higher mean and variance, and is less persistent than before the crisis. Jumps occur slightly less frequently, and are more likely to be negative when they do occur.
Research Seminar in Statistics
Limited information likelihood inference in stochastic volatility jump-diffusion models
HG G 19.1
Tue 26.11.2013
16:15-17:00
Volkan Cevher
EPFL, Lausanne
Abstract
We propose a variable metric framework for minimizing the sum of a self-concordant function and a possibly non-smooth convex function endowed with a computable proximal operator. We theoretically establish the convergence of our framework without relying on the usual Lipschitz gradient assumption on the smooth part. An important highlight of our work is a new set of analytic step-size selection and correction procedures based on the structure of the problem. We describe concrete algorithmic instances of our framework for several interesting large-scale applications, such as graph learning, Poisson regression with total variation regularization, and heteroscedastic LASSO. Here is a link to the document that contains technical parts of the presentation: http://arxiv.org/abs/1308.2867
Research Seminar in Statistics
Composite self-concordant minimization
HG G 19.2
Thr 28.11.2013
16:15-17:00
Danny Williamson
University of Exeter, UK
Abstract
Uncertainty quantification for climate models in the form of Bayesian calibration requires a detailed assessment of structural error or model discrepancy, that difference between the model and the reality it seeks to describe that is due to imperfect knowledge of the physics and compromises made for practical and computational reasons. However, many perceived structural biases in current climate models may not in fact be examples of this type of error, but may be due to non optimal choices of the model parameters. Given these perceptions, how are statisticians to work with climate scientists in order to elicit probabilistic judgements for model discrepancy? In this talk I will discuss a method for ruling out regions of parameter space that lead to poor representations of climate called history matching. I will compare history matching with Bayesian calibration and show that it can be used to remove current perceived structural biases, to identify the real sources of structural error and that it may therefore be a vital tool in tuning and climate model development. I will apply history matching to the fully coupled unflux adjusted atmosphere ocean generalised circulation model, HadCM3, and show that many perceived structural biases in ocean flows may not be structural at all.
ZüKoSt Zürcher Kolloquium über Statistik
Identification and removal of structural biases in climate models, a statistical approach
HG G 19,1
Fri 29.11.2013
15:15-16:00
Simon Broda
University of Amsterdam
Abstract
A large number of exact inferential procedures in statistics and econometrics involve the sampling distribution of ratios of random variables. If the denominator variable is positive, then tail probabilities of the ratio can be expressed as those of a suitably defined difference of random variables. If in addition, the joint characteristic function of numerator and denominator is known, then standard Fourier inversion techniques can be used to reconstruct the distribution function from it. Most research in this field has been based on this correspondence, but which breaks down when both numerator and denominator are supported on the entire real line. The present manuscript derives inversion formulae and saddlepoint approximations that remain valid in this case, and reduce to known results when the denominator is almost surely positive. Applications include the IV estimator of a structural parameter in a just-identified equation.
Research Seminar in Statistics
On distributions of ratios
HG G 19.1
Fri 29.11.2013
16:15-17:00
Thomas Mikosch
University of Copenhagen
Abstract
We give an asymptotic theory for the eigenvalues of the sample covariance matrix of a multivariate time series. The time series constitutes a linear process across time and between components. The input noise of the linear process has regularly varying tails with index $\alpha\in (0,4)$; in particular, the time series has infinite fourth moment. We derive the limiting behavior for the largest eigenvalues of the sample covariance matrix and show point process convergence of the normalized eigenvalues. The limiting process has an explicit form involving points of a Poisson process and eigenvalues of a non-negative definite matrix Based on this convergence we derive limit theory for a host of other continuous functionals of the eigenvalues, including the joint convergence of the largest eigenvalues, the joint convergence of the largest eigenvalue and the trace of the sample covariance matrix, and the ratio of the largest eigenvalue to their sum. This is joint work with Richard A. Davis (Columbia NY) and Oliver Pfaffel (Munich).
Research Seminar in Statistics
Asymptotic theory for the sample covariance matrix of a heavy-tailed multivariate time series
HG G 19.1
Thr 05.12.2013
16:15-17:00
Jon Wakefield
University of Washington
Abstract
For many diseases, clinical illness can arise as the result of different genetic or viral pathogens. For example, cases of hand, foot and mouth disease (HFMD) arise due to different enteroviruses and can be clinically classified as either mild or severe. While clinical illness may be measured on the majority of a population, along with disease severity, the specific pathogen responsible will often be gathered on only a small subsample of individuals, sampled on the basis of disease severity. This talk have will have two halves. In the first half, we develop models for key transmission probabilities that allow an understanding of the transmission of the multiple pathogens in this non-random sampling setting. In the second half, we develop designs and inferential methods for community intervention vaccination trials in which direct, indirect, overall and total effects are estimated. Vaccines may target one pathogen, and we build on the modeling in the first half, since the vaccine efficacy measures are functions of the probabilities that are modeled within this aim. This is joint work with Leigh Fisher, Cici Chen and Steve Self.
ZüKoSt Zürcher Kolloquium über Statistik
The Modeling of Pathogen-Specific Counts for Hand, Foot and Mouth Disease
HG G 19.1
Fri 06.12.2013
15:15-16:00
Jun Mikyoung
Texas A&M University
Abstract
Many spatial processes in environmental applications, such as climate variables and climate model errors on a global scale, exhibit complex nonstationary dependence structure, not only in their marginal covariance but their cross-covariance. Flexible cross-covariance models for processes on a global scale are critical for accurate description of each spatial processes as well as their cross-dependence and for improved prediction. We propose various ways for producing cross-covariance models, based on Matérn covariance model class, that are suitable for describing prominent nonstationary characteristics of the global processes. In particular, we seek nonstationary version of Matérn covariance models whose smoothness parameters vary over space, coupled with differential operators approach for modeling large-scale nonstationarity. We compare their performances to some of existing models in terms of the aic and spatial prediction in two applications problems: joint modeling of surface temperature and precipitation, and joint modeling of errors of climate model ensembles.
Research Seminar in Statistics
Matérn-based nonstationary cross-covariance models for global processes
HG G 19.1
JavaScript has been disabled in your browser