Statistics research seminar

×

Modal title

Modal content

Spring Semester 2010

Date / Time Speaker Title Location
26 February 2010
15:15-16:15
Sach Mukherjee
University of Warwick
Event Details

Research Seminar in Statistics

Title Graphical modelling for cancer signalling
Speaker, Affiliation Sach Mukherjee, University of Warwick
Date, Time 26 February 2010, 15:15-16:15
Location HG G 19.1
Abstract Networks of proteins called "signalling networks" play a key role in the control of diverse cellular processes; their aberrant functioning is heavily implicated in many diseases, including cancer. Aberrations in cancer cells are thought to perturb the normal connectivity of these networks, with important biological and therapeutic implications. Yet cancer-specific signalling remains poorly understood, especially at the level of relevant (post-translational) protein modifications. However, modern high-throughput biochemical technologies are now able to make measurements on components of these systems, and can, in principle, shed light on a variety of open questions concerning signalling in cancer. I will discuss statistical approaches for interpreting these data, in particular how graphical models can be used to integrate biochemical data and prior knowledge of signalling biology to facilitate the discovery process.
Graphical modelling for cancer signallingread_more
HG G 19.1
5 March 2010
15:15-16:15
Michael McAleer
Erasmus University Rotterdam
Event Details

Research Seminar in Statistics

Title Do We Really Need Both BEKK and DCC? A Tale of Two Multivariate GARCH Models
Speaker, Affiliation Michael McAleer, Erasmus University Rotterdam
Date, Time 5 March 2010, 15:15-16:15
Location HG G 19.1
Abstract The management and monitoring of very large portfolios of financial assets are routine for many individuals and organizations. The two most widely used models of conditional covariances and correlations in the class of multivariate GARCH models are BEKK and DCC. It is well known that BEKK suffers from the archetypal "curse of dimensionality", whereas DCC does not. It is argued in this paper that this is a misleading interpretation of the suitability of the two models for use in practice. The primary purpose of this paper is to analyze the similarities and dissimilarities between BEKK and DCC, both with and without targeting, on the basis of the structural derivation of the models, the availability of analytical forms for the sufficient conditions for existence of moments, sufficient conditions for consistency and asymptotic normality of the appropriate estimators, and computational tractability for ultra large numbers of financial assets. Based on theoretical considerations, the paper sheds light on how to discriminate between BEKK and DCC in practical applications. Keywords: Conditional Correlations, Conditional Covariances, Diagonal Models, Forecasting, Generalized Models, Hadamard Models, Scalar models, Targeting
Do We Really Need Both BEKK and DCC? A Tale of Two Multivariate GARCH Models read_more
HG G 19.1
19 March 2010
15:15-16:15
Francis Bach
INRIA-Willow Projekt, Paris
Event Details

Research Seminar in Statistics

Title High-Dimensional Non-Linear Variable Selection through Hierarchical Kernel Learning
Speaker, Affiliation Francis Bach, INRIA-Willow Projekt, Paris
Date, Time 19 March 2010, 15:15-16:15
Location HG G 19.1
Abstract We consider the problem of high-dimensional non-linear variable selection for supervised learning. Our approach is based on performing linear selection among exponentially many appropriately defined positive definite kernels that characterize non-linear interactions between the original variables. To select efficiently from these many kernels, we use the natural hierarchical structure of the problem to extend the multiple kernel learning framework to kernels that can be embedded in a directed acyclic graph; we show that it is then possible to perform kernel selection through a graph-adapted sparsity-inducing norm, in polynomial time in the number of selected kernels. Moreover, we study the consistency of variable selection in high-dimensional settings, showing that under certain assumptions, our regularization framework allows a number of irrelevant variables which is exponential in the number of observations.
High-Dimensional Non-Linear Variable Selection through Hierarchical Kernel Learningread_more
HG G 19.1
26 March 2010
15:15-16:15
Cun-Hui Zhang
Rutgers University, U.S.A
Event Details

Research Seminar in Statistics

Title Graphical Model Selection and Statistical Inference About Large Inverse Matrices
Speaker, Affiliation Cun-Hui Zhang, Rutgers University, U.S.A
Date, Time 26 March 2010, 15:15-16:15
Location HG G 19.1
Abstract We propose a new method of learning a large symmetric target matrix whose inverse can be directly approximated by data. Our primary example of the target matrix is the inverse of a population correlation or covariance matrix. The proposed method, called GMACS, naturally leads to a uniform error bound which can be viewed as simultaneous condence intervals for the elements of the target matrix from an oracle expert. Since the uniform error bound essentially turns a statistical experiment with indirect observations into one with direct observations, conservative solutions follow for a number of hypothesis testing and structure estimation problems, including graphical model selection and consistent estimation of the target under the spectrum and other norms. Our main ideas and results are applicable to general symmetric data and target matrices.
Graphical Model Selection and Statistical Inference About Large Inverse Matricesread_more
HG G 19.1
* 16 April 2010
15:00-16:15
Dalalyan Arnak
Ecole des Ponts Paris Tech
Event Details

Research Seminar in Statistics

Title Sparsity oracle inequalities for mirror avaraging aggregate
Speaker, Affiliation Dalalyan Arnak, Ecole des Ponts Paris Tech
Date, Time 16 April 2010, 15:00-16:15
Location HG G 26.5
Abstract (joint work with A.B. Tsybakov) We consider the problem of aggregating the elements of a (possibly infinite) dictionary for building a decision procedure, that aims at minimizing a given criterion. Along with the dictionary, an independent identically distributed training sample is assumed available on which the performance of a given procedure can be tested. In a fairly general set-up, we establish an oracle inequality for the Mirror Averaging aggregate based on any prior distribution. This oracle inequality is applied in the context of sparse coding for different tasks of statistics and machine learning such as regression, density estimation and binary classification.
Sparsity oracle inequalities for mirror avaraging aggregateread_more
HG G 26.5
21 May 2010
15:15-16:15
Christian Hennig
University College London
Event Details

Research Seminar in Statistics

Title How to merge normal mixture components for cluster analysis
Speaker, Affiliation Christian Hennig, University College London
Date, Time 21 May 2010, 15:15-16:15
Location HG G 19.1
Abstract Normal mixture models are often used for cluster analysis. Usually, every component of the mixture is interpreted as a cluster. This, however, is often not appropriate. A mixture of two normal components can be unimodal and quite homogeneous. Particularly, mixtures of several normals can be needed to approximate homogeneous non-normal distributions. Even if there are non-normal subpopulations in the data, the normal mixture model is still a good tool for clustering because of its flexibility. This presentation is about methods to decide whether, after having fitted a normal mixture, several mixture components should be merged in order to be interpreted as a single cluster. Note that this cannot be formulated as a statistical estimation problem, because the likelihood and the general fitting quality of the model does not depend on whether single mixture components or sets of mixture components are interpreted as clusters. So any method depends on a specification of what the user wants to regard as a "cluster". There are at least two different cluster concepts, namely identifying clusters with modes (and therefore merging unimodal mixtures) and identifying clusters with clear patterns in the data (which for example means that scale mixtures, though unimodal, should not necessarily be merged). Furthermore, it has to be specified how strong a separation is required between different clusters. The methods proposed and compared in this presentation are all hierarchical. From an estimated mixture, pairs of components (and later pairs of already merged mixtures) are merged until members of a pair are separated enough in order to be interpreted as different clusters. This can be measured in many different ways, depending on the underlying cluster concept. Apart from the discussed methodology, some implications about how to think about cluster analysis problems in general will be discussed.
How to merge normal mixture components for cluster analysisread_more
HG G 19.1
* 28 May 2010
15:15-16:15
David van Dyk
University of California
Event Details

Research Seminar in Statistics

Title Statistical Analysis of Stellar Evolution
Speaker, Affiliation David van Dyk, University of California
Date, Time 28 May 2010, 15:15-16:15
Location HG G 26.5
Abstract Color-Magnitude Diagrams (CMDs) are plots that compare the magnitudes (luminosities) of stars in different wavelengths of light (colors). High non-linear correlations among the mass, color and surface temperature of newly formed stars induce a long narrow curved point cloud in a CMD known as the main sequence. Aging stars form new CMD groups of red giants and white dwarfs. The physical processes that govern this evolution can be described with mathematical models and explored using complex computer models. These calculations are designed to predict the plotted magnitudes as a function of parameters of scientific interest such as stellar age, mass, and metallicity. Here, we describe how we use the computer models as a component of a complex likelihood function in a Bayesian analysis that requires sophisticated computing, corrects for contamination of the data by field stars, accounts for complications caused by unresolved binary-star systems, anaims to compare competing physics-based computer models of stellar evolution. (Joint work with Steven DeGennaro, Nathan Stein, William H. Jeffery, Ted von Hippel, and Elizabeth Jeffery)
Statistical Analysis of Stellar Evolutionread_more
HG G 26.5
* 23 July 2010
14:15-15:30
Thomas Richardson
University of Washington, Seattle
Event Details

Research Seminar in Statistics

Title Acyclic directed mixed graphs for binary data
Speaker, Affiliation Thomas Richardson, University of Washington, Seattle
Date, Time 23 July 2010, 14:15-15:30
Location HG G 19.1
Abstract Acyclic directed mixed graphs (ADMGs), also known as semi-Markov models, represent the conditional independence structure induced on an observed margin by a DAG model with latent variables. Ancestral graphs (without undirected edges) are a subclass of ADMGs. In this talk we first present a parametrization of these model for binary data. We then describe a maximum-likelihood fitting algorithm that may be used for scoring.
Acyclic directed mixed graphs for binary dataread_more
HG G 19.1
* 1 September 2010
14:15-16:00
Dr. Johanna Ziegel
The University of Melbourne
Event Details

Research Seminar in Statistics

Title Distribution estimators and confidence intervals for Cavalieri estimators
Speaker, Affiliation Dr. Johanna Ziegel, The University of Melbourne
Date, Time 1 September 2010, 14:15-16:00
Location HG G 19.1
Abstract Volume estimators based on Cavalieri's principle are widely used in the biosciences. For example in neuroscience, where volumetric measurements of brain structures are of interest, systematic samples of serial sections are obtained by magnetic resonance imaging or by a physical cutting procedure. The volume is then estimated by the sum over the areas of the structure of interest in the section planes multiplied by the width of the sections. Assessing the precision of such volume estimates is a question of great practical importance, but statistically a challenging task due to the strong spatial dependence of the data and typically small sample sizes. The approach we take is more ambitious than earlier methodologies, the goal of which has been estimation of the variance of a volume estimator \hat{v}, rather than estimation of the distribution of \hat{v}; see e.g. Cruz-Orive (1999); Gundersen et al. (1999); García-Fiñana and Cruz-Orive (2004); Ziegel et al. (2010). We use a bootstrap method to obtain a consistent estimator of the distribution of \hat{v} conditional on the observed data. Confidence intervals are then derived from the distribution estimate. We treat the case where serial sections are exactly periodic as well as when the physical cutting procedure introduces errors in the placement of the sampling points. To illustrate the performance of our method we conduct a simulation study with synthetic data and also apply our results to real data sets. Joint work with Peter Hall, The University of Melbourne. References: Cruz-Orive, L. M. (1999). Precision of Cavalieri sections and slices with local errors. J. Microsc., 193, 182-198. García-Fiñana, M. and Cruz-Orive, L. M. (2004). Improved variance prediction for systematic sampling on R. Statistics , 38(3), 243-272. Gundersen, H. J. G., Jensen, E. B. V., Kiêu, K., and Nielsen, J. (1999). The efficiency of systematic sampling - reconsidered. J. Microsc., 193, 199-211. Ziegel, J., Baddeley, A., Dorph-Petersen, K.-A., and Jensen, E. B. V. (2010). Systematic sampling with errors in sample locations. Biometrika , 97, 1-13.
Distribution estimators and confidence intervals for Cavalieri estimators read_more
HG G 19.1

Notes: events marked with an asterisk (*) indicate that the time and/or location are different from the usual time and/or location and if you want you can subscribe to the iCal/ics Calender.

Organizers: Peter Bühlmann, Leonhard Held, Hans-Rudolf Künsch, Marloes Maathuis, Sara van de Geer

JavaScript has been disabled in your browser