Seminar overview

×

Modal title

Modal content

Spring Semester 2010

Date & Time Speaker Title Location
Fri 26.02.2010
15:15-16:15
Sach Mukherjee
University of Warwick
Abstract
Networks of proteins called "signalling networks" play a key role in the control of diverse cellular processes; their aberrant functioning is heavily implicated in many diseases, including cancer. Aberrations in cancer cells are thought to perturb the normal connectivity of these networks, with important biological and therapeutic implications. Yet cancer-specific signalling remains poorly understood, especially at the level of relevant (post-translational) protein modifications. However, modern high-throughput biochemical technologies are now able to make measurements on components of these systems, and can, in principle, shed light on a variety of open questions concerning signalling in cancer. I will discuss statistical approaches for interpreting these data, in particular how graphical models can be used to integrate biochemical data and prior knowledge of signalling biology to facilitate the discovery process.
Research Seminar in Statistics
Graphical modelling for cancer signalling
HG G 19.1
Fri 05.03.2010
15:15-16:15
Michael McAleer
Erasmus University Rotterdam
Abstract
The management and monitoring of very large portfolios of financial assets are routine for many individuals and organizations. The two most widely used models of conditional covariances and correlations in the class of multivariate GARCH models are BEKK and DCC. It is well known that BEKK suffers from the archetypal "curse of dimensionality", whereas DCC does not. It is argued in this paper that this is a misleading interpretation of the suitability of the two models for use in practice. The primary purpose of this paper is to analyze the similarities and dissimilarities between BEKK and DCC, both with and without targeting, on the basis of the structural derivation of the models, the availability of analytical forms for the sufficient conditions for existence of moments, sufficient conditions for consistency and asymptotic normality of the appropriate estimators, and computational tractability for ultra large numbers of financial assets. Based on theoretical considerations, the paper sheds light on how to discriminate between BEKK and DCC in practical applications. Keywords: Conditional Correlations, Conditional Covariances, Diagonal Models, Forecasting, Generalized Models, Hadamard Models, Scalar models, Targeting
Research Seminar in Statistics
Do We Really Need Both BEKK and DCC? A Tale of Two Multivariate GARCH Models
HG G 19.1
Thr 11.03.2010
16:15-17:30
Willi Maurer
Novartis Pharma AG
Abstract
In confirmatory clinical trials the Type I error rate must be controlled for claims forming the basis for approval and labelling of a new drug. Strong control of the familywise error rate is usually needed for families of hypotheses related to the claims. Such families of hypotheses arise naturally from comparing several treatments with a control, combined non-inferiority and superiority testing for primary and secondary variables, the presence of multiple primary or secondary endpoints and combinations thereof. A variety of sequentially rejective, weighted Bonferroni type tests have been proposed for this purpose, such as parallel and serial gatekeeping procedures. They allow -in principle- mapping the partially hierarchical structure with respect to importance of hypotheses onto multiple test procedures. Since these procedures are based on the closed testing principle they require the explicit specification of a large number of intersection hypotheses tests. For a procedure defined this way it is difficult to recognize the basic principle behind the choice and vice versa. In this talk a graphical approach* for constructing and performing the test procedure is presented. The relative importance of the hypotheses and their partially hierarchical structure is mapped onto weighted directed graphs whose vertices are representing the hypotheses. An update algorithm for the graphs enables sequentially rejective testing. Some properties of procedures generated this way are discussed and are illustrated with the visualization of common gatekeeping strategies and of procedures for more complex testing situations that have arisen in clinical practice. * Bretz, F., Maurer, W., Brannath, W., and Posch, M. (2009), ``A graphical approach to sequentially rejective multiple test procedures,'' Statistics in Medicine, 28, 586-604.
ZüKoSt Zürcher Kolloquium über Statistik
A Graphical Approach to Sequentially Rejective Multiple Test Procedures and its Application in Clinical Trials
HG G 19.1
Fri 19.03.2010
15:15-16:15
Francis Bach
INRIA-Willow Projekt, Paris
Abstract
We consider the problem of high-dimensional non-linear variable selection for supervised learning. Our approach is based on performing linear selection among exponentially many appropriately defined positive definite kernels that characterize non-linear interactions between the original variables. To select efficiently from these many kernels, we use the natural hierarchical structure of the problem to extend the multiple kernel learning framework to kernels that can be embedded in a directed acyclic graph; we show that it is then possible to perform kernel selection through a graph-adapted sparsity-inducing norm, in polynomial time in the number of selected kernels. Moreover, we study the consistency of variable selection in high-dimensional settings, showing that under certain assumptions, our regularization framework allows a number of irrelevant variables which is exponential in the number of observations.
Research Seminar in Statistics
High-Dimensional Non-Linear Variable Selection through Hierarchical Kernel Learning
HG G 19.1
Fri 26.03.2010
15:15-16:15
Cun-Hui Zhang
Rutgers University, U.S.A
Abstract
We propose a new method of learning a large symmetric target matrix whose inverse can be directly approximated by data. Our primary example of the target matrix is the inverse of a population correlation or covariance matrix. The proposed method, called GMACS, naturally leads to a uniform error bound which can be viewed as simultaneous condence intervals for the elements of the target matrix from an oracle expert. Since the uniform error bound essentially turns a statistical experiment with indirect observations into one with direct observations, conservative solutions follow for a number of hypothesis testing and structure estimation problems, including graphical model selection and consistent estimation of the target under the spectrum and other norms. Our main ideas and results are applicable to general symmetric data and target matrices.
Research Seminar in Statistics
Graphical Model Selection and Statistical Inference About Large Inverse Matrices
HG G 19.1
Thr 15.04.2010
16:15-17:30
Dalalyan Arnak
Ecole des Ponts Paris Tech
Abstract
(joint work with R. Keriven) We propose a new approach to the problem of robust estimation for an inverse problem arising in multiview geometry. Inspired by recent advances in the statistical theory of recovering sparse vectors,we define our estimator as a Bayesian maximum a posteriori with multivariate Laplace prior on the vector describing the outliers. This leads to an estimator in which the fidelity to the data is measured by the L_\infty - norm while the regularization is done by the L1-norm. The proposed procedure is fairly fast since the outlier removal is done by solving one linear program (LP). An important difference compared to existing algorithms is that for our estimator it is not necessary to specify neither the number nor the proportion of the outliers; only an upper bound on the maximal measurement error for the inliers should be specified. We present theoretical results assessing the accuracy of our procedure, as well as numerical examples illustrating its efficiency on synthetic and real data.
ZüKoSt Zürcher Kolloquium über Statistik
Robust Estimation for an Inverse Problem Arising in Multiview Geometry
HG G 19.1
Fri 16.04.2010
15:00-16:15
Dalalyan Arnak
Ecole des Ponts Paris Tech
Abstract
(joint work with A.B. Tsybakov) We consider the problem of aggregating the elements of a (possibly infinite) dictionary for building a decision procedure, that aims at minimizing a given criterion. Along with the dictionary, an independent identically distributed training sample is assumed available on which the performance of a given procedure can be tested. In a fairly general set-up, we establish an oracle inequality for the Mirror Averaging aggregate based on any prior distribution. This oracle inequality is applied in the context of sparse coding for different tasks of statistics and machine learning such as regression, density estimation and binary classification.
Research Seminar in Statistics
Sparsity oracle inequalities for mirror avaraging aggregate
HG G 26.5
Thr 22.04.2010
16:15-17:30
Berend Snijder
ETH Zürich
Abstract
Systematically interfering with the components that make up a larger system is a good way of getting first insights into how the system works. In molecular cell biology, RNA interference (RNAi) has become the method of choice to silence individual genes, the building blocks of life. We use RNAi to silence thousands of human genes, one at the time, in order to find out which of these genes are required for virus infection. Automated microscopy combined with high content image analysis is used to get quantitative readouts of the many millions of cells that make up such large scale screens. Using these methods we discovered that virus infection in human cells is much more complex than previously assumed, with different viruses preferentially infecting different sub-populations of cells. We therefore apply several statistical methods to model single-cell behavior in our RNAi screens. Although these statistics add a layer of complexity to the process of RNAi screening, we find that the quality and types of information we can extract from RNAi screens is greatly increased. The improved understanding of cellular behavior and virus infection opens up new avenues in the search for antiviral drug targets, and the methods described here generalize to many different types of cellular perturbation screens.
ZüKoSt Zürcher Kolloquium über Statistik
Single cell modeling in image based RNAi screens
HG G 19.1
Thr 29.04.2010
16:15-17:30
Bowman Adrian
University of Glasgow, UK
Abstract
Three-dimensional surface imaging, through laser-scanning or stereo-photogrammetry, provides high-resolution data defining the shape of objects. In an anatomical setting this can provide invaluable quantitative information, for example on the success of surgery. Two particular applications are in the success of breast reconstruction and in facial surgery following conditions such as cleft lip and palate. An initial challenge is to extract suitable information from these images, to characterise the surface shape in an informative manner. Landmarks are traditionally used to good effect but these clearly do not adequately represent the very much richer information present in each digitised images. Curves with clear anatomical meaning provide a good compromise between informative representations of shape and simplicity of structure. Some of the issues involved in analysing data of this type will be discussed and illustrated. Modelling issues include the measurement of asymmetry and longitudinal patterns of growth.
ZüKoSt Zürcher Kolloquium über Statistik
Statistics with a human face
HG G 19.1
Fri 21.05.2010
15:15-16:15
Christian Hennig
University College London
Abstract
Normal mixture models are often used for cluster analysis. Usually, every component of the mixture is interpreted as a cluster. This, however, is often not appropriate. A mixture of two normal components can be unimodal and quite homogeneous. Particularly, mixtures of several normals can be needed to approximate homogeneous non-normal distributions. Even if there are non-normal subpopulations in the data, the normal mixture model is still a good tool for clustering because of its flexibility. This presentation is about methods to decide whether, after having fitted a normal mixture, several mixture components should be merged in order to be interpreted as a single cluster. Note that this cannot be formulated as a statistical estimation problem, because the likelihood and the general fitting quality of the model does not depend on whether single mixture components or sets of mixture components are interpreted as clusters. So any method depends on a specification of what the user wants to regard as a "cluster". There are at least two different cluster concepts, namely identifying clusters with modes (and therefore merging unimodal mixtures) and identifying clusters with clear patterns in the data (which for example means that scale mixtures, though unimodal, should not necessarily be merged). Furthermore, it has to be specified how strong a separation is required between different clusters. The methods proposed and compared in this presentation are all hierarchical. From an estimated mixture, pairs of components (and later pairs of already merged mixtures) are merged until members of a pair are separated enough in order to be interpreted as different clusters. This can be measured in many different ways, depending on the underlying cluster concept. Apart from the discussed methodology, some implications about how to think about cluster analysis problems in general will be discussed.
Research Seminar in Statistics
How to merge normal mixture components for cluster analysis
HG G 19.1
Thr 27.05.2010
16:15-17:30
Roger Kaufmann

Abstract
Der Ausgang eines Fussballspiels ist im Voraus unklar, was den grossen Reiz dieses Sports ausmacht. Aber ist Fussball unvorhersehbar? Wir diskutieren diese Frage anhand der FIFA WM 2010 in Südafrika. In einem ersten Teil wird der Zusammenhang zwischen Fussball und Mathematik aufgezeigt. Ein mathematischer Ansatz zur Berechnung möglicher Ausgänge von Fussballspielen wird präsentiert. Anschliessend werfen wir einen Blick auf die Partie Schweiz - Spanien, die am 16. Juni in Südafrika über die Bühne gehen wird. Im letzten Teil der Präsentation sprechen wir über Backtesting und zeigen auf, wie Fussballmanager von solchen Berechnungen profitieren können. Der Grossteil des Vortrags wird für Hörer/innen ohne statistische Fachkenntnisse verständlich sein. The talk will be in German, with English slides.
ZüKoSt Zürcher Kolloquium über Statistik
Fussball-Mathematik: Vorhersagen zur Weltmeisterschaft (findet in HG G 5 statt)
HG G 5
Fri 28.05.2010
15:15-16:15
David van Dyk
University of California
Abstract
Color-Magnitude Diagrams (CMDs) are plots that compare the magnitudes (luminosities) of stars in different wavelengths of light (colors). High non-linear correlations among the mass, color and surface temperature of newly formed stars induce a long narrow curved point cloud in a CMD known as the main sequence. Aging stars form new CMD groups of red giants and white dwarfs. The physical processes that govern this evolution can be described with mathematical models and explored using complex computer models. These calculations are designed to predict the plotted magnitudes as a function of parameters of scientific interest such as stellar age, mass, and metallicity. Here, we describe how we use the computer models as a component of a complex likelihood function in a Bayesian analysis that requires sophisticated computing, corrects for contamination of the data by field stars, accounts for complications caused by unresolved binary-star systems, anaims to compare competing physics-based computer models of stellar evolution. (Joint work with Steven DeGennaro, Nathan Stein, William H. Jeffery, Ted von Hippel, and Elizabeth Jeffery)
Research Seminar in Statistics
Statistical Analysis of Stellar Evolution
HG G 26.5
Fri 23.07.2010
14:15-15:30
Thomas Richardson
University of Washington, Seattle
Abstract
Acyclic directed mixed graphs (ADMGs), also known as semi-Markov models, represent the conditional independence structure induced on an observed margin by a DAG model with latent variables. Ancestral graphs (without undirected edges) are a subclass of ADMGs. In this talk we first present a parametrization of these model for binary data. We then describe a maximum-likelihood fitting algorithm that may be used for scoring.
Research Seminar in Statistics
Acyclic directed mixed graphs for binary data
HG G 19.1
Wed 01.09.2010
14:15-16:00
Dr. Johanna Ziegel
The University of Melbourne
Abstract
Volume estimators based on Cavalieri's principle are widely used in the biosciences. For example in neuroscience, where volumetric measurements of brain structures are of interest, systematic samples of serial sections are obtained by magnetic resonance imaging or by a physical cutting procedure. The volume is then estimated by the sum over the areas of the structure of interest in the section planes multiplied by the width of the sections. Assessing the precision of such volume estimates is a question of great practical importance, but statistically a challenging task due to the strong spatial dependence of the data and typically small sample sizes. The approach we take is more ambitious than earlier methodologies, the goal of which has been estimation of the variance of a volume estimator \hat{v}, rather than estimation of the distribution of \hat{v}; see e.g. Cruz-Orive (1999); Gundersen et al. (1999); García-Fiñana and Cruz-Orive (2004); Ziegel et al. (2010). We use a bootstrap method to obtain a consistent estimator of the distribution of \hat{v} conditional on the observed data. Confidence intervals are then derived from the distribution estimate. We treat the case where serial sections are exactly periodic as well as when the physical cutting procedure introduces errors in the placement of the sampling points. To illustrate the performance of our method we conduct a simulation study with synthetic data and also apply our results to real data sets. Joint work with Peter Hall, The University of Melbourne. References: Cruz-Orive, L. M. (1999). Precision of Cavalieri sections and slices with local errors. J. Microsc., 193, 182-198. García-Fiñana, M. and Cruz-Orive, L. M. (2004). Improved variance prediction for systematic sampling on R. Statistics , 38(3), 243-272. Gundersen, H. J. G., Jensen, E. B. V., Kiêu, K., and Nielsen, J. (1999). The efficiency of systematic sampling - reconsidered. J. Microsc., 193, 199-211. Ziegel, J., Baddeley, A., Dorph-Petersen, K.-A., and Jensen, E. B. V. (2010). Systematic sampling with errors in sample locations. Biometrika , 97, 1-13.
Research Seminar in Statistics
Distribution estimators and confidence intervals for Cavalieri estimators
HG G 19.1
JavaScript has been disabled in your browser