Seminar overview
×
Modal title
Modal content
Spring Semester 2010
Date & Time | Speaker | Title | Location |
---|---|---|---|
Fri 26.02.2010 15:15-16:15 |
Sach Mukherjee University of Warwick |
Abstract
Networks of proteins called "signalling networks" play a key role in the control of diverse cellular processes; their aberrant functioning is heavily implicated in many diseases, including cancer. Aberrations in cancer cells are thought to perturb the normal connectivity of these networks, with important biological and therapeutic implications. Yet cancer-specific signalling remains poorly understood, especially at the level of relevant (post-translational) protein modifications. However, modern high-throughput biochemical technologies are now able to make measurements on components of these systems, and can, in principle, shed light on a variety of open questions concerning signalling in cancer. I will discuss statistical approaches for interpreting these data, in particular how graphical models can be used to integrate biochemical data and prior knowledge of signalling biology to facilitate the discovery process.
Research Seminar in StatisticsGraphical modelling for cancer signallingread_more |
HG G 19.1 |
Fri 05.03.2010 15:15-16:15 |
Michael McAleer Erasmus University Rotterdam |
Abstract
The management and monitoring of very large portfolios of financial assets are routine for many individuals and organizations. The two most widely used models of conditional covariances and correlations in the class of multivariate GARCH models are BEKK and DCC. It is well known that BEKK suffers from the archetypal "curse of dimensionality", whereas DCC does not. It is argued in this paper that this is a misleading interpretation of the suitability of the two models for use in practice. The primary purpose of this paper is to analyze the similarities and dissimilarities between BEKK and DCC, both with and without targeting, on the basis of the structural derivation of the models, the availability of analytical forms for the sufficient conditions for existence of moments, sufficient conditions for consistency and asymptotic normality of the appropriate estimators, and computational tractability for ultra large numbers of financial assets. Based on theoretical considerations, the paper sheds light on how to discriminate between BEKK and DCC in practical applications.
Keywords: Conditional Correlations, Conditional Covariances, Diagonal Models, Forecasting, Generalized Models, Hadamard Models, Scalar models, Targeting
Research Seminar in StatisticsDo We Really Need Both BEKK and DCC? A Tale of Two Multivariate GARCH Models read_more |
HG G 19.1 |
Thr 11.03.2010 16:15-17:30 |
Willi Maurer Novartis Pharma AG |
Abstract
In confirmatory clinical trials the Type I error rate must be controlled for claims forming the basis for approval and labelling of a new drug. Strong control of the familywise error rate is usually needed for families of hypotheses related to the claims. Such families of hypotheses arise naturally from comparing several treatments with a control, combined non-inferiority and superiority testing for primary and secondary variables, the presence of multiple primary or secondary endpoints and combinations thereof. A variety of sequentially rejective, weighted Bonferroni type tests have been proposed for this purpose, such as parallel and serial gatekeeping procedures. They allow -in principle- mapping the partially hierarchical structure with respect to importance of hypotheses onto multiple test procedures. Since these procedures are based on the closed testing principle they require the explicit specification of a large number of intersection hypotheses tests. For a procedure defined this way it is difficult to recognize the basic principle behind the choice and vice versa. In this talk a graphical approach* for constructing and performing the test procedure is presented. The relative importance of the hypotheses and their partially hierarchical structure is mapped onto weighted directed graphs whose vertices are representing the hypotheses. An update algorithm for the graphs enables sequentially rejective testing. Some properties of procedures generated this way are discussed and are illustrated with the visualization of common gatekeeping strategies and of procedures for more complex testing situations that have arisen in clinical practice.
* Bretz, F., Maurer, W., Brannath, W., and Posch, M. (2009), ``A graphical approach to sequentially rejective multiple test procedures,'' Statistics in Medicine, 28, 586-604.
ZüKoSt Zürcher Kolloquium über StatistikA Graphical Approach to Sequentially Rejective Multiple Test Procedures and its Application in Clinical Trialsread_more |
HG G 19.1 |
Fri 19.03.2010 15:15-16:15 |
Francis Bach INRIA-Willow Projekt, Paris |
Abstract
We consider the problem of high-dimensional non-linear variable selection for supervised learning. Our approach is based on performing linear selection among exponentially many appropriately defined positive definite kernels that characterize non-linear interactions between the original variables. To select efficiently from these many kernels, we use the natural hierarchical structure of the problem to extend the multiple kernel learning framework to kernels that can be embedded in a directed acyclic graph; we show that it is then possible to perform kernel selection through a graph-adapted sparsity-inducing norm, in polynomial time in the number of selected kernels. Moreover, we study the consistency of variable selection in high-dimensional settings, showing that under certain assumptions, our regularization framework allows a number of irrelevant variables which is exponential
in the number of observations.
Research Seminar in StatisticsHigh-Dimensional Non-Linear Variable Selection through Hierarchical Kernel Learningread_more |
HG G 19.1 |
Fri 26.03.2010 15:15-16:15 |
Cun-Hui Zhang Rutgers University, U.S.A |
Abstract
We propose a new method of learning a large symmetric target matrix whose inverse can be directly approximated by data. Our primary example of the target matrix is the inverse of a population correlation or covariance matrix. The proposed method, called GMACS, naturally leads to a uniform error bound which can be viewed as simultaneous condence intervals for the elements of the target matrix from an oracle expert. Since the uniform error bound essentially turns a statistical experiment with indirect observations into one with direct observations, conservative solutions follow for a number of hypothesis testing and structure estimation problems, including graphical model selection and consistent estimation of the target under the spectrum and other norms. Our main ideas and results are applicable to general symmetric data and target matrices.
Research Seminar in StatisticsGraphical Model Selection and Statistical Inference About Large Inverse Matricesread_more |
HG G 19.1 |
Thr 15.04.2010 16:15-17:30 |
Dalalyan Arnak Ecole des Ponts Paris Tech |
Abstract
(joint work with R. Keriven)
We propose a new approach to the problem of robust estimation for an inverse problem arising in multiview geometry. Inspired by recent advances in the statistical theory of recovering sparse vectors,we define our estimator as a Bayesian maximum a posteriori with multivariate Laplace prior on the vector describing the outliers. This leads to an estimator in which the fidelity to the data is measured by the L_\infty - norm while the regularization is done by the L1-norm. The proposed procedure is fairly fast since the outlier removal is done by solving one linear program (LP). An important difference compared to existing algorithms is that for our estimator it is not necessary to specify neither the number nor the proportion of the outliers; only an upper bound on the maximal measurement error for the inliers should be specified. We present theoretical results assessing the accuracy of our procedure, as well as numerical examples illustrating its efficiency on synthetic and real data.
ZüKoSt Zürcher Kolloquium über StatistikRobust Estimation for an Inverse Problem Arising in Multiview Geometryread_more |
HG G 19.1 |
Fri 16.04.2010 15:00-16:15 |
Dalalyan Arnak Ecole des Ponts Paris Tech |
Abstract
(joint work with A.B. Tsybakov)
We consider the problem of aggregating the elements of a (possibly infinite) dictionary for building a decision procedure, that aims at minimizing a given criterion. Along with the dictionary, an independent identically distributed training sample is assumed available on which the performance of a given procedure can be tested. In a fairly general set-up, we establish an oracle inequality for the Mirror Averaging aggregate based on any prior distribution. This oracle inequality is applied in the context of sparse coding for different tasks of statistics and machine learning such as regression, density estimation and binary classification.
Research Seminar in StatisticsSparsity oracle inequalities for mirror avaraging aggregateread_more |
HG G 26.5 |
Thr 22.04.2010 16:15-17:30 |
Berend Snijder ETH Zürich |
Abstract
Systematically interfering with the components that make up a larger system is a good way of getting first insights into how the system works. In molecular cell biology, RNA interference (RNAi) has become the method of choice to silence individual genes, the building blocks of life. We use RNAi to silence thousands of human genes, one at the time, in order to find out which of these genes are required for virus infection. Automated microscopy combined with high content image analysis is used to get quantitative readouts of the many millions of cells that make up such large scale screens. Using these methods we discovered that virus infection in human cells is much more complex than previously assumed, with different viruses preferentially infecting different sub-populations of cells. We therefore apply several statistical methods to model single-cell behavior in our RNAi screens. Although these statistics add a layer of complexity to the process of RNAi screening, we find that the quality and types of information we can extract from RNAi screens is greatly increased. The improved understanding of cellular behavior and virus infection opens up new avenues in the search for antiviral drug targets, and the methods described here generalize to many different types of cellular perturbation screens.
ZüKoSt Zürcher Kolloquium über StatistikSingle cell modeling in image based RNAi screensread_more |
HG G 19.1 |
Thr 29.04.2010 16:15-17:30 |
Bowman Adrian University of Glasgow, UK |
Abstract
Three-dimensional surface imaging, through laser-scanning or stereo-photogrammetry, provides high-resolution data defining the shape of objects. In an anatomical setting this can provide invaluable quantitative information, for example on the success of surgery. Two particular applications are in the success of breast reconstruction and in facial surgery following conditions such as cleft lip and palate. An initial challenge is to extract suitable information from these images, to characterise the surface shape in an informative manner. Landmarks are traditionally used to good effect but these clearly do not adequately represent the very much richer information present in each digitised images. Curves with clear anatomical meaning provide a good compromise between informative representations of shape and simplicity of structure. Some of the issues involved in analysing data of this type will be discussed and illustrated. Modelling issues include the measurement of asymmetry and longitudinal patterns of growth.
ZüKoSt Zürcher Kolloquium über StatistikStatistics with a human faceread_more |
HG G 19.1 |
Fri 21.05.2010 15:15-16:15 |
Christian Hennig University College London |
Abstract
Normal mixture models are often used for cluster analysis. Usually, every component of the mixture is interpreted as a cluster. This, however, is often not appropriate. A mixture of two normal components can be unimodal and quite homogeneous. Particularly, mixtures of several normals can be needed to approximate homogeneous non-normal distributions.
Even if there are non-normal subpopulations in the data, the normal mixture model is still a good tool for clustering because of its flexibility. This presentation is about methods to decide whether, after having fitted a normal mixture, several mixture components should be merged in order to be interpreted as a single cluster.
Note that this cannot be formulated as a statistical estimation problem, because the likelihood and the general fitting quality of the model does not depend on whether single mixture components or sets of mixture components are interpreted as clusters. So any method depends on a specification of what the user wants to regard as a "cluster". There are at least two different cluster concepts, namely identifying clusters with modes (and therefore merging unimodal mixtures) and identifying clusters with clear patterns in the data
(which for example means that scale mixtures, though unimodal, should not necessarily be merged). Furthermore, it has to be specified how strong a separation is required between different clusters.
The methods proposed and compared in this presentation are all hierarchical. From an estimated mixture, pairs of components (and later pairs of already merged mixtures) are merged until members of a pair are separated enough in order to be interpreted as different clusters. This can be measured in many different ways, depending on the underlying cluster concept.
Apart from the discussed methodology, some implications about how to think about cluster analysis problems in general will be discussed.
Research Seminar in StatisticsHow to merge normal mixture components for cluster analysisread_more |
HG G 19.1 |
Thr 27.05.2010 16:15-17:30 |
Roger Kaufmann |
Abstract
Der Ausgang eines Fussballspiels ist im Voraus unklar, was den grossen Reiz dieses Sports ausmacht. Aber ist Fussball unvorhersehbar? Wir diskutieren diese Frage anhand der FIFA WM 2010 in Südafrika. In einem ersten Teil wird der Zusammenhang zwischen Fussball und Mathematik aufgezeigt. Ein mathematischer Ansatz zur Berechnung möglicher Ausgänge von Fussballspielen wird präsentiert. Anschliessend werfen wir einen Blick auf die Partie Schweiz - Spanien, die am 16. Juni in Südafrika über die Bühne gehen wird. Im letzten Teil der Präsentation sprechen wir über Backtesting und zeigen auf, wie Fussballmanager von solchen Berechnungen profitieren können.
Der Grossteil des Vortrags wird für Hörer/innen ohne statistische Fachkenntnisse verständlich sein.
The talk will be in German, with English slides.
ZüKoSt Zürcher Kolloquium über StatistikFussball-Mathematik: Vorhersagen zur Weltmeisterschaft (findet in HG G 5 statt)read_more |
HG G 5 |
Fri 28.05.2010 15:15-16:15 |
David van Dyk University of California |
Abstract
Color-Magnitude Diagrams (CMDs) are plots that compare the magnitudes
(luminosities) of stars in different wavelengths of light (colors). High non-linear correlations among the mass, color and surface temperature of newly formed stars induce a long narrow curved point cloud in a CMD known as the main sequence. Aging stars form new CMD groups of red giants and white dwarfs. The physical processes that govern this evolution can be described with mathematical models and explored using complex computer models. These calculations are designed to predict the plotted magnitudes as a function of parameters of scientific interest such as stellar age, mass, and metallicity. Here, we describe how we use the computer models as a component of a complex likelihood function in a Bayesian analysis that requires sophisticated computing, corrects for contamination of the data by field stars, accounts for complications caused by unresolved binary-star systems, anaims to compare competing physics-based computer models of stellar evolution.
(Joint work with Steven DeGennaro, Nathan Stein, William H. Jeffery, Ted von Hippel, and Elizabeth Jeffery)
Research Seminar in StatisticsStatistical Analysis of Stellar Evolutionread_more |
HG G 26.5 |
Fri 23.07.2010 14:15-15:30 |
Thomas Richardson University of Washington, Seattle |
Abstract
Acyclic directed mixed graphs (ADMGs), also known as semi-Markov models, represent the conditional independence structure induced on an observed margin by a DAG model with latent variables. Ancestral graphs (without undirected edges) are a subclass of ADMGs. In this talk we first present a parametrization of these model for binary data. We then describe a maximum-likelihood fitting algorithm that may be used for scoring.
Research Seminar in StatisticsAcyclic directed mixed graphs for binary dataread_more |
HG G 19.1 |
Wed 01.09.2010 14:15-16:00 |
Dr. Johanna Ziegel The University of Melbourne |
Abstract
Volume estimators based on Cavalieri's principle are widely used in the biosciences. For example in neuroscience, where volumetric measurements of brain structures are of interest, systematic samples of serial sections are obtained by magnetic resonance imaging or by a physical cutting procedure. The volume is then estimated by the sum over the areas of the structure of interest in the section planes multiplied by the width of the sections. Assessing the precision of such volume estimates is a question of great practical importance, but statistically a challenging task due to the strong spatial dependence of the data and typically small sample sizes. The approach we take is more ambitious than earlier methodologies, the goal of which has been estimation of the variance of a volume estimator \hat{v}, rather than estimation of the distribution of \hat{v}; see e.g. Cruz-Orive (1999); Gundersen et al. (1999); García-Fiñana and Cruz-Orive (2004); Ziegel et al. (2010). We use a bootstrap method to obtain a consistent estimator of the distribution of \hat{v} conditional on the observed data. Confidence intervals are then derived from the distribution estimate. We treat the case where serial sections are exactly periodic as well as when the physical cutting procedure introduces errors in the placement of the sampling points. To illustrate the performance of our method we conduct a simulation study with synthetic data and also apply our results to real data sets.
Joint work with Peter Hall, The University of Melbourne.
References:
Cruz-Orive, L. M. (1999). Precision of Cavalieri sections and slices with local errors. J. Microsc., 193, 182-198.
García-Fiñana, M. and Cruz-Orive, L. M. (2004). Improved variance prediction for systematic sampling on R. Statistics , 38(3), 243-272.
Gundersen, H. J. G., Jensen, E. B. V., Kiêu, K., and Nielsen, J. (1999). The efficiency of systematic sampling - reconsidered. J. Microsc., 193, 199-211.
Ziegel, J., Baddeley, A., Dorph-Petersen, K.-A., and Jensen, E. B. V. (2010). Systematic sampling with errors in sample locations. Biometrika , 97, 1-13.
Research Seminar in StatisticsDistribution estimators and confidence intervals for Cavalieri estimators read_more |
HG G 19.1 |