Seminar overview
×
Modal title
Modal content
Autumn Semester 2014
Date & Time | Speaker | Title | Location |
---|---|---|---|
Fri 26.09.2014 15:15-16:00 |
Eleni Sgouritsa MPI for Intelligent Systems, Tuebingen |
Abstract
Drawing causal conclusions from observed statistical dependencies is a fundamental problem. Conditional-independence based causal discovery (e.g., PC or FCI) cannot be used in case there are no observed conditional independences. Alternative methods investigate a different set of assumptions, namely restricting the model class, e.g., additive noise models. In this talk, I will present two causal inference methods employing different kind of assumptions than the above.
The first is a method to infer the existence and identify a finite confounder attaining a small number of values. It is based on a kernel method to identify finite mixtures of nonparametric product distributions. The number of mixture components is found by embedding the joint distribution into a reproducing kernel Hilbert space. The mixture components are then recovered by clustering according to an independence criterion.
In the second part I will focus on the problem of causal inference in the two-variable case (assuming causal sufficiency). The proposed method is based on the assumption that, if X causes Y, the marginal distribution P(X) contains no information about P(Y|X). In contrast, P(Y) may contain information about P(X|Y). Consequently, semi-supervised and unsupervised learning (inferring the conditional from the marginal) should be possible in the latter but not in the former case. Accordingly, a method is proposed to decide upon the causal structure.
Research Seminar in StatisticsIdentifying confounders and telling cause from effect using latent variable modelsread_more |
HG G 19.1 |
Fri 10.10.2014 15:15-16:00 |
Hannes Leeb Universität Wien |
Abstract
One of the most widely used properties of the multivariate Gaussian distribution, besides its tail behavior, is the fact that conditional means are linear and that conditional variances are constant. We here show that this property is also shared, in an approximate sense, by a large class of non-Gaussian distributions. We allow for several conditioning variables and we provide explicit non-asymptotic results,whereby we extend earlier findings of Hall and Li (1993) and Leeb (2013).
(This is joint work with Lukas Steinberger.)
Research Seminar in StatisticsOn conditional moments of high-dimensional random vectors given lower-dimensional projectionsread_more |
HG G 19.1 |
Thr 16.10.2014 16:15-17:00 |
Frank Bretz Novartis, Basel |
Abstract
Clinical trials play a critical role in pharmaceutical drug development. New trial designs often
depend on historical data, which, however, may provide inaccurate information for the current study
due to changes in study populations, patient heterogeneity, or different medical facilities. As a
result, the original plan and study design may need to be adjusted or even altered to accommodate
new findings and unexpected interim results. The goal of using adaptive methods in clinical trials
is to enhance the flexibility of trial conduct as well as maintain the integrity of trial findings.
Through carefully thought out and planned adaptation, we can pinpoint the right dose faster, treat
patients more effectively, identify treatment effects more efficiently, and thus expedite the drug
development process. In this presentation we will provide an overview of various adaptive methods
for Phase I to Phase III clinical trials. Accordingly, different types of adaptive designs will be
introduced, such as adaptive modifications of treatment randomization probabilities, adaptive dose
escalation and dose finding trials, group sequential designs (including early termination), blinded
or unblinded sample size re-estimation, and adaptive designs for treatment or subgroup selection.
ZüKoSt Zürcher Kolloquium über StatistikAdaptive Methods in Clinical Trialsread_more |
HG G 19.1 |
Thr 23.10.2014 16:15-17:00 |
Ingo Scholtes und Frank Schweitzer ETH Zürich (MTEC) |
Abstract
Recent research has highlighted limitations of studying complex systems with time-varying topologies from the perspective of static, time-aggregated networks. Non-Markovian characteristics resulting from the ordering of interactions in time-varying networks were identified as one important mechanism that alters causality and affects dynamical processes. So far, an analytical explanation for this phenomenon and for the significant variations observed across different systems is missing. Summarizing our recent research in this area, in this talk we will introduce a methodology that allows to analytically predict causality-driven changes of diffusion speed in non-Markovian time-varying networks. In particular, we will summarize the so-called "ensemble perspective" which is commonly applied in the stochastic modeling of complex networks, and we will show first results on the extension of this ensemble perspective to time-varying networks.
ZüKoSt Zürcher Kolloquium über StatistikModeling Time-Varying Complex Networks: The Importance of Non-Markovianityread_more |
HG G 19.1 |
Thr 30.10.2014 16:15-17:00 |
Manuela Zucknick German Cancer Research Center (DKFZ), Heidelberg |
Abstract
When using high-dimensional genomic data in cancer research, the identification of prognostic
factors, which can influence clinical parameters such as therapy response or survival outcome, and the evaluation of their prediction performance are some of the main issues. In these applications, the number of genome features p is usually much larger than the number of observations n (p >> n problem).
Penalized likelihood methods, for example lasso regression, are often applied in this context.
Frequentist lasso estimates correspond to Bayesian posterior mode estimates, when the regression
parameters have independent double-exponential priors. To better understand certain properties
of the lasso, it is useful to exploit this connection and to move to the Bayesian framework. I
will present a comparison study, where we investigated the lasso method in the frequentist and
Bayesian frameworks in the context of Cox models for survival endpoints.
Bayesian variable selection (BVS) can be used as an alternative way to perform risk prediction with automatic variable selection, which I will demonstrate through some applications in genomics in the second part of the talk. BVS models are very exible, both in their setup and with regards to possibilities for model inference, for example allowing to interpret and rank genomic features by their posterior variable selection probabilities. The models are exible enough to easily allow the integration of several genomic data sources in a biologically meaningful manner.
ZüKoSt Zürcher Kolloquium über StatistikBayesian models for risk prediction with high-dimensional (integrative) genomicsread_more |
HG G 19.1 |
Fri 14.11.2014 15:15-16:00 |
Harrison Zhou Yale University, New Haven, CT |
Abstract
Canonical correlation analysis is a widely used multivariate statistical technique for exploring the relation between two sets of variables. In this talk we consider the problem of estimating the leading canonical correlation directions in high dimensional settings. Recently, under the assumption that the leading canonical correlation directions are sparse, various procedures have been proposed for many high dimensional applications involving massive data sets. However, there has been few theoretical justification available in the literature. In this talk, we establish rate-optimal non-asymptotic minimax estimation with respect to an appropriate loss function for a wide range of model spaces. Two interesting phenomena are observed. First, the minimax rates are not affected by the presence of nuisance parameters, namely the covariance matrices of the two sets of random variables, though they need to be estimated in the canonical correlation analysis problem. Second, we allow the presence of the residual canonical correlation directions. However, they do not influence the minimax rates under a mild condition on eigengap. A generalized sin-theta theorem and an empirical process bound for Gaussian quadratic forms under rank constraint are used to establish the minimax upper bounds, which may be of independent interest.
If time permits, we will discuss a computationally efficient two-stage estimation procedure which consists of a convex programming based initialization stage and a group Lasso based refinement stage, and show some encouraging numerical results on simulated data sets and a breast cancer data set.
Research Seminar in StatisticsSparse Canonical Correlation Analysis: Minimaxity and Adaptivityread_more |
HG G 19.1 |
Thr 04.12.2014 16:15-17:00 |
Martin Schumacher Universität Freiburg |
Abstract
Conditional survival (CS) is defined as the probability of surviving further t years given that a patient has already survived s years after the diagnosis of a chronic disease. It has attracted attention in recent years either in an absolute or relative form where the latter is based on a comparison with an age-adjusted normal population being highly relevant from a public health perspective. In its absolute form, CS is the quantity of major interest in a clinical context. CS constitutes the simplest form of a dynamic prediction in which other events in the course of the disease or biomarker values measured up to time s can be incorporated. In the presentation we review applications in clinical medicine, especially in oncology, and aspects related to statistical modelling with special emphasis on assessment of predictive accuracy. CS provides valuable and relevant information how prognosis develops over time; it also serves as a starting point for identifying factors related to long-term survival and for developing more complex dynamic predictions that can be used for disease monitoring.
Martin Schumacher and Stefanie Hieke
Institute of Medical Biometry and Statistics, University Medical Center Freiburg
References
1. Van Houwelingen HC, Putter H. Dynamic Prediction in Clinical Survival Analysis. CRC Press, Boca Raton 2012.
2. Zamboni BA et al. Conditional survival and the choice of conditioning set for patients with colon cancer. J Clin Oncol. 2010 May 20; 28 (15): 2544-8. doi: 10.1200/JCO.2009.23.0573.
3. Schoop R, Schumacher M, Graf E. Measures of prediction error for survival data with longitudinal covariates. Biom J. 2011 Mar; 53 (2): 275-93. Doi: 10.1002/bimj.201000145
ZüKoSt Zürcher Kolloquium über StatistikFrom conditional survival to dynamic predictions – aspects of application, statistical modelling and assessmentread_more |
HG G 19.1 |
Thr 11.12.2014 16:15-17:00 |
Syed Ejaz Ahmed Brock University, Ontario |
Abstract
In high-dimensional statistics settings where number of variables is greater than observations, or when number of variables are increasing with the sample size, many penalized regularization strategies were studied for simultaneous variable selection and post-estimation. However, a model may have sparse signals as well as with number predictors with weak signals. In this scenario variable selection methods may not distinguish predictors with weak signals and sparse signals. The prediction based on a selected submodel may not be preferable in such cases. For this reason, we propose a high-dimensional shrinkage estimation strategy to improve the prediction performance of a submodel. Such a high-dimensional shrinkage estimator (HDSE) is constructed by shrinking a ridge estimator in the direction of a candidate submodel. We demonstrate that the proposed HDSE performs uniformly better than the ridge estimator. Interestingly, it improves the prediction performance of given candidate submodel generated from most existing variable selection methods. The relative performance of the proposed HDSE strategy is appraised by both simulation studies and the real data analysis.
ZüKoSt Zürcher Kolloquium über StatistikBig Data Big Bias Small Surprise!read_more |
HG G 19.1 |