Seminar overview
×
Modal title
Modal content
Autumn Semester 2016
Date & Time | Speaker | Title | Location |
---|---|---|---|
Thr 15.09.2016 16:15-17:00 |
Emmanuel Lesaffre L-BioStat, Leuven |
Abstract
We propose a novel multivariate multilevel model that expresses both the mean and covariance structure as a multivariate mixed effects model. We called this the multilevel covariance regression (MCR) model. Two versions of this model are presented. In the first version the covariance matrix of the multivariate response is allowed to depend on covariates and random effects. In this model the random effects of the covariance part are assumed to be independent of random effects of the mean structure. In the second model this assumption is relaxed by allowing the two types of random effects to be dependent.
The motivating data set is obtained from the RN4CAST (Sermeus et al. 2011) FP7 project which involves 33,731 registered nurses in 2,169 nursing units in 486 hospitals in 12 European countries. As response we have taken the three classical burnout dimensions (Maslach and Jackson, 1981) extracted from a 22-item questionnaire, i.e. emotional exhaustion (EE), depersonalization (DP) and personal accomplishment (PA). There are four levels in the total data set: nurses, nursing units, hospitals and (for the whole data set) countries. The first model is applied to the total data set, while the second model is applied to only the Belgian part of the data. The two models address the following nurse research questions simultaneously: 1) how much variation of burnout could be explained by the level-specific fixed and random effects? 2) do the variances and correlations among burnout stay constant across level-specific characteristics and units at each level? The two models are explored with respect to their statistical properties, but are also compared on the Belgian part of the study.
We opted for the Bayesian approach to estimate the parameters of the model. To this end we made use of the JAGS Markov chain Monte Carlo program through the R package rjags.
ZüKoSt Zürcher Kolloquium über StatistikModeling multivariate multilevel continuous responses with a hierarchical regression model for the mean and covariance matrix applied to a large nursing data setread_more |
HG G 19.1 |
Fri 16.09.2016 15:15-16:00 |
Venkat Chandrasekaran California Institute of Technology, USA |
Abstract
Regularization techniques are widely employed in the solution of
inverse problems in data analysis and scientific computing due to
their effectiveness in addressing difficulties due to ill-posedness.
In their most common manifestation, these methods take the form of penalty functions added to the objective in optimization-based approaches for solving inverse problems. The purpose of the penalty function is to induce a desired structure in the solution, and these functions are specified based on prior domain-specific expertise. We
consider the problem of learning suitable regularization functions
from data in settings in which prior domain knowledge is not directly available. Previous work under the title of 'dictionary learning' or 'sparse coding' may be viewed as learning a polyhedral regularizer from data. We describe generalizations of these methods to learn semidefinite regularizers by computing structured factorizations of
data matrices. Our algorithmic approach for computing these
factorizations combines recent techniques for rank minimization
problems along with operator analogs of Sinkhorn scaling. The
regularizers obtained using our framework can be employed effectively in semidefinite programming relaxations for solving inverse problems.
(Joint work with Yong Sheng Soh)
Research Seminar in StatisticsLearning Semidefinite Regularizers via Matrix Factorizationread_more |
HG G 19.1 |
Fri 23.09.2016 15:15-16:00 |
Helen Odgen University of Southampton, UK |
Abstract
Many statistical models have likelihoods which are intractable: it is impossible or infeasibly expensive to compute the likelihood exactly. In such settings, a common approach is to replace the likelihood with an approximation, and proceed with inference as if the approximate likelihood were the exact likelihood. For example, in latent variable models, where the likelihood is an integral over the latent variables, a Laplace approximation to the likelihood is often used in place of the exact likelihood to do inference. I will describe general conditions which guarantee that this naive inference with an approximate likelihood has the same first-order asymptotic properties as inference with the exact likelihood, and discuss in detail the implications of these results for inference using a Laplace approximation to the likelihood in generalized linear mixed models.
Research Seminar in StatisticsInference with approximate likelihoodsread_more |
HG G 19.1 |
Thr 13.10.2016 16:15-17:00 |
Torsten Hothorn Universität Zürich |
Abstract
Transformation models are a surprisingly large and useful class of models
for conditional and also unconditional distributions. Many known
transformation models, for example the Cox proportional hazards model or
proportional odds logistic regression, have been known for decades in
survival or categorical data analysis. The strong connections between
these models and other commonly used procedures, for example normal
or binary linear models, are not very well known. It is very stimulating,
both from an intellectual and a practical point of view, to interpret
such classical models as transformation models and therefore as models for
describing distributions instead of means.
We will look at a cascade ranging from very simple to rather complex
unconditional and conditional transformation models theoretically and
practically. The R add-on package "mlt" (Most Likely Transformations)
allows fitting many of such transformation models in the maximum
likelihood framework and will be used to illustrate how one can estimate
and analyse interesting transformations models in R.
References
http://dx.doi.org/10.1111/rssb.12017
http://arxiv.org/abs/1508.06749
http://CRAN.R-project.org/package=mlt
https://cran.r-project.org/web/packages/mlt.docreg/vignettes/mlt.pdf
ZüKoSt Zürcher Kolloquium über StatistikUnderstanding and Applying Transformation Modelsread_more |
HG G 19.1 |
Thr 20.10.2016 16:15-17:00 |
Thomas Hofmann ETHZ Zürich |
Abstract
This talk will provide an overview over recent trends in deep learning for natural language understanding. The focus will be on the structure and architecture of the network models used in this area, which in the last years has seen significant advances and innovations. In passing, the talk will also give a cursory introduction to key problems in language understanding.
ZüKoSt Zürcher Kolloquium über StatistikNatural Language Understanding by Deep Networksread_more |
HG G 19.1 |
Fri 28.10.2016 15:15-16:00 |
Samantha Leorato Università Tor Vergata, Roma |
Abstract
Given a continuous random variable Y and a random vector X
defined on the same probability space, the conditional
distribution function (CDF) and the conditional quantile
function (CQF) give rise to two competing approaches to the
estimation of the conditional distribution of Y given X. One
approach -- distribution regression -- is based on direct
estimation of the conditional distribution function (CDF); the
other approach -- quantile regression -- is instead based on
direct estimation of the conditional quantile function (CQF).
Since the CDF and the CQF are generalized inverses of each
other, estimates of any functional of the distribution may be
obtained by appropriately transforming the direct estimates of
the CDF and the CQ. Similarly, indirect estimates of the CQF and
the CDF may be obtained by taking the generalized inverse of the
direct estimates. Contrary to the QR estimator, that typically
refers to a conditional ALAD estimator, there is no unique
choice for the DR estimator. One possibility is to define a
binary choice model for any given threshold $y$ and the
corresponding dummy variable $\{Y\leq y\}$. This choice is
particularly suited to comparisons with the QR estimator, since,
in the unconditional case, the two approaches are equivalent.
Our paper focuses on comparing QR and DR approaches, and their
performances in terms of efficiency, both asymptotically and for
finite samples. Asymptotic efficiency is measured by asymptotic
MSE of the rescaled estimators of the CDF (or of the CQF), where
asymptotic MSE is the sum of the asymptotic variance and of the
squared asymptotic bias. Asymptotic bias is allowed to be
nonzero, thus taking into account some form of \emph{local}
misspecification of either the QR or the DR models. For the
asymptotic variance, we show that the choice of the link
function used for DR estimation matters, and that under the most
popular error distributions (i.e. logistic and normal) the QR is
uniformly more efficient (in expectation).
The finite sample performance is assessed by an extensive Monte
Carlo exercise.
Research Seminar in StatisticsDistribution and Quantile Regressionsread_more |
HG G 19.1 |
Wed 02.11.2016 16:15-17:00 |
Søren Højsgaard Aalborg University, DK |
Abstract
Mixed models in R (www.r-project.org) are currently usually handled
with the \verb'lme4' package. Until recently, inference (hypothesis test) in
linear mixed models with \verb'lme4' was commonly based on the limiting $\chi^2$
distribution of the likelihood ratio statistic. The \verb'pbkrtest' package
provides two alternatives: 1) A Kenward-Roger approximation for
calculating (or estimating) the numerator degrees of freedom for an
"F-like" test statistic. 2) $p$-values based on simulating the
reference distribution of the likelihood ratio statistic via
parametric bootstrap. In the talk, I will illustrate the package
through various examples, and discuss some directions for further
developments.
ZüKoSt Zürcher Kolloquium über StatistikInference in mixed models in R - beyond the usual asymptotic likelihood ratio testread_more |
HG G 26.1 |
Fri 04.11.2016 15:15-16:00 |
Davy Paindaveine Universität Brüssel |
Abstract
We revisit, in an original and challenging perspective, the problem of testing the null hypothesis that the mode of a directional signal is equal to a given value. Motivated by a real data example where the signal is weak, we consider this problem under asymptotic scenarios for which the signal strength goes to zero at an arbitrary rate eta_n. Both under the null and the alternative, we focus on rotationally symmetric distributions. We show that, while they are asymptoti- cally equivalent under fixed signal strength, the classical Wald and Watson tests exhibit very different (null and non-null) behaviours when the signal becomes arbitrarily weak. To fully characterize how challenging the problem is as a function of eta_n, we adopt a Le Cam, convergence-of-statistical-experiments, point of view and show that the resulting limiting experiments crucially depend on eta_n. In the light of these results, the Watson test is shown to be adaptively rate-consistent and essentially adaptively Le Cam optimal. Throughout, our theoretical findings are illustrated via Monte Carlo simulations. The practical relevance of our results is also shown on the real data example that motivated the present work.
Research Seminar in StatisticsCANCELED: !!! Inference on the mode of weak directional signals: A Le Cam perspektive on hypothesis testing near singularitiesread_more |
HG G 19.2 |
Thr 17.11.2016 16:15-17:00 |
Gilles Monneret Université Pierre et Marie Curie, Paris |
Abstract
Gene network inference from transcriptomic data is a recent and major methodological challenge, usually based on partial correlations within a Gaussian graphical model framework. Recent methodological advances that fully exploit both observational and interventional (i.e., knock-out or knock-down) data go one step further by enabling the inference of causal networks.
I will start with a method first proposed by Rau and al.(2013), which is based on Bayesian networks and can use a mix of observationnal and interventionnal data, even with several interventions in the same time. To do so, we use a MCMC procedure that work on the space of topological orders that lead to a posterior probability of ordering. We can then compute, for example, a mean networks for our data.
In a second time, I will define a novel causal test to identify marginal causality for each of the interaction pairs. The proposed procedure is very fast and can be applied to thousands of genes simultaneously, which allows the pre-selection of a group of genes of interest for downstream causal network inference around an interventional gene. I will show that we obtain results very similar to differential analysis currently used in genomics.
I will illustrate these two method with an application on one example using biological data.
Research Seminar in StatisticsIdentification of causal relationships in gene networks, from observational and interventional expression dataread_more |
HG G 19.1 |
Fri 18.11.2016 15:15-16:00 |
Gabor Lugosi Universitat Pompeu Fabra |
Abstract
Given n independent, identically distributed copies of a
random variable, one is interested in estimating the expected value.
Perhaps surprisingly, there are still open questions concerning this
very basic problem in statistics. In this talk we are primarily
interested in non-asymptotic sub-Gaussian estimates for potentially
heavy-tailed random variables. We discuss various estimates and
extensions to high dimensions. We apply the estimates for
statistical learning and regression function estimation problems.
The methods improve on classical empirical minimization techniques.
This talk is based on joint work with Emilien Joly, Luc Devroye,
Matthieu Lerasle, Roberto Imbuzeiro Oliveira, and Shahar Mendelson.
Research Seminar in StatisticsHow to estimate the mean of a random variable?read_more |
HG G 19.1 |
Fri 02.12.2016 15:15-16:00 |
Martyn Plummer IARC Lyon, France |
Abstract
We consider approximate Bayesian model choice for model selection
problems that involve models whose Fisher information matrices may fail
to be invertible along other competing submodels. Such singular models
do not obey the regularity conditions underlying the derivation of
Schwarz’s Bayesian information criterion (BIC) and the penalty
structure in BIC generally does not reflect the frequentist large
sample behaviour of their marginal likelihood. Although large sample
theory for the marginal likelihood of singular models has been
developed recently, the resulting approximations depend on the true
parameter value and lead to a paradox of circular reasoning. Guided by
examples such as determining the number of components of mixture
models, the number of factors in latent factor models or the rank in
reduced rank regression, we propose a resolution to this paradox and
give a practical extension of BIC for singular model selection
problems.
Research Seminar in StatisticsA Bayesian Information Criterion for Singular Modelsread_more |
HG G 19.1 |
Thr 08.12.2016 16:15-17:00 |
Nicolas Städler F. Hoffmann-La Roche Ltd, Basel |
Abstract
Our aim at Roche is for every person who needs our products to be able to access and benefit from them. Market
access, that is the coverage and reimbursement of our products by payers, is a crucial success factor in achieving this goal. As healthcare spendings are accelerating payers and public health authorities are carefully assessing benefits of new drugs over and
above drugs already on the market. Health Technology Assessment (HTA) agencies have therefore adopted stringent product evaluation strategies and their expectations in terms of evidence on effectiveness of a new product very often exceed those required for
regulatory approval. In this talk I will present work-in-progress examples where we use advanced statistics to inform robust payer evidence. Firstly, I will discuss surrogate endpoint validation and show how in some cases this is a useful approach to make
predictions on how effects measured on biomarkers or on surrogate endpoints translate into effects which are considered payer relevant. Secondly, I will discuss network meta-analysis and explain how we used this approach in chronic lymphocytic leukemia to
inform payers on the comparative effectiveness of our product to others on the market. I will further discuss our ideas on how to extend network meta-analysis to also include non-randomized trials. Finally, I will discuss extrapolation of survival curves as
a key ingredient to calculate the so-called Incremental Cost Effectiveness Ratio (ICER) which serves many payers as an important reference value in their decision making. I will discuss the limitations of classical parametric extrapolation and I will show
how we use advanced techniques based on mixture models to improve extrapolation and to obtain more accurate estimates of the ICER.
ZüKoSt Zürcher Kolloquium über StatistikOpportunities and Challenges of Statistics in Health Technology Assessmentread_more |
HG G 19.1 |
Tue 13.12.2016 15:15-16:00 |
William Aeberhard Dalhousie University, Halifax |
Abstract
State-space models (SSMs) encompass a wide range of popular models encountered in various fields such as mathematical finance, control engineering and ecology. SSMs are essentially characterized by a hierarchical structure, with latent (unobserved) variables governed by Markovian dynamics. Classical estimation of fixed parameters in these models, for instance by maximizing an approximated marginal likelihood, is known to be highly sensitive to the correct specification of the model. This sensitivity is all the more so problematic since assumptions about latent variables cannot be verified by the data analyst. Motivated by the highly non-linear models used for fish stock assessments, we introduce robust estimators for general SSMs which remain stable under deviations from the assumed model. The implementation relies on Laplace's method, where automatic differentiation allows the user to robustly fit such a model in a matter of minutes. A real-life fish stock assessment example illustrates the reliable inference these estimators can yield and how robustness weights can be used as diagnostic tools.
Research Seminar in StatisticsRobust fitting of state-space models with application to fish stock assessmentsread_more |
HG G 19.2 |