Statistics research seminar

Main content

Spring Semester 2017

Note: The highlighted event marks the next occurring event and events marked with an asterisk (*) indicate that the time and/or location are different from the usual time and/or location.

Date / Time Speaker Title Location
28 February 2017
Po-Ling Loh
University of Wisconsin-Madison
Influence maximization in stochastic and adversarial settings  HG  G 19.1 
Abstract: We consider the problem of influence maximization in fixed networks, for both stochastic and adversarial contagion models. In the stochastic setting, nodes are infected in waves according to linear threshold or independent cascade models. We establish upper and lower bounds for the influence of a subset of nodes in the network, where the influence is defined as the expected number of infected nodes at the conclusion of the epidemic. We quantify the gap between our upper and lower bounds in the case of the linear threshold model and illustrate the gains of our upper bounds for independent cascade models in relation to existing results. Importantly, our lower bounds are monotonic and submodular, implying that a greedy algorithm for influence maximization is guaranteed to produce a maximizer within a 1-1/e factor of the truth. In the adversarial setting, an adversary is allowed to specify the edges through which contagion may spread, and the player chooses sets of nodes to infect in successive rounds. We establish upper and lower bounds on the pseudo-regret for possibly stochastic strategies of the adversary and player. This is joint work with Justin Khim and Varun Jog.
7 April 2017
Tommaso Proietti
University of Rome, Tor Vergata
Optimal linear prediction of stochastic trends  HG G 19.1 
Abstract: A recent strand of the time series literature has considered the problem of estimating high-dimensional autocovariance matrices, for the purpose of out of sample prediction. For an integrated time series, the Beveridge-Nelson trend is defined as the current value of the series plus the sum of all forecastable future changes. For the optimal linear projection of all future changes into the space spanned by the past of the series, we need to solve a high-dimensional Toeplitz system involving 𝑛 autocovariances, where 𝑛 is the sample size. The paper proposes a non-parametric estimator of the trend that relies on banding, or tapering, the sample partial autocorrelations, by a regularized Durbin-Levinson algorithm. We derive the properties of the estimator and compare it with alternative parametric estimators based on the direct and indirect finite order autoregressive predictors.
10 April 2017
Shahar Mendelson
Australian National University, Canberra, and Technion, Haifa
The small-ball method and the structure of random coordinate projections   HG G 19.2 
Abstract: We study the geometry of the natural function class extension of a random projection of a subset of $R^d$: for a class of functions $F$ defined on the probability space $(\Omega,\mu)$ and an iid sample X_1,...,X_N with each of the $X_i$'s distributed according to $\mu$, the corresponding coordinate projection of $F$ is the set $\{ (f(X_1),....,f(X_N)) : f \in F\} \subset R^N$. We explain how structural information on such random sets can be derived and then used to address various questions in high dimensional statistics (e.g. regression problems), high dimensional probability (e.g., the extremal singular values of certain random matrices) and high dimensional geometry (e.g., Dvoretzky type theorems). Our focus is on results that are (almost) universally true, with minimal assumptions on the class $F$; these results are established using the recently introduced small-ball method.
10 April 2017
Mladen Kolar
The University of Chicago
Some Recent Advances in Scalable Optimization  HG G 19.2 
Abstract: In this talk, I will present two recent ideas that can help solve large scale optimization problems. In the first part, I will present a method for solving an ell-1 penalized linear and logistic regression problems where data are distributed across many machines. In such a scenario it is computationally expensive to communicate information between machines. Our proposed method requires a small number of rounds of communication to achieve the optimal error bound. Within each round, every machine only communicates a local gradient to the central machine and the central machine solves a ell-1 penalized shifter linear or logistic regression. In the second part, I will discuss usage of sketching as a way to solve linear and logistic regression problems with large sample size and many dimensions. This work is aimed at solving large scale optimization procedures on a single machine, while the extension to a distributed setting is work in progress.
12 May 2017
Walter Distaso
Imperial College
Testing for jump spillovers without testing for jumps  HG G 19.1 
Abstract: The analysis of jumps spillovers across assets and markets is fundamental for risk management and portfolio diversification. This paper develops statistical tools for testing conditional independence among the jump components of the quadratic variation, which are measured as the sum of squared jump sizes over a day. To avoid sequential bias distortion, we do not pretest for the presence of jumps. We proceed in two steps. First, we derive the limiting distribution of the infeasible statistic, based on the unobservable jump component. Second, we provide sufficient conditions for the asymptotic equivalence of the feasible statistic based on realized jumps. When the null is true, and both assets have jumps, the statistic weakly converges to a Gaussian random variable. When instead at least one asset has no jumps, then the statistic approaches zero in probability. We then establish the validity of moon bootstrap critical values. If the null is true and both assets have jumps, both statistics have the same limiting distribution. in the absence of jumps in at least one asset, the bootstrap-based statistic converges to zero at a slower rate. Under the alternative, the bootstrap statistic diverges at a slower rate. Altogether, this means that the use of bootstrap critical values ensures a consistent test with asymptotic size equal to or smaller than alpha. We finally provide an empirical illustration using transactions data on futures and ETFs.
6 June 2017
Yuansi Chen
Berkeley, University of California
CNNs meet real neurons: transfer learning and pattern selectivity in V4.  HG G 26.3 
Abstract: Vision in humans and in non-human primates is mediated by a constellation of hierarchically organized visual areas. Visual cortex area V4, which has highly nonlinear response properties, is a challenging visual cortex area after V1 and V2 on the ventral pathway. We demonstrate that area V4 of the primate visual cortex can be accurately modeled via transfer learning of convolutional neural networks (CNNs). We also find that several different neural network architectures lead to similar predictive performance. This fact, combined with the high dimensionality of the models, makes model interpretation challenging. Hence, we introduce stability based principles to interpret these models and explain V4 neuron's pattern selectivity.
13 June 2017
Caroline Uhler
Massachusetts Institute of Technology IDSS
Permutation-based causal inference algorithms with interventions  HG G 19.2 
Abstract: A recent break-through in genomics makes it possible to perform perturbation experiments at a very large scale. In order to learn gene regulatory networks from the resulting data, efficient and reliable causal inference algorithms are needed that can make use of both, observational and interventional data. I will present an algorithm of this type and prove that it is consistent under the faithfulness assumption. This algorithm is based on a greedy permutation search and it is a hybrid approach that uses conditional independence relations in a score-based method. Hence, this algorithm is non-parametric, which makes it useful for analyzing inherently non-Gaussian gene expression data. We will end by analyzing its performance on simulated data, protein signaling data, and single-cell gene expression data.
16 June 2017
Kim Hendrickx
Hasselt University , Hasselt
current status linear regression  HG G 19.1 
Abstract: In statistics one often has to find a method for analyzing data which are only indirectly given. One such situation is when one has ``current status data", which only give the information that a certain event has taken place or on the other hand still did not happen. So one observes the ``current status" of the matter. We consider a simple linear regression model where the dependent variable is not observed due to current status censoring and where no assumptions are made on the distribution of the unobserved random error terms. For this model, the theoretical performance of the maximum likelihood estimator (MLE), maximizing the likelihood of the data over all possible distribution functions and all possible regression parameters, is still an open problem. We construct $\sqrt{n}$-consistent and asymptotically normal estimates for the finite dimensional regression parameter in the current status linear regression model, which do not require any smoothing device and are based on maximum likelihood estimates (MLEs) of the infinite dimensional parameter. We also construct estimates, again only based on these MLEs, which are arbitrarily close to efficient estimates, if the generalized Fisher information is finite.
29 June 2017
Victor Chernozhukov
Double/debiased machine learning for treatment and structural parameters  HG G 19.1 
Abstract: We revisit the classic semiparametric problem of inference on a low dimensional parameter θ0 in the presence of high-dimensional nuisance parameters η0. We depart from the classical setting by allowing for η0 to be so high-dimensional that the traditional assumptions, such as Donsker properties, that limit complexity of the parameter space for this object break down. To estimate η0, we consider the use of statistical or machine learning (ML) methods which are particularly well-suited to estimation in modern, very high-dimensional cases. ML methods perform well by employing regularization to reduce variance and trading off regularization bias with overfitting in practice. However, both regularization bias and overfitting in estimating η0 cause a heavy bias in estimators of θ0 that are obtained by naively plugging ML estimators of η0 into estimating equations for θ0. This bias results in the naive estimator failing to be N −1/2 consistent, where N is the sample size. We show that the impact of regularization bias and overfitting on estimation of the parameter of interest θ0 can be removed by using two simple, yet critical, ingredients: (1) using Neyman-orthogonal moments/scores that have reduced sensitivity with respect to nuisance parameters to estimate θ0, and (2) making use of cross-fitting which provides an efficient form of data-splitting. We call the resulting set of methods double or debiased ML (DML). We verify that DML delivers point estimators that concentrate in a N −1/2-neighborhood of the true parameter values and are approximately unbiased and normally distributed, which allows construction of valid confidence statements. The generic statistical theory of DML is elementary and simultaneously relies on only weak theoretical requirements which will admit the use of a broad array of modern ML methods for estimating the nuisance parameters such as random forests, lasso, ridge, deep neural nets, boosted trees, and various hybrids and ensembles of these methods. We illustrate the general theory by applying it to provide theoretical properties of DML applied to learn the main regression parameter in a partially linear regression model, DML applied to learn the coefficient on an endogenous variable in a partially linear instrumental variables model, DML applied to learn the average treatment effect and the average treatment effect on the treated under unconfoundedness, and DML applied to learn the local average treatment effect in an instrumental variables setting. In addition to these theoretical applications, we also illustrate the use of DML in three empirical examples.

Archive: SS 17  AS 16  SS 16  AS 15  SS 15  AS 14  SS 14  AS 13  SS 13  AS 12  SS 12  AS 11  SS 11  AS 10  SS 10  AS 09 

Page URL:
Sat Jun 24 11:43:54 CEST 2017
© 2017 Eidgenössische Technische Hochschule Zürich