Research Seminar

Would you like to be notified about these presentations via e-mail? Please subscribe here.

×

Modal title

Modal content

Spring Semester 2013

Date / Time Speaker Title Location
22 February 2013
15:15-16:15
Thanh Mai Pham Ngoc
Université de Paris Sud Orsay
Event Details

Research Seminar in Statistics

Title Goodness of fit tests for noisy directional data
Speaker, Affiliation Thanh Mai Pham Ngoc, Université de Paris Sud Orsay
Date, Time 22 February 2013, 15:15-16:15
Location HG G 19.1
Abstract In astrophysics,a burning issue consists in understanding the behaviour of the so-called UUltra High Energy Cosmic Rays (UHECR). These latter are cosmic rays with an extreme kinetic energy and the rarest particles in the universe. The source of those most energetic particles remains a mystery. Finding out more about the law of probability of those incoming directions is crucial to gain an insight into the mechanisms generating the UHECR. Astrophysicists have at their disposal directional data which are measurements of the incoming directions of the UHECR on Earth. Unfortunately their trajectories are deflected by Galactic and intergalactic fields. A first way to model the deflection in the incoming directions can be done thanks to the following model with random rotations : Zi =εiXi, i=1,...,N We define a nonparametric test procedure to distinguish H0 : ”the density f of Xi is the uniform density f0 on the sphere” and H1. we show that an adaptive procedure cannot have a faster rate of separation than ψad(s) = (N/loglog(N))−2s/ (2s+2ν+1) and we provide a procedure which reaches this rate. We illustrate the theory by implementing our test procedure for various kinds of noise on SO(3) and by comparing it to other procedures. Applications to real data in astrophysics and paleomagnetism are provided.
Goodness of fit tests for noisy directional dataread_more
HG G 19.1
1 March 2013
15:15-16:15
Sébastien Loustau
Université d'Angers, France
Event Details

Research Seminar in Statistics

Title Inverse Statistical Learning
Speaker, Affiliation Sébastien Loustau, Université d'Angers, France
Date, Time 1 March 2013, 15:15-16:15
Location HG G 19.1
Abstract We propose to consider the problem of statistical learning when we observe a contaminated sample. More precisely, we state fast rates of convergence in classification with errors in variables for deconvolution empirical risk minimizers. These rates depends on the ill-posedness, the margin and the complexity of the problem. The cornerstone of the proof is a bias variance decomposition of the excess risk. After a theoretical study of the problem, we turn out into more practical considerations by presenting a new algorithm for noisy finite dimensional clustering called noisy K-means.
Inverse Statistical Learningread_more
HG G 19.1
8 March 2013
15:15-16:15
Alexei Onatski
University of Cambridge
Event Details

Research Seminar in Statistics

Title Asymptotic Analysis of the Squared Estimation Error in Misspecified Factor Models
Speaker, Affiliation Alexei Onatski, University of Cambridge
Date, Time 8 March 2013, 15:15-16:15
Location HG G 19.1
Abstract In this paper, we obtain asymptotic approximations to the mean squared error of the least squares estimator of the common component in large approximate factor models with possibly misspecified number of factors. The approximations are derived under both strong and weak factors asymptotics assuming that the cross-sectional and temporal dimensions of the data are comparable. We develop consistent estimators of these approximations and propose to use them as new criteria for selection of the number of factors. We show that the estimators of the number of factors that minimize these criteria are asymptotically loss efficient in the sense of Shibata (1980), Li (1987), and Shao (1997).
Asymptotic Analysis of the Squared Estimation Error in Misspecified Factor Modelsread_more
HG G 19.1
* 27 March 2013
15:15-16:15
Patrik Guggenberger
University of California, San Diego
Event Details

Research Seminar in Statistics

Title Subset inference in the linear IV model
Speaker, Affiliation Patrik Guggenberger, University of California, San Diego
Date, Time 27 March 2013, 15:15-16:15
Location HG G 19.1
Abstract In the linear instrumental variables model we are interested in testing a hypothesis on the coefficient of an exogenous variable when one right hand side endogenous variable is present. Under the assumption of conditional homoskedasticity but no restriction on the reduced form coefficient vector, we derive the asymptotic size of the subset Lagrange multiplier (LM) test and provide the nonrandom size corrected (SC) critical value that ensures that the resulting SC subset LM test has correct asymptotic size. We introduce an easy-to-implement generalized moment selection plug-in SC (GMS-PSC) subset LM test that uses a data-dependent critical value. We compare the local power properties of the GMS-PSC subset LM and subset AR test and also provide a Monte Carlo study that compares the finite-sample properties of the two tests. The GMS-PSC is shown to have competitive power properties.
Subset inference in the linear IV modelread_more
HG G 19.1
* 2 April 2013
15:15-16:15
Joris M. Mooij
Radboud University Nijmegen
Event Details

Research Seminar in Statistics

Title Cyclic Causal Discovery from Equilibrium Data
Speaker, Affiliation Joris M. Mooij, Radboud University Nijmegen
Date, Time 2 April 2013, 15:15-16:15
Location HG G 19.2
Abstract Causal feedback loops play important roles in many biological systems. In the absence of time series data, inferring the structure of cyclic causal systems can be extremely challenging. An example of such a biological system is a cellular signalling network that plays an important role in human immune system cells (Sachs et al., Science 2005), consisting of several interacting proteins and phospholipids. The protein concentration data measured by Sachs et al. utilizing flow cytometry has been analyzed by different researchers in order to evaluate various causal inference methods. Most of these methods only consider acyclic causal structures, even though the data shows strong evidence that feedback loops are present. In this talk I will propose a new method for cyclic causal discovery from a combination of observational and interventional equilibrium data. I will show that the method indeed finds evidence for feedback loops in the flow cytometry data and that it gives a more accurate quantitative description of the data at comparable model complexity.
Cyclic Causal Discovery from Equilibrium Dataread_more
HG G 19.2
19 April 2013
15:15-16:15
Alexander Sokol
University of Copenhagen
Event Details

Research Seminar in Statistics

Title Stochastic differential equations as causal models
Speaker, Affiliation Alexander Sokol, University of Copenhagen
Date, Time 19 April 2013, 15:15-16:15
Location HG G 19.1
Abstract We define a notion of interventions in a stochastic differential equation based on simple substitution in the SDE. We prove that this notion of intervention is the same as can be obtained by making do()-interventions in the Euler scheme for the SDE and taking the limit. We show that when the driving semimartingale is a Lévy process and there are no latent variables, the postintervention distribution is always identifiable from the observational distribution. We also relate our results to litterature on weak conditional local independence by Gégout-Petit and Commenges.
Stochastic differential equations as causal modelsread_more
HG G 19.1
* 19 April 2013
16:30-17:30
Johanna G. Neslehova and Christian Genest
McGill University, Montréal, Canada
Event Details

Research Seminar in Statistics

Title Tests of independence for sparse contingency tables and beyond
Speaker, Affiliation Johanna G. Neslehova and Christian Genest, McGill University, Montréal, Canada
Date, Time 19 April 2013, 16:30-17:30
Location HG G 19.1
Abstract New statistics are proposed for testing the hypothesis that arbitrary random variables are mutually independent. These tests are consistent and well-behaved for any type of data, even for sparse contingency tables and tables whose dimension depends on the sample size. The statistics are Cram?ér-von Mises and Kolmogorov-Smirnov type functionals of the empirical checkerboard copula. The asymptotic behavior of the corresponding empirical process will be characterized and illustrated; it will also be shown how replicates from the limiting process can be generated using a multiplier bootstrap procedure. As will be seen through simulations, the new tests are considerably more powerful than those based on the Pearson chi squared, likelihood ratio, and Zelterman statistics often used in this context.
Tests of independence for sparse contingency tables and beyondread_more
HG G 19.1
3 May 2013
15:15-16:15
Niels Richard Hansen
University of Copenhagen
Event Details

Research Seminar in Statistics

Title Non-parametric estimation of linear filters for point processes
Speaker, Affiliation Niels Richard Hansen, University of Copenhagen
Date, Time 3 May 2013, 15:15-16:15
Location HG G 19.1
Abstract A main challenge in neuron science is to model the dynamic activity of the brain and how it responds to external stimuli. We present models of neuron network activity based on multichannel spike data. The models form a class of point orocess models with spike rates determined through linear filters of the spike histories. The filters are given in terms of filter functions that are estimated non-parametrically as elements in e.g. a reproducing kernel Hilbert space. We discuss how the models can be used to infer network connectivity and predictions of stimuli (intervention) effects. The methods used are available via the R package ppstat.
Non-parametric estimation of linear filters for point processesread_more
HG G 19.1
* 20 June 2013
15:15-16:15
Andrew B. Nobel
University of North Carolina at Chapel Hill
Event Details

Research Seminar in Statistics

Title Large Average Submatrices of a Gaussian Random Matrix: Landscapes and Local Optima.
Speaker, Affiliation Andrew B. Nobel, University of North Carolina at Chapel Hill
Date, Time 20 June 2013, 15:15-16:15
Location HG G 19.1
Abstract The problem of finding large average submatrices of a real-valued matrix arises in the exploratory analysis of data from a variety of disciplines, ranging from genomics to social sciences. This talk details several new theoretical results concerning the asymptotic behavior of large average submatrices of an nxn Gaussian random matrix. The first result identifies the average and joint distribution of the (globally optimal) kxk submatrix having largest average value. We then turn our attention to submatrices with dominant row and column sums, which arise as the local optima of a useful iterative search procedure for large average submatrices. Paralleling the result for global optima, the second result identifies the average and joint distribution of a typical locally optimal kxk submatrix. The last part of the talk considers the *number* of locally optimal kxk submatrices, L_n(k), beginning with the asymptotic behavior of its mean and variance for fixed k and increasing n. The final result is a Gaussian central limit theorem for L_n(k) that is based on a new variant of Stein's method for normal approximation. Joint work with Shankar Bhamidi and Partha S. Dey
Large Average Submatrices of a Gaussian Random Matrix: Landscapes and Local Optima.read_more
HG G 19.1
2 July 2013
15:15-16:00
Rajen Shah
Statistical Laboratory, University of Cambridge, UK
Event Details

Research Seminar in Statistics

Title Large-scale regression with sparse data
Speaker, Affiliation Rajen Shah , Statistical Laboratory, University of Cambridge, UK
Date, Time 2 July 2013, 15:15-16:00
Location HG G 19.1
Abstract The “Big Data” era in which we are living has brought with it a combination of statistical and computational challenges that often must be met with approaches that draw on developments from both the fields of statistics and computer science. In this talk I will present a method for performing regression where the n by p design matrix may have both n and p in the millions, but where the design matrix is sparse, that is most of its entries are zero; such sparsity is common in many large-scale applications such as text analysis. In this setting, performing regression using the original data can be computationally infeasible. Instead, we first map the design matrix to an n by L matrix with L << p, using a modified version of a scheme known as b-bit min-wise hashing in computer science. From a statistical perspective, we study the performance of regression using this compressed data, and give finite sample bounds on the prediction error. Interestingly, despite the loss of information through the compression scheme, we will see that ordinary least squares or ridge regression applied to the reduced data can actually allow us to fit a model containing interactions in the original data. This is joint (and ongoing) work with Nicolai Meinshausen.
Large-scale regression with sparse dataread_more
HG G 19.1

Notes: events marked with an asterisk (*) indicate that the time and/or location are different from the usual time and/or location and if you want you can subscribe to the iCal/ics Calender.

Organizers: Peter Bühlmann, Leonhard Held, Hans Rudolf Künsch, Marloes Maathuis, Sara van de Geer, Michael Wolf

JavaScript has been disabled in your browser