Statistics research seminar

×

Modal title

Modal content

Autumn Semester 2023

Date / Time Speaker Title Location
21 August 2023
16:00-17:00
Cun-Hui Zhang
Rutgers University, USA
Event Details

Research Seminar in Statistics

Title Chi-Squared and Normal Approximations in Large Contingency Tables
Speaker, Affiliation Cun-Hui Zhang, Rutgers University, USA
Date, Time 21 August 2023, 16:00-17:00
Location HG G 26.5
Abstract We provide necessary and sufficient conditions for the chi-squared and normal approximations of Pearson's chi-squared statistics for the test of independence and the goodness-of- t test, as well as necessary and sufficient conditions for the normal approximation of the likelihood ratio and Hellinger statistics, when the cell probabilities of the multinomial data are in general pattern and the dimension diverges with the sample size. A cross-sample chi-squared statistic for testing independence applies to two-way contingency tables with diverging dimensions. A degrees-of-freedom adjusted chi-squared approximation applies continuously throughout the high-dimensional regime and matches Pearson's chi-squared statistic in both the mean and variance. Specific examples are provided to demonstrate the asymptotic normality of the three types of test statistics when the classical regularity conditions for the chi-squared and normal approximations are violated. Simulation results demonstrate that the chi-squared and normal approximations are more robust for the likelihood ratio and Hellinger statistics, compared with Pearson's chi-squared statistics. This talk is based on joint work with Chong Wu and Yisha Yao.
Chi-Squared and Normal Approximations in Large Contingency Tablesread_more
HG G 26.5
22 September 2023
15:15-16:15
Zijian Guo
Rutgers University, USA
Event Details

Research Seminar in Statistics

Title Joint talk: Robust Causal Inference with Possibly Invalid Instruments: Post-selection Problems and A Solution Using Searching and Sampling
Speaker, Affiliation Zijian Guo, Rutgers University, USA
Date, Time 22 September 2023, 15:15-16:15
Location HG G 19.1
Abstract Instrumental variable methods are among the most commonly used causal inference approaches to deal with unmeasured confounders in observational studies. The presence of invalid instruments is the primary concern for practical applications, and a fast-growing area of research is inference for the causal effect with possibly invalid instruments. This paper illustrates that the existing confidence intervals may undercover when the valid and invalid instruments are hard to separate in a data-dependent way. To address this, we construct uniformly valid confidence intervals that are robust to the mistakes in separating valid and invalid instruments. We propose to search for a range of treatment effect values that lead to sufficiently many valid instruments. We further devise a novel sampling method, which, together with searching, leads to a more precise confidence interval. Our proposed searching and sampling confidence intervals are uniformly valid and achieve the parametric length under the finite-sample majority and plurality rules. We apply our proposal to examine the effect of education on earnings. The proposed method is implemented in the R package \texttt{RobustIV} available from CRAN.
Joint talk: Robust Causal Inference with Possibly Invalid Instruments: Post-selection Problems and A Solution Using Searching and Samplingread_more
HG G 19.1
29 September 2023
15:15-16:15
Leonardo Egidi
University of Trieste
Event Details

Research Seminar in Statistics

Title Prediction, skepticism, and the Bayes Factory
Speaker, Affiliation Leonardo Egidi , University of Trieste
Date, Time 29 September 2023, 15:15-16:15
Location HG G 19.1
Abstract Nowadays a Bayesian model needs to be reproducible, generative, predictive, robust, computationally scalable, and able to provide sound inferential conclusions. In this wide framework, Bayes factors still represent one of the most well-known and commonly adopted tools to perform model selection and hypothesis testing; however, they are usually criticized due to their intrinsic lack of calibration, and they are rarely used to measure the predictive accuracy arising from competing models. We propose two distinct approaches relying on BFs from our most recent research. With regard to prediction, we propose a new algorithmic protocol to transform Bayes factors into measures that evaluate the pure and intrinsic predictive capabilities of models in terms of posterior predictive distributions, by assessing some preliminary theoretical properties (joint work with Ioannis Ntzoufras). Then, regarding the analysis of replication studies (Held, 2020), we follow the stream outlined by Pawel and Held (2022) and propose a skeptical mixture prior which represents the prior of an investigator who is unconvinced by the original findings. Its novelty lies in the fact that it incorporates skepticism while controlling for prior-data conflict (Egidi et al., 2021). Consistency properties of the resulting skeptical BF are provided together with a thorough analysis of the main features of our proposal (joint work with Guido Consonni). Short Bibliography: Egidi, L., Pauli, F., & Torelli, N. (2022). Avoiding prior–data conflict in regression models via mixture priors. Canadian Journal of Statistics, 50(2), 491-510. Held, L. (2020). A new standard for the analysis and design of replication studies. Journal of the Royal Statistical Society Series A: Statistics in Society, 183(2), 431-448. Pawel, S., & Held, L. (2022). The sceptical Bayes factor for the assessment of replication success. Journal of the Royal Statistical Society Series B: Statistical Methodology, 84(3), 879-911.
Prediction, skepticism, and the Bayes Factoryread_more
HG G 19.1
30 November 2023
15:15-16:15
Xinwei Shen
ETH Zurich
Event Details

Research Seminar in Statistics

Title Engression: Extrapolation for Nonlinear Regression?
Speaker, Affiliation Xinwei Shen, ETH Zurich
Date, Time 30 November 2023, 15:15-16:15
Location HG G 43
Abstract Extrapolation is crucial in many statistical and machine learning applications, as it is common to encounter test data outside the training support. However, extrapolation is a considerable challenge for nonlinear models. Conventional models typically struggle in this regard: while tree ensembles provide a constant prediction beyond the support, neural network predictions tend to become uncontrollable. This work aims at providing a nonlinear regression methodology whose reliability does not break down immediately at the boundary of the training support. Our primary contribution is a new method called ‘engression’ which, at its core, is a distributional regression technique for pre-additive noise models, where the noise is added to the covariates before applying a nonlinear transformation. Our experimental results indicate that this model is typically suitable for many real data sets. We show that engression can successfully perform extrapolation under some assumptions such as a strictly monotone function class, whereas traditional regression approaches such as least-squares regression and quantile regression fall short under the same assumptions. We establish the advantages of engression over existing approaches in terms of extrapolation, showing that engression consistently provides a meaningful improvement. Our empirical results, from both simulated and real data, validate these findings, highlighting the effectiveness of the engression method. The software implementations of engression are available in both R and Python.
Engression: Extrapolation for Nonlinear Regression?read_more
HG G 43
14 December 2023
15:15-16:15
Shuheng Zhou
University of California
Event Details

Research Seminar in Statistics

Title Concentration of measure bounds for matrix-variate data with missing values
Speaker, Affiliation Shuheng Zhou, University of California
Date, Time 14 December 2023, 15:15-16:15
Location HG G 43
Abstract We consider the following data perturbation model, where the covariates incur multiplicative errors. For two random matrices U, X, we denote by (U \circ X) the Hadamard or Schur product, which is defined as (U \circ X)_{i,j} = (U_{i,j}) (X_{ij}). In this paper, we study the subgaussian matrix variate model, where we observe the matrix variate data through a random mask U: \mathcal{X} = U \circ X, where X = B^{1/2} Z A^{1/2}, where Z is a random matrix with independent subgaussian entries, and U is a mask matrix with either zero or positive entries, where $E[U_{ij}] \in [0,1]$ and all entries are mutually independent.Under the assumption of independence between X and U, we introduce componentwise unbiased estimators for estimating covariance A and B, and prove the concentration of measure bounds in the sense of guaranteeing the restricted eigenvalue(RE) conditions to hold on the unbiased estimator for B, when columns of data matrix are sampled with different rates. We further develop multiple regression methods for estimating the inverse of B and show statistical rate of convergence. Our results provide insight for sparse recovery for relationships among entities (samples, locations, items) when features (variables, time points, user ratings) are present in the observed data matrix X with heterogeneous rates. Our proof techniques can certainly be extended to other scenarios. We provide simulation evidence illuminating the theoretical predictions.
Concentration of measure bounds for matrix-variate data with missing valuesread_more
HG G 43

Note: if you want you can subscribe to the iCal/ics Calender.

JavaScript has been disabled in your browser