ZüKoSt: Seminar on Applied Statistics

Main content

Would you like to be notified about these presentations via e-mail? Please subscribe here.

Spring Semester 2017

Note: The highlighted event marks the next occurring event.

Date / Time Speaker Title Location
22 February 2017
Stef van Buuren
Utrecht University
Gerko Vink
Utrecht University
A quick tour with the mice package for imputing missing data  HG G 19.1 
Abstract: Nearly all data analytic procedures in R are designed for complete data and fail if the data contain NA's. Most procedures simply ignore any incomplete rows in the data, or use ad-hoc procedures like replacing NA with the "best value". However, such procedures for fixing NA's may introduce serious biases in the ensuing statistical analysis. Multiple imputation is a principled solution for this problem and is implemented in the R package MICE. In this talk we will give a compact overview of MICE capabilities for R experts, followed by a discussion.
2 March 2017
Ben Marwick
University of Washington, Seattle
Reproducible Research Compendia via R packages  HG G 19.1 
Abstract: "Long considered an axiom of science, the reproducibility of scientific research has recently come under scrutiny after some highly-publicized failures to reproduce results. This has often been linked to the failure of the current model of journal publishing to provide enough details for reviewers to adequately assess the correctness of papers submitted for publication. One early proposal for ameliorating this situation is to bundle the different files that make up a research result into a publicly-available 'compendium'. At the time it was originally proposed, creating a compendium was a complex process. In this talk I show how modern software tools and services have substantially lightened the burden of making compendia. I describe current approaches to making these compendia to accompany journal articles. Several recent projects of varying sizes are briefly presented to show how my colleagues and I are using R and related tools (e.g. version control, continuous integration, containers, repositories) to make compendia for our publications. I explain how these approaches, which we believe to be widely applicable to many types of research work, subvert the constraints of the typical journal article, and improve the efficiency and reproducibility of our research."
6 April 2017
Sebastian Engelke
EPFL Lausanne
Models for extremes on graphs  HG G 19.1 
Abstract: Max-stable processes are suitable models for extreme events that exhibit spatial dependencies. The dependence measure is usually a function of Euclidean distance between two locations. In this talk, we explore two models for extreme events on an underlying graphical structure. Dependence is more complex in this case as it can no longer be explained by classical geostatistical tools. The first model concentrates on river discharges on a network in the upper Danube catchment, where flooding regularly causes huge damage. Inspired by the work by Ver Hoef and Peterson (2010) for non-extreme data, we introduce a max-stable process on the river network that allows flexible modeling of flood events and that enables risk assessment even at locations without a gauging station. The second approach studies conditional independence structures for threshold exceedances, which result in a factorization of the likelihoods of extreme events. This allows for the construction of parsimonious dependence models that respect the underlying graph.
27 April 2017
Marjolein Fokkema
Department of Methods and Statistics der Universität Leiden, NL
Prediction rule ensembles, or a Japanese gardening approach to random forests  HG G 19.1 
Abstract: Most statistical prediction methods provide a trade-off between accuracy and interpretability. For example, single classification trees may be easy to interpret, but likely provide lower predictive accuracy than many other methods. Random forests, on the other hand, may provide much better accuracy, but are more difficult to interpret, sometimes even termed black boxes. Prediction rule ensembles (PREs) aim to strike a balance between accuracy and interpretability. They generally consist of only a small set of prediction rules, which in turn can be depicted as very simple decision trees, which are easy to interpret and apply. Friedman and Popescu (2008) proposed an algorithm for deriving PREs, which derives a large initial ensemble of prediction rules from the nodes of CART trees and selects a sparse final ensemble by regularized regression of the outcome variable on the prediction rules. The R package ‘pre’ takes a similar approach to deriving PREs and offers several additional advantages. For example, it employs an unbiased tree induction algorithm, allows for a random-forest type approach to deriving prediction rules, and allows for plotting of the final ensemble. In this talk, I will introduce PRE methodology and package 'pre', illustrate with examples based on psychological research data, and discuss some future directions.
11 May 2017
Alexandre Pintore
Winton Capital Management
Title T.B.A. HG G 19.2 
18 May 2017
Philip O'Neill
University of Nottingham
Title T.B.A.  HG G 19.1 
Abstract: tba

Archive: SS 17  AS 16  SS 16  AS 15  SS 15  AS 14  SS 14  AS 13  SS 13  AS 12  SS 12  AS 11  SS 11  AS 10  SS 10  AS 09 

Page URL: https://www.math.ethz.ch/sfs/news-and-events/seminar-applied-statistics.html
© 2017 Eidgenössische Technische Hochschule Zürich