S_38 Statistical Tools in Climate Research: Hypothesis Testing

This course introduces the basic idea behind the art of hypothesis testing. After an introduction to the general formalism and terminology, a few examples that illustrate the basic idea will be presented. Pitfalls that may arise when using the 'test recipe' without care will be discussed.

 

Exercises:

There will be time in the afternoon to work with statistical problems related to the courses. The problems should be defined and solved by students. The material provided for defining the problems is a new 100-member ensemble of the 20th -century simulation (see Super Ensemble below). All problems should be guided by the overarching question of whether and how the increase in the atmospheric GHG in the past 150 years affects the statistics of a control climate obtained under the pre-industrial GHG concentration.  Ideally, the statistics to be considered should be related to your  PhD topic. For instance, if you are working in the atmospheric department, you may want to consider statistics describing cloud properties or atmospheric variability on e.g. synoptic or longer, time scales.

Please contact me directly before the courses, to find out how to formulate your problem in a more precise way and how to get the data from the Super Ensemble to address your problem.

NOTE: The exercises in the afternoon depend on your interests. Two extreme outcomes are possible: a) If nobody signs up, we will skip the exercises. b) If several groups sign up to a series of problems, we may want to combine the results into a paper to answer the overarching question raised above. This may require additional efforts after the course.

Super Ensemble:

This is a 100-member ensemble of simulations of the 20th -century from 1850 to 2005 obtained with the newly released MPI-ESM. Different from the control run, which is run under the pre-industrial condition, the 20th-century simulations are driven by the observed time evolution of the CO2 concentration and other natural and anthropogenic forcings. For the control-run and all ensemble members, data are stored four times per day, so that we have, for the simulated 20th-century state at any output date from 1850 to 2005, 100 realizations. This time-varying ensemble will allow us to assess more detailed changes in the 20th-century that are impossible to derive from standard ensembles with an ensemble size of the order of 5-10.