One-Day Short Course on
Bayesian Modeling, Inference and Prediction
Presenter: David Draper
Department of Applied Mathematics and Statistics
University of California, Santa Cruz
Fri 10 Dec 2004, 8am-5.30pm
(7 hours of material covered in an 9.5-hour time slot, with a 1-hour
check-in and coffee period first thing, 15-20 minutes of breaks in each of
the morning and afternoon sessions, and a one-hour break for lunch)
Location: Hotel@MIT
20 Sidney Street, Cambridge, Massachusetts USA 02139
Telephone: 617.577.0200
www.hotelatmit.com/home/home.html
Sponsored by the Boston Chapter
of the American Statistical Association (ASA)
Initial registration deadline:
--> Tuesday 23 Nov 2004 (please see below for details)
Summary of Short Course Contents
This is an award-winning short course on Bayesian modeling, inference and
prediction, based on a series of case studies and assuming no previous
exposure to Bayesian ideas or methods.
Topics will include a review of classical, frequentist, and Bayesian
definitions of probability; sequential learning via Bayes' Theorem;
coherence as a form of internal calibration; Bayesian decision theory via
maximization of expected utility; review of frequentist modeling and
maximum-likelihood inference; exchangeability as a Bayesian concept
parallel to frequentist independence; prior, posterior, and predictive
distributions; Bayesian conjugate analysis of binary outcomes, and
comparison with frequentist modeling; integer-valued outcomes (Poisson
modeling); continuous outcomes (Gaussian modeling); multivariate unknowns
and marginal posterior distributions; introduction to simulation-based
computation, including rejection sampling and Markov chain Monte Carlo
(MCMC) methods; MCMC implementation strategies; introduction to Bayesian
hierarchical modeling; fitting and interpreting fixed- and random-effects
Poisson regression models; hierarchical modeling with latent variables as
an approach to mixture modeling; Bayesian model specification via
out-of-sample predictive validation (as a form of external calibration)
and the deviance information criterion (DIC).
The case studies will be drawn from medicine (diagnostic screening for
HIV; hospital-specific prediction of patient-level mortality rates;
hospital length of stay for premature births; a randomized controlled
trial of in-home geriatric assessment) and the physical sciences
(measurement of physical constants), but the methods illustrated will
apply to a broad range of subject areas in the natural and social
sciences, business (including topics of direct relevance to pharmaceutical
companies), and public policy.
The course will liberally illustrate user-friendly implementations of MCMC
sampling via the freeware program WinBUGS.
The course is intended mainly for people who often use statistics in their
research; some graduate coursework in statistics will provide sufficient
mathematical background for participants. To get the most out of the
course, participants should be comfortable with hearing the course
presenter mention (at least briefly) (a) differentiation and integration
of functions of several variables and (b) discrete and continuous
probability distributions (joint, marginal, and conditional) for several
variables at a time, but all necessary concepts will be approached in a
sufficiently intuitive manner that rustiness on these topics will not
prevent understanding of the key ideas.
Registration Fee: $ 95 for full-time students
$145 for non-student members of the Boston ASA chapter
$195 for all other participants
(The registration fee includes extensive materials [see below], lunch, and
refreshments for AM and PM breaks. As a point of reference, the LearnSTAT
program run by the national office of the American Statistical Association
charges $500 for ASA members and $600 for non-members for one-day courses
like this one, with no special fee for students.)
Participants will be provided with 225-250 pages of materials (essentially
they will receive a copy of the draft book the short course presenter is
writing on this topic), including detailed computer sessions with (a) a
leading statistical computing (freeware) package (R); (b) one of the two
most widely used symbolic computing packages (Maple); and (c) WinBUGS.
Initial registration deadline: Tuesday 23 Nov 2004
If not enough people have paid their registration fees by this date, the
course may need to be postponed (in that case people who have registered
by 23 Nov will have their checks returned by mail). Assuming that the
course does go ahead, as is highly likely, registration will remain open
until the day of the short course, as long as there is still room for
additional participants, but
--> to ensure yourself a place, please register early.
Registration: Please send check payable to
"Boston Chapter of the ASA" or "BCASA"
to
BCASA, c/o Michael Posner, Treasurer
313 Summit Ave #3, Brighton, MA 02135
Please include your name, address, phone number, and e-mail with your
check.
Additional information may be found online at
www.amstat.org/chapters/boston/
The web page for the short course is
www.ams.ucsc.edu/~draper/Boston2004.html
Brief Biography of Instructor
David Draper is a Professor in, and Chair of, the Department of Applied
Mathematics and Statistics in the Baskin School of Engineering at the
University of California, Santa Cruz. From 2001 to 2003 he served as the
President-Elect, President, and Past President of the International
Society for Bayesian Analysis (ISBA). His research is in the areas of
Bayesian inference and prediction, model uncertainty and empirical
model-building, hierarchical modeling, Markov Chain Monte Carlo methods,
and Bayesian semi-parametric methods, with applications mainly in health
policy, education, and environmental risk assessment. When he gave an
earlier version of this short course at the Anaheim Joint Statistical
Meetings (JSM) in 1997 it received the 1998 ASA Excellence in Continuing
Education award, and a short course he gave on intermediate and
advanced-level topics in Bayesian hierarchical modeling at the San
Francisco JSM in 2003 received the 2004 ASA Excellence in Continuing
Education award. He has won or been nominated for major teaching awards
everywhere he has taught (the University of Chicago; the RAND Graduate
School of Public Policy Studies; the University of California, Los
Angeles; the University of Bath (UK); and the University of California,
Santa Cruz). He has a particular interest in the exposition of complex
statistical methods and ideas in the context of real-world applications.
Approximate Structure of the Short Course
8.00-9.00am: Check-in and coffee
9.00-9.30am: Quantification of uncertainty. Classical, frequentist, and
Bayesian definitions of probability. Subjectivity and objectivity.
Sequential learning; Bayes' Theorem. Inference (science) and
decision-making (policy and business). Bayesian decision theory;
coherence. Maximization of expected utility. Case study: Diagnostic
screening for HIV.
9.30-11.00am: Exchangeability and conjugate modeling. Probability as
quantification of uncertainty about observables. Binary outcomes. Review
of frequentist modeling and maximum-likelihood inference. Exchangeability
as a Bayesian concept parallel to frequentist independence. Prior,
posterior, and predictive distributions. Inference and prediction.
Coherence and calibration. Conjugate analysis. Comparison with frequentist
modeling. Case Study: Hospital-specific prediction of patient-level
mortality rates.
11.00-11.15am: Coffee break
11.15am-noon: Integer-valued outcomes; Poisson modeling. Case Study:
Hospital length of stay for birth of premature babies.
noon-12.30pm: Continuous outcomes; Gaussian modeling. Multivariate
unknowns; marginal posterior distributions. Case Study: Measurement of
physical constants (NB10).
12.30-1.30pm: Lunch break
1.30-3.30pm: Simulation-based computation. IID sampling; rejection
sampling. Introduction to Markov chain Monte Carlo (MCMC) methods: the
Metropolis-Hastings algorithm and Gibbs sampling. User-friendly
implementation of Gibbs and Metropolis-Hastings sampling via BUGS and
WinBUGS. MCMC implementation strategies. Case Study: the NB10 data
revisited.
3.30-3.45pm: Coffee break
3.45-4.40pm: Hierarchical models: formulation, selection, and diagnostics.
Poisson fixed-effects modeling. Additive and multiplicative treatment
effects. Expansion of a simple model that does not satisfy all diagnostic
checks, by embedding it in a richer class of models of which it's a
special case. Random-effects Poisson regression: hierarchical modeling
with latent variables as an approach to mixture modeling. Case study: a
randomized controlled trial of in-home geriatric assessment (IHGA).
4.40-4.45pm: Get-up-and-move-around break
4.45-5.30pm: Bayesian model specification. Predictive diagnostics. Model
selection as a decision problem. Bayesian cross-validation as an approach
to diagnostics: comparing outcomes from omitted cases with their
predictive distributions given the rest of the data. 3CV: 3-way
cross-validation. The log score as a model-selection method, and its
relationship to the deviance information criterion (DIC). Case study:
continuation of IHGA example.
(yes, this looks like a lot to cover in a single day :-), but I've given
this course a number of times to more than 600 participants, and almost
everybody seems happy with how things go)