2.5-Day Short Course on Practical Bayesian Non-Parametric and Semi-Parametric Modeling Presenters: David Draper and Thanasis Kottas Department of Applied Mathematics and Statistics University of California, Santa Cruz Wed-Fri 15-17 June 2005 (12.5 hours of material spread out over 2 1/2 days) Location: Brigham Young University (Provo, Utah USA) Keynote event of the 30th Annual Summer Institute of Applied Statistics Sponsored by the Department of Statistics, Brigham Young University For more details please see http://statweb.byu.edu/summerinstitute/index.php Schedule - Summer Institute Check-In: Wednesday, June 15, 2005 at 8:30am Room 200 TMCB (Talmage Building) Brigham Young University, Provo, UT Lecture Workshop Sessions will be held in room 1170 TMCB all day Wednesday, all day Thursday, and Friday morning. Thursday evening the traditional BYU cookout will take place. The sessions will conclude with a luncheon on Friday. Cost of the cookout and Friday luncheon is included in the registration fee. Nearest airport: Salt Lake City, Utah, USA Rates/Registration: Advanced registration is requested (and will save you money). Academic Registration BY May 20, 2005 US$450 Academic Registration AFTER May 20, 2005 $600 Non-Academic Registration BY May 20, 2005 $700 Non-Academic Registration AFTER May 20, 2005 $850 Electronic registration is available at http://statweb.byu.edu/summerinstitute/index.php For CES and student rates, and any other information about the Summer Institute, please get in touch with Kathi Carter Department of Statistics 230 TMCB Brigham Young University Provo, UT 84602 USA email: kathi_carter@byu.edu Tel: +1-801-422-4506 Fax: +1-801-422-0635 Short bios of the presenters David Draper is Professor in, and Chair of, the Department of Applied Mathematics and Statistics in the Baskin School of Engineering at the University of California, Santa Cruz. From 2001 to 2003 he served as President-Elect, President, and Past President of the International Society for Bayesian Analysis (ISBA). His research is in the areas of Bayesian inference and prediction, model uncertainty and empirical model-building, hierarchical modeling, Markov Chain Monte Carlo methods, and Bayesian non-parametric and semi-parametric methods, with applications mainly in medicine, health policy, education, and environmental risk assessment. He has a particular interest in the exposition of complex statistical methods and ideas in the context of real-world applications. Thanasis Kottas is Assistant Professor in the Department of Applied Mathematics and Statistics, Baskin School of Engineering, University of California, Santa Cruz. His research is in the areas of Bayesian non-parametric modeling and inference, mixtures models, semi-parametric regression, spatial statistics, and survival analysis. He is interested in application of Bayesian non-parametric methods in various fields, including econometrics, epidemiology, and population dynamics. Short summary of course content Parametric Bayesian statistical modeling -- based typically on (a) the specification of prior distributions on numbers, vectors, and matrices arising in parametric likelihood functions and (b) the use of Bayes' Theorem to update these prior distributions in light of new data -- has gained tremendously in scope, power, and application over the past 15 years with the increasing ease of use of Markov Chain Monte Carlo (MCMC) algorithms. However, to achieve its widest applicability the Bayesian paradigm also has to be able to work with distributions on functions: placing priors on cumulative distribution functions (or densities) and smooth regression surfaces allows the Bayesian approach to adapt flexibly to virtually any data-generating mechanism, not just those that may be indexed parametrically. This -- working with probability distributions on functions -- is the task of Bayesian non-parametric (BNP) and Bayesian semi-parametric (BSP) modeling. In this 2.5-day course the presenters will briefly review parametric Bayesian modeling, motivate the need for BNP/BSP modeling, and cover a wide variety of contemporary techniques (including Dirichlet processes, Polya trees, and Gaussian processes) for working with distributions on functions. The material will be presented in an intuitive fashion in the context of a series of case studies, and sufficient MCMC implementation details will be given to permit participants to do their own BNP/BSP modeling. Course prerequisites It will be assumed that participants have some familiarity with parametric Bayesian modeling. Background equivalent to a Masters degree in statistics will provide sufficient preparation for the course. Placing distributions on functions is an application of the theory of stochastic processes, so one or more courses in that subject would be helpful preparation (but not required): all necessary ideas in the course will be presented in a self-contained fashion. Tentative schedule Wed 15 June 2005 8.30-9.00am Registration and check-in 9.00-10.15am First morning session (DD) Brief review of parametric Bayesian modeling and its strengths and weaknesses. The need for Bayesian non-parametric (BNP) and Bayesian semi-parametric (BSP) modeling. 10.15-10.45am Break 10.45am-noon Second morning session (DD) How BNP/BSP arises naturally from exchangeability considerations and a desire to specify Bayesian models in a coherent manner. Low-technology BNP via Dirichlet-multinomial modeling (a generalization of the Bayesian bootstrap). noon-1.30pm Lunch 1.30-2.45pm First afternoon session (DD) General approaches for construction of BNP priors: exchangeability, partitioning (Polya trees), neutral-to-the-right priors, expansion of finite-dimensional parametric models 2.45-3.15pm Break 3.15-4.45pm Second afternoon session (TK) Dirichlet process (DP) priors and mixtures of DP priors: definitions, properties, and methods. Applications to Bayesian bio-assay (dose-response) modeling. Thu 16 June 2005 9.00-10.15am First morning session (TK) Dirichlet process mixture models: definitions, examples, methods for posterior inference and prediction 10.15-10.45am Break 10.45am-noon Second morning session (TK) Applications of DP mixture models: density estimation, nonparametric quantile regression, hierarchical generalized linear models, multivariate ordinal data analysis, survival analysis noon-1.30pm Lunch 1.30-3.00pm First afternoon session (TK) Extensions of DP priors and DP mixture models: dependent DP priors, spatial DP prior models. Illustrations with spatial disease incidence data. 3.00-3.30pm Break 3.30-4.45pm Second afternoon session (DD) Polya tree (PT) priors and mixtures of PT priors: definitions, properties, and methods. A case study of BNP: using Polya trees for risk assessment in nuclear waste disposal. 6.30pm Barbeque dinner Fri 17 June 2005 9.00-10.15am First morning session (TK) BNP regression and classification using Gaussian process priors. Illustrations with population dynamics data. 10.15-10.45am Break 10.45-11.30am Second morning session (DD, TK) Overview and wrapup: strengths and limitations of BNP/BSP. noon-2pm Closing luncheon