David Draper - Writings and Creative Activities

David Draper

Writings and Creative Activities

CV (current as of 17 October 2005; 41 pages) is available here.

Personal statement (current as of 17 October 2005; 40 pages) is available here.

Table of Contents of This Page

I. Research Contributions

II. Grant and Contract Support (Research and Teaching Proposals)

III. Dissertations Supervised

IV. Teaching Contributions

V. Administrative Contributions

I. Research Contributions

(Citation data (from the Web of Science, and available only for articles) were current as of October 2005; self-citations were removed. Summary across writings where citation data are available: 2,223 total citations; mean 79, median 56; 3 articles with more than 200 citations, 5 with more than 150, 11 with more than 100, 15 with more than 50. For comparison, a random sample of Web of Science entries indicates that the median number of citations of a paper in a leading statistics journal 10 years after publication is 4 (95% interval, 0-12))

75. (monograph, in progress) Draper D. Bayesian Modeling, Inference and Prediction (draft 6; 305 pages). Contract offered. (I am about 65% finished with this 400-page book, which uses many case studies and mixes theoretical and methodological ideas with symbolic and numerical computing in Maple and R to create a graduate-level introduction to Bayesian modeling.)

74. (monograph, in progress) Draper D. Bayesian Hierarchical Modeling (draft 7; 183 pages). Contract offered. (I am about 35% finished with this 450-page book, which is meant to be a definitive monograph on the subject, illustrating the standard methodologies (and some new ones) with many case studies. I have made two chapters of this book, together with associated software, available for free on the web; to date more than 3,000 people from at least 19 countries have downloaded this material. When finished this book will be a sequel to Bayesian Modeling, Inference and Prediction)

73. (textbook, in progress) Draper D. Thinking About Uncertainty: An Introduction to Probability and Statistics (draft 10; 253 pages). Contract offered. (This is an introductory text, mainly on statistics, for a lower-division undergraduate audience like the students in AMS 5 at UCSC; I'm about 60% finished with it. To date I've taught more than 1,600 students from the partially-finished book at 6 universities in the US and UK.)

72. (article, in progress) Krnjajic M, Kottas A, Draper D. Parametric and nonparametric Bayesian model specification: a case study (22 pages). (In this paper, which is about 75% finished, we undertake a simulation study to explore the ability of Bayesian parametric and nonparametric models to provide an adequate fit to count data, of the type that would routinely be analyzed parametrically either through fixed-effects or random-effects Poisson models. The context of the study is a randomized controlled trial with two groups (treatment and control). Our nonparametric approach utilizes several modeling formulations based on Dirichlet process (DP) mixture and mixtures of DP priors. We find that the nonparametric models are able to flexibly adapt to the data, to offer rich posterior inference, and to provide, in a variety of settings, more accurate predictive inference than parametric models.)

71. (article) Draper D, Krnjajic M. Bayesian model specification (submitted; 30 pages). (A standard (data-analytic) approach to statistical model specification, practiced with equal vigor in both Bayesian and non-Bayesian approaches to model-building, involves the initial choice, for the structure of the model, of one or another of a variety of standard parametric families, followed by modification of this initial choice - once data begin to arrive - if the data suggest deficiencies in the original specification. In this paper (a) we argue that this approach is formally incoherent, because it amounts to using the data both to specify the prior distribution on structure space and to update using this data-determined prior; (b) we identify two approaches to avoiding (at least in principle, and with a fair amount of data) the incoherence in (a): (1) Bayesian semi-parametric modeling and (2) three-way out-of-sample predictive validation; (c) we provide details on implementing (2); (d) we argue that to make progress in coherent Bayesian model specification in complicated problems You (the modeler) have to either implicitly or explicitly choose a utility structure which defines, for You, when the model currently being examined is "good enough"; (e) we argue that it is best to make this choice explicitly on the basis of real-world considerations regarding the use to which the model will be put; and (f) we contrast model selection methods based on the log score and deviance information criteria (DIC) as two examples of (e) with utilities governed by predictive accuracy.)

70. (article) Draper D, Toland JF. Nonparametric prior specification: A case study (in preparation; 42 pages). (Shows how to use techniques from functional analysis to compute bounds on Bayes factors in an infinite-dimensional class of prior distributions, as a way to deal more realistically with uncertainty in the process of specifying priors. Other people have dealt in the past with unimodality as a qualitative prior constraint, using Khintchine's characterization of unimodal distributions as mixtures of uniforms; in this paper we use quite different methods to deal with monotonicity and convexity constraints.)

69. (article) Draper D. On the relationship between model uncertainty and inferential/predictive uncertainty (submitted; 10 pages). (Demonstrates that increasing the uncertainty in the modeling process by expanding a model hierarchically can lead either to an increase or a decrease in uncertainty about quantities of direct inferential or predictive interest.)

68. (article) Fouskakis D, Draper D. Stochastic optimization methods for cost-effective quality assessment in health (submitted; 53 pages). (Uses Bayesian decision theory to solve the general problem of variable selection in generalized linear models subject to a data collection cost constraint on the predictor variables. The particular case study in which this methodology is developed involves the creation of a cost-effective scale for measuring sickness at admission for hospital patients. We use simulated annealing (SA), genetic algorithms (GA), and tabu search (TS) to find (near-)optimal subsets of predictor variables; the optimization is of a real-valued function of p binary inputs, and in our largest application the space of input vectors over which we search has 10**25 elements. We use simulation methods to explore a wide variety of user-defined input settings for the optimization methods we examine, without tuning these methods specifically to the structure of our utility-maximization problem, and we also create a context-specific version (ISA) of simulated annealing (the optimization method whose generic implementation performed most poorly) and document the improvement over its generic counterpart. We find in our optimization problem that (a) when p is modest (i) genetic algorithms performed relatively poorly for all but the very best user-defined input configurations, and generic simulated annealing also did not perform well, whereas (ii) tabu search had excellent median performance and was much less sensitive to suboptimal choice of user-defined inputs; and (b) for large p the best versions of GA and ISA outperformed TS and generic SA. Our results are phrased in the language of health policy but apply with equal force to other quality assessment settings with dichotomous outcomes, such as the examination of drop-out rates in education, the study of retention rates in the workplace and the creation of cost-effective credit scores in business. This work (1) provides a relatively new perspective on variable selection in generalized linear models, (2) offers new insights into the comparative advantages and flaws of competing stochastic optimization methods, and (3) produces results of direct potential use in quality assessment in health policy and other fields.)

67. (invited discussion) Draper D (2006). Coherence and calibration: comments on subjectivity and ``objectivity'' in Bayesian analysis. Discussion of The case for objective Bayesian analysis by J Berger and Subjective Bayesian analysis: principles and practice by M Goldstein, Bayesian Analysis, forthcoming (5 pages). (Examines the crucial role of both coherence and calibration in Bayesian analysis, and argues (a) that all Bayesian work is inherently subjective but that (b) ``objective'' prior distributions play a valuable role in achieving good calibration when (in your judgment) the past and future are exchangeable.)

66. (monograph chapter) Draper D (2006). Bayesian multilevel analysis and MCMC. Chapter 3 in Handbook of Quantitative Multilevel Analysis (de Leeuw J, editor), New York: Springer, forthcoming (59 pages). (My goal in writing this chapter was to produce a definitive introduction to the Bayesian paradigm and how it is applied in contemporary statistical work to the analysis of multilevel, or hierarchical, models, using Markov chain Monte Carlo methods as the basis of computation.)

65. (invited discussion) Draper D (2005). Discussion of ``Local model uncertainty and incomplete-data bias,'' by Copas J and Eguchi S, Journal of the Royal Statistical Society Series B, 67, 502-503. (Comments upon differences between frequentist and Bayesian approaches to accounting for model uncertainty, and discusses the use of random-effects meta-analytic models to create uncertainty bands that appropriately reflect bias in the measurement process, using estimation of the speed of light in physics in the 20th century as an example.) (Citation data unavailable)

64. (discussion article) Browne WJ, Draper D (2006). A comparison of Bayesian and likelihood-based methods for fitting multilevel models (with discussion and rejoinder). Bayesian Analysis, forthcoming (31 pages). (Demonstrates that Bayesian MCMC-based estimation outperforms likelihood and quasi-likelihood methods in variance components and random-effects logistic regression models with respect to bias of point estimates and coverage and length of interval estimates, and therefore recommends the use of maximum likelihood estimation during the model exploration phase of a multilevel study (for computational speed), and Bayesian estimation using MCMC to produce final publishable results.)

63. (invited discussion) Draper D (2005). Discussion of ``Multiple bias modeling for analysis of observational data'' by S Greenland, Journal of the Royal Statistical Society Series A, 168, 301. (Offers suggestions on how to perform both process and outcome evaluation of the method proposed by Greenland to judgmentally estimate variance components (a) for nonexchangeability between the observed units in an observational study and units in the population of real scientific interest and (b) for the effects of unmeasured confounders in such studies.) (Citation data on discussion unavailable; article under discussion cited 5 times)

62. (article) Hanks B, McDowell C, Draper D, Krnjajic M (2004). Program quality with pair programming in CS1. ACM SIGCSE Bulletin, 36, 176-180. (Pair programming transforms what has traditionally been a solitary activity into a cooperative effort. While pair programming, two software developers (the driver and the navigator, roles which are switched at regular intervals) share a single computer monitor and keyboard. Prior research has shown that compared with students who work alone, students who pair demonstrate increased confidence in their work, and greater success in their first computer science class (CS1); however, these earlier studies were flawed in that paired and solo students were not given the same programming assignments. We use a design that holds assignments constant, and we employ Bayesian methods to quantify the improvement in both process and outcome measures of program quality under pair programming in our stronger experimental design.) (Citation data unavailable)

61. (discussion article) Draper D, Gittoes M (2004). Statistical analysis of performance indicators in UK higher education (with discussion). Journal of the Royal Statistical Society, Series A, 167, 449-474 (discussion, 447-448, 497-499; we were not given an opportunity to rejoin). (Attempts to measure the quality with which institutions such as hospitals and universities carry out their public mandates have gained in frequency and sophistication over the last decade. In this paper we examine methods for creating performance indicators (PIs) in multilevel settings (e.g., students nested within universities) based on a dichotomous outcome variable (e.g., drop-out from the higher education system). The profiling methods we study involve the indirect measurement of quality, by comparing institutional outputs after adjusting for inputs, rather than directly attempting to measure the quality of the processes unfolding inside the institutions. In the context of an extended case study of the creation of PIs for universities in the UK higher education system, we (a) demonstrate the large-sample functional equivalence between a method based on indirect standardization and an approach based on fixed-effects multilevel modeling, (b) offer simulation results on the performance of the standardization method in null and non-null settings, (c) examine the sensitivity of this method to inadvertent omission of relevant input variables, (d) explore random-effects reformulations and characterize settings in which they are preferable to fixed-effects multilevel modeling in this type of quality assessment, and (e) discuss extensions to longitudinal quality modeling and the overall pros and cons of institutional profiling. Our results are couched in the language of higher education but apply with equal force to other settings with dichotomous response variables, such as the examination of observed and expected rates of mortality (or other adverse outcomes) in the study of the quality of health care.) (1 citation [December 2004], in statistics, in a journal published in the UK)

60. (invited discussion) Draper D (2004). Discussion of ``Ecological inference for 2 by 2 tables'' by J Wakefield, Journal of the Royal Statistical Society Series A, 167, 435-436. (Emphasizes how violently sensitive inferential answers at the individual level are to assumptions and prior inputs when all that is available is aggregate data, and discusses the relationship between sampling-theory and model-based approaches to ecological inference.) (Citation data on discussion unavailable; article under discussion cited 3 times)

59. (article) Fouskakis D, Draper D (2002). Stochastic optimization: a review. International Statistical Review, 70, 315-349. (We review three leading stochastic optimization methods: simulated annealing, genetic algorithms, and tabu search. In each case we analyze the method, give the exact algorithm, detail advantages and disadvantages, and summarize the literature on optimal values of the inputs. As a motivating example we describe the solution - using Bayesian decision theory, via maximization of expected utility - of a variable selection problem in generalized linear models, which arises in the cost-effective construction of a patient sickness-at-admission scale as part of an effort to measure quality of hospital care.) (4 citations [most recent January 2005], in computer science, ecology, and statistics, in journals published in Holland, the UK and the US)

58. (invited discussion) Draper D (2002). Discussion of ``Bayesian measures of model complexity and fit'' by DJ Spiegelhalter, NG Best, BP Carlin, and A van der Linde, Journal of the Royal Statistical Society Series B, 64, 630-631. (Criticizes the view taken by the authors that model choice can be made in a context-free manner, and advocates a decision-theoretic basis for model selection based on maximization of expected utility.) (Citation data on discussion unavailable; article under discussion cited 151 times)

57. (article) Browne WJ, Draper D, Goldstein H, Rasbash J (2002). Bayesian and likelihood methods for fitting multilevel models with complex level-1 variation. Computational Statistics and Data Analysis, 39, 203-225. (In multilevel modeling it is common practice to assume constant variance at level 1 across individuals. In this paper we consider situations where the level-1 variance depends on predictor variables. We examine two cases using a dataset from educational research; in the first case the variance at level 1 of a test score depends on a continuous ``intake score'' predictor, and in the second case the variance is assumed to be different for different genders. We contrast two maximum-likelihood methods based on iterative generalized least squares with two MCMC methods based on adaptive hybrid versions of the Metropolis-Hastings (MH) algorithm, and we use two simulation experiments to compare these four methods. We find that all four approaches have good repeated-sampling behavior in the classes of models we simulate. We conclude by contrasting raw- and log-scale formulations of the level-1 variance function, and we find that adaptive MH sampling is considerably more efficient than adaptive rejection sampling when the heteroscedasticity is modeled polynomially on the log scale.) (3 citations [most recent August 2005], in veterinary research and statistics, in journals published in France, the UK and the US)

56. (invited discussion) Draper D (2002). Discussion of ``Commissioned analysis of surgical performance by using routine data: lessons from the Bristol inquiry'' by DJ Spiegelhalter, P Aylin, NG Best, SJW Evans, and GD Murray, Journal of the Royal Statistical Society Series A, 165, 227. (Emphasizes the value of simulation studies and Bayesian decision theory as a basis for setting practical cutpoints to identify ``good'' and ``bad'' institutions in input-output quality assessment.) (Citation data on discussion unavailable; article under discussion cited 7 times)

55. (monograph chapter) Draper D, Saltelli A, Tarantola S, Prado P (2000). Scenario and parametric sensitivity and uncertainty analyses in nuclear waste disposal risk assessment: the case of GESAMAC. Chapter 13 in Mathematical and Statistical Methods for Sensitivity Analysis (Saltelli A, Chan K, Scott M, eds.), New York: Wiley, 275-292, 427-447. (Shows that variance-based sensitivity analyses are not fully adequate in determining the factors most responsible for high radiologic doses arising from the failure of underground storage facilities for nuclear waste, and that about 30% of the overall predictive uncertainty for log dose arises from uncertainty about the scenario describing how the facility will fail - a source of uncertainty previously largely ignored or treated qualitatively. Also explores the use of projection pursuit regression in sensitivity analysis.) (Citation data unavailable)

54. (article) Draper D, Fouskakis D (2000). A case study of stochastic optimization in health policy: problem formulation and preliminary results. Journal of Global Optimization, 18, 399-416. (We use Bayesian decision theory to address a variable selection problem arising in attempts to indirectly measure the quality of hospital care, by comparing observed mortality rates to expected values based on patient sickness at admission. Our method weighs data collection costs against predictive accuracy to find an optimal subset of the available admission sickness variables. The approach involves maximizing expected utility across possible subsets, using Monte Carlo methods based on random division of the available data into N modeling and validation splits to approximate the expectation. After exploring the geometry of the solution space, we compare a variety of stochastic optimization methods - including genetic algorithms (GA), simulated annealing (SA), threshold acceptance (TA), messy simulated annealing (MSA), and tabu search (TS) - on their performance in finding good subsets of variables, and we clarify the role of N in the optimization. Preliminary results indicate that TS is somewhat better than TA and SA in this problem, with MSA and GA well behind the other three methods. Sensitivity analysis reveals broad stability of our conclusions.) (5 citations [most recent August 2005], in artificial intelligence, medicine, and statistics, in journals published in Australia, Holland, the UK and the US)

53. (software manual) Rasbash J, Browne WJ, Goldstein H, Yang M, Plewis I, Healy M, Woodhouse G, Draper D, Langford I, Lewis T (2000). A User's Guide to MLwiN, Version 2.1d. London: Institute of Education, University of London (286 pages; ISBN 085473 6123). (Bill Browne and I are the co-developers of the MCMC capabilities in this multi-level modeling package, with a user base of more than 3,000 people worldwide.) (Citation data unavailable)

52. (article) Browne WJ, Draper D (2000). Implementation and performance issues in the Bayesian and likelihood fitting of multilevel models. Computational Statistics, 15, 391-420. (Examines (a) the relative performance - in the sense of point and interval estimation accuracy - of likelihood and Bayesian fitting methods in random-slopes regression models, and (b) some performance comparisons in random effects logistic regression models - in the sense of required CPU time to achieve a given accuracy of posterior summary - between several MCMC fitting methods, including adaptive rejection sampling and an approach we have developed specifically for MLwiN based on adaptive hybrid Gibbs-Metropolis sampling.) (15 citations [most recent August 2005], in entomology, machine learning, medicine, psychology, and statistics, in journals published in Holland, the UK and the US)

51. (invited discussion) Draper D (1999). Model uncertainty yes, discrete model averaging maybe. Statistical Science, 14, 405-409 (discussion of ``Bayesian model averaging: a tutorial,'' by Hoeting JA, Madigan D, Raftery AE, Volinsky CT). (Argues that variable selection uncertainty in generalized linear models should be dealt with in a continuous manner via hierarchical modeling rather than through discrete model averaging, and advocates the use of expected utility maximization as a basis for model choice.) (Citation data on discussion unavailable; article under discussion cited 129 times)

50. (monograph) Bergdahl M, Black O, Bowater R, Chambers R, Davies P, Draper D, Elvers E, Full S, Holmes D, Lundqvist P, Lundstrom S, Nordberg L, Perry J, Pont M, Prestwood M, Richardson I, Skinner C, Smith P, Underwood C, Williams M (1999). Model Quality Reports in Business Statistics (Volume 1). Luxembourg: Eurostat (442 pages in four volumes; Volume 2; Volume 3; Volume 4). (Report produced by team consisting of people from the Office for National Statistics (UK), Statistics Sweden, and the Universities of Bath and Southampton (UK), on best practice in evaluating the design and analysis of complex sample surveys.) (Citation data unavailable)

49. (invited discussion) Draper D (1999). Discussion of ``Decision models in screening for breast cancer,'' by G Parmigiani. In Bayesian Statistics 6, JM Bernardo, J Berger, P Dawid, and AFM Smith, eds., Oxford: Oxford University Press, 541-543. (Attempts to bring the results of an expected utility analysis onto a more interpretable scale for women choosing whether, and how often, to be screened for breast cancer.) (Citation data unavailable)

48. (invited discussion) Draper D (1999). Discussion (proposer of the vote of thanks) of ``Bayesian nonparametric inference for random distributions and related functions'' by SG Walker, P Damien, PW Laud, and AFM Smith, Journal of the Royal Statistical Society Series B, 61, 510-513. (Emphasizes the importance of Bayesian nonparametrics as the way to finally make operational de Finetti's representation theorem for continuous outcomes, illustrates some of the finer points in using Polya trees, and makes a connection between Polya trees and wavelet density estimation.) (Citation data on discussion unavailable; article under discussion cited 25 times)

47. (monograph chapter) Draper D, Bowater R (1999). Model assumption errors. Chapter 9 in Model Quality Reports in Business Statistics, by Bergdahl M, Black O, Bowater R, Chambers R, Davies P, Draper D, Elvers E, Full S, Holmes D, Lundqvist P, Lundstrom S, Nordberg L, Perry J, Pont M, Prestwood M, Richardson I, Skinner C, Smith P, Underwood C, Williams M; Luxembourg: Eurostat, Volume I, 138-162. (Examines a number of areas of sample survey design and analysis in which statistical models feature prominently, including index formulae, bench-marking, seasonal adjustment, cut-off sampling, small-area estimation, and non-ignorable nonresponse, and makes recommendations on best practice in assessing what can happen when assumptions in these models are wrong.) (Citation data unavailable)

46. (monograph chapter) Draper D, Bowater R (1999). Sampling errors under non-probability sampling. Chapter 4 in Model Quality Reports in Business Statistics, by Bergdahl M, Black O, Bowater R, Chambers R, Davies P, Draper D, Elvers E, Full S, Holmes D, Lundqvist P, Lundstrom S, Nordberg L, Perry J, Pont M, Prestwood M, Richardson I, Skinner C, Smith P, Underwood C, Williams M; Luxembourg: Eurostat, Volume I, 65-81. (Examines a variety of non-probability-based sampling methods in routine use in business surveys, including voluntary sampling, quota sampling, judgmental sampling, and cut-off sampling, and makes recommendations on best practice in assessing the biases that can arise with these methods.) (Citation data unavailable)

45. (monograph review) Fouskakis D, Draper D (1999). Review of Tabu Search, by F Glover and M Laguna, Amsterdam: Kluwer (1997). The Statistician, 48, 616-619. (Critique of a book on a popular stochastic optimization method written by the method's developers.) (Citation data unavailable)

44. (invited discussion) Draper D (1999). Discussion (seconder of the vote of thanks) of ``Some statistical heresies'' by JK Lindsey, Journal of the Royal Statistical Society Series D - The Statistician, 48, 27-28 (Argues for a combination of Bayesian and non-Bayesian outlook and methods in which out-of-sample predictive validation is used to calibrate Bayesian results). (Citation data unavailable)

43. (article) Draper D, Pereira A, Prado P, Saltelli A, Cheal R, Eguilior S, Mendes B, Tarantola S (1999). Scenario and parametric uncertainty in GESAMAC: A methodological study in nuclear waste disposal risk assessment. Computer Physics Communications, 117, 142-155. (Shows that scenario uncertainty, previously largely ignored in risk assessments in the field, accounts for more than 1/3 of the overall uncertainty in attempting to assess what may happen if underground repositories for nuclear waste are compromised.) (6 citations [most recent June 2003], in computer science, environmetrics, physics, and statistics, in journals published in Holland, the UK and the US)

42. (invited discussion) Draper D (1998). Discussion of ``Some algebra and geometry for hierarchical models, applied to diagnostics,'' by JS Hodges, Journal of the Royal Statistical Society Series B, 60, 527-528 (Argues for the use of predictive rather than inferential diagnostics for hierarchical models). (Citation data on discussion unavailable; article under discussion cited 22 times)

41. (invited discussion) Draper D (1998). Bayesian analysis of finite-population survey data using Markov Chain Monte Carlo. Closing discussion, Half-Day Meeting on Design and Analysis of Complex Sample Surveys, Journal of the Royal Statistical Society Series B, 60, 96-98 (Lays out and illustrates a Bayesian theory of finite-population sampling based on MCMC imputation of the unsampled units as missing data). (Citation data unavailable)

40. (monograph) Draper D, Pereira A, Prado P, Saltelli A, Cheal R, Eguilior S, Mendes B, Tarantola S (1998). GESAMAC: Conceptual and Computational Tools to Assess the Long-Term Risk from Nuclear Waste Disposal in the Geosphere. Brussels: European Commission, EUR 19113 EN (92 pages). (Report produced by team of physicists, statisticians, and risk analysts from England, Italy, Spain, and Sweden, offering novel methods for risk assessment in nuclear waste disposal. Full report not available at present; the PDF file here contains the chapter of which I was the main author.) (Citation data unavailable)

39. (invited discussion) Draper D (1998). Discussion of ``Model-based inference for categorical survey data subject to nonignorable nonresponse,'' by JJ Forster and PWF Smith, and ``Analysis of longitudinal binary data from multiphase sampling,'' by D Clayton, G Dunn, A Pickles, and D Spiegelhalter, Journal of the Royal Statistical Society Series B, 60, 94 (Advocates predictive validation as a method for tuning prior distributions, especially in situations in which strong prior assumptions are unchallenged by the data, and questions the role of estimated versus known sampling weights in survey methods based on inverse probability weighting). (Citation data on discussion unavailable; articles under discussion cited 26 and 17 times, respectively)

38. (encyclopedia article) Greenland S, Draper D (1998). Exchangeability. Entry in Encyclopedia of Biostatistics. Armitage P, Colton T (eds). London: Wiley. (Defines unconditional and conditional exchangeability and discusses difficulties in applying de Finetti's representation theorems in practice). (Citation data unavailable)

37. (discussion article) Draper D, Madigan D (1997). The scientific value of Bayesian statistical methods and outlook (with discussion). IEEE Expert, 12, 18-25. (Contrasts Bayesian and non-Bayesian definitions of probability; highlights the differences in methodology and outlook that arise from these definitions; examines how MCMC has revolutionized Bayesian applied statistics, and describes some success stories; suggests that there is no need to choose exclusively between Bayesian and non-Bayesian perspectives; and outlines some possible fusions of the best of both worlds.) (Citation data unavailable)

36. (invited discussion) Draper D (1997). Discussion of ``The EM algorithm: An old song sung to a fast new tune,'' by X-L Meng and D Van Dyke, Journal of the Royal Statistical Society Series B, 59, 552-553. (Describes a strategy for general-purpose Metropolis sampling, and questions the future of EM in an MCMC world.) (Citation data on discussion unavailable; article under discussion cited 108 times)

35. (article) Draper D (1997). Model uncertainty in ``stochastic'' and ``deterministic'' systems. In Proceedings of the 12th International Workshop on Statistical Modeling, Minder C, Friedl H (eds.), Vienna: Schriftenreihe der Osterreichischen Statistichen Gesellschaft, 5, 43-59. (Explores a categorization of sources of uncertainty - scenario, structural, parametric, and predictive - arising in predictive modeling, and argues that in reality there is no such thing as a ``deterministic'' model.) (Citation data unavailable)

34. (invited discussion) Draper D (1996). Discussion of ``Hierarchical generalized linear models,'' by Y Lee and JA Nelder, Journal of the Royal Statistical Society Series B, 58, 662-663. (Demonstrates that the authors' approach to hierarchical modeling is dominated by full-Bayes analyses using MCMC methods.) (Citation data on discussion unavailable; article under discussion cited 148 times)

33. (invited discussion) Draper D (1996). Discussion of ``League tables and their limitations: Statistical issues in comparisons of institutional performance,'' by H Goldstein and DJ Spiegelhalter, Journal of the Royal Statistical Society Series A, 159, 416-418. (Stresses (by presenting original results) the need for process evaluations of quality to supplement examination of system outputs adjusted for inputs, discusses hierarchical models and priors alternative to those proposed by the authors, and emphasizes the need for predictive validation in HM work.) (Citation data on discussion unavailable; article under discussion cited 185 times)

32. (invited discussion) Draper D (1996). Discussion of ``Accounting for model uncertainty in survival analysis improves predictive performance,'' by AE Raftery, D Madigan, and CT Volinsky. In Bayesian Statistics 5, JM Bernardo, J Berger, P Dawid, and AFM Smith, eds., Oxford: Oxford University Press, 341-342. (Discusses how to measure the effects of model uncertainty in more practically relevant ways, and advocates treating variable selection uncertainty continuously rather than discretely.) (Citation data unavailable)

31. (invited discussion) Draper D (1996). Discussion of ``Testing for mixtures: A Bayesian entropic approach,'' by KL Mengersen and CP Robert. In Bayesian Statistics 5, JM Bernardo, J Berger, P Dawid, and AFM Smith, eds., Oxford: Oxford University Press, 270. (Advocates the explicit quantification of utility, in preference to a reliance on generic loss functions, in problems of stochastic model choice.) (Citation data unavailable)

30. (article) Swezey RL, Draper D, Swezey AM (1996). Bone densitometry: Comparison of dual energy X-ray absorptiometry to radiographic absorptiometry. Journal of Rheumatology, 23, 1734-1738. (Lays the groundwork for a full cost-benefit analysis of three leading methods for detecting osteoporosis, and demonstrates that radiographic absorptiometry holds promise as a practical and inexpensive method for screening for this disease.) (5 citations [most recent December 2003], in osteoporosis research and radiology, in journals published in France, the Netherlands, the UK and the US)

29. (invited discussion) Draper D (1996). Utility, sensitivity analysis, and cross-validation in Bayesian model-checking. Statistica Sinica, 6, 760-767 (discussion of ``Posterior predictive assessment of model fitness via realized discrepancies,'' by A Gelman, X-L Meng, and H Stern). (Advocates graphical summaries of model deficiencies followed by hierarchical embedding of a candidate model in a richer family based on the nature of its shortcomings, and out-of-sample predictive calibration as a way to avoid using the data twice in model selection.) (5 citations [most recent February 2003], in machine learning, medicine, and statistics, in journals published in Canada, the UK and the US; article under discussion cited 171 times)

28. (article) Steiner A, Raube K, Stuck A, Aronow H, Draper D, Rubenstein L, Beck J (1996). Measuring psychosocial aspects of well-being in older community residents: performance of four short scales. Gerontologist, 36, 54-62. (Shows that well-chosen subsets of standard instruments for measuring well-being in the elderly can produce scales with good reliability and validity and reduced respondent burden.) (19 citations [most recent June 2005], in community health, epidemiology, geriatrics, gerontology, health policy, neurology, pharmacology, psychology, and social welfare, in journals published in Germany, Norway, the UK and the US)

27. (invited discussion) Draper D (1995). Discussion of ``Model uncertainty, data mining, and statistical inference,'' by C Chatfield, Journal of the Royal Statistical Society Series A, 158, 450-451. (Provides an example of model uncertainty arising from variable selection in regression in which jackknifing the modeling process improves predictive calibration.) (Citation data on discussion unavailable; article under discussion cited 160 times)

26. (invited discussion) Draper D (1995). Discussion of ``Fractional Bayes factors for model comparison,'' by A O'Hagan, Journal of the Royal Statistical Society Series B, 57, 124. (Remarks on the use and calculation of Bayes factors in prediction.) (Citation data on discussion unavailable; article under discussion cited 138 times)

25. (discussion article) Draper D (1995). Assessment and propagation of model uncertainty (with discussion and rejoinder). Journal of the Royal Statistical Society, Series B, 57, 45-97. (A methodology article documenting the failure of standard empirical model-building methods to capture the uncertainty in the modeling process itself, and demonstrating the success of a Bayesian approach - Bayesian model averaging - to solving the problem.) (238 citations [most recent September 2005], in acoustics, agriculture, artificial intelligence, biostatistics, business, chemometrics, climate studies, computer science, dairy science, data mining, demography, ecology, econometrics, economics, education, engineering reliability, environmetrics, epidemiology, fisheries management, forestry, genetics, geostatistics, machine learning, marketing, mechanical engineering, medicine, neural networks, nuclear waste disposal, operations research, pharmacokinetics, plant pathology, plasma physics, political science, psychology, quality assessment, queueing theory, risk analysis, sensitivity analysis, sociology, sports, statistics, toxicology, transport research, and water quality management, in journals published in Australia, Canada, China, France, the Netherlands, New Zealand, Sweden, the UK and the US)

24. (discussion article) Draper D (1995). Inference and hierarchical modeling in the social sciences (with discussion and rejoinder). Journal of Educational and Behavioral Statistics, 20, 115-147, 190-233. (Introduces a hierarchy of inferential validity in the social sciences as a function of the strength of the data-gathering activity, and notes some interpretational and technical problems with standard methods for hierarchical modeling in meta-analysis and education policy.) (31 citations [most recent August 2005], in computer science, criminology, education, epidemiology, geography, gerontology, management, medicine, operations research, psychology, public health, social work, sociology, and statistics, in journals published in Australia, China, the UK and the US)

23. (monograph) Draper D, Gaver D, Goel P, Greenhouse J, Hedges L, Morris C, Tucker J, Waternaux C (1993). Combining Information: Statistical Issues and Opportunities for Research. Contemporary Statistics Series, No. 1. Alexandria VA: American Statistical Association (234 pages). (Report produced by panel convened by U.S. National Research Council, 1990-1992, to survey the state of the art in meta-analysis, hierarchical modeling, and other methods for combining information from several sources to produce more informative summaries and better decisions than those possible based only on the separate information sources.) (Citation data unavailable)

22. (invited discussion) Draper D, Mallows C (1993). Discussion of ``Predictability and prediction,'' by ASC Ehrenberg and JA Bound, Journal of the Royal Statistical Society Series A, 156, 201-202. (A reply to comments made by Ehrenberg and Bound on Draper et al. (1993).) (Citation data on discussion unavailable; article under discussion cited 15 times)

21. (discussion article) Draper D, Hodges J, Mallows C, Pregibon D (1993). Exchangeability and data analysis (with discussion and rejoinder). Journal of the Royal Statistical Society Series A, 156, 9-37. (Provides a theoretical framework for making the sorts of similarity judgments central to empirical model-building in statistics.) (27 citations [most recent June 2005], in animal behavior, behavioral and brain science, economics, education, enzymology, epidemiology, evaluation methodology, health services research, medicine, political science, psychology, social science, sociology, and statistics, in journals published in the UK and the US)

20. (discussion article) Keeler E, Rubenstein L, Kahn K, Draper D, Harrison E, McGinty M, Rogers W, Brook R (1992). Hospital characteristics and quality of care (with discussion). Journal of the American Medical Association, 268, 1709-1714 (discussion, 269, 865-6). (Documents large differences in average quality of care in the U.S. as a function of hospital characteristics such as size, urbanicity, and teaching status.) (165 citations [most recent August 2005], in economics, epidemiology, geriatrics, health services research, medicine, law, nursing science, nutritional science, pharmacology, policy analysis, radiology, social science, and statistics, in journals published in Germany, Singapore, the UK, and the US)

19. (article) Hadorn D, Draper D, Rogers W, Keeler E, Brook R (1992). Cross-validation performance of patient mortality prediction models. Statistics in Medicine, 11, 475-489. (Shows that mortality prediction methods are accurate enough for research involving aggregation over many patients, but not accurate enough to serve as the basis of rationing of scarce health care resources.) (24 citations [most recent July 2004], in cardiology, diabetes research, ecology, epidemiology, geriatrics, health services research, internal medicine, law, oncology, operations research, pharmacology, and statistics, in journals published in the UK and the US)

18. (discussion article) Rogers W, Draper D, Kahn K, Keeler E, Rubenstein L, Kosecoff J, Brook R (1990). Quality of care before and after implementation of the DRG-Based Prospective Payment System: a summary of effects, (with discussion). Journal of the American Medical Association, 264, 1989-1994 (discussion, 1995-1997). (Summarizes the overall effects of the change in the reimbursement system on quality of care in the U.S. from 1981 to 1986.) (90 citations [most recent July 2005], in economics, epidemiology, family practice, geriatrics, health services research, internal medicine, neurology, pharmacology, social science, and statistics, in journals published in Germany, Singapore, the UK, and the US)

17. (discussion article) Kahn K, Keeler E, Sherwood M, Rogers W, Draper D, Bentow S, Reinisch E, Rubenstein L, Kosecoff J, Brook R (1990). Comparing outcomes of care before and after implementation of the DRG-based Prospective Payment System (with discussion). Journal of the American Medical Association, 264, 1984-1988 (discussion, 1995-1997). (Summarizes changes in outcomes associated with the change in reimbursement mechanism.) (142 citations [most recent September 2005], in economics, geriatrics, internal medicine, management science, nursing, operations research, public health, and statistics, in journals published in Germany, Singapore, the UK, and the US)

16. (discussion article) Kosecoff J, Kahn K, Rogers W, Reinisch E, Sherwood M, Rubenstein L, Draper D, Roth C, Chew C, Brook R (1990). The Prospective Payment System and impairment at discharge: The ``quicker-and-sicker'' story revisited (with discussion). Journal of the American Medical Association, 264, 1980-1983 (discussion, 1995-1997). (Documents an increase in sickness at discharge after the change in the government's method for reimbursing hospitals.) (133 citations [most recent July 2005], in economics, family and community health, geriatrics, health services research, internal medicine, management science, medical ethics, pharmacology, and statistics, in journals published in Germany, Singapore, the UK, and the US)

15. (discussion article) Rubenstein L, Kahn K, Reinisch E, Sherwood M, Rogers W, Kamberg C, Draper D, Brook R (1990). Changes in quality of care for five diseases as measured by implicit review, 1981 to 1986 (with discussion). Journal of the American Medical Association, 264, 1974-1979 (discussion, 1995-1997). (Shows that quality of care can also be measured accurately with expert physician judgment.) (113 citations [most recent July 2005], in biology, geriatrics, health policy, health services research, internal medicine, nursing, ophthalmology, pediatrics, psychiatry, and statistics, in journals published in Germany, Singapore, the UK, and the US)

14. (discussion article) Kahn K, Rogers W, Rubenstein L, Sherwood M, Reinisch E, Keeler E, Draper D, Kosecoff J, Brook R (1990). Measuring quality of care with explicit process criteria before and after implementation of the DRG-based Prospective Payment System (with discussion). Journal of the American Medical Association, 264, 1969-1973 (discussion, 1995-1997). (Developed what were at the time the most extensive process criteria to date for measuring quality of hospital care explicitly.) (160 citations [most recent July 2005], in biology, cardiology, geriatrics, health policy, health services research, internal medicine, oncology, pharmacology, public health, and statistics, in journals published in Germany, Singapore, the UK, and the US)

13. (discussion article) Keeler E, Kahn K, Draper D, Rogers W, Sherwood M, Rubenstein L, Reinisch E, Kosecoff J, Brook R (1990). Changes in sickness at admission following the introduction of the Prospective Payment System (with discussion). Journal of the American Medical Association, 264, 1962-1968 (discussion: 264, 1995-1997; 265, 1112-1113). (Developed what were at the time the most accurate measures of sickness at admission to date for elderly patients with high-prevalence diseases, for use in value-added quality assessment and other quality of care activities.) (122 citations [most recent September 2005], in econometrics, epidemiology, geriatrics, gerontology, health services research, internal medicine, obstetrics, operations research, orthopedics, and statistics, in journals published in Germany, Singapore, the UK, and the US)

12. (discussion article) Draper D, Kahn K, Reinisch E, Sherwood M, Carney M, Kosecoff J, Keeler E, Rogers W, Savitt H, Allen H, Wells K, Reboussin D, Brook R (1990). Studying the effects of the DRG-based Prospective Payment System on quality of care: Design, sampling, and fieldwork (with discussion). Journal of the American Medical Association, 264, 1956-1961 (discussion, 1995-1997). (I had major involvement in the design and analysis of the study and was responsible for the sampling plan, which involved multi-stage cluster sampling with stratification and yielded a nationally representative data set with approximately 17,000 patients from 297 hospitals.) (61 citations [most recent January 2004], in epidemiology, geriatrics, health policy, internal medicine, nursing, obstetrics, operations research, radiology, public health, and statistics, in journals published in Germany, Singapore, the UK, and the US)

11. (discussion article) Kahn K, Rubenstein L, Draper D, Kosecoff J, Rogers W, Keeler E, Brook R (1990). The effects of the DRG-based Prospective Payment System on quality of care for hospitalized Medicare patients: An introduction to the series (with discussion). Journal of the American Medical Association, 264, 1953-1955 (discussion, 1995-1997). (This article, together with the seven that follow, summarizes the results of a five-year $7.5 million study examining the quality of care offered to elderly patients by hospitals in the U.S. before and after a major change in governmental reimbursement method which was suspected of causing a decrease in quality of care.) (89 citations [most recent September 2005], in epidemiology, geriatrics, health policy, internal medicine, law, nursing, operations research, pharmacology, psychiatry, and statistics, in journals published in Germany, Singapore, the UK, and the US)

10. (letter) Bennett C, Draper D, Kanouse D, Greenfield S (1989). AIDS treatment center: is the concept premature? Journal of the American Medical Association, 262, 2537. (Discusses whether (in 1989) it was clinically appropriate for hospitals to create treatment centers dedicated solely to treating HIV and AIDS patients.) (Citation data unavailable)

9. (discussion article) Bennett C, Garfinkle J, Greenfield S, Draper D, Rogers W, Mathews C, Kanouse D (1989). The relation between hospital experience and in-hospital mortality for patients with AIDS-related PCP (with discussion). Journal of the American Medical Association, 261, 2975-2979 (discussion, 261, 3016-3017; 262, 2537). (Demonstrates that hospitals which treat a higher volume of AIDS patients have better outcomes.) (114 citations [most recent August 2005], in cardiology, family medicine, geriatrics, gerontology, internal medicine, pediatrics, and statistics, in journals published in Canada, the UK, and the US)

8. (article) Kahn K, Brook R, Draper D , Keeler E, Rubenstein L, Rogers W, Kosecoff J (1988). Interpreting hospital mortality data: How can we proceed? Journal of the American Medical Association, 260, 3625-3628. (Discusses alternative strategies for quality of care monitoring based on mortality rates.) (51 citations [most recent March 2005], in epidemiology, health services research, medical decision-making, operations research, public health, psychology, and statistics, in journals published in Germany, Singapore, the UK, and the US)

7. (article) Jencks S, Daley J, Draper D , Thomas N, Lenhart G, Walker J (1988). Interpreting hospital mortality data: The role of clinical risk adjustment. Journal of the American Medical Association, 260, 3611-3616. (Together with the previous article, examines a value-added strategy involving the use of hospital mortality rates adjusted for sickness at admission to identify high- and low-quality hospitals.) (141 citations [most recent August 2005], in cardiology, epidemiology, health services research, internal medicine, nursing, operations research, pediatrics, and statistics, in journals published in Australia, Canada, France, Germany, New Zealand, the UK and the US)

6. (article) Daley J, Jencks S, Draper D , Lenhart G, Thomas N, Walker J (1988). Predicting hospital-associated mortality for Medicare patients with stroke, pneumonia, acute myocardial infarction, and congestive heart failure. Journal of the American Medical Association, 260, 3617-3624. (Results of a two-year $1.5 million study showing how to measure sickness at admission and predict death status for elderly patients with high-mortality diseases.) (222 citations [most recent September 2005], in cardiology, epidemiology, geriatrics, health care financing, internal medicine, pharmacology, psychology, and statistics, in journals published in Germany, Singapore, the UK, and the US)

5. (discussion article) Draper D (1988). Rank-based robust analysis of linear models: Exposition and review (with discussion and rejoinder). Statistical Science, 3, 239-271. (Shows how estimators of location based on rank-tests can be used to produce robust inferences in regression and analysis of variance.) (28 citations [most recent December 2003], in econometrics, epidemiology, psychology, and statistics, in journals published in the UK and the US)

4. (letter) Dubois R, Rogers W, Draper D , Brook R (1988). Does hospital mortality predict quality? New England Journal of Medicine, 318, 1624. (A further exploration of the relationship between inpatient mortality and hospital quality.) (Citation data unavailable)

3. (monograph review) Draper D (1987). Review of Summing Up: The Science of Reviewing Research, by R Light and D Pillemer, Cambridge MA: Harvard University Press (1984). Journal of the American Statistical Association, 82, 349-350. (Critique of a leading book on meta-analysis in education and medicine.) (Citation data unavailable)

2. (invited discussion) Draper D (1987). On exchangeability judgments in predictive modeling, and the role of data in statistical research. Statistical Science, 2, 454-461 (discussion of ``Prediction of future observations in growth curve models,'' by CR Rao). (An indictment of the misuse of ``real data'' in methodological work in statistics, and a call for greater explicitness in laying out the similarity judgments at the heart of good predictive modeling.) (Citation data on discussion unavailable; article under discussion cited 19 times)

1. (discussion article) Dubois R, Rogers W, Moxley III J, Draper D , Brook R (1987). Hospital inpatient mortality: Is it a predictor of quality? (with discussion). New England Journal of Medicine, 317, 1674-1680 (discussion: 318, 1623-1624). (Demonstrates that inpatient mortality by itself is a poor marker for quality of hospital care.) (209 citations [most recent June 2005], in cardiology, economics, epidemiology, health policy, nursing, internal medicine, surgery, and statistics, in journals published in Australia, Canada, France, Germany, Spain, the UK, and the US)

II. Grant and Contract Support (Research and Teaching Proposals)

(Grant proposals I've been involved in since 1993.)

15. (pending) Escobar G, Draper D, et al. (2005). Sepsis and critical illness in babies >= 34 weeks gestation (39 pages). $1,670,000; January 2006-December 2008; pending at the National Institute of Child Health and Human Development (a branch of the National Institutes of Health). (Proposes novel clinical and statistical methods, involving dynamic linear modeling, to take advantage of the Kaiser hospital chain's soon-to-be-available electronic clinical data base to create dynamically-updated severity of illness scores for newborn babies in the first 72 hours of life. I am the lead statistical consultant on this project.)

14. (pending) Draper D, Gearhart C (2005). Bayesian statistical modeling of the relationship between air quality and mortality: In pursuit of accurate uncertainty bands and better environmental policy (2 pages). $120,000; April 2006-March 2008; pending at the University Research Program at Ford Motor Company. (Proposes a collaboration with environmental and quality control researchers at Ford, to use Bayesian model averaging to establish good methods for estimating the true uncertainty in epidemiological studies of the relationship between ambient air pollutant levels and mortality, which will serve as the basis of better environmental policy. I am the only statistician on this project.)

13. (awarded) Draper D (2004). Bayesian Modeling and Decision-Making in Industrial Process Control. $15,000; Nov 2004-Oct 2006; from the Statistics Group at Pratt & Whitney. (Provides funds for design and analysis work on improved risk assessment in engineering the manufacturing process for jet engines.)

12. (awarded) Draper D (2004). Bayesian Modeling and Inference for Improved Medical Processes and Outcomes. $50,000; Aug 2004-Aug 2006; from the Division of Research at Kaiser Permanente. (Provides funds for design and analysis work on a variety of projects (e.g., an innovative method for migrating methods from the intensive care unit to the general wards and emergency room to prevent unnecessary deaths from sepsis).)

11. (awarded) Towbin P, Draper D (2004). Mathematical and Statistical Models of Cooperation and Conflict in Environmental Resource Use. $162,000; July 2004-June 2008; from the Institute on Global Conflict and Cooperation at the University of California, San Diego. (Provides fellowship money for Peter Towbin's Ph.D. study on mathematical and statistical models of how to create incentives for renewable environmental resource use.)

10. (awarded) Draper D (2004). A Case-Study-Based Contemporary Calculus Course (7 pages). $8,088; July 2004-June 2005; from the Center for Teaching Excellence at UCSC. (Secures course relief funding to develop a case-study-based contemporary calculus course to meet the calculus needs of undergraduate majors in engineering and science in a manner that (a) captures the excitement and relevance of the ideas, (b) covers all appropriate theory rigorously, (c) teaches students how to use mathematics to model reality (i.e., to formulate real-world problems in mathematical terms), and (d) makes use of contemporary technological innovations by involving the students in computer-based symbolic and numerical calculations.)

9. (awarded) Draper D (2004). Understanding Variations in Death Rates in Veterans Administration Intensive Care Units. $15,000; Jul 2004-Dec 2005; from the Department of Veterans Affairs (Palo Alto Health Care System). (Provided funds to perform a hierarchical random-effects logistic regression analysis to explain variations in death rates in intensive care units in VA hospitals around the U.S.)

8. (awarded) Draper D, Krnjajic M (2003). Cluster Analysis Via Bayesian Nonparametric Density Estimation. $30,000; Feb-Sep 2004; from NASA Ames. (Provided funds to investigate Bayesian methods for classifying pixels in satellite images on a four-point ordered categorical scale of cloudiness, by applying mixture models with unknown numbers of components in the context of massive data sets.)

7. (awarded) Romano P, Draper D (2002). Perinatal Outcomes for Medical Mothers and Babies. $27,194; Apr 2003-Mar 2006; from the California Healthcare Foundation. (Provided funds to perform an empirical analysis of physician- and hospital-level effects as part of the Maternal Outcomes Reporting Initiative.)

6. (awarded) Draper D (2002). International Workshop on Bayesian Data Analysis. $88,367 from the National Science Foundation, UCSC, NASA Ames Research Laboratories, and CTB/McGraw-Hill. (Provided funds to run an international Workshop in Santa Cruz, 8-10 Aug 2003. This Workshop brought together approximately 160 researchers from 15 countries on 5 continents for 26 invited talks and 75 posters; electronic proceedings are available here.)

5. (awarded) Chambers R, Draper D, Jones T, Nordberg L, Skinner C (1998). Model Quality Reports in Business Statistics. $547,420; Jan-Dec 1998; from Eurostat. (Provided funds for a postdoctoral Research Officer, equipment, and travel, to advise the European Community on best-practice methodology in the design and analysis of complex sample surveys.)

4. (awarded) Draper D (1998). Bayesian Nonparametric Methods in Nuclear Waste Disposal Risk Assessment. $31,383; Nov 1997-Apr 1998; from AEA Technology plc, U.K. (Provided funds for a postdoctoral Research Officer, travel, and consulting fees to perform nonparametric Bayesian calculations in an effort to provide more stable risk estimates involving possible groundwater contamination from nuclear waste repositories under scenarios leading to low radiation doses with high probability and very high doses with very low probability, .)

3. (awarded) Draper D, Pereira A, Prado P, Saltelli A (1996). GESAMAC: Conceptual and Computational Tools to Assess the Long-Term Risk from Nuclear Waste Disposal in the Geosphere. $558,327; Jan 1996-Dec 1998; from the European Commission. (Provided funds for a postdoctoral Research Officer, equipment, and travel, to perform model uncertainty and sensitivity analysis calculations in risk assessment studies of groundwater contamination from nuclear waste repositories.)

2. (awarded) Draper D, Parmigiani G, West M (1995). International Workshop on Model Uncertainty and Model Robustness. $41,416 from the EPSRC, the US National Science Foundation, and the University of Bath. (Provided funds to run an international workshop in Bath, June 30-July 2, 1995, which brought together 88 researchers from 15 countries for 18 invited talks, 9 invited discussions, and 46 posters.)

1. (awarded) Draper D (1993-2005). $189,900 from the UK Engineering and Physical Sciences Research Council (EPSRC), the European Commission, the UK Royal Society, the University of Bath, and a variety of organizing committees for international research conferences. (Provided funds for Ph.D. students (7), conference travel grants (17), and 1-year student training and research placements (4).)

III. Dissertations Supervised

(I was the sole supervisor of all of the work listed below except where otherwise noted, and all work is in statistics except where otherwise noted.)

16. (in progress) Towbin P. Mathematical and Statistical Models of Cooperation and Conflict in Environmental Resource Use. Ph.D. anticipated, University of California, Santa Cruz, 2008) (This Ph.D. thesis develops mathematical and statistical models that suggest how to implement policies which encourage sustainable environmental resource use.)

15. (in progress) Young R. Bayesian Estimation of Cytonuclear Disequilibria Under Models of Immigration and Epistatic Mating. Ph.D. (Ecology and Evolutionary Biology) anticipated, University of California, Santa Cruz, 2006) (Nuclear-cytoplasmic covariances are measures of non-random associations of nuclear alleles or genotypes with cytoplasmic alleles. Frequentist approaches have been developed to estimates these covariances and test for statistical significance, but these methods do not perform well in small samples. This Ph.D. thesis extends the existing likelihood methodology by developing a Bayesian algorithm that combines information on allele counts from previous studies with information gained from the current study. I am co-supervising this dissertation with R Vrijenhoek.)

14. (in progress) Wallerius J. Bayesian Nonparametric Modeling For Well-calibrated Location and Scale Inferences With Skewed and Long-tailed Data. M.S. anticipated, University of California, Santa Cruz, 2006) (This M.S. thesis uses Dirichlet process mixture models to create well-calibrated interval estimates of location and scale with data arising from highly skewed and long-tailed distributions.)

13. Krnjajic M (2005). Contributions to Bayesian Statistical Analysis: Model Specification and Nonparametric Inference (132 pages). Ph.D., University of California, Santa Cruz. (This Ph.D. thesis makes contributions in two areas: Bayesian model specification (developing and calibrating tools based on predictive accuracy for choosing between models and answering the question "Could the data have arisen from model M?") and Bayesian nonparametric analysis (exploring the ability of Bayesian parametric and nonparametric models to provide an adequate fit to count data, of the type that would routinely be modeled parametrically either through fixed-effects or random-effects Poisson regression, and presenting new Bayesian nonparametric methodology for quantile regression). Milovan now has a post-doctoral position at the Lawrence Livermore National Laboratories. I co-supervised this dissertation with A Kottas.)

12. Liu S (2003). Mirror-Jump Sampling: A Strategy for MCMC Acceleration (58 pages). M.S., University of California, Santa Cruz) (This M.S. thesis examines the performance of a method which induces negative serial correlation in the output of a Metropolis sampler, thereby increasing MCMC efficiency. Shufeng is now working toward a Ph.D. in the Department of Biostatistics at the University of Michigan.)

11. Mendes B (2003). Uncertainties in Modeling Groundwater Contamination (67 pages). Ph.D., University of Stockholm (Sweden). (This Ph.D. thesis combines functional data analytic methods and MCMC techniques to quantify the risks, under various scenarios for what might go wrong, of storing radioactive waste from nuclear power plants underground. Bruno is now a postdoctoral researcher working with me at UCSC.)

10. Mendes B (2002). Functional Data Analysis: Modeling of Groundwater Contamination (60 pages). M.Sc., University of Bath (U.K.). (The leading method for disposing of nuclear waste materials involves deep underground storage. If and when the storage vessel breaks, waterborne radionuclides would gradually diffuse through the porous media of the ground to the surface. This is typically modeled with systems of partial differential equations, solved numerically, which produce an estimated radiologic dose curve over time as a function of distance from the radioactive point source. This M.Sc. thesis uses methods from functional data analysis to relate the entire dose curve to predictor variables, rather than simply summarizing the curve via its maximum, which was the exclusive approach taken in the past. Bruno is now a postdoctoral researcher working with me at UCSC.)

9. Gittoes M (2001). Statistical analysis of performance indicators in UK higher education (191 pages). Ph.D., University of Bath (U.K.). (This thesis develops methods for judging whether rates of dichotomous outcomes such as dropout from university are appropriate at the institutional level given the intake characteristics of the students at the university. Mark is now a Member of the Technical Staff at the Higher Education Funding Council for England (HEFCE) in Bristol (U.K.).)

8. Fouskakis D (2000). Stochastic Optimization Methods for Cost-Effective Quality Assessment in Health (147 pages). Ph.D., University of Bath (U.K.). (This Ph.D. thesis - which was short-listed for the 1999 Ede and Ravenscroft Research Prize at the University of Bath and nominated for a Savage Award for best Bayesian dissertation worldwide in 2001 - compares the performance of three popular stochastic optimization methods (simulated annealing, genetic algorithms, and tabu search) in maximizing expected utility to solve a problem in variable selection in generalized linear models via Bayesian decision theory. Dimitris is currently a Lecturer (equivalent to an Assistant Professor in the US) in the Department of Mathematics at the National Technical University of Athens (Greece).)

7. Browne W (1999). Applying MCMC methods to multi-level models (211 pages). Ph.D., University of Bath (U.K.). (This Ph.D. thesis - which was nominated for a Savage Award for best Bayesian dissertation worldwide in 1999 - develops MCMC algorithms for efficient fitting of hierarchical (multilevel) models (variance components, random-effects logistic regression, and models with complex level-1 variation), and documents the calibration performance of various diffuse prior distributions in such models. Bill is currently a Lecturer (equivalent to an Assistant Professor in the US) in the Division of Statistics within the School of Mathematical Sciences at the University of Nottingham (U.K.).)

6. Kounali D (1998). Cardiac mortality and dietary risk factors: Survival analysis with time-varying covariates. M.Sc., University of Bath (U.K.). (This M.Sc. thesis conducts an epidemilogical study of the relationship between dietary factors and heart disease among a cohort of men from Crete. Daphne is currently finishing a Ph.D. in statistics at the University of Southampton (U.K.).)

5. Cheal R (1997). Markov chain Monte Carlo methods for inference on family trees. Ph.D., University of Bath (U.K.). (This Ph.D. thesis develops methods for inference on family trees from pedigree data, with applications to breeding programs for Przewalski's horse that will encourage genetic diversity for these rare animals. Ryan is now a postdoc in the Statistics Group within the Department of Mathematical Sciences at the University of Bath (U.K.).)

4. McKail C (1997). Fixing the broken bootstrap. M.Sc., University of Bath (U.K.). (This M.Sc. thesis explores the use of Bayesian nonparametric methods to produce well-calibrated interval estimates for measures of spread based on data sampled from skewed and long-tailed distributions. Callum now works at a leading software company in the London area.)

3. Fouskakis D (1996). Variable selection via hierarchical modeling and utility. M.Sc., University of Bath (U.K.). (This M.Sc. thesis, which was awarded with distinction, demonstrates that standard variable-selection methods in generalized linear models are sub-optimal when the cost of data collection for the predictor variables must (as is often true) be taken into consideration. Dimitris is now a Lecturer in the Department of Mathematics at the National Technical University of Athens (Greece).)

2. Browne W (1995). Applications of hierarchical modeling (59 pages). M.Sc., University of Bath (U.K.). (This M.Sc. thesis - which won the James Duthie Prize for best M.Sc. Dissertation at the University of Bath in 1995 - explored the use of hierarchical modeling as a smooth alternative to discrete (all-or-nothing) variable selection methods in generalized linear models. Bill is now a Lecturer (equivalent to an Assistant Professor in the US) in the Division of Statistics within the School of Mathematical Sciences at the University of Nottingham (U.K.).)

1. Raube K (1991). Health and social support in the elderly. Ph.D., RAND Graduate School of Policy Studies. (This Ph.D. thesis was a quantitative examination of the relationship between the quality of the social support network an elderly person has and that person's health. Kristi is now Adjunct Professor and Executive Director of the Graduate Program in Health Management at the Haas School of Business in the University of California, Berkeley.)

IV. Teaching Contributions
1. The two courses I've taught most frequently at UCSC are AMS 206 (a graduate introduction to Bayesian modeling, inference and prediction, which I developed from scratch at UCSC) and AMS 5 (a lower-division undergraduate introduction to basic statistical thinking and methods, which I completely revamped using the case-study method described below).

AMS 206. Bayesian Statistics. Introduction to Bayesian statistical methods for inference and prediction; exchangeability; prior, likelihood, posterior, and predictive distributions; coherence and calibration; conjugate analysis; Markov Chain Monte Carlo methods for simulation-based computation; hierarchical modeling; Bayesian model diagnostics, model selection, and sensitivity analysis.

The web page for my most recent version of AMS 206, in the winter of 2005 (including 86 downloadable documents in PDF format and {data sets and WinBUGS code} for 3 detailed examples), is here.

AMS 5. Statistics. Introduction to statistical methods/reasoning, including descriptive methods, data-gathering (experimental design and sample surveys), probability, interval estimation, significance tests, one- and two-sample problems, correlation and regression. Emphasis on applications to the natural and social sciences.

The web page for my most recent version of AMS 5, in the spring of 2005 (including 65 downloadable documents in PDF format), is here.

2. I've thought carefully about how the disciplines of applied mathematics and statistics can be most successfully taught. It's my view that at all levels of material, from the very first lower-division course to the most advanced graduate seminar, the key is a case-study orientation, as in the following four-step paradigm:

(1) An interesting real-world science or engineering problem is introduced and described in sufficient contextual detail that the students fully understand its practical significance;

(2) Mathematical and/or statistical methods are developed/introduced to solve the problem in step (1);

(3) The real-world implications and limitations of the solution in step (2) are examined; and

(4) The general properties of the methods ``invented'' in step (2) are explored.
Step (1) in this four-step process serves to illustrate the wide applicability of mathematical and statistical thinking and to show that many good mathematical ideas or methods were in practice invented while trying to solve a real-world problem, and step (2) focuses on the crucial problem-formulation process.

If the class demonstrates its openness to a not-just-note-taking approach to learning (almost all classes do, typically quite eagerly), I undertake steps (2) and (4) in an interactive way, by asking the students to suggest ideas for how progress might be made in solving the problem in (1), developing the methods adaptively based on the suggestions they give me, and interactively exploring the general attributes of the methods we've "created."

When someone suggests an idea that's only partially successful I lead us down the indicated path until we hit a brick wall, and then we figure out together how to climb over the wall; this reinforces the important fact about the mathematical discovery process that most good ideas and methods are arrived at through a process of successive refinement of partially-flawed ideas and methods (many people are mystified about how a now-standard mathematical concept was developed until they see in action that the way to proceed is to scratch something down, figure out what's wrong with it, and then figure out how to fix the flaw).

This approach has the added advantage that the instructor can weave many details of the history of mathematics/statistics (and of science) into the narrative; by this device the interest of the students is maintained, and they increase their appreciation for the crucial context in which ideas and methods arise.

In my 25 years of teaching I've used this approach successfully in classes ranging in size from 1 to 500, and at levels ranging from first-year undergraduate courses to the most advanced graduate seminars.

An abbreviated example of a case study in a calculus course can be found here, and my three books-in-progress (items 73, 74, and 75 in Section I.) all contain a number of examples of the use of case studies in statistics employing the four-step paradigm described here.

V. Administrative Contributions
(Among many that could be included, here are three recent examples of administrative written contributions I have authored or co-authored.)

4. (AMS revised academic plan for 2005-11) Draper D, Cortes J, Garaud P, Kottas A, Lee H, Mangel M, Prado R, Sanso B, Wang W (2006). AMS Revised Academic Plan For 2005-11 (21 pages). (AMS contribution to the UCSC academic planning exercise for 2005-2011.)

3. (annual report 2004-05) Draper D, Lee H (2005). Annual Report for the Academic Year 2004-05: Department of Applied Mathematics and Statistics (26 pages). (Summarizes many of the activities and accomplishments of the Department of Applied Mathematics and Statistics at UCSC in 2004-05.)

2. (graduate proposal) Draper D, Balmforth N, Cortes J, Garaud P, Kottas A, Lee H, Mangel M, Prado R, Sanso B, Wang W (2005). A Proposal for a Program of Graduate Studies Leading to M.S. and Ph.D. Degrees in Statistics and Stochastic Modeling (246 pages). (Sets out a vision for graduate degrees in statistics and stochastic modeling that involves a fusion of statistical and applied mathematical modeling and that encourages students to develop innovative theories and methodologies in the context of solving important problems in science and engineering. I wrote most of the proposal, based on an earlier draft by Marc Mangel. Currently under UCSC campus review.)

1. (annual report 2003-04) Draper D (2004). Annual Report for the Academic Year 2003-04: Department of Applied Mathematics and Statistics (24 pages). (Summarizes many of the activities and accomplishments of the Department of Applied Mathematics and Statistics at UCSC in 2003-04.)

visits since creation on 17 October 2005.
(last update 30 January 2006)