BAYESIAN STATISTICS
Course presented at Workshop 11/14/97
by
Contact: Dan Strom
Security & Privacy Statement
Revised: July 21, 2000
Contact: Dan Strom
Security & Privacy Statement
Revised: July 21, 2000
View slide show using browser, or link to
Powerpoint 97 file for use with your local Powerpoint viewer (available free,
download it now ),
or you may select from following list of 161 slides:
BAYESIAN STATISTICS
Text for this course
These slides available on WWW
Advanced topics reference
Another advanced reference
Added attraction
COURSE OUTLINE
COURSE OUTLINE (cont’d)
COURSE OUTLINE (cont’d)
COURSE OUTLINE (cont’d)
CHAPTER 5: CONDITIONAL PROBABILITY AND BAYES’ RULE
Conditional probability and trees.Example 5.3: How many girls?
In contrast, condition on G2: How many girls?
Law of total probability
Bayes’ rule
Alternative versions of Bayes’ rule
Paternity testing: Ex. 5.12
Posterior probability of paternity
Posterior vs. prior probabilities, for PI = 5.56
Example: Screening for cancer
Screening for cancer (cont’d)
Mary Decker Slaney Case
My presentation in Mary Decker Slaney Case
Two databases, users and not (hypothetical distributions)
Specificity
Sensitivity
User probability given T/E ¨ 6(Suppose Spec = 99%)
Better: Use exact observed value.Suppose T/E = 6. (Extra slide.)
Why formal theorem for learning?
P(urn 1 | data) = 40/41=97.5%
Martingales and Bayes
Example: In 2-urn quiz, start with P(1)=1/2; suppose 5 obs.
Sampling and learning:Example 5.13. How many greens?
Updated probabilities ifselect a green chip
Probability next chip is green—WITHOUT replacement
Probability next chip is green—WITH replacement
Example 5.15:“Let’s Make a Deal”
“Let’s Make a Deal” Assumptions
“Let’s Make a Deal”; P(data)
“Let’s Make a Deal”; Bayes’ rule
INFERENCES CONCERNING PROPORTIONS—CHAPTERS 6-9
Exercise 7.28: Data for long-term effects of lead in childhood
Likelihood for population proportion p of “graduate”
Likelihoods; Discrete case
Bayes’ rule, Discrete case.Suppose uniform prior
Bayes’ rule, Continuous case.Suppose uniform prior
Beta density for a proportion
Updating rule for beta densities
Example with single observation—from Beta(4,2):
Predictive probabilities for beta densities
Predictive probabilities for uniform, Beta(1,1):
Predictive probabilitiesfor one observation
Revisit graduation rate for children exposed to lead
Prediction for next 10
Treatment comparison:Example with pairs
Likelihood function of p:
If prior is uniform on (0, 1):
Laplace’s rule of succession
Predictive distribution
Best fitting binomial vs. predictive probabilities
Frequentist inferences—Comparisons with Bayesian
Frequentist hypothesis testing
Design (1): Observe 17 pairs
Design (2): Stop when both 4 A’s and 4 B’s
Design (3): Interim analysis at n=17, possible total is 44
Design (4): Stop when enough information
Frequentist conclusion depends on investigator’s intentions
Unplanned interim looks
Multiplicities in science
Frequentist vs. Bayesian—Six comparisons
Frequentist vs. Bayesian—Six comparisons
Frequentist vs. Bayesian—Six comparisons
Frequentist vs. Bayesian—Six comparisons
Two-sample comparison of proportions
Product of separate likelihoods
Probability of pN > pC given data
P(pN – pC > 0.6 | data)
PdALx = P(pN – pC > x)using Minitab
PdALx (Probability difference is At Least x) in picture form:
CHAPTERS 10-12: INFERENCES ABOUT MEANS
Two-sided P value is 0.05
Frequentist testing hypotheses
Frequentist confidence intervals
Bayesian approach
Bayesian: Consider alternatives. Discrete case
Alternatively, by symmetry of
Bayes’ rule calculations
Continuous case (m any value)
Prior, likelihood, posterior of m:
Posterior mean is weighted average of prior mean and
Example calculations from posterior distribution of m
Bayesian probability of frequentist confidence interval
INFERENCES FOR POISSON RATES
From prior to posterior
Assessing prior distribution for ?
If (a,b)=(10,5) and k=10 then updated (a,b) is (20,6)
If (a,b)=(4,2) and k=1 then updated (a,b) is (5,3)
Consider ?1/?2 where priors are assessed to be:
Observe 10 events on first and 1 event on second:
Finding distribution of ratio, r = ?1/?2, by simple simulation
Histogram of posterior of r = ?1/?2; 10,000 simulations
r < 1 means ?1 < ?2;posterior probability is 0.6%
HIERARCHICAL MODELING
Analogy to selecting coins
Generic example: Unit is person or subgroup or treatment or study
If p1 = p2 = . . . = p9 = p(all 150 units exchangeable)
Assuming equal p’s,95% CI for p: (0.63, 0.77)
Suppose ni independent observations on unit i
Bayesian view: G unknown means it has probability distribution
Beta(a,b) for a, b = 1, 2, 3, 4:
When G is Beta(a,b)
Bayesian questions:
Suppose uniform prior for a & b on integers 1, . . ., 10
Posterior probabilities for a & b
Calculating posterior distribution of G
Posterior mean of G(also predictive density for p)
Contrast with likelihood assuming all p’s equal
Bayes estimates
Bayes estimates are regressed or shrunk toward overall mean
Screening mammography for women in their 40s—Poisson example
U.S. Senate: Mammography WILL be effective!
Science by politics
Part of my presentation to NCAB
Characteristics of randomized trials
Breast cancer mortality reduction, by trial
Mortality per 100,000 life years
BC deaths, per 1000 women (Sweden only)
Mantel-Haenszel 95% confidence interval: 3% to 29% reduction
Bayesian hierarchical model allows for different trial effects
“The Canadian trial is a major outlier.”
Allowing for trial heterogeneity
Quantifying benefit—Assume 18% reduction in BC mortality
How to calculate? Recall BC deaths per 1000 women in Swedish trials.
Assuming 18% reduction—Hours of life expectancy gained per mammogram
Marcia Angell in NYTimes
The down side—Still assuming 18% reduction in BC mortality
Non-issues assuming 18% BC mortality reduction
Conclusions
Conclusions
DECISION ANALYSIS
Components of typicala decision problem
Example loss table, L(a,?), discrete case
Bayes’ risk
Graphs of Bayes risks
What about a3? Suppose perfect test, at cost 2
Including a3
Continuous case,Loss function: L(a,?)
Posterior (or current) distribution of ??given X
Not: P(? > ?*)
Rather: Compare actions on basis of Bayes risk
Bayes risks of a1 and a2
Order of preference switches (to a2 over a1 ) if shift in mean or SD:
What about a3?
Expected value ofperfect information (EVPI)
Expected value of sample information (EVSI)
Bayesian sample size and sampling strategy
COURSE OUTLINE
Addendum
Extensions of Little’s Model
ROC (Receiver Operating Characteristics)
Retesting Lowers ?
|