Christian Robert reviewed on line a paper that was critical of non-informative priors. Among the points that were discussed by him and other contributors (e.g. Keith O’Rourke), was the issue of *induced *priors, i.e. priors which arise from a transformation of original parameters, or of observables. I found this exchange interesting because I did something similar when revisiting an old research project that had been collecting digital dust in my hard disk. The specific problem had to do with analysis of a biomarker that was measured with a qualitative technique yielding a binary classification of measurements as present or absent, in two experimental conditions (call them A and B). Ignoring some technical aspects of the study design, the goal was to calculate the odds ratio of the biomarker being expressed in condition B v.s A (the reference state signifying absence of disease).

When writing the programs for the analysis, I defaulted to the* *N(0.0,1.0E-6) prior that epitomizes *non-informativeness *in BUGS. However, one of my co-authors asked the “What the @#$%& does this prior mean?” question. And then we stopped … and reflected on what we were about to do. You see, before the experiment started we had absolutely no prior information about the behaviour of the biomarker in either experimental state so that we did not want to commit one way or another. In other words, Laplace’s original uniform (or Beta(1,1)) prior would have been reasonable if the expression data for A and B were to be analyzed separately. However, we wanted to analyze the data with a logistic regression model, so was the ubiquitous N(0.0,1.0E-6) the prior we were after?

The answer is a loud NO! According to Wikipedia, the mother of all knowledge, the logistic transformation of a uniform variate is the logistic distribution with location of zero and scale of 1. Hence, the prior on the intercept of the logistic regression (interpretable as the odds of the biomarker being expressed in state A) had to be a Logistic(0,1).

Surprisingly the Odds Ratio of B v.s. A was found (after trial and error and method of moments considerations) to be very well approximated by a 1:1 mixture of a logistic and a Gaussian which clearly departs from the N(0.0,1.0-6) prior we (almost) used:

**Bottom line**: Even informative (in the BUGS sense!) priors can be pretty non-informative in some intuitively appropriate parameterization. Conversely, one could start with a non-informative prior in a parameterization that is easier to reason about and look for an induced prior (using analytic considerations or even simulations) to convert it to a parameterization that is more appropriate to the analytic plan at hand.

(R code for the plots and simulations is given below)

## approximating uniforms logit<-function(x) log(x/(1-x)) set.seed(1234) N<-10000000 s<-runif(N,0,1); s2<-runif(N,0,1); y<-logit(s) y2<-logit(s2) m<-mean(y) s<-sd(y) x<-seq(-10,10,.1) ## logistic is logit of a uniform hist(y,prob=TRUE,breaks=50,main="intercept", xlab="logit(A)") lines(x,dnorm(x,m,s),col="red") lines(x,dlogis(x,0,1),col="blue") legend(-15,0.20,legend=c("Normal(0,1)", "Logistic(0,1)"),lty=1,col=c("blue","red") ) ## approximating the difference of two uniforms hist(y-y2,prob=TRUE,ylim=c(0,.25),breaks=200, xlim=c(-10,10),main="OR between two U(0,1)", xlab="logit(B)-logit(A)") ## logistic approximation lines(x,dlogis(x,0,sqrt(2)),col="blue",lwd=2) ## normal lines(x,dnorm(x,0,(pi)*sqrt(2/3)),col="red",lwd=2) ## mixture of a logistic and a normal approximation lines(x,0.5*(dlogis(x,0,sqrt(2))+ dnorm(x,0,(pi)*sqrt(2/3))),col="green",lwd=2) ## legends NL<-expression(paste("Normal(0,",pi*sqrt(2/3),")")) LL<-expression(paste("Logistic(0,",sqrt(2),")")) ML<-expression(paste("0.5 Normal(0,",pi*sqrt(2/3),")+0.5 Logistic(0,",sqrt(2),")")) legend(-6.5,0.25,legend=c(NL,LL,ML), lty=1,col=c("blue","red","green") ) ## does it extend to more general cases? m1<--2;m2<-2;s1<-1;s2<-2.5; l1<-rlogis(N,m1,s1) l2<-rlogis(N,m2,s2) d<-l1-l2 hist(d,prob=TRUE,ylim=c(0,0.25),breaks=200) plot(density(d)) lines(x,dlogis(x,m1-m2,sqrt(s1^2+s2^2)),col="green",lwd=2) lines(x,dnorm(x,m1-m2,pi*sqrt((s1^2+s2^2)/3)),col="red",lwd=2) lines(x,0.5*(dnorm(x,m1-m2,pi*sqrt((s1^2+s2^2)/3))+ dlogis(x,m1-m2,sqrt(s1^2+s2^2))),col="blue",lwd=2)

**
Edit (29/11/2013): **Updated the first image due to an accidental reversal of the distribution labels

November 28, 2013 at 11:47 am |

Really interesting post, thanks! I’m also getting a pretty close approximation using a t-distribution with dt(mu = 0, sigma=2.36, sigma = 13). This is nice as the t-distribution is easy to use in BUGS/JAGS 🙂

November 28, 2013 at 2:00 pm |

Thanks for stopping by!

Do you mean df=130?

November 28, 2013 at 2:44 pm

Sorry, I meant dt(mu = 0, sigma=2.36, df = 13). Got this by finding the best fitting t-distribution by maximum likelihood:

library(MASS)

y <- rlogis(39999) + rlogis(39999)

# Non-standardized Student's t-distribution

dt2 <- function(x, mu, sigma, nu) {

if(sigma <= 0 | nu <= 0.1) {

0

} else {

dt(x = (x – mu) / sigma, df=nu) / sigma

}

}

fitdistr(y, "normal")

fitdistr(y, "logistic")

fitdistr(y, dst2, list(mu=0, sigma=1, nu=5))

hist(y, freq=F, 80, xlim=c(-10, 10))

lines(x, dnorm(x, 0, 2.56), col="red", lwd=2)

lines(x, dlogis(x, 0, 1.44), col="green", lwd=2)

lines(x, dst2(x, 0, 2.35, 13), col="blue", lwd=2)