Archive for the ‘probability’ Category

Page Rev Bayes – we found statistical irregularities in a randomized controlled trial

November 9, 2013

The Bayesian counterpart to the frequentist analysis of the Randomized Controlled Trial is in many aspects more straightforward than the Bayesian analysis. One starts with a prior probability about the probability of a patient being assigned to each of the three arms and combines it with the (multinomial) likelihood of observing a given assignment pattern in the 240 patients enrolled in the study. Bayes theorem gives the posterior probability quantifying our belief about the magnitudes of the unknown assignment probabilities. Note that testing the strict equality is bound to lead us straight to the arms of the Lindley paradox so that a different approach is likely to be more fruitful. Specifically, we specify a maximum tolerable threshold for the difference between the maximum and the minimum probability of being assigned the trial arms (let’s say 1-5%) and we directly calculate the probability for this difference (“probability of foul play”).

In the absence of prior evidence for (or against) foul play we use a non-informative prior in which all possible values of assignment probabilities are equally plausible. This (Dirichlet) prior corresponds to a prior state of knowledge in which three individuals were randomized and all three ended up in different treatment arms. Under this prior, the posterior distribution is itself a Dirichlet distribution with parameters equal to the number of individuals actually assigned to each arm+1. The following R code may then be used to calculate the probability of foul play, as previously defined i.e.


This probability comes down to 0.4% which is numerically close to the frequentist answer, yet with a more intuitive interpretation: based on the observed trial sizes and a numerical tolerance for the maximum tolerable difference in assignment probability the odds for “foul play” are 249:1.
Increasing the tolerance will obviously decrease these odds, but in such a case we would be willing to tolerate larger differences in assignment probabilities. Although these results are mathematically trivial (and non-controversial), the plot will become more convoluted when one proceeds to use them to make a declaration of “foul play”. For in that case, a decision needs to be made which has to consider not only the probability of the uncertain events: “foul play” v.s. “not foul play” but also the consequences for the journal, the study investigators and the scientific community at large. At this level one would need to decide whether the odds of 249:1 are high enough or not for subsequent action to be taken. But this consideration will take us to the realm of decision theory (and it already 11pms).


Robert Heinlein and the distinction between a scientist and an academician

October 27, 2013

Robert Heinlein, the author of Starship Troopers(no relationship to the movie by the way) wrote an interesting paragraph in his 1939 short story Life-Line:

One can judge from experiment, or one can blindly accept authority. To the scientific mind, experimental proof is all important and theory is merely a convenience in description,
to be junked when it no longer fits. To the academic mind, authority is everything and facts are junked when they do not fit theory laid down by authority.

This short paragraph summarises the essence of the differences between Bayesian (scientific mind) and frequentist (academic mind) inference or at least their application in scientific discourse.

For objective Bayesians, models are only convenience instruments to summarise and describe possibly multi-dimensional data without having to carry the weight of paper, disks, USB sticks etc containing the raw points. Parameters do the heavy lifting of models and the parametric form of a given model may be derived in a rigorous manner using a number of mathematical procedures (e.g. maximum entropy). Given such a specification, one can use an empiric body of data D to calculate P(M|D) sequentially rejecting models that do not fit (a nice example is given in the second section of Jayne’s entropy concentration paper).

Now consider the situation of the frequentist mind: even though one can (and most certainly will!) use a hypothesis test (aka reject the null) to falsify the model, the authoritarian can (and most certainly will!) hide behind the frequentist modus operandi and claim that only an unlikely body of data was obtained, not that an inadequate model was utilized.

This is seen (and in fact enforced) in the discussion of every single scientific paper under the standardized second-to-last paragraph about ‘alternative explanations’. This section is all about bowing down to authority, offering (often convoluted and mystical) explanations that the data obtained are at fault and that the model falsified in the results section is in fact true. Depending on the weight of the authoritative figures who will write the commentary about the aforementioned paper, we can (and most certainly will!) end up in the highly undesirable situation of falsifying data rather than models.

Compare this with the hypothetical structure of a Bayesian paper in which the alternative hypotheses would be built as alternative models (or values of parameters) only to be systematically compared and the ill-fitting models (even those held to be true by academic figures of the highest authority) are triaged to oblivion.

As a concluding statement, note that our systematic failure to respond to the financial crisis or even to advance science in the last 3-4 decades can be traced to the dominating influence of academicians over scientists. Rather than systematically evaluating evidence for or against particular models in specific domains, we seem to only judge models/explanations by the authority/status of their proponents, a situation not unlike the one in the 30s when Heinlein wrote the aforementioned piece.

Illussion of Effectiveness in the ‘definitive’ clinical trial

September 2, 2013

The believer’s attitude is one of unconditional trust to the results of the randomized clinical trial (RCT). The latter, not only provides “unbiased” estimates of the relative efficacy of two more therapies, but also furnishes numerical estimates of the absolute efficacy that translate more or less into the outcomes of real world clinical practice. The believer will thus views the results obtained in the clinic as interchangeable with the ones observed in the RCT, so that the mathematically consistent way to jointly examine them is to simply add together the corresponding successes and failures. This approach will work just fine if the underlying premise of equivalency between effectiveness and efficacy is true, yet it will backfire otherwise.

To see why, consider what happens in the hypothetical thought experiment previously outlined:
So for real world experience reflective of a single individual (or even a single practice i.e. 20-100) patients, the magnitude of effectiveness will likely be overstated (since most therapies don’t work as well as advertised in papers). It will take a considerable number of patients (>1000 and likely 10000) to align the believer’s expectations with real world results. 
Such large number of patients from a single condition are unlikely to be encountered in a single individual’s professional lifetime (especially if the condition is rare) so that a believer is stuck in an “evidential blackhole”. Being trapped by the large number of patients (the gravity) of the definitive clinical trial, he or she is forced to discount personal experience for results that are only partially relevant to the patients they actually treat!
Furthermore the believer will substantially underestimate the precision of the estimate; when asked to produce an estimate below which the effectiveness is expected to be found with a small probability e.g. 5% the following figure can be obtained:
Hence even if that physician sounds confident that the therapy works in between 45-50% of patients, this is a gross overestimate and does not even bracket the “true” effectiveness unless the outcomes in alarge number of real world patients are examined:

Four basic attitudes towards efficacy and its relation to effectiveness

September 1, 2013

I will continue this series of posts regarding the appraisal of efficacy (“how well a therapeutic intervention worked in a randomized experiment”) and its translation to statements about effectiveness (“how well the intervention worked in the real world”), by considering the attitudes that one adopt towards these issues. The aim is to develop a sophisticated approach, or rather a vantage point that one would almost always want to adopt when considering the implications of having data about the efficacy (results of a trial) and effectiveness (success rate in real world practice). However the vantage point will only become evident by considering a basic set of attitudes, which are described here: (more…)

The expectation of the ratio of two random variables

August 4, 2013

I was recently revising a paper concerning statistical simulations of hemodialysis trials, in which I examine the effects of different technical aspects of the dialysis prescription at the population level. I had used the reported figures from a number of recent high profile papers, when I noticed that while the results were right on average, there was a substantial number of outliers, i.e. “digital patients” who would actually not be among the living if they were to be dialyzed with these parameters in the real world. (more…)

Argumentation between people of different convictions: Practical Implications of the Lindley Paradox

July 9, 2013

The Lindley paradox  can teach us some deep lessons about the conduction of our everyday business: one cannot (and in fact should not!) expect to win over an argument when one is confronting a person with much stronger convictions, even if that person is wrong!! Furthermore when we witness such an encounter, one cannot (and in fact should not!)  always attribute this outcome to other reasons e.g. deference to authority, lack of determination or even outright bullying. Even in the absence of these factors a rational exchange between two individuals of widely different convictions would lead to the one with the stronger ones coming out of the debate a winner. This is not particularly troublesome if the person is right, but it turns out to be problematic when that person is outright wrong. (more…)

How to choose among alternative narratives

June 25, 2013

The arithmetic of choosing among alternative narratives in the clinic and elsewhere, can be made rigorous by selecting a numerical scale for representing the extremes of falseness/impossibility all the way truthfulness/certainty. If the scale is selected as a probability one ranging from 0 to 1 and certain rules for the manipulation of probabilities as mathematical objects are employed, then one arrives at a formal inferential system. This system which allows for a logically consistent reasoning in the face of uncertainty is known as Bayesian probability theory and its modern development can be found in the texts by Keynes, Cox and Jaynes. (more…)

Optimizing Survival Likelihoods With Poisson Models for Rate and Exposure

October 13, 2012

In a previous post I mentioned that Poisson models can be used to carry out survival analysis tasks e.g. estimation of survival curves or even relative risk modelling. Yet, I never showed how this can be done. So I will close the gap today and highlight how this works from a purely mathematical vantage point.


Survival Analysis via Hazard Based Modeling and Generalized Linear Models

October 5, 2012

The connection between survival analysis via hazard based modelling and generalized linear models had been made very early even since the description of the proportional hazard (PHM) Cox (1972) and generalized linear models (GLM) Nelder and Wedderburn (1972). For example,

– Breslow (1974) considers the proportional hazard model as a discrete time logistic regression in which discrete probability masses are put on the (ordered) set of observed failure times (more…)