Illussion of Effectiveness in the ‘definitive’ clinical trial

The believer’s attitude is one of unconditional trust to the results of the randomized clinical trial (RCT). The latter, not only provides “unbiased” estimates of the relative efficacy of two more therapies, but also furnishes numerical estimates of the absolute efficacy that translate more or less into the outcomes of real world clinical practice. The believer will thus views the results obtained in the clinic as interchangeable with the ones observed in the RCT, so that the mathematically consistent way to jointly examine them is to simply add together the corresponding successes and failures. This approach will work just fine if the underlying premise of equivalency between effectiveness and efficacy is true, yet it will backfire otherwise.

To see why, consider what happens in the hypothetical thought experiment previously outlined:
So for real world experience reflective of a single individual (or even a single practice i.e. 20-100) patients, the magnitude of effectiveness will likely be overstated (since most therapies don’t work as well as advertised in papers). It will take a considerable number of patients (>1000 and likely 10000) to align the believer’s expectations with real world results. 
Such large number of patients from a single condition are unlikely to be encountered in a single individual’s professional lifetime (especially if the condition is rare) so that a believer is stuck in an “evidential blackhole”. Being trapped by the large number of patients (the gravity) of the definitive clinical trial, he or she is forced to discount personal experience for results that are only partially relevant to the patients they actually treat!
Furthermore the believer will substantially underestimate the precision of the estimate; when asked to produce an estimate below which the effectiveness is expected to be found with a small probability e.g. 5% the following figure can be obtained:
Hence even if that physician sounds confident that the therapy works in between 45-50% of patients, this is a gross overestimate and does not even bracket the “true” effectiveness unless the outcomes in alarge number of real world patients are examined:

