## An algebra guy’s take on the meta-analysis posts

As I was reading through the meta-analysis posts in order to correct various typos, the forgotten non-probability me woke up and raised the following question:

What if one were to treat the reported RR ($t$), 95% confidence interval ($t_L, t_U$) and p-value ($p_v$) as the true values of the non-reported quantities, in essence ignoring the round-off error?

Could this lead to a (?simpler) solution bypassing the need for Monte Carlo? What this solution would look like and how it differs (implementationally) from the Bayesian one ? More importantly how does it hold up against the Bayesian solution?

The solution involves solving algebraically for the various quantities; if more than one estimate can be obtained (e.g. the log-relative risk can be obtained from the confidence interval as well as the reported RR) then one averages over these estimates. So the solution looks something like this:

1. First we obtain an estimate for $t$ from the reported RR: $t_1 = \log(t)$
2. Secondly we transform the confidence interval to estimates for $t, se$:
1. $t_2 = \log(\sqrt{H_U \times H_L})$
2. $se_1 = \log(\sqrt{\frac{H_U}{H_L}}) \times (q_{0.975})^{-1}$
3. Averaging over $t_1, t_2$ yields $t=0.5 \times(t_1+t_2)$
4. A second estimate for the $se$ is obtained from the p-value using the quantile function $q_N(x)$ of the normal distribution: $se_2 = \vert{ \frac{t}{q_N(p_v)}}\vert$
5. Averaging over $se_1, se_2$ yields $= se = 0.5\times(se_1+se_2)$

Of note, the second step is used in the Monte Carlo algorithm when generating samples for the quantities $X,Y$ from the confidence interval, so that the algorithm solution is living in the Bayesian one. Furthermore, the algebraic solution utilizes statistical concepts (the mean) to arbitrate between alternative estimates of the same quantities.

As the algebraic solution does indeed look simple enough to be programmed in Excel š one decide to use this rather than the complicated Monte Carlo one. So in a subsequent post we will pit these two solutions in a “death match” to decide whether the extra complexity of the Bayesian solution is worth the extra programming effort.