What does 95% effective mean for a Covid-19 vaccine?

Pfizer/BioNTech’s vaccine was reported as being 95% effective. This number is not your chance of staying Covid-free after vaccination: rather, it estimates how much your chance rises relative to not being vaccinated. With the vaccine, your chance of staying Covid-free is in fact 99.96%.

On 9 November 2020, the world woke to the uplifting news that a safe and effective vaccine for Covid-19 was within reach. Pfizer and BioNTech reported results of their clinical trial for the vaccine to justifiably great acclaim. The vaccine was said to be 95% effective, based on outcomes observed so far. This was followed by news about two other vaccines with similarly high effectiveness.

Many people, including journalists, have interpreted 95% effectiveness as meaning: ‘If I take the vaccine, then there is at most a 5% chance that I will catch Covid-19’. That is not what effectiveness means, as reported in the vaccine studies. Here, we explain precisely what ‘effectiveness’ means statistically, and clarify how sample sizes affect the calculation of this rate and its reliability.

Some media reports have recognised the fact that participants in the Pfizer trial are not a random sample from the full population, since participation is subject to many restrictions in terms of their pre-existing conditions, and of course, the trial’s subjects have to volunteer as well. Therefore, some sub-groups of the population (for example, those in frail health) were excluded from the study, and so we do not know what the impact of the vaccine would be on such individuals.

Furthermore, it is not clear that all participants were regularly tested for infection. What we discuss below, however, is distinct from these issues, and concerns clarification of how to measure effectiveness itself and understand its reliability, allowing for the fact that the study represents a sub-group of the population.

What were the study’s design and reported results?

In the Pfizer study, a total of 21,999 people had received two doses of the vaccine and an equal number had received the placebo. There were a total of 170 Covid-19 positive cases in the study, with eight of them being those who had received the vaccine and 162 of them being those who had received the placebo. The following table summarises this information:

	Vaccine	Placebo
Positive	8	162
Non-positive	21,991	21,837
Total	21,999	21,999

The 95% efficacy that Pfizer reported can be obtained as the percentage reduction of positive cases in the vaccinated group compared with those who received the placebo, which equals 100 x (162-8)/162 = 95.06%. Alternatively, it can be obtained as the percentage of positive cases in the placebo group, which equals 100 x 162/170 = 95.29%.

On the other hand, the chances of remaining Covid-free with the vaccine equals 100 x 21,991/21,999 = 99.96%, which is the number that most potential recipients of the vaccine want to know. Even without the vaccine, it is very high at 100 x 21,837/21,999 = 99.26% (although it is important to remember that the trial group in the study is not representative of the whole population).

What is being estimated here and how reliable is it?

The study was designed to allow one to extrapolate from the small number of participants to the much larger world population. One issue is that the total number of Covid-19 cases reported in the study is quite small compared with the 85 million cases so far reported worldwide and the even larger potential number of cases. This raises the question of the precision of these percentages – that is, what is the margin of error around the reported numbers as a measure of the true efficacy that would result when applied to the entire population?

Note that in general, one needs to account for the sample size in each arm of the trial to obtain efficacy rates. To see why, note that the measure reported above depends solely on the number of positive cases in each arm of the trial. So suppose instead that the table looked like this:

	Vaccine	Placebo
Positive	8	162
Non-positive	3,392	40,438
Total	3,400	40,600

This differs from the previous table in that we have artificially changed the number vaccinated versus the number receiving the placebo among the non-positives and left the number of positive cases unaltered. Then the effectiveness measures calculated above will not change.

But it is obvious that the vaccine is much less effective in this case, since the proportion positive in the vaccinated case has increased from 8/21,999 to 8/3,400 – that is, from 36 per 100,000 to nearly six times that at 235 per 100,000. This issue does not cause problems in interpreting the Pfizer trial because it had an equal number on the placebo and vaccine arms.

How should one take sample size into account?

To see how to take sample sizes into account, note that one can view the incidence of Covid-19 as similar to a coin toss, except that the two outcomes need not be equally likely and their relative chances unknown.

Suppose that the chance of an individual contracting Covid-19 is v when vaccinated, and p when not. The Pfizer trial gives us estimates of these unknown probabilities based on a sample from these respective populations; these estimates are: v^est = 8/21,999 = 0.00036 and p^est = 162/21,999 = 0.0073.

These numbers tell us that based on the trial, the estimated chance of getting Covid-19 is 36 in 100,000 with the vaccine and 736 in 100,000 without it. We can also calculate the degree of accuracy in these estimates using a statistical model known as the binomial distribution.

It is clear that a summary measure of effectiveness should be based only on how large p is relative to v. Therefore, a reasonable measure of effectiveness in the population is the quantity E* = 1 – v/p, which tells us how much the vaccine reduces infections relative to the base rate.

The estimated value of this is 1 – 0.00036/0.0073 = 95.06%. The ‘confidence interval’ for E* (this calculation also requires knowing the sample sizes on each arm of the trial) equals [94.94, 95.17] which, statistically speaking, is the range that contains the unknown E* with high accuracy.

Indeed, if the sample sizes in the treatment and placebo arms are equal, then both methods would give identical answers. But if they are not, as was the case in the Oxford study reported a few weeks later, then the calculations we suggest need to be applied to make the result correspond to the World Health Organization’s definition of effectiveness as ‘the proportional reduction of disease in the vaccine group’.