Posts

Statistical considerations for the design of randomized, controlled trials for probiotics and prebiotics

By Prof. Daniel Tancredi, UC Davis, USA

The best evidence for the efficacy of probiotics or prebiotics generally comes from randomized controlled trials. The proper design of such trials should strive to use the available resources to achieve the most informative results for stakeholders, while properly accounting for the consequences of correct and incorrect decisions. It is crucial to understand that even well-designed and -executed studies cannot entirely eliminate uncertainty from statistical inferences. Those inferences could be incorrect, even though they were made rigorously and without any procedural or technical errors. By “incorrect”, I mean that the decisions made may not correspond to the truth about those unknown population parameters. Those parameters involve the distribution of study variables in the entire population, but our inferences are inductive and based on just the fraction of the population that appeared in our sample, creating the possibility for discordance between those parameters and our inferences about them. Although rigorous statistical inference procedures can allow us to control the probabilities of certain kinds of incorrect decisions, they cannot eliminate them.

For example, consider a two-armed randomized controlled trial designed to address a typical null hypothesis, that the probability of successful treatment is the same for the experimental treatment as for the comparator. Depending on the analytical methods to be employed, that null hypothesis could also be phrased as saying that the difference in successful treatment probabilities between the two arms is zero or that the ratio of the successful treatment probabilities between the two groups is one. Suppose the study sponsor has two possible choices regarding the null hypothesis, either to reject it or fail to reject it. (The latter choice is colloquially called “accepting the null hypothesis”, but that is a bit of an overstatement, as the absence of evidence for an effect in a sample typically does not rise to the level of being convincing evidence for the absence of an effect in the population.)

With these two choices about the null hypothesis, there are two major types of “incorrect decisions” that can be made: the null hypothesis could be true for the population but the study data led to a decision to reject the null hypothesis, a result conventionally called a “Type-1” error. Or the null hypothesis could be false for the population but the study data led to a decision not to reject the null hypothesis, conventionally called a “Type-2” error. Conversely, there are also two potentially correct decisions. One could fail to reject the null hypothesis when the null hypothesis is true for the population, a so-called “true negative”, or one could reject the null hypothesis when the null hypothesis is not true, a so-called true positive.

The consequences of these four different decision classifications vary from one stakeholder to another, and thus it is unwise to rely solely and simply on commonly used error probabilities when planning studies. The wiser approach is to set the error probabilities so that they properly account for the relative gains and losses to a stakeholder that arise from correct and incorrect decisions, respectively. From long experience assessing the design of clinical trials for probiotics and prebiotics, I recommend that stakeholders in the design phase of studies give thought to the following three statistical considerations.

Pay attention to power

Power is the probability of avoiding a type-2 error—in other words, under the condition that an assumed true effect exists in a study population and that the type-1 error has been controlled at a given value, power is computed of the probability of avoiding the incorrect decision to fail to reject the null hypothesis. Standard practices are to set the type-1 error at 5% and to determine a sample size that achieves 80% power for an assumed alternative hypothesis, one stating that the true effect is of a specific given magnitude, one corresponding to a so-called meaningful effect size. That effect size is typically called a ‘minimum clinically significant difference’ (MCSD) or something similar, because ideally the assumed effect size would be the smallest of the values that would be clinically important, although as a practical matter — because the higher the magnitude of the effect size, the lower the sample size requirements and thus the better the chance of the study being perceived as “affordable” to study sponsors — the MCSDs used to power studies are often larger than some of the values that would also be clinically significant. Nevertheless, let’s consider what it means for the sponsor to accept that the study should be powered at merely the conventional 80% level. Under the assumptions that the true effect in the population is the MCSD and that the study achieves its target sample size, a sponsor of a study that has only 80% power is taking a 1-in-5 chance that the sample results would not be statistically significant (and that the null hypothesis would fail to be rejected).  Such an incorrect decision could have major adverse implications for the sponsor (and for potential beneficiaries of the intervention), particularly given the investments that have been made in the research program and the implications the incorrect decision could have for misinforming future decisions regarding the specific intervention and indeed related interventions.  A 20% risk may not be worth taking.

All other considerations being equal, the risk of a type-2 error could be lowered by increasing the sample size. Under regular asymptotic assumptions that generally apply, increasing the target sample size by about one-third would cut a 20% type-2 error risk in half, to 10%. Increasing the target sample size by two-thirds reduces it all the way to 5%.

Define the true minimum clinically significant effect size applicable to your study

Another important question is where to set the minimum clinically significant effect. Often that effect is based on prior studies without any adjustment—but this can neglect key considerations. Prior effects of an intervention are typically biased in a direction that overstates the benefits of the intervention, especially if the intervention emerged from smallish early-phase studies. More fundamentally, from the perspective of decision theory the estimated effects seen in prior studies do not specifically address what could truly be the minimum clinically meaningful effect when one considers the possible benefits, risks, and costs of the intervention. Probiotics and prebiotics are typically relatively benign interventions in terms of adverse events, so it could be that even more modest favorable impacts on health than were seen in prior studies are still worthwhile.

Powering your study based on what truly is a minimal clinically meaningful effect may lead to a better overall strategy for optimizing net gains, while giving the intervention an appropriately high chance of showing that it works. Although the smaller the assumed effect size, the larger the required sample size needed to detect it (all other factors being the same), a proper assessment of the relative risks and benefits of the intervention and, also, of correct and incorrect decisions about the intervention, may provide a strong basis for making that investment.

In addition, there is another important but often overlooked aspect when deciding on what is a worthwhile improvement. We frequently turn to clinicians to determine what would be a worthwhile improvement, and it is natural for a clinician to address that question by considering what would be a meaningful improvement for a patient who responds to the intervention. Keep in mind, though, that an intervention could be worthwhile for a population if it achieves what would be a worthwhile improvement for a single patient–say, a mean improvement of 0.2 SD on a quality-of-life scale—in only a fraction of the patients in the overall population, say 50%. There are many conditions for which having an intervention that works for only large subsets of the population could be valuable in improving the population’s overall health and wellness. Using this example, where the worthwhile improvement for an individual is 0.2 SD and the worthwhile responder percentage is 50%, then the worthwhile improvement that should be used to power the study would be 0.1 SD, which is equal to (0.2 SD * 50%) + (0 SD * 50%), with the latter product quantifying an assumed absence of a benefit in the non-responders. What should be gleaned from this example is that the minimum clinically important effect for a population is typically less than the minimum clinically important effect for an individual. The effect used to power the study should be the one that applies to the relevant population. Again, that effect should be chosen so that it balances benefits relative to the costs and harms of the intervention while accounting also for variation in whether and how much individuals in the population may respond. When study planners fail to account for this variation, the result is a study that is underpowered for detecting meaningful population-level effects.

Improving the signal-to-noise ratio

In general, effect sizes can be expressed analogously to a mean difference divided by a standard error, and thus can be thought of as a signal-to-noise ratio. Sample size requirements depend crucially on this signal-to-noise ratio. Typically, standard errors are proportional to outcome standard deviations and inversely proportional to the square root of the sample size. The latter is key because it means that in case an expected signal would be cut in half, the noise would also need to be cut in half to maintain the signal-to-noise ratio, which means that if you cannot alter the outcome standard deviation, then you would need to quadruple the sample size. This also applies in the opposite direction, happily: if you can double the expected signal-to-noise ratio, you would only need one-fourth the sample size to achieve the desired power, all other things being equal.

Signal-to-noise ratios can be optimized by designing a trial for a judiciously restricted target population (of potential responders) and by using high-quality outcome measurements for the trial to reduce noise. Although research programs may eventually aim to culminate in large pragmatic trials that show meaningful improvements associated with an intervention even in populations of individuals with wide variations in their likelihood and amount of potential response, it is generally wise up to that stage in a research program to focus trials so that they give accurate information as to whether the intervention works in populations targeted for being more apt to be responsive to an intervention. To do that, for example, the trial methods should include accurate assessments for whether potential recruits are currently experiencing, say, symptoms from whatever condition the intervention is intended to address and whether the recruit would be able to achieve the desired dose of whatever the trial assigns to them. For a truly beneficial intervention, it is easier to continue a research program advancing the development of that intervention if the intervention sustains a consecutive string of “true positive” results from when it began to undergo trials, avoiding a potentially fatal type-2 error (“false negative”).

Careful attention to the above considerations can lead to better trials, ones that combine rigor and transparency with a tailored consideration of the relative costs and benefits of potentially fallible statistical inferences, so that the resulting evidence is as informative as possible for stakeholder decision-making.

ISAPP provides guidance on use of probiotics and prebiotics in time of COVID-19

By ISAPP board of directors

Summary: No probiotics or prebiotics have been shown to prevent or treat COVID-19 or inhibit the growth of SARSCoV-2. We recommend placebo-controlled trials be conducted, which have been undertaken by some research groups. If being used in clinical practice in advance of such evidence, we recommend a registry be organized to collect data on interventions and outcomes.  

Many people active in the probiotic and prebiotic fields have been approached regarding their recommendations for using these interventions in an attempt to prevent or treat COVID-19. Here, the ISAPP board of directors provides some basic facts on this topic.

What is known. Some human trials have shown that specific probiotics can reduce the incidence and duration of common upper respiratory tract infections, especially in children (Hao et al. 2015; Luoto et al. 2014), but also with some evidence for adults (King et al. 2014) and nursing home residents (Van Puyenbroeck et al. 2012; Wang et al. 2018). However, not all evidence is of high quality and more trials are needed to confirm these findings, as well as determine the optimal strain(s), dosing regimens, time and duration of intervention. Further, we do not know how relevant these studies are for COVID-19, as the outcomes are for probiotic impact on upper respiratory tract infections, whereas COVID-19 is also a lower respiratory tract infection and inflammatory disease.

There is less information on the use of prebiotics for addressing respiratory issues than there is for probiotics, as they are used mainly to improve gut health. However, there is evidence supporting the use of galactans and fructans in infant formulae to reduce upper respiratory infections (Shahramian et al. 2018; Arslanoglu et al. 2008). A meta-analysis of synbiotics also showed promise in repressing respiratory infections (Chan et al. 2020).

Mechanistic underpinnings. Is there scientific evidence to suggest that probiotics or prebiotics could impact SARS-CoV-2? Data are very limited. Some laboratory studies have suggested that certain probiotics have anti-viral effects including against other forms of coronavirus (Chai et al. 2013). Other studies indicate the potential to interfere with the main host receptor of the SARS-CoV-2 virus, the angiotensin converting enzyme 2 (ACE2). For example, during milk fermentation, some lactobacilli have been shown to release peptides with high affinity for ACE (Li et al. 2019). Recently, Paenibacillus bacteria were shown to naturally produce carboxypeptidases homologous to ACE2 in structure and function (Minato et al. 2020). In mice, intranasal inoculation of Limosilactobacillus reuteri (formerly Lactobacillus reuteri) F275 (ATCC 23272) has been shown to have protective effects against lethal infection from a pneumonia virus of mice (PVM) (Garcia-Crespo et al. 2013). These data point towards immunomodulatory effects involving rapid, transient neutrophil recruitment in association with proinflammatory mediators but not Th1 cytokines. A recent study demonstrated that TLR4 signaling was crucial for the effects of preventive intranasal treatment with probiotic Lacticaseibacillus rhamnosus (formerly Lactobacillus rhamnosus) GG in a neonatal mouse model of influenza infection (Kumova et al., 2019). Whether these or other immunomodulatory effects, following local or oral administration, could be relevant to SARS-CoV-2 infections in humans is at present not known.

Our immune systems have evolved to respond to continual exposure to live microbes. Belkaid and Hand (2016) state: “The microbiota plays a fundamental role on the induction, training, and function of the host immune system. In return, the immune system has largely evolved as a means to maintain the symbiotic relationship of the host with these highly diverse and evolving microbes.” This suggests a mechanism whereby exposure to dietary microbes, including probiotics, could positively impact immune function (Sugimura et al. 2015; Jespersen et al. 2015).

The role of the gut in COVID-19. Many COVID-19 patients present with gastrointestinal symptoms and also suffer from sepsis that may originate in the gut. This could be an important element in the development and outcome of the disease. Though results from studies vary, it is evident that gastrointestinal symptoms, loss of taste, and diarrhea, in particular, can be features of the infection and may occur in the absence of overt respiratory symptoms. There is a suggestion that gastrointestinal symptoms are associated with a more severe disease course. Angiotensin converting enzyme 2 and virus nucleocapsid protein have been detected in gastrointestinal epithelial cells, and infectious virus particles have been isolated from feces. In some patients, viral RNA may be detectable in feces when nasopharyngeal samples are negative. The significance of these findings in terms of disease transmission is unknown but, in theory, do provide an opportunity for microbiome-modulating interventions that may have anti-viral effects (Cheung et al. 2020; Tian et al. 2020; Han et al. 2020).

A preprint (not peer reviewed) has recently been released, titled ‘Gut microbiota may underlie the predisposition of healthy individuals to COVID-19’ (Gao et al. 2020) suggesting that this could be an interesting research direction and worthy of further discussion. A review of China National Health Commission and National Administration of Traditional Chinese Medicine guidelines also suggested probiotic use, although more work on specific strains is needed (Mak et al. 2020).

Are probiotics or prebiotics safe? Currently marketed probiotics and prebiotics are available primarily as foods and food/dietary supplements, not as drugs to treat or prevent disease. Assuming they are manufactured in a manner consistent with applicable regulations, they should be safe for the generally healthy population and can be consumed during this time.

Baud et al. (in press) presented a case for probiotics and prebiotics to be part of the management of COVID-19. Although not fully aligned with ISAPP’s official position, readers may find the points made and references cited of interest.

Conclusion. We reiterate, currently no probiotics or prebiotics have been shown to prevent or treat COVID-19 or inhibit the growth of SARSCoV-2.