What are the lessons learned?

Lessons related to the benefit-risk methodology

Table 19 Assessment of appropriate frame for benefit-risk approaches through practical experience

Comments Proposed improvements and/or extensions
The BRAT and PrOACT-URL are comprehensive, provide similar approaches and essentially give the same guidelines for defining the decision context. Both can be considered fit for purpose. A convergence towards one standard framework would support global implementation of benefit-risk modelling.
The hardest issue to deal with was the time horizon for outcomes. A two-year horizon was chosen as this is the duration of the pivotal clinical trials. However, this may not capture rare events or events that take longer to manifest. Further research needed on how to deal with outcomes on different time horizons: e.g. Should discounting be used?
Availability of additional risk minimization measures such as the JVC antibody test recently developed by Biogen would make a difference in benefit-risk assessment but was not considered in this exercise due to the paucity of information currently available in the public domain, and also was not available at time of the CHMP re-evaluation in 2010. Established benefit-risk models could well be used to re-evaluate BR whenever a new risk minimization activity becomes available.
Characterisation of very rare serious adverse events is a key motivation for benefit-risk modelling for marketed products because such events usually are observed once a broader population is exposed. This remains a challenge in particular for characterising the event and for visualisation of its severity. Weighing is key to give such events the appropriate emphasis in a benefit-risk model. Standard information in SPC on frequencies was considered more vague (PML uncommon, between 1 in 100 and 1 in 1000).

Other sources than clinical trials or labelling information are required, in this case more certainty was derived from observational studies.

A weighted benefit-risk model is necessary to allow appropriate visualisation.

Perspective taken was of the regulator, assuming they are a perfect agent for the patient. This was a particular example where Regulators and patients had initially very different perspectives. Patient value elicitation should always be part of modelling complex benefit-risk decisions.
Both frameworks allow for comparisons. Differences in indications / trial patient population between the comparators had to be considered, i.e. Interferon beta-1a and glatiramer acetate are approved in less severe patients (first line) than natalizumab (second line). Unfortunately the data are not reported for this subgroup on the comparators interferon beta-1a and glatiramer acetate in the available sources. Benefit-risk models allow only comparisons based on available data. Regulators aim to increase benefit-risk balance by restricting the use of a drug to a smaller patient population. When taking such decisions, attempts should be made to understand the benefit-risk of comparators in the same restricted population.

Table 20 Assessment of using meaningful reliable information for benefit-risk approaches through practical experience

Comments Proposed improvements and/or extensions
Consideration of other benefits, bias towards more complete risk listing. There was a general concern that the benefit-risk tree will provide visual imbalance that may transfer into numerical differences later if the group is too selective on either benefits or risks. If an outcome is missing from the tree then it is the same as giving that outcome no weight. Also there is a tendency to focus on risks instead of benefit. On the other hand, the risk of double counting must be avoided. Consider all benefit outcomes relevant for the patients. Focus on providing relevant risks only.
A controversial discussion was initially led on the use of odds ratios versus absolute risk differences. For benefit-risk evaluation, absolute risk differences provide more transparent information than relative differences like odds ratios or relative risks, particularly for the comparison of more than one item. Absolute numbers are also needed for the value elicitation.

For the benefits, data were available as they were primary or secondary endpoints. However, uncertainty may not have been reported (only p-values).

For risks, they were sometimes available in the literature. Needed to assess if difference in definitions used were comparable.

Variability of source data and subsequent transformation: Different experts will select different data.

Original data plus modified data (e.g. allowing indirect comparison across trials) must be well documented and provided together with the result.
Time consuming to reach consensus in group meetings. Start with a proposal already prepared, then get consensus in a group – a consensus meeting of all functions is a prerequisite for finalisation of the value tree. Include a qualified statistician who knows the data and a qualified clinician who understands the disease area and the clinical implications of the findings.

Table 21 Assessment of the availability of clear values and trade-offs for benefit-risk approaches through practical experience

Comments Proposed improvements and/or extensions

BRAT allows, but does not insist on preference values. PrOACT-URL more explicitly requires the use of preference values.

Judgements were obtained from patient representatives using bottom-up swing weights. Given the nature of the problem – balancing one serious and rare adverse event over several clinically relevant and frequent benefits – the team agreed that weighting from patient perspective is necessary as basis for decision making. Usually Regulators decide on behalf of the patient but since their risk tolerance may be much lower than that of a patient, a carefully elicited patient input is required. Such preference elicitation / weighting needs a thorough methodology. Unguided questionnaires provide the same information to each patient and thereby might avoid bias, while interviews may be preferable to fully explain and discuss the disease and the risks involved.

The elicitation of reasonable preference weights most likely depends on the stakeholder (e.g. regulators, physicians, patients) involved. Therefore sensitivity analyses using preference weights from different groups of stakeholders are recommended to check the robustness of the decision.

As some of the weights provided were questioned the team recommended a sensitivity analysis, in particular looking at higher avoidance preference for the life threatening risk of PML.

Scale bias: there are cognitive biases in realistically understanding large numbers.

Patients tend to put more value on avoiding harm that they are experiencing, and may underestimate the harm that they could potentially experience.

The tools and research in this area needs to be further analysed in the context of Benefit-risk assessments by Regulators.

It is important to have consistent definition between the value tree, the objective data extracted and subjective data elicited. A broad definition makes data extraction easier, but preference weighting harder. A narrow definition makes data extraction harder, but preference weighting easier. Clear and concise definitions must be established from the start, and must be used consistently throughout the process.

Table 22 Assessment of the logically correct reasoning for benefit-risk approaches through practical experience

Comments Proposed improvements and/or extensions

BRAT with w-NCB needs outcomes to be expressed as events.

PrOACT-URL with MCDA: can accept any form of data. Outcomes should be expressed on an absolute scale as this is the scale that preference values are expressed.

Continuous or ordered categorical data could be dealt with by definition of either one or a set of appropriate dichotomous variables (e.g. responder definition in case of continuous efficacy data).

Consideration of uncertainty in all models: BRAT with w-NCB Risk difference is presented with CI. In principal, accounting for deterministic and stochastic uncertainty is allowed in the method, but was not implemented.

PrOACT-URL with MCDA: Uncertainty is an explicit step in the process. Accounting for deterministic and stochastic uncertainty is possible, and one way deterministic sensitivity analysis on the weights and measures was applied.

Uncertainty of the weighted NCB approach can be assessed by exploring the variability using, for example, approximation techniques.Further sensitivity analysis can be performed to show at which points the decision made would be changed.
There are cognitive biases in realistically understanding large numbers. These induce some potential problems with the hierarchical weighting process with respect to scale bias. Patients also tend to put more value on avoiding harm they are experiencing, and underestimate harm they could potentially experience. More careful considerations and strategy to deal with cognitive biases due to different magnitudes of events are needed when facing such situations.

Lessons related to the visual representation of benefit-risk assessment results

Visualisations in BRAT framework

The visual displays deliver benefit-risk information for decision making without drawing final conclusions or synthesising to one number. It is considered acceptable and easily interpretable by decision makers making qualitative judgement. The benefit-risk value tree, absolute numbers in tabular forms and relative risks as forest plot are recommended. Other suitable visualisations can be easily introduced in the framework at any point in time, when necessary. An issue the proposed BRAT visual representations is that rare outcomes shown alongside common outcomes can be misleading. The rare events may appear not to have occurred as they are beyond the resolution of the plot. The seriousness of the event is not captured by the plot and this could be misleading.

Visualisations in PrOACT-URL framework

The PrOACT-URL does not prescribe specific plots, but does emphasise the importance of communication of results and sensitivity analysis, and the plots selected are in this spirit. As many of the risks did not occur, this led to a preference value that was 1 for many of the outcomes. This led to, in some places, the difference between natalizumab and placebo being lost where there is no value difference. We found it best to display the incremental value between treatments. Benefit-risk analysis has a lot of moving parts and it is helpful to not just visualise the overall benefit-risk. It is helpful to break down the overall benefit-risk into the contributions from the different components from each outcome. This can be achieved through the use of a horizontal bar chart, a vertical stacked bar chart and/or a waterfall plot. Sensitivity analysis is crucial in assessing the robustness of the quantitative benefit-risk results and can be represented as one-way (tornado) and two-way (line or frontier graphs) deterministic sensitivity analysis plots. There were issues on how to represent numbers on very different scales on the same plot. Very low weights or rare outcomes may not be resolved on a plot.