Decision curve analysis

Decision curve analysis evaluates a predictor for an event, where values of the predictor beyond a threshold will indicate that some intervention or treatment should be performed. As an example, a doctor suspecting prostate cancer in a patient may consider performing a biopsy (the intervention/treatment) if a blood sample is beyond some level, or if a prediction model gives a risk above some value (threshold). The doctor wants to perform biopsies in the right group of patients, i.e. those where it will reveal cancer, while avoiding performing a painful biopsy in patients that are not sick.

The purpose of decision curve analysis is to evaluate whether use of the predictor on average will bring more benefit than harm. A threshold probability is used as a measure of the harm in performing unnecessary treatment (false positives) compared to the benefit of treatment in those that need it (true positives). The net benefit is calculated as the weighed difference between the rates of true and false positives.

The decision curve is typically shown as a graphical plot of the net benefit from using the predictor, shown as a function of the threshold probability. Two default strategies are also plotted: Treat all and treat none. "Treat all" is a relevant strategy for cases where the harm of the treatment procedure is considered low, e.g. a screening test. "Treat none" is a relevant strategy if no specific knowledge is available. The predictor will only be of value in a range where its net benefit is above both the “treat all” curve and the “treat none” curve.

Threshold probability is defined as the minimum probability of an event at which a decision-maker would take a given action, for instance, the probability of cancer at which a doctor would order a biopsy. A lower threshold probability implies a greater concern about the event (e.g. a patient worried about cancer), while a higher threshold implies greater concern about the action to be taken (e.g. a patient averse to the biopsy procedure).

Decision curve analysis does not tell what threshold probability should be used. Instead, the researcher evaluating the predictor should indicate a range of relevant threshold probabilities, and focus on the plot within this range.

The predictor evaluated by decision curve analysis could be a binary classifier (yes/no), or a percentage risk from a prediction model. In the latter case, treatment is indicated if the percentage risk is higher than the threshold probability.

General theory

The use of threshold probability to weight true and false positives derives from decision theory, in which the expected value of a decision can be calculated from the utilities and probabilities associated with decision outcomes. In the case of predicting an event, there are four possible outcomes: true positive, true negative, false positive and false negative. This means that to conduct a decision analysis, the analyst must specify four different utilities, which is often challenging.

In decision curve analysis, the strategy of treating no patients (considering all observations as negative) is defined as having a value of zero. This means that only true positives (event identified and appropriately managed) and false positives (unnecessary action) are considered.^[1]

The threshold probability is used to measure of the harm in unnecessary treatment (false positives) weighed against the benefit in relevant treatment (true positives). It is easily shown that the ratio of the utility of a true positive vs. the utility of avoiding a false positive is the odds at the threshold probability.^[2] For instance, a doctor whose threshold probability to order a biopsy for cancer is 10% believes that the utility of finding cancer early is 9 times greater than that of avoiding the harm of unnecessary biopsy. Similarly to the calculation of expected value, weighting false positive outcomes by the threshold probability yields an estimate of net benefit that incorporates decision consequences and preferences.^[3]

Net benefit

Theory

Net benefit is calculated as a weighted combination of true and false positives, where $p_{t}$ is the threshold probability for treatment (intervention), true and false positives are count variables, and $N$ is the total number of observations:^[2]

Net\ Benefit={{True\ positives} \over N}-{{False\ positives} \over N}\times {p_{t} \over 1-p_{t}}

The theoretical maximal value of the net benefit equals the prevalence of the disease in the population considered.^[4]

{Prevalence}={{Total\ number\ of\ sick} \over N}={{Maximal\ number\ of\ true\ positives} \over N}

If the threshold probability is set to zero, $p_{t}=0$ , then a "treat all" approach will achieve this maximal value.

There is no lower bound to the value of the net benefit, it can be infinitely negative.^[2] The closer $p_{t}$ comes to zero, the larger becomes the factor on the false positive rate.

Interpretation

The threshold $p_{t}$ is the probability that should be exceeded before the patient is treated. "Treated" can in this context be any kind of intervention, including performing a test that involves risk and/or patient discomfort, e.g. biopsy.^[4]

The value of net benefit is true positives. For instance, a net benefit of 0.07 is the same as finding 7 true positives per 100 patients in the target population.^[4] A negative net benefit means that performing the treatment will on average do more harm than good.

The net benefit can be compared to net profit in a trade: net profit = income – expenditures.^[4] In this comparison,

the true positive rate corresponds to the income of the sale (e.g. in dollars),
the false positive rate corresponds to what must be put into the sale (e.g. in units of goods, or in units of euros),
and the factor corresponds to the exchange rate (cost in dollars per unit of goods, or dollars per euro).

Thus, the factor ${p_{t} \over 1-p_{t}}$ is the ‘price’ of one false positive, the harm (in a broad sense) that equals the benefit of finding one true positive. The factor represents the odds corresponding to probability the $p_{t}$ . As an example, the threshold probability $p_{t}=0.1=10\%$ corresponds to odds 1:9 = 1/9, so finding 1 true positive is worth the cost of 9 false positives. In the biopsy example, $p_{t}=10\%$ corresponds to saying that finding 1 positive biopsy is worth the harm in performing biopsy in 9 persons that turn out to be healthy.^[4]

Setting $p_{t}=0$ gives false positives a 'price' of zero, i.e. 1 true positive is in that case worth any number (infinitely many) of false positives. As probability, $p_{t}=0$ sets the threshold so low that all patients should be treated. Conversely, setting $p_{t}=1=100\%$ gives an infinitely high price on a false positive, so even if there is a 99.999...% chance that the patient is (truly) positive, treatment should not be performed.

Decision curve interpretation

A decision curve analysis graph is drawn by plotting threshold probability on the x-axis and net benefit on y-axis, illustrating the trade-offs between benefit (true positives) and harm (false positives) as the threshold probability (preference) is varied across a range of reasonable threshold probabilities.^[2]

The figure gives a hypothetical example of biopsy for cancer. Given the relative benefits and harms of cancer early detection and avoidable biopsy, we would consider it unreasonable to opt for a biopsy if the risk of cancer was less than 5% or, alternatively, to refuse biopsy if given a risk of more than 25%. Hence the best strategy is that with the highest net benefit across the range of threshold probabilities between 5 – 25%, in this case, model A. If no strategy has highest net benefit across the full range, that is, if the decision curves cross, then the decision curve analysis is equivocal.^[4]

The default strategies of assuming all or no observations are positive are often interpreted as “Treat all” (or “Intervention for all”) and “Treat none” (or “Intervention for none”) respectively. The curve for “Treat none” is fixed at a net benefit of 0. The curve for “Treat all” crosses the y-axis and “Treat none” at the event prevalence.^[2]

Net benefit on the y-axis is expressed in units of true positives per person.^[4] For instance, a difference in net benefit of 0.025 at a given threshold probability between two predictors of cancer, Model A and Model B, could be interpreted as “using Model A instead of Model B to order biopsies increases the number of cancers detected by 25 per 1000 patients, without changing the number of unnecessary biopsies.”

References

^ Baker, Stuart G.; Cook, Nancy R.; Vickers, Andrew; Kramer, Barnett S. (2009-10-01). "Using relative utility curves to evaluate risk prediction". Journal of the Royal Statistical Society. Series A, (Statistics in Society). 172 (4): 729–748. doi:10.1111/j.1467-985X.2009.00592.x. ISSN 0964-1998. PMC 2804257. PMID 20069131.
^ ^a ^b ^c ^d ^e Vickers, Andrew J.; Elkin, Elena B. (November 2006). "Decision curve analysis: a novel method for evaluating prediction models". Medical Decision Making. 26 (6): 565–574. doi:10.1177/0272989X06295361. ISSN 0272-989X. PMC 2577036. PMID 17099194.
^ van Calster, Ben; Wynants, Laure; Verbeek, Jan F.M.; Verbakel, Jan Y.; Christodoulou, Evangelia; Vickers, Andrew J.; Roobol, Monique J.; Steyerberg, Ewout W. (December 2018). "Reporting and Interpreting Decision Curve Analysis: A Guide for Investigators". European Urology. 74 (6): 796–804. doi:10.1016/j.eururo.2018.08.038. PMC 6261531. PMID 30241973.
^ ^a ^b ^c ^d ^e ^f ^g Vickers, Andrew J.; van Calster, Ben; Steyerberg, Ewout W. (2019-10-04). "A simple, step-by-step guide to interpreting decision curve analysis". Diagnostic and Prognostic Research. 3: 18. doi:10.1186/s41512-019-0064-7. ISSN 2397-7523. PMC 6777022. PMID 31592444.{{cite journal}}: CS1 maint: unflagged free DOI (link)

[Baker_2009-1] Baker, Stuart G.; Cook, Nancy R.; Vickers, Andrew; Kramer, Barnett S. (2009-10-01). "Using relative utility curves to evaluate risk prediction". Journal of the Royal Statistical Society. Series A, (Statistics in Society). 172 (4): 729–748. doi:10.1111/j.1467-985X.2009.00592.x. ISSN 0964-1998. PMC 2804257. PMID 20069131.

[Vickers_2006-2] Vickers, Andrew J.; Elkin, Elena B. (November 2006). "Decision curve analysis: a novel method for evaluating prediction models". Medical Decision Making. 26 (6): 565–574. doi:10.1177/0272989X06295361. ISSN 0272-989X. PMC 2577036. PMID 17099194.

[van_Calster_2018-3] van Calster, Ben; Wynants, Laure; Verbeek, Jan F.M.; Verbakel, Jan Y.; Christodoulou, Evangelia; Vickers, Andrew J.; Roobol, Monique J.; Steyerberg, Ewout W. (December 2018). "Reporting and Interpreting Decision Curve Analysis: A Guide for Investigators". European Urology. 74 (6): 796–804. doi:10.1016/j.eururo.2018.08.038. PMC 6261531. PMID 30241973.

[Vickers_2019-4] ^ ^a ^b ^c ^d ^e ^f ^g Vickers, Andrew J.; van Calster, Ben; Steyerberg, Ewout W. (2019-10-04). "A simple, step-by-step guide to interpreting decision curve analysis". Diagnostic and Prognostic Research. 3: 18. doi:10.1186/s41512-019-0064-7. ISSN 2397-7523. PMC 6777022. PMID 31592444.{{cite journal}}: CS1 maint: unflagged free DOI (link)

[1]

[2]

[3]

[4]