Talk:Positive and negative predictive values

Medicine Unassessed

	Medicine portal This article is within the scope of WikiProject Medicine, which recommends that medicine-related articles follow the Manual of Style for medicine-related articles and that biomedical information in any article use high-quality medical sources. Please visit the project page for details or ask questions at Wikipedia talk:WikiProject Medicine.MedicineWikipedia:WikiProject MedicineTemplate:WikiProject Medicinemedicine
???	This article has not yet received a rating on Wikipedia's content assessment scale.
???	This article has not yet received a rating on the project's importance scale.

Table and edits

See Talk:Sensitivity (tests) re past wish list for simpler description, setting what it is before launching in mathematical jargon. I have also added a table and in Sensitivity (tests) added a worked example. The table is now consistant in Sensitivity, Specificity, PPV & NPV with relevant row or column for calculation highlighted. David Ruben ^Talk 02:45, 11 October 2006 (UTC)[reply]

"Physician's Gold Standard" Remove?

"Physician's gold standard" seems to be an unhelpful phrase as it is used in this article.

My experience has been that when "gold standard" is used in this context it refers to the reference test against which the accuracy of a test is measured. As we all know, sensitivity, specificity, PPV, etc., require a "gold standard" test for reference -- otherwise we don't have a basis for claims about % true positives and % true negatives.

Here it seems that "physician's gold standard" means something like "it is the statistical property of a test that is most useful to physicians".

It seems that either the author was confused about the use of "gold standard" in biostatistics or there's another (unfortunate) use of the phrase that I'm not familiar with. Since I don't know which, I'm not editing the page. If others agree, perhaps this phrase should be replaced.

--will 02:19, 24 July 2007 (UTC)[reply]

the need for an unequivocal definition of positive predictive value

Let's consider following tabel (Grant Innes, 2006, CJEM. Clinical utility of novel cardiac markers: let the byer beware.)

Table 3. Diagnostic performance of ischemia modified albumin (IMA) in a low (5%) prevalence population.

ACS   Yes No Total Sensitivity (true-positive rate) = 35/50 = 70% 
IMA + 35 722  757 Specificity (true-negative rate) = 228/950 = 24% 
IMA – 15 228  243 Positive predictive value = 35/757 = 4.6% 
      50 950 1000 Negative predictive value = 228/243 = 94%

The positive predictive value is smaller than the prevalence. We must conclude that a positive test result decreases the probability of disease or in other words that the post-test probability of disease, given a positive result, is smaller than the pre-test probability (prevalence): very strange and unusual conclusion.

From a statistical point of view this very strange conclusion can be avoided by interchanging the rows of thet table: IMA- becomes a positive test result. This operation results in a predictive value of 6.17%. The conclusion is that a positive test result, if the test is of any value at all, increases the post-test probability as it is expected to do and in no case decreases this value.

This example illustrates the need for an unequivocal definition of a positive test result. If a positive test result is unequivocally defined, the positive predictive value is mathematically unequivocally defined. A text providing such an unequivocal definition was removed by someone who called it 'garble'. I intend to put the text back, any objections? —Preceding unsigned comment added by Michel soete (talk • contribs) 18:57, 22 September 2007 (UTC)[reply]

Yes - makes no sense, 'garble' indeed. I've removed it and placed here in talk page where we can work on this.

And, alternatively, too:
PPV = PR * LR+ / (PR * (LR+ - 1) + 1)
wherein PR = the prevalence (pre-test probability) of the disease, * = the multiplication sign and LR+ = the positive likelihood ratio. LR+ = sensitivity / (1 - specificity). The prevalence, the sensitivity and the specificity must be expressend in per one, not in percentage or in pro mille a.s.o.. The frequency of the True Positives must be this frequency that exceeds or equals the expected value, mathematically expressed: True Positives >= (True positives + False Positives) (True Positives + False Negatives) / N wherein N = True Positives + False Positives + True Negatives + False Negatives. If this condition is not met and if the sensitivity differs from .50 (50%) then two different results after the calculation of sensitivity are possible since the rows of two by two tables can be interchanged and then a former positive result can be called a negative, a former negative result can be called a positive (Michel Soete, Wikipedia, dutch version, Sensitiviteit en Specificiteit, 2006, december 16th).

As a start, lets use same terminology as rest of article, ie call PR just Prevalence, no need explain maths symbols. If LR+ is "sensitivity / (1 - specificity)", then I get:

PPV = Prevalence * sensitivity / (1 - specificity)
     --------------------------------------------
     Prevalence * ((sensitivity / (1 - specificity)) - 1) + 1

Lets multiply through by (1 - specificity):

PPV = Prevalence * sensitivity 
     --------------------------------------------
    (Prevalence * (sensitivity - (1 - specificity)) + (1 - specificity)

Which is:

PPV = Prevalence * sensitivity 
     --------------------------------------------
     Prevalence * sensitivity - Prevalence + specificityPrevalence + 1 - specificity

and so to:

PPV = Prevalence * sensitivity 
     --------------------------------------------
     Prevalence * sensitivity + (1-specificity)(1- prevalence)

ie exactly the same as the last formula already given in the article ! This fails to add therefore a new insight into its derivation or meaning.

As for "The frequency of the True Positives must be this frequency that exceeds or equals the expected value, mathematically expressed: True Positives >= (True positives + False Positives) (True Positives + False Negatives) / N wherein N = True Positives + False Positives + True Negatives + False Negatives. If this condition is not met and if the sensitivity differs from .50 (50%) then two different results after the calculation of sensitivity are possible since the rows of two by two tables can be interchanged and then a former positive result can be called a negative, a former negative result can be called a positive" - sorry can't even begin to get my head around this.

Why must TP be larger than the expected values?
The conditional formula your seek is the same as TP => Positive predictive value * Sensitivity, but what is this expressing in everyday words ?
How can there be two different results possible ?
Surely just needless convolution to start supposing what happens if switching rows about ? Might as well say switching a "test result that excluded a disease" to a "test result that confirmed a disease" - one can't start switching values. One defines at the start what a positive or negative result means (ie what the null hypothesis is) and then should stick to it thoughout the analysis. David Ruben ^Talk 11:46, 27 September 2007 (UTC)[reply]

allowing ambiguity

My mother tongue is dutch. Initially I did not understand quite well what garble is but now I think it is the same of nonsense.

Not quite the meaning I meant, more that it was so convoluted/mixed up/unclear as to loose the intended meaning.David Ruben ^Talk 15:00, 29 September 2007 (UTC)[reply]

Let us assume that allowing ambiguity is a good option. Following tables can then be constructed:

                 D+      D-               D+        D-
blue (P)         99 (a)  1  (b)    red (P) 1        99
red  (N)          1 (c) 99  (d)    blue(N) 99        1

Constructing these tables I respected some conventions: The frequencies of diseased people are in the first column, the frequencies of the positives in the first row, the frequency of the true positives in cell a.... a.s.o..

Now we can write that sensitivity is a / (a + c). For those for whom blue is positive the sensitivity is 99%, for those for whom red is positive the sensitivity is 1%. The positive predictive value ( a / (a + b)) is 99% (blue is positive) or 1% (red is positive).

I now understand where you see the alternative way of looking at the data (indeed one could go switching sensitivity for specificity), but this is precisely my point about needing to be very clear from the outset about the meaning of the test (the null hypothesis) and what a positive or negative result means. To start talking about how well a test result confirms a disease and then start considering how the same test might be viewed as a marker of no disease (ie is a positive result that for picking up disease or is a positive result that of identifying the normal) is to dither between positive & negative results, PPV & NPV, specificity & sensitivity. One should define what the test indicates and then staying with that, interpret the results - there can only be a single PPV, a single NPV, a single specificity, a single sensitivity for any given set of data.David Ruben ^Talk 15:00, 29 September 2007 (UTC)[reply]

Such a possibility for ambiguity is not in line with traditional medical thinking and therefore it leads to (at least seemingly) contradicory statements and therefore confusion.

Megan Davdson writes (2002, The interpretation of diagnostic tests: A primer for physiotherapists): 'Where sensitivity or specificity is extremely high (98-100%, interpretation of test results is simple. If the sensitivity is extremely high, we can be sure that a negative test result will rule the disease out.' If ambiguity is allowed we have to add 'or extremely low (0-2%)' and 'If the sensitivity is extremely low, we can be sure that a positive test result will rule disease out'. Moreover, the relatively new concepts SpPIn and SnNOut are described in the article. It are acronyms. A SpPIn is a test with such an extreme high Specificity that if a test result is Positive disease can be ruled In. A SnNOut is a test with such an extremely high Sensitivity that if the test result is Negative the disease can be ruled Out.

Thus our demand that a > the expected value in cell a is a solid basis for these concepts and their names and for the classical ideas that they incorporate. Also the strong living idea that a positive test result always points to disease find in this demand a firm basis.

I hope that the argumentation above were convincing enough and that the removed text will be put back by the person that removed it.

81.244.101.52 12:07, 29 September 2007 (UTC)[reply]

Have to disagree with "If the sensitivity is extremely high, we can be sure that a negative test result will rule the disease out'" - where sensitivity is high, this means only that with a positive result we can be reasonably sure that the disease is identified. Sensitivity has no direct bearing on the truely healthy, only on those with disease. Consider:

       Disease   Healthy
   +ve    980       10 
   -ve     20       10

This has sensitivity of 98% (980/(20+980)) yet it can hardly be said that "a negative test result will rule the disease out" - quite the opposite, of those with a negative result, two thirds will have the disease (20/(20+10)).

Now for the second claim of "'If the sensitivity is extremely low, we can be sure that a positive test result will rule disease out'", equally untrue:

       Disease   Healthy
   +ve    20        1 
   -ve   980       99

This test has low sensitivity of 2% (20/(20+980)), but high specificity of 99% (99/(1+99)), yet a positive result is far from reassuring but suggests over a 95% chance for really being ill (20/20+1)). Of course this test is so poor that it fails to meaningfully help identify disease vs healthy, given that in this example 91% of subjects had the disease !

I think your textbook for physiotherapists is being simplistic in its outlook and attempts to guide the reader - would have been better if its author had stuck to describing the standard terms, rather than trying to create new "rules-of-thumb".David Ruben ^Talk 15:00, 29 September 2007 (UTC)[reply]

unequivocal definition of statistical measures

Hi Davidruben,

I did not claimed that a test with a high sensitivity, given a negative test result, ruled disease out, it was Megan Davidson. It was not Megan Davidson who claimed that a low negative test, given a a positive test result ruled disease out, I was it who stated that this should be added in her article if ambiguity was allowed. Davidson did not wrote a textbook but an excellent article of six pages on the subject.

You disagree with those claims, I disagree too but Davidson and Grant Innes are examples of classical thinking about the subject. Moreover, they have strong argumentation for their point of view. Davidson writes: 'Unfortunately the predictive values only apply when the clinical prevalence is identical to that reported in the study. Prevalence changes dramatically depending on where the test is being performed.'

Grant Innes writes: 'In reality, predictive value is less a measure of test performance than it is a reflection of disease prevalence in the population being tested.'(op.cit.)

His illustrating examples are good. But I disagree with both of them. Your examples are good and following example is it too:

             D+         D-
       +ve   99        99 
       -ve    1         1

No further comment on this tabel needed, I suppose.

I consider their point of view as an expression of what was generally believed in the former century and, I suppose, by many if not most of the physicians today.

I stress the point that allowing ambiguity in defining a positive result does not result in ambiguity of the conclusion for the testee. Blue remains in any case the color that ends up in the conclusion D+. For the patient it is of no importance if the sensitivity is called 90% or 10% and if this conclusion is the result of what is called a positive or negative test result. For him, blue is disease.

The null hypothesis on its own does not says wich test result is positive. For two by two tables the null hypothesis says that experimental data will not deviate (significantly) from the table of expected values. My demand is decisive for what must be considered as a positive result (a must be higher than the expected value in cell a). It results in a situation wherein only one sensitivity a.s.o. is possible. A positive result is then not a result of a decision but of a calculation.

By the way, in my opinion, a cheap, innocent poor test may have very good utility. A potential good test is a test where the test result shows association with disease. The utility of a test is depending on decisions and is not only a characteristic of a test if there is association between test results and disease. Let us assume that the physician or the patient is satisfied with a probability of 97% to decide to a dangerous treatment then a very poor, cheap test increasing the post-test probability from 93% to 97% is potentially a very usefull test.

So, I hope that you will convinced that it is to be preferred to put my 'confusing' text back.

Michel soete 20:05, 29 September 2007 (UTC)[reply]

I hope my phrasing was clear that I was critised the textbook and certainly not yourself. I think their points probably try to refer to situations where the disease is quite rare and hence my previous very skewed examples would not apply. Hence their observations have a prerequest (and unstated) assumption that the disease occurs in a small fraction of the total population). The test examples I gave though would still be reasonable if the "disease" is not living past 80 (most people do not), and the "test" was say a mad-scientists claim that enquiring if people ate more than 2 apples a day could predict those who would live healthy long lives (one can conceive that this would be a useless test). I accept your comments immediately above demonstrate understanding of the situation, but the points are I'm sure too esoteric (?pedantic or convoluted better adjectives?) to help explain these parameters in what is just a general encycloipaedia (i.e. they might be appropriate in a full textbook on statistics, but that is not Wikipedia's role). So, whilst I've appreciated discussing this with you (it certainly made me review my own understanding of several issues), I still do not personally feel the paragraph should be included - sorry :-) Would be interesting to hear if other editors have any thoughts on this... David Ruben ^Talk 23:14, 29 September 2007 (UTC)[reply]

Hi David Ruben

I think I can understand quite well your hesitation. I suppose that nobody should hesitate to prefer a sensitivity of 99% in the last example I gave and therefore my new example was not convincing (but perhaps somewhat shocking). It is logical for a measure as sensitivity that everyone desires that it is high and for these tables there is no good reason to prefer a very low sensitivity. Applying my requirement the conclusion is too that the sensitivity is 99% and not 1%. But the problem of the table of Grant Innes is therewith not solved and this is not a esoteric, pedantic problem. It is a real life problem.

I looked on the website of wynneconsult.com and there I found the following (in dutch): 'The probability of a positive test result, as the patient has the disease, is called sensitivity. The sensitivity has to be as high as possible.' This is quite reasonable, I believe. They write too: 'The probability of a negative test result in absence of disease the disease is called specificity and it must also be as high as possible.' This too is quite reasonable, I think, but it is a pity that, reconsidering the table of Grant Innes, both requirements cannot be met at the same time. Indeed both should be at least as high as minimum 50%. We must make a choice and on what basis? So the requirements are not of general value and it is for that reason that I proposed a new requirement, it is an objective basis to make this choice.

Moreover if the unindependant variable is a numerical variable sensitivity and specificity can be manipulated by changing the cut-off points. If the positivity decreases the sensitivity will decrease too and there will be a cut-off point that is low enough to cause a sensitivity that is lower than 50%. What then? Switch positive results into negative results to meet the requirement of as high sensitivity as possible again? I do'nt like the idea.

For all those reasons and yet a few others I proposed my requirement that solves those problems. There is even no loss: if sensitivity in some cases will be lower than 50% it will be at the profit of specificity and it will be justified.

I thank you for your efforts to answer.

81.244.101.52 20:11, 30 September 2007 (UTC)[reply]