Talk:Conditional probability

I've struck out the sentence about decision trees. There is certainly no sense in which conditional probability calculations are generally easier with decision trees. Decision trees can indeed be interpreted as conditional probability models (or not), but in any event, they are a very, very small part of the world of conditional probability, and making an unwarranted assertion about a minor topic is out of place. Wile E. Heresiarch 17:13, 1 Feb 2004 (UTC)

Wrong?

Conditional probability is the probability of some event A, given that some other event, B, has already occurred

...

In these definitions, note that there need not be a causal or temporal relation between A and B. A may precede B, or vice versa, or they may happen at the same time.

This statement is totally confusing - if event B has already occurred, there has to be a temporal relation between A and B (i.e. B happens before A). --Abdull 12:50, 25 February 2006 (UTC)[reply]

I've reworded it. --Zundark 14:32, 25 February 2006 (UTC)[reply]

Great, thank you! --Abdull 11:24, 26 February 2006 (UTC)[reply]

Since the subject of the article is completely formal, I dislike the references to time, expressions like "temporal relation" or one event "preceeding" another, because I find them informal in this context. In the framework of probability space where we are working time is not formally introduced: what "time" does an event $A$ take place? In fact, when we specifically want to represent or model how our knowledge of the world (represented by random variables) is growing as time passes, we can do it by means of filtrations. And I feel the same goes for the "causal relation", in the article such notion is not defined formally.--zeycus 15:22, 23 February 2007 (UTC)[reply]

The purpose of this paragraph is to dispel the common misconception that conditional probability has something to do with temporal relationships or causality. The paragraph is necessarily informal, as a probability space does not even have such concepts. (By the way, contrary to your suggestion on my Talk page, this paragraph was added by Wile E. Heresiarch on 10 February 2004. The rewording I mentioned above did not touch this paragraph, it simply removed incorrect suggestions of temporal relationships elsewhere in the article. All this can be seen from the edit history.) --Zundark 08:36, 24 February 2007 (UTC)[reply]

I apologize for attributing you the paragraph. I understand what you mean, but I think it is important to separate formal notions from informal ones. So I will add a short comment afterwards. --zeycus 9:42, 24 February 2007 (UTC)

Undefined or Indeterminate?

In the Other Considerations section, the statement If $P(B)=0$ , then $P(A\mid B)$ is left undefined. seems incorrect. Is it not more correct to say that $P(A\mid B)$ is indeterminate?

$IfP(B)=0,thenP(A\cap B)=0$ regardless of $P(A)orP(A\mid B)$ .

Bob Badour 04:36, 11 June 2006 (UTC)[reply]

It's undefined. If you think it's not undefined, then what do you think its definition is? --Zundark 08:54, 11 June 2006 (UTC)[reply]

Indeterminate as I said, the definition of which one would paraphrase to incalcuable or unknown. However, an indeterminate form can be undefined, and the consensus in the literature is to call the conditional undefined in the abovementioned case. There are probably reasons for treating it as undefined that I am unaware of, and changing the text in the article would be OR. Thank you for your comments, and I apologize for taking your time. -- Bob Badour 00:07, 12 June 2006 (UTC)[reply]

Something about this is bothering me. Suppose $X$ is normal standard. I am considering $A\equiv X=0$ and $B\equiv X\in \{0,5\}$ , for example. Clearly $P(A)=P(B)=0$ . However, I feel that $P(A\mid B)$ should be defined, and in fact equal to ${\frac {f(0)}{f(0)+f(5)}}$ where $f$ is the density function of $X$ . In order to informally justify this, I would define $A_{\epsilon }=(-\epsilon ,\epsilon )$ and $B_{\epsilon }=(-\epsilon ,\epsilon )\cup (5-\epsilon ,5+\epsilon )$ for any $\epsilon >0$ . Then, if I am not wrong, $\lim _{\epsilon \to 0^{+}}{P(A_{\epsilon }\mid B_{\epsilon })}={\frac {f(0)}{f(0)+f(5)}}\approx 0.999996273$ .

Suppose someone tells me that a number has been obtained from a normal standard variable, that it is 0 or 5, and that I have a chance for a double-nothing bet trying to guess which one of them it was. Shouldn't I bet for the 0? And how can I argument it, if not with the calculations above? Opinions are most welcome. What do you think? -- zeycus 18:36, 22 February 2007 (UTC)[reply]

I think you are absolutely right. However, the theory needed to obtain this is a lot more complicated than the theory needed to understand conditional probability as such. Should the article state clearly from the start that we are deling with discrete distributions only, and then perhaps have a last section dealing with generalization to continuous distributions?--Niels Ø (noe) 19:33, 22 February 2007 (UTC)[reply]

Use of 'modulus signs' and set theory

Are the modulus signs in the "Definition" section intended to refer to the cardinality of the respective sets? It's not clear from the current content of the page. I think the set theory background to probability is a little tricky, so perhaps more explanation could go into this section?

I absolutely agree.--Niels Ø (noe) 14:13, 29 January 2007 (UTC)[reply]

I may be wrong, but it seems to me that the definition $P(A\mid B)={\frac {\mid A\cap B\mid }{\mid B\mid }}$ is not just unfortunate, but simply incorrect. Consider for example the probability space $(\Omega ,F,P)$ with $\Omega =\{a,b,c,d\}$ , the set of events $F=2^{\Omega }$ and probabilities $P(\{a\})=0.4$ , $P(\{b\})=0.3$ , $P(\{c\})=0.2$ and $P(\{d\})=0.1$ . Let $A=\{a\}$ and $B=\{a,b\}$ . Then $P(A\mid B)={\frac {P(A\cap B)}{P(B)}}={\frac {P(A)}{P(B)}}={\frac {4}{7}}$ . However, ${\frac {\mid A\cap B\mid }{\mid B\mid }}={\frac {\mid A\mid }{\mid B\mid }}={\frac {1}{2}}$ . --zeycus 4:46, 24 February 2007 (UTC)

The text talks about elements randomly chosen from a set. The author's intent clearly is that this implies symmetry.--Niels Ø (noe) 08:29, 24 February 2007 (UTC)[reply]

Yes, you are absolutely right. But then, why defining conditional probability only in that particular case, when it makes sense and is usually defined for any probabilistic space with the same formula

P(A\mid B)={\frac {P(A\cap B)}{P(B)}}

. --zeycus 8:43, 24 February 2007 (UTC)

Valid for continuous distributions?

Two events A and B are mutually exclusive if and only if P(A∩B) = 0...

Let X be a continuous random variable, e.g. normally distributed with mean 0 and standard deviation 1. Let A be the event that X >= 0, and B the event that X <= 0. Then, A∩B is the event X=0, which has probability 0, but which is not impossible. I don't think A and B should be called exclusive in this case. So, either the context of the statement from the article I quote above should be made clear (For discrete distributions,...), or the statement itself should be modified.

Would it in all cases be correct to say that A and B are exclusive if and only if A∩B = Ø ? Suppose U={0,1,2,3,4,5,6}, P(X=0)=0 and P(X=x)=1/6 for x=1,2,3,4,5,6 (i.e. a silly but not incorrect model of a die). Are A={X even}={0,2,4,6} and B={X<2}={0,1} mutually exclusive or not?--Niels Ø (noe) 14:13, 29 January 2007 (UTC)[reply]

An example?

Here's an example involving conditional probabilities that makes sense to me, and usually also to the students to whom I teach this stuff. It clearly shows the difference between P(A|B) and P(B|A).

In order to identify individuals having a serious disease in an early curable form, one may consider screening a large group of people. While the benefits are obvious, an argument against such screenings is the disturbance caused by false positive screening results: If a person not having the disease is incorrectly found to have it by the initial test, they will most likely be quite distressed till a more careful test hopefully shows that they do not have the disease. Even after being told they are well, their lives may be affected negatively.

The magnitude of this problem is best understood in terms of conditional probabilities.

Suppose 1% of the group suffer from the disease D. Choosing an individual at random, P(D)=1%=0.01 and P(W)=99%, where W=D' means the person is well. Suppose that when the screening test is applied to a person not having the disease, there is a 1% chance of getting a false positive result, i.e. P(P|W)=1%, and P(N|W)=99%, where P means positive result, and N=P' means negative result. Finally, suppose that when the test is applied to a person having the disease, there is a 1% chance of a false negative result, i.e. P(N|D)=1% and P(P|D)=99%.

Now, calculation shows that:

P(W\cap N)=P(W)\times P(N|W)=99\%\times 99\%=98.01\%

is the fraction of the whole group being well and testing negative.

P(D\cap P)=P(D)\times P(P|D)=1\%\times 99\%=0.99\%

is the fraction of the whole group being ill and testing positive.

P(W\cap P)=P(W)\times P(P|W)=99\%\times 1\%=0.99\%

is the fraction of the whole group having false positive results.

P(D\cap N)=P(D)\times P(N|D)=1\%\times 1\%=0.01\%

is the fraction of the whole group having false negative results.

Furthermore,

P(P)=P(W\cap P)+P(D\cap P)=0.99\%+0.99\%=1.98\%

is the fraction of the whole group testing positive.

\scriptstyle P(D|P)={\frac {P(D\cap P)}{P(P)}}={\frac {0.99\%}{1.98\%}}=50\%

is the probability that you actually have the disease if you tested positive.

In this example, it should be easy to relate to the difference between P(P|D)=99% and P(D|P)=50%: The first is the conditional probability that you test positive if you have the disease; the second is the conditional probability that you have the disease if you test positive. With the numbers chosen here, the last result is likely to be deemed unacceptable: Half the people testing positive are actually false positives.

So that's my example. Do you like it? Should I include it in the article? Can you perhaps help me improve on it first?--Niels Ø (noe) 11:48, 13 February 2007 (UTC)[reply]

No replies for 10 days. I don't know how to interpret that, but I'll now be bold and ad my example to the article.--Niels Ø (noe) 09:25, 23 February 2007 (UTC)[reply]