Talk:Statistical hypothesis test

This is the talk page for discussing improvements to the Statistical hypothesis test article.
This is not a forum for general discussion of the article's subject.

Put new text under old text. Click here to start a new topic.
New to Wikipedia? Welcome! Learn to edit; get help.

Article policies

Find sources: Google (books · news · scholar · free images · WP refs) · FENS · JSTOR · TWL

Archives: 1, 2: 3 months

Template:Vital article

This article has not yet been rated on Wikipedia's content assessment scale.
It is of interest to multiple WikiProjects.

Please add the quality rating to the {{WikiProject banner shell}} template instead of this project banner. See WP:PIQA for details.

Psychology B‑class High‑importance

	Psychology portal This article is within the scope of WikiProject Psychology, a collaborative effort to improve the coverage of Psychology on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.PsychologyWikipedia:WikiProject PsychologyTemplate:WikiProject Psychologypsychology
B	This article has been rated as B-class on Wikipedia's content assessment scale.
High	This article has been rated as High-importance on the project's importance scale.

Please add the quality rating to the {{WikiProject banner shell}} template instead of this project banner. See WP:PIQA for details.

Statistics B‑class Top‑importance

	This article is within the scope of WikiProject Statistics, a collaborative effort to improve the coverage of statistics on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.StatisticsWikipedia:WikiProject StatisticsTemplate:WikiProject StatisticsStatistics
B	This article has been rated as B-class on Wikipedia's content assessment scale.
Top	This article has been rated as Top-importance on the importance scale.

Please add the quality rating to the {{WikiProject banner shell}} template instead of this project banner. See WP:PIQA for details.

Mathematics B‑class Top‑priority

	Mathematics portal This article is within the scope of WikiProject Mathematics, a collaborative effort to improve the coverage of mathematics on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.MathematicsWikipedia:WikiProject MathematicsTemplate:WikiProject Mathematicsmathematics
B	This article has been rated as B-class on Wikipedia's content assessment scale.
Top	This article has been rated as Top-priority on the project's priority scale.

This is the talk page for discussing improvements to the Statistical hypothesis test article.
This is not a forum for general discussion of the article's subject.

Put new text under old text. Click here to start a new topic.
New to Wikipedia? Welcome! Learn to edit; get help.

Article policies

Find sources: Google (books · news · scholar · free images · WP refs) · FENS · JSTOR · TWL

Archives: 1, 2: 3 months

Common test statistics

I corrected the erroneous last test, ("regression t-test") to a correct F-test. Harald Lang, 2015-11-29.

relationships

Does a mans financial responsibility only start when the couple gets married?

The p-value doesn't have to be strictly lower than the significance level to reject the null hypothesis.

The significance level “alpha” is defined as the risk of rejecting a true null hypothesis (risk of type 1 error, or false positive). The p-value is defined as the probability of getting a test statistic at least as extreme as observed, under the null hypothesis. The page says one should reject the null hypothesis when the p-value is less than alpha. This rule appears to contract the two definitions. If we reject H0 only when a sample yields a p-value that is strictly lower than alpha, the rejection rate of a true H0 might be lower than alpha, while it should equal alpha, by definition.

To illustrate: H0 is “this coin is fair” and H1 is “there is a probability >1/2 of getting a head” (one-sided test). We toss the coin 10 times. Our test statistic X is the number of heads observed in 10 trials. X follows Bi(10, 1/2) under H0. We get 5 heads. The p-value is P(X ≥ 5) = 0.6230469. You can check with R using binom.test(5, 10, 1/2, “greater”).

If we choose alpha = P(X ≥ 5) = 0.6230469, and decide to reject H0 when the p-value is strictly lower than alpha, we would reject H0 only if there are 6 heads of more, because if we get 5 heads, the p-value equals alpha. Getting 6 heads or more under H0 has a probably P(X ≥ 6) = 0.3769531. This is the rate at which we would reject the true H0. As you can see, it does not equal alpha.

If I’m right, the wiki page is wrong. Jpeccoud (talk) 05:41, 29 August 2019 (UTC)jpeccoud[reply]

I agree, it should be "less than or equal to". This is meaningless for continuous test statistics for which there will be no difference. The issue comes up for discrete distributions. I'm changing the phrasing. Tal Galili (talk) 14:21, 29 August 2019 (UTC)[reply]

Thanks. This implies many corrections to the Statistical significance article, which I'd rather not do myself (not being a statistician nor a native English speaker) — Preceding unsigned comment added by Jpeccoud (talk • contribs) 08:02, 30 August 2019 (UTC)[reply]

Thanks for the heads up. I've now made modifications to Statistical significance based on this. Tal Galili (talk) 07:05, 1 September 2019 (UTC)[reply]

Criticism

When used to detect whether a difference exists between groups, a paradox arises. As improvements are made to experimental design (e.g. increased precision of measurement and sample size), the test becomes more lenient. Unless one accepts the absurd assumption that all sources of noise in the data cancel out completely, the chance of finding statistical significance in either direction approaches 100%. However, this absurd assumption that the mean difference between two groups cannot be zero implies that the data cannot be independent and identically distributed (i.i.d.) because the expected difference between any two subgroups of i.i.d. random variates is zero; therefore, the i.i.d. assumption is also absurd.

This train of thought is unclear and does not make much sense to me. The contributor confuses two different assumptions, in addition to labeling them as "absurd" without any justification. 2607:FEA8:11E0:6C57:E9A6:1B95:A479:CA89 (talk) 16:49, 2 April 2020 (UTC)[reply]