- Hypothesis Tests
- Type I and Type II Errors
- Critical Value Approach
- P-Value Approach
- Hypothesis Tests when variance is unknown
- The Wilcoxon Signed-Rank Test
Hypothesis testing is an essential procedure in statistics. Its idea is really simple: the researcher has some theory about the world, and wants to determine whether or not the data actually support that theory. However, the details are messy and it is not easy at all to make a conclusion. In opposite to the research hypothesis, we focus on a statistical hypothesis, which must be mathematically precise and must correspond to specific claims about the characteristic of the population data.
The first step is to make so called null hypothesis. The term "null hypothesis" is a general statement or default position that there is no relationship between two measured phenomena, or no association among group. The null hypothesis is generally assumed to be true until evidence indicates otherwise. In statistics, it is often denoted H0 (read “H-nought”, "H-null", "H-oh", or "H-zero").
The concept of a null hypothesis is used differently in two approaches to statistical inference. In the orthodox significance testing approach of Sir Ronald Fisher (1890--1962), a null hypothesis is rejected if the observed d ata are significantly unlikely to have occurred if the null hypothesis were true. In this case the null hypothesis is rejected based on poor evidence. According to Fisher, if the null hypothesis is rejected, you don't have any other hypothesis to compare it to, there is no way of accepting the alternative because you don't necessarily have an explicitely stateted alternative. Because Fisher was a real person, his opinion changed over time and at no point he offered the definite statement.
In the hypothesis testing approach of Jerzy Neyman and Egon Pearson, a null hypothesis is contrasted with an alternative hypothesis and the two hypotheses are distinguished on the basis of data, with certain error rates. So Jerzy Neyman (1894--1981) thought that the point of hypothesis testing was as a guide to action, and his approach was somewhat more formal than Fisher's one. His view was that there are multiple things that you could do and the point of the test was to tell you which one the data support. From his perspective, it is critical to specify your alternative hypothesis properly (which we denote as Ha). If you don't know what the a lternative hypothesis is, then you don't know how powerful the test is, or even which action makes sense.
Jerzy Neyman was born into a Polish family in Bendery, in the Bessarabia Governorate of the Russian Empire. He graduated from the Kamieniec Podolski gubernial gymnasium for boys in 1909 under the name Yuri Cheslavovich Neyman. He began studies at Kharkov University in 1912, where he was taught by Russian probabilist Sergei Natanovich Bernstein (1880--1968). In 1938 Jerzy moved to Berkeley (California), where he worked for the rest of his life. Thirty-nine students received their Ph.D's under his advisorship.
Egon Sharpe Pearson (1895--1980) was a leading British statistician who is best known for development of the Neyman–Pearson lemma of statistical hypothesis testing.
The best way to think about the hypothesis testing is to imagine that a hypothesis test is criminal trial, which is the analogy of adversarial legal system like UK/US/Australia. Then the null hypothesis is the defendant, the researcher is the prosecutor, and the statistical test itself is the judge. Silimarly to a criminal trial, there is a presumption of innocence: the null hypothesis is deemed to be true unless you, the researcher, can prove beyond a reasonable doubt that it is false.
Example: Consider an example of a person who has been indicted for committing a crime and is being tried in a court. Based on the available evidence, the judge or jury will make one of two possible decisions:
- The person is not guilty.
- The person is guilty.
Example: Calcium is the most abundant mineral in the human body and has several importatnt functions. Most body calcium is stored in the bones and teeth, where it functions to support their structure.
A point estimate is a single value given as the estimate of a population parameter that is of interest, for example, the mean of some quantity. An interval estimate specifies instead a range within which the parameter is estimated to lie. Interval estimates can be contrasted with point estimates. Confidence intervals are commonly reported in tables or graphs along with point estimates of the same parameters, to show the reliability of the estimates.
retain H0 | reject H0 | |
---|---|---|
H0 is true | correct decision | error (type I) |
H0 is false | error (type II) | correct decision |
It is much more common to refer to the power of the test, which is the probability with which we reject a null hypothesis when it really false, which is 1-β.
retain H0 | reject H0 | |
---|---|---|
H0 is true | 1-α (probability of correct retention) | α (error (type I error rate) |
H0 is false | β (type II error rate) | 1-β (power of the test) |
A point estimate is a single value given as the estimate of a population parameter that is of interest, for example, the mean of some quantity. An interval estimate specifies instead a range within which the parameter is estimated to lie. Interval estimates can be contrasted with point estimates. Confidence intervals are commonly reported in tables or graphs along with point estimates of the same parameters, to show the reliability of the estimates.
A point estimate is a single value given as the estimate of a population parameter that is of interest, for example, the mean of some quantity. An interval estimate specifies instead a range within which the parameter is estimated to lie. Interval estimates can be contrasted with point estimates. Confidence intervals are commonly reported in tables or graphs along with point estimates of the same parameters, to show the reliability of the estimates.
A point estimate is a single value given as the estimate of a population parameter that is of interest, for example, the mean of some quantity. An interval estimate specifies instead a range within which the parameter is estimated to lie. Interval estimates can be contrasted with point estimates. Confidence intervals are commonly reported in tables or graphs along with point estimates of the same parameters, to show the reliability of the estimates.
Up to this point, we have presented two methods for performing a hypothesis test for a population mean. If the population standard deviation is known, we can use the z-test; if it is unknown, we can use the t-test.