what is alternative hypothesis testing

Statistics Made Easy

What is an Alternative Hypothesis in Statistics?

Often in statistics we want to test whether or not some assumption is true about a population parameter .

For example, we might assume that the mean weight of a certain population of turtle is 300 pounds.

To determine if this assumption is true, we’ll go out and collect a sample of turtles and weigh each of them. Using this sample data, we’ll conduct a hypothesis test .

The first step in a hypothesis test is to define the null and alternative hypotheses .

These two hypotheses need to be mutually exclusive, so if one is true then the other must be false.

These two hypotheses are defined as follows:

Null hypothesis (H 0 ): The sample data is consistent with the prevailing belief about the population parameter.

Alternative hypothesis (H A ): The sample data suggests that the assumption made in the null hypothesis is not true. In other words, there is some non-random cause influencing the data.

Types of Alternative Hypotheses

There are two types of alternative hypotheses:

A one-tailed hypothesis involves making a “greater than” or “less than ” statement. For example, suppose we assume the mean height of a male in the U.S. is greater than or equal to 70 inches.

The null and alternative hypotheses in this case would be:

Null hypothesis: µ ≥ 70 inches
Alternative hypothesis: µ < 70 inches

A two-tailed hypothesis involves making an “equal to” or “not equal to” statement. For example, suppose we assume the mean height of a male in the U.S. is equal to 70 inches.

Null hypothesis: µ = 70 inches
Alternative hypothesis: µ ≠ 70 inches

Note: The “equal” sign is always included in the null hypothesis, whether it is =, ≥, or ≤.

Examples of Alternative Hypotheses

The following examples illustrate how to define the null and alternative hypotheses for different research problems.

Example 1: A biologist wants to test if the mean weight of a certain population of turtle is different from the widely-accepted mean weight of 300 pounds.

The null and alternative hypothesis for this research study would be:

Null hypothesis: µ = 300 pounds
Alternative hypothesis: µ ≠ 300 pounds

If we reject the null hypothesis, this means we have sufficient evidence from the sample data to say that the true mean weight of this population of turtles is different from 300 pounds.

Example 2: An engineer wants to test whether a new battery can produce higher mean watts than the current industry standard of 50 watts.

Null hypothesis: µ ≤ 50 watts
Alternative hypothesis: µ > 50 watts

If we reject the null hypothesis, this means we have sufficient evidence from the sample data to say that the true mean watts produced by the new battery is greater than the current industry standard of 50 watts.

Example 3: A botanist wants to know if a new gardening method produces less waste than the standard gardening method that produces 20 pounds of waste.

Null hypothesis: µ ≥ 20 pounds
Alternative hypothesis: µ < 20 pounds

If we reject the null hypothesis, this means we have sufficient evidence from the sample data to say that the true mean weight produced by this new gardening method is less than 20 pounds.

When to Reject the Null Hypothesis

Whenever we conduct a hypothesis test, we use sample data to calculate a test-statistic and a corresponding p-value.

If the p-value is less than some significance level (common choices are 0.10, 0.05, and 0.01), then we reject the null hypothesis.

This means we have sufficient evidence from the sample data to say that the assumption made by the null hypothesis is not true.

If the p-value is not less than some significance level, then we fail to reject the null hypothesis.

This means our sample data did not provide us with evidence that the assumption made by the null hypothesis was not true.

Additional Resource: An Explanation of P-Values and Statistical Significance

Featured Posts

Hey there. My name is Zach Bobbitt. I have a Masters of Science degree in Applied Statistics and I’ve worked on machine learning algorithms for professional businesses in both healthcare and retail. I’m passionate about statistics, machine learning, and data visualization and I created Statology to be a resource for both students and teachers alike. My goal with this site is to help you learn statistics through using simple terms, plenty of real-world examples, and helpful illustrations.

Join the Statology Community

Sign up to receive Statology's exclusive study resource: 100 practice problems with step-by-step solutions. Plus, get our latest insights, tutorials, and data analysis tips straight to your inbox!

By subscribing you accept Statology's Privacy Policy.

9.1 Null and Alternative Hypotheses

The actual test begins by considering two hypotheses . They are called the null hypothesis and the alternative hypothesis . These hypotheses contain opposing viewpoints.

H 0 , the — null hypothesis: a statement of no difference between sample means or proportions or no difference between a sample mean or proportion and a population mean or proportion. In other words, the difference equals 0.

H a —, the alternative hypothesis: a claim about the population that is contradictory to H 0 and what we conclude when we reject H 0 .

After you have determined which hypothesis the sample supports, you make a decision. There are two options for a decision. They are reject H 0 if the sample information favors the alternative hypothesis or do not reject H 0 or decline to reject H 0 if the sample information is insufficient to reject the null hypothesis.

Mathematical Symbols Used in H 0 and H a :


equal (=)	not equal (≠) greater than (>) less than (<)
greater than or equal to (≥)	less than (<)
less than or equal to (≤)	more than (>)

H 0 always has a symbol with an equal in it. H a never has a symbol with an equal in it. The choice of symbol depends on the wording of the hypothesis test. However, be aware that many researchers use = in the null hypothesis, even with > or < as the symbol in the alternative hypothesis. This practice is acceptable because we only make the decision to reject or not reject the null hypothesis.

Example 9.1

H 0 : No more than 30 percent of the registered voters in Santa Clara County voted in the primary election. p ≤ 30 H a : More than 30 percent of the registered voters in Santa Clara County voted in the primary election. p > 30

A medical trial is conducted to test whether or not a new medicine reduces cholesterol by 25 percent. State the null and alternative hypotheses.

Example 9.2

We want to test whether the mean GPA of students in American colleges is different from 2.0 (out of 4.0). The null and alternative hypotheses are the following: H 0 : μ = 2.0 H a : μ ≠ 2.0

We want to test whether the mean height of eighth graders is 66 inches. State the null and alternative hypotheses. Fill in the correct symbol (=, ≠, ≥, <, ≤, >) for the null and alternative hypotheses.

H 0 : μ __ 66
H a : μ __ 66

Example 9.3

We want to test if college students take fewer than five years to graduate from college, on the average. The null and alternative hypotheses are the following: H 0 : μ ≥ 5 H a : μ < 5

We want to test if it takes fewer than 45 minutes to teach a lesson plan. State the null and alternative hypotheses. Fill in the correct symbol ( =, ≠, ≥, <, ≤, >) for the null and alternative hypotheses.

H 0 : μ __ 45
H a : μ __ 45

Example 9.4

An article on school standards stated that about half of all students in France, Germany, and Israel take advanced placement exams and a third of the students pass. The same article stated that 6.6 percent of U.S. students take advanced placement exams and 4.4 percent pass. Test if the percentage of U.S. students who take advanced placement exams is more than 6.6 percent. State the null and alternative hypotheses. H 0 : p ≤ 0.066 H a : p > 0.066

On a state driver’s test, about 40 percent pass the test on the first try. We want to test if more than 40 percent pass on the first try. Fill in the correct symbol (=, ≠, ≥, <, ≤, >) for the null and alternative hypotheses.

H 0 : p __ 0.40
H a : p __ 0.40

Collaborative Exercise

Bring to class a newspaper, some news magazines, and some internet articles. In groups, find articles from which your group can write null and alternative hypotheses. Discuss your hypotheses with the rest of the class.

As an Amazon Associate we earn from qualifying purchases.

This book may not be used in the training of large language models or otherwise be ingested into large language models or generative AI offerings without OpenStax's permission.

Want to cite, share, or modify this book? This book uses the Creative Commons Attribution License and you must attribute Texas Education Agency (TEA). The original material is available at: https://www.texasgateway.org/book/tea-statistics . Changes were made to the original material, including updates to art, structure, and other content updates.

Access for free at https://openstax.org/books/statistics/pages/1-introduction

Authors: Barbara Illowsky, Susan Dean
Publisher/website: OpenStax
Book title: Statistics
Publication date: Mar 27, 2020
Location: Houston, Texas
Book URL: https://openstax.org/books/statistics/pages/1-introduction
Section URL: https://openstax.org/books/statistics/pages/9-1-null-and-alternative-hypotheses

© Jan 23, 2024 Texas Education Agency (TEA). The OpenStax name, OpenStax logo, OpenStax book covers, OpenStax CNX name, and OpenStax CNX logo are not subject to the Creative Commons license and may not be reproduced without the prior and express written consent of Rice University.

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Publications
Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

Advanced Search
Journal List
Indian J Crit Care Med
v.23(Suppl 3); 2019 Sep

An Introduction to Statistics: Understanding Hypothesis Testing and Statistical Errors

Priya ranganathan.

1 Department of Anesthesiology, Critical Care and Pain, Tata Memorial Hospital, Mumbai, Maharashtra, India

2 Department of Surgical Oncology, Tata Memorial Centre, Mumbai, Maharashtra, India

The second article in this series on biostatistics covers the concepts of sample, population, research hypotheses and statistical errors.

How to cite this article

Ranganathan P, Pramesh CS. An Introduction to Statistics: Understanding Hypothesis Testing and Statistical Errors. Indian J Crit Care Med 2019;23(Suppl 3):S230–S231.

Two papers quoted in this issue of the Indian Journal of Critical Care Medicine report. The results of studies aim to prove that a new intervention is better than (superior to) an existing treatment. In the ABLE study, the investigators wanted to show that transfusion of fresh red blood cells would be superior to standard-issue red cells in reducing 90-day mortality in ICU patients. 1 The PROPPR study was designed to prove that transfusion of a lower ratio of plasma and platelets to red cells would be superior to a higher ratio in decreasing 24-hour and 30-day mortality in critically ill patients. 2 These studies are known as superiority studies (as opposed to noninferiority or equivalence studies which will be discussed in a subsequent article).

SAMPLE VERSUS POPULATION

A sample represents a group of participants selected from the entire population. Since studies cannot be carried out on entire populations, researchers choose samples, which are representative of the population. This is similar to walking into a grocery store and examining a few grains of rice or wheat before purchasing an entire bag; we assume that the few grains that we select (the sample) are representative of the entire sack of grains (the population).

The results of the study are then extrapolated to generate inferences about the population. We do this using a process known as hypothesis testing. This means that the results of the study may not always be identical to the results we would expect to find in the population; i.e., there is the possibility that the study results may be erroneous.

HYPOTHESIS TESTING

A clinical trial begins with an assumption or belief, and then proceeds to either prove or disprove this assumption. In statistical terms, this belief or assumption is known as a hypothesis. Counterintuitively, what the researcher believes in (or is trying to prove) is called the “alternate” hypothesis, and the opposite is called the “null” hypothesis; every study has a null hypothesis and an alternate hypothesis. For superiority studies, the alternate hypothesis states that one treatment (usually the new or experimental treatment) is superior to the other; the null hypothesis states that there is no difference between the treatments (the treatments are equal). For example, in the ABLE study, we start by stating the null hypothesis—there is no difference in mortality between groups receiving fresh RBCs and standard-issue RBCs. We then state the alternate hypothesis—There is a difference between groups receiving fresh RBCs and standard-issue RBCs. It is important to note that we have stated that the groups are different, without specifying which group will be better than the other. This is known as a two-tailed hypothesis and it allows us to test for superiority on either side (using a two-sided test). This is because, when we start a study, we are not 100% certain that the new treatment can only be better than the standard treatment—it could be worse, and if it is so, the study should pick it up as well. One tailed hypothesis and one-sided statistical testing is done for non-inferiority studies, which will be discussed in a subsequent paper in this series.

STATISTICAL ERRORS

There are two possibilities to consider when interpreting the results of a superiority study. The first possibility is that there is truly no difference between the treatments but the study finds that they are different. This is called a Type-1 error or false-positive error or alpha error. This means falsely rejecting the null hypothesis.

The second possibility is that there is a difference between the treatments and the study does not pick up this difference. This is called a Type 2 error or false-negative error or beta error. This means falsely accepting the null hypothesis.

The power of the study is the ability to detect a difference between groups and is the converse of the beta error; i.e., power = 1-beta error. Alpha and beta errors are finalized when the protocol is written and form the basis for sample size calculation for the study. In an ideal world, we would not like any error in the results of our study; however, we would need to do the study in the entire population (infinite sample size) to be able to get a 0% alpha and beta error. These two errors enable us to do studies with realistic sample sizes, with the compromise that there is a small possibility that the results may not always reflect the truth. The basis for this will be discussed in a subsequent paper in this series dealing with sample size calculation.

Conventionally, type 1 or alpha error is set at 5%. This means, that at the end of the study, if there is a difference between groups, we want to be 95% certain that this is a true difference and allow only a 5% probability that this difference has occurred by chance (false positive). Type 2 or beta error is usually set between 10% and 20%; therefore, the power of the study is 90% or 80%. This means that if there is a difference between groups, we want to be 80% (or 90%) certain that the study will detect that difference. For example, in the ABLE study, sample size was calculated with a type 1 error of 5% (two-sided) and power of 90% (type 2 error of 10%) (1).

Table 1 gives a summary of the two types of statistical errors with an example

Statistical errors

(a) Types of statistical errors
		: Null hypothesis is
		True	False
Null hypothesis is actually	True	Correct results!	Falsely rejecting null hypothesis - Type I error
	False	Falsely accepting null hypothesis - Type II error	Correct results!
(b) Possible statistical errors in the ABLE trial

		There is difference in mortality between groups receiving fresh RBCs and standard-issue RBCs	There difference in mortality between groups receiving fresh RBCs and standard-issue RBCs
Truth	There is difference in mortality between groups receiving fresh RBCs and standard-issue RBCs	Correct results!	Falsely rejecting null hypothesis - Type I error
Truth	There difference in mortality between groups receiving fresh RBCs and standard-issue RBCs	Falsely accepting null hypothesis - Type II error	Correct results!

In the next article in this series, we will look at the meaning and interpretation of ‘ p ’ value and confidence intervals for hypothesis testing.

Source of support: Nil

Conflict of interest: None

Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, generate accurate citations for free.

Knowledge Base

Hypothesis Testing | A Step-by-Step Guide with Easy Examples

Published on November 8, 2019 by Rebecca Bevans . Revised on June 22, 2023.

Hypothesis testing is a formal procedure for investigating our ideas about the world using statistics . It is most often used by scientists to test specific predictions, called hypotheses, that arise from theories.

There are 5 main steps in hypothesis testing:

State your research hypothesis as a null hypothesis and alternate hypothesis (H o ) and (H a or H 1 ).
Collect data in a way designed to test the hypothesis.
Perform an appropriate statistical test .
Decide whether to reject or fail to reject your null hypothesis.
Present the findings in your results and discussion section.

Though the specific details might vary, the procedure you will use when testing a hypothesis will always follow some version of these steps.

Step 1: state your null and alternate hypothesis, step 2: collect data, step 3: perform a statistical test, step 4: decide whether to reject or fail to reject your null hypothesis, step 5: present your findings, other interesting articles, frequently asked questions about hypothesis testing.

After developing your initial research hypothesis (the prediction that you want to investigate), it is important to restate it as a null (H o ) and alternate (H a ) hypothesis so that you can test it mathematically.

The alternate hypothesis is usually your initial hypothesis that predicts a relationship between variables. The null hypothesis is a prediction of no relationship between the variables you are interested in.

H 0 : Men are, on average, not taller than women. H a : Men are, on average, taller than women.

Receive feedback on language, structure, and formatting

Professional editors proofread and edit your paper by focusing on:

Academic style
Vague sentences
Style consistency

See an example

For a statistical test to be valid , it is important to perform sampling and collect data in a way that is designed to test your hypothesis. If your data are not representative, then you cannot make statistical inferences about the population you are interested in.

There are a variety of statistical tests available, but they are all based on the comparison of within-group variance (how spread out the data is within a category) versus between-group variance (how different the categories are from one another).

If the between-group variance is large enough that there is little or no overlap between groups, then your statistical test will reflect that by showing a low p -value . This means it is unlikely that the differences between these groups came about by chance.

Alternatively, if there is high within-group variance and low between-group variance, then your statistical test will reflect that with a high p -value. This means it is likely that any difference you measure between groups is due to chance.

Your choice of statistical test will be based on the type of variables and the level of measurement of your collected data .

an estimate of the difference in average height between the two groups.
a p -value showing how likely you are to see this difference if the null hypothesis of no difference is true.

Based on the outcome of your statistical test, you will have to decide whether to reject or fail to reject your null hypothesis.

In most cases you will use the p -value generated by your statistical test to guide your decision. And in most cases, your predetermined level of significance for rejecting the null hypothesis will be 0.05 – that is, when there is a less than 5% chance that you would see these results if the null hypothesis were true.

In some cases, researchers choose a more conservative level of significance, such as 0.01 (1%). This minimizes the risk of incorrectly rejecting the null hypothesis ( Type I error ).

Here's why students love Scribbr's proofreading services

Discover proofreading & editing

The results of hypothesis testing will be presented in the results and discussion sections of your research paper , dissertation or thesis .

In the results section you should give a brief summary of the data and a summary of the results of your statistical test (for example, the estimated difference between group means and associated p -value). In the discussion , you can discuss whether your initial hypothesis was supported by your results or not.

In the formal language of hypothesis testing, we talk about rejecting or failing to reject the null hypothesis. You will probably be asked to do this in your statistics assignments.

However, when presenting research results in academic papers we rarely talk this way. Instead, we go back to our alternate hypothesis (in this case, the hypothesis that men are on average taller than women) and state whether the result of our test did or did not support the alternate hypothesis.

If your null hypothesis was rejected, this result is interpreted as “supported the alternate hypothesis.”

These are superficial differences; you can see that they mean the same thing.

You might notice that we don’t say that we reject or fail to reject the alternate hypothesis . This is because hypothesis testing is not designed to prove or disprove anything. It is only designed to test whether a pattern we measure could have arisen spuriously, or by chance.

If we reject the null hypothesis based on our research (i.e., we find that it is unlikely that the pattern arose by chance), then we can say our test lends support to our hypothesis . But if the pattern does not pass our decision rule, meaning that it could have arisen by chance, then we say the test is inconsistent with our hypothesis .

If you want to know more about statistics , methodology , or research bias , make sure to check out some of our other articles with explanations and examples.

Normal distribution
Descriptive statistics
Measures of central tendency
Correlation coefficient

Methodology

Cluster sampling
Stratified sampling
Types of interviews
Cohort study
Thematic analysis

Research bias

Implicit bias
Cognitive bias
Survivorship bias
Availability heuristic
Nonresponse bias
Regression to the mean

Hypothesis testing is a formal procedure for investigating our ideas about the world using statistics. It is used by scientists to test specific predictions, called hypotheses , by calculating how likely it is that a pattern or relationship between variables could have arisen by chance.

A hypothesis states your predictions about what your research will find. It is a tentative answer to your research question that has not yet been tested. For some research projects, you might have to write several hypotheses that address different aspects of your research question.

A hypothesis is not just a guess — it should be based on existing theories and knowledge. It also has to be testable, which means you can support or refute it through scientific research methods (such as experiments, observations and statistical analysis of data).

Null and alternative hypotheses are used in statistical hypothesis testing . The null hypothesis of a test always predicts no effect or no relationship between variables, while the alternative hypothesis states your research prediction of an effect or relationship.

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the “Cite this Scribbr article” button to automatically add the citation to our free Citation Generator.

Bevans, R. (2023, June 22). Hypothesis Testing | A Step-by-Step Guide with Easy Examples. Scribbr. Retrieved June 12, 2024, from https://www.scribbr.com/statistics/hypothesis-testing/

Is this article helpful?

Rebecca Bevans

Other students also liked, choosing the right statistical test | types & examples, understanding p values | definition and examples, what is your plagiarism score.

Alternative hypothesis

by Marco Taboga , PhD

In a statistical test, observed data is used to decide whether or not to reject a restriction on the data-generating probability distribution.

The assumption that the restriction is true is called null hypothesis , while the statement that the restriction is not true is called alternative hypothesis.

A correct specification of the alternative hypothesis is essential to decide between one-tailed and two-tailed tests.

Table of contents

Mathematical setting

Choice between one-tailed and two-tailed tests, the critical region, the interpretation of the rejection, the interpretation must be coherent with the alternative hypothesis.

Power function

Accepting the alternative

More details, keep reading the glossary.

In order to fully understand the concept of alternative hypothesis, we need to remember the essential elements of a statistical inference problem:

we observe a sample drawn from an unknown probability distribution;

in principle, any valid probability distribution could have generated the sample;

however, we usually place some a priori restrictions on the set of possible data-generating distributions;

A couple of simple examples follow.

When we conduct a statistical test, we formulate a null hypothesis as a restriction on the statistical model.

The alternative hypothesis is

The alternative hypothesis is used to decide whether a test should be one-tailed or two-tailed.

The null hypothesis is rejected if the test statistic falls within a critical region that has been chosen by the statistician.

The critical region is a set of values that may comprise:

only the left tail of the distribution or only the right tail (one-tailed test);

both the left and the right tail (two-tailed test).

The choice of the critical region depends on the alternative hypothesis. Let us see why.

The interpretation is different depending on the tail of the distribution in which the test statistic falls.

The choice between a one-tailed or a two-tailed test needs to be done in such a way that the interpretation of a rejection is always coherent with the alternative hypothesis.

When we deal with the power function of a test, the term "alternative hypothesis" has a special meaning.

We conclude with a caveat about the interpretation of the outcome of a test of hypothesis.

The interpretation of a rejection of the null is controversial.

According to some statisticians, rejecting the null is equivalent to accepting the alternative.

However, others deem that rejecting the null does not necessarily imply accepting the alternative. In fact, it is possible to think of situations in which both hypotheses can be rejected. Let us see why.

According to the conceptual framework illustrated by the images above, there are three possibilities:

the null is true;

the alternative is true;

neither the null nor the alternative is true because the true data-generating distribution has been excluded from the statistical model (we say that the model is mis-specified).

If we are in case 3, accepting the alternative after a rejection of the null is an incorrect decision. Moreover, a second test in which the alternative becomes the new null may lead us to another rejection.

There are three cases, including one case in which it is incorrect to accept the alternative hypothesis after a rejection of the null.

You can find more details about the alternative hypothesis in the lecture on Hypothesis testing .

Previous entry: Almost sure

Next entry: Binomial coefficient

How to cite

Please cite as:

Taboga, Marco (2021). "Alternative hypothesis", Lectures on probability theory and mathematical statistics. Kindle Direct Publishing. Online appendix. https://www.statlect.com/glossary/alternative-hypothesis.

Most of the learning materials found on this website are now available in a traditional textbook format.

Maximum likelihood
Binomial distribution
Beta distribution
Convergence in probability
Delta method
Exponential distribution
Chi-square distribution
Set estimation
Wishart distribution
Mathematical tools
Fundamentals of probability
Probability distributions
Asymptotic theory
Fundamentals of statistics
About Statlect
Cookies, privacy and terms of use
Precision matrix
Loss function
Integrable variable
Critical value
To enhance your privacy,
we removed the social buttons,
but don't forget to share .

Skip to secondary menu
Skip to main content
Skip to primary sidebar

Statistics By Jim

Making statistics intuitive

Statistical Hypothesis Testing Overview

By Jim Frost 59 Comments

In this blog post, I explain why you need to use statistical hypothesis testing and help you navigate the essential terminology. Hypothesis testing is a crucial procedure to perform when you want to make inferences about a population using a random sample. These inferences include estimating population properties such as the mean, differences between means, proportions, and the relationships between variables.

This post provides an overview of statistical hypothesis testing. If you need to perform hypothesis tests, consider getting my book, Hypothesis Testing: An Intuitive Guide .

Why You Should Perform Statistical Hypothesis Testing

Graph that displays mean drug scores by group. Use hypothesis testing to determine whether the difference between the means are statistically significant.

Hypothesis testing is a form of inferential statistics that allows us to draw conclusions about an entire population based on a representative sample. You gain tremendous benefits by working with a sample. In most cases, it is simply impossible to observe the entire population to understand its properties. The only alternative is to collect a random sample and then use statistics to analyze it.

While samples are much more practical and less expensive to work with, there are trade-offs. When you estimate the properties of a population from a sample, the sample statistics are unlikely to equal the actual population value exactly. For instance, your sample mean is unlikely to equal the population mean. The difference between the sample statistic and the population value is the sample error.

Differences that researchers observe in samples might be due to sampling error rather than representing a true effect at the population level. If sampling error causes the observed difference, the next time someone performs the same experiment the results might be different. Hypothesis testing incorporates estimates of the sampling error to help you make the correct decision. Learn more about Sampling Error .

For example, if you are studying the proportion of defects produced by two manufacturing methods, any difference you observe between the two sample proportions might be sample error rather than a true difference. If the difference does not exist at the population level, you won’t obtain the benefits that you expect based on the sample statistics. That can be a costly mistake!

Let’s cover some basic hypothesis testing terms that you need to know.

Background information : Difference between Descriptive and Inferential Statistics and Populations, Parameters, and Samples in Inferential Statistics

Hypothesis Testing

Hypothesis testing is a statistical analysis that uses sample data to assess two mutually exclusive theories about the properties of a population. Statisticians call these theories the null hypothesis and the alternative hypothesis. A hypothesis test assesses your sample statistic and factors in an estimate of the sample error to determine which hypothesis the data support.

When you can reject the null hypothesis, the results are statistically significant, and your data support the theory that an effect exists at the population level.

The effect is the difference between the population value and the null hypothesis value. The effect is also known as population effect or the difference. For example, the mean difference between the health outcome for a treatment group and a control group is the effect.

Typically, you do not know the size of the actual effect. However, you can use a hypothesis test to help you determine whether an effect exists and to estimate its size. Hypothesis tests convert your sample effect into a test statistic, which it evaluates for statistical significance. Learn more about Test Statistics .

An effect can be statistically significant, but that doesn’t necessarily indicate that it is important in a real-world, practical sense. For more information, read my post about Statistical vs. Practical Significance .

Null Hypothesis

The null hypothesis is one of two mutually exclusive theories about the properties of the population in hypothesis testing. Typically, the null hypothesis states that there is no effect (i.e., the effect size equals zero). The null is often signified by H 0 .

In all hypothesis testing, the researchers are testing an effect of some sort. The effect can be the effectiveness of a new vaccination, the durability of a new product, the proportion of defect in a manufacturing process, and so on. There is some benefit or difference that the researchers hope to identify.

However, it’s possible that there is no effect or no difference between the experimental groups. In statistics, we call this lack of an effect the null hypothesis. Therefore, if you can reject the null, you can favor the alternative hypothesis, which states that the effect exists (doesn’t equal zero) at the population level.

You can think of the null as the default theory that requires sufficiently strong evidence against in order to reject it.

For example, in a 2-sample t-test, the null often states that the difference between the two means equals zero.

When you can reject the null hypothesis, your results are statistically significant. Learn more about Statistical Significance: Definition & Meaning .

Related post : Understanding the Null Hypothesis in More Detail

Alternative Hypothesis

The alternative hypothesis is the other theory about the properties of the population in hypothesis testing. Typically, the alternative hypothesis states that a population parameter does not equal the null hypothesis value. In other words, there is a non-zero effect. If your sample contains sufficient evidence, you can reject the null and favor the alternative hypothesis. The alternative is often identified with H 1 or H A .

For example, in a 2-sample t-test, the alternative often states that the difference between the two means does not equal zero.

You can specify either a one- or two-tailed alternative hypothesis:

If you perform a two-tailed hypothesis test, the alternative states that the population parameter does not equal the null value. For example, when the alternative hypothesis is H A : μ ≠ 0, the test can detect differences both greater than and less than the null value.

A one-tailed alternative has more power to detect an effect but it can test for a difference in only one direction. For example, H A : μ > 0 can only test for differences that are greater than zero.

Related posts : Understanding T-tests and One-Tailed and Two-Tailed Hypothesis Tests Explained

Image of a P for the p-value in hypothesis testing.

P-values are the probability that you would obtain the effect observed in your sample, or larger, if the null hypothesis is correct. In simpler terms, p-values tell you how strongly your sample data contradict the null. Lower p-values represent stronger evidence against the null. You use P-values in conjunction with the significance level to determine whether your data favor the null or alternative hypothesis.

Related post : Interpreting P-values Correctly

Significance Level (Alpha)

image of the alpha symbol for hypothesis testing.

For instance, a significance level of 0.05 signifies a 5% risk of deciding that an effect exists when it does not exist.

Use p-values and significance levels together to help you determine which hypothesis the data support. If the p-value is less than your significance level, you can reject the null and conclude that the effect is statistically significant. In other words, the evidence in your sample is strong enough to be able to reject the null hypothesis at the population level.

Related posts : Graphical Approach to Significance Levels and P-values and Conceptual Approach to Understanding Significance Levels

Types of Errors in Hypothesis Testing

Statistical hypothesis tests are not 100% accurate because they use a random sample to draw conclusions about entire populations. There are two types of errors related to drawing an incorrect conclusion.

False positives: You reject a null that is true. Statisticians call this a Type I error . The Type I error rate equals your significance level or alpha (α).
False negatives: You fail to reject a null that is false. Statisticians call this a Type II error. Generally, you do not know the Type II error rate. However, it is a larger risk when you have a small sample size , noisy data, or a small effect size. The type II error rate is also known as beta (β).

Statistical power is the probability that a hypothesis test correctly infers that a sample effect exists in the population. In other words, the test correctly rejects a false null hypothesis. Consequently, power is inversely related to a Type II error. Power = 1 – β. Learn more about Power in Statistics .

Related posts : Types of Errors in Hypothesis Testing and Estimating a Good Sample Size for Your Study Using Power Analysis

Which Type of Hypothesis Test is Right for You?

There are many different types of procedures you can use. The correct choice depends on your research goals and the data you collect. Do you need to understand the mean or the differences between means? Or, perhaps you need to assess proportions. You can even use hypothesis testing to determine whether the relationships between variables are statistically significant.

To choose the proper statistical procedure, you’ll need to assess your study objectives and collect the correct type of data . This background research is necessary before you begin a study.

Related Post : Hypothesis Tests for Continuous, Binary, and Count Data

Statistical tests are crucial when you want to use sample data to make conclusions about a population because these tests account for sample error. Using significance levels and p-values to determine when to reject the null hypothesis improves the probability that you will draw the correct conclusion.

To see an alternative approach to these traditional hypothesis testing methods, learn about bootstrapping in statistics !

If you want to see examples of hypothesis testing in action, I recommend the following posts that I have written:

How Effective Are Flu Shots? This example shows how you can use statistics to test proportions.
Fatality Rates in Star Trek . This example shows how to use hypothesis testing with categorical data.
Busting Myths About the Battle of the Sexes . A fun example based on a Mythbusters episode that assess continuous data using several different tests.
Are Yawns Contagious? Another fun example inspired by a Mythbusters episode.

Reader Interactions

January 14, 2024 at 8:43 am

Hello professor Jim, how are you doing! Pls. What are the properties of a population and their examples? Thanks for your time and understanding.

January 14, 2024 at 12:57 pm

Please read my post about Populations vs. Samples for more information and examples.

Also, please note there is a search bar in the upper-right margin of my website. Use that to search for topics.

July 5, 2023 at 7:05 am

Hello, I have a question as I read your post. You say in p-values section

“P-values are the probability that you would obtain the effect observed in your sample, or larger, if the null hypothesis is correct. In simpler terms, p-values tell you how strongly your sample data contradict the null. Lower p-values represent stronger evidence against the null.”

But according to your definition of effect, the null states that an effect does not exist, correct? So what I assume you want to say is that “P-values are the probability that you would obtain the effect observed in your sample, or larger, if the null hypothesis is **incorrect**.”

July 6, 2023 at 5:18 am

Hi Shrinivas,

The correct definition of p-value is that it is a probability that exists in the context of a true null hypothesis. So, the quotation is correct in stating “if the null hypothesis is correct.”

Essentially, the p-value tells you the likelihood of your observed results (or more extreme) if the null hypothesis is true. It gives you an idea of whether your results are surprising or unusual if there is no effect.

Hence, with sufficiently low p-values, you reject the null hypothesis because it’s telling you that your sample results were unlikely to have occurred if there was no effect in the population.

I hope that helps make it more clear. If not, let me know I’ll attempt to clarify!

May 8, 2023 at 12:47 am

Thanks a lot Ny best regards

May 7, 2023 at 11:15 pm

Hi Jim Can you tell me something about size effect? Thanks

May 8, 2023 at 12:29 am

Here’s a post that I’ve written about Effect Sizes that will hopefully tell you what you need to know. Please read that. Then, if you have any more specific questions about effect sizes, please post them there. Thanks!

January 7, 2023 at 4:19 pm

Hi Jim, I have only read two pages so far but I am really amazed because in few paragraphs you made me clearly understand the concepts of months of courses I received in biostatistics! Thanks so much for this work you have done it helps a lot!

January 10, 2023 at 3:25 pm

Thanks so much!

June 17, 2021 at 1:45 pm

Can you help in the following question: Rocinante36 is priced at ₹7 lakh and has been designed to deliver a mileage of 22 km/litre and a top speed of 140 km/hr. Formulate the null and alternative hypotheses for mileage and top speed to check whether the new models are performing as per the desired design specifications.

April 19, 2021 at 1:51 pm

Its indeed great to read your work statistics.

I have a doubt regarding the one sample t-test. So as per your book on hypothesis testing with reference to page no 45, you have mentioned the difference between “the sample mean and the hypothesised mean is statistically significant”. So as per my understanding it should be quoted like “the difference between the population mean and the hypothesised mean is statistically significant”. The catch here is the hypothesised mean represents the sample mean.

Please help me understand this.

Regards Rajat

April 19, 2021 at 3:46 pm

Thanks for buying my book. I’m so glad it’s been helpful!

The test is performed on the sample but the results apply to the population. Hence, if the difference between the sample mean (observed in your study) and the hypothesized mean is statistically significant, that suggests that population does not equal the hypothesized mean.

For one sample tests, the hypothesized mean is not the sample mean. It is a mean that you want to use for the test value. It usually represents a value that is important to your research. In other words, it’s a value that you pick for some theoretical/practical reasons. You pick it because you want to determine whether the population mean is different from that particular value.

I hope that helps!

November 5, 2020 at 6:24 am

Jim, you are such a magnificent statistician/economist/econometrician/data scientist etc whatever profession. Your work inspires and simplifies the lives of so many researchers around the world. I truly admire you and your work. I will buy a copy of each book you have on statistics or econometrics. Keep doing the good work. Remain ever blessed

November 6, 2020 at 9:47 pm

Hi Renatus,

Thanks so much for you very kind comments. You made my day!! I’m so glad that my website has been helpful. And, thanks so much for supporting my books! 🙂

November 2, 2020 at 9:32 pm

Hi Jim, I hope you are aware of 2019 American Statistical Association’s official statement on Statistical Significance: https://www.tandfonline.com/doi/full/10.1080/00031305.2019.1583913 In case you do not bother reading the full article, may I quote you the core message here: “We conclude, based on our review of the articles in this special issue and the broader literature, that it is time to stop using the term “statistically significant” entirely. Nor should variants such as “significantly different,” “p < 0.05,” and “nonsignificant” survive, whether expressed in words, by asterisks in a table, or in some other way."

With best wishes,

November 3, 2020 at 2:09 am

I’m definitely aware of the debate surrounding how to use p-values most effectively. However, I need to correct you on one point. The link you provide is NOT a statement by the American Statistical Association. It is an editorial by several authors.

There is considerable debate over this issue. There are problems with p-values. However, as the authors state themselves, much of the problem is over people’s mindsets about how to use p-values and their incorrect interpretations about what statistical significance does and does not mean.

If you were to read my website more thoroughly, you’d be aware that I share many of their concerns and I address them in multiple posts. One of the authors’ key points is the need to be thoughtful and conduct thoughtful research and analysis. I emphasize this aspect in multiple posts on this topic. I’ll ask you to read the following three because they all address some of the authors’ concerns and suggestions. But you might run across others to read as well.

Five Tips for Using P-values to Avoid Being Misled How to Interpret P-values Correctly P-values and the Reproducibility of Experimental Results

September 24, 2020 at 11:52 pm

HI Jim, i just want you to know that you made explanation for Statistics so simple! I should say lesser and fewer words that reduce the complexity. All the best! 🙂

September 25, 2020 at 1:03 am

Thanks, Rene! Your kind words mean a lot to me! I’m so glad it has been helpful!

September 23, 2020 at 2:21 am

Honestly, I never understood stats during my entire M.Ed course and was another nightmare for me. But how easily you have explained each concept, I have understood stats way beyond my imagination. Thank you so much for helping ignorant research scholars like us. Looking forward to get hardcopy of your book. Kindly tell is it available through flipkart?

September 24, 2020 at 11:14 pm

I’m so happy to hear that my website has been helpful!

I checked on flipkart and it appears like my books are not available there. I’m never exactly sure where they’re available due to the vagaries of different distribution channels. They are available on Amazon in India.

Introduction to Statistics: An Intuitive Guide (Amazon IN) Hypothesis Testing: An Intuitive Guide (Amazon IN)

July 26, 2020 at 11:57 am

Dear Jim I am a teacher from India . I don’t have any background in statistics, and still I should tell that in a single read I can follow your explanations . I take my entire biostatistics class for botany graduates with your explanations. Thanks a lot. May I know how I can avail your books in India

July 28, 2020 at 12:31 am

Right now my books are only available as ebooks from my website. However, soon I’ll have some exciting news about other ways to obtain it. Stay tuned! I’ll announce it on my email list. If you’re not already on it, you can sign up using the form that is in the right margin of my website.

June 22, 2020 at 2:02 pm

Also can you please let me if this book covers topics like EDA and principal component analysis?

June 22, 2020 at 2:07 pm

This book doesn’t cover principal components analysis. Although, I wouldn’t really classify that as a hypothesis test. In the future, I might write a multivariate analysis book that would cover this and others. But, that’s well down the road.

My Introduction to Statistics covers EDA. That’s the largely graphical look at your data that you often do prior to hypothesis testing. The Introduction book perfectly leads right into the Hypothesis Testing book.

June 22, 2020 at 1:45 pm

Thanks for the detailed explanation. It does clear my doubts. I saw that your book related to hypothesis testing has the topics that I am studying currently. I am looking forward to purchasing it.

Regards, Take Care

June 19, 2020 at 1:03 pm

For this particular article I did not understand a couple of statements and it would great if you could help: 1)”If sample error causes the observed difference, the next time someone performs the same experiment the results might be different.” 2)”If the difference does not exist at the population level, you won’t obtain the benefits that you expect based on the sample statistics.”

I discovered your articles by chance and now I keep coming back to read & understand statistical concepts. These articles are very informative & easy to digest. Thanks for the simplifying things.

June 20, 2020 at 9:53 pm

I’m so happy to hear that you’ve found my website to be helpful!

To answer your questions, keep in mind that a central tenant of inferential statistics is that the random sample that a study drew was only one of an infinite number of possible it could’ve drawn. Each random sample produces different results. Most results will cluster around the population value assuming they used good methodology. However, random sampling error always exists and makes it so that population estimates from a sample almost never exactly equal the correct population value.

So, imagine that we’re studying a medication and comparing the treatment and control groups. Suppose that the medicine is truly not effect and that the population difference between the treatment and control group is zero (i.e., no difference.) Despite the true difference being zero, most sample estimates will show some degree of either a positive or negative effect thanks to random sampling error. So, just because a study has an observed difference does not mean that a difference exists at the population level. So, on to your questions:

1. If the observed difference is just random error, then it makes sense that if you collected another random sample, the difference could change. It could change from negative to positive, positive to negative, more extreme, less extreme, etc. However, if the difference exists at the population level, most random samples drawn from the population will reflect that difference. If the medicine has an effect, most random samples will reflect that fact and not bounce around on both sides of zero as much.

2. This is closely related to the previous answer. If there is no difference at the population level, but say you approve the medicine because of the observed effects in a sample. Even though your random sample showed an effect (which was really random error), that effect doesn’t exist. So, when you start using it on a larger scale, people won’t benefit from the medicine. That’s why it’s important to separate out what is easily explained by random error versus what is not easily explained by it.

I think reading my post about how hypothesis tests work will help clarify this process. Also, in about 24 hours (as I write this), I’ll be releasing my new ebook about Hypothesis Testing!

May 29, 2020 at 5:23 am

Hi Jim, I really enjoy your blog. Can you please link me on your blog where you discuss about Subgroup analysis and how it is done? I need to use non parametric and parametric statistical methods for my work and also do subgroup analysis in order to identify potential groups of patients that may benefit more from using a treatment than other groups.

May 29, 2020 at 2:12 pm

Hi, I don’t have a specific article about subgroup analysis. However, subgroup analysis is just the dividing up of a larger sample into subgroups and then analyzing those subgroups separately. You can use the various analyses I write about on the subgroups.

Alternatively, you can include the subgroups in regression analysis as an indicator variable and include that variable as a main effect and an interaction effect to see how the relationships vary by subgroup without needing to subdivide your data. I write about that approach in my article about comparing regression lines . This approach is my preferred approach when possible.

April 19, 2020 at 7:58 am

sir is confidence interval is a part of estimation?

April 17, 2020 at 3:36 pm

Sir can u plz briefly explain alternatives of hypothesis testing? I m unable to find the answer

April 18, 2020 at 1:22 am

Assuming you want to draw conclusions about populations by using samples (i.e., inferential statistics ), you can use confidence intervals and bootstrap methods as alternatives to the traditional hypothesis testing methods.

March 9, 2020 at 10:01 pm

Hi JIm, could you please help with activities that can best teach concepts of hypothesis testing through simulation, Also, do you have any question set that would enhance students intuition why learning hypothesis testing as a topic in introductory statistics. Thanks.

March 5, 2020 at 3:48 pm

Hi Jim, I’m studying multiple hypothesis testing & was wondering if you had any material that would be relevant. I’m more trying to understand how testing multiple samples simultaneously affects your results & more on the Bonferroni Correction

March 5, 2020 at 4:05 pm

I write about multiple comparisons (aka post hoc tests) in the ANOVA context . I don’t talk about Bonferroni Corrections specifically but I cover related types of corrections. I’m not sure if that exactly addresses what you want to know but is probably the closest I have already written. I hope it helps!

January 14, 2020 at 9:03 pm

Thank you! Have a great day/evening.

January 13, 2020 at 7:10 pm

Any help would be greatly appreciated. What is the difference between The Hypothesis Test and The Statistical Test of Hypothesis?

January 14, 2020 at 11:02 am

They sound like the same thing to me. Unless this is specialized terminology for a particular field or the author was intending something specific, I’d guess they’re one and the same.

April 1, 2019 at 10:00 am

so these are the only two forms of Hypothesis used in statistical testing?

April 1, 2019 at 10:02 am

Are you referring to the null and alternative hypothesis? If so, yes, that’s those are the standard hypotheses in a statistical hypothesis test.

April 1, 2019 at 9:57 am

year very insightful post, thanks for the write up

October 27, 2018 at 11:09 pm

hi there, am upcoming statistician, out of all blogs that i have read, i have found this one more useful as long as my problem is concerned. thanks so much

October 27, 2018 at 11:14 pm

Hi Stano, you’re very welcome! Thanks for your kind words. They mean a lot! I’m happy to hear that my posts were able to help you. I’m sure you will be a fantastic statistician. Best of luck with your studies!

October 26, 2018 at 11:39 am

Dear Jim, thank you very much for your explanations! I have a question. Can I use t-test to compare two samples in case each of them have right bias?

October 26, 2018 at 12:00 pm

Hi Tetyana,

You’re very welcome!

The term “right bias” is not a standard term. Do you by chance mean right skewed distributions? In other words, if you plot the distribution for each group on a histogram they have longer right tails? These are not the symmetrical bell-shape curves of the normal distribution.

If that’s the case, yes you can as long as you exceed a specific sample size within each group. I include a table that contains these sample size requirements in my post about nonparametric vs parametric analyses .

Bias in statistics refers to cases where an estimate of a value is systematically higher or lower than the true value. If this is the case, you might be able to use t-tests, but you’d need to be sure to understand the nature of the bias so you would understand what the results are really indicating.

I hope this helps!

April 2, 2018 at 7:28 am

Simple and upto the point 👍 Thank you so much.

April 2, 2018 at 11:11 am

Hi Kalpana, thanks! And I’m glad it was helpful!

March 26, 2018 at 8:41 am

Am I correct if I say: Alpha – Probability of wrongly rejection of null hypothesis P-value – Probability of wrongly acceptance of null hypothesis

March 28, 2018 at 3:14 pm

You’re correct about alpha. Alpha is the probability of rejecting the null hypothesis when the null is true.

Unfortunately, your definition of the p-value is a bit off. The p-value has a fairly convoluted definition. It is the probability of obtaining the effect observed in a sample, or more extreme, if the null hypothesis is true. The p-value does NOT indicate the probability that either the null or alternative is true or false. Although, those are very common misinterpretations. To learn more, read my post about how to interpret p-values correctly .

March 2, 2018 at 6:10 pm

I recently started reading your blog and it is very helpful to understand each concept of statistical tests in easy way with some good examples. Also, I recommend to other people go through all these blogs which you posted. Specially for those people who have not statistical background and they are facing to many problems while studying statistical analysis.

Thank you for your such good blogs.

March 3, 2018 at 10:12 pm

Hi Amit, I’m so glad that my blog posts have been helpful for you! It means a lot to me that you took the time to write such a nice comment! Also, thanks for recommending by blog to others! I try really hard to write posts about statistics that are easy to understand.

January 17, 2018 at 7:03 am

I recently started reading your blog and I find it very interesting. I am learning statistics by my own, and I generally do many google search to understand the concepts. So this blog is quite helpful for me, as it have most of the content which I am looking for.

January 17, 2018 at 3:56 pm

Hi Shashank, thank you! And, I’m very glad to hear that my blog is helpful!

January 2, 2018 at 2:28 pm

thank u very much sir.

January 2, 2018 at 2:36 pm

You’re very welcome, Hiral!

November 21, 2017 at 12:43 pm

Thank u so much sir….your posts always helps me to be a #statistician

November 21, 2017 at 2:40 pm

Hi Sachin, you’re very welcome! I’m happy that you find my posts to be helpful!

November 19, 2017 at 8:22 pm

great post as usual, but it would be nice to see an example.

November 19, 2017 at 8:27 pm

Thank you! At the end of this post, I have links to four other posts that show examples of hypothesis tests in action. You’ll find what you’re looking for in those posts!

Comments and Questions Cancel reply

Hypothesis Testing (cont...)

Hypothesis testing, the null and alternative hypothesis.

In order to undertake hypothesis testing you need to express your research hypothesis as a null and alternative hypothesis. The null hypothesis and alternative hypothesis are statements regarding the differences or effects that occur in the population. You will use your sample to test which statement (i.e., the null hypothesis or alternative hypothesis) is most likely (although technically, you test the evidence against the null hypothesis). So, with respect to our teaching example, the null and alternative hypothesis will reflect statements about all statistics students on graduate management courses.

The null hypothesis is essentially the "devil's advocate" position. That is, it assumes that whatever you are trying to prove did not happen ( hint: it usually states that something equals zero). For example, the two different teaching methods did not result in different exam performances (i.e., zero difference). Another example might be that there is no relationship between anxiety and athletic performance (i.e., the slope is zero). The alternative hypothesis states the opposite and is usually the hypothesis you are trying to prove (e.g., the two different teaching methods did result in different exam performances). Initially, you can state these hypotheses in more general terms (e.g., using terms like "effect", "relationship", etc.), as shown below for the teaching methods example:

Null Hypotheses (H ):	Undertaking seminar classes has no effect on students' performance.
Alternative Hypothesis (H ):	Undertaking seminar class has a positive effect on students' performance.

Depending on how you want to "summarize" the exam performances will determine how you might want to write a more specific null and alternative hypothesis. For example, you could compare the mean exam performance of each group (i.e., the "seminar" group and the "lectures-only" group). This is what we will demonstrate here, but other options include comparing the distributions , medians , amongst other things. As such, we can state:

Null Hypotheses (H ):	The mean exam mark for the "seminar" and "lecture-only" teaching methods is the same in the population.
Alternative Hypothesis (H ):	The mean exam mark for the "seminar" and "lecture-only" teaching methods is not the same in the population.

Now that you have identified the null and alternative hypotheses, you need to find evidence and develop a strategy for declaring your "support" for either the null or alternative hypothesis. We can do this using some statistical theory and some arbitrary cut-off points. Both these issues are dealt with next.

Significance levels

The level of statistical significance is often expressed as the so-called p -value . Depending on the statistical test you have chosen, you will calculate a probability (i.e., the p -value) of observing your sample results (or more extreme) given that the null hypothesis is true . Another way of phrasing this is to consider the probability that a difference in a mean score (or other statistic) could have arisen based on the assumption that there really is no difference. Let us consider this statement with respect to our example where we are interested in the difference in mean exam performance between two different teaching methods. If there really is no difference between the two teaching methods in the population (i.e., given that the null hypothesis is true), how likely would it be to see a difference in the mean exam performance between the two teaching methods as large as (or larger than) that which has been observed in your sample?

So, you might get a p -value such as 0.03 (i.e., p = .03). This means that there is a 3% chance of finding a difference as large as (or larger than) the one in your study given that the null hypothesis is true. However, you want to know whether this is "statistically significant". Typically, if there was a 5% or less chance (5 times in 100 or less) that the difference in the mean exam performance between the two teaching methods (or whatever statistic you are using) is as different as observed given the null hypothesis is true, you would reject the null hypothesis and accept the alternative hypothesis. Alternately, if the chance was greater than 5% (5 times in 100 or more), you would fail to reject the null hypothesis and would not accept the alternative hypothesis. As such, in this example where p = .03, we would reject the null hypothesis and accept the alternative hypothesis. We reject it because at a significance level of 0.03 (i.e., less than a 5% chance), the result we obtained could happen too frequently for us to be confident that it was the two teaching methods that had an effect on exam performance.

Whilst there is relatively little justification why a significance level of 0.05 is used rather than 0.01 or 0.10, for example, it is widely used in academic research. However, if you want to be particularly confident in your results, you can set a more stringent level of 0.01 (a 1% chance or less; 1 in 100 chance or less).

One- and two-tailed predictions

When considering whether we reject the null hypothesis and accept the alternative hypothesis, we need to consider the direction of the alternative hypothesis statement. For example, the alternative hypothesis that was stated earlier is:

Alternative Hypothesis (H ):

Undertaking seminar classes has a positive effect on students' performance.

The alternative hypothesis tells us two things. First, what predictions did we make about the effect of the independent variable(s) on the dependent variable(s)? Second, what was the predicted direction of this effect? Let's use our example to highlight these two points.

Sarah predicted that her teaching method (independent variable: teaching method), whereby she not only required her students to attend lectures, but also seminars, would have a positive effect (that is, increased) students' performance (dependent variable: exam marks). If an alternative hypothesis has a direction (and this is how you want to test it), the hypothesis is one-tailed. That is, it predicts direction of the effect. If the alternative hypothesis has stated that the effect was expected to be negative, this is also a one-tailed hypothesis.

Alternatively, a two-tailed prediction means that we do not make a choice over the direction that the effect of the experiment takes. Rather, it simply implies that the effect could be negative or positive. If Sarah had made a two-tailed prediction, the alternative hypothesis might have been:

Alternative Hypothesis (H ):

Undertaking seminar classes has an effect on students' performance.

In other words, we simply take out the word "positive", which implies the direction of our effect. In our example, making a two-tailed prediction may seem strange. After all, it would be logical to expect that "extra" tuition (going to seminar classes as well as lectures) would either have a positive effect on students' performance or no effect at all, but certainly not a negative effect. However, this is just our opinion (and hope) and certainly does not mean that we will get the effect we expect. Generally speaking, making a one-tail prediction (i.e., and testing for it this way) is frowned upon as it usually reflects the hope of a researcher rather than any certainty that it will happen. Notable exceptions to this rule are when there is only one possible way in which a change could occur. This can happen, for example, when biological activity/presence in measured. That is, a protein might be "dormant" and the stimulus you are using can only possibly "wake it up" (i.e., it cannot possibly reduce the activity of a "dormant" protein). In addition, for some statistical tests, one-tailed tests are not possible.

Rejecting or failing to reject the null hypothesis

Let's return finally to the question of whether we reject or fail to reject the null hypothesis.

If our statistical analysis shows that the significance level is below the cut-off value we have set (e.g., either 0.05 or 0.01), we reject the null hypothesis and accept the alternative hypothesis. Alternatively, if the significance level is above the cut-off value, we fail to reject the null hypothesis and cannot accept the alternative hypothesis. You should note that you cannot accept the null hypothesis, but only find evidence against it.

User Preferences

Content preview.

Arcu felis bibendum ut tristique et egestas quis:

Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris
Duis aute irure dolor in reprehenderit in voluptate
Excepteur sint occaecat cupidatat non proident

Keyboard Shortcuts

6a.2 - steps for hypothesis tests, the logic of hypothesis testing section .

A hypothesis, in statistics, is a statement about a population parameter, where this statement typically is represented by some specific numerical value. In testing a hypothesis, we use a method where we gather data in an effort to gather evidence about the hypothesis.

How do we decide whether to reject the null hypothesis?

If the sample data are consistent with the null hypothesis, then we do not reject it.
If the sample data are inconsistent with the null hypothesis, but consistent with the alternative, then we reject the null hypothesis and conclude that the alternative hypothesis is true.

Six Steps for Hypothesis Tests Section

In hypothesis testing, there are certain steps one must follow. Below these are summarized into six such steps to conducting a test of a hypothesis.

Set up the hypotheses and check conditions : Each hypothesis test includes two hypotheses about the population. One is the null hypothesis, notated as $H_0 $, which is a statement of a particular parameter value. This hypothesis is assumed to be true until there is evidence to suggest otherwise. The second hypothesis is called the alternative, or research hypothesis, notated as $H_a $. The alternative hypothesis is a statement of a range of alternative values in which the parameter may fall. One must also check that any conditions (assumptions) needed to run the test have been satisfied e.g. normality of data, independence, and number of success and failure outcomes.
Decide on the significance level, $\alpha $: This value is used as a probability cutoff for making decisions about the null hypothesis. This alpha value represents the probability we are willing to place on our test for making an incorrect decision in regards to rejecting the null hypothesis. The most common $\alpha $ value is 0.05 or 5%. Other popular choices are 0.01 (1%) and 0.1 (10%).
Calculate the test statistic: Gather sample data and calculate a test statistic where the sample statistic is compared to the parameter value. The test statistic is calculated under the assumption the null hypothesis is true and incorporates a measure of standard error and assumptions (conditions) related to the sampling distribution.
Calculate probability value (p-value), or find the rejection region: A p-value is found by using the test statistic to calculate the probability of the sample data producing such a test statistic or one more extreme. The rejection region is found by using alpha to find a critical value; the rejection region is the area that is more extreme than the critical value. We discuss the p-value and rejection region in more detail in the next section.
Make a decision about the null hypothesis: In this step, we decide to either reject the null hypothesis or decide to fail to reject the null hypothesis. Notice we do not make a decision where we will accept the null hypothesis.
State an overall conclusion : Once we have found the p-value or rejection region, and made a statistical decision about the null hypothesis (i.e. we will reject the null or fail to reject the null), we then want to summarize our results into an overall conclusion for our test.

We will follow these six steps for the remainder of this Lesson. In the future Lessons, the steps will be followed but may not be explained explicitly.

Step 1 is a very important step to set up correctly. If your hypotheses are incorrect, your conclusion will be incorrect. In this next section, we practice with Step 1 for the one sample situations.

Module 9: Hypothesis Testing With One Sample

Null and alternative hypotheses, learning outcomes.

Describe hypothesis testing in general and in practice

The actual test begins by considering two hypotheses . They are called the null hypothesis and the alternative hypothesis . These hypotheses contain opposing viewpoints.

H 0 : The null hypothesis: It is a statement about the population that either is believed to be true or is used to put forth an argument unless it can be shown to be incorrect beyond a reasonable doubt.

H a : The alternative hypothesis : It is a claim about the population that is contradictory to H 0 and what we conclude when we reject H 0 .

After you have determined which hypothesis the sample supports, you make adecision. There are two options for a decision . They are “reject H 0 ” if the sample information favors the alternative hypothesis or “do not reject H 0 ” or “decline to reject H 0 ” if the sample information is insufficient to reject the null hypothesis.

Mathematical Symbols Used in H 0 and H a :


equal (=)	not equal (≠) greater than (>) less than (<)
greater than or equal to (≥)	less than (<)
less than or equal to (≤)	more than (>)

H 0 always has a symbol with an equal in it. H a never has a symbol with an equal in it. The choice of symbol depends on the wording of the hypothesis test. However, be aware that many researchers (including one of the co-authors in research work) use = in the null hypothesis, even with > or < as the symbol in the alternative hypothesis. This practice is acceptable because we only make the decision to reject or not reject the null hypothesis.

H 0 : No more than 30% of the registered voters in Santa Clara County voted in the primary election. p ≤ 30

H a : More than 30% of the registered voters in Santa Clara County voted in the primary election. p > 30

A medical trial is conducted to test whether or not a new medicine reduces cholesterol by 25%. State the null and alternative hypotheses.

H 0 : The drug reduces cholesterol by 25%. p = 0.25

H a : The drug does not reduce cholesterol by 25%. p ≠ 0.25

We want to test whether the mean GPA of students in American colleges is different from 2.0 (out of 4.0). The null and alternative hypotheses are:

H 0 : μ = 2.0

H a : μ ≠ 2.0

H 0 : μ = 66
H a : μ ≠ 66

We want to test if college students take less than five years to graduate from college, on the average. The null and alternative hypotheses are:

H 0 : μ ≥ 5

H a : μ < 5

H 0 : μ ≥ 45
H a : μ < 45

In an issue of U.S. News and World Report , an article on school standards stated that about half of all students in France, Germany, and Israel take advanced placement exams and a third pass. The same article stated that 6.6% of U.S. students take advanced placement exams and 4.4% pass. Test if the percentage of U.S. students who take advanced placement exams is more than 6.6%. State the null and alternative hypotheses.

H 0 : p ≤ 0.066

H a : p > 0.066

On a state driver’s test, about 40% pass the test on the first try. We want to test if more than 40% pass on the first try. Fill in the correct symbol (=, ≠, ≥, <, ≤, >) for the null and alternative hypotheses. H 0 : p __ 0.40 H a : p __ 0.40

H 0 : p = 0.40
H a : p > 0.40

Concept Review

In a hypothesis test , sample data is evaluated in order to arrive at a decision about some type of claim. If certain conditions about the sample are satisfied, then the claim can be evaluated for a population. In a hypothesis test, we: Evaluate the null hypothesis , typically denoted with H 0 . The null is not rejected unless the hypothesis test shows otherwise. The null statement must always contain some form of equality (=, ≤ or ≥) Always write the alternative hypothesis , typically denoted with H a or H 1 , using less than, greater than, or not equals symbols, i.e., (≠, >, or <). If we reject the null hypothesis, then we can assume there is enough evidence to support the alternative hypothesis. Never state that a claim is proven true or false. Keep in mind the underlying fact that hypothesis testing is based on probability laws; therefore, we can talk only in terms of non-absolute certainties.

Formula Review

H 0 and H a are contradictory.

OpenStax, Statistics, Null and Alternative Hypotheses. Provided by : OpenStax. Located at : http://cnx.org/contents/[email protected]:58/Introductory_Statistics . License : CC BY: Attribution
Introductory Statistics . Authored by : Barbara Illowski, Susan Dean. Provided by : Open Stax. Located at : http://cnx.org/contents/[email protected] . License : CC BY: Attribution . License Terms : Download for free at http://cnx.org/contents/[email protected]
Simple hypothesis testing | Probability and Statistics | Khan Academy. Authored by : Khan Academy. Located at : https://youtu.be/5D1gV37bKXY . License : All Rights Reserved . License Terms : Standard YouTube License

If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

To log in and use all the features of Khan Academy, please enable JavaScript in your browser.

AP®︎/College Statistics

Course: ap®︎/college statistics > unit 10.

Idea behind hypothesis testing

Examples of null and alternative hypotheses

Writing null and alternative hypotheses
P-values and significance tests
Comparing P-values to different significance levels
Estimating a P-value from a simulation
Estimating P-values from simulations
Using P-values to make conclusions

Want to join the conversation?

Upvote Button navigates to signup page
Downvote Button navigates to signup page
Flag Button navigates to signup page

Video transcript

school Campus Bookshelves
menu_book Bookshelves
perm_media Learning Objects
login Login
how_to_reg Request Instructor Account
hub Instructor Commons

Margin Size

Download Page (PDF)
Download Full Book (PDF)
Periodic Table
Physics Constants
Scientific Calculator
Reference & Cite
Tools expand_more
Readability

selected template will load here

This action is not available.

10.2: Null and Alternative Hypotheses

Last updated
Save as PDF
Page ID 100392

$ \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } $

$ \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} $

$ \newcommand{\id}{\mathrm{id}}$ $ \newcommand{\Span}{\mathrm{span}}$

( \newcommand{\kernel}{\mathrm{null}\,}\) $ \newcommand{\range}{\mathrm{range}\,}$

$ \newcommand{\RealPart}{\mathrm{Re}}$ $ \newcommand{\ImaginaryPart}{\mathrm{Im}}$

$ \newcommand{\Argument}{\mathrm{Arg}}$ $ \newcommand{\norm}[1]{\| #1 \|}$

$ \newcommand{\inner}[2]{\langle #1, #2 \rangle}$

$ \newcommand{\Span}{\mathrm{span}}$

$ \newcommand{\id}{\mathrm{id}}$

$ \newcommand{\kernel}{\mathrm{null}\,}$

$ \newcommand{\range}{\mathrm{range}\,}$

$ \newcommand{\RealPart}{\mathrm{Re}}$

$ \newcommand{\ImaginaryPart}{\mathrm{Im}}$

$ \newcommand{\Argument}{\mathrm{Arg}}$

$ \newcommand{\norm}[1]{\| #1 \|}$

$ \newcommand{\Span}{\mathrm{span}}$ $ \newcommand{\AA}{\unicode[.8,0]{x212B}}$

$ \newcommand{\vectorA}[1]{\vec{#1}} % arrow$

$ \newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow$

$ \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } $

$ \newcommand{\vectorC}[1]{\textbf{#1}} $

$ \newcommand{\vectorD}[1]{\overrightarrow{#1}} $

$ \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} $

$ \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} $

The actual test begins by considering two hypotheses. They are called the null hypothesis and the alternative hypothesis. These hypotheses contain opposing viewpoints.

The null hypothesis ($H_{0}$) is a statement about the population that either is believed to be true or is used to put forth an argument unless it can be shown to be incorrect beyond a reasonable doubt.
The alternative hypothesis ($H_{a}$) is a claim about the population that is contradictory to $H_{0}$ and what we conclude when we reject $H_{0}$.

Since the null and alternative hypotheses are contradictory, you must examine evidence to decide if you have enough evidence to reject the null hypothesis or not. The evidence is in the form of sample data. After you have determined which hypothesis the sample supports, you make a decision. There are two options for a decision. They are "reject $H_{0}$" if the sample information favors the alternative hypothesis or "do not reject $H_{0}$" or "decline to reject $H_{0}$" if the sample information is insufficient to reject the null hypothesis.

Table $\PageIndex{1}$: Mathematical Symbols Used in $H_{0}$ and $H_{a}$:

equal (=)	not equal $(\neq)$ greater than (>) less than (<)
greater than or equal to $(\geq)$	less than (<)
less than or equal to $(\leq)$	more than (>)

$H_{0}$ always has a symbol with an equal in it. $H_{a}$ never has a symbol with an equal in it. The choice of symbol depends on the wording of the hypothesis test. However, be aware that many researchers (including one of the co-authors in research work) use = in the null hypothesis, even with > or < as the symbol in the alternative hypothesis. This practice is acceptable because we only make the decision to reject or not reject the null hypothesis.

Example $\PageIndex{1}$

$H_{0}$: No more than 30% of the registered voters in Santa Clara County voted in the primary election. $p \leq 30$
$H_{a}$: More than 30% of the registered voters in Santa Clara County voted in the primary election. $p > 30$

Exercise $\PageIndex{1}$

A medical trial is conducted to test whether or not a new medicine reduces cholesterol by 25%. State the null and alternative hypotheses.

$H_{0}$: The drug reduces cholesterol by 25%. $p = 0.25$
$H_{a}$: The drug does not reduce cholesterol by 25%. $p \neq 0.25$

Example $\PageIndex{2}$

We want to test whether the mean GPA of students in American colleges is different from 2.0 (out of 4.0). The null and alternative hypotheses are:

$H_{0}: \mu = 2.0$
$H_{a}: \mu \neq 2.0$

Exercise $\PageIndex{2}$

We want to test whether the mean height of eighth graders is 66 inches. State the null and alternative hypotheses. Fill in the correct symbol $(=, \neq, \geq, <, \leq, >)$ for the null and alternative hypotheses.

$H_{0}: \mu \ \_ \ 66$
$H_{a}: \mu \ \_ \ 66$
$H_{0}: \mu = 66$
$H_{a}: \mu \neq 66$

Example $\PageIndex{3}$

We want to test if college students take less than five years to graduate from college, on the average. The null and alternative hypotheses are:

$H_{0}: \mu \geq 5$
$H_{a}: \mu < 5$

Exercise $\PageIndex{3}$

$H_{0}: \mu \ \_ \ 45$
$H_{a}: \mu \ \_ \ 45$
$H_{0}: \mu \geq 45$
$H_{a}: \mu < 45$

Example $\PageIndex{4}$

In an issue of U. S. News and World Report , an article on school standards stated that about half of all students in France, Germany, and Israel take advanced placement exams and a third pass. The same article stated that 6.6% of U.S. students take advanced placement exams and 4.4% pass. Test if the percentage of U.S. students who take advanced placement exams is more than 6.6%. State the null and alternative hypotheses.

$H_{0}: p \leq 0.066$
$H_{a}: p > 0.066$

Exercise $\PageIndex{4}$

On a state driver’s test, about 40% pass the test on the first try. We want to test if more than 40% pass on the first try. Fill in the correct symbol ($=, \neq, \geq, <, \leq, >$) for the null and alternative hypotheses.

$H_{0}: p \ \_ \ 0.40$
$H_{a}: p \ \_ \ 0.40$
$H_{0}: p = 0.40$
$H_{a}: p > 0.40$

COLLABORATIVE EXERCISE

Bring to class a newspaper, some news magazines, and some Internet articles . In groups, find articles from which your group can write null and alternative hypotheses. Discuss your hypotheses with the rest of the class.

Chapter Review

Evaluate the null hypothesis , typically denoted with $H_{0}$. The null is not rejected unless the hypothesis test shows otherwise. The null statement must always contain some form of equality $(=, \leq \text{or} \geq)$
Always write the alternative hypothesis , typically denoted with $H_{a}$ or $H_{1}$, using less than, greater than, or not equals symbols, i.e., $(\neq, >, \text{or} <)$.
If we reject the null hypothesis, then we can assume there is enough evidence to support the alternative hypothesis.
Never state that a claim is proven true or false. Keep in mind the underlying fact that hypothesis testing is based on probability laws; therefore, we can talk only in terms of non-absolute certainties.

Formula Review

$H_{0}$ and $H_{a}$ are contradictory.

	equal $(=)$	greater than or equal to $(\geq)$	less than or equal to $(\leq)$
has:	not equal $(\neq)$ greater than $(>)$ less than $(<)$	less than $(<)$	greater than $(>)$

If $\alpha \leq p$-value, then do not reject $H_{0}$.
If$\alpha > p$-value, then reject $H_{0}$.

$\alpha$ is preconceived. Its value is set before the hypothesis test starts. The $p$-value is calculated from the data.References

Data from the National Institute of Mental Health. Available online at http://www.nimh.nih.gov/publicat/depression.cfm .

Contributors

Barbara Illowsky and Susan Dean (De Anza College) with many other contributing authors. Content produced by OpenStax College is licensed under a Creative Commons Attribution License 4.0 license. Download for free at http://cnx.org/contents/[email protected] .

Math Article

Alternative Hypothesis

Alternative hypothesis defines there is a statistically important relationship between two variables. Whereas null hypothesis states there is no statistical relationship between the two variables. In statistics, we usually come across various kinds of hypotheses. A statistical hypothesis is supposed to be a working statement which is assumed to be logical with given data. It should be noticed that a hypothesis is neither considered true nor false.

The alternative hypothesis is a statement used in statistical inference experiment. It is contradictory to the null hypothesis and denoted by H a or H 1 . We can also say that it is simply an alternative to the null. In hypothesis testing, an alternative theory is a statement which a researcher is testing. This statement is true from the researcher’s point of view and ultimately proves to reject the null to replace it with an alternative assumption. In this hypothesis, the difference between two or more variables is predicted by the researchers, such that the pattern of data observed in the test is not due to chance.

To check the water quality of a river for one year, the researchers are doing the observation. As per the null hypothesis, there is no change in water quality in the first half of the year as compared to the second half. But in the alternative hypothesis, the quality of water is poor in the second half when observed.

Difference Between Null and Alternative Hypothesis


It denotes there is no relationship between two measured phenomena.	It’s a hypothesis that a random cause may influence the observed data or sample.
It is represented by H	It is represented by H or H
Example: Rohan will win at least Rs.100000 in lucky draw.	Example: Rohan will win less than Rs.100000 in lucky draw.

Basically, there are three types of the alternative hypothesis, they are;

Left-Tailed : Here, it is expected that the sample proportion (π) is less than a specified value which is denoted by π 0 , such that;

H 1 : π < π 0

Right-Tailed: It represents that the sample proportion (π) is greater than some value, denoted by π 0 .

H 1 : π > π 0

Two-Tailed: According to this hypothesis, the sample proportion (denoted by π) is not equal to a specific value which is represented by π 0 .

H 1 : π ≠ π 0

Note: The null hypothesis for all the three alternative hypotheses, would be H 1 : π = π 0 .

MATHS Related Links

Register with BYJU'S & Download Free PDFs

Null Hypothesis and Alternative Hypothesis

Inferential Statistics
Statistics Tutorials
Probability & Games
Descriptive Statistics
Applications Of Statistics
Math Tutorials
Pre Algebra & Algebra
Exponential Decay
Worksheets By Grade
Ph.D., Mathematics, Purdue University
M.S., Mathematics, Purdue University
B.A., Mathematics, Physics, and Chemistry, Anderson University

Hypothesis testing involves the careful construction of two statements: the null hypothesis and the alternative hypothesis. These hypotheses can look very similar but are actually different.

How do we know which hypothesis is the null and which one is the alternative? We will see that there are a few ways to tell the difference.

The Null Hypothesis

The null hypothesis reflects that there will be no observed effect in our experiment. In a mathematical formulation of the null hypothesis, there will typically be an equal sign. This hypothesis is denoted by H 0 .

The null hypothesis is what we attempt to find evidence against in our hypothesis test. We hope to obtain a small enough p-value that it is lower than our level of significance alpha and we are justified in rejecting the null hypothesis. If our p-value is greater than alpha, then we fail to reject the null hypothesis.

If the null hypothesis is not rejected, then we must be careful to say what this means. The thinking on this is similar to a legal verdict. Just because a person has been declared "not guilty", it does not mean that he is innocent. In the same way, just because we failed to reject a null hypothesis it does not mean that the statement is true.

For example, we may want to investigate the claim that despite what convention has told us, the mean adult body temperature is not the accepted value of 98.6 degrees Fahrenheit . The null hypothesis for an experiment to investigate this is “The mean adult body temperature for healthy individuals is 98.6 degrees Fahrenheit.” If we fail to reject the null hypothesis, then our working hypothesis remains that the average adult who is healthy has a temperature of 98.6 degrees. We do not prove that this is true.

If we are studying a new treatment, the null hypothesis is that our treatment will not change our subjects in any meaningful way. In other words, the treatment will not produce any effect in our subjects.

The Alternative Hypothesis

The alternative or experimental hypothesis reflects that there will be an observed effect for our experiment. In a mathematical formulation of the alternative hypothesis, there will typically be an inequality, or not equal to symbol. This hypothesis is denoted by either H a or by H 1 .

The alternative hypothesis is what we are attempting to demonstrate in an indirect way by the use of our hypothesis test. If the null hypothesis is rejected, then we accept the alternative hypothesis. If the null hypothesis is not rejected, then we do not accept the alternative hypothesis. Going back to the above example of mean human body temperature, the alternative hypothesis is “The average adult human body temperature is not 98.6 degrees Fahrenheit.”

If we are studying a new treatment, then the alternative hypothesis is that our treatment does, in fact, change our subjects in a meaningful and measurable way.

The following set of negations may help when you are forming your null and alternative hypotheses. Most technical papers rely on just the first formulation, even though you may see some of the others in a statistics textbook.

Null hypothesis: “ x is equal to y .” Alternative hypothesis “ x is not equal to y .”
Null hypothesis: “ x is at least y .” Alternative hypothesis “ x is less than y .”
Null hypothesis: “ x is at most y .” Alternative hypothesis “ x is greater than y .”
Null Hypothesis Examples
An Example of a Hypothesis Test
Hypothesis Test for the Difference of Two Population Proportions
What Is a P-Value?
How to Conduct a Hypothesis Test
Hypothesis Test Example
Maslow's Hierarchy of Needs Explained
Chi-Square Goodness of Fit Test
What Level of Alpha Determines Statistical Significance?
Popular Math Terms and Definitions
How to Do Hypothesis Tests With the Z.TEST Function in Excel
The Difference Between Type I and Type II Errors in Hypothesis Testing
Type I and Type II Errors in Statistics
The Runs Test for Random Sequences
What 'Fail to Reject' Means in a Hypothesis Test
What Is the Difference Between Alpha and P-Values?

school Campus Bookshelves
menu_book Bookshelves
perm_media Learning Objects
login Login
how_to_reg Request Instructor Account
hub Instructor Commons

Margin Size

Download Page (PDF)
Download Full Book (PDF)
Periodic Table
Physics Constants
Scientific Calculator
Reference & Cite
Tools expand_more
Readability

selected template will load here

This action is not available.

4.4: Hypothesis Testing

Last updated
Save as PDF
Page ID 283

David Diez, Christopher Barr, & Mine Çetinkaya-Rundel

David Diez, Christopher Barr, & Mine Çetinkaya-Rundel
OpenIntro Statistics

$ \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } $

$ \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} $

$ \newcommand{\id}{\mathrm{id}}$ $ \newcommand{\Span}{\mathrm{span}}$

( \newcommand{\kernel}{\mathrm{null}\,}\) $ \newcommand{\range}{\mathrm{range}\,}$

$ \newcommand{\RealPart}{\mathrm{Re}}$ $ \newcommand{\ImaginaryPart}{\mathrm{Im}}$

$ \newcommand{\Argument}{\mathrm{Arg}}$ $ \newcommand{\norm}[1]{\| #1 \|}$

$ \newcommand{\inner}[2]{\langle #1, #2 \rangle}$

$ \newcommand{\Span}{\mathrm{span}}$

$ \newcommand{\id}{\mathrm{id}}$

$ \newcommand{\kernel}{\mathrm{null}\,}$

$ \newcommand{\range}{\mathrm{range}\,}$

$ \newcommand{\RealPart}{\mathrm{Re}}$

$ \newcommand{\ImaginaryPart}{\mathrm{Im}}$

$ \newcommand{\Argument}{\mathrm{Arg}}$

$ \newcommand{\norm}[1]{\| #1 \|}$

$ \newcommand{\Span}{\mathrm{span}}$ $ \newcommand{\AA}{\unicode[.8,0]{x212B}}$

$ \newcommand{\vectorA}[1]{\vec{#1}} % arrow$

$ \newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow$

$ \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } $

$ \newcommand{\vectorC}[1]{\textbf{#1}} $

$ \newcommand{\vectorD}[1]{\overrightarrow{#1}} $

$ \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} $

$ \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} $

Is the typical US runner getting faster or slower over time? We consider this question in the context of the Cherry Blossom Run, comparing runners in 2006 and 2012. Technological advances in shoes, training, and diet might suggest runners would be faster in 2012. An opposing viewpoint might say that with the average body mass index on the rise, people tend to run slower. In fact, all of these components might be influencing run time.

In addition to considering run times in this section, we consider a topic near and dear to most students: sleep. A recent study found that college students average about 7 hours of sleep per night.15 However, researchers at a rural college are interested in showing that their students sleep longer than seven hours on average. We investigate this topic in Section 4.3.4.

Hypothesis Testing Framework

The average time for all runners who finished the Cherry Blossom Run in 2006 was 93.29 minutes (93 minutes and about 17 seconds). We want to determine if the run10Samp data set provides strong evidence that the participants in 2012 were faster or slower than those runners in 2006, versus the other possibility that there has been no change. 16 We simplify these three options into two competing hypotheses :

H 0 : The average 10 mile run time was the same for 2006 and 2012.
H A : The average 10 mile run time for 2012 was different than that of 2006.

We call H 0 the null hypothesis and H A the alternative hypothesis.

Null and alternative hypotheses

The null hypothesis (H 0 ) often represents either a skeptical perspective or a claim to be tested.
The alternative hypothesis (H A ) represents an alternative claim under consideration and is often represented by a range of possible parameter values.

15 theloquitur.com/?p=1161

16 While we could answer this question by examining the entire population data (run10), we only consider the sample data (run10Samp), which is more realistic since we rarely have access to population data.

The null hypothesis often represents a skeptical position or a perspective of no difference. The alternative hypothesis often represents a new perspective, such as the possibility that there has been a change.

Hypothesis testing framework

The skeptic will not reject the null hypothesis (H 0 ), unless the evidence in favor of the alternative hypothesis (H A ) is so strong that she rejects H 0 in favor of H A .

The hypothesis testing framework is a very general tool, and we often use it without a second thought. If a person makes a somewhat unbelievable claim, we are initially skeptical. However, if there is sufficient evidence that supports the claim, we set aside our skepticism and reject the null hypothesis in favor of the alternative. The hallmarks of hypothesis testing are also found in the US court system.

Exercise $\PageIndex{1}$

A US court considers two possible claims about a defendant: she is either innocent or guilty. If we set these claims up in a hypothesis framework, which would be the null hypothesis and which the alternative? 17

Jurors examine the evidence to see whether it convincingly shows a defendant is guilty. Even if the jurors leave unconvinced of guilt beyond a reasonable doubt, this does not mean they believe the defendant is innocent. This is also the case with hypothesis testing: even if we fail to reject the null hypothesis, we typically do not accept the null hypothesis as true. Failing to find strong evidence for the alternative hypothesis is not equivalent to accepting the null hypothesis.

17 H 0 : The average cost is $650 per month, $\mu$ = $650.

In the example with the Cherry Blossom Run, the null hypothesis represents no difference in the average time from 2006 to 2012. The alternative hypothesis represents something new or more interesting: there was a difference, either an increase or a decrease. These hypotheses can be described in mathematical notation using $\mu_{12}$ as the average run time for 2012:

H 0 : $\mu_{12} = 93.29$
H A : $\mu_{12} \ne 93.29$

where 93.29 minutes (93 minutes and about 17 seconds) is the average 10 mile time for all runners in the 2006 Cherry Blossom Run. Using this mathematical notation, the hypotheses can now be evaluated using statistical tools. We call 93.29 the null value since it represents the value of the parameter if the null hypothesis is true. We will use the run10Samp data set to evaluate the hypothesis test.

Testing Hypotheses using Confidence Intervals

We can start the evaluation of the hypothesis setup by comparing 2006 and 2012 run times using a point estimate from the 2012 sample: $\bar {x}_{12} = 95.61$ minutes. This estimate suggests the average time is actually longer than the 2006 time, 93.29 minutes. However, to evaluate whether this provides strong evidence that there has been a change, we must consider the uncertainty associated with $\bar {x}_{12}$.

1 6 The jury considers whether the evidence is so convincing (strong) that there is no reasonable doubt regarding the person's guilt; in such a case, the jury rejects innocence (the null hypothesis) and concludes the defendant is guilty (alternative hypothesis).

We learned in Section 4.1 that there is fluctuation from one sample to another, and it is very unlikely that the sample mean will be exactly equal to our parameter; we should not expect $\bar {x}_{12}$ to exactly equal $\mu_{12}$. Given that $\bar {x}_{12} = 95.61$, it might still be possible that the population average in 2012 has remained unchanged from 2006. The difference between $\bar {x}_{12}$ and 93.29 could be due to sampling variation, i.e. the variability associated with the point estimate when we take a random sample.

In Section 4.2, confidence intervals were introduced as a way to find a range of plausible values for the population mean. Based on run10Samp, a 95% confidence interval for the 2012 population mean, $\mu_{12}$, was calculated as

\[(92.45, 98.77)\]

Because the 2006 mean, 93.29, falls in the range of plausible values, we cannot say the null hypothesis is implausible. That is, we failed to reject the null hypothesis, H 0 .

Double negatives can sometimes be used in statistics

In many statistical explanations, we use double negatives. For instance, we might say that the null hypothesis is not implausible or we failed to reject the null hypothesis. Double negatives are used to communicate that while we are not rejecting a position, we are also not saying it is correct.

Example $\PageIndex{1}$

Next consider whether there is strong evidence that the average age of runners has changed from 2006 to 2012 in the Cherry Blossom Run. In 2006, the average age was 36.13 years, and in the 2012 run10Samp data set, the average was 35.05 years with a standard deviation of 8.97 years for 100 runners.

First, set up the hypotheses:

H 0 : The average age of runners has not changed from 2006 to 2012, $\mu_{age} = 36.13.$
H A : The average age of runners has changed from 2006 to 2012, $\mu _{age} 6 \ne 36.13.$

We have previously veri ed conditions for this data set. The normal model may be applied to $\bar {y}$ and the estimate of SE should be very accurate. Using the sample mean and standard error, we can construct a 95% con dence interval for $\mu _{age}$ to determine if there is sufficient evidence to reject H 0 :

\[\bar{y} \pm 1.96 \times \dfrac {s}{\sqrt {100}} \rightarrow 35.05 \pm 1.96 \times 0.90 \rightarrow (33.29, 36.81)\]

This confidence interval contains the null value, 36.13. Because 36.13 is not implausible, we cannot reject the null hypothesis. We have not found strong evidence that the average age is different than 36.13 years.

Exercise $\PageIndex{2}$

Colleges frequently provide estimates of student expenses such as housing. A consultant hired by a community college claimed that the average student housing expense was $650 per month. What are the null and alternative hypotheses to test whether this claim is accurate? 18

Sample distribution of student housing expense. These data are moderately skewed, roughly determined using the outliers on the right.

H A : The average cost is different than $650 per month, $\mu \ne$ $650.

18 Applying the normal model requires that certain conditions are met. Because the data are a simple random sample and the sample (presumably) represents no more than 10% of all students at the college, the observations are independent. The sample size is also sufficiently large (n = 75) and the data exhibit only moderate skew. Thus, the normal model may be applied to the sample mean.

Exercise $\PageIndex{3}$

The community college decides to collect data to evaluate the $650 per month claim. They take a random sample of 75 students at their school and obtain the data represented in Figure 4.11. Can we apply the normal model to the sample mean?

If the court makes a Type 1 Error, this means the defendant is innocent (H 0 true) but wrongly convicted. A Type 2 Error means the court failed to reject H 0 (i.e. failed to convict the person) when she was in fact guilty (H A true).

Example $\PageIndex{2}$

The sample mean for student housing is $611.63 and the sample standard deviation is $132.85. Construct a 95% confidence interval for the population mean and evaluate the hypotheses of Exercise 4.22.

The standard error associated with the mean may be estimated using the sample standard deviation divided by the square root of the sample size. Recall that n = 75 students were sampled.

\[ SE = \dfrac {s}{\sqrt {n}} = \dfrac {132.85}{\sqrt {75}} = 15.34\]

You showed in Exercise 4.23 that the normal model may be applied to the sample mean. This ensures a 95% confidence interval may be accurately constructed:

\[\bar {x} \pm z*SE \rightarrow 611.63 \pm 1.96 \times 15.34 \times (581.56, 641.70)\]

Because the null value $650 is not in the confidence interval, a true mean of $650 is implausible and we reject the null hypothesis. The data provide statistically significant evidence that the actual average housing expense is less than $650 per month.

Decision Errors

Hypothesis tests are not flawless. Just think of the court system: innocent people are sometimes wrongly convicted and the guilty sometimes walk free. Similarly, we can make a wrong decision in statistical hypothesis tests. However, the difference is that we have the tools necessary to quantify how often we make such errors.

There are two competing hypotheses: the null and the alternative. In a hypothesis test, we make a statement about which one might be true, but we might choose incorrectly. There are four possible scenarios in a hypothesis test, which are summarized in Table 4.12.

Table 4.12: Four different scenarios for hypothesis tests.
	Test conclusion
	do not reject H	reject H in favor of H
H true H true	okay Type 2 Error	Type 1 Error okay

A Type 1 Error is rejecting the null hypothesis when H0 is actually true. A Type 2 Error is failing to reject the null hypothesis when the alternative is actually true.

Exercise 4.25

In a US court, the defendant is either innocent (H 0 ) or guilty (H A ). What does a Type 1 Error represent in this context? What does a Type 2 Error represent? Table 4.12 may be useful.

To lower the Type 1 Error rate, we might raise our standard for conviction from "beyond a reasonable doubt" to "beyond a conceivable doubt" so fewer people would be wrongly convicted. However, this would also make it more difficult to convict the people who are actually guilty, so we would make more Type 2 Errors.

Exercise 4.26

How could we reduce the Type 1 Error rate in US courts? What influence would this have on the Type 2 Error rate?

To lower the Type 2 Error rate, we want to convict more guilty people. We could lower the standards for conviction from "beyond a reasonable doubt" to "beyond a little doubt". Lowering the bar for guilt will also result in more wrongful convictions, raising the Type 1 Error rate.

Exercise 4.27

How could we reduce the Type 2 Error rate in US courts? What influence would this have on the Type 1 Error rate?

A skeptic would have no reason to believe that sleep patterns at this school are different than the sleep patterns at another school.

Exercises 4.25-4.27 provide an important lesson:

If we reduce how often we make one type of error, we generally make more of the other type.

Hypothesis testing is built around rejecting or failing to reject the null hypothesis. That is, we do not reject H 0 unless we have strong evidence. But what precisely does strong evidence mean? As a general rule of thumb, for those cases where the null hypothesis is actually true, we do not want to incorrectly reject H 0 more than 5% of the time. This corresponds to a significance level of 0.05. We often write the significance level using $\alpha$ (the Greek letter alpha): $\alpha = 0.05.$ We discuss the appropriateness of different significance levels in Section 4.3.6.

If we use a 95% confidence interval to test a hypothesis where the null hypothesis is true, we will make an error whenever the point estimate is at least 1.96 standard errors away from the population parameter. This happens about 5% of the time (2.5% in each tail). Similarly, using a 99% con dence interval to evaluate a hypothesis is equivalent to a significance level of $\alpha = 0.01$.

A confidence interval is, in one sense, simplistic in the world of hypothesis tests. Consider the following two scenarios:

The null value (the parameter value under the null hypothesis) is in the 95% confidence interval but just barely, so we would not reject H 0 . However, we might like to somehow say, quantitatively, that it was a close decision.
The null value is very far outside of the interval, so we reject H 0 . However, we want to communicate that, not only did we reject the null hypothesis, but it wasn't even close. Such a case is depicted in Figure 4.13.

In Section 4.3.4, we introduce a tool called the p-value that will be helpful in these cases. The p-value method also extends to hypothesis tests where con dence intervals cannot be easily constructed or applied.

Formal Testing using p-Values

The p-value is a way of quantifying the strength of the evidence against the null hypothesis and in favor of the alternative. Formally the p-value is a conditional probability.

definition: p-value

The p-value is the probability of observing data at least as favorable to the alternative hypothesis as our current data set, if the null hypothesis is true. We typically use a summary statistic of the data, in this chapter the sample mean, to help compute the p-value and evaluate the hypotheses.

A poll by the National Sleep Foundation found that college students average about 7 hours of sleep per night. Researchers at a rural school are interested in showing that students at their school sleep longer than seven hours on average, and they would like to demonstrate this using a sample of students. What would be an appropriate skeptical position for this research?

This is entirely based on the interests of the researchers. Had they been only interested in the opposite case - showing that their students were actually averaging fewer than seven hours of sleep but not interested in showing more than 7 hours - then our setup would have set the alternative as $\mu < 7$.

We can set up the null hypothesis for this test as a skeptical perspective: the students at this school average 7 hours of sleep per night. The alternative hypothesis takes a new form reflecting the interests of the research: the students average more than 7 hours of sleep. We can write these hypotheses as

H 0 : $\mu$ = 7.
H A : $\mu$ > 7.

Using $\mu$ > 7 as the alternative is an example of a one-sided hypothesis test. In this investigation, there is no apparent interest in learning whether the mean is less than 7 hours. (The standard error can be estimated from the sample standard deviation and the sample size: $SE_{\bar {x}} = \dfrac {s_x}{\sqrt {n}} = \dfrac {1.75}{\sqrt {110}} = 0.17$). Earlier we encountered a two-sided hypothesis where we looked for any clear difference, greater than or less than the null value.

Always use a two-sided test unless it was made clear prior to data collection that the test should be one-sided. Switching a two-sided test to a one-sided test after observing the data is dangerous because it can inflate the Type 1 Error rate.

TIP: One-sided and two-sided tests

If the researchers are only interested in showing an increase or a decrease, but not both, use a one-sided test. If the researchers would be interested in any difference from the null value - an increase or decrease - then the test should be two-sided.

TIP: Always write the null hypothesis as an equality

We will find it most useful if we always list the null hypothesis as an equality (e.g. $\mu$ = 7) while the alternative always uses an inequality (e.g. $\mu \ne 7, \mu > 7, or \mu < 7)$.

The researchers at the rural school conducted a simple random sample of n = 110 students on campus. They found that these students averaged 7.42 hours of sleep and the standard deviation of the amount of sleep for the students was 1.75 hours. A histogram of the sample is shown in Figure 4.14.

Before we can use a normal model for the sample mean or compute the standard error of the sample mean, we must verify conditions. (1) Because this is a simple random sample from less than 10% of the student body, the observations are independent. (2) The sample size in the sleep study is sufficiently large since it is greater than 30. (3) The data show moderate skew in Figure 4.14 and the presence of a couple of outliers. This skew and the outliers (which are not too extreme) are acceptable for a sample size of n = 110. With these conditions veri ed, the normal model can be safely applied to $\bar {x}$ and the estimated standard error will be very accurate.

What is the standard deviation associated with $\bar {x}$? That is, estimate the standard error of $\bar {x}$. 25

The hypothesis test will be evaluated using a significance level of $\alpha = 0.05$. We want to consider the data under the scenario that the null hypothesis is true. In this case, the sample mean is from a distribution that is nearly normal and has mean 7 and standard deviation of about 0.17. Such a distribution is shown in Figure 4.15.

The shaded tail in Figure 4.15 represents the chance of observing such a large mean, conditional on the null hypothesis being true. That is, the shaded tail represents the p-value. We shade all means larger than our sample mean, $\bar {x} = 7.42$, because they are more favorable to the alternative hypothesis than the observed mean.

We compute the p-value by finding the tail area of this normal distribution, which we learned to do in Section 3.1. First compute the Z score of the sample mean, $\bar {x} = 7.42$:

\[Z = \dfrac {\bar {x} - \text {null value}}{SE_{\bar {x}}} = \dfrac {7.42 - 7}{0.17} = 2.47\]

Using the normal probability table, the lower unshaded area is found to be 0.993. Thus the shaded area is 1 - 0.993 = 0.007. If the null hypothesis is true, the probability of observing such a large sample mean for a sample of 110 students is only 0.007. That is, if the null hypothesis is true, we would not often see such a large mean.

We evaluate the hypotheses by comparing the p-value to the significance level. Because the p-value is less than the significance level $(p-value = 0.007 < 0.05 = \alpha)$, we reject the null hypothesis. What we observed is so unusual with respect to the null hypothesis that it casts serious doubt on H 0 and provides strong evidence favoring H A .

p-value as a tool in hypothesis testing

The p-value quantifies how strongly the data favor H A over H 0 . A small p-value (usually < 0.05) corresponds to sufficient evidence to reject H 0 in favor of H A .

TIP: It is useful to First draw a picture to find the p-value

It is useful to draw a picture of the distribution of $\bar {x}$ as though H 0 was true (i.e. $\mu$ equals the null value), and shade the region (or regions) of sample means that are at least as favorable to the alternative hypothesis. These shaded regions represent the p-value.

The ideas below review the process of evaluating hypothesis tests with p-values:

The null hypothesis represents a skeptic's position or a position of no difference. We reject this position only if the evidence strongly favors H A .
A small p-value means that if the null hypothesis is true, there is a low probability of seeing a point estimate at least as extreme as the one we saw. We interpret this as strong evidence in favor of the alternative.
We reject the null hypothesis if the p-value is smaller than the significance level, $\alpha$, which is usually 0.05. Otherwise, we fail to reject H 0 .
We should always state the conclusion of the hypothesis test in plain language so non-statisticians can also understand the results.

The p-value is constructed in such a way that we can directly compare it to the significance level ( $\alpha$) to determine whether or not to reject H 0 . This method ensures that the Type 1 Error rate does not exceed the significance level standard.

If the null hypothesis is true, how often should the p-value be less than 0.05?

About 5% of the time. If the null hypothesis is true, then the data only has a 5% chance of being in the 5% of data most favorable to H A .

Exercise 4.31

Suppose we had used a significance level of 0.01 in the sleep study. Would the evidence have been strong enough to reject the null hypothesis? (The p-value was 0.007.) What if the significance level was $\alpha = 0.001$? 27

27 We reject the null hypothesis whenever p-value < $\alpha$. Thus, we would still reject the null hypothesis if $\alpha = 0.01$ but not if the significance level had been $\alpha = 0.001$.

Exercise 4.32

Ebay might be interested in showing that buyers on its site tend to pay less than they would for the corresponding new item on Amazon. We'll research this topic for one particular product: a video game called Mario Kart for the Nintendo Wii. During early October 2009, Amazon sold this game for $46.99. Set up an appropriate (one-sided!) hypothesis test to check the claim that Ebay buyers pay less during auctions at this same time. 28

28 The skeptic would say the average is the same on Ebay, and we are interested in showing the average price is lower.

Exercise 4.33

During early October, 2009, 52 Ebay auctions were recorded for Mario Kart.29 The total prices for the auctions are presented using a histogram in Figure 4.17, and we may like to apply the normal model to the sample mean. Check the three conditions required for applying the normal model: (1) independence, (2) at least 30 observations, and (3) the data are not strongly skewed. 30

30 (1) The independence condition is unclear. We will make the assumption that the observations are independent, which we should report with any nal results. (2) The sample size is sufficiently large: $n = 52 \ge 30$. (3) The data distribution is not strongly skewed; it is approximately symmetric.

H 0 : The average auction price on Ebay is equal to (or more than) the price on Amazon. We write only the equality in the statistical notation: $\mu_{ebay} = 46.99$.

H A : The average price on Ebay is less than the price on Amazon, $\mu _{ebay} < 46.99$.

29 These data were collected by OpenIntro staff.

Example 4.34

The average sale price of the 52 Ebay auctions for Wii Mario Kart was $44.17 with a standard deviation of $4.15. Does this provide sufficient evidence to reject the null hypothesis in Exercise 4.32? Use a significance level of $\alpha = 0.01$.

The hypotheses were set up and the conditions were checked in Exercises 4.32 and 4.33. The next step is to find the standard error of the sample mean and produce a sketch to help find the p-value.

Because the alternative hypothesis says we are looking for a smaller mean, we shade the lower tail. We find this shaded area by using the Z score and normal probability table: $Z = \dfrac {44.17 \times 46.99}{0.5755} = -4.90$, which has area less than 0.0002. The area is so small we cannot really see it on the picture. This lower tail area corresponds to the p-value.

Because the p-value is so small - specifically, smaller than = 0.01 - this provides sufficiently strong evidence to reject the null hypothesis in favor of the alternative. The data provide statistically signi cant evidence that the average price on Ebay is lower than Amazon's asking price.

Two-sided hypothesis testing with p-values

We now consider how to compute a p-value for a two-sided test. In one-sided tests, we shade the single tail in the direction of the alternative hypothesis. For example, when the alternative had the form $\mu$ > 7, then the p-value was represented by the upper tail (Figure 4.16). When the alternative was $\mu$ < 46.99, the p-value was the lower tail (Exercise 4.32). In a two-sided test, we shade two tails since evidence in either direction is favorable to H A .

Exercise 4.35 Earlier we talked about a research group investigating whether the students at their school slept longer than 7 hours each night. Let's consider a second group of researchers who want to evaluate whether the students at their college differ from the norm of 7 hours. Write the null and alternative hypotheses for this investigation. 31

Example 4.36 The second college randomly samples 72 students and nds a mean of $\bar {x} = 6.83$ hours and a standard deviation of s = 1.8 hours. Does this provide strong evidence against H 0 in Exercise 4.35? Use a significance level of $\alpha = 0.05$.

First, we must verify assumptions. (1) A simple random sample of less than 10% of the student body means the observations are independent. (2) The sample size is 72, which is greater than 30. (3) Based on the earlier distribution and what we already know about college student sleep habits, the distribution is probably not strongly skewed.

Next we can compute the standard error $(SE_{\bar {x}} = \dfrac {s}{\sqrt {n}} = 0.21)$ of the estimate and create a picture to represent the p-value, shown in Figure 4.18. Both tails are shaded.

31 Because the researchers are interested in any difference, they should use a two-sided setup: H 0 : $\mu$ = 7, H A : $\mu \ne 7.$

An estimate of 7.17 or more provides at least as strong of evidence against the null hypothesis and in favor of the alternative as the observed estimate, $\bar {x} = 6.83$.

We can calculate the tail areas by rst nding the lower tail corresponding to $\bar {x}$:

\[Z = \dfrac {6.83 - 7.00}{0.21} = -0.81 \xrightarrow {table} \text {left tail} = 0.2090\]

Because the normal model is symmetric, the right tail will have the same area as the left tail. The p-value is found as the sum of the two shaded tails:

\[ \text {p-value} = \text {left tail} + \text {right tail} = 2 \times \text {(left tail)} = 0.4180\]

This p-value is relatively large (larger than $\mu$= 0.05), so we should not reject H 0 . That is, if H 0 is true, it would not be very unusual to see a sample mean this far from 7 hours simply due to sampling variation. Thus, we do not have sufficient evidence to conclude that the mean is different than 7 hours.

Example 4.37 It is never okay to change two-sided tests to one-sided tests after observing the data. In this example we explore the consequences of ignoring this advice. Using $\alpha = 0.05$, we show that freely switching from two-sided tests to onesided tests will cause us to make twice as many Type 1 Errors as intended.

Suppose the sample mean was larger than the null value, $\mu_0$ (e.g. $\mu_0$ would represent 7 if H 0 : $\mu$ = 7). Then if we can ip to a one-sided test, we would use H A : $\mu > \mu_0$. Now if we obtain any observation with a Z score greater than 1.65, we would reject H 0 . If the null hypothesis is true, we incorrectly reject the null hypothesis about 5% of the time when the sample mean is above the null value, as shown in Figure 4.19.

Suppose the sample mean was smaller than the null value. Then if we change to a one-sided test, we would use H A : $\mu < \mu_0$. If $\bar {x}$ had a Z score smaller than -1.65, we would reject H 0 . If the null hypothesis is true, then we would observe such a case about 5% of the time.

By examining these two scenarios, we can determine that we will make a Type 1 Error 5% + 5% = 10% of the time if we are allowed to swap to the "best" one-sided test for the data. This is twice the error rate we prescribed with our significance level: $\alpha = 0.05$ (!).

Caution: One-sided hypotheses are allowed only before seeing data

After observing data, it is tempting to turn a two-sided test into a one-sided test. Avoid this temptation. Hypotheses must be set up before observing the data. If they are not, the test must be two-sided.

Choosing a Significance Level

Choosing a significance level for a test is important in many contexts, and the traditional level is 0.05. However, it is often helpful to adjust the significance level based on the application. We may select a level that is smaller or larger than 0.05 depending on the consequences of any conclusions reached from the test.

If making a Type 1 Error is dangerous or especially costly, we should choose a small significance level (e.g. 0.01). Under this scenario we want to be very cautious about rejecting the null hypothesis, so we demand very strong evidence favoring H A before we would reject H 0 .
If a Type 2 Error is relatively more dangerous or much more costly than a Type 1 Error, then we should choose a higher significance level (e.g. 0.10). Here we want to be cautious about failing to reject H 0 when the null is actually false. We will discuss this particular case in greater detail in Section 4.6.

Significance levels should reflect consequences of errors

The significance level selected for a test should reflect the consequences associated with Type 1 and Type 2 Errors.

Example 4.38

A car manufacturer is considering a higher quality but more expensive supplier for window parts in its vehicles. They sample a number of parts from their current supplier and also parts from the new supplier. They decide that if the high quality parts will last more than 12% longer, it makes nancial sense to switch to this more expensive supplier. Is there good reason to modify the significance level in such a hypothesis test?

The null hypothesis is that the more expensive parts last no more than 12% longer while the alternative is that they do last more than 12% longer. This decision is just one of the many regular factors that have a marginal impact on the car and company. A significancelevel of 0.05 seems reasonable since neither a Type 1 or Type 2 error should be dangerous or (relatively) much more expensive.

Example 4.39

The same car manufacturer is considering a slightly more expensive supplier for parts related to safety, not windows. If the durability of these safety components is shown to be better than the current supplier, they will switch manufacturers. Is there good reason to modify the significance level in such an evaluation?

The null hypothesis would be that the suppliers' parts are equally reliable. Because safety is involved, the car company should be eager to switch to the slightly more expensive manufacturer (reject H 0 ) even if the evidence of increased safety is only moderately strong. A slightly larger significance level, such as $\mu = 0.10$, might be appropriate.

Exercise 4.40

A part inside of a machine is very expensive to replace. However, the machine usually functions properly even if this part is broken, so the part is replaced only if we are extremely certain it is broken based on a series of measurements. Identify appropriate hypotheses for this test (in plain language) and suggest an appropriate significance level. 32

Stack Exchange Network

Stack Exchange network consists of 183 Q&A communities including Stack Overflow , the largest, most trusted online community for developers to learn, share their knowledge, and build their careers.

Q&A for work

Connect and share knowledge within a single location that is structured and easy to search.

Why do we need alternative hypothesis?

When we do testing we end up with two outcomes.

1) We reject null hypothesis

2) We fail to reject null hypothesis.

We do not talk about accepting alternative hypotheses. If we do not talk about accepting alternative hypothesis, why do we need to have alternative hypothesis at all?

Here is update: Could somebody give me two examples:

1) rejecting null hypothesis is equal to accepting alternative hypothesis

2) rejecting null hypothesis is not equal to accepting alternative hypothesis

hypothesis-testing

$\begingroup$ If your alternative hypothesis is a complement of null hypothesis, there's no point in using at all. Nobody uses alternative hypothesis in practice for this reasons outside textbooks. $\endgroup$ – Aksakal Commented Jan 13, 2019 at 18:30
$\begingroup$ "We do not talk about accepting alternative hypotheses" -- not true for all possible "we". Some people do talk about accepting the alternative hypothesis, and many others think it, even if they respect the taboo against saying it. It is somewhat pedantic to avoid talking about accepting the alternative hypothesis when there is no reasonable doubt that it is true. But, since statistics is so prone to misuse, in this case the pedantry is probably a good thing in so far as it inculcates caution in the interpretation of results. $\endgroup$ – John Coleman Commented Jan 14, 2019 at 15:36

5 Answers 5

There was, historically, disagreement about whether an alternative hypothesis was necessary. Let me explain this point of disagreement by considering the opinions of Fisher and Neyman, within the context of frequentist statistics, and a Bayesian answer.

Fisher - We do not need an alternative hypothesis; we can simply test a null hypothesis using a goodness-of-fit test. The outcome is a $p$ -value, providing a measure of evidence for the null hypothesis.

Neyman - We must perform a hypothesis test between a null and an alternative. The test is such that it would result in type-1 errors at a fixed, pre-specified rate, $\alpha$ . The outcome is a decision - to reject or not reject the null hypothesis at the level $\alpha$ .

We need an alternative from a decision theoretic perspective - we are making a choice between two courses of action - and because we should report the power of the test $$ 1 - p\left(\textrm{Accept $H_0$} \, \middle|\, H_1\right) $$ We should seek the most powerful tests possible to have the best chance of rejecting $H_0$ when the alternative is true.

To satisfy both these points, the alternative hypothesis cannot be the vague 'not $H_0$ ' one.

Bayesian - We must consider at least two models and update their relative plausibility with data. With only a single model, we simple have $$ p(H_0) = 1 $$ no matter what data we collect. To make calculations in this framework, the alternative hypothesis (or model as it would be known in this context) cannot be the ill-defined 'not $H_0$ ' one. I call it ill-defined since we cannot write the model $p(\text{data}|\text{not }H_0)$ .

1 $\begingroup$ Your last point is excellent, and often neglected in publications which base their whole argumentation on a single, unmotivated NHST. $\endgroup$ – Konrad Rudolph Commented Jan 14, 2019 at 15:37
1 $\begingroup$ Why is 'not $H_0$' ill-defined? $\endgroup$ – Michael Commented Apr 3, 2019 at 19:05
$\begingroup$ What is it? Can you calculate $p(data| not H0)$? $\endgroup$ – innisfree Commented Apr 11, 2019 at 12:03
$\begingroup$ @innisfree under frequentist conception not, but probably under Bayesian. $\endgroup$ – Michael Commented Jun 13, 2019 at 9:09
1 $\begingroup$ For future readers, here is a great article that discusses the differences between each kind of hypothesis testing in detail. $\endgroup$ – mhdadk Commented Apr 16, 2022 at 21:57

I will focus on "If we do not talk about accepting alternative hypothesis, why do we need to have alternative hypothesis at all?"

Because it helps us to choose a meaningful test statistic and design our study to have high power---a high chance of rejecting the null when the alternative is true. Without an alternative, we have no concept of power.

Imagine we only have a null hypothesis and no alternative. Then there's no guidance on how to choose a test statistic that will have high power. All we can say is, "Reject the null whenever you observe a test statistic whose value is unlikely under the null." We can pick something arbitrary: we could draw Uniform(0,1) random numbers and reject the null when they are below 0.05. This happens under the null "rarely," no more than 5% of the time---yet it's also just as rare when the null is false. So this is technically a statistical test, but it's meaningless as evidence for or against anything.

Instead, usually we have some scientifically-plausible alternative hypothesis ("There is a positive difference in outcomes between the treatment and control groups in my experiment"). We'd like to defend it against potential critics who would bring up the null hypothesis as devil's advocates ("I'm not convinced yet---maybe your treatment actually hurts, or has no effect at all , and any apparent difference in the data is due only to sampling variation").

With these 2 hypotheses in mind, now we can setup up a powerful test, by choosing a test statistic whose typical values under the alternative are unlikely under the null. (A positive 2-sample t-statistic far from 0 would be unsurprising if the alternative is true, but surprising if the null is true.) Then we figure out the test statistic's sampling distribution under the null, so we can calculate p-values---and interpret them. When we observe a test statistic that's unlikely under the null, especially if the study design, sample size, etc. were chosen to have high power , this provides some evidence for the alternative.

So, why don't we talk about "accepting" the alternative hypothesis? Because even a high-powered study doesn't provide completely rigorous proof that the null is wrong. It's still a kind of evidence, but weaker than some other kinds of evidence.

Im am not 100% sure if this is a formal requirement but typically the null hypothesis and alternative hypothesis are: 1) complementary and 2) exhaustive. That is: 1) they cannot be both true at the same time ; 2) if one is not true the other must be true.

Consider simple test of heights between girls and boys. A typical null hypothesis in this case is that $height_{boys} = height_{girls}$ . An alternative hypothesis would be $height_{boys} \ne height_{girls}$ . So if null is not true - alternative must be true.

1 $\begingroup$ I completely agree with your statements, but one should note that both $H_0$ and $H_a$ are commonly infinitely large sets of null hypotheses. It also seems that many are convinced that $H_0$ and $H_a$ need not to be exhaustive, e.g. see this or this discussion. $\endgroup$ – Scholar Commented Jan 13, 2019 at 20:39
2 $\begingroup$ @bi_scholar thank you for discussion threads. I am no expert in this but based on simple reasoning I do believe they have to be exhaustive. Consider this weird test: someone finds 5 rocks arranged in order on a road. His $H_0$: wind did this. His $H_1$: it was aliens. Now if he tests the chance that wind did this and finds a probability of 0.0001 - he rejects the wind hypothesis. But it doesn't give him the right to claim it was aliens. All he can claim is that the chance of it being wind is small. But ANY other explanation remains open. $\endgroup$ – Karolis Koncevičius Commented Jan 13, 2019 at 23:16
1 $\begingroup$ I agree. My reasoning was that hypothesis testing is about accepting or rejecting $H_0$ while rejecting or accepting $H_a$. If $H_0$ and $H_a$ are not exhaustive, there is no point in defining any $H_a$ at all, since even when we reject $H_0$ we can not accept $H_a$, as there exist other hypotheses outside of $H_0$ and $H_a$ which might also be true. I unfortunately didn't manage to get my point across in the first thread. $\endgroup$ – Scholar Commented Jan 13, 2019 at 23:25
1 $\begingroup$ @innisfree one could test two point hypotheses in some kind of likelihood framework - sure. But that procedure would't bet called "null hypothesis testing" and it's imprecise. It would select the closest one as being true even in cases when none of them are true. Furthermore regarding power - one can pick an alternative hypothesis or effect size when calculating power of the test but (in my view) should forget it once the testing is taking place. Unless there is some prior information that tells him about the possible effects present in the data. Like maybe white/black pixels in a noisy photo. $\endgroup$ – Karolis Koncevičius Commented Jan 14, 2019 at 15:28
1 $\begingroup$ @innisfree I am curious how such a test would look like, could you formulate a small example? I'm convinced that we can not accept $\theta = 1$ by rejecting $H_0$ unless $\theta \in \{0, 1\}$ which corresponds to $H_0$ and $H_1$ being exhaustive. $\endgroup$ – Scholar Commented Jan 14, 2019 at 17:06

Why do we need to have alternative hypothesis at all?

In a classical hypothesis test, the only mathematical role played by the alternative hypothesis is that it affects the ordering of the evidence through the chosen test statistic. The alternative hypothesis is used to determine the appropriate test statistic for the test, which is equivalent to setting an ordinal ranking of all possible data outcomes from those most conducive to the null hypothesis (against the stated alternative) to those least conducive to the null hypotheses (against the stated alternative). Once you have formed this ordinal ranking of the possible data outcomes, the alternative hypothesis plays no further mathematical role in the test .

You can find a related answer on this question here which gives a schematic diagram of the classical hypothesis test and how the alternative hypothesis enters into the test. This is a useful supplement to the present answer.

Formal explanation: In any classical hypothesis test with $n$ observable data values $\mathbf{x} = (x_1,...,x_n)$ you have some test statistic $T: \mathbb{R}^n \rightarrow \mathbb{R}$ that maps every possible outcome of the data onto an ordinal scale that measures whether it is more conducive to the null or alternative hypothesis. (Without loss of generality we will assume that lower values are more conducive to the null hypothesis and higher values are more conducive to the alternative hypothesis. We sometimes say that higher values of the test statistic are "more extreme" insofar as they constitute more extreme evidence for the alternative hypothesis.) The p-value of the test is then given by:

$$p(\mathbf{x}) \equiv p_T(\mathbf{x}) \equiv \mathbb{P}( T(\mathbf{X}) \geqslant T(\mathbf{x}) | H_0).$$

This p-value function fully determines the evidence in the test for any data vector. When combined with a chosen significance level, it determines the outcome of the test for any data vector. (We have described this for a fixed number of data points $n$ but this can easily be extended to allow for arbitrary $n$ .) It is important to note that the p-value is affected by the test statistic only through the ordinal scale it induces , so if you apply a monotonically increasing transformation to the test statistics, this makes no difference to the hypothesis test (i.e., it is the same test). This mathematical property merely reflects the fact that the sole purpose of the test statistic is to induce an ordinal scale on the space of all possible data vectors, to show which are more conducive to the null/alternative.

The alternative hypothesis affects this measurement only through the function $T$ , which is chosen based on the stated null and alternative hypotheses within the overall model. Hence, we can regard the test statistic function as being a function $T \equiv g (\mathcal{M}, H_0, H_A)$ of the overall model $\mathcal{M}$ and the two hypotheses. For example, for a likelihood-ratio-test the test statistic is formed by taking a ratio (or logarithm of a ratio) of supremums of the likelihood function over parameter ranges relating to the null and alternative hypotheses.

What does this mean if we compare tests with different alternatives? Suppose you have a fixed model $\mathcal{M}$ and you want to do two different hypothesis tests comparing the same null hypothesis $H_0$ against two different alternatives $H_A$ and $H_A'$ . In this case you will have two different test statistic functions:

$$T = g (\mathcal{M}, H_0, H_A) \quad \quad \quad \quad \quad T' = g (\mathcal{M}, H_0, H_A'),$$

leading to the corresponding p-value functions:

$$p(\mathbf{x}) = \mathbb{P}( T(\mathbf{X}) \geqslant T(\mathbf{x}) | H_0) \quad \quad \quad \quad \quad p'(\mathbf{x}) = \mathbb{P}( T'(\mathbf{X}) \geqslant T'(\mathbf{x}) | H_0).$$

It is important to note that if $T$ and $T'$ are monotonic increasing transformations of one another then the p-value functions $p$ and $p'$ are identical, so both tests are the same test. If the functions $T$ and $T'$ are not monotonic increasing transformations of one another then we have two genuinely different hypothesis tests.

3 $\begingroup$ I would agree with this, saying that the test is designed to reject the null hypothesis when faced with extreme results, and the role of the alternative hypothesis is to point at which results would be seen as extreme if the null hypothesis were true $\endgroup$ – Henry Commented Jan 14, 2019 at 8:45

The reason I wouldn't think of accepting the alternative hypothesis is because that's not what we are testing. Null hypothesis significance testing (NHST) calculates the probability of observing data as extreme as observed (or more) given that the null hypothesis is true, or in other words NHST calculates a probability value that is conditioned on the fact that the null hypothesis is true, $P(data|H_0)$ . So it is the probability of the data assuming that the null hypothesis is true. It never uses or gives the probability of a hypothesis (neither null nor alternative). Therefore when you observe a small p-value, all you know is that the data you observed appears to be unlikely under $H_0$ , so you are collecting evidence against the null and in favour for whatever your alternative explanation is.

Before you run the experiment, you can decide on a cut-off level ( $\alpha$ ) that deems you result significant, meaning if your p-value falls below that level, you conclude that the evidence against the null is so overwhelmingly high that the data must have originated from some other data generating process and you reject the null hypothesis based on that evidence. If the p-value is above that level you fail to reject the null hypothesis since your evidence is not substantial enough to believe that your sample came form a different data generating process.

The reason why you formulate an alternative hypothesis is because you likely had an experiment in mind before you started sampling. Formulating an alternative hypothesis can also decide on whether you use a one-tailed or two-tailed test and hence giving you more statistical power (in the one-tailed scenario). But technically in order to run the test you don't need to formulate an alternative hypothesis, you just need data.

$\begingroup$ NHST does not calculate $P(data|H_0)$; it calculates $P(\textrm{data as extreme as that observed}|H_0)$. The distinction is important. $\endgroup$ – innisfree Commented Jan 14, 2019 at 9:32
$\begingroup$ @innisfree I agree and that's exactly how I defined data in that same sentence. $\endgroup$ – Stefan Commented Jan 14, 2019 at 11:53
$\begingroup$ ? I can’t see anywhere where data is defined (that way or any other way) $\endgroup$ – innisfree Commented Jan 14, 2019 at 12:12
$\begingroup$ And even if it were, why do that? Why redefine data that way? I’d advise to clarify the parts of the text around p(data.. $\endgroup$ – innisfree Commented Jan 14, 2019 at 12:22

Your Answer

Required, but never shown

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy .

Not the answer you're looking for? Browse other questions tagged hypothesis-testing or ask your own question .

Featured on Meta
Upcoming sign-up experiments related to tags

Hot Network Questions

Did any Famicom game use microphone for random seed?
Is it possible to avoid ending Time Stop by making attacks from inside an Antimagic Field?
Aligning surveyed point layers in QGIS
Can a creature with Mimicry activate command word magic items?
Is “stuff” used correctly in “ There are all kinds of animals: lions, elephants and stuff.”
Curve Tangent direction always changing depending on extrude
Round Cake Pan with Parchment Paper
Why are worldships not shaped like worlds?
Why was the 1540 a computer in its own right?
Why is nonzero net charge density incompatible with the cosmological principle?
Scheme interpreter in C
How can student grades from different countries (e.g., India and China) be compared when applying for academic positions?
A Quine program in C
I feel like doing a PhD is my only option but I am not excited about it. What can I do to fix my life?
Is "Shopping malls are a posh place" grammatical when "malls" is a plural and "place" is a singular?
How can photons interact with nuclei?
Does generating function always have a convergence domain?
Problems with \dot and \hbar in sfmath following an update
Audio amplifier for school project
What are the approaches of protecting against partially initialized objects?
Is there any difference between “essential to something” and “essential for something”?
Roadmap for self study in philosophy
Looking at buying house with mechanical septic system. What questions should I be asking?
What distribution should I use to predict three possible outcomes

Learn Python
Python Lists
Python Dictionaries
Python Strings
Python Functions
Learn Pandas & NumPy
Pandas Tutorials
Numpy Tutorials
Learn Data Visualization
Python Seaborn
Python Matplotlib

How to Perform T-Tests in Python (One- and Two-Sample)

February 12, 2024 January 13, 2024

In this post, you’ll learn how to perform t-tests in Python using the popular SciPy library . T-tests are used to test for statistical significance and can be hugely advantageous when working with smaller sample sizes.

By the end of this tutorial, you’ll have learned the following:

What the different t-tests are and when they should be applied
How to perform a one-sample t-test and a two-sample t-test in Python
How to interpret the results from your statistical tests

Table of Contents

Understanding the T-Test

The t-test, or often referred to as the student’s t-test , dates back to the early 20th century. An Irish statistician working for Guinness Brewery, William Sealy Gosset, introduced the concept. Because the brewery was working with small sample sizes and was under strict orders of confidentiality, Gosset published his findings under the pseudonym “Student”. His seminal work, “The Probable Error of a Mean,” laid the groundwork for what we now know as Student’s t-test.

This leads us to one of the primary benefits of the t-test: the t-test is able to make reliable inferences about a population using a small sample size . Let’s explore how this works by discussing the theory behind the t-test in the following section.

Understanding the Student’s T-Test

Statistical tests are used to make assumptions about some population parameters. For example, it lets us test whether or not the average test score for any given group of students is 70%. The T-Test works in two different ways:

The one-sample t-test allows us to test whether or not the population mean is equal to some value
The two-sample t-test allows us to test whether or not two population means are equal

Let’s explore these in a little more depth.

Understanding the One-Sample T-Test

The one-sample t-test is used to test the null hypothesis that the population mean inferred from a sample is equal to some given value. It can be described as below:

There are actually three different alternative hypotheses:

Two-tailed : The population mean is not equal to some given value
Left-tailed : The population mean is less than some given value
Right-tailed : The population mean is greater than some given value

We can use the following formula to calculate our test statistic:

x: the sample mean
μ 0 : a hypothesized population mean
s: the sample standard deviation
n: the sample size

We then need to calculate the p-value using degrees of freedom equal to n – 1. If the p-value is less than your chosen significance level, we can reject the null hypothesis and say that the means differ.

Understanding the Two-Sample T-Test

The two-sample t-test is used to test whether two population means are equal (or if they differ in a significant way). In this case, the null hypothesis assumes that the two population means are equal.

When we sample two different groups, we are almost guaranteed that their sample means will differ. But the t-test allows us to test whether or not this difference is different in a statistically significant way.

Similar to the one-sample t-test, there are three different alternative hypotheses:

Two-tailed : The two means are not equal
Left-tailed : Population mean #1 is less than population mean #2
Right-tailed : Population mean #1 is greater than population mean #2

The formula for the two-sample t-test can be written as:

X 1 and X 2 are the sample means of the two groups.
s 1 and s 2 are the sample variances of the two groups.
n 1 and n 2 are the sample sizes of the two groups.

We then need to calculate the p-value using degrees of freedom equal to (n 1 +n 2 -1). If the p-value is less than your chosen significance level, we can reject the null hypothesis and say that the means differ.

Requirements for the Student T-Test

Both types of t-tests follow a key set of assumptions, including:

Observations should be independent of one another
The data should be relatively normally distributed
The samples should have approximately equal variances (this only applies to the two-sample t-test)
The samples were collected using random sampling

It’s easy to test for these assumptions using Python (and I have included links to tutorials covering how to do this). Let’s take a look at example walkthroughs of how to conduct both of these tests in Python.

Perform a One-Sample T-Test in Python

In this section, you’ll learn how to conduct a one-sample t-test in Python. Suppose you are a teacher and have just given a test. You know that the population mean for this test is 85% and you want to see whether the score of the class is significantly different from this population mean.

Let’s start by importing our required function, ttest_1samp() from SciPy and defining our data:

In the code block above, we first imported our required library. We then defined our sample as a list of values and defined our population mean as its own variable.

We can now pass these values into the function, as shown below:

The function returns a test statistic and the corresponding p-value. We can print these values out using f-strings to simplify the labeling , as shown above.

Finally, we can write a simple if-else statement to evaluate whether or not our sample mean is significantly different from the population mean:

We can see that by running this if-else statement, that our test indicates that there is no significant difference in the exam scores.

In order to calculate the different one-sample t-test alternative hypotheses, we can use the alternative= parameter:

alternative='two-sided' is the default value, checking for a two-sided alternative hypothesis
alternative='less' checks whether the provided mean is less than the population mean
alternative='greater' checks whether the provided mean is greater than the population mean

Now that you have a strong understanding of how to perform a one-sample t-test, let’s dive into the exciting world of two-sample t-tests!

Perform a Two-Sample T-Test in Python

A two-sample t-test is used to test whether the means of two samples are equal. The test requires that both samples be normally distributed, have similar variances, and be independent of one another.

Imagine that we want to compare the test scores of two different classes. This is the perfect example of when to use a t-test. Let’s begin by running a two-tailed test, which only evaluates whether or not the two means are equal. It begins with the null hypothesis, which states that the two means are equal.

Let’s take a look at how we can run a two-tailed t-test in Python:

We can see that the ttest_ind() function returns both a test statistic and a p-value. We can run a simple if-else statement to check whether or not we can reject or fail to reject the null hypothesis:

We can see that there is a significant difference between the two sets of scores. However, the two-tailed test doesn’t tell us in which direction.

In order to do this, we need to use a right- or left-tailed two-sample t-test. To do this in SciPy, we use the alternative= parameter. By default, this is set to 'two-sided' . However, we can modify this to either 'less' or 'greater' , if we want to evaluate whether or not the mean for one sample is less than or greater than another.

Let’s see how we can check if the mean of class 2 is significantly higher than that of class 1:

Because our p-value is less than our defined value of 0.05, we can say that the mean of class 2 is higher with statistical significance.

In conclusion, this comprehensive guide has equipped you with the knowledge and practical skills to perform t-tests in Python using the SciPy library. T-tests are invaluable tools for assessing statistical significance, particularly when working with smaller sample sizes.

Throughout this tutorial, you’ve gained insights into:

The different types of t-tests and their applications.
How to conduct one-sample and two-sample t-tests in Python.
Interpretation of results obtained from statistical tests.

Remember that t-tests come with certain assumptions, and it’s crucial to validate them before applying these tests to your data. Python provides tools to check these assumptions, ensuring the robustness and reliability of your statistical analyses.

To learn more about these functions, check out the official documentation for the one-sample t-test and for the two-sample t-test in SciPy.

Nik Piepenbreier

Nik is the author of datagy.io and has over a decade of experience working with data analytics, data science, and Python. He specializes in teaching developers how to use Python for data science using hands-on tutorials. View Author posts

Formula for Chapter 10 Hypothesis Testing 3321 (1)

What is a scientific hypothesis?

It's the initial building block in the scientific method.

A girl looks at plants in a test tube for a science experiment. What's her scientific hypothesis?

Hypothesis basics

What makes a hypothesis testable.

Types of hypotheses
Hypothesis versus theory

Additional resources

Bibliography.

A scientific hypothesis is a tentative, testable explanation for a phenomenon in the natural world. It's the initial building block in the scientific method . Many describe it as an "educated guess" based on prior knowledge and observation. While this is true, a hypothesis is more informed than a guess. While an "educated guess" suggests a random prediction based on a person's expertise, developing a hypothesis requires active observation and background research.

The basic idea of a hypothesis is that there is no predetermined outcome. For a solution to be termed a scientific hypothesis, it has to be an idea that can be supported or refuted through carefully crafted experimentation or observation. This concept, called falsifiability and testability, was advanced in the mid-20th century by Austrian-British philosopher Karl Popper in his famous book "The Logic of Scientific Discovery" (Routledge, 1959).

A key function of a hypothesis is to derive predictions about the results of future experiments and then perform those experiments to see whether they support the predictions.

A hypothesis is usually written in the form of an if-then statement, which gives a possibility (if) and explains what may happen because of the possibility (then). The statement could also include "may," according to California State University, Bakersfield .

Here are some examples of hypothesis statements:

If garlic repels fleas, then a dog that is given garlic every day will not get fleas.
If sugar causes cavities, then people who eat a lot of candy may be more prone to cavities.
If ultraviolet light can damage the eyes, then maybe this light can cause blindness.

A useful hypothesis should be testable and falsifiable. That means that it should be possible to prove it wrong. A theory that can't be proved wrong is nonscientific, according to Karl Popper's 1963 book " Conjectures and Refutations ."

An example of an untestable statement is, "Dogs are better than cats." That's because the definition of "better" is vague and subjective. However, an untestable statement can be reworded to make it testable. For example, the previous statement could be changed to this: "Owning a dog is associated with higher levels of physical fitness than owning a cat." With this statement, the researcher can take measures of physical fitness from dog and cat owners and compare the two.

Types of scientific hypotheses

Elementary-age students study alternative energy using homemade windmills during public school science class.

In an experiment, researchers generally state their hypotheses in two ways. The null hypothesis predicts that there will be no relationship between the variables tested, or no difference between the experimental groups. The alternative hypothesis predicts the opposite: that there will be a difference between the experimental groups. This is usually the hypothesis scientists are most interested in, according to the University of Miami .

For example, a null hypothesis might state, "There will be no difference in the rate of muscle growth between people who take a protein supplement and people who don't." The alternative hypothesis would state, "There will be a difference in the rate of muscle growth between people who take a protein supplement and people who don't."

If the results of the experiment show a relationship between the variables, then the null hypothesis has been rejected in favor of the alternative hypothesis, according to the book " Research Methods in Psychology " (BCcampus, 2015).

There are other ways to describe an alternative hypothesis. The alternative hypothesis above does not specify a direction of the effect, only that there will be a difference between the two groups. That type of prediction is called a two-tailed hypothesis. If a hypothesis specifies a certain direction — for example, that people who take a protein supplement will gain more muscle than people who don't — it is called a one-tailed hypothesis, according to William M. K. Trochim , a professor of Policy Analysis and Management at Cornell University.

Sometimes, errors take place during an experiment. These errors can happen in one of two ways. A type I error is when the null hypothesis is rejected when it is true. This is also known as a false positive. A type II error occurs when the null hypothesis is not rejected when it is false. This is also known as a false negative, according to the University of California, Berkeley .

A hypothesis can be rejected or modified, but it can never be proved correct 100% of the time. For example, a scientist can form a hypothesis stating that if a certain type of tomato has a gene for red pigment, that type of tomato will be red. During research, the scientist then finds that each tomato of this type is red. Though the findings confirm the hypothesis, there may be a tomato of that type somewhere in the world that isn't red. Thus, the hypothesis is true, but it may not be true 100% of the time.

Scientific theory vs. scientific hypothesis

The best hypotheses are simple. They deal with a relatively narrow set of phenomena. But theories are broader; they generally combine multiple hypotheses into a general explanation for a wide range of phenomena, according to the University of California, Berkeley . For example, a hypothesis might state, "If animals adapt to suit their environments, then birds that live on islands with lots of seeds to eat will have differently shaped beaks than birds that live on islands with lots of insects to eat." After testing many hypotheses like these, Charles Darwin formulated an overarching theory: the theory of evolution by natural selection.

"Theories are the ways that we make sense of what we observe in the natural world," Tanner said. "Theories are structures of ideas that explain and interpret facts."

Read more about writing a hypothesis, from the American Medical Writers Association.
Find out why a hypothesis isn't always necessary in science, from The American Biology Teacher.
Learn about null and alternative hypotheses, from Prof. Essa on YouTube .

Encyclopedia Britannica. Scientific Hypothesis. Jan. 13, 2022. https://www.britannica.com/science/scientific-hypothesis

Karl Popper, "The Logic of Scientific Discovery," Routledge, 1959.

California State University, Bakersfield, "Formatting a testable hypothesis." https://www.csub.edu/~ddodenhoff/Bio100/Bio100sp04/formattingahypothesis.htm

Karl Popper, "Conjectures and Refutations," Routledge, 1963.

Price, P., Jhangiani, R., & Chiang, I., "Research Methods of Psychology — 2nd Canadian Edition," BCcampus, 2015.‌

University of Miami, "The Scientific Method" http://www.bio.miami.edu/dana/161/evolution/161app1_scimethod.pdf

William M.K. Trochim, "Research Methods Knowledge Base," https://conjointly.com/kb/hypotheses-explained/

University of California, Berkeley, "Multiple Hypothesis Testing and False Discovery Rate" https://www.stat.berkeley.edu/~hhuang/STAT141/Lecture-FDR.pdf

University of California, Berkeley, "Science at multiple levels" https://undsci.berkeley.edu/article/0_0_0/howscienceworks_19

Sign up for the Live Science daily newsletter now

Get the world’s most fascinating discoveries delivered straight to your inbox.

Hot Tub of Despair: The deadly ocean pool that traps and pickles creatures that fall in

Enormous deposit of rare earth elements discovered in heart of ancient Norwegian volcano

Melatonin may stave off age-related vision loss, study hints

Statistics > Methodology

Title: null hypothesis bayes factor estimates can be biased in (some) common factorial designs: a simulation study.

Abstract: Bayes factor null hypothesis tests provide a viable alternative to frequentist measures of evidence quantification. Bayes factors for realistic interesting models cannot be calculated exactly, but have to be estimated, which involves approximations to complex integrals. Crucially, the accuracy of these estimates, i.e., whether an estimated Bayes factor corresponds to the true Bayes factor, is unknown, and may depend on data, prior, and likelihood. We have recently developed a novel statistical procedure, namely simulation-based calibration (SBC) for Bayes factors, to test for a given analysis, whether the computed Bayes factors are accurate. Here, we use SBC for Bayes factors to test for some common cognitive designs, whether Bayes factors are estimated accurately. We use the bridgesampling/brms packages as well as the BayesFactor package in R. We find that Bayes factor estimates are accurate and exhibit only little bias in Latin square designs with (a) random effects for subjects only and (b) for crossed random effects for subjects and items, but a single fixed-factor. However, Bayes factor estimates turn out biased and liberal in a 2x2 design with crossed random effects for subjects and items. These results suggest that researchers should test for their individual analysis, whether Bayes factor estimates are accurate. Moreover, future research is needed to determine the boundary conditions under which Bayes factor estimates are accurate or biased, as well as software development to improve estimation accuracy.

Comments:	arXiv admin note: text overlap with
Subjects:	Methodology (stat.ME)
Cite as:	[stat.ME]
	(or [stat.ME] for this version)
	Focus to learn more arXiv-issued DOI via DataCite

Submission history

Access paper:.

HTML (experimental)
Other Formats

References & Citations

Google Scholar
Semantic Scholar

BibTeX formatted citation

Bibliographic and Citation Tools

Code, data and media associated with this article, recommenders and search tools.

Institution

arXivLabs: experimental projects with community collaborators

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.

Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.

Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs .

IMAGES

Hypothesis Testing
Null Hypothesis and Alternative Hypothesis
PPT
Hypothesis Testing
PPT
Hypothesis testing Infographics by: Mariz Turdanes

VIDEO

Hypothesis Testing: the null and alternative hypotheses
Null Hypothesis vs Alternate Hypothesis
Null Hypothesis vs Alternate Hypothesis
Step-by-Step Guide to Hypothesis Testing: A Detailed Example of the 9 Essential Steps
Hypothesis testing in statistics #statistics #dataanalytics #stats #maths #datascience
Hypothesis Testing (Null and Alternative Hypothesis)

COMMENTS

Null & Alternative Hypotheses
The null and alternative hypotheses offer competing answers to your research question. When the research question asks "Does the independent variable affect the dependent variable?": The null hypothesis ( H0) answers "No, there's no effect in the population.". The alternative hypothesis ( Ha) answers "Yes, there is an effect in the ...
Alternative hypothesis
The statement that is being tested against the null hypothesis is the alternative hypothesis. Alternative hypothesis is often denoted as H a or H 1. In statistical hypothesis testing, to prove the alternative hypothesis is true, it should be shown that the data is contradictory to the null hypothesis. Namely, there is sufficient evidence ...
What is an Alternative Hypothesis in Statistics?
Null hypothesis: µ ≥ 70 inches. Alternative hypothesis: µ < 70 inches. A two-tailed hypothesis involves making an "equal to" or "not equal to" statement. For example, suppose we assume the mean height of a male in the U.S. is equal to 70 inches. The null and alternative hypotheses in this case would be: Null hypothesis: µ = 70 inches.
9.1 Null and Alternative Hypotheses
The actual test begins by considering two hypotheses.They are called the null hypothesis and the alternative hypothesis.These hypotheses contain opposing viewpoints. H 0, the —null hypothesis: a statement of no difference between sample means or proportions or no difference between a sample mean or proportion and a population mean or proportion. In other words, the difference equals 0.
An Introduction to Statistics: Understanding Hypothesis Testing and
HYPOTHESIS TESTING. A clinical trial begins with an assumption or belief, and then proceeds to either prove or disprove this assumption. In statistical terms, this belief or assumption is known as a hypothesis. Counterintuitively, what the researcher believes in (or is trying to prove) is called the "alternate" hypothesis, and the opposite ...
9.1: Introduction to Hypothesis Testing
In hypothesis testing, the goal is to see if there is sufficient statistical evidence to reject a presumed null hypothesis in favor of a conjectured alternative hypothesis.The null hypothesis is usually denoted $H_0$ while the alternative hypothesis is usually denoted $H_1$. An hypothesis test is a statistical decision; the conclusion will either be to reject the null hypothesis in favor ...
Hypothesis Testing
Hypothesis testing is a formal procedure for investigating our ideas about the world using statistics. It is most often used by scientists to test specific predictions, called hypotheses, that arise from theories. ... The null hypothesis of a test always predicts no effect or no relationship between variables, while the alternative hypothesis ...
7.4: The Alternative Hypothesis
Thus, our alternative hypothesis is the mathematical way of stating our research question. If we expect our obtained sample mean to be above or below the null hypothesis value, which we call a directional hypothesis, then our alternative hypothesis takes the form: HA: μ > 7.47 or HA: μ < 7.47 H A: μ > 7.47 or H A: μ < 7.47.
Alternative hypothesis
The alternative hypothesis is one of two mutually exclusive hypotheses in a hypothesis test. The alternative hypothesis states that a population parameter does not equal a specified value. Typically, this value is the null hypothesis value associated with no effect, such as zero.If your sample contains sufficient evidence, you can reject the null hypothesis and favor the alternative hypothesis.
Alternative hypothesis
Alternative hypothesis. by Marco Taboga, PhD. In a statistical test, observed data is used to decide whether or not to reject a restriction on the data-generating probability distribution. The assumption that the restriction is true is called null hypothesis, while the statement that the restriction is not true is called alternative hypothesis.
Statistical Hypothesis Testing Overview
Alternative Hypothesis. The alternative hypothesis is the other theory about the properties of the population in hypothesis testing. Typically, the alternative hypothesis states that a population parameter does not equal the null hypothesis value. In other words, there is a non-zero effect.
6.2: Null and Alternative Hypotheses
The actual test begins by considering two hypotheses.They are called the null hypothesis and the alternative hypothesis.These hypotheses contain opposing viewpoints. $H_0$: The null hypothesis: It is a statement of no difference between the variables—they are not related. This can often be considered the status quo and as a result if you cannot accept the null it requires some action.
Hypothesis Testing
The null hypothesis and alternative hypothesis are statements regarding the differences or effects that occur in the population. You will use your sample to test which statement (i.e., the null hypothesis or alternative hypothesis) is most likely (although technically, you test the evidence against the null hypothesis).
6a.2
The alternative hypothesis is a statement of a range of alternative values in which the parameter may fall. One must also check that any conditions (assumptions) needed to run the test have been satisfied e.g. normality of data, independence, and number of success and failure outcomes.
Null and Alternative Hypotheses
The actual test begins by considering two hypotheses.They are called the null hypothesis and the alternative hypothesis.These hypotheses contain opposing viewpoints. H 0: The null hypothesis: It is a statement about the population that either is believed to be true or is used to put forth an argument unless it can be shown to be incorrect beyond a reasonable doubt.
Examples of null and alternative hypotheses
It is the opposite of your research hypothesis. The alternative hypothesis--that is, the research hypothesis--is the idea, phenomenon, observation that you want to prove. If you suspect that girls take longer to get ready for school than boys, then: Alternative: girls time > boys time. Null: girls time <= boys time.
10.2: Null and Alternative Hypotheses
The null hypothesis ( H0. H 0. ) is a statement about the population that either is believed to be true or is used to put forth an argument unless it can be shown to be incorrect beyond a reasonable doubt. The alternative hypothesis ( Ha. H a. ) is a claim about the population that is contradictory to H0. H 0.
Alternative Hypothesis-Definition, Types and Examples
In hypothesis testing, an alternative theory is a statement which a researcher is testing. This statement is true from the researcher's point of view and ultimately proves to reject the null to replace it with an alternative assumption. In this hypothesis, the difference between two or more variables is predicted by the researchers, such that ...
Null Hypothesis and Alternative Hypothesis
Most technical papers rely on just the first formulation, even though you may see some of the others in a statistics textbook. Null hypothesis: " x is equal to y .". Alternative hypothesis " x is not equal to y .". Null hypothesis: " x is at least y .". Alternative hypothesis " x is less than y .". Null hypothesis: " x is at ...
8.4: The Alternative Hypothesis
Thus, our alternative hypothesis is the mathematical way of stating our research question. If we expect our obtained sample mean to be above or below the null hypothesis value, which we call a directional hypothesis, then our alternative hypothesis takes the form: HA: μ > 7.47 or HA: μ < 7.47 H A: μ > 7.47 or H A: μ < 7.47.
4.4: Hypothesis Testing
This is also the case with hypothesis testing: even if we fail to reject the null hypothesis, we typically do not accept the null hypothesis as true. Failing to find strong evidence for the alternative hypothesis is not equivalent to accepting the null hypothesis. 17 H 0: The average cost is $650 per month, μ = $650.
Why do we need alternative hypothesis?
Neyman - We must perform a hypothesis test between a null and an alternative. The test is such that it would result in type-1 errors at a fixed, pre-specified rate, α. α. The outcome is a decision - to reject or not reject the null hypothesis at the level α. α.
How to Perform T-Tests in Python (One- and Two-Sample)
In order to calculate the different one-sample t-test alternative hypotheses, we can use the alternative= parameter: alternative='two-sided' is the default value, checking for a two-sided alternative hypothesis; alternative='less' checks whether the provided mean is less than the population mean;
Formula for Chapter 10 Hypothesis Testing 3321 (1)
Formula for Chapter 10 Statistical Inference Concerning Two Populations (BSTAT 3321) Test Statistics: Part 1-1. Hypothesis Test for the difference between two means when population variances ( σ 1 2 ∧ σ 2 2 ¿ are Known ( z -Test: Two-Sample for Means in Excel) z = ( x 1 − x 2 ) − d 0 √ σ 1 2 n 1 + σ 2 2 n 2 Part 1-2-1.
What is a scientific hypothesis?
A scientific hypothesis is a tentative, testable explanation for a phenomenon in the natural world. It's the initial building block in the scientific method. Many describe it as an "educated guess ...
[2406.08022] Null hypothesis Bayes factor estimates can be biased in
Bayes factor null hypothesis tests provide a viable alternative to frequentist measures of evidence quantification. Bayes factors for realistic interesting models cannot be calculated exactly, but have to be estimated, which involves approximations to complex integrals. Crucially, the accuracy of these estimates, i.e., whether an estimated Bayes factor corresponds to the true Bayes factor, is ...

	equal \((=)\)	greater than or equal to \((\geq)\)	less than or equal to \((\leq)\)
has:	not equal \((\neq)\) greater than \((>)\) less than \((<)\)	less than \((<)\)	greater than \((>)\)

What is an Alternative Hypothesis in Statistics?

Types of Alternative Hypotheses

Examples of Alternative Hypotheses

When to Reject the Null Hypothesis

Featured Posts

Leave a Reply Cancel reply

Join the Statology Community

9.1 Null and Alternative Hypotheses

Example 9.1

Example 9.2

Example 9.3

Example 9.4

Collaborative Exercise

An Introduction to Statistics: Understanding Hypothesis Testing and Statistical Errors

How to cite this article

SAMPLE VERSUS POPULATION

HYPOTHESIS TESTING

STATISTICAL ERRORS

Have a language expert improve your writing

Hypothesis Testing | A Step-by-Step Guide with Easy Examples

Table of contents

Receive feedback on language, structure, and formatting

Here's why students love Scribbr's proofreading services

Cite this Scribbr article

Is this article helpful?

Rebecca Bevans

Alternative hypothesis

Mathematical setting

Accepting the alternative

How to cite

Statistical Hypothesis Testing Overview

Why You Should Perform Statistical Hypothesis Testing

Hypothesis Testing

Null Hypothesis

Alternative Hypothesis

Significance Level (Alpha)

Types of Errors in Hypothesis Testing

Which Type of Hypothesis Test is Right for You?

Share this:

Reader Interactions

Comments and Questions Cancel reply

Hypothesis Testing (cont...)

Significance levels

One- and two-tailed predictions

Rejecting or failing to reject the null hypothesis

User Preferences

Keyboard Shortcuts

Six Steps for Hypothesis Tests Section

Module 9: Hypothesis Testing With One Sample

Concept Review

Formula Review

AP®︎/College Statistics

Examples of null and alternative hypotheses

Want to join the conversation?

Video transcript

Margin Size

10.2: Null and Alternative Hypotheses

Chapter Review

Formula Review

Contributors

Alternative Hypothesis

Difference Between Null and Alternative Hypothesis

Register with BYJU'S & Download Free PDFs

Null Hypothesis and Alternative Hypothesis

The Null Hypothesis

The Alternative Hypothesis

Margin Size

4.4: Hypothesis Testing

Hypothesis Testing Framework

Testing Hypotheses using Confidence Intervals

Decision Errors

Formal Testing using p-Values

Two-sided hypothesis testing with p-values

Choosing a Significance Level

Stack Exchange Network

Why do we need alternative hypothesis?

5 Answers 5

Your Answer

Not the answer you're looking for? Browse other questions tagged hypothesis-testing or ask your own question .

Hot Network Questions