• Skip to secondary menu
  • Skip to main content
  • Skip to primary sidebar

Statistics By Jim

Making statistics intuitive

One-Tailed and Two-Tailed Hypothesis Tests Explained

By Jim Frost 60 Comments

Choosing whether to perform a one-tailed or a two-tailed hypothesis test is one of the methodology decisions you might need to make for your statistical analysis. This choice can have critical implications for the types of effects it can detect, the statistical power of the test, and potential errors.

In this post, you’ll learn about the differences between one-tailed and two-tailed hypothesis tests and their advantages and disadvantages. I include examples of both types of statistical tests. In my next post, I cover the decision between one and two-tailed tests in more detail.

What Are Tails in a Hypothesis Test?

First, we need to cover some background material to understand the tails in a test. Typically, hypothesis tests take all of the sample data and convert it to a single value, which is known as a test statistic. You’re probably already familiar with some test statistics. For example, t-tests calculate t-values . F-tests, such as ANOVA, generate F-values . The chi-square test of independence and some distribution tests produce chi-square values. All of these values are test statistics. For more information, read my post about Test Statistics .

These test statistics follow a sampling distribution. Probability distribution plots display the probabilities of obtaining test statistic values when the null hypothesis is correct. On a probability distribution plot, the portion of the shaded area under the curve represents the probability that a value will fall within that range.

The graph below displays a sampling distribution for t-values. The two shaded regions cover the two-tails of the distribution.

Plot that display critical regions in the two tails of the distribution.

Keep in mind that this t-distribution assumes that the null hypothesis is correct for the population. Consequently, the peak (most likely value) of the distribution occurs at t=0, which represents the null hypothesis in a t-test. Typically, the null hypothesis states that there is no effect. As t-values move further away from zero, it represents larger effect sizes. When the null hypothesis is true for the population, obtaining samples that exhibit a large apparent effect becomes less likely, which is why the probabilities taper off for t-values further from zero.

Related posts : How t-Tests Work and Understanding Probability Distributions

Critical Regions in a Hypothesis Test

In hypothesis tests, critical regions are ranges of the distributions where the values represent statistically significant results. Analysts define the size and location of the critical regions by specifying both the significance level (alpha) and whether the test is one-tailed or two-tailed.

Consider the following two facts:

  • The significance level is the probability of rejecting a null hypothesis that is correct.
  • The sampling distribution for a test statistic assumes that the null hypothesis is correct.

Consequently, to represent the critical regions on the distribution for a test statistic, you merely shade the appropriate percentage of the distribution. For the common significance level of 0.05, you shade 5% of the distribution.

Related posts : Significance Levels and P-values and T-Distribution Table of Critical Values

Two-Tailed Hypothesis Tests

Two-tailed hypothesis tests are also known as nondirectional and two-sided tests because you can test for effects in both directions. When you perform a two-tailed test, you split the significance level percentage between both tails of the distribution. In the example below, I use an alpha of 5% and the distribution has two shaded regions of 2.5% (2 * 2.5% = 5%).

When a test statistic falls in either critical region, your sample data are sufficiently incompatible with the null hypothesis that you can reject it for the population.

In a two-tailed test, the generic null and alternative hypotheses are the following:

  • Null : The effect equals zero.
  • Alternative :  The effect does not equal zero.

The specifics of the hypotheses depend on the type of test you perform because you might be assessing means, proportions, or rates.

Example of a two-tailed 1-sample t-test

Suppose we perform a two-sided 1-sample t-test where we compare the mean strength (4.1) of parts from a supplier to a target value (5). We use a two-tailed test because we care whether the mean is greater than or less than the target value.

To interpret the results, simply compare the p-value to your significance level. If the p-value is less than the significance level, you know that the test statistic fell into one of the critical regions, but which one? Just look at the estimated effect. In the output below, the t-value is negative, so we know that the test statistic fell in the critical region in the left tail of the distribution, indicating the mean is less than the target value. Now we know this difference is statistically significant.

Statistical output from a two-tailed 1-sample t-test.

We can conclude that the population mean for part strength is less than the target value. However, the test had the capacity to detect a positive difference as well. You can also assess the confidence interval. With a two-tailed hypothesis test, you’ll obtain a two-sided confidence interval. The confidence interval tells us that the population mean is likely to fall between 3.372 and 4.828. This range excludes the target value (5), which is another indicator of significance.

Advantages of two-tailed hypothesis tests

You can detect both positive and negative effects. Two-tailed tests are standard in scientific research where discovering any type of effect is usually of interest to researchers.

One-Tailed Hypothesis Tests

One-tailed hypothesis tests are also known as directional and one-sided tests because you can test for effects in only one direction. When you perform a one-tailed test, the entire significance level percentage goes into the extreme end of one tail of the distribution.

In the examples below, I use an alpha of 5%. Each distribution has one shaded region of 5%. When you perform a one-tailed test, you must determine whether the critical region is in the left tail or the right tail. The test can detect an effect only in the direction that has the critical region. It has absolutely no capacity to detect an effect in the other direction.

In a one-tailed test, you have two options for the null and alternative hypotheses, which corresponds to where you place the critical region.

You can choose either of the following sets of generic hypotheses:

  • Null : The effect is less than or equal to zero.
  • Alternative : The effect is greater than zero.

Plot that displays a single critical region for a one-tailed test.

  • Null : The effect is greater than or equal to zero.
  • Alternative : The effect is less than zero.

Plot that displays a single critical region in the left tail for a one-tailed test.

Again, the specifics of the hypotheses depend on the type of test you perform.

Notice how for both possible null hypotheses the tests can’t distinguish between zero and an effect in a particular direction. For example, in the example directly above, the null combines “the effect is greater than or equal to zero” into a single category. That test can’t differentiate between zero and greater than zero.

Example of a one-tailed 1-sample t-test

Suppose we perform a one-tailed 1-sample t-test. We’ll use a similar scenario as before where we compare the mean strength of parts from a supplier (102) to a target value (100). Imagine that we are considering a new parts supplier. We will use them only if the mean strength of their parts is greater than our target value. There is no need for us to differentiate between whether their parts are equally strong or less strong than the target value—either way we’d just stick with our current supplier.

Consequently, we’ll choose the alternative hypothesis that states the mean difference is greater than zero (Population mean – Target value > 0). The null hypothesis states that the difference between the population mean and target value is less than or equal to zero.

Statistical output for a one-tailed 1-sample t-test.

To interpret the results, compare the p-value to your significance level. If the p-value is less than the significance level, you know that the test statistic fell into the critical region. For this study, the statistically significant result supports the notion that the population mean is greater than the target value of 100.

Confidence intervals for a one-tailed test are similarly one-sided. You’ll obtain either an upper bound or a lower bound. In this case, we get a lower bound, which indicates that the population mean is likely to be greater than or equal to 100.631. There is no upper limit to this range.

A lower-bound matches our goal of determining whether the new parts are stronger than our target value. The fact that the lower bound (100.631) is higher than the target value (100) indicates that these results are statistically significant.

This test is unable to detect a negative difference even when the sample mean represents a very negative effect.

Advantages and disadvantages of one-tailed hypothesis tests

One-tailed tests have more statistical power to detect an effect in one direction than a two-tailed test with the same design and significance level. One-tailed tests occur most frequently for studies where one of the following is true:

  • Effects can exist in only one direction.
  • Effects can exist in both directions but the researchers only care about an effect in one direction. There is no drawback to failing to detect an effect in the other direction. (Not recommended.)

The disadvantage of one-tailed tests is that they have no statistical power to detect an effect in the other direction.

As part of your pre-study planning process, determine whether you’ll use the one- or two-tailed version of a hypothesis test. To learn more about this planning process, read 5 Steps for Conducting Scientific Studies with Statistical Analyses .

This post explains the differences between one-tailed and two-tailed statistical hypothesis tests. How these forms of hypothesis tests function is clear and based on mathematics. However, there is some debate about when you can use one-tailed tests. My next post explores this decision in much more depth and explains the different schools of thought and my opinion on the matter— When Can I Use One-Tailed Hypothesis Tests .

If you’re learning about hypothesis testing and like the approach I use in my blog, check out my Hypothesis Testing book! You can find it at Amazon and other retailers.

Cover image of my Hypothesis Testing: An Intuitive Guide ebook.

Share this:

null hypothesis one sided test

Reader Interactions

' src=

June 26, 2022 at 12:14 pm

Hi, Can help me with figuring out the null and alternative hypothesis of the following statement? Some claimed that the real average expenditure on beverage by general people is at least $10.

' src=

February 19, 2022 at 6:02 am

thank you for the thoroughly explanation, I’m still strugling to wrap my mind around the t-table and the relation between the alpha values for one or two tail probability and the confidence levels on the bottom (I’m understanding it so wrongly that for me it should be the oposite, like one tail 0,05 should correspond 95% CI and two tailed 0,025 should correspond to 95% because then you got the 2,5% on each side). In my mind if I picture the one tail diagram with an alpha of 0,05 I see the rest 95% inside the diagram, but for a one tail I only see 90% CI paired with a 5% alpha… where did the other 5% go? I tried to understand when you said we should just double the alpha for a one tail probability in order to find the CI but I still cant picture it. I have been trying to understand this. Like if you only have one tail and there is 0,05, shouldn’t the rest be on the other side? why is it then 90%… I know I’m missing a point and I can’t figure it out and it’s so frustrating…

' src=

February 23, 2022 at 10:01 pm

The alpha is the total shaded area. So, if the alpha = 0.05, you know that 5% of the distribution is shaded. The number of tails tells you how to divide the shaded areas. Is it all in one region (1-tailed) or do you split the shaded regions in two (2-tailed)?

So, for a one-tailed test with an alpha of 0.05, the 5% shading is all in one tail. If alpha = 0.10, then it’s 10% on one side. If it’s two-tailed, then you need to split that 10% into two–5% in both tails. Hence, the 5% in a one-tailed test is the same as a two-tailed test with an alpha of 0.10 because that test has the same 5% on one side (but there’s another 5% in the other tail).

It’s similar for CIs. However, for CIs, you shade the middle rather than the extremities. I write about that in one my articles about hypothesis testing and confidence intervals .

I’m not sure if I’m answering your question or not.

' src=

February 17, 2022 at 1:46 pm

I ran a post hoc Dunnett’s test alpha=0.05 after a significant Anova test in Proc Mixed using SAS. I want to determine if the means for treatment (t1, t2, t3) is significantly less than the means for control (p=pathogen). The code for the dunnett’s test is – LSmeans trt / diff=controll (‘P’) adjust=dunnett CL plot=control; I think the lower bound one tailed test is the correct test to run but I’m not 100% sure. I’m finding conflicting information online. In the output table for the dunnett’s test the mean difference between the control and the treatments is t1=9.8, t2=64.2, and t3=56.5. The control mean estimate is 90.5. The adjusted p-value by treatment is t1(p=0.5734), t2 (p=.0154) and t3(p=.0245). The adjusted lower bound confidence limit in order from t1-t3 is -38.8, 13.4, and 7.9. The adjusted upper bound for all test is infinity. The graphical output for the dunnett’s test in SAS is difficult to understand for those of us who are beginner SAS users. All treatments appear as a vertical line below the the horizontal line for control at 90.5 with t2 and t3 in the shaded area. For treatment 1 the shaded area is above the line for control. Looking at just the output table I would say that t2 and t3 are significantly lower than the control. I guess I would like to know if my interpretation of the outputs is correct that treatments 2 and 3 are statistically significantly lower than the control? Should I have used an upper bound one tailed test instead?

' src=

November 10, 2021 at 1:00 am

Thanks Jim. Please help me understand how a two tailed testing can be used to minimize errors in research

' src=

July 1, 2021 at 9:19 am

Hi Jim, Thanks for posting such a thorough and well-written explanation. It was extremely useful to clear up some doubts.

' src=

May 7, 2021 at 4:27 pm

Hi Jim, I followed your instructions for the Excel add-in. Thank you. I am very new to statistics and sort of enjoy it as I enter week number two in my class. I am to select if three scenarios call for a one or two-tailed test is required and why. The problem is stated:

30% of mole biopsies are unnecessary. Last month at his clinic, 210 out of 634 had benign biopsy results. Is there enough evidence to reject the dermatologist’s claim?

Part two, the wording changes to “more than of 30% of biopsies,” and part three, the wording changes to “less than 30% of biopsies…”

I am not asking for the problem to be solved for me, but I cannot seem to find direction needed. I know the elements i am dealing with are =30%, greater than 30%, and less than 30%. 210 and 634. I just don’t know what to with the information. I can’t seem to find an example of a similar problem to work with.

May 9, 2021 at 9:22 pm

As I detail in this post, a two-tailed test tells you whether an effect exists in either direction. Or, is it different from the null value in either direction. For the first example, the wording suggests you’d need a two-tailed test to determine whether the population proportion is ≠ 30%. Whenever you just need to know ≠, it suggests a two-tailed test because you’re covering both directions.

For part two, because it’s in one direction (greater than), you need a one-tailed test. Same for part three but it’s less than. Look in this blog post to see how you’d construct the null and alternative hypotheses for these cases. Note that you’re working with a proportion rather than the mean, but the principles are the same! Just plug your scenario and the concept of proportion into the wording I use for the hypotheses.

I hope that helps!

' src=

April 11, 2021 at 9:30 am

Hello Jim, great website! I am using a statistics program (SPSS) that does NOT compute one-tailed t-tests. I am trying to compare two independent groups and have justifiable reasons why I only care about one direction. Can I do the following? Use SPSS for two-tailed tests to calculate the t & p values. Then report the p-value as p/2 when it is in the predicted direction (e.g , SPSS says p = .04, so I report p = .02), and report the p-value as 1 – (p/2) when it is in the opposite direction (e.g., SPSS says p = .04, so I report p = .98)? If that is incorrect, what do you suggest (hopefully besides changing statistics programs)? Also, if I want to report confidence intervals, I realize that I would only have an upper or lower bound, but can I use the CI’s from SPSS to compute that? Thank you very much!

April 11, 2021 at 5:42 pm

Yes, for p-values, that’s absolutely correct for both cases.

For confidence intervals, if you take one endpoint of a two-side CI, it becomes a one-side bound with half the confidence level.

Consequently, to obtain a one-sided bound with your desired confidence level, you need to take your desired significance level (e.g., 0.05) and double it. Then subtract it from 1. So, if you’re using a significance level of 0.05, double that to 0.10 and then subtract from 1 (1 – 0.10 = 0.90). 90% is the confidence level you want to use for a two-sided test. After obtaining the two-sided CI, use one of the endpoints depending on the direction of your hypothesis (i.e., upper or lower bound). That’s produces the one-sided the bound with the confidence level that you want. For our example, we calculated a 95% one-sided bound.

' src=

March 3, 2021 at 8:27 am

Hi Jim. I used the one-tailed(right) statistical test to determine an anomaly in the below problem statement: On a daily basis, I calculate the (mapped_%) in a common field between two tables.

The way I used the t-test is: On any particular day, I calculate the sample_mean, S.D and sample_count (n=30) for the last 30 days including the current day. My null hypothesis, H0 (pop. mean)=95 and H1>95 (alternate hypothesis). So, I calculate the t-stat based on the sample_mean, pop.mean, sample S.D and n. I then choose the t-crit value for 0.05 from my t-ditribution table for dof(n-1). On the current day if my abs.(t-stat)>t-crit, then I reject the null hypothesis and I say the mapped_pct on that day has passed the t-test.

I get some weird results here, where if my mapped_pct is as low as 6%-8% in all the past 30 days, the t-test still gets a “pass” result. Could you help on this? If my hypothesis needs to be changed.

I would basically look for the mapped_pct >95, if it worked on a static trigger. How can I use the t-test effectively in this problem statement?

' src=

December 18, 2020 at 8:23 pm

Hello Dr. Jim, I am wondering if there is evidence in one of your books or other source you could provide, which supports that it is OK not to divide alpha level by 2 in one-tailed hypotheses. I need the source for supporting evidence in a Portfolio exercise and couldn’t find one.

I am grateful for your reply and for your statistics knowledge sharing!

' src=

November 27, 2020 at 10:31 pm

If I did a one directional F test ANOVA(one tail ) and wanted to calculate a confidence interval for each individual groups (3) mean . Would I use a one tailed or two tailed t , within my confidence interval .

November 29, 2020 at 2:36 am

Hi Bashiru,

F-tests for ANOVA will always be one-tailed for the reasons I discuss in this post. To learn more about, read my post about F-tests in ANOVA .

For the differences between my groups, I would not use t-tests because the family-wise error rate quickly grows out of hand. To learn more about how to compare group means while controlling the familywise error rate, read my post about using post hoc tests with ANOVA . Typically, these are two-side intervals but you’d be able to use one-sided.

' src=

November 26, 2020 at 10:51 am

Hi Jim, I had a question about the formulation of the hypotheses. When you want to test if a beta = 1 or a beta = 0. What will be the null hypotheses? I’m having trouble with finding out. Because in most cases beta = 0 is the null hypotheses but in this case you want to test if beta = 0. so i’m having my doubts can it in this case be the alternative hypotheses or is it still the null hypotheses?

Kind regards, Noa

November 27, 2020 at 1:21 am

Typically, the null hypothesis represents no effect or no relationship. As an analyst, you’re hoping that your data have enough evidence to reject the null and favor the alternative.

Assuming you’re referring to beta as in regression coefficients, zero represents no relationship. Consequently, beta = 0 is the null hypothesis.

You might hope that beta = 1, but you don’t usually include that in your alternative hypotheses. The alternative hypothesis usually states that it does not equal no effect. In other words, there is an effect but it doesn’t state what it is.

There are some exceptions to the above but I’m writing about the standard case.

' src=

November 22, 2020 at 8:46 am

Your articles are a help to intro to econometrics students. Keep up the good work! More power to you!

' src=

November 6, 2020 at 11:25 pm

Hello Jim. Can you help me with these please?

Write the null and alternative hypothesis using a 1-tailed and 2-tailed test for each problem. (In paragraph and symbols)

A teacher wants to know if there is a significant difference in the performance in MAT C313 between her morning and afternoon classes.

It is known that in our university canteen, the average waiting time for a customer to receive and pay for his/her order is 20 minutes. Additional personnel has been added and now the management wants to know if the average waiting time had been reduced.

November 8, 2020 at 12:29 am

I cover how to write the hypotheses for the different types of tests in this post. So, you just need to figure which type of test you need to use. In your case, you want to determine whether the mean waiting time is less than the target value of 20 minutes. That’s a 1-sample t-test because you’re comparing a mean to a target value (20 minutes). You specifically want to determine whether the mean is less than the target value. So, that’s a one-tailed test. And, you’re looking for a mean that is “less than” the target.

So, go to the one-tailed section in the post and look for the hypotheses for the effect being less than. That’s the one with the critical region on the left side of the curve.

Now, you need include your own information. In your case, you’re comparing the sample estimate to a population mean of 20. The 20 minutes is your null hypothesis value. Use the symbol mu μ to represent the population mean.

You put all that together and you get the following:

Null: μ ≥ 20 Alternative: μ 0 to denote the null hypothesis and H 1 or H A to denote the alternative hypothesis if that’s what you been using in class.

' src=

October 17, 2020 at 12:11 pm

I was just wondering if you could please help with clarifying what the hypothesises would be for say income for gamblers and, age of gamblers. I am struggling to find which means would be compared.

October 17, 2020 at 7:05 pm

Those are both continuous variables, so you’d use either correlation or regression for them. For both of those analyses, the hypotheses are the following:

Null : The correlation or regression coefficient equals zero (i.e., there is no relationship between the variables) Alternative : The coefficient does not equal zero (i.e., there is a relationship between the variables.)

When the p-value is less than your significance level, you reject the null and conclude that a relationship exists.

' src=

October 17, 2020 at 3:05 am

I was ask to choose and justify the reason between a one tailed and two tailed test for dummy variables, how do I do that and what does it mean?

October 17, 2020 at 7:11 pm

I don’t have enough information to answer your question. A dummy variable is also known as an indicator variable, which is a binary variable that indicates the presence or absence of a condition or characteristic. If you’re using this variable in a hypothesis test, I’d presume that you’re using a proportions test, which is based on the binomial distribution for binary data.

Choosing between a one-tailed or two-tailed test depends on subject area issues and, possibly, your research objectives. Typically, use a two-tailed test unless you have a very good reason to use a one-tailed test. To understand when you might use a one-tailed test, read my post about when to use a one-tailed hypothesis test .

' src=

October 16, 2020 at 2:07 pm

In your one-tailed example, Minitab describes the hypotheses as “Test of mu = 100 vs > 100”. Any idea why Minitab says the null is “=” rather than “= or less than”? No ASCII character for it?

October 16, 2020 at 4:20 pm

I’m not entirely sure even though I used to work there! I know we had some discussions about how to represent that hypothesis but I don’t recall the exact reasoning. I suspect that it has to do with the conclusions that you can draw. Let’s focus on the failing to reject the null hypothesis. If the test statistic falls in that region (i.e., it is not significant), you fail to reject the null. In this case, all you know is that you have insufficient evidence to say it is different than 100. I’m pretty sure that’s why they use the equal sign because it might as well be one.

Mathematically, I think using ≤ is more accurate, which you can really see when you look at the distribution plots. That’s why I phrase the hypotheses using ≤ or ≥ as needed. However, in terms of the interpretation, the “less than” portion doesn’t really add anything of importance. You can conclude that its equal to 100 or greater than 100, but not less than 100.

' src=

October 15, 2020 at 5:46 am

Thank you so much for your timely feedback. It helps a lot

October 14, 2020 at 10:47 am

How can i use one tailed test at 5% alpha on this problem?

A manufacturer of cellular phone batteries claims that when fully charged, the mean life of his product lasts for 26 hours with a standard deviation of 5 hours. Mr X, a regular distributor, randomly picked and tested 35 of the batteries. His test showed that the average life of his sample is 25.5 hours. Is there a significant difference between the average life of all the manufacturer’s batteries and the average battery life of his sample?

October 14, 2020 at 8:22 pm

I don’t think you’d want to use a one-tailed test. The goal is to determine whether the sample is significantly different than the manufacturer’s population average. You’re not saying significantly greater than or less than, which would be a one-tailed test. As phrased, you want a two-tailed test because it can detect a difference in either direct.

It sounds like you need to use a 1-sample t-test to test the mean. During this test, enter 26 as the test mean. The procedure will tell you if the sample mean of 25.5 hours is a significantly different from that test mean. Similarly, you’d need a one variance test to determine whether the sample standard deviation is significantly different from the test value of 5 hours.

For both of these tests, compare the p-value to your alpha of 0.05. If the p-value is less than this value, your results are statistically significant.

' src=

September 22, 2020 at 4:16 am

Hi Jim, I didn’t get an idea that when to use two tail test and one tail test. Will you please explain?

September 22, 2020 at 10:05 pm

I have a complete article dedicated to that: When Can I Use One-Tailed Tests .

Basically, start with the assumption that you’ll use a two-tailed test but then consider scenarios where a one-tailed test can be appropriate. I talk about all of that in the article.

If you have questions after reading that, please don’t hesitate to ask!

' src=

July 31, 2020 at 12:33 pm

Thank you so so much for this webpage.

I have two scenarios that I need some clarification. I will really appreciate it if you can take a look:

So I have several of materials that I know when they are tested after production. My hypothesis is that the earlier they are tested after production, the higher the mean value I should expect. At the same time, the later they are tested after production, the lower the mean value. Since this is more like a “greater or lesser” situation, I should use one tail. Is that the correct approach?

On the other hand, I have several mix of materials that I don’t know when they are tested after production. I only know the mean values of the test. And I only want to know whether one mean value is truly higher or lower than the other, I guess I want to know if they are only significantly different. Should I use two tail for this? If they are not significantly different, I can judge based on the mean values of test alone. And if they are significantly different, then I will need to do other type of analysis. Also, when I get my P-value for two tail, should I compare it to 0.025 or 0.05 if my confidence level is 0.05?

Thank you so much again.

July 31, 2020 at 11:19 pm

For your first, if you absolutely know that the mean must be lower the later the material is tested, that it cannot be higher, that would be a situation where you can use a one-tailed test. However, if that’s not a certainty, you’re just guessing, use a two-tail test. If you’re measuring different items at the different times, use the independent 2-sample t-test. However, if you’re measuring the same items at two time points, use the paired t-test. If it’s appropriate, using the paired t-test will give you more statistical power because it accounts for the variability between items. For more information, see my post about when it’s ok to use a one-tailed test .

For the mix of materials, use a two-tailed test because the effect truly can go either direction.

Always compare the p-value to your full significance level regardless of whether it’s a one or two-tailed test. Don’t divide the significance level in half.

' src=

June 17, 2020 at 2:56 pm

Is it possible that we reach to opposite conclusions if we use a critical value method and p value method Secondly if we perform one tail test and use p vale method to conclude our Ho, then do we need to convert sig value of 2 tail into sig value of one tail. That can be done just by dividing it with 2

June 18, 2020 at 5:17 pm

The p-value method and critical value method will always agree as long as you’re not changing anything about how the methodology.

If you’re using statistical software, you don’t need to make any adjustments. The software will do that for you.

However, if you calculating it by hand, you’ll need to take your significance level and then look in the table for your test statistic for a one-tailed test. For example, you’ll want to look up 5% for a one-tailed test rather than a two-tailed test. That’s not as simple as dividing by two. In this article, I show examples of one-tailed and two-tailed tests for the same degrees of freedom. The t critical value for the two-tailed test is +/- 2.086 while for the one-sided test it is 1.725. It is true that probability associated with those critical values doubles for the one-tailed test (2.5% -> 5%), but the critical value itself is not half (2.086 -> 1.725). Study the first several graphs in this article to see why that is true.

For the p-value, you can take a two-tailed p-value and divide by 2 to determine the one-sided p-value. However, if you’re using statistical software, it does that for you.

' src=

June 11, 2020 at 3:46 pm

Hello Jim, if you have the time I’d be grateful if you could shed some clarity on this scenario:

“A researcher believes that aromatherapy can relieve stress but wants to determine whether it can also enhance focus. To test this, the researcher selected a random sample of students to take an exam in which the average score in the general population is 77. Prior to the exam, these students studied individually in a small library room where a lavender scent was present. If students in this group scored significantly above the average score in general population [is this one-tailed or two-tailed hypothesis?], then this was taken as evidence that the lavender scent enhanced focus.”

Thank you for your time if you do decide to respond.

June 11, 2020 at 4:00 pm

It’s unclear from the information provided whether the researchers used a one-tailed or two-tailed test. It could be either. A two-tailed test can detect effects in both directions, so it could definitely detect an average group score above the population score. However, you could also detect that effect using a one-tailed test if it was set up correctly. So, there’s not enough information in what you provided to know for sure. It could be either.

However, that’s irrelevant to answering the question. The tricky part, as I see it, is that you’re not entirely sure about why the scores are higher. Are they higher because the lavender scent increased concentration or are they higher because the subjects have lower stress from the lavender? Or, maybe it’s not even related to the scent but some other characteristic of the room or testing conditions in which they took the test. You just know the scores are higher but not necessarily why they’re higher.

I’d say that, no, it’s not necessarily evidence that the lavender scent enhanced focus. There are competing explanations for why the scores are higher. Also, it would be best do this as an experiment with a control and treatment group where subjects are randomly assigned to either group. That process helps establish causality rather than just correlation and helps rules out competing explanations for why the scores are higher.

By the way, I spend a lot of time on these issues in my Introduction to Statistics ebook .

' src=

June 9, 2020 at 1:47 pm

If a left tail test has an alpha value of 0.05 how will you find the value in the table

' src=

April 19, 2020 at 10:35 am

Hi Jim, My question is in regards to the results in the table in your example of the one-sample T (Two-Tailed) test. above. What about the P-value? The P-value listed is .018. I assuming that is compared to and alpha of 0.025, correct?

In regression analysis, when I get a test statistic for the predictive variable of -2.099 and a p-value of 0.039. Am I comparing the p-value to an alpha of 0.025 or 0.05? Now if I run a Bootstrap for coefficients analysis, the results say the sig (2-tail) is 0.098. What are the critical values and alpha in this case? I’m trying to reconcile what I am seeing in both tables.

Thanks for your help.

April 20, 2020 at 3:24 am

Hi Marvalisa,

For one-tailed tests, you don’t need to divide alpha in half. If you can tell your software to perform a one-tailed test, it’ll do all the calculations necessary so you don’t need to adjust anything. So, if you’re using an alpha of 0.05 for a one-tailed test and your p-value is 0.04, it is significant. The procedures adjust the p-values automatically and it all works out. So, whether you’re using a one-tailed or two-tailed test, you always compare the p-value to the alpha with no need to adjust anything. The procedure does that for you!

The exception would be if for some reason your software doesn’t allow you to specify that you want to use a one-tailed test instead of a two-tailed test. Then, you divide the p-value from a two-tailed test in half to get the p-value for a one tailed test. You’d still compare it to your original alpha.

For regression, the same thing applies. If you want to use a one-tailed test for a cofficient, just divide the p-value in half if you can’t tell the software that you want a one-tailed test. The default is two-tailed. If your software has the option for one-tailed tests for any procedure, including regression, it’ll adjust the p-value for you. So, in the normal course of things, you won’t need to adjust anything.

' src=

March 26, 2020 at 12:00 pm

Hey Jim, for a one-tailed hypothesis test with a .05 confidence level, should I use a 95% confidence interval or a 90% confidence interval? Thanks

March 26, 2020 at 5:05 pm

You should use a one-sided 95% confidence interval. One-sided CIs have either an upper OR lower bound but remains unbounded on the other side.

' src=

March 16, 2020 at 4:30 pm

This is not applicable to the subject but… When performing tests of equivalence, we look at the confidence interval of the difference between two groups, and we perform two one-sided t-tests for equivalence..

' src=

March 15, 2020 at 7:51 am

Thanks for this illustrative blogpost. I had a question on one of your points though.

By definition of H1 and H0, a two-sided alternate hypothesis is that there is a difference in means between the test and control. Not that anything is ‘better’ or ‘worse’.

Just because we observed a negative result in your example, does not mean we can conclude it’s necessarily worse, but instead just ‘different’.

Therefore while it enables us to spot the fact that there may be differences between test and control, we cannot make claims about directional effects. So I struggle to see why they actually need to be used instead of one-sided tests.

What’s your take on this?

March 16, 2020 at 3:02 am

Hi Dominic,

If you’ll notice, I carefully avoid stating better or worse because in a general sense you’re right. However, given the context of a specific experiment, you can conclude whether a negative value is better or worse. As always in statistics, you have to use your subject-area knowledge to help interpret the results. In some cases, a negative value is a bad result. In other cases, it’s not. Use your subject-area knowledge!

I’m not sure why you think that you can’t make claims about directional effects? Of course you can!

As for why you shouldn’t use one-tailed tests for most cases, read my post When Can I Use One-Tailed Tests . That should answer your questions.

' src=

May 10, 2019 at 12:36 pm

Your website is absolutely amazing Jim, you seem like the nicest guy for doing this and I like how there’s no ulterior motive, (I wasn’t automatically signed up for emails or anything when leaving this comment). I study economics and found econometrics really difficult at first, but your website explains it so clearly its been a big asset to my studies, keep up the good work!

May 10, 2019 at 2:12 pm

Thank you so much, Jack. Your kind words mean a lot!

' src=

April 26, 2019 at 5:05 am

Hy Jim I really need your help now pls

One-tailed and two- tailed hypothesis, is it the same or twice, half or unrelated pls

April 26, 2019 at 11:41 am

Hi Anthony,

I describe how the hypotheses are different in this post. You’ll find your answers.

' src=

February 8, 2019 at 8:00 am

Thank you for your blog Jim, I have a Statistics exam soon and your articles let me understand a lot!

February 8, 2019 at 10:52 am

You’re very welcome! I’m happy to hear that it’s been helpful. Best of luck on your exam!

' src=

January 12, 2019 at 7:06 am

Hi Jim, When you say target value is 5. Do you mean to say the population mean is 5 and we are trying to validate it with the help of sample mean 4.1 using Hypo tests ?.. If it is so.. How can we measure a population parameter as 5 when it is almost impossible o measure a population parameter. Please clarify

January 12, 2019 at 6:57 pm

When you set a target for a one-sample test, it’s based on a value that is important to you. It’s not a population parameter or anything like that. The example in this post uses a case where we need parts that are stronger on average than a value of 5. We derive the value of 5 by using our subject area knowledge about what is required for a situation. Given our product knowledge for the hypothetical example, we know it should be 5 or higher. So, we use that in the hypothesis test and determine whether the population mean is greater than that target value.

When you perform a one-sample test, a target value is optional. If you don’t supply a target value, you simply obtain a confidence interval for the range of values that the parameter is likely to fall within. But, sometimes there is meaningful number that you want to test for specifically.

I hope that clarifies the rational behind the target value!

' src=

November 15, 2018 at 8:08 am

I understand that in Psychology a one tailed hypothesis is preferred. Is that so

November 15, 2018 at 11:30 am

No, there’s no overall preference for one-tailed hypothesis tests in statistics. That would be a study-by-study decision based on the types of possible effects. For more information about this decision, read my post: When Can I Use One-Tailed Tests?

' src=

November 6, 2018 at 1:14 am

I’m grateful to you for the explanations on One tail and Two tail hypothesis test. This opens my knowledge horizon beyond what an average statistics textbook can offer. Please include more examples in future posts. Thanks

November 5, 2018 at 10:20 am

Thank you. I will search it as well.

Stan Alekman

November 4, 2018 at 8:48 pm

Jim, what is the difference between the central and non-central t-distributions w/respect to hypothesis testing?

November 5, 2018 at 10:12 am

Hi Stan, this is something I will need to look into. I know central t-distribution is the common Student t-distribution, but I don’t have experience using non-central t-distributions. There might well be a blog post in that–after I learn more!

' src=

November 4, 2018 at 7:42 pm

this is awesome.

Comments and Questions Cancel reply

Statology

Statistics Made Easy

One-Tailed Hypothesis Tests: 3 Example Problems

In statistics, we use hypothesis tests to determine whether some claim about a population parameter is true or not.

Whenever we perform a hypothesis test, we always write a null hypothesis and an alternative hypothesis, which take the following forms:

H 0 (Null Hypothesis): Population parameter = ≤, ≥ some value

H A (Alternative Hypothesis): Population parameter <, >, ≠ some value

There are two types of hypothesis tests:

  • Two-tailed test : Alternative hypothesis contains the ≠ sign
  • One-tailed test : Alternative hypothesis contains either < or > sign

In a one-tailed test , the alternative hypothesis contains the less than (“<“) or greater than (“>”) sign. This indicates that we’re testing whether or not there is a positive or negative effect.

Check out the following example problems to gain a better understanding of one-tailed tests.

Example 1: Factory Widgets

Suppose it’s assumed that the average weight of a certain widget produced at a factory is 20 grams. However, one engineer believes that a new method produces widgets that weigh less than 20 grams.

To test this, he can perform a one-tailed hypothesis test with the following null and alternative hypotheses:

  • H 0 (Null Hypothesis): μ ≥ 20 grams
  • H A (Alternative Hypothesis): μ < 20 grams

Note : We can tell this is a one-tailed test because the alternative hypothesis contains the less than ( < ) sign. Specifically, we would call this a left-tailed test because we’re testing if some population parameter is less than a specific value.

To test this, he uses the new method to produce 20 widgets and obtains the following information:

  • n = 20 widgets
  • x = 19.8 grams
  • s = 3.1 grams

Plugging these values into the One Sample t-test Calculator , we obtain the following results:

  • t-test statistic: -0.288525
  • one-tailed p-value: 0.388

Since the p-value is not less than .05, the engineer fails to reject the null hypothesis.

He does not have sufficient evidence to say that the true mean weight of widgets produced by the new method is less than 20 grams.

Example 2: Plant Growth

Suppose a standard fertilizer has been shown to cause a species of plants to grow by an average of 10 inches. However, one botanist believes a new fertilizer can cause this species of plants to grow by an average of greater than 10 inches.

To test this, she can perform a one-tailed hypothesis test with the following null and alternative hypotheses:

  • H 0 (Null Hypothesis): μ ≤ 10 inches
  • H A (Alternative Hypothesis): μ > 10 inches

Note : We can tell this is a one-tailed test because the alternative hypothesis contains the greater than ( > ) sign. Specifically, we would call this a right-tailed test because we’re testing if some population parameter is greater than a specific value.

To test this claim, she applies the new fertilizer to a simple random sample of 15 plants and obtains the following information:

  • n = 15 plants
  • x = 11.4 inches
  • s = 2.5 inches
  • t-test statistic: 2.1689
  • one-tailed p-value: 0.0239

Since the p-value is less than .05, the botanist rejects the null hypothesis.

She has sufficient evidence to conclude that the new fertilizer causes an average increase of greater than 10 inches.

Example 3: Studying Method

A professor currently teaches students to use a studying method that results in an average exam score of 82. However, he believes a new studying method can produce exam scores with an average value greater than 82.

To test this, he can perform a one-tailed hypothesis test with the following null and alternative hypotheses:

  • H 0 (Null Hypothesis): μ ≤ 82
  • H A (Alternative Hypothesis): μ > 82

To test this claim, the professor has 25 students use the new studying method and then take the exam. He collects the following data on the exam scores for this sample of students:

  • t-test statistic: 3.6586
  • one-tailed p-value: 0.0006

Since the p-value is less than .05, the professor rejects the null hypothesis.

He has sufficient evidence to conclude that the new studying method produces exam scores with an average score greater than 82.

Additional Resources

The following tutorials provide additional information about hypothesis testing:

Introduction to Hypothesis Testing What is a Directional Hypothesis? When Do You Reject the Null Hypothesis?

Featured Posts

null hypothesis one sided test

Hey there. My name is Zach Bobbitt. I have a Masters of Science degree in Applied Statistics and I’ve worked on machine learning algorithms for professional businesses in both healthcare and retail. I’m passionate about statistics, machine learning, and data visualization and I created Statology to be a resource for both students and teachers alike.  My goal with this site is to help you learn statistics through using simple terms, plenty of real-world examples, and helpful illustrations.

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Join the Statology Community

Sign up to receive Statology's exclusive study resource: 100 practice problems with step-by-step solutions. Plus, get our latest insights, tutorials, and data analysis tips straight to your inbox!

By subscribing you accept Statology's Privacy Policy.

  • Search Search Please fill out this field.

What Is a One-Tailed Test?

  • Determining Significance
  • One-Tailed Test FAQs
  • Corporate Finance
  • Financial Analysis

One-Tailed Test Explained: Definition and Example

null hypothesis one sided test

Investopedia / Xiaojie Liu

A one-tailed test is a statistical test in which the critical area of a distribution is one-sided so that it is either greater than or less than a certain value, but not both. If the sample being tested falls into the one-sided critical area, the alternative hypothesis will be accepted instead of the null hypothesis.

Financial analysts use the one-tailed test to test an investment or portfolio hypothesis.

Key Takeaways

  • A one-tailed test is a statistical hypothesis test set up to show that the sample mean would be higher or lower than the population mean, but not both.
  • When using a one-tailed test, the analyst is testing for the possibility of the relationship in one direction of interest and completely disregarding the possibility of a relationship in another direction.
  • Before running a one-tailed test, the analyst must set up a null and alternative hypothesis and establish a probability value (p-value).

A basic concept in inferential statistics is hypothesis testing . Hypothesis testing is run to determine whether a claim is true or not, given a population parameter. A test that is conducted to show whether the mean of the sample is significantly greater than and significantly less than the mean of a population is considered a two-tailed test . When the testing is set up to show that the sample mean would be higher or lower than the population mean, it is referred to as a one-tailed test. The one-tailed test gets its name from testing the area under one of the tails (sides) of a normal distribution , although the test can be used in other non-normal distributions.

Before the one-tailed test can be performed, null and alternative hypotheses must be established. A null hypothesis is a claim that the researcher hopes to reject. An alternative hypothesis is the claim supported by rejecting the null hypothesis.

A one-tailed test is also known as a directional hypothesis or directional test.

Example of the One-Tailed Test

Let's say an analyst wants to prove that a portfolio manager outperformed the S&P 500 index in a given year by 16.91%. They may set up the null (H 0 ) and alternative (H a ) hypotheses as:

H 0 : μ ≤ 16.91

H a : μ > 16.91

The null hypothesis is the measurement that the analyst hopes to reject. The alternative hypothesis is the claim made by the analyst that the portfolio manager performed better than the S&P 500. If the outcome of the one-tailed test results in rejecting the null, the alternative hypothesis will be supported. On the other hand, if the outcome of the test fails to reject the null, the analyst may carry out further analysis and investigation into the portfolio manager’s performance.

The region of rejection is on only one side of the sampling distribution in a one-tailed test. To determine how the portfolio’s return on investment compares to the market index, the analyst must run an upper-tailed significance test in which extreme values fall in the upper tail (right side) of the normal distribution curve. The one-tailed test conducted in the upper or right tail area of the curve will show the analyst how much higher the portfolio return is than the index return and whether the difference is significant.

1%, 5% or 10%

The most common significance levels (p-values) used in a one-tailed test.

Determining Significance in a One-Tailed Test

To determine how significant the difference in returns is, a significance level must be specified. The significance level is almost always represented by the letter p, which stands for probability. The level of significance is the probability of incorrectly concluding that the null hypothesis is false. The significance value used in a one-tailed test is either 1%, 5%, or 10%, although any other probability measurement can be used at the discretion of the analyst or statistician. The probability value is calculated with the assumption that the null hypothesis is true. The lower the p-value , the stronger the evidence that the null hypothesis is false.

If the resulting p-value is less than 5%, the difference between both observations is statistically significant, and the null hypothesis is rejected. Following our example above, if the p-value = 0.03, or 3%, then the analyst can be 97% confident that the portfolio returns did not equal or fall below the return of the market for the year. They will, therefore, reject H 0  and support the claim that the portfolio manager outperformed the index. The probability calculated in only one tail of a distribution is half the probability of a two-tailed distribution if similar measurements were tested using both hypothesis testing tools.

When using a one-tailed test, the analyst is testing for the possibility of the relationship in one direction of interest and completely disregarding the possibility of a relationship in another direction. Using our example above, the analyst is interested in whether a portfolio’s return is greater than the market’s. In this case, they do not need to statistically account for a situation in which the portfolio manager underperformed the S&P 500 index. For this reason, a one-tailed test is only appropriate when it is not important to test the outcome at the other end of a distribution.

How Do You Determine If It Is a One-Tailed or Two-Tailed Test?

A one-tailed test looks for an increase or decrease in a parameter. A two-tailed test looks for change, which could be a decrease or an increase.

What Is a One-Tailed T Test Used for?

A one-tailed T-test checks for the possibility of a one-direction relationship but does not consider a directional relationship in another direction.

When Should a Two-Tailed Test Be Used?

You would use a two-tailed test when you want to test your hypothesis in both directions.

University of Southern California. " FAQ: What Are the Differences Between One-Tailed and Two-Tailed Tests? "

null hypothesis one sided test

  • Terms of Service
  • Editorial Policy
  • Privacy Policy

Statitstical Inference

4.4 one-sided and two-sided tests.

In the preceding section, you may have had some trouble when you were determining whether a research hypothesis is a null hypothesis or an alternative hypothesis. The research hypothesis stating that average media literacy is below 5.5 in the population, for example, represents the alternative hypothesis because it does not fix the hypothesized population value to one number. The accompanying null hypothesis must cover all other options, so it must state that the population mean is 5.5 or higher. But this null hypothesis does not specify one value as it should, right?

This null hypothesis is slightly different from the ones we have encountered so far, which equated the population value to a single value. If the null hypothesis equates a parameter to a single value, the null hypothesis can be rejected if the sample statistic is either too high or too low. There are two ways of rejecting the null hypothesis, so this type of hypothesis and test are called two-sided or two-tailed .

By contrast, the null hypothesis stating that the population mean is 5.5 or higher is a one-sided or one-tailed hypothesis. It can only be rejected if the sample statistic is at one side of the spectrum: only below (left-sided) or only above (right-sided) the hypothesized population value. In the media literacy example, the null hypothesis is only rejected if the sample mean is well below the hypothesized population value. A test of a one-sided null hypothesis is called a one-sided test .

Figure 4.3: One-sided and two-sided tests of a null hypothesis.

In a left-sided test of the media literacy hypothesis, the researcher is not interested in demonstrating that average media literacy among children can be larger than 5.5. She only wants to test if it is below 5.5, perhaps because an average score below 5.5 is alarming and requires an intervention, or because prior knowledge about the world has convinced her that average media literacy among children can only be lower than 5.5 on average in the population.

If it is deemed important to note values well over 5.5 as well as values well below 5.5, the research and null hypotheses should be two-sided. Then, a sample average well above 5.5 would also have resulted in a rejection of the null hypothesis. In a left-sided test, however, a high sample outcome cannot reject the null hypothesis.

4.4.1 Boundary value as hypothesized population value

Figure 4.4: Sampling distribution of average media literacy.

You may wonder how a one-sided null hypothesis equates the parameter of interest with one value as it should. The special value here is 5.5. If we can reject the null hypothesis stating that the population mean is 5.5 because our sample mean is sufficiently lower than 5.5, we can also reject any hypothesis involving population means higher than 5.5.

In other words, if you want to know if the value is not 5.5 or more, it is enough to find that it is less than 5.5. If it’s less than 5.5, then you know it’s also less than any number above 5.5. Therefore, we use the boundary value of a one-sided null hypothesis as the hypothesized value for the population in a one-sided test.

4.4.2 One-sided – two-sided distinction is not always relevant

Note that the difference between one-sided and two-sided tests is only useful if we test a statistic against one particular value or if we test the difference between two groups.

In the first situation, for example, if we test the null hypothesis that average media literacy is 5.5 in the population, we may only be interested in showing that the population value is lower than the hypothesized value. Another example is a test on a regression coefficient or correlation coefficient. According to the null hypothesis, the coefficient is zero in the population. If we only want to use a brand advertisement if exposure to the advertisement increases brand awareness among consumers, we apply a right-sided test to the coefficient for the effect of exposure on brand awareness because we are only interested in a positive effect (larger than the zero).

In the second situation, we compare the scores of two groups on a dependent variable. If we compare average media literacy after an intervention to media literacy before the intervention (paired-samples t test), we must demonstrate an increase in media literacy before we are going to use the intervention on a large scale. Again, a one-sided test can be applied.

In contrast, we cannot meaningfully formulate a one-sided null hypothesis if we are comparing three groups or more. Even if we expect that Group A can only score higher than Group B and Group C, what about the difference between Group B and Group C? If we can’t have meaningful one-sided null hypotheses, we cannot meaningfully distinguish between one-sided and two-sided null hypotheses.

4.4.3 From one-sided to two-sided p values and back again

Statistical software like SPSS usually reports either one-sided or two-sided p values. What if a one-sided p value is reported but you need a two-sided p value or the other way around?

In Figure 4.5 , the sample mean is 3.9 and we have .015 probability of finding a sample mean of 3.9 or less if the null hypothesis is true. This probability is the surface under the curve to the left of the red line representing the sample mean. It is the one-sided p value that we obtain if we only take into account the possibility that the population mean can be smaller than the hypothesized value. We are only interested in the left tail of the sampling distribution.

Figure 4.5: Halve a two-sided p value to obtain a one-sided p value, double a one-sided p value to obtain a two-sided p value.

In a two-sided test, we have to take into account two different types of outcomes. Our sample outcome can be smaller or larger than the hypothesized population value. As a consequence, the p value must cover samples at opposite sides of the sampling distribution. We should not only take into account sample means that are smaller than 5.5 but also sample means that are just as much larger than the hypothesized population value. So our two-sided p value must include both the probability of .015 for the left tail and for the right tail of the distribution in Figure 4.5 . We must double the one-sided p value to obtain the two-sided p value.

In contrast, if our statistical software tells us the two-sided p value and we want to have the one-sided p value, we can simply halve the two-sided p value. The two-sided p value is divided equally between the left and right tails. If we are interested in just one tail, we can ignore the half of the p value that is situated in the other tail. Of course, this only makes sense if a one-sided test makes sense.

Be careful if you divide a two-sided p value to obtain a one-sided p value. If your left-sided test hypothesizes that average media literacy is below 5.5 but your sample mean is well above 5.5, the two-sided p value can be below .05. But your left-sided test can never be significant because a sample mean above 5.5 is fully in line with the null hypothesis. Check that the sample outcome is at the correct side of the hypothesized population value.

  • Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar

Institute for Digital Research and Education

FAQ: What are the differences between one-tailed and two-tailed tests?

When you conduct a test of statistical significance, whether it is from a correlation, an ANOVA, a regression or some other kind of test, you are given a p-value somewhere in the output.  If your test statistic is symmetrically distributed, you can select one of three alternative hypotheses. Two of these correspond to one-tailed tests and one corresponds to a two-tailed test.  However, the p-value presented is (almost always) for a two-tailed test.  But how do you choose which test?  Is the p-value appropriate for your test? And, if it is not, how can you calculate the correct p-value for your test given the p-value in your output?  

What is a two-tailed test?

First let’s start with the meaning of a two-tailed test.  If you are using a significance level of 0.05, a two-tailed test allots half of your alpha to testing the statistical significance in one direction and half of your alpha to testing statistical significance in the other direction.  This means that .025 is in each tail of the distribution of your test statistic. When using a two-tailed test, regardless of the direction of the relationship you hypothesize, you are testing for the possibility of the relationship in both directions.  For example, we may wish to compare the mean of a sample to a given value x using a t-test.  Our null hypothesis is that the mean is equal to x . A two-tailed test will test both if the mean is significantly greater than x and if the mean significantly less than x . The mean is considered significantly different from x if the test statistic is in the top 2.5% or bottom 2.5% of its probability distribution, resulting in a p-value less than 0.05.     

What is a one-tailed test?

Next, let’s discuss the meaning of a one-tailed test.  If you are using a significance level of .05, a one-tailed test allots all of your alpha to testing the statistical significance in the one direction of interest.  This means that .05 is in one tail of the distribution of your test statistic. When using a one-tailed test, you are testing for the possibility of the relationship in one direction and completely disregarding the possibility of a relationship in the other direction.  Let’s return to our example comparing the mean of a sample to a given value x using a t-test.  Our null hypothesis is that the mean is equal to x . A one-tailed test will test either if the mean is significantly greater than x or if the mean is significantly less than x , but not both. Then, depending on the chosen tail, the mean is significantly greater than or less than x if the test statistic is in the top 5% of its probability distribution or bottom 5% of its probability distribution, resulting in a p-value less than 0.05.  The one-tailed test provides more power to detect an effect in one direction by not testing the effect in the other direction. A discussion of when this is an appropriate option follows.   

When is a one-tailed test appropriate?

Because the one-tailed test provides more power to detect an effect, you may be tempted to use a one-tailed test whenever you have a hypothesis about the direction of an effect. Before doing so, consider the consequences of missing an effect in the other direction.  Imagine you have developed a new drug that you believe is an improvement over an existing drug.  You wish to maximize your ability to detect the improvement, so you opt for a one-tailed test. In doing so, you fail to test for the possibility that the new drug is less effective than the existing drug.  The consequences in this example are extreme, but they illustrate a danger of inappropriate use of a one-tailed test.

So when is a one-tailed test appropriate? If you consider the consequences of missing an effect in the untested direction and conclude that they are negligible and in no way irresponsible or unethical, then you can proceed with a one-tailed test. For example, imagine again that you have developed a new drug. It is cheaper than the existing drug and, you believe, no less effective.  In testing this drug, you are only interested in testing if it less effective than the existing drug.  You do not care if it is significantly more effective.  You only wish to show that it is not less effective. In this scenario, a one-tailed test would be appropriate. 

When is a one-tailed test NOT appropriate?

Choosing a one-tailed test for the sole purpose of attaining significance is not appropriate.  Choosing a one-tailed test after running a two-tailed test that failed to reject the null hypothesis is not appropriate, no matter how "close" to significant the two-tailed test was.  Using statistical tests inappropriately can lead to invalid results that are not replicable and highly questionable–a steep price to pay for a significance star in your results table!   

Deriving a one-tailed test from two-tailed output

The default among statistical packages performing tests is to report two-tailed p-values.  Because the most commonly used test statistic distributions (standard normal, Student’s t) are symmetric about zero, most one-tailed p-values can be derived from the two-tailed p-values.   

Below, we have the output from a two-sample t-test in Stata.  The test is comparing the mean male score to the mean female score.  The null hypothesis is that the difference in means is zero.  The two-sided alternative is that the difference in means is not zero.  There are two one-sided alternatives that one could opt to test instead: that the male score is higher than the female score (diff  > 0) or that the female score is higher than the male score (diff < 0).  In this instance, Stata presents results for all three alternatives.  Under the headings Ha: diff < 0 and Ha: diff > 0 are the results for the one-tailed tests. In the middle, under the heading Ha: diff != 0 (which means that the difference is not equal to 0), are the results for the two-tailed test. 

Two-sample t test with equal variances ------------------------------------------------------------------------------ Group | Obs Mean Std. Err. Std. Dev. [95% Conf. Interval] ---------+-------------------------------------------------------------------- male | 91 50.12088 1.080274 10.30516 47.97473 52.26703 female | 109 54.99083 .7790686 8.133715 53.44658 56.53507 ---------+-------------------------------------------------------------------- combined | 200 52.775 .6702372 9.478586 51.45332 54.09668 ---------+-------------------------------------------------------------------- diff | -4.869947 1.304191 -7.441835 -2.298059 ------------------------------------------------------------------------------ Degrees of freedom: 198 Ho: mean(male) - mean(female) = diff = 0 Ha: diff < 0 Ha: diff != 0 Ha: diff > 0 t = -3.7341 t = -3.7341 t = -3.7341 P < t = 0.0001 P > |t| = 0.0002 P > t = 0.9999

Note that the test statistic, -3.7341, is the same for all of these tests.  The two-tailed p-value is P > |t|. This can be rewritten as P(>3.7341) + P(< -3.7341).  Because the t-distribution is symmetric about zero, these two probabilities are equal: P > |t| = 2 *  P(< -3.7341).  Thus, we can see that the two-tailed p-value is twice the one-tailed p-value for the alternative hypothesis that (diff < 0).  The other one-tailed alternative hypothesis has a p-value of P(>-3.7341) = 1-(P<-3.7341) = 1-0.0001 = 0.9999.   So, depending on the direction of the one-tailed hypothesis, its p-value is either 0.5*(two-tailed p-value) or 1-0.5*(two-tailed p-value) if the test statistic symmetrically distributed about zero. 

In this example, the two-tailed p-value suggests rejecting the null hypothesis of no difference. Had we opted for the one-tailed test of (diff > 0), we would fail to reject the null because of our choice of tails. 

The output below is from a regression analysis in Stata.  Unlike the example above, only the two-sided p-values are presented in this output.

Source | SS df MS Number of obs = 200 -------------+------------------------------ F( 2, 197) = 46.58 Model | 7363.62077 2 3681.81039 Prob > F = 0.0000 Residual | 15572.5742 197 79.0486001 R-squared = 0.3210 -------------+------------------------------ Adj R-squared = 0.3142 Total | 22936.195 199 115.257261 Root MSE = 8.8909 ------------------------------------------------------------------------------ socst | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- science | .2191144 .0820323 2.67 0.008 .0573403 .3808885 math | .4778911 .0866945 5.51 0.000 .3069228 .6488594 _cons | 15.88534 3.850786 4.13 0.000 8.291287 23.47939 ------------------------------------------------------------------------------

For each regression coefficient, the tested null hypothesis is that the coefficient is equal to zero.  Thus, the one-tailed alternatives are that the coefficient is greater than zero and that the coefficient is less than zero. To get the p-value for the one-tailed test of the variable science having a coefficient greater than zero, you would divide the .008 by 2, yielding .004 because the effect is going in the predicted direction. This is P(>2.67). If you had made your prediction in the other direction (the opposite direction of the model effect), the p-value would have been 1 – .004 = .996.  This is P(<2.67). For all three p-values, the test statistic is 2.67. 

Your Name (required)

Your Email (must be a valid email for us to receive the report!)

Comment/Error Report (required)

How to cite this page

  • © 2021 UC REGENTS

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Indian J Crit Care Med
  • v.23(Suppl 3); 2019 Sep

An Introduction to Statistics: Understanding Hypothesis Testing and Statistical Errors

Priya ranganathan.

1 Department of Anesthesiology, Critical Care and Pain, Tata Memorial Hospital, Mumbai, Maharashtra, India

2 Department of Surgical Oncology, Tata Memorial Centre, Mumbai, Maharashtra, India

The second article in this series on biostatistics covers the concepts of sample, population, research hypotheses and statistical errors.

How to cite this article

Ranganathan P, Pramesh CS. An Introduction to Statistics: Understanding Hypothesis Testing and Statistical Errors. Indian J Crit Care Med 2019;23(Suppl 3):S230–S231.

Two papers quoted in this issue of the Indian Journal of Critical Care Medicine report. The results of studies aim to prove that a new intervention is better than (superior to) an existing treatment. In the ABLE study, the investigators wanted to show that transfusion of fresh red blood cells would be superior to standard-issue red cells in reducing 90-day mortality in ICU patients. 1 The PROPPR study was designed to prove that transfusion of a lower ratio of plasma and platelets to red cells would be superior to a higher ratio in decreasing 24-hour and 30-day mortality in critically ill patients. 2 These studies are known as superiority studies (as opposed to noninferiority or equivalence studies which will be discussed in a subsequent article).

SAMPLE VERSUS POPULATION

A sample represents a group of participants selected from the entire population. Since studies cannot be carried out on entire populations, researchers choose samples, which are representative of the population. This is similar to walking into a grocery store and examining a few grains of rice or wheat before purchasing an entire bag; we assume that the few grains that we select (the sample) are representative of the entire sack of grains (the population).

The results of the study are then extrapolated to generate inferences about the population. We do this using a process known as hypothesis testing. This means that the results of the study may not always be identical to the results we would expect to find in the population; i.e., there is the possibility that the study results may be erroneous.

HYPOTHESIS TESTING

A clinical trial begins with an assumption or belief, and then proceeds to either prove or disprove this assumption. In statistical terms, this belief or assumption is known as a hypothesis. Counterintuitively, what the researcher believes in (or is trying to prove) is called the “alternate” hypothesis, and the opposite is called the “null” hypothesis; every study has a null hypothesis and an alternate hypothesis. For superiority studies, the alternate hypothesis states that one treatment (usually the new or experimental treatment) is superior to the other; the null hypothesis states that there is no difference between the treatments (the treatments are equal). For example, in the ABLE study, we start by stating the null hypothesis—there is no difference in mortality between groups receiving fresh RBCs and standard-issue RBCs. We then state the alternate hypothesis—There is a difference between groups receiving fresh RBCs and standard-issue RBCs. It is important to note that we have stated that the groups are different, without specifying which group will be better than the other. This is known as a two-tailed hypothesis and it allows us to test for superiority on either side (using a two-sided test). This is because, when we start a study, we are not 100% certain that the new treatment can only be better than the standard treatment—it could be worse, and if it is so, the study should pick it up as well. One tailed hypothesis and one-sided statistical testing is done for non-inferiority studies, which will be discussed in a subsequent paper in this series.

STATISTICAL ERRORS

There are two possibilities to consider when interpreting the results of a superiority study. The first possibility is that there is truly no difference between the treatments but the study finds that they are different. This is called a Type-1 error or false-positive error or alpha error. This means falsely rejecting the null hypothesis.

The second possibility is that there is a difference between the treatments and the study does not pick up this difference. This is called a Type 2 error or false-negative error or beta error. This means falsely accepting the null hypothesis.

The power of the study is the ability to detect a difference between groups and is the converse of the beta error; i.e., power = 1-beta error. Alpha and beta errors are finalized when the protocol is written and form the basis for sample size calculation for the study. In an ideal world, we would not like any error in the results of our study; however, we would need to do the study in the entire population (infinite sample size) to be able to get a 0% alpha and beta error. These two errors enable us to do studies with realistic sample sizes, with the compromise that there is a small possibility that the results may not always reflect the truth. The basis for this will be discussed in a subsequent paper in this series dealing with sample size calculation.

Conventionally, type 1 or alpha error is set at 5%. This means, that at the end of the study, if there is a difference between groups, we want to be 95% certain that this is a true difference and allow only a 5% probability that this difference has occurred by chance (false positive). Type 2 or beta error is usually set between 10% and 20%; therefore, the power of the study is 90% or 80%. This means that if there is a difference between groups, we want to be 80% (or 90%) certain that the study will detect that difference. For example, in the ABLE study, sample size was calculated with a type 1 error of 5% (two-sided) and power of 90% (type 2 error of 10%) (1).

Table 1 gives a summary of the two types of statistical errors with an example

Statistical errors

(a) Types of statistical errors
: Null hypothesis is
TrueFalse
Null hypothesis is actuallyTrueCorrect results!Falsely rejecting null hypothesis - Type I error
FalseFalsely accepting null hypothesis - Type II errorCorrect results!
(b) Possible statistical errors in the ABLE trial
There is difference in mortality between groups receiving fresh RBCs and standard-issue RBCsThere difference in mortality between groups receiving fresh RBCs and standard-issue RBCs
TruthThere is difference in mortality between groups receiving fresh RBCs and standard-issue RBCsCorrect results!Falsely rejecting null hypothesis - Type I error
There difference in mortality between groups receiving fresh RBCs and standard-issue RBCsFalsely accepting null hypothesis - Type II errorCorrect results!

In the next article in this series, we will look at the meaning and interpretation of ‘ p ’ value and confidence intervals for hypothesis testing.

Source of support: Nil

Conflict of interest: None

null hypothesis one sided test

  • The Open University
  • Accessibility hub
  • Guest user / Sign out
  • Study with The Open University

My OpenLearn Profile

Personalise your OpenLearn profile, save your favourite content and get recognition for your learning

About this free course

Become an ou student, download this course, share this free course.

Data analysis: hypothesis testing

Start this free course now. Just create an account and sign in. Enrol and complete the course for a free statement of participation or digital badge if available.

4.3 One-sided tests

As well as non-directional hypotheses, you will also encounter hypotheses that have a less than or equal to (≤) and greater than (>) supposition (sign) in the statement (as you saw in Activity 3). This is called a directional hypothesis. A directional hypothesis is a type of research hypothesis that aims to predict the direction of the relationship or difference between two variables. Essentially, it specifies the anticipated outcome of a study prior to the collection of data.

For example, a directional hypothesis might propose that a marketing campaign will increase product sales, predicting the direction of the relationship (i.e. the marketing campaign will lead to an increase in product sales). In contrast, a non-directional hypothesis simply states that there is a relationship between two variables without specifying the direction of that relationship, such as: ‘There is a relationship between the marketing campaign and product sales.’

Directional hypotheses are often preferred in scientific research because they provide a more precise and focused prediction than non-directional hypotheses. In business management, a directional hypothesis can also be a useful tool. For example, a company may use a directional hypothesis to design a study that examines the effectiveness of a marketing campaign in enhancing sales. This approach provides a clearer understanding of the impact of the campaign and enables the company to make more informed decisions about future marketing strategies.

A one-tailed test is a statistical test employed to evaluate a directional hypothesis, which predicts the direction of the difference or association between two variables. Its objective is to ascertain if the data supports the anticipated direction.

To illustrate, consider the hypotheses from Activity 3:

H 0 : µ ≤ 15 hours of studies

H a : µ > 15 hours of studies

The null hypothesis (H 0 ) posits that the population mean (µ) is less than or equal to 15 hours of studies, while the alternative hypothesis (H a ) predicts that the population mean is greater than 15 hours of studies.

To conduct a one-tailed test, a critical value must be established to determine whether the null hypothesis should be rejected or retained. Typically, a significance level (α) is set for this purpose. For instance, assuming α = 0.05, the z-score for a one-tailed test with α = 0.05 in a normal distribution is 1.645. Consequently, the null hypothesis would be rejected if the z-score exceeds 1.645. In other words, only the upper tail region of the distribution is rejected for a one-tailed test. Additionally, you employ distinct z-scores since, in contrast to a two-tailed test, the alpha level does not need to be divided by two. In a normal distribution, the area in the tail above z = +1.645 represents 0.5 of the distribution. This portion of the distribution is significantly remote from the centre of the bell curve at 0. Consequently, the null hypothesis would be rejected if the z-score exceeds 1.645 (as depicted in Figure 8).

A one tailed test shown in a symmetrical graph reminiscent of a bell

A symmetrical graph reminiscent of a bell. The x-axis is labelled ‘z-score’ and the y-axis is labelled ‘probability density’. The x-axis increases in increments of 1 from -2 to 2.

The top of the bell-shaped curve is labelled ‘Hours of study = 15 hours’. The graph circles the rejection regions of the null hypothesis on the right hand side of the bell curve. Within this circle is an area shaded orange which is labelled z > 1.645 and α = 0.05.

In summary, a one-tailed test is used to assess a directional hypothesis in which the direction of the difference or association between two variables is predicted. The critical value for a one-tailed test is determined by the selected significance level (α), and the test is conducted to ascertain whether the data supports the predicted direction.

In addition, the one-tailed test is not limited to a single direction (greater than) but can also be employed in the opposite direction (less than). An example can be used to illustrate this type of hypothesis testing. Consider a situation where the management team believes that the average amount spent by customers during their visits to a department store is £65. However, the service manager observes that customers spend less than that amount during their visits. In this case, you can formulate the following set of hypotheses:

H 0 : µ ≥ £65

To test this directional hypothesis, a one-tailed test must be conducted. The alternative hypothesis states that the specific value of µ will be lower than the value specified in the hypothesis. Therefore, you must reject the region in the lower tail of the normal distribution. More specifically, the rejection region of the one-tailed test at alpha levels equals 0.05. The lower tail of the normal distribution has a z-score lower than -1.645. Any hypothesis in this region will be rejected. The graph in Figure 9 illustrates this.

A one tailed test shown in a symmetrical graph reminiscent of a bell

A symmetrical graph reminiscent of a bell. The x-axis is labelled ‘z-score axis’ and the y-axis is labelled ‘customer spending axis’. The x-axis increases in increments of 1 from -2 to 2.

The top of the bell shaped curve is labelled ‘Customer spending = £65’. The graph circles the rejection regions of the null hypothesis on left hand side of the bell curve. Within this circle is an area shaded orange which is labelled z

In conclusion, the one-tailed test is not restricted to a specific direction and can be used in either direction, depending on the research question and the hypothesis being tested. The test is used to determine if the data supports a directional hypothesis, and a critical value is established based on the significance level chosen for the test.

Previous

  • ChatGPT Documentation
  • Data Science Dictionary
  • Privacy Policy

Data Science Wiki

Data Science Wiki

Unlocking the power of data science, one term at a time., recent posts.

  • Web Scraping

Recent Comments

One-sided test, one-sided test :.

Filed under: O - @ 12:45 pm

« One Hot Encoding ⇐ More Pages ⇒ One Shot Learning »

One-sided statistical tests are just as accurate as two-sided tests

Author: Georgi Z. Georgiev, Published: Aug 6, 2018

Since there are a lot of misconceptions and "bad press" about one-sided tests of significance and one-sided confidence intervals (for examples see "The paradox of one-sided vs. two-sided tests of significance" and "Examples of negative portrayal of one-sided significance tests" ) I want to set the record straight in this brief article. Namely, I will demonstrate that a one-sided test maintains its type I error guarantees just as well as a two-sided test and therefore refute claims about one-sided tests being biased, leading to more false positives or having more assumptions than two-sided tests.

What is a one-sided test?

The p-value from a one-sided test of significance or the bound of an equivalent confidence interval is calculated under a null hypothesis that includes zero and one side of the distribution of possible outcomes of the measurement of interest. The alternative hypothesis covers just one side of the distribution as well. The above is the classical definition while in many practical scenarios a one-sided null hypothesis may span any proportion of the possible values of the outcome variable.

Two sided vs One sided hypothesis

Since the null of a one-sided test is broader it requires less data to reject it with the same level of uncertainty : in the classical scenario and a symmetrical error distribution we eliminate half of the possible outcomes from the pool of outcomes that can reject the null. This means that if for a two-sided hypothesis an error probability of 0.05 is maintained by setting the critical boundary c α = Z 2 and -c α = -Z 2 , for a one-sided test the same error probability is maintained by the critical boundary c α = Z 1 where Z 1 < |Z 2 |.

Translated into the meaning of a p-value, observing a p-value of 0.01 from both a one-sided and a two-sided test means the same thing in terms of error probabilities : that were the null hypothesis true, we would observe such an extreme outcome, or a more extreme one, with probability 0.01. For a 99% one-sided ([l lower ; +∞) or (-∞; l upper ]) or two-sided ([l lower ; l upper ]) interval it means that the true value will be within 99% of such intervals.

It should be noted that due the composite nature of the null hypothesis a one-sided test actually offers a conservative error guarantee, a maximum bound on the type I error . That is, if we have a null spanning from -∞ to 0, the reported p-value is calculated against the worst possible case: 0. If the null is in fact less than 0, then the true type I error can approach 0 for values significantly in the negative direction. This is exactly what we would if we want a claim to withstand even the most critical examination.

There is nothing wrong in rejecting the null with outcomes in one direction only

I’ve seen the imprecise statement that by doing a one-sided test "we are looking in one direction only" Since a directional claim frames the alternative hypothesis as a one-sided one, in answering it we are limiting our rejection of the null hypothesis to outcomes in just one direction. For the purposes of rejecting a claim of no difference or negative (or positive) effect it is the only correct thing to do (See "Directional claims require directional hypotheses" ).

Making a claim of positive (or negative) effect is does not require one to predict, expect or hope for an outcome in that direction. Furthermore, predicting, expecting and hoping as experienced by the researcher have no effect on the true effect or the sampling space (possible experiment outcomes) and does not affect the data generating procedure in any way. No p-value adjustments or other precautions are necessary given a directional claim.

If one wants to examine the other direction they are free to do so. If one wants to entertain a more precise point null and calculate a two-sided p-value or confidence interval they are also free to do so, but it has no relevance to the research question that corresponds to a one-sided hypothesis.

Explanation through a scale metaphor

weight scale measurement error

I find this weight measurement metaphor useful in illustrating the validity of a one-sided test versus a two-sided one. Say we have a scale and we know that 68.27% of the time it results in a measurement that is no more than ±1 kg off the true weight. Thus, assuming a normal distribution of error, the scale’s standard deviation is 1 kg. Suppose the allowed risk for any weight claim we make is 5%, corresponding to 1.644 kg critical value for a one-sided claim and 1.96 kg critical value for a two-sided claim.

We measure John’s weight and the scale shows 82 kg. Following this measurement, one can reject the claim "John weighs less than or equal to 80 kg" with a p-value of 0.0227 and the claim "John weighs exactly 80 kg" with a p-value of 0.0455.

Why is that?

Because under the second claim a scale indication of say 78 kg, which has a chance to happen due to random error even if John truly weighs exactly 80 kg, would also count against the null of "John weighs exactly 80 kg". The same 78 kg measurement would not count as evidence against the one-sided null of "John weighs less than or equal to 80 kg". In order for the p-value calculated relative to the "exactly 80 kg" to reflect the same error of probability as the "greater than 80 kg" claim it has to be adjusted to take into account those possible rejections that do not exist for the one-sided claim.

Note that we can just as easily state that we fail to reject the claim "John’s weigh less than 80 kg" with a p-value of 0.9773 (the one-sided test in the opposite direction) without any compromise of integrity or validity of the results.

The scale is obviously the statistical test of our choice and it remains uninterested in what claims any of us predicts, expects or wants to make after the weighing. The rest of the metaphor should be obvious.

One- and two-sided hypothesis testing explained through statistical power

I always had an intuitive understanding of one-sided tests, but the details only downed on me once I examined the power functions of one-sided and two-sided tests of significance. It might not be as intuitive if you do not have a good grasp on the concept of power.

This is the graph I was looking at:

power two sided v one sided tests

Knowing that statistical power is the probability of rejecting the null with a given significance level if a given point alternative is true and that power function is plotting the power for a set of such alternatives we can understand a two-sided test with type I error 2α as two one-sided tests with type I error α, back to back. In fact, it appears that was how Fisher, Neyman & Pearson saw it as I discovered in writing "Fisher, Neyman & Pearson - advocates for one-sided tests and confidence intervals" . The above assumes a symmetrical error distribution.

A statistical hypothesis that matches a directional research hypothesis is one in which we give up the ability to accept extreme results in one direction as rejecting the null based on the inquiry or claim of interest. In conducting a one-sided test matching such a hypothesis we give up power against all possible outcomes in the null direction and have zero power against many of them. The extent to which we give up power for true values under the null we are to decrease the reported probability of rejecting the null (p-value).

On the contrary, when doing a two-sided test, we add power by now accepting extreme results in the other direction as basis to reject the new, more precise point null. The power of the test under the null is doubled, so to maintain the same probability of a false rejection of the null we need to double the reported error (p-value) versus a test for a one-sided null bounded by the same value.

Nothing works better in understanding statistics than a proper simulation

If you want to really learn statistics, do simulations. The easiest way to settle any disputes about the appropriateness and accuracy of a one-sided p-value is to see it through simulations.

Enjoyed this article? Please, consider sharing it where it will be appreciated!

Cite this article:

If you'd like to cite this online article you can use the following citation: Georgiev G.Z., "One-sided statistical tests are just as accurate as two-sided tests" , [online] Available at: https://www.onesided.org/articles/one-sided-statistical-tests-just-as-accurate-as-two-sided-tests.php URL [Accessed Date: 27 Jun, 2024].

About the author

Georgi Z. Georgiev

  • - Google Chrome

Intended for healthcare professionals

  • My email alerts
  • BMA member login
  • Username * Password * Forgot your log in details? Need to activate BMA Member Log In Log in via OpenAthens Log in via your institution

Home

Search form

  • Advanced search
  • Search responses
  • Search blogs
  • One sided and two...

Null hypothesis for a one sided test

Rapid response to:

One sided and two sided hypothesis tests

  • Related content
  • Article metrics
  • Rapid responses

Rapid Response:

I enjoy the statistical questions in Endgames, and sometimes use them on my students. However I was surprised that the answer to this week's statement about one-sided tests 'Null hypothesis : in the total population the rate of miscarriage for HPV vaccine is equal to, that for control.' was deemed false, whereas the statement 'Null hypothesis : in the total population the rate of miscarriage for HPV vaccine is equal to, or less than, that for control.' was deemed true. To my mind the null hypothesis concerns a point estimate, such as the the hypothesis that the difference in population rates, d, is zero. This is the hypothesis from which the p- value given in the problem, 0.16, was calculated. You could not work out the p-value if the hypothesis was d<=0 (unless you adopted some Bayesian prior distribution). If we fail to reject the null hypothesis for a one-sided test, all we can say is that we have failed to show that the population mean difference is greater than 0. The decision to use a one- sided test has already deemed that P(d<0)=0. This is why they are difficult to deal with when the 'impossible' appears to have happened, and a result 'in the wrong direction' occurs. It may be of no medical concern when this happens, but that is not a reason for doing a one-sided test, which in general is best avoided.

Competing interests: No competing interests

null hypothesis one sided test

Stack Exchange Network

Stack Exchange network consists of 183 Q&A communities including Stack Overflow , the largest, most trusted online community for developers to learn, share their knowledge, and build their careers.

Q&A for work

Connect and share knowledge within a single location that is structured and easy to search.

One Sided Null Hypothesis - 2 Interpretations

I've been reading around about hypothesis testing. I don't understand why the following one sided tests are equivalent:

$H_0:\mu \leq \mu_0$; $H_a:\mu > \mu_0$

$H_0:\mu = \mu_0$; $H_a:\mu > \mu_0$

Any thoughts?

Edit: I think I understand why they are equivalent. Anything that's rejected by the second hypothesis will be also rejected by the first (at least for Z tests and T tests you learn about in a first course in stats). Maybe a better question to ask is -- are there scenarios where these two inferences are not equivalent?

  • hypothesis-testing

yoshi's user avatar

  • 1 $\begingroup$ It might help to recall the definition of a p-value for a composite null hypothesis as the maximum p-value over all constituent simple nulls: see Is the p-value still uniformly distributed when the null hypothesis is composite? . So you have to give the null its best shot, & in some cases - including the usual z tests & t-tests for the mean of normally distributed observations - how to do so is quite clear. $\endgroup$ –  Scortchi - Reinstate Monica ♦ Commented Apr 8, 2016 at 15:35
  • 1 $\begingroup$ Sorry, I don't have enough reputation, thus a useful link as answer: quantdec.com/envstats/notes/class_13/tests.htm See also here $\endgroup$ –  Christoph Commented May 24, 2019 at 11:29

I do not think those are equivalent and in fact I believe one of them

H0:μ=μ0; Ha:μ>μ0 is incorrect.

Philosophically, the 'rules' for forming the Ho and the Ha are that they be a-mutually exclusive and b-exhaustive, and so I think technically that form of the null is incorrect because it's not exhaustive (e.g. it omits the result in which (using your single sample example) the obtained mean is actually significantly lower).

Pragmatically, you are correct there aren't any cases where anything that's rejected by the second version of the hypothesis wont be also rejected by the first version - because the critical value for rejection region would go in the tail corresponding to the alternative hypothesis, leaving the entire other part of the distribution in the zone of the null. But the fact that the practical implication is invariant doesn't make the expression of the hypothesis correct (for the reason stated above, that it fails one of the rules of hypothesis formation).

Marina_ANOVA's user avatar

  • $\begingroup$ Why not break the rule? $\endgroup$ –  Scortchi - Reinstate Monica ♦ Commented Apr 8, 2016 at 15:38

Your Answer

Sign up or log in, post as a guest.

Required, but never shown

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy .

Not the answer you're looking for? Browse other questions tagged hypothesis-testing or ask your own question .

  • Featured on Meta
  • Upcoming sign-up experiments related to tags

Hot Network Questions

  • A class for students who want to get better at a subject, aside from their public education
  • What exactly is beef bone extract, beef extract, beef fat (all powdered form) and where can I find it?
  • A puzzle from YOU to ME ;)
  • Why does c show up in Schwarzschild's equation for the horizon radius?
  • Should mail addresses for logins be stored hashed to minimize impact of data loss?
  • How to find your contract and employee handbook in the UK?
  • Why did Geordi have his visor replaced with ocular implants between Generations and First Contact?
  • Looking for a caveman-discovers-fire short story that is a pun on "nuclear" power
  • How to make sure to only get full frame lenses for the Canon EF (non-mirrorless) mount?
  • Sets of algebraic integers whose differences are units
  • Why does the Clausius inequality involve a single term/integral if we consider a body interacting with multiple heat sources/sinks?
  • What rights does an employee retain, if any, who does not consent to being monitored on a work IT system?
  • What's the meaning of "nai gar"?
  • Output the Steiner system S(5,8,24)
  • Could Kessler Syndrome be used to control the temperature of the Earth?
  • Why can Ethernet NICs bridge to VirtualBox and most Wi-Fi NICs don't?
  • Grouping rows by categories avoiding repetition
  • Would a spaceport on Ceres make sense?
  • Is "ROW_NUMBER() OVER(ORDER BY xml.node)" well defined?
  • Does "my grades suffered" mean "my grades became worse" or "my grades were bad"?
  • What is the mode of operation of a Hobb's meter?
  • Collaborators write their departments for my (undergraduate) affiliation
  • What is the best way to set a class value to a variable in Python if it exists in a dictionary?
  • Derivative of the Score Function in Fisher Information

null hypothesis one sided test

spearmanr #

Calculate a Spearman correlation coefficient with associated p-value.

The Spearman rank-order correlation coefficient is a nonparametric measure of the monotonicity of the relationship between two datasets. Like other correlation coefficients, this one varies between -1 and +1 with 0 implying no correlation. Correlations of -1 or +1 imply an exact monotonic relationship. Positive correlations imply that as x increases, so does y. Negative correlations imply that as x increases, y decreases.

The p-value roughly indicates the probability of an uncorrelated system producing datasets that have a Spearman correlation at least as extreme as the one computed from these datasets. Although calculation of the p-value does not make strong assumptions about the distributions underlying the samples, it is only accurate for very large samples (>500 observations). For smaller sample sizes, consider a permutation test (see Examples section below).

One or two 1-D or 2-D arrays containing multiple variables and observations. When these are 1-D, each represents a vector of observations of a single variable. For the behavior in the 2-D case, see under axis , below. Both arrays need to have the same length in the axis dimension.

If axis=0 (default), then each column represents a variable, with observations in the rows. If axis=1, the relationship is transposed: each row represents a variable, while the columns contain observations. If axis=None, then both arrays will be raveled.

Defines how to handle when input contains nan. The following options are available (default is ‘propagate’):

‘propagate’: returns nan

‘raise’: throws an error

‘omit’: performs the calculations ignoring nan values

Defines the alternative hypothesis. Default is ‘two-sided’. The following options are available:

‘two-sided’: the correlation is nonzero

‘less’: the correlation is negative (less than zero)

‘greater’: the correlation is positive (greater than zero)

Added in version 1.7.0.

An object containing attributes:

Spearman correlation matrix or correlation coefficient (if only 2 variables are given as parameters). Correlation matrix is square with length equal to total number of variables (columns or rows) in a and b combined.

The p-value for a hypothesis test whose null hypothesis is that two samples have no ordinal correlation. See alternative above for alternative hypotheses. pvalue has the same shape as statistic .

Raised if an input is a constant array. The correlation coefficient is not defined in this case, so np.nan is returned.

Zwillinger, D. and Kokoska, S. (2000). CRC Standard Probability and Statistics Tables and Formulae. Chapman & Hall: New York. 2000. Section 14.7

Kendall, M. G. and Stuart, A. (1973). The Advanced Theory of Statistics, Volume 2: Inference and Relationship. Griffin. 1973. Section 31.18

Kershenobich, D., Fierro, F. J., & Rojkind, M. (1970). The relationship between the free pool of proline and collagen content in human liver cirrhosis. The Journal of Clinical Investigation, 49(12), 2246-2249.

Hollander, M., Wolfe, D. A., & Chicken, E. (2013). Nonparametric statistical methods. John Wiley & Sons.

B. Phipson and G. K. Smyth. “Permutation P-values Should Never Be Zero: Calculating Exact P-values When Permutations Are Randomly Drawn.” Statistical Applications in Genetics and Molecular Biology 9.1 (2010).

Ludbrook, J., & Dudley, H. (1998). Why permutation tests are superior to t and F tests in biomedical research. The American Statistician, 52(2), 127-132.

Consider the following data from [3] , which studied the relationship between free proline (an amino acid) and total collagen (a protein often found in connective tissue) in unhealthy human livers.

The x and y arrays below record measurements of the two compounds. The observations are paired: each free proline measurement was taken from the same liver as the total collagen measurement at the same index.

These data were analyzed in [4] using Spearman’s correlation coefficient, a statistic sensitive to monotonic correlation between the samples.

The value of this statistic tends to be high (close to 1) for samples with a strongly positive ordinal correlation, low (close to -1) for samples with a strongly negative ordinal correlation, and small in magnitude (close to zero) for samples with weak ordinal correlation.

The test is performed by comparing the observed value of the statistic against the null distribution: the distribution of statistic values derived under the null hypothesis that total collagen and free proline measurements are independent.

For this test, the statistic can be transformed such that the null distribution for large samples is Student’s t distribution with len(x) - 2 degrees of freedom.

../../_images/scipy-stats-spearmanr-1_00_00.png

The comparison is quantified by the p-value: the proportion of values in the null distribution as extreme or more extreme than the observed value of the statistic. In a two-sided test in which the statistic is positive, elements of the null distribution greater than the transformed statistic and elements of the null distribution less than the negative of the observed statistic are both considered “more extreme”.

../../_images/scipy-stats-spearmanr-1_01_00.png

If the p-value is “small” - that is, if there is a low probability of sampling data from independent distributions that produces such an extreme value of the statistic - this may be taken as evidence against the null hypothesis in favor of the alternative: the distribution of total collagen and free proline are not independent. Note that:

The inverse is not true; that is, the test is not used to provide evidence for the null hypothesis.

The threshold for values that will be considered “small” is a choice that should be made before the data is analyzed [5] with consideration of the risks of both false positives (incorrectly rejecting the null hypothesis) and false negatives (failure to reject a false null hypothesis).

Small p-values are not evidence for a large effect; rather, they can only provide evidence for a “significant” effect, meaning that they are unlikely to have occurred under the null hypothesis.

Suppose that before performing the experiment, the authors had reason to predict a positive correlation between the total collagen and free proline measurements, and that they had chosen to assess the plausibility of the null hypothesis against a one-sided alternative: free proline has a positive ordinal correlation with total collagen. In this case, only those values in the null distribution that are as great or greater than the observed statistic are considered to be more extreme.

../../_images/scipy-stats-spearmanr-1_02_00.png

Note that the t-distribution provides an asymptotic approximation of the null distribution; it is only accurate for samples with many observations. For small samples, it may be more appropriate to perform a permutation test: Under the null hypothesis that total collagen and free proline are independent, each of the free proline measurements were equally likely to have been observed with any of the total collagen measurements. Therefore, we can form an exact null distribution by calculating the statistic under each possible pairing of elements between x and y .

../../_images/scipy-stats-spearmanr-1_03_00.png

IMAGES

  1. What are one-sided and two-sided tests?

    null hypothesis one sided test

  2. PPT

    null hypothesis one sided test

  3. One sided or one-tailed tests

    null hypothesis one sided test

  4. Hypothesis Testing

    null hypothesis one sided test

  5. SOLVED:A one-sided test of the null hypothesis μ=20 versus the

    null hypothesis one sided test

  6. ONE

    null hypothesis one sided test

VIDEO

  1. Large Sample Hypothesis Tests Part 2

  2. AP Statistics: Chapter 9, Video #4

  3. Lec-1, Statistical Methods||Test of Hypothesis|| one Sample T-test #statistics #mathematics

  4. Stating Hypotheses & Defining Parameters

  5. 2102203 Statistics 9 (Lecture on Other Topics in Statistical Hypothesis Test for Normal Mean)

  6. Testing of Hypothesis

COMMENTS

  1. One-Tailed and Two-Tailed Hypothesis Tests Explained

    Consequently, the peak (most likely value) of the distribution occurs at t=0, which represents the null hypothesis in a t-test. Typically, the null hypothesis states that there is no effect. ... The t critical value for the two-tailed test is +/- 2.086 while for the one-sided test it is 1.725. It is true that probability associated with those ...

  2. One-Tailed Hypothesis Tests: 3 Example Problems

    To test this, he can perform a one-tailed hypothesis test with the following null and alternative hypotheses: H 0 (Null Hypothesis): μ ≥ 20 grams; H A (Alternative Hypothesis): μ < 20 grams; Note: We can tell this is a one-tailed test because the alternative hypothesis contains the less than (<) sign. Specifically, we would call this a left ...

  3. Hypothesis testing: Null Hypothesis for one-sided tests

    With a one sided test, we might want to assess if a sample mean is greater than some theoretical mean (or the other way round): HA:μS > μT H A: μ S > μ T. What confuses me is that even for one-sided test the Null-hypothesis is described as equality between the means, i.e.: H0:μS = μT H 0: μ S = μ T . Why is that?

  4. One-Tailed Test Explained: Definition and Example

    One-Tailed Test: A one-tailed test is a statistical test in which the critical area of a distribution is one-sided so that it is either greater than or less than a certain value, but not both. If ...

  5. One-tailed and two-tailed tests (video)

    To decide if a one-tailed test can be used, one has to have some extra information about the experiment to know the direction from the mean (H1: drug lowers the response time). If the direction of the effect is unknown, a two tailed test has to be used, and the H1 must be stated in a way where the direction of the effect is left uncertain (H1 ...

  6. Null hypothesis

    A one-tailed hypothesis (tested using a one-sided test) is an inexact hypothesis in which the value of a parameter is ... A one-tailed hypothesis is said to have directionality. Fisher's original (lady tasting tea) example was a one-tailed test. The null hypothesis was asymmetric. The probability of guessing all cups correctly was the same as ...

  7. 4.4: Hypothesis Testing

    Two-sided hypothesis testing with p-values. We now consider how to compute a p-value for a two-sided test. In one-sided tests, we shade the single tail in the direction of the alternative hypothesis. For example, when the alternative had the form \(\mu\) > 7, then the p-value was represented by the upper tail (Figure 4.16).

  8. One- and two-tailed tests

    In coin flipping, the null hypothesis is a sequence of Bernoulli trials with probability 0.5, yielding a random variable X which is 1 for heads and 0 for tails, and a common test statistic is the sample mean (of the number of heads) ¯. If testing for whether the coin is biased towards heads, a one-tailed test would be used - only large numbers of heads would be significant.

  9. Hypothesis testing: One-tailed and two-tailed tests

    For example, our null hypothesis would state that there's no difference in the mean blood pressure for people that take the placebo compared to people that take the medication. On the other hand, the alternate hypothesis for a t-test can be either one-sided or two-sided, and this has to be determined at the beginning of the study.

  10. 4.4 One-Sided and Two-Sided Tests

    A test of a one-sided null hypothesis is called a one-sided test. Figure 4.3: One-sided and two-sided tests of a null hypothesis. In a left-sided test of the media literacy hypothesis, the researcher is not interested in demonstrating that average media literacy among children can be larger than 5.5. She only wants to test if it is below 5.5 ...

  11. The p-value and rejecting the null (for one- and two-tail tests)

    The p-value (or the observed level of significance) is the smallest level of significance at which you can reject the null hypothesis, assuming the null hypothesis is true. You can also think about the p-value as the total area of the region of rejection. Remember that in a one-tailed test, the regi

  12. FAQ: What are the differences between one-tailed and two-tailed tests?

    The null hypothesis is that the difference in means is zero. The two-sided alternative is that the difference in means is not zero. There are two one-sided alternatives that one could opt to test instead: that the male score is higher than the female score (diff > 0) or that the female score is higher than the male score (diff < 0).

  13. An Introduction to Statistics: Understanding Hypothesis Testing and

    This is known as a two-tailed hypothesis and it allows us to test for superiority on either side (using a two-sided test). This is because, when we start a study, we are not 100% certain that the new treatment can only be better than the standard treatment—it could be worse, and if it is so, the study should pick it up as well.

  14. Data analysis: hypothesis testing: 4.3 One-sided tests

    To conduct a one-tailed test, a critical value must be established to determine whether the null hypothesis should be rejected or retained. Typically, a significance level (α) is set for this purpose. For instance, assuming α = 0.05, the z-score for a one-tailed test with α = 0.05 in a normal distribution is 1.645.

  15. 13.6: One Sided Tests

    When introducing the theory of null hypothesis tests, I mentioned that there are some situations when it's appropriate to specify a one-sided test (see Section 11.4.3). So far, all of the t-tests have been two-sided tests. For instance, when we specified a one sample t-test for the grades in Dr Zeppo's class, the null hypothesis was that ...

  16. One-sided test

    A one-sided test, also known as a one-tailed test or directional test, is a statistical hypothesis test in which the null hypothesis is tested against a specific alternative hypothesis that states that the population parameter is greater than, less than, or not equal to a certain value. The direction of the alternative hypothesis, whether it is greater than or less than a certain value ...

  17. 6.2: Null and Alternative Hypotheses

    The actual test begins by considering two hypotheses.They are called the null hypothesis and the alternative hypothesis.These hypotheses contain opposing viewpoints. \(H_0\): The null hypothesis: It is a statement of no difference between the variables—they are not related. This can often be considered the status quo and as a result if you cannot accept the null it requires some action.

  18. When is a one-sided hypothesis required?

    Once the data gathered and t-values are calculated using the observed direction of the effect, if one is to state that, for example "each one-inch increase in height results in $789 increase in yearly income", then the corresponding p-value (p 1) should be one-tailed with a null hypothesis that increases in height either have no relation or an ...

  19. One-sided statistical tests are just as accurate as two-sided tests

    The alternative hypothesis covers just one side of the distribution as well. The above is the classical definition while in many practical scenarios a one-sided null hypothesis may span any proportion of the possible values of the outcome variable. Since the null of a one-sided test is broader it requires less data to reject it with the same ...

  20. What's the null hypothesis in a one-sided Kolmogorov-Smirnov test?

    I think most of the tables providing p-values for the K-S statistic are based on a two-sided test. The null hypothesis assumed by the values in the table is that the two samples are drawn from the same distribution (ie, that Cx = Cy C x = C y ). So really the table is only concerned with the absolute value of the difference between Cx C x and ...

  21. Why does the one-sided T-test reject the Null Hypothesis while the two

    Obviously, the first case is the one-sided, and the second is the two-sided test. So, for the same statistic, you have a higher probability of being wrong if you reject the null hypothesis in the two-sided test than in the one-sided test.

  22. Null hypothesis for a one sided test

    This is the hypothesis from which the p-value given in the problem, 0.16, was calculated. You could not work out the p-value if the hypothesis was d<=0 (unless you adopted some Bayesian prior distribution). If we fail to reject the null hypothesis for a one-sided test, all we can say is that we have failed to show that the

  23. 10.6: One-Sided Tests

    If so, our null hypothesis would be that the true mean is 67.5% or less, and the alternative hypothesis would be that the true mean is greater than 67.5%. ... For an independent samples t-test, you could have a one-sided test if you're only interested in testing to see if group A has higher scores than group B, but have no interest in finding ...

  24. 9.2: Null and Alternative Hypotheses

    The actual test begins by considering two hypotheses. They are called the null hypothesis and the alternative hypothesis. These hypotheses contain opposing viewpoints. \(H_0\): The null hypothesis: It is a statement of no difference between a sample mean or proportion and a population mean or proportion. In other words, the difference equals 0.

  25. One Sided Null Hypothesis

    1. I do not think those are equivalent and in fact I believe one of them. H0:μ=μ0; Ha:μ>μ0 is incorrect. Philosophically, the 'rules' for forming the Ho and the Ha are that they be a-mutually exclusive and b-exhaustive, and so I think technically that form of the null is incorrect because it's not exhaustive (e.g. it omits the result in ...

  26. spearmanr

    The p-value for a hypothesis test whose null hypothesis is that two samples have no ordinal correlation. See alternative above for alternative hypotheses. pvalue has the same shape as statistic. ... >>> res. pvalue 0.03995834515444954 # one-sided p-value; half of the two-sided p-value.

  27. 9.9: Practice

    Which distribution do you use when the standard deviation is not known and you are testing one population mean? Assume a normal distribution, with \(n \geq 30\). ... or two-tailed test? 43. Assume the null hypothesis states that the mean is at most 12. Is this a left-tailed, right-tailed, or two-tailed test? 44. Assume the null hypothesis ...