• What you can generate and how
  • Edit on GitHub

What you can generate and how ¶

Most things should be easy to generate and everything should be possible.

To support this principle Hypothesis provides strategies for most built-in types with arguments to constrain or adjust the output, as well as higher-order strategies that can be composed to generate more complex types.

This document is a guide to what strategies are available for generating data and how to build them. Strategies have a variety of other important internal features, such as how they simplify, but the data they can generate is the only public part of their API.

Core strategies ¶

Functions for building strategies are all available in the hypothesis.strategies module. The salient functions from it are as follows:

Generates bytes .

The generated bytes will have a length of at least min_size and at most max_size . If max_size is None there is no upper limit.

Examples from this strategy shrink towards smaller strings and lower byte values.

Returns a strategy which generates instances of bool .

Examples from this strategy will shrink towards False (i.e. shrinking will replace True with False where possible).

Generates values by drawing from args and kwargs and passing them to the callable (provided as the first positional argument) in the appropriate argument position.

e.g. builds(target, integers(), flag=booleans()) would draw an integer i and a boolean b and call target(i, flag=b) .

If the callable has type annotations, they will be used to infer a strategy for required arguments that were not passed to builds. You can also tell builds to infer a strategy for an optional argument by passing ... ( Ellipsis ) as a keyword argument to builds, instead of a strategy for that argument to the callable.

If the callable is a class defined with attrs , missing required arguments will be inferred from the attribute on a best-effort basis, e.g. by checking attrs standard validators . Dataclasses are handled natively by the inference from type hints.

Examples from this strategy shrink by shrinking the argument values to the callable.

Generates characters, length-one str ings, following specified filtering rules.

When no filtering rules are specified, any character can be produced.

If min_codepoint or max_codepoint is specified, then only characters having a codepoint in that range will be produced.

If categories is specified, then only characters from those Unicode categories will be produced. This is a further restriction, characters must also satisfy min_codepoint and max_codepoint .

If exclude_categories is specified, then any character from those categories will not be produced. You must not pass both categories and exclude_categories ; these arguments are alternative ways to specify exactly the same thing.

If include_characters is specified, then any additional characters in that list will also be produced.

If exclude_characters is specified, then any characters in that list will be not be produced. Any overlap between include_characters and exclude_characters will raise an exception.

If codec is specified, only characters in the specified codec encodings will be produced.

The _codepoint arguments must be integers between zero and sys.maxunicode . The _characters arguments must be collections of length-one unicode strings, such as a unicode string.

The _categories arguments must be used to specify either the one-letter Unicode major category or the two-letter Unicode general category . For example, ('Nd', 'Lu') signifies “Number, decimal digit” and “Letter, uppercase”. A single letter (‘major category’) can be given to match all corresponding categories, for example 'P' for characters in any punctuation category.

We allow codecs from the codecs module and their aliases, platform specific and user-registered codecs if they are available, and python-specific text encodings (but not text or binary transforms). include_characters which cannot be encoded using this codec will raise an exception. If non-encodable codepoints or categories are explicitly allowed, the codec argument will exclude them without raising an exception.

Examples from this strategy shrink towards the codepoint for '0' , or the first allowable codepoint after it if '0' is excluded.

Returns a strategy that generates complex numbers.

This strategy draws complex numbers with constrained magnitudes. The min_magnitude and max_magnitude parameters should be non-negative Real numbers; a value of None corresponds an infinite upper bound.

If min_magnitude is nonzero or max_magnitude is finite, it is an error to enable allow_nan . If max_magnitude is finite, it is an error to enable allow_infinity .

allow_infinity , allow_nan , and allow_subnormal are applied to each part of the complex number separately, as for floats() .

The magnitude constraints are respected up to a relative error of (around) floating-point epsilon, due to implementation via the system sqrt function.

The width argument specifies the maximum number of bits of precision required to represent the entire generated complex number. Valid values are 32, 64 or 128, which correspond to the real and imaginary components each having width 16, 32 or 64, respectively. Passing width=64 will still use the builtin 128-bit complex class, but always for values which can be exactly represented as two 32-bit floats.

Examples from this strategy shrink by shrinking their real and imaginary parts, as floats() .

If you need to generate complex numbers with particular real and imaginary parts or relationships between parts, consider using builds(complex, ...) or @composite respectively.

Defines a strategy that is built out of potentially arbitrarily many other strategies.

This is intended to be used as a decorator. See the full documentation for more details about how to use this function.

Examples from this strategy shrink by shrinking the output of each draw call.

This isn’t really a normal strategy, but instead gives you an object which can be used to draw data interactively from other strategies.

See the rest of the documentation for more complete information.

Examples from this strategy do not shrink (because there is only one), but the result of calls to each data.draw() call shrink as they normally would.

This type only exists so that you can write type hints for tests using the data() strategy. Do not use it directly!

A strategy for dates between min_value and max_value .

Examples from this strategy shrink towards January 1st 2000.

A strategy for generating datetimes, which may be timezone-aware.

This strategy works by drawing a naive datetime between min_value and max_value , which must both be naive (have no timezone).

timezones must be a strategy that generates either None , for naive datetimes, or tzinfo objects for ‘aware’ datetimes. You can construct your own, though we recommend using one of these built-in strategies:

with Python 3.9 or newer or backports.zoneinfo : hypothesis.strategies.timezones() ;

with dateutil : hypothesis.extra.dateutil.timezones() ; or

with pytz : hypothesis.extra.pytz.timezones() .

You may pass allow_imaginary=False to filter out “imaginary” datetimes which did not (or will not) occur due to daylight savings, leap seconds, timezone and calendar adjustments, etc. Imaginary datetimes are allowed by default, because malformed timestamps are a common source of bugs.

Examples from this strategy shrink towards midnight on January 1st 2000, local time.

Generates instances of decimal.Decimal , which may be:

A finite rational number, between min_value and max_value .

Not a Number, if allow_nan is True. None means “allow NaN, unless min_value and max_value are not None”.

Positive or negative infinity, if max_value and min_value respectively are None, and allow_infinity is not False. None means “allow infinity, unless excluded by the min and max values”.

Note that where floats have one NaN value, Decimals have four: signed, and either quiet or signalling . See the decimal module docs for more information on special values.

If places is not None, all finite values drawn from the strategy will have that number of digits after the decimal place.

Examples from this strategy do not have a well defined shrink order but try to maximize human readability when shrinking.

A deferred strategy allows you to write a strategy that references other strategies that have not yet been defined. This allows for the easy definition of recursive and mutually recursive strategies.

The definition argument should be a zero-argument function that returns a strategy. It will be evaluated the first time the strategy is used to produce an example.

Example usage:

Mutual recursion also works fine:

Examples from this strategy shrink as they normally would from the strategy returned by the definition.

Generates dictionaries of type dict_class with keys drawn from the keys argument and values drawn from the values argument.

The size parameters have the same interpretation as for lists() .

Examples from this strategy shrink by trying to remove keys from the generated dictionary, and by shrinking each generated key and value.

This type only exists so that you can write type hints for functions decorated with @composite .

A strategy for generating email addresses as unicode strings. The address format is specified in RFC 5322#section-3.4.1 . Values shrink towards shorter local-parts and host domains.

If domains is given then it must be a strategy that generates domain names for the emails, defaulting to domains() .

This strategy is useful for generating “user data” for tests, as mishandling of email addresses is a common source of bugs.

Generates a dictionary of the same type as mapping with a fixed set of keys mapping to strategies. mapping must be a dict subclass.

Generated values have all keys present in mapping, in iteration order, with the corresponding values drawn from mapping[key].

If optional is passed, the generated value may or may not contain each key from optional and a value drawn from the corresponding strategy. Generated values may contain optional keys in an arbitrary order.

Examples from this strategy shrink by shrinking each individual value in the generated dictionary, and omitting optional key-value pairs.

Returns a strategy which generates floats.

If min_value is not None, all values will be >= min_value (or > min_value if exclude_min ).

If max_value is not None, all values will be <= max_value (or < max_value if exclude_max ).

If min_value or max_value is not None, it is an error to enable allow_nan.

If both min_value and max_value are not None, it is an error to enable allow_infinity.

If inferred values range does not include subnormal values, it is an error to enable allow_subnormal.

Where not explicitly ruled out by the bounds, subnormals , infinities, and NaNs are possible values generated by this strategy.

The width argument specifies the maximum number of bits of precision required to represent the generated float. Valid values are 16, 32, or 64. Passing width=32 will still use the builtin 64-bit float class, but always for values which can be exactly represented as a 32-bit float.

The exclude_min and exclude_max argument can be used to generate numbers from open or half-open intervals, by excluding the respective endpoints. Excluding either signed zero will also exclude the other. Attempting to exclude an endpoint which is None will raise an error; use allow_infinity=False to generate finite floats. You can however use e.g. min_value=-math.inf, exclude_min=True to exclude only one infinite endpoint.

Examples from this strategy have a complicated and hard to explain shrinking behaviour, but it tries to improve “human readability”. Finite numbers will be preferred to infinity and infinity will be preferred to NaN.

Returns a strategy which generates Fractions.

If min_value is not None then all generated values are no less than min_value . If max_value is not None then all generated values are no greater than max_value . min_value and max_value may be anything accepted by the Fraction constructor.

If max_denominator is not None then the denominator of any generated values is no greater than max_denominator . Note that max_denominator must be None or a positive integer.

Examples from this strategy shrink towards smaller denominators, then closer to zero.

Generates strings that contain a match for the given regex (i.e. ones for which re.search() will return a non-None result).

regex may be a pattern or compiled regex . Both byte-strings and unicode strings are supported, and will generate examples of the same type.

You can use regex flags such as re.IGNORECASE or re.DOTALL to control generation. Flags can be passed either in compiled regex or inside the pattern with a (?iLmsux) group.

Some regular expressions are only partly supported - the underlying strategy checks local matching and relies on filtering to resolve context-dependent expressions. Using too many of these constructs may cause health-check errors as too many examples are filtered out. This mainly includes (positive or negative) lookahead and lookbehind groups.

If you want the generated string to match the whole regex you should use boundary markers. So e.g. r"\A.\Z" will return a single character string, while "." will return any string, and r"\A.$" will return a single character optionally followed by a "\n" . Alternatively, passing fullmatch=True will ensure that the whole string is a match, as if you had used the \A and \Z markers.

The alphabet= argument constrains the characters in the generated string, as for text() , and is only supported for unicode strings.

Examples from this strategy shrink towards shorter strings and lower character values, with exact behaviour that may depend on the pattern.

Looks up the appropriate search strategy for the given type.

from_type is used internally to fill in missing arguments to builds() and can be used interactively to explore what strategies are available or to debug type resolution.

You can use register_type_strategy() to handle your custom types, or to globally redefine certain strategies - for example excluding NaN from floats, or use timezone-aware instead of naive time and datetime strategies.

The resolution logic may be changed in a future version, but currently tries these five options:

If thing is in the default lookup mapping or user-registered lookup, return the corresponding strategy. The default lookup covers all types with Hypothesis strategies, including extras where possible.

If thing is from the typing module, return the corresponding strategy (special logic).

If thing has one or more subtypes in the merged lookup, return the union of the strategies for those types that are not subtypes of other elements in the lookup.

Finally, if thing has type annotations for all required arguments, and is not an abstract class, it is resolved via builds() .

Because abstract types cannot be instantiated, we treat abstract types as the union of their concrete subclasses. Note that this lookup works via inheritance but not via register , so you may still need to use register_type_strategy() .

There is a valuable recipe for leveraging from_type() to generate “everything except” values from a specified type. I.e.

For example, everything_except(int) returns a strategy that can generate anything that from_type() can ever generate, except for instances of int , and excluding instances of types added via register_type_strategy() .

This is useful when writing tests which check that invalid input is rejected in a certain way.

This is identical to the sets function but instead returns frozensets.

A strategy for functions, which can be used in callbacks.

The generated functions will mimic the interface of like , which must be a callable (including a class, method, or function). The return value for the function is drawn from the returns argument, which must be a strategy. If returns is not passed, we attempt to infer a strategy from the return-type annotation if present, falling back to none() .

If pure=True , all arguments passed to the generated function must be hashable, and if passed identical arguments the original return value will be returned again - not regenerated, so beware mutable values.

If pure=False , generated functions do not validate their arguments, and may return a different value if called again with the same arguments.

Generated functions can only be called within the scope of the @given which created them. This strategy does not support .example() .

Returns a strategy which generates integers.

If min_value is not None then all values will be >= min_value. If max_value is not None then all values will be <= max_value

Examples from this strategy will shrink towards zero, and negative values will also shrink towards positive (i.e. -n may be replaced by +n).

Generate IP addresses - v=4 for IPv4Address es, v=6 for IPv6Address es, or leave unspecified to allow both versions.

network may be an IPv4Network or IPv6Network , or a string representing a network such as "127.0.0.0/24" or "2001:db8::/32" . As well as generating addresses within a particular routable network, this can be used to generate addresses from a reserved range listed in the IANA registries .

If you pass both v and network , they must be for the same version.

This has the same behaviour as lists, but returns iterables instead.

Some iterables cannot be indexed (e.g. sets) and some do not have a fixed length (e.g. generators). This strategy produces iterators, which cannot be indexed and do not have a fixed length. This ensures that you do not accidentally depend on sequence behaviour.

Return a strategy which only generates value .

Note: value is not copied. Be wary of using mutable values.

If value is the result of a callable, you can use builds(callable) instead of just(callable()) to get a fresh value each time.

Examples from this strategy do not shrink (because there is only one).

Returns a list containing values drawn from elements with length in the interval [min_size, max_size] (no bounds in that direction if these are None). If max_size is 0, only the empty list will be drawn.

If unique is True (or something that evaluates to True), we compare direct object equality, as if unique_by was lambda x: x . This comparison only works for hashable types.

If unique_by is not None it must be a callable or tuple of callables returning a hashable type when given a value drawn from elements. The resulting list will satisfy the condition that for i != j , unique_by(result[i]) != unique_by(result[j]) .

If unique_by is a tuple of callables the uniqueness will be respective to each callable.

For example, the following will produce two columns of integers with both columns being unique respectively.

Examples from this strategy shrink by trying to remove elements from the list, and by shrinking each individual element of the list.

Return a strategy which only generates None.

This strategy never successfully draws a value and will always reject on an attempt to draw.

Examples from this strategy do not shrink (because there are none).

Return a strategy which generates values from any of the argument strategies.

This may be called with one iterable argument instead of multiple strategy arguments, in which case one_of(x) and one_of(*x) are equivalent.

Examples from this strategy will generally shrink to ones that come from strategies earlier in the list, then shrink according to behaviour of the strategy that produced them. In order to get good shrinking behaviour, try to put simpler strategies first. e.g. one_of(none(), text()) is better than one_of(text(), none()) .

This is especially important when using recursive strategies. e.g. x = st.deferred(lambda: st.none() | st.tuples(x, x)) will shrink well, but x = st.deferred(lambda: st.tuples(x, x) | st.none()) will shrink very badly indeed.

Return a strategy which returns permutations of the ordered collection values .

Examples from this strategy shrink by trying to become closer to the original order of values.

Hypothesis always seeds global PRNGs before running a test, and restores the previous state afterwards.

If having a fixed seed would unacceptably weaken your tests, and you cannot use a random.Random instance provided by randoms() , this strategy calls random.seed() with an arbitrary integer and passes you an opaque object whose repr displays the seed value for debugging. If numpy.random is available, that state is also managed, as is anything managed by hypothesis.register_random() .

Examples from these strategy shrink to seeds closer to zero.

Generates instances of random.Random . The generated Random instances are of a special HypothesisRandom subclass.

If note_method_calls is set to True , Hypothesis will print the randomly drawn values in any falsifying test case. This can be helpful for debugging the behaviour of randomized algorithms.

If use_true_random is set to True then values will be drawn from their usual distribution, otherwise they will actually be Hypothesis generated values (and will be shrunk accordingly for any failing test case). Setting use_true_random=False will tend to expose bugs that would occur with very low probability when it is set to True, and this flag should only be set to True when your code relies on the distribution of values for correctness.

For managing global state, see the random_module() strategy and register_random() function.

base: A strategy to start from.

extend: A function which takes a strategy and returns a new strategy.

max_leaves: The maximum number of elements to be drawn from base on a given run.

This returns a strategy S such that S = extend(base | S) . That is, values may be drawn from base, or from any strategy reachable by mixing applications of | and extend.

An example may clarify: recursive(booleans(), lists) would return a strategy that may return arbitrarily nested and mixed lists of booleans. So e.g. False , [True] , [False, []] , and [[[[True]]]] are all valid values to be drawn from that strategy.

Examples from this strategy shrink by trying to reduce the amount of recursion and by shrinking according to the shrinking behaviour of base and the result of extend.

Add an entry to the global type-to-strategy lookup.

This lookup is used in builds() and @given .

builds() will be used automatically for classes with type annotations on __init__ , so you only need to register a strategy if one or more arguments need to be more tightly defined than their type-based default, or if you want to supply a strategy for an argument with a default value.

strategy may be a search strategy, or a function that takes a type and returns a strategy (useful for generic types). The function may return NotImplemented to conditionally not provide a strategy for the type (the type will still be resolved by other methods, if possible, as if the function was not registered).

Note that you may not register a parametrised generic type (such as MyCollection[int] ) directly, because the resolution logic does not handle this case correctly. Instead, you may register a function for MyCollection and inspect the type parameters within that function .

A strategy for getting “the current test runner”, whatever that may be. The exact meaning depends on the entry point, but it will usually be the associated ‘self’ value for it.

If you are using this in a rule for stateful testing, this strategy will return the instance of the RuleBasedStateMachine that the rule is running for.

If there is no current test runner and a default is provided, return that default. If no default is provided, raises InvalidArgument.

Returns a strategy which generates any value present in elements .

Note that as with just() , values will not be copied and thus you should be careful of using mutable data.

sampled_from supports ordered collections, as well as Enum objects. Flag objects may also generate any combination of their members.

Examples from this strategy shrink by replacing them with values earlier in the list. So e.g. sampled_from([10, 1]) will shrink by trying to replace 1 values with 10, and sampled_from([1, 10]) will shrink by trying to replace 10 values with 1.

It is an error to sample from an empty sequence, because returning nothing() makes it too easy to silently drop parts of compound strategies. If you need that behaviour, use sampled_from(seq) if seq else nothing() .

This has the same behaviour as lists, but returns sets instead.

Note that Hypothesis cannot tell if values are drawn from elements are hashable until running the test, so you can define a strategy for sets of an unhashable type but it will fail at test time.

Examples from this strategy shrink by trying to remove elements from the set, and by shrinking each individual element of the set.

Returns a strategy that draws a single shared value per run, drawn from base. Any two shared instances with the same key will share the same value, otherwise the identity of this strategy will be used. That is:

In the above x and y may draw different (or potentially the same) values. In the following they will always draw the same:

Examples from this strategy shrink as per their base strategy.

Generates slices that will select indices up to the supplied size

Generated slices will have start and stop indices that range from -size to size - 1 and will step in the appropriate direction. Slices should only produce an empty selection if the start and end are the same.

Examples from this strategy shrink toward 0 and smaller values

Generates strings with characters drawn from alphabet , which should be a collection of length one strings or a strategy generating such strings.

The default alphabet strategy can generate the full unicode range but excludes surrogate characters because they are invalid in the UTF-8 encoding. You can use characters() without arguments to find surrogate-related bugs such as bpo-34454 .

min_size and max_size have the usual interpretations. Note that Python measures string length by counting codepoints: U+00C5 Å is a single character, while U+0041 U+030A Å is two - the A , and a combining ring above.

Examples from this strategy shrink towards shorter strings, and with the characters in the text shrinking as per the alphabet strategy. This strategy does not normalize() examples, so generated strings may be in any or none of the ‘normal forms’.

A strategy for timedeltas between min_value and max_value .

Examples from this strategy shrink towards zero.

A strategy for times between min_value and max_value .

The timezones argument is handled as for datetimes() .

Examples from this strategy shrink towards midnight, with the timezone component shrinking as for the strategy that provided it.

A strategy for IANA timezone names .

As well as timezone names like "UTC" , "Australia/Sydney" , or "America/New_York" , this strategy can generate:

Aliases such as "Antarctica/McMurdo" , which links to "Pacific/Auckland" .

Deprecated names such as "Antarctica/South_Pole" , which also links to "Pacific/Auckland" . Note that most but not all deprecated timezone names are also aliases.

Timezone names with the "posix/" or "right/" prefixes, unless allow_prefix=False .

These strings are provided separately from Tzinfo objects - such as ZoneInfo instances from the timezones() strategy - to facilitate testing of timezone logic without needing workarounds to access non-canonical names.

The zoneinfo module is new in Python 3.9, so you will need to install the backports.zoneinfo module on earlier versions.

On Windows, you will also need to install the tzdata package .

pip install hypothesis[zoneinfo] will install these conditional dependencies if and only if they are needed.

On Windows, you may need to access IANA timezone data via the tzdata package. For non-IANA timezones, such as Windows-native names or GNU TZ strings, we recommend using sampled_from() with the dateutil package, e.g. dateutil.tz.tzwin.list() .

A strategy for zoneinfo.ZoneInfo objects.

If no_cache=True , the generated instances are constructed using ZoneInfo.no_cache instead of the usual constructor. This may change the semantics of your datetimes in surprising ways, so only use it if you know that you need to!

Return a strategy which generates a tuple of the same length as args by generating the value at index i from args[i].

e.g. tuples(integers(), integers()) would generate a tuple of length two with both values an integer.

Examples from this strategy shrink by shrinking their component parts.

Returns a strategy that generates UUIDs .

If the optional version argument is given, value is passed through to UUID and only UUIDs of that version will be generated.

If allow_nil is True, generate the nil UUID much more often. Otherwise, all returned values from this will be unique, so e.g. if you do lists(uuids()) the resulting list will never contain duplicates.

Examples from this strategy don’t have any meaningful shrink order.

Provisional strategies ¶

This module contains various provisional APIs and strategies.

It is intended for internal use, to ease code reuse, and is not stable. Point releases may move or break the contents at any time!

Internet strategies should conform to RFC 3986 or the authoritative definitions it links to. If not, report the bug!

Generate RFC 1035 compliant fully qualified domain names.

A strategy for RFC 3986 , generating http/https URLs.

Shrinking ¶

When using strategies it is worth thinking about how the data shrinks . Shrinking is the process by which Hypothesis tries to produce human readable examples when it finds a failure - it takes a complex example and turns it into a simpler one.

Each strategy defines an order in which it shrinks - you won’t usually need to care about this much, but it can be worth being aware of as it can affect what the best way to write your own strategies is.

The exact shrinking behaviour is not a guaranteed part of the API, but it doesn’t change that often and when it does it’s usually because we think the new way produces nicer examples.

Possibly the most important one to be aware of is one_of() , which has a preference for values produced by strategies earlier in its argument list. Most of the others should largely “do the right thing” without you having to think about it.

Adapting strategies ¶

Often it is the case that a strategy doesn’t produce exactly what you want it to and you need to adapt it. Sometimes you can do this in the test, but this hurts reuse because you then have to repeat the adaption in every test.

Hypothesis gives you ways to build strategies from other strategies given functions for transforming the data.

map is probably the easiest and most useful of these to use. If you have a strategy s and a function f , then an example s.map(f).example() is f(s.example()) , i.e. we draw an example from s and then apply f to it.

Note that many things that you might use mapping for can also be done with builds() , and if you find yourself indexing into a tuple within .map() it’s probably time to use that instead.

Filtering ¶

filter lets you reject some examples. s.filter(f).example() is some example of s such that f(example) is truthy.

It’s important to note that filter isn’t magic and if your condition is too hard to satisfy then this can fail:

In general you should try to use filter only to avoid corner cases that you don’t want rather than attempting to cut out a large chunk of the search space.

A technique that often works well here is to use map to first transform the data and then use filter to remove things that didn’t work out. So for example if you wanted pairs of integers (x,y) such that x < y you could do the following:

Chaining strategies together ¶

Finally there is flatmap . flatmap draws an example, then turns that example into a strategy, then draws an example from that strategy.

It may not be obvious why you want this at first, but it turns out to be quite useful because it lets you generate different types of data with relationships to each other.

For example suppose we wanted to generate a list of lists of the same length:

In this example we first choose a length for our tuples, then we build a strategy which generates lists containing lists precisely of that length. The finds show what simple examples for this look like.

Most of the time you probably don’t want flatmap , but unlike filter and map which are just conveniences for things you could just do in your tests, flatmap allows genuinely new data generation that you wouldn’t otherwise be able to easily do.

(If you know Haskell: Yes, this is more or less a monadic bind. If you don’t know Haskell, ignore everything in these parentheses. You do not need to understand anything about monads to use this, or anything else in Hypothesis).

Recursive data ¶

Sometimes the data you want to generate has a recursive definition. e.g. if you wanted to generate JSON data, valid JSON is:

Any float, any boolean, any unicode string.

Any list of valid JSON data

Any dictionary mapping unicode strings to valid JSON data.

The problem is that you cannot call a strategy recursively and expect it to not just blow up and eat all your memory. The other problem here is that not all unicode strings display consistently on different machines, so we’ll restrict them in our doctest.

The way Hypothesis handles this is with the recursive() strategy which you pass in a base case and a function that, given a strategy for your data type, returns a new strategy for it. So for example:

That is, we start with our leaf data and then we augment it by allowing lists and dictionaries of anything we can generate as JSON data.

The size control of this works by limiting the maximum number of values that can be drawn from the base strategy. So for example if we wanted to only generate really small JSON we could do this as:

Composite strategies ¶

The @composite decorator lets you combine other strategies in more or less arbitrary ways. It’s probably the main thing you’ll want to use for complicated custom strategies.

The composite decorator works by converting a function that returns one example into a function that returns a strategy that produces such examples - which you can pass to @given , modify with .map or .filter , and generally use like any other strategy.

It does this by giving you a special function draw as the first argument, which can be used just like the corresponding method of the data() strategy within a test. In fact, the implementation is almost the same - but defining a strategy with @composite makes code reuse easier, and usually improves the display of failing examples.

For example, the following gives you a list and an index into it:

draw(s) is a function that should be thought of as returning s.example() , except that the result is reproducible and will minimize correctly. The decorated function has the initial argument removed from the list, but will accept all the others in the expected order. Defaults are preserved.

Note that the repr will work exactly like it does for all the built-in strategies: it will be a function that you can call to get the strategy in question, with values provided only if they do not match the defaults.

You can use assume inside composite functions:

This works as assume normally would, filtering out any examples for which the passed in argument is falsey.

Take care that your function can cope with adversarial draws, or explicitly rejects them using the .filter() method or assume() - our mutation and shrinking logic can do some strange things, and a naive implementation might lead to serious performance problems. For example:

If @composite is used to decorate a method or classmethod, the draw argument must come before self or cls . While we therefore recommend writing strategies as standalone functions and using the register_type_strategy() function to associate them with a class, methods are supported and the @composite decorator may be applied either before or after @classmethod or @staticmethod . See issue #2578 and pull request #2634 for more details.

Drawing interactively in tests ¶

There is also the data() strategy, which gives you a means of using strategies interactively. Rather than having to specify everything up front in @given you can draw from strategies in the body of your test.

This is similar to @composite , but even more powerful as it allows you to mix test code with example generation. The downside of this power is that data() is incompatible with explicit @example(...) s - and the mixed code is often harder to debug when something goes wrong.

If you need values that are affected by previous draws but which don’t depend on the execution of your test, stick to the simpler @composite .

If the test fails, each draw will be printed with the falsifying example. e.g. the above is wrong (it has a boundary condition error), so will print:

As you can see, data drawn this way is simplified as usual.

Optionally, you can provide a label to identify values generated by each call to data.draw() . These labels can be used to identify values in the output of a falsifying example.

For instance:

will produce the output:

helpful professor logo

13 Different Types of Hypothesis

13 Different Types of Hypothesis

Chris Drew (PhD)

Dr. Chris Drew is the founder of the Helpful Professor. He holds a PhD in education and has published over 20 articles in scholarly journals. He is the former editor of the Journal of Learning Development in Higher Education. [Image Descriptor: Photo of Chris]

Learn about our Editorial Process

hypothesis definition and example, explained below

There are 13 different types of hypothesis. These include simple, complex, null, alternative, composite, directional, non-directional, logical, empirical, statistical, associative, exact, and inexact.

A hypothesis can be categorized into one or more of these types. However, some are mutually exclusive and opposites. Simple and complex hypotheses are mutually exclusive, as are direction and non-direction, and null and alternative hypotheses.

Below I explain each hypothesis in simple terms for absolute beginners. These definitions may be too simple for some, but they’re designed to be clear introductions to the terms to help people wrap their heads around the concepts early on in their education about research methods .

Types of Hypothesis

Before you Proceed: Dependent vs Independent Variables

A research study and its hypotheses generally examine the relationships between independent and dependent variables – so you need to know these two concepts:

  • The independent variable is the variable that is causing a change.
  • The dependent variable is the variable the is affected by the change. This is the variable being tested.

Read my full article on dependent vs independent variables for more examples.

Example: Eating carrots (independent variable) improves eyesight (dependent variable).

1. Simple Hypothesis

A simple hypothesis is a hypothesis that predicts a correlation between two test variables: an independent and a dependent variable.

This is the easiest and most straightforward type of hypothesis. You simply need to state an expected correlation between the dependant variable and the independent variable.

You do not need to predict causation (see: directional hypothesis). All you would need to do is prove that the two variables are linked.

Simple Hypothesis Examples

QuestionSimple Hypothesis
Do people over 50 like Coca-Cola more than people under 50?On average, people over 50 like Coca-Cola more than people under 50.
According to national registries of car accident data, are Canadians better drivers than Americans?Canadians are better drivers than Americans.
Are carpenters more liberal than plumbers?Carpenters are more liberal than plumbers.
Do guitarists live longer than pianists?Guitarists do live longer than pianists.
Do dogs eat more in summer than winter?Dogs do eat more in summer than winter.

2. Complex Hypothesis

A complex hypothesis is a hypothesis that contains multiple variables, making the hypothesis more specific but also harder to prove.

You can have multiple independent and dependant variables in this hypothesis.

Complex Hypothesis Example

QuestionComplex Hypothesis
Do (1) age and (2) weight affect chances of getting (3) diabetes and (4) heart disease?(1) Age and (2) weight increase your chances of getting (3) diabetes and (4) heart disease.

In the above example, we have multiple independent and dependent variables:

  • Independent variables: Age and weight.
  • Dependent variables: diabetes and heart disease.

Because there are multiple variables, this study is a lot more complex than a simple hypothesis. It quickly gets much more difficult to prove these hypotheses. This is why undergraduate and first-time researchers are usually encouraged to use simple hypotheses.

3. Null Hypothesis

A null hypothesis will predict that there will be no significant relationship between the two test variables.

For example, you can say that “The study will show that there is no correlation between marriage and happiness.”

A good way to think about a null hypothesis is to think of it in the same way as “innocent until proven guilty”[1]. Unless you can come up with evidence otherwise, your null hypothesis will stand.

A null hypothesis may also highlight that a correlation will be inconclusive . This means that you can predict that the study will not be able to confirm your results one way or the other. For example, you can say “It is predicted that the study will be unable to confirm a correlation between the two variables due to foreseeable interference by a third variable .”

Beware that an inconclusive null hypothesis may be questioned by your teacher. Why would you conduct a test that you predict will not provide a clear result? Perhaps you should take a closer look at your methodology and re-examine it. Nevertheless, inconclusive null hypotheses can sometimes have merit.

Null Hypothesis Examples

QuestionNull Hypothesis (H )
Do people over 50 like Coca-Cola more than people under 50?Age has no effect on preference for Coca-Cola.
Are Canadians better drivers than Americans?Nationality has no effect on driving ability.
Are carpenters more liberal than plumbers?There is no statistically significant difference in political views between carpenters and plumbers.
Do guitarists live longer than pianists?There is no statistically significant difference in life expectancy between guitarists and pianists.
Do dogs eat more in summer than winter?Time of year has no effect on dogs’ appetites.

4. Alternative Hypothesis

An alternative hypothesis is a hypothesis that is anything other than the null hypothesis. It will disprove the null hypothesis.

We use the symbol H A or H 1 to denote an alternative hypothesis.

The null and alternative hypotheses are usually used together. We will say the null hypothesis is the case where a relationship between two variables is non-existent. The alternative hypothesis is the case where there is a relationship between those two variables.

The following statement is always true: H 0 ≠ H A .

Let’s take the example of the hypothesis: “Does eating oatmeal before an exam impact test scores?”

We can have two hypotheses here:

  • Null hypothesis (H 0 ): “Eating oatmeal before an exam does not impact test scores.”
  • Alternative hypothesis (H A ): “Eating oatmeal before an exam does impact test scores.”

For the alternative hypothesis to be true, all we have to do is disprove the null hypothesis for the alternative hypothesis to be true. We do not need an exact prediction of how much oatmeal will impact the test scores or even if the impact is positive or negative. So long as the null hypothesis is proven to be false, then the alternative hypothesis is proven to be true.

5. Composite Hypothesis

A composite hypothesis is a hypothesis that does not predict the exact parameters, distribution, or range of the dependent variable.

Often, we would predict an exact outcome. For example: “23 year old men are on average 189cm tall.” Here, we are giving an exact parameter. So, the hypothesis is not composite.

But, often, we cannot exactly hypothesize something. We assume that something will happen, but we’re not exactly sure what. In these cases, we might say: “23 year old men are not on average 189cm tall.”

We haven’t set a distribution range or exact parameters of the average height of 23 year old men. So, we’ve introduced a composite hypothesis as opposed to an exact hypothesis.

Generally, an alternative hypothesis (discussed above) is composite because it is defined as anything except the null hypothesis. This ‘anything except’ does not define parameters or distribution, and therefore it’s an example of a composite hypothesis.

6. Directional Hypothesis

A directional hypothesis makes a prediction about the positivity or negativity of the effect of an intervention prior to the test being conducted.

Instead of being agnostic about whether the effect will be positive or negative, it nominates the effect’s directionality.

We often call this a one-tailed hypothesis (in contrast to a two-tailed or non-directional hypothesis) because, looking at a distribution graph, we’re hypothesizing that the results will lean toward one particular tail on the graph – either the positive or negative.

Directional Hypothesis Examples

QuestionDirectional Hypothesis
Does adding a 10c charge to plastic bags at grocery stores lead to changes in uptake of reusable bags?Adding a 10c charge to plastic bags in grocery stores will lead to an in uptake of reusable bags.
Does a Universal Basic Income influence retail worker wages?Universal Basic Income retail worker wages.
Does rainy weather impact the amount of moderate to high intensity exercise people do per week in the city of Vancouver?Rainy weather the amount of moderate to high intensity exercise people do per week in the city of Vancouver.
Does introducing fluoride to the water system in the city of Austin impact number of dental visits per capita per year?Introducing fluoride to the water system in the city of Austin the number of dental visits per capita per year?
Does giving children chocolate rewards during study time for positive answers impact standardized test scores?Giving children chocolate rewards during study time for positive answers standardized test scores.

7. Non-Directional Hypothesis

A non-directional hypothesis does not specify the predicted direction (e.g. positivity or negativity) of the effect of the independent variable on the dependent variable.

These hypotheses predict an effect, but stop short of saying what that effect will be.

A non-directional hypothesis is similar to composite and alternative hypotheses. All three types of hypothesis tend to make predictions without defining a direction. In a composite hypothesis, a specific prediction is not made (although a general direction may be indicated, so the overlap is not complete). For an alternative hypothesis, you often predict that the even will be anything but the null hypothesis, which means it could be more or less than H 0 (or in other words, non-directional).

Let’s turn the above directional hypotheses into non-directional hypotheses.

Non-Directional Hypothesis Examples

QuestionNon-Directional Hypothesis
Does adding a 10c charge to plastic bags at grocery stores lead to changes in uptake of reusable bags?Adding a 10c charge to plastic bags in grocery stores will lead to a in uptake of reusable bags.
Does a Universal Basic Income influence retail worker wages?Universal Basic Income retail worker wages.
Does rainy weather impact the amount of moderate to high intensity exercise people do per week in the city of Vancouver?Rainy weather the amount of moderate to high intensity exercise people do per week in the city of Vancouver.
Does introducing fluoride to the water system in the city of Austin impact number of dental visits per capita per year?Introducing fluoride to the water system in the city of Austin the number of dental visits per capita per year?
Does giving children chocolate rewards during study time for positive answers impact standardized test scores?Giving children chocolate rewards during study time for positive answers standardized test scores.

8. Logical Hypothesis

A logical hypothesis is a hypothesis that cannot be tested, but has some logical basis underpinning our assumptions.

These are most commonly used in philosophy because philosophical questions are often untestable and therefore we must rely on our logic to formulate logical theories.

Usually, we would want to turn a logical hypothesis into an empirical one through testing if we got the chance. Unfortunately, we don’t always have this opportunity because the test is too complex, expensive, or simply unrealistic.

Here are some examples:

  • Before the 1980s, it was hypothesized that the Titanic came to its resting place at 41° N and 49° W, based on the time the ship sank and the ship’s presumed path across the Atlantic Ocean. However, due to the depth of the ocean, it was impossible to test. Thus, the hypothesis was simply a logical hypothesis.
  • Dinosaurs closely related to Aligators probably had green scales because Aligators have green scales. However, as they are all extinct, we can only rely on logic and not empirical data.

9. Empirical Hypothesis

An empirical hypothesis is the opposite of a logical hypothesis. It is a hypothesis that is currently being tested using scientific analysis. We can also call this a ‘working hypothesis’.

We can to separate research into two types: theoretical and empirical. Theoretical research relies on logic and thought experiments. Empirical research relies on tests that can be verified by observation and measurement.

So, an empirical hypothesis is a hypothesis that can and will be tested.

  • Raising the wage of restaurant servers increases staff retention.
  • Adding 1 lb of corn per day to cows’ diets decreases their lifespan.
  • Mushrooms grow faster at 22 degrees Celsius than 27 degrees Celsius.

Each of the above hypotheses can be tested, making them empirical rather than just logical (aka theoretical).

10. Statistical Hypothesis

A statistical hypothesis utilizes representative statistical models to draw conclusions about broader populations.

It requires the use of datasets or carefully selected representative samples so that statistical inference can be drawn across a larger dataset.

This type of research is necessary when it is impossible to assess every single possible case. Imagine, for example, if you wanted to determine if men are taller than women. You would be unable to measure the height of every man and woman on the planet. But, by conducting sufficient random samples, you would be able to predict with high probability that the results of your study would remain stable across the whole population.

You would be right in guessing that almost all quantitative research studies conducted in academic settings today involve statistical hypotheses.

Statistical Hypothesis Examples

  • Human Sex Ratio. The most famous statistical hypothesis example is that of John Arbuthnot’s sex at birth case study in 1710. Arbuthnot used birth data to determine with high statistical probability that there are more male births than female births. He called this divine providence, and to this day, his findings remain true: more men are born than women.
  • Lady Testing Tea. A 1935 study by Ronald Fisher involved testing a woman who believed she could tell whether milk was added before or after water to a cup of tea. Fisher gave her 4 cups in which one randomly had milk placed before the tea. He repeated the test 8 times. The lady was correct each time. Fisher found that she had a 1 in 70 chance of getting all 8 test correct, which is a statistically significant result.

11. Associative Hypothesis

An associative hypothesis predicts that two variables are linked but does not explore whether one variable directly impacts upon the other variable.

We commonly refer to this as “ correlation does not mean causation ”. Just because there are a lot of sick people in a hospital, it doesn’t mean that the hospital made the people sick. There is something going on there that’s causing the issue (sick people are flocking to the hospital).

So, in an associative hypothesis, you note correlation between an independent and dependent variable but do not make a prediction about how the two interact. You stop short of saying one thing causes another thing.

Associative Hypothesis Examples

  • Sick people in hospital. You could conduct a study hypothesizing that hospitals have more sick people in them than other institutions in society. However, you don’t hypothesize that the hospitals caused the sickness.
  • Lice make you healthy. In the Middle Ages, it was observed that sick people didn’t tend to have lice in their hair. The inaccurate conclusion was that lice was not only a sign of health, but that they made people healthy. In reality, there was an association here, but not causation. The fact was that lice were sensitive to body temperature and fled bodies that had fevers.

12. Causal Hypothesis

A causal hypothesis predicts that two variables are not only associated, but that changes in one variable will cause changes in another.

A causal hypothesis is harder to prove than an associative hypothesis because the cause needs to be definitively proven. This will often require repeating tests in controlled environments with the researchers making manipulations to the independent variable, or the use of control groups and placebo effects .

If we were to take the above example of lice in the hair of sick people, researchers would have to put lice in sick people’s hair and see if it made those people healthier. Researchers would likely observe that the lice would flee the hair, but the sickness would remain, leading to a finding of association but not causation.

Causal Hypothesis Examples

QuestionCausation HypothesisCorrelation Hypothesis
Does marriage cause baldness among men?Marriage causes stress which leads to hair loss.Marriage occurs at an age when men naturally start balding.
What is the relationship between recreational drugs and psychosis?Recreational drugs cause psychosis.People with psychosis take drugs to self-medicate.
Do ice cream sales lead to increase drownings?Ice cream sales cause increased drownings.Ice cream sales peak during summer, when more people are swimming and therefore more drownings are occurring.

13. Exact vs. Inexact Hypothesis

For brevity’s sake, I have paired these two hypotheses into the one point. The reality is that we’ve already seen both of these types of hypotheses at play already.

An exact hypothesis (also known as a point hypothesis) specifies a specific prediction whereas an inexact hypothesis assumes a range of possible values without giving an exact outcome. As Helwig [2] argues:

“An “exact” hypothesis specifies the exact value(s) of the parameter(s) of interest, whereas an “inexact” hypothesis specifies a range of possible values for the parameter(s) of interest.”

Generally, a null hypothesis is an exact hypothesis whereas alternative, composite, directional, and non-directional hypotheses are all inexact.

See Next: 15 Hypothesis Examples

This is introductory information that is basic and indeed quite simplified for absolute beginners. It’s worth doing further independent research to get deeper knowledge of research methods and how to conduct an effective research study. And if you’re in education studies, don’t miss out on my list of the best education studies dissertation ideas .

[1] https://jnnp.bmj.com/content/91/6/571.abstract

[2] http://users.stat.umn.edu/~helwig/notes/SignificanceTesting.pdf

Chris

  • Chris Drew (PhD) https://helpfulprofessor.com/author/chris-drew-phd-2/ 10 Reasons you’re Perpetually Single
  • Chris Drew (PhD) https://helpfulprofessor.com/author/chris-drew-phd-2/ 20 Montessori Toddler Bedrooms (Design Inspiration)
  • Chris Drew (PhD) https://helpfulprofessor.com/author/chris-drew-phd-2/ 21 Montessori Homeschool Setups
  • Chris Drew (PhD) https://helpfulprofessor.com/author/chris-drew-phd-2/ 101 Hidden Talents Examples

2 thoughts on “13 Different Types of Hypothesis”

' src=

Wow! This introductionary materials are very helpful. I teach the begginers in research for the first time in my career. The given tips and materials are very helpful. Chris, thank you so much! Excellent materials!

' src=

You’re more than welcome! If you want a pdf version of this article to provide for your students to use as a weekly reading on in-class discussion prompt for seminars, just drop me an email in the Contact form and I’ll get one sent out to you.

When I’ve taught this seminar, I’ve put my students into groups, cut these definitions into strips, and handed them out to the groups. Then I get them to try to come up with hypotheses that fit into each ‘type’. You can either just rotate hypothesis types so they get a chance at creating a hypothesis of each type, or get them to “teach” their hypothesis type and examples to the class at the end of the seminar.

Cheers, Chris

Leave a Comment Cancel Reply

Your email address will not be published. Required fields are marked *

eMathZone

Simple Hypothesis and Composite Hypothesis

A simple hypothesis is one in which all parameters of the distribution are specified. For example, the heights of college students are normally distributed with $${\sigma ^2} = 4$$, and the hypothesis that its mean $$\mu $$ is, say, $$62”$$; that is, $${H_o}:\mu = 62$$. So we have stated a simple hypothesis, as the mean and variance together specify a normal distribution completely. A simple hypothesis, in general, states that $$\theta = {\theta _o}$$ where $${\theta _o}$$ is the specified value of a parameter $$\theta $$, ($$\theta $$ may represent $$\mu ,p,{\mu _1} – {\mu _2}$$ etc).

A hypothesis which is not simple (i.e. in which not all of the parameters are specified) is called a composite hypothesis. For instance, if we hypothesize that $${H_o}:\mu > 62$$ (and $${\sigma ^2} = 4$$) or$${H_o}:\mu = 62$$ and $${\sigma ^2} < 4$$, the hypothesis becomes a composite hypothesis because we cannot know the exact distribution of the population in either case. Obviously, the parameters $$\mu > 62”$$ and$${\sigma ^2} < 4$$ have more than one value and no specified values are being assigned. The general form of a composite hypothesis is $$\theta \leqslant {\theta _o}$$ or $$\theta \geqslant {\theta _o}$$; that is, the parameter $$\theta $$ does not exceed or does not fall short of a specified value $${\theta _o}$$. The concept of simple and composite hypotheses applies to both the null hypothesis and alternative hypothesis.

Hypotheses may also be classified as exact and inexact. A hypothesis is said to be an exact hypothesis if it selects a unique value for the parameter, such as $${H_o}:\mu = 62$$ or $$p > 0.5$$. A hypothesis is called an inexact hypothesis when it indicates more than one possible value for the parameter, such as $${H_o}:\mu \ne 62$$ or $${H_o}:p = 62$$. A simple hypothesis must be exact while an exact hypothesis is not necessarily a simple hypothesis. An inexact hypothesis is a composite hypothesis.

One Comment

Etini August 5 @ 7:28 pm

How can i design a sequential test for the shape parameters of the beta distribution

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

All Subjects

study guides for every class

That actually explain what's on your next test, composite hypothesis, from class:, theoretical statistics.

A composite hypothesis is a type of statistical hypothesis that includes a range of possible values for a parameter, rather than specifying a single value. This concept is crucial when dealing with null and alternative hypotheses, as it allows researchers to consider multiple scenarios or conditions under which the data may be analyzed, providing a more flexible approach to hypothesis testing.

congrats on reading the definition of composite hypothesis . now let's actually learn it.

5 Must Know Facts For Your Next Test

  • Composite hypotheses can test multiple values for a parameter, which allows for broader applicability in real-world scenarios.
  • Unlike simple hypotheses that specify a single value, composite hypotheses acknowledge uncertainty in parameter estimation.
  • In hypothesis testing, rejecting the null hypothesis often implies accepting a composite alternative hypothesis rather than pinpointing a specific value.
  • The formulation of composite hypotheses is critical in experiments where parameters are not fixed and can vary based on different conditions.
  • Composite hypotheses are common in many statistical tests, including t-tests and ANOVA, where researchers evaluate the effects across groups rather than focusing on individual outcomes.

Review Questions

  • A composite hypothesis differs from a simple hypothesis by encompassing a range of values for the parameter being tested rather than just one specific value. While simple hypotheses make concrete assertions that can be clearly accepted or rejected, composite hypotheses allow researchers to explore multiple possibilities within their data. This flexibility is particularly useful in real-world applications where parameters are not easily defined.
  • Using composite hypotheses in research studies has significant implications for how results are interpreted and understood. Since these hypotheses consider various potential outcomes, they provide a more comprehensive view of the data and its variability. This approach helps researchers avoid oversimplification and encourages deeper exploration of how different factors may influence results, ultimately leading to more robust conclusions.
  • Employing composite hypotheses offers several advantages, such as greater flexibility and the ability to encompass a wider range of scenarios that can occur in real-life situations. They facilitate a better understanding of variability within data sets. However, this complexity can also introduce challenges; for example, it may complicate the interpretation of results and increase the likelihood of Type I and Type II errors due to multiple comparisons. Balancing these factors is essential when deciding which type of hypothesis to use in statistical analyses.

Related terms

null hypothesis : A statement that assumes there is no effect or no difference in a given situation, serving as the starting point for statistical testing.

alternative hypothesis : The hypothesis that proposes an effect or a difference exists, challenging the null hypothesis in statistical testing.

parameter : A numerical characteristic or measure of a population, such as the mean or standard deviation, that is estimated through statistical methods.

" Composite hypothesis " also found in:

© 2024 fiveable inc. all rights reserved., ap® and sat® are trademarks registered by the college board, which is not affiliated with, and does not endorse this website..

Composite Hypothesis Test

Hypothesis tests > Composite Hypothesis Test

What is a Composite Hypothesis Test?

A composite hypothesis test contains more than one parameter and more than one model. In a simple hypothesis test, the probability density functions for both the null hypothesis (H 0 ) and alternate hypothesis (H 1 ) are known. In academic and hypothetical situations, the simple hypothesis test works for most cases. However, in real life it’s much more challenging to specify all of the pdfs for a particular situation.

Approaches to Composite Hypothesis Testing

composite hypothesis test

  • Bayesian approach : the unknown parameter is assigned a prior PDF.
  • Generalized likelihood ratio test approach : the unknown parameter is estimated and placed into a likelihood ratio test.

Composite Null Hypothesis

In real life, null hypotheses are usually composite unless the problem is very simple. An example of a composite hypothesis, which has multiple possible values, is: H 0 : μ ≥ 100

Ghobadzadeh, A. et. al. Separating Function Estimation Tests: A New Perspective on Binary Composite Hypothesis Testing. Retrieved August 19, 2019 from: http://post.queensu.ca/~gazor/T-SP-13550-2012.pdf Lindsey, J. Parametric Statistical Inference . Retrieved August 19, 2019 from: https://books.google.com/books?id=YnsQ-NnMxJ8C Nowak, R. (2010). Lecture 10: Composite Hypothesis Testing. Retrieved August 19, 2019 from: http://nowak.ece.wisc.edu/ece830/ece830_fall11_lecture10.pdf Lecture 29 : Bayesian Composite Hypothesis Testing. Retrieved August 19, 2019 from: https://nptel.ac.in/courses/117103018/module8/lec29/1.html

Leave a Comment

You must be logged in to post a comment.

Definition: Simple and composite hypothesis

Definition: Let $H$ be a statistical hypothesis . Then,

$H$ is called a simple hypothesis, if it completely specifies the population distribution; in this case, the sampling distribution of the test statistic is a function of sample size alone.

$H$ is called a composite hypothesis, if it does not completely specify the population distribution; for example, the hypothesis may only specify one parameter of the distribution and leave others unspecified.

  • Wikipedia (2021): "Exclusion of the null hypothesis" ; in: Wikipedia, the free encyclopedia , retrieved on 2021-03-19 ; URL: https://en.wikipedia.org/wiki/Exclusion_of_the_null_hypothesis#Terminology .

Composite Hypothesis

In subject area: Mathematics

Classically, composite hypotheses are used to determine if a point null is statistically distinguishable from the best alternative, or to determine if the best supported alternative lies on a specified side of the point null.

From: Philosophy of Statistics , 2011

Chapters and Articles

You might find these chapters and articles relevant to this topic.

Special Kinds of Theorems

Antonella Cupillari , in The Nuts and Bolts of Proofs (Fourth Edition) , 2013

Composite Statements

The hypothesis and/or the conclusion of a theorem might be composite statements that include the words “and,” “or.” Because of the more complicated structure of this kind of statement, we have to pay even more attention to details. After analyzing a composite statement, we can check whether it is possible to break it down into simpler parts, which can then be proved by using any of the principles and techniques already seen. Other times we will replace the original statement with another logically equivalent statement that is easier to handle.

Multiple Hypotheses

Multiple hypotheses statements are statements whose hypothesis are composite statements, such as “If A and B , then C ,” and “If A or B , then C .” Let us start by examining statements of the form “If A and B , then C .”

Proving that such a statement is true does not require any special technique, and some of these statements have already been included in previous sections. The main characteristic of this kind of statement is that the composite statement “ A and B ” contains several pieces of information, and we need to make sure that we use all of them during the construction of the proof. If we do not, we are proving a statement different from the original. Always remember to consider possible implicit hypotheses.

If b is a multiple of 2 and of 5, then b is a multiple of 10.

The number b is a multiple of 2.

The number b is a multiple of 5. (Implicit hypothesis: The number b is an integer, and all properties and operations of integer numbers can be used.)

The number b is a multiple of 10.

By hypothesis A , the number b is a multiple of 2. So, b = 2 n for some integer n . The other hypothesis, B , states that b is a multiple of 5. Therefore, b = 5 k for some integer k . Thus, 5 k = 2 n .

Because 5 k is even, and 5 is odd, we conclude that k is even. Thus, k = 2 t for some integer number t . This implies that b = 5 k = 5 ( 2 t ) = 1 0 t for some integer number t . Therefore the number b is a multiple of 10.

The proof of a statement of the form “If A and B , then C ” can be constructed using its contrapositive, which is “If ‘not C ,’ then either ‘not A ’ or ‘not B .’” (You might want to review the truth tables for constructing the negation of a composite statement introduced in the section “The Negation of a Statement.”) This is a statement with multiple conclusions, which is part of the next topic presented.

Let us construct another proof for Example 3.24 using the contrapositive of the original statement, just to become more familiar with this kind of statement. “If the number b is not a multiple of 10, then either b is not a multiple of 2 or b is not a multiple of 5.”

The two parts of the conclusion are “ b is not a multiple of 2” and “ b is not a multiple of 5.” To prove that the conclusion is true, it is enough to prove that at least one of the two parts is true. (Keep reading for more details regarding this kind of statement.)

Assume that the number b is not a multiple of 10. Then, by the division algorithm, b = 1 0 q + r with q and r integers and 1 ≤ r ≤ 9.

If r is an even number (i.e., 2, 4, 6, 8), then we can write r = 2 t , with t a positive integer, and 1 ≤ t ≤ 4. So b = 1 0 q + 2 t = 2 ( 5 q + t ) .

The number 5 q + t is an integer, so the number b is divisible by 2. But b is not divisible by 5 because r is not divisible by 5. Thus, in this case the conclusion is true because its second part is true. If r is an odd number (i.e., 3, 5, 7, 9), then b is not divisible by 2. In this case the conclusion is true as well because its first part is true.

The statement in Example 3.24 can also be proved using contradiction. In this case we start by assuming that b is a multiple of 2 and 5 and it is not a multiple of 10.

We will now consider statements of the form “If A or B , then C .” In this kind of statement we know that the hypothesis “ A or B ” is true. This can possibly mean that:

Part A of the statement is true.

Part B of the statement is true.

Both parts A and B are true.

Because we do not know which one of the three cases to consider, we must examine all of them. It is important to notice that it is sufficient to concentrate on the first two cases, because the third case is a special (weaker) case that combines the first two. Therefore, the proof of a statement of the form “If A or B , then C ” has two parts (two cases):

Case 1. “If A , then C .”

Case 2. “If B , then C .”

To be sure that the logic of this procedure is indeed correct, we can construct the truth tables that prove that the statements “If A or B , then C ” and “(If A , then C) and (If B , then C )” are logically equivalent.

Thus, to prove the statement “If A or B , then C ,” one has to prove that the two simpler statements “If A , then C ” and “If B , then C ” are both true.

Let x , y , and z be counting numbers. If x is a multiple of z or y is a multiple of z , then their product xy is a multiple of z .

Case 1. Let x be a multiple of z . Then, x = kz , with k an integer (positive since x > 0, z > 0). Therefore, x y = ( k z ) y = ( k y ) z .

The number ky is a positive integer because k and y are positive integers. So xy is a multiple of z .

Case 2. Let y be a multiple of z . Then, y = nz with n an integer (positive since y > 0, z > 0). Therefore, x y = x ( n z ) = ( x n ) z .

The number xn is a positive integer because x and n are positive integers. So xy is a multiple of z .

The proof of a statement of the form “If A or B , then C ” can be constructed using its contrapositive, which is “If ‘not C ,’ then ‘not A ’ and ‘not B .’” (You might want to review the truth tables for constructing the negation of a composite statement introduced in the section “The Negation of a Statement.”) This is again a statement with multiple conclusions, which is part of the next topic.

The contrapositive of the original statement in Example 3.25 is the statement “Let x , y , and z be counting numbers. If the product xy is not a multiple of z , then x is not a multiple of z and y is not a multiple of z .”

Multiple Conclusions

The most common kinds of multiple conclusion statements are

If A , then B and C .

If A , then B or C .

We will consider these statements in some detail.

The proof of this kind of statement has two parts:

If A , then B .

If A , then C .

Indeed, we need to prove that each one of the possible conclusions is true, because we want all of them to hold. If we have already completed the proof that one of the two (or more) implications is true, we can use that result to prove the remaining ones (if needed).

The lines y = 2 x + 1 and y = −3 x + 2 are not perpendicular and they intersect in exactly one point.

The two lines have equations y = 2 x + 1 and y = −3 x + 2. (Implicit hypothesis: All the properties and relations between lines can be used.)

Conclusions

The lines are not perpendicular.

The lines intersect in exactly one point.

Part 1. If A , then B .

Two lines are perpendicular if their slopes, m and m 1 , satisfy the equation m = −1/ m 1 , unless one of them is horizontal and the other vertical, in which case one slope is equal to zero and the other is undefined. The first line has slope 2, the second has slope −3. So the lines are neither horizontal nor vertical. In addition −3 ≠ −1/2. Thus, the lines are not perpendicular.

Part 2. If A , then C .

This second part is an existence and uniqueness statement: There is one and only one point belonging to both lines.

The given lines are distinct and nonparallel (since they have different slopes); therefore they have only one point in common.

We can find the coordinates of the point(s) in common, by solving the system { y = 2 x + 1 y = − 3 x + 2 .

By substitution we have 2 x + 1 = − 3 x + 2 .

The only solution of this equation is x = 1/5. The corresponding value of the y variable is y = 2(1/5) + 1 = 7/5. Therefore, the lines have in common the point with coordinates (1/5, 7/5). This point is unique because its coordinates represent the only solution of the system formed by the equations of the two lines.

If a number is even, then its second power is divisible by 4 and its sixth power is divisible by 64.

The number n is even. (Implicit hypothesis: All the properties and operations of integer numbers can be used.)

The number n 2 is divisible by 4.

The number n 6 is divisible by 64.

By hypothesis the number n is even. Therefore, n = 2 t for some integer number t . This implies that n 2 = 4 t 2 .

As the number t 2 is an integer, it is true that n 2 is divisible by 4.

This implication can be proved in two ways.

Method 1: By hypothesis, the number n is even. Therefore n = 2 t for some integer number t . This implies that n 6 = 64 t 6 .

As the number t 6 is an integer, it is true that n 6 is divisible by 64.

Method 2: We can use the result established in part 1, since that part of the proof is indeed complete. When n is even, then n 2 = 4 k for some integer number k . Then we have n 6 = ( n 2 ) 3 = ( 4 k ) 3 = 64 k 3 .

As k 3 is an integer, it is true that n 6 is divisible by 64.

In this case we need to show that given A , then either B or C is true (not necessarily both). This means that we need to prove that at least one of the possible conclusions is true; that is, if one of the two conclusions is false, the other must be true. Thus, the best way to prove this kind of statement is to use the following one, which is logically equivalent to it:

It might be useful to consider the truth tables for the two statements “If A , then B or C ” and “If A and (not B ), then C .”

Similarly one can prove that the statements “If A , then B or C ” and “If A and (not C ), then B ” are logically equivalent.

Let n be a composite number larger than 1. Then n has at least one nontrivial factor smaller than or equal to n .

The number n is a composite number larger than 1. Thus, n = pq with 1 < p < n and 1 < q < n . (Implicit hypothesis: We can use all of the properties of counting and prime and nonprime numbers, divisibility, and square roots.)

Then either p ≤ n or q ≤ n .

We will start by assuming that n = p q with 1 < p < n and 1 < q < n , and p > n . Multiplying this last inequality by q yields q p > q n , that is, n > q n .

This implies n > q . Thus it is true that n ≥ q . (Note that if p = q , then p = q = n . )

The result stated in Example 3.28 is used to improve the speed of the search for possible prime factors of numbers.

If x is a rational number and y is an irrational number, their sum x + y is an irrational number.

The sum x + y is an irrational number. (Implicit hypothesis: As rational and irrational numbers are real numbers, we can use all the properties of real numbers and their operations.) The fact that the numbers are called x and y is irrelevant. We can use any two symbols. We will keep using x and y to be consistent with the original statement.

The number x is irrational.

The number y is rational.

Therefore, we plan to prove the equivalent statement “If A and (not B) , then C .”

Assume that the number x + y is rational and so is the number x . Therefore, using the definition of rational numbers we can write x + y = n / p , with p ≠ 0, and n and p integer numbers.

As x is rational, we can write x = a / b with b ≠ 0, and a and b integer numbers. Thus, we have a / b + y = n / p .

If we solve this equation for y , we obtain y = n / p − a / b = ( n b − a p ) / p b with pb ≠ 0 since p ≠ 0 and b ≠ 0. The numbers nb − ap and pb are integers because n , p , a , and b are integer numbers. This information allows us to conclude that y is a rational number. As we have proved the contrapositive of the original statement to be true, the original statement is also true.

Let a be an even number, with | a | >16. Then, either a ≥ 18 or a ≤ −18.

The number a is even. (Implicit hypothesis: We can use the properties and operations of integer numbers.) The fact that the number is called a is irrelevant.

| a | > 16.

Moreover B is a composite statement. Indeed, B can be written as “B 1 or B 2 ,” with B 1 : a > 16   and   B 2 : a < − 16 .

Thus the original statement can be rewritten as:

If ( a is even and a > 16) or ( a is even and a < −16), then either a ≥ 18 or a ≤ −18.

The presence of an “or” in the hypothesis suggests the construction of a proof by cases.

Case 1. We will prove the statement: “If a is an even number and a > 16, then either a ≥ 18 or a ≤ −18.” As a is even and larger than 16, then it must be at least 18. Thus, a ≥ 18, and the conclusion is true.

Case 2. We will prove the statement: “If a is an even number and a < −16, then either a ≥ 18 or a ≤ −18.” As a is even and smaller than −16, then it cannot be −17, so it must be at most −18. Therefore, a ≤ −18, and the conclusion is true.

Prove the following statements.

If x 2 = y 2 and x ≥ 0, y ≥ 0, then x = y .

If a function f is even and odd, then f ( x ) = 0 for all x in the domain of the function. (See “Some Facts and Properties of Functions” at the front of the book for the definitions of even and odd functions.)

If n is a positive multiple of 3, then either n is odd or it is a multiple of 6.

If x and y are two real numbers such that x 4 = y 4 , then either x = y or x = − y .

Let a and b be two nonzero numbers. If a divides b and b divides a , then a = ± b .

Let n be an integer number. Then, either n 2 or n 2 − 1 is divisible by 4.

Let a , b , and m be three positive numbers. If either a divides m or b divides m , then d = GCD ( a , b ) divides m .

Let a , b , and m be three positive numbers. If a divides m and b divides m , then L = lcm ( a , b ) divides m .

Let a , b , and c be a Pythagorean triple of integers, i.e., a 2 + b 2 = c 2 . Then the product of the three numbers, abc , is even.

Fill in all the details and outline the following proof of the Rational Zero Theorem:

Let z be a rational zero of the polynomial P ( x ) = a n x n + a n − 1 x n − 1 + … + a 0 , which has all integer coefficients with a n  ≠ 0 and a 0  ≠ 0, with n  ≥ 1. Let z = p / q be written in its lowest terms, with q  ≠ 0. Then, q divides a n and p divides a 0 .

By hypothesis P ( z ) = 0. So a n ( p q ) n + a n − 1 ( p q ) n − 1 + … + a 1 ( p q ) + a 0 = 0 .

Therefore (why?) * a n p n + a n − 1 p n − 1 q + … + a 1 p q n − 1 + a 0 q n = 0 .

Thus, a n p n = − q ( a n − 1 p n − 1 + … + a 1 p q n − 2 + a 0 q n − 1 ) ( why? ) .

This can be rewritten as a n p n = − qt , where t = a n −1 p n −1 + … + a 1 pq n −2 + a 0 q n −1 is an integer (why?). This implies that q divides a n p n . As p and q have no common factors (why?), q divides a n . We can use equation (*) to obtain (why?) a 0 q n = − p ( a n p n − 1 + a n − 1 p n − 2 q + … + a 1 q n − 1 ) .

This can be rewritten as a 0 q n = − ps , where s = a n p n −1 + a n −1 p n −2 q + … + a 1 q n −1 is an integer (why?). Thus, p divides a 0 q n . Because p and q have no common factors (why?), p must divide a 0 .

Antonella Cupillari , in The Nuts and Bolts of Proofs (Third Edition) , 2005

COMPOSITE STATEMENTS

The hypothesis and conclusion of a theorem might be composite statements. Because of the more complicated structure of this kind of statement, we have to pay very close attention. After analyzing a composite statement, we can check if it is possible to break it down into simpler parts, which can then be proved by using any of the principles and techniques already seen. Other times we will replace the original statement with another logically equivalent to it, but easier to handle.

MULTIPLE HYPOTHESES

Multiple hypotheses statements are statements for which the hypotheses are composite statements, such as “If A and B , then C ” and “If A or B , then C .”

Let us start by examining statements of the form “If A and B , then C .”

Proving that such a statement is true does not require any special technique, and some of these statements have already been included in previous sections. The main characteristic of this kind of statement is that the composite statement “ A and B ” contains several pieces of information, and we need to make sure that we use all of them during construction of the proof. If we do not, we are proving a statement different from the original. Always remember to consider possible implicit hypotheses.

The number b is a multiple of 5.

( Implicit hypothesis: All the properties and operations of integer numbers can be used.)

By hypothesis A , the number b is a multiple of 2. So, b = 2 n for some integer n . The other hypothesis, B , states that b is a multiple of 5. Therefore, b = 5 k for some integer k . Thus, 2 n = 5 k .

Because 2 n is divisible by 5, and 2 is not divisible by 5, we conclude that n is divisible by 5. Thus, n = 5 t for some integer number t . This implies that: b = 2 n = 2 ( 5 t ) = 10 t

for some integer number t . Therefore, the number b is a multiple of 10.

The proof of a statement of the form “If A and B , then C ” can be constructed using its contrapositive, which is “If ‘not C ,’ then either ‘not A ’ or ‘not B .’” (You might want to review the truth tables for constructing the negation of a composite statement introduced in the How To Construct the Negation of a Statement section.) This is a statement with multiple conclusions which is part of the next topic presented.

“If the number b is not a multiple of 10, then either b is not a multiple of 2 or b is not a multiple of 5”.

Assume that the number b is not a multiple of 10. Then, by the division algorithm, b = 10 q + r

where q and r are integers and 1 ≤ r ≤ 9.

If r is an even number ( i.e. , 2, 4, 6, 8), then we can write r = 2 t , with t positive integer, and 1 ≤ t ≤ 4. So, b = 10 q + 2 t = 2 ( 5 q + t ) .

The number 5 q + t is an integer, so the number b is divisible by 2. But b is not divisible by 5 because r is not divisible by 5.

Thus, in this case the conclusion is true because its second part is true.

If r is an odd number ( i.e. , 3, 5, 7, 9), then b is not divisible by 2. In this case, the conclusion is true as well because its first part is true.

We will now consider statements of the form “If A or B , then C .”

In this kind of statement, we know that the hypothesis “ A or B ” is true. This can possibly mean that:

Part A of the statement is true,

Part B of the statement is true,

Because we do not know which one of the three cases to consider, we must examine all of them. It is important to notice that it is sufficient to concentrate on the first two cases, because the third case is a stronger case that combines the first two. Therefore, the proof of a statement of the form “If A or B , then C ” has two parts (two cases):

Let x, y , and z be counting numbers. If x is a multiple of z or y is a multiple of z , then their product xy is a multiple of z .

Case 1. Let x be a multiple of z . Then x = kz with k integer (positive because x > 0, z > 0). Therefore, x y = ( k z ) y = ( k y ) z .

The number ky is a positive integer because k and y are positive integer. So xy is a multiple of z .

Case 2. Let y be a multiple of z . Then y = nz with n integer (positive because y > 0, z > 0). Therefore, x y = x ( n z ) = ( x n ) z .

The number xn is a positive integer because x and n are positive integer. So xy is a multiple of z .

The proof of a statement of the form “If A or B , then C ” can be constructed using its contrapositive, which is “If ‘not C ,’ then ‘not A ’ and ‘not B .’” (You might want to review the truth tables for constructing the negation of a composite statement introduced in the How To Construct the Negation of a Statement section.) This is again a statement with multiple conclusions which is part of the next topic.

The contrapositive of the original statement in Example 2 is the statement:

“Let x, y , and z be counting numbers. If the product xy is not a multiple of z , then x is not a multiple of z and y is not a multiple of z ”

MULTIPLE CONCLUSIONS

The most common kinds of multiple conclusion statements are:

If A , then B and C ;

If A , then B ;

Indeed, we need to prove that each one of the possible conclusions is true, because we want all of them to hold. If we have already completed the proof that one of the two (or more) implications is true, we can use it to prove the remaining ones (if needed).

The lines y = 2 x + 1 and y = −3 x + 2 are not perpendicular, and they intersect in exactly one point.

The two lines have equations y = 2 x + 1 and y = −3 x + 2.

( Implicit hypothesis: All the properties and relations between lines can be used.)

Two lines are perpendicular if their slopes, m and m 1 , satisfy the equation m = −1/ m 1 , unless one of them is horizontal and the other vertical, in which case one slope is equal to zero and the other is undefined.

The first line has slope 2 and the second has slope −3, so the lines are neither horizontal nor vertical. In addition, −3 ≠ −1/2. Thus, the lines are not perpendicular.

The given lines are distinct and nonparallel (as they have different slopes); therefore, they have only one point in common.

We can find the coordinates of the point(s) in common by solving the system: { y = 2 x + 1 y = − 3 x + 2.

By substitution we have: 2 x + 1 = − 3 x + 2.

The only solution of this equation is x = 1/5.

The corresponding value of the y variable is y = 2(1/5) + 1 = 7/5.

Therefore, the lines have in common the point with coordinates (1/5, 7/5).

This point is unique because its coordinates represent the only solution of the system formed by the equations of the two lines.

The number n is even.

By hypothesis the number n is even. Therefore, n = 2 t for some integer number t . This implies that: n 2 = 4 t 2 .

By hypothesis the number n is even. Therefore, n = 2 t for some integer number t . This implies that: n 6 = 64 t 6 .

We can use the result established in Part 1, as that part of the proof is indeed complete. When n is even, then n 2 = 4 k for some integer number k . Then we have: n 6 = ( n 2 ) 3 = ( 4 k ) 3 = 64 k 3 .

Because k 3 is an integer, it is true that n 6 is divisible by 64.

In this case, we need to show that given A , then either B or C is true (not necessarily both). This means that we need to prove that at least one of the possible conclusions is true; that is, if one of the two conclusions is false, then the other must be true. Thus, the best way to prove this kind of statement is to use the following one, which is logically equivalent to it:

ABCB or CIf A, then B or C
TTTTT
TTFTT
TFTTT
TFFFF
FTTTT
FTFTT
FFTTT
FFFFT
ABCnot BA and (not B)If A and (not B), then C
TTTFFT
TTFFFT
TFTTTT
TFFTTF
FTTFFT
FTFFFT
FFTTFT
FFFTFT

The number n is a composite number larger than 1.

Thus, n = pq with 1 < p < n and 1 < q < n .

( Implicit hypothesis: We can use all properties of counting and prime, non-prime numbers, divisibility, and the properties of square roots.)

Then either p ≤ n or q ≤ n . .

We will start by assuming that: n = p q

where 1 < p < n and 1 < q < n , and p > n . .

Multiplying this last inequality by q yields: q p > q n ,

that is, n > q n .

This implies n > q . Thus, it is true that n ≥ q .

(Note that if p = q , then p = q = n .)

The result stated in Example 5 is used to improve the speed of the search for possible prime factors of numbers.

If x is a rational number and y is an irrational number, their sum, x + y , is an irrational number.

The sum x + y is a rational number.

( Implicit hypothesis: As rational and irrational numbers are real numbers, we can use all the properties of real numbers and their operations.)

The fact that the numbers are called x and y is irrelevant. We can use any two symbols. We will keep using x and y to be consistent with the original statement.

Therefore, we plan to prove the equivalent statement “If A and ‘not B ,’ then C .”

Assume that the number x + y is rational and so is the number x .

Therefore, using the definition of rational numbers, we can write: x + y = n / p

where p ≠ 0, and n and p are integer numbers.

As x is rational, we can write x = a/b with b ≠ 0, where a and b are integer numbers. Thus, we have: a / b + y = n / p

If we solve this equation for y , we obtain: y = n / p − a / b = ( n b − a p ) / p b

where pb ≠ 0 because p ≠ 0 and b ≠ 0.

The numbers nb − ap and pb are integers because n, p, a , and b are integer numbers.

This information allows us to conclude that y is indeed a rational number.

As we have proved the contrapositive of the original statement to be true, the original statement is also true.

Let a be an even number, with | a | > 16. Then either a ≥ 18 or a ≤ −18.

The number a is even.

( Implicit hypothesis: We can use properties and operations of integer numbers.)

The fact that the number is called a is irrelevant.

Moreover, B is a composite statement. Indeed, B can be written as B 1 or B 2 , with B 1   :   a > 16   and B 2   :   a < − 16.

If (a is even and a > 16) or (a is even and a < −16),

then either a ≥ 18 or a ≤ −18.

If a is an even number and a > 16,

As a is even and larger than 16, then it must be at least 18. Thus, a ≥ 18, and the conclusion is true.

If a is an even number and a < −16,

As a is even and smaller than −16, then it cannot be −17, so it must be at most −18. Therefore, a ≤ −18, and the conclusion is true.

Significance Testing

Michael Dickson , Davis Baird , in Philosophy of Statistics , 2011

2.2.3 Composite Hypotheses and Independence

Thus far, we have considered so-called ‘simple’ hypotheses, i.e., hypotheses that propose a specific distribution for the variable of interest. Often one is interested, instead, in ‘composite’ hypotheses, which do not propose a specific distribution, but a range of distributions. For example, one common hypothesis is that two variables are probabilistically independent of one another. Consider, for example, a case of sampling two binomial variables, X ∈ {0,1} and Y ∈ {0,1}. Each sample produces one of four possible results: 〈0,0〉, 〈0,1〉, 〈1,0〉, 〈1,1〉. The problem may thus apparently be treated as sampling a single variable with four possible results. The hypothesis of independence, however, does not specify a single distribution for this 4-valued variable, but rather a family of distributions, in each of which the variables X and Y are probabilistically independent. (Of course, similar remarks hold for the multinomial case.)

It is far from clear what the relationship should be between significance tests for each of many simple hypotheses (to which the χ 2 -test most clearly applies) and inferences regarding the composite hypothesis comprised of these simple hypotheses. One may, on the one hand, wish to suppose that if each of the individual simple hypotheses should be rejected, then so should the composite hypothesis — after all, it is merely a disjunction of the individually rejected simple hypotheses. On the other hand, recall that rejection here is a probabilistic judgement: we reject an hypothesis because it makes the observed data very unlikely. Does the fact that each of several hypotheses makes the observed data unlikely entail that the group as a whole ought to be rejected? It is far from clear that the answer is ‘yes’ — that answer lies perilously close to the lottery paradox.

In some cases — including the important case of hypotheses of independence — this issue can be, at least for the moment, sidestepped. One may use observed data to make estimates of specific probabilities, in accord with the composite hypothesis (for example, the hypothesis of independence). The result is a simple hypothesis that can be tested in the usual way by a χ 2 -test.

A case of historical, theoretical, and practical importance is the ‘fourfold contingency table’ (nowadays called a ‘2 × 2’ table), where we wish to judge the independence of the two variables X and Y , as described above. We begin with the observed frequencies, described abstractly in Table 1 .

Table 1 . Observed data for two binomial variables. The variables a , b , c , d are arbitrary natural numbers.

We may use this data to generate estimates of the four possible results, but in so doing, we must also respect the hypothesis of independence. Hence, for example, the joint probability p (〈0,0〉) must be the product of marginals:

The result is a simple hypothesis about the underlying (population) distribution, shown in Table 2 . (Note that it is sufficient to calculate just one cell from Table 2 , as the others can be determined from the result of that calculation together with the marginal totals.)

Table 2 . Expected (predicted) data for two independent binomial variables.

A standard χ 2 -test may then be applied to the observed and hypothesized data in Tables 1 and 2 . We discuss this case further below ( Section 2.4 ).

Evidence, Evidence Functions, and Error Probabilities

Mark L. Taper , Subhash R. Lele , in Philosophy of Statistics , 2011

7 Selecting between Composite Hypotheses

We suggest that, under the evidential paradigm, the composite hypothesis problem be recast as a model selection problem among models with different numbers of free parameters. In the simple example given above H 0 is a model with no free parameters while H 1 is a family of models indexed by the free parameter μ. Model selection using information criteria 22 compares models by estimating from data their relative Kulback-Leibler distance to truth ( Burnham and Anderson 2002 ). This is a reasonable evidence function. With multiple models, all models are compared to the model with the lowest estimated KL distance to truth. The model selection procedures are blind to whether the suite of candidate models is partitioned into composite hypotheses. One can consider that the hypothesis that contains the best supported model is the hypothesis best supported by the data. No longer comparing all points in one hypothesis to all points in another, but in effect, comparing the best to the best. Where best is defined as the model with the lowest information criterion value. This solution is neither ad hoc (to the particular case) nor post hoc (after the fact/data). The comparison of composite hypotheses using information criteria is not a toy procedure, and can do real scientific work. Taper and Gogan [2002] in their study of the population dynamics of the Yellowstone Park northern elk herd were interested in discerning whether population growth was density dependent or density independent. They fitted 15 population dynamic models to the data and selected amongst them using the Schwarz information criterion (SIC). The best model by this criterion was a density dependent population growth model and difference between the SIC value for this model and that of the best density independent model was more than 5, a very highly significant difference [ Burnham and Anderson, 2002 ]. There were a number of statistically indistinguishable density dependent models that all fit the data well, making identifying the best model difficult. Nevertheless, it is clear that the best model is in the composite hypothesis of density dependence, not the composite hypothesis of density independence.

Linear Regression Models

Milan Meloun , Jiří Militký , in Statistical Data Analysis , 2011

Problem 6.13 Validation of a new laboratory method by asimultaneous test of a composite hypothesis

Try to test a composite hypothesis H 0 : β 2 = 0 and β 1 = 1 in Problem 6.7 against the alternative H A : β 2 ≠ 0 and β 1 ≠ 1.

◯ Data: from Problem 6.7

Solution: From the results of Problem 6.7 , we have RSC = 3440, and when we set β 1,0 = 1 and β 2,0 = 0, we obtain RSC 1 , = 8221. On substitution into Eq. (6.50) , we find F 1 = 8220 − 3440 × 22 3440 × 2 = 15.28 which is greater than the quantile of the Fisher-Snedecor F -distribution F 0.95 (2, 22) = 3.44, so the null hypothesis H 0 cannot be accepted. This conclusion is also in agreement with the partial t -tests and confidence intervals of the two parameters. Figure 6.17 shows the regression straight line y ^ P = x with experimental points and a graphical analysis of residuals.

Figure 6.17 . (a) Linear regression model of validation of a new laboratory method y ^ P = x , and (b) dependence of the residuals on x .

Conclusion: A simultaneous test of the composite hypothesis ( H 0 : β 1,0 = 1 and β 2,0 = 0) confirmed that a new laboratory method is not in agreement with the results of a standard one.

6 Evidence and Composite Hypotheses

The evidential approach has been criticized (e.g. [ Mayo and Spanos, 2006 ]) as a toy approach because the LR can't compare composite hypotheses. 21 This criticism is simultaneously true, a straw man, a good thing, and false. It is true because one can only strictly rank composite hypotheses if every member of one set is greater than every member of the other [ Royall, 1997; Blume, 2002; Forster and Sober, 2004 ]. But, the statement is also a strawman because it implies that the evidential paradigm isn't able to do the statistical and scientific work done using composite hypotheses, which is patently false. Classically, composite hypotheses are used to determine if a point null is statistically distinguishable from the best alternative, or to determine if the best supported alternative lies on a specified side of the point null. Royall [ 1997 , chapter 6] gives a number of sophisticated examples of doing real scientific work using the tools of the support curve, the likelihood ratio, and the support interval. Further, the inability of the LR to compare composite hypotheses is a good thing because Royall is correct in that the composite H can lead to some ridiculous situations. Consider the comparison of hypotheses regarding the mean of a normal distribution with a known standard deviation of 2 as in [ Mayo and Cox, 2006 ]. H 0 : μ ≤ 12 vs: H 1 : μ > 12. A μ of 15 and a μ of 10,000 are both in H 1 . But, if 15 is the true mean, a model with μ = 0 (an element of H 0 ) will describe data generated by the true model much better than will μ = 10,000 (an element of H 1 ). This contradiction will require some awkward circumlocution by Neyman/Pearson adherents. Finally, the statement is false if under the evidence function concept discussed above we expand the evidential paradigm to include model selection using information criteria. Comparing composite hypotheses using information criteria is discussed in more detail in the next section.

Preliminaries

Jaroslav Hájek , ... Pranab K. Sen , in Theory of Rank Tests (Second Edition) , 1999

2.3.1 Statement of the problem.

In statistics a hypothesis is represented by a family of probability distributions of the set of observations X. Since we shall permanently assume that P (*) is determined by a density p(x) , the hypothesis may preferably be represented by the corresponding family of densities. We shall, however, use both interpretations simultaneously, i.e. regard the hypothesis sometimes as a family of densities, and other times, if need be, as a family of corresponding probability distributions, without changing the notation. Thus we attach to expressions of the types ‘∫ Ψpdx, p ∈ H’ and ‘∫ Ψ dP, P ∈ H’ the same meaning. Where convenient, we shall use the word ‘hypothesis’ also for the conjecture or statement that p belongs to a certain family.

We shall distinguish the null hypothesis H = { p }, or equivalently H = { P }, and alternative hypothesis K = { q } or equivalently K = { Q }. The members of the null hypothesis will be denoted by p with or without affixes, and the members of the alternative hypothesis will be denoted by q with or without affixes. We shall also occasionally speak simply about the hypothesis and alternative instead of the null hypothesis and alternative hypothesis, respectively. If H or K contains only one member, it will be called simple and denoted by p or q , respectively. Otherwise it will be called composite.

Although formally equivalent, the two hypotheses induce different attitudes in the researcher's mind. The null hypothesis is usually based on the assumptions of equality, symmetry or independence and expresses a reserved position. The alternative hypothesis is based on the assumptions of the presence of differences, asymmetries or dependences, which the researcher is hoping to prove or support by the results of the experiment. There are a few basic null hypotheses in contrast to the abundance of conceivable alternative hypotheses. Thus the tendency to adhere to null hypotheses as long as it is reasonable is fully justified by the economy of thought and the limited capacity of the human brain. In theory, the difference between the two hypotheses shows up in that the distribution and other problems are much easier to deal with under the null hypothesis than under the alternative.

A test of the hypothesis H against the alternative K consists in deciding whether or not the hypothesis is true (or might be true). The decision is based on the observed value x of X in the following manner: The space X is divided into two disjoint parts, the critical region (region of rejection) A K , and the region of acceptance A H , A K ∪ A H = X , A K ∈ A ; whenever x falls into A K , the null hypothesis H is rejected, and in the contrary case x ∈ A H the null hypothesis H is accepted. The choice of A K should be made before carrying out the experiment. When performing a test one may arrive at the correct decision, or one may commit one of the two errors: rejecting H when it is true (the error of the first kind) or accepting it when it is false (the error of the second kind).

First of all, due to the special role of H , one wants to keep the probability of the error of the first kind low. Therefore, one chooses a number α, lying between 0 and 1, and imposes the condition

The number α is called the level of significance.

In certain cases (it is the usual situation in rank tests), having prescribed α in advance, this upper bound of P(X ∈ A K ) is attained for no P ∈ H. Thus it is convenient to introduce a special term for the number

it will be called the size of the test or critical region A K.

As a rule, the test is based on a proper statistic t(x) , called the test statistic. The correspondence between A K and t ( x ) is usually one of the following three types:

In the first two cases we speak of one-sided tests based on t , in the last case of the two-sided test based on t. The numbers c u and c l are called the upper critical value and the lower critical value, respectively. Admitting the infinite values c l = −∞ and c u = ∞ , the first two regions turn into a special case of the third one.

As a generalization we introduce the notion of a randomized test. A randomized test is defined by a measurable function Ψ( x ), such that 0 ≤ Ψ( x ) ≤ 1 for all x , and the hypothesis H is rejected with probability Ψ(x) if X = x. The function Ψ( x ) is called the critical function , of the test. Since the correspondence between randomized tests and critical functions is one-to-one, we shall occasionally use the abbreviated expression ‘the test Ψ’ instead of ‘the test with the critical function Ψ’. The size of the test Ψ is defined as ∫ Ψ d P , or as sup p ∈ H ∫ ψ d P for a composite hypothesis . If we base a test on the test statistic t ( x ), the critical function Ψ(x) is usually defined so that

and the intermediate values 0 < Ψ( x ) < 1, if any, are chosen only for x such that t ( x ) = c u or t ( x ) = c l.

The theory is considerably simplified by introducing randomized tests. With randomized tests we are always able to ensure equality between the size and the significance, level, and moreover the set of all possible tests becomes convex in the sense that for any two critical functions Ψ 1 and Ψ 2 of respective sizes ≤ α 1 and ≤ α 2 , the function

is again a critical function of size not exceeding α = λα 1 + (1 − λ)α 2 for any 0 ≤ λ ≤ 1. Thus it is convenient to develop the theory for randomized tests even if we may feel reluctant to use the randomization in practice.

If dealing with a simple alternative q , we call

the power of the respective test. Obviously, the power is the complement to 1 or the error of the second kind. When the, alternative K = { q } is composite, the power is defined by

i.e. by the greatest lower bound of the powers for individual alternatives q from K. Frequently we are interested in a parametric set of simple alternatives { Q Δ } indexed by a real parameter Δ, and then

is called the power function of Ψ. The main purpose of the theory of testing hypotheses is to provide tests with largest power for a given level of significance.

In the opinion of some statisticians, there is no need to prescribe a fixed significance level, and furthermore there is no reasonable rule for its choice. They do not regard testing hypotheses as a decision procedure leading to an irreversible decision, but rather as an intellectual procedure in the mind of a researcher, whose attitude to various hypotheses comes to be more or less changed on the basis of the experimental evidence. Such a point of view may lead to preferring the so-called level actually attained by a test statistic t to the fixed significance level. The level actually attained is defined as follows: Assuming the test is based on the right tail of the distribution of a test statistic t ( x ), and that this distribution is the same for all p ∈ H , we put

and call the random variable l ( X ) the level actually attained. The smaller

l ( x ) we observe, the stronger evidence against the null hypothesis H the set of observations X = x provides. The results presented in this book might be utilized with this approach to testing hypotheses, too, but we shall prefer the more clear-cut framework of the Neyman-Pearson theory.

In conclusion, let us mention that in proving optimum properties of rank tests no advantage is gained by considering randomized rank tests. Actually, the optimum properties are either of local character, and then the optimum tests are either non-randomized (for proper values of α) or they depend on the derivatives of second and higher orders, and, consequently, are too difficult to be dealt with; alternatively the optimum properties are of asymptotic character, and then we may ensure the asymptotic equality between the size and the significance level with non-randomized rank tests, too.

6.3.2.3 Simultaneous Test of a Composite Hypothesis

The likelihood ratio test ( Section 8.6.2 ) may be used for testing general parametric hypotheses. In a case where the null hypothesis H 0 : β 2 = 0 is to be tested against the alternative H A : β 2 ≠ 0, where β 2 represents the last q elements of the vector β , the regression model is expressed in the divided form

where X 1 , is the matrix of dimension [ n × ( m −   q )] containing those controllable variables with regression coefficients that are not included in a test vector β 2 . Similarly, X 2 is the matrix of dimension ( n × q ) containing those controllable variables with regression coefficients that are included in a test vector β 2 . When the hypothesis H 0 is valid, it is evident that y ^ P , 1 = X 1 b 1 where b 1 = ( X T 1 X 1 ) −   1 X T 1 y and the corresponding residual sum of squares RSC 1 is RS C 1 = y − y ^ P , 1 T y − y ^ P , 1 . When the hypothesis H A is valid, we have y ^ P = Xb where b 1 = ( X T 1 X 1 ) −   1 X T 1 y and the corresponding residual sum of squares RSC is RSC = y − y ^ P T y − y ^ P . The difference ( RSC 1 − RSC ) corresponds to an increase in the residual sum of squares caused by validity of the null hypothesis H 0 . The test criterion has the form

which if the H 0 hypothesis is valid, has the Fisher-Snedecor F -distribution with q and ( n − m ) degrees of freedom.

A mistake often made in the application of linear regression in laboratories is a false approach to a choice of test criteria. Instead of the test criterion F 1 , the individual test statistics T j from Eq. (6.48) are calculated, and on their basis, the significance of a composite hypothesis H 0 : β 2 = β 2,0 against H A : β 2 ≠ β 2,0 tested. Here β 2,0 is the vector of known parameters.

For tests of composite hypotheses, the test statistic F 1 , should be used, where RSC 1 is the residual sum of squares for the model y ^ P , i = X 1 B 1 + X 2 β 2 , 0 where b 1 , is the estimate of parameters β 1 , on the assumption that the restriction β 2 = β 2,0 is valid

Problem 6.12 Simultaneous test of a composite hypothesis for a Lambert-Beer law model

For the data from Problem 6.10 , test the composite null hypothesis H 0 : β 2 = 0, β 1 = 0.148 against H A : β 2 ≠ 0, β 1 ≠ 0.148. The false approach would be two separate tests of two null hypotheses, H 0 : β 2 = 0 and H 0 : β 1 = 0.148.

Solution: On substitution into Eq. (6.48) , we obtain

Because T 1 , and T 2 are less than the quantile of the Student t -distribution, t 0.975 (4) = 2.7764, both tests lead to a conclusion that H 0 : β 2 = 0, β 1 = 0.148 should be accepted. This conclusion is, however, false .

The more rigorous approach uses a simultaneous test of the composite hypothesis H 0 : β 2 = 0 and β 1 = 0.148.

The procedure starts with a calculation of RSC = 5.12 × 10 −   5 for estimates b 1 = 0.1459 and b 2 = 1.461 × 10 −   4 . Then, RSC 1 = 5.3476 × 10 −   4 for parameters β 2,0 =   0 and β 1,0 = 0.148 is calculated. From Eq. (6.50) , the test criterion F 1 , is

Because the quantile of Fisher-Snedecor F -distribution is F 0.95 (2, 4) = 6.944, the null hypothesis H 0 : β 2 = 0 and β 1 = 0.148 cannot be accepted. The result of this F -test is not in agreement with conclusion of the previous t -tests. Figure 6.16 shows the 95% confidence ellipse of parameters β 1 and β 2 , and the point β 1,0 = 0.148 and β 2,0 = 0 marked by a cross. This point lies outside the 95% confidence interval of the two parameters.

Figure 6.16 . The 95% confidence interval for the parameters β 1 and β 2 . The point β 1 = 0, β 2 = 0.148 is marked by a cross.

Conclusion: It may be concluded that a simultaneous test of the compositehypothesis cannot be replaced by tests of two separate hypotheses. Thus, testingof individual parameters in a vector β 0 can lead to quite false conclusions.

Hypothesis testing

Kandethody M. Ramachandran , Chris P. Tsokos , in Mathematical Statistics with Applications in R (Third Edition) , 2021

6.1 Introduction

Statistics plays an important role in decision-making. In statistics, one utilizes random samples to make inferences about the population from which the samples were obtained. Statistical inference regarding population parameters takes two forms: estimation and hypothesis testing, although both may be viewed as different aspects of the same general problem of arriving at decisions on the basis of observed data. We have already seen several estimation procedures in earlier chapters. Hypothesis testing is the subject of this chapter. This has an important role in the application of statistics to real-life problems. Here we utilize sampled data to make decisions concerning the unknown distribution of a population or its parameters. Pioneering work on the explicit formulation as well as the fundamental concepts of the theory of hypothesis testing are due to J. Neyman and E.S. Pearson.

A statistical hypothesis is a statement concerning the probability distribution of a random variable or population parameters that are inherent in a probability distribution. The following example illustrates the concept of hypothesis testing. An important industrial problem is that of accepting or rejecting lots of manufactured products. Before releasing each lot for the consumer, the manufacturer usually performs some tests to determine whether the lot conforms to acceptable standards. Let us say that both the manufacturer and the consumer agree that if the proportion of defectives in a lot is less than or equal to a certain number p , the lot will be released. Very often, instead of testing every item in the lot, we may test only a few at random from the lot and make decisions about the proportion of defectives in the lot; that is, we make decisions about the population on the basis of sample information. Such decisions are called statistical decisions . In attempting to reach decisions, it is useful to make some initial conjectures about the population involved. Such conjectures are called statistical hypotheses . Sometimes the results from the sample may be markedly different from those expected under the hypothesis. Then we can say that the observed differences are significant and we would be inclined to reject the initial hypothesis. The procedures that enable us to decide whether to reject hypotheses or to determine whether observed samples differ significantly from expected results are called tests of hypotheses , tests of significance , or rules of decision.

In any hypothesis-testing problem, we formulate a null hypothesis and an alternative hypothesis such that if we reject the null, then we have to accept the alternative. The null hypothesis usually is a statement of the “status quo” or “no effect” or a “belief.” A guideline for selecting a null hypothesis is that when the objective of an experiment is to establish a claim, the nullification of the claim should be taken as the null hypothesis. The experiment is often performed to determine whether the null hypothesis is false. For example, suppose the prosecution wants to establish that a certain person is guilty. The null hypothesis would be that the person is innocent and the alternative would be that the person is guilty. Thus, the claim itself becomes the alternative hypothesis. Customarily, the alternative hypothesis is the statement that the experimenter believes to be true. For example, the alternative hypothesis is the reason a person is arrested (police suspect the person is not innocent). Once the hypotheses have been stated, appropriate statistical procedures are used to determine whether to reject the null hypothesis. For the testing procedure, one begins with the assumption that the null hypothesis is true. If the information furnished by the sampled data strongly contradicts (beyond a reasonable doubt) the null hypothesis, then we reject it in favor of the alternative hypothesis. If we do not reject the null, then we automatically reject the alternative. Note that we always make a decision with respect to the null hypothesis. Failure to reject the null hypothesis does not necessarily mean that the null hypothesis is true. For example, a person being judged “not guilty” does not mean the person is innocent. This basically means that there is not enough evidence to reject the null hypothesis (presumption of innocence) beyond “a reasonable doubt.”

We summarize the elements of a statistical hypothesis in the following.

The elements of a statistical hypothesis

The null hypothesis , denoted by H 0 , is usually the nullification of a claim. Unless evidence from the data indicates otherwise, the null hypothesis is assumed to be true.

The alternative hypothesis , denoted by H a (or sometimes denoted by H 1 ), is customarily the claim itself.

The test statistic , denoted by TS, is a function of the sample measurements upon which the statistical decision, to reject or not to reject the null hypothesis, will be based.

A rejection region (or a critical region ) is the region (denoted by RR) that specifies the values of the observed TS for which the null hypothesis will be rejected. This is the range of values of the TS that corresponds to the rejection of H 0 at some fixed level of significance, α , which will be explained later.

Conclusion : If the value of the observed TS falls in the RR, the null hypothesis is rejected and we will conclude that there is enough evidence to decide that the alternative hypothesis is true. If the TS does not fall in the RR, we conclude that we cannot reject the null hypothesis.

In practice one may have hypotheses such as H 0 : μ   =   μ 0 against one of the following alternatives:

A test with a lower- or upper-tailed alternative is called a one-tailed test. One of the issues in hypothesis testing is the choice of the form of alternative hypothesis. Note that, as discussed earlier, the null hypothesis is always concerned with the question posed: the claim. The alternative hypothesis must reflect the aim of the claim when in fact we reject the claim; we want to know why we rejected it. For example, suppose that a pharmaceutical company claims that medication A is 80% effective (that is, p = 0.8 ). We conduct an experiment, clinical trials, to test this claim. Thus, the null hypothesis is that the claim is true. Now if we do not want to reject the null hypothesis, no problem, but if we reject the null hypothesis, we want to know why. Thus, the alternative must be a one-tailed test, p < 0.8 , that is, the claim is not true. If we were to use a two-tailed test, we would not know whether the rejection was because p < 0.8 or p > 0.8 . In this case, p > 0.8 is actually part of the null hypothesis. It is important to note that when using a one-tailed test in a certain direction, if the consequence of missing an effect in the other direction is not negligible, it is better to use a two-tailed test. Also, choosing a one-tailed test after doing a two-tailed test that failed to reject the null hypothesis is not appropriate. Therefore, the choice of the alternative is based on what happens if we reject the null hypothesis. In an applied hypothesis-testing problem, we can use the following general steps.

General method for hypothesis testing

From the (word) problem, determine the appropriate null hypothesis, H 0 , and the alternative, H a .

Identify the appropriate TSs and calculate the observed TS from the data.

Find the RR by looking up the critical value in the appropriate table.

Draw the conclusion: reject or fail to reject the null hypothesis, H 0 , based on a given level of significance α .

Interpret the results: state in words what the conclusion means to the problem we started with.

It is always necessary to state a null and an alternative hypothesis for every statistical test performed. All possible outcomes should be accounted for by the two hypotheses. Note that a critical value is the value that a TS must surpass for the null hypothesis to be rejected, and is derived from the level of significance α of the test. Thus, the critical values are the boundaries of the RR. It is important to observe that both null and alternative hypotheses are stated in terms of parameters, not in terms of statistics.

H a : The coin is not fair ( p   ≠   1/2). This is a two-tailed alternative.

H a : The coin is biased in favor of heads ( p   >   1/2). This is an upper-tailed alternative.

H a : The coin is biased in favor of tails ( p   <   1/2). This is a lower-tailed alternative.

It is important to observe that the TS is a function of a random sample. Thus, the TS itself is a random variable whose distribution is known under the null hypothesis. The value of a TS when specific sample values are substituted is called the observed test statistic or simply test statistic .

For example, consider the hypothesis H 0 : μ   =   μ 0 versus H a : μ   ≠   μ 0 , where μ 0 is known. Assume that the population is normal, with a known variance σ 2 . Consider X ¯ , an unbiased estimator of μ based on the random sample X 1 , … , X n . Then Z = ( X ¯ − μ 0 ) / ( σ / n ) is a function of the random sample X 1 , … , X n , and has a known distribution, say a standard normal, under H 0 . If x 1 ,   x 2 , … , x n are specific sample values, then z = ( x ¯ − μ 0 ) / ( σ / n ) is called the observed sample statistic or simply sample statistic.

Definition 6.1.1

A hypothesis is said to be a simple hypothesis if that hypothesis uniquely specifies the distribution from which the sample is taken. Any hypothesis that is not simple is called a composite hypothesis .

Refer to Example 6.1.1 . The null hypothesis p   =   1/2 is simple, because the hypothesis completely specifies the distribution, which in this case will be a binomial with p   =   1/2 and with n being the number of tosses. The alternative hypothesis p   ≠   1/2 is composite because the distribution now is not completely specified (we do not know the exact value of p ).

Because the decision is based on the sample information, we are prone to commit errors. In a statistical test, it is impossible to establish the truth of a hypothesis with 100% certainty. There are two possible types of errors. On one hand, one can make an error by rejecting H 0 when in fact it is true. On the other hand, one can also make an error by failing to reject the null hypothesis when in fact it is false. Because the errors arise as a result of wrong decisions, and the decisions themselves are based on random samples, it follows that the errors have probabilities associated with them. We now have the following definitions.

Definition 6.1.2

(a) A type I error is made if H 0 is rejected when in fact H 0 is true. The probability of type I error is denoted by α. That is,

Table 6.1 . Statistical Decision and Error Probabilities.

Statistical decisionTrue state of null hypothesis
true false
Do not reject Correct decisionType II error ( )
Reject Type I error ( )Correct decision

The probability of type I error, α, is called the level of significance.

A type II error is made if H 0 is accepted when in fact H a is true. The probability of a type II error is denoted by β. That is, P ( n o t r e j e c t i n g H 0 | H 0 i s f a l s e ) = β .

It is desirable that a test should have α   =   β   =   0 (this can be achieved only in trivial cases), or at least we prefer to use a test that minimizes both types of errors. Unfortunately, it so happens that for a fixed sample size, as α decreases, β tends to increase and vice versa. There are no hard and fast rules that can be used to make the choice of α and β . This decision must be made for each problem based on quality and economic considerations. However, in many situations it is possible to determine which of the two errors is more serious. It should be noted that a type II error is only an error in the sense that a chance to correctly reject the null hypothesis was lost. It is not an error in the sense that an incorrect conclusion was drawn, because no conclusion is made when the null hypothesis is not rejected. In the case of a type I error, a conclusion is drawn that the null hypothesis is false when, in fact, it is true. Therefore, type I errors are generally considered more serious than type II errors. For example, it is mostly agreed that finding an innocent person guilty is a more serious error than finding a guilty person innocent. Here, the null hypothesis is that the person is innocent, and the alternative hypothesis is that the person is guilty. “Not rejecting the null hypothesis” is equivalent to acquitting a defendant. It does not prove that the null hypothesis is true, or that the defendant is innocent. In statistical testing, the significance level α is the probability of wrongly rejecting the null hypothesis when it is true (that is, the risk of finding an innocent person guilty). Here the type II risk is acquitting a guilty defendant. The usual approach to hypothesis testing is to find a test procedure that limits α , the probability of type I error, to an acceptable level while trying to lower β as much as possible.

The consequences of different types of errors are, in general, very different. For example, if a doctor tests for the presence of a certain illness, incorrectly diagnosing the presence of the disease (type I error) will cause a waste of resources, not to mention the mental agony to the patient. On the other hand, failure to determine the presence of the disease (type II error) can lead to a serious health risk.

To formulate a hypothesis-testing problem, consider the following situation. Suppose a toy store chain claims that at least 80% of girls under 8   years of age prefer dolls over other types of toys. We feel that this claim is inflated. In an attempt to dispose of this claim, we observe the buying pattern of 20 randomly selected girls under 8   years of age, and we observe X , the number of girls under 8   years of age who buy stuffed toys or dolls. Now the question is, how can we use X to confirm or reject the store's claim? Let p be the probability that a girl under 8 chosen at random prefers stuffed toys or dolls. The question now can be reformulated as a hypothesis-testing problem. Is p   ≥   0.8 or p   <   0.8? Because we would like to reject the store's claim only if we are highly certain of our decision, we should choose the null hypothesis to be H 0 : p   ≥   0.8, the rejection of which is considered to be more serious. The null hypothesis should be H 0 : p   ≥   0.8, and the alternative H a : p   <   0.8. To make the null hypothesis simple, we will use H 0 : p   =   0.8, which is the boundary value, with the understanding that it really represents H 0 : p   ≥   0.8. We note that X , the number of girls under 8 years of age who prefer stuffed toys or dolls, is a binomial random variable. Clearly a large sample value of X would favor H 0 . Suppose we arbitrarily choose to accept the null hypothesis if X   >   12. Because our decision is based on only a sample of 20 girls under 8, there is always a possibility of making errors whether we accept or reject the store chain's claim. In the following example, we will now formally state this problem and calculate the error probabilities based on our decision rule.

Find β for p   =   0.6.

Find β for p   =   0.4.

Find the RR of the form { X   ≤   K } so that (i) α   =   0.01; (ii) α   =   0.05.

For the alternative H a : p   =   0.6, find β for the values of α in (d).

For p   =   0.8, the probability of type I error is:

Here, p   =   0.6. The probability of type II error is:

If p   =   0.4, then:

To find K such that α = P { X ≤ K | p = 0.8 } = 0.01 , from the binomial table, K   =   11. Hence, the RR is reject H 0 if { X   ≤   11}.

To find K such that α = P { X ≤ K | p = 0.8 } = 0.05 , from the binomial table, α   =   0.05 falls between K   =   12 and K   =   13. However, for K   =   13, the value for α is 0.087, exceeding 0.05. If we want to limit α to be no more than 0.05, we will have to take K   =   12. That is, we reject the null hypothesis if X   ≤   12, yielding an α   =   0.0321 as shown in (a).

When a   =   0.01, from (d), the RR is of the form { X   ≤   11}. For p   =   0.6,

From (a)  and (b)  for testing the hypothesis H 0 : p   =   0.8 against H a : p   <   0.8 with n   =   20 , we see that when α is 0.0321, β is 0.416. From (d) (i)  and (e) (i)  for the same hypothesis, we see that when α is 0.01, β is 0.596. This holds in general. Thus, we observe that for fixed n as α decreases, β increases, and vice versa.

In the next example, we explore what happens to β as the sample size increases, with α fixed.

Let X be a binomial random variable. We wish to test the hypothesis H 0 : p   =   0.8 against H a : p   =   0.6. Let α   =   0.03 be fixed. Find β for n   =   10 and n   =   20.

For n   =   10, using the binomial tables, we obtain P{ X   ≤   5 | p   =   0.8}   ≅   0.03. Hence, the RR for the hypothesis H 0 : p   =   0.8 versus H a : p   =   0.6 is given by reject H 0 if X   ≤   5. The probability of type II error is: β = P { a c c e p t H 0 | p = 0.6 } β = P { X > 5 | p = 0.6 } = 1 − P { X ≤ 5 | p = 0.6 } = 0.733.

For n   =   20, as shown in Example 6.1.3 , if we reject H 0 for X   ≤   12, we obtain: P ( X ≤ 12 | p = 0.8 ) ≅ 0.03 and β = P ( X > 12 | p = 0.6 ) = 1 − P { X ≤ 12 | p = 0.6 } = 0.416.

We see that for a fixed α , as n increases β decreases and vice versa. It can be shown that this result holds in general.

For us to compute the value of β , it is necessary that the alternative hypothesis is simple. Now we will discuss a three-step procedure to calculate β .

Steps to calculate β

Decide an appropriate TS (usually this is a sufficient statistic or an estimator for the unknown parameter, whose distribution is known under H 0 ).

Determine the RR using a given α , and the distribution of the TS.

Find the probability that the observed TS does not fall in the RR assuming H a is true. This gives β. That is, β   =   P (TS falls in the complement of the RR | H a is true).

A random sample of size 36 from a population with known variance, σ 2   =   9, yields a sample mean of x ¯ = 17 . For the hypothesis H 0 : μ   =   15 versus H a : μ   >   15, find β when μ   =   16. Assume α   =   0.05.

Here, n   =   36, x ¯ = 17 , and σ 2   =   9. In general, to test H 0 : μ   =   μ 0 versus H a : μ   >   μ 0 , we proceed as follows. An unbiased estimator of μ is X ¯ . Intuitively we would reject H 0 if X ¯ is large, say X ¯ > c . Now using α   =   0.05, we will determine the RR. By the definition of α , we have: P ( X ¯ > c | μ = μ 0 ) = 0.05 o r P ( X ¯ − μ 0 σ / n > c − μ 0 σ / n | μ = μ 0 ) = 0.05

But, if μ   =   μ 0 , because the sample size n   ≥   30, [ ( X ¯ − μ 0 ) / ( σ / n ) ] ∼ N ( 0,1 ) . Therefore, P ( X ¯ − μ 0 ( σ / n ) > c − μ 0 ( σ / n ) ) = 0.05 is equivalent to P ( Z > c − μ 0 ( σ / n ) ) = 0.05. From standard normal tables, we obtain P ( Z   >   1.645)   =   0.05. Hence , c − μ 0 ( σ / n ) = 1.645 or c = μ 0 + 1.645 ( σ / n ) .

Therefore, the RR is the set of all sample means x ¯ such that: x ¯ > μ 0 + 1.645 ( σ n ) .

Substituting μ 0   =   15, and σ   =   3, we obtain: μ 0 + 1.645 ( σ / n ) = 15 + 1.645 ( 3 36 ) = 15.8225.

The RR is the set of x ¯ such that x ¯ ≥ 15.8225.

Then by definition , β = P ( X ¯ ) ≤ 15.8225   w h e n μ = 16 ) .

Consequently, for μ   =   16, β = P ( X ¯ − 16 σ / n ≤ 15.8225 − 16 3 / 36 ) = P ( Z ≤ − 0.36 ) = 0.3594.

That is, under the given information, there is a 35.94% chance of not rejecting a false null hypothesis.

6.1.1 Sample size

It is clear from the preceding example that once we are given the sample size n , an α , a simple alternative H a , and a TS, we have no control over β . Hence, for a given sample size and the TS, any effort to lower β will lead to an increase in α and vice versa. This means that for a test with fixed sample size it is not possible to simultaneously reduce both α and β . We also notice from Example 6.1.4 that by increasing the sample size n , we can decrease β (for the same α ) to an acceptable level. The following discussion illustrates that it may be possible to determine the sample size for a given α and β .

Suppose we want to test H 0 : μ   =   μ 0 versus H a : μ   >   μ 0 . Given α and β , we want to find n , the sample size, and K , the point at which the rejection begins. We know that:

and for some particular value μ = μ a > μ 0 ,

From Eqs. (6.1) and (6.2) ,

This gives us two equations with two unknowns ( K and n ), and we can proceed to solve them. Eliminating K , we get:

From this we can derive:

Thus, the sample size for an upper-tail alternative hypothesis is:

The sample size increases with the square of the standard deviation and decreases with the square of the difference between the mean value of the alternative hypothesis and the mean value under the null hypothesis. Note that in real-world problems, care should be taken in the choice of the value of μ a for the alternative hypothesis. It may be tempting for a researcher to take a large value of μ a to reduce the required sample size. This will seriously affect the accuracy (power) of the test. This alternative value must be realistic within the experiment under study. Care should also be taken in the choice of the standard deviation σ . Using an underestimated value of the standard deviation to reduce the sample size will result in inaccurate conclusions similar to overestimating the difference of means. Usually, the value of σ is estimated using a similar study conducted earlier. The problem could be that the previous study may be old and may not represent the new reality. When accuracy is important, it may be necessary to conduct a pilot study only to get some idea of the estimate of σ . Once we determine the necessary sample size, we must devise a procedure by which the appropriate data can be randomly obtained. This aspect of the design of experiments is discussed in Chapter 8 .

Let σ   =   3.1 be the true standard deviation of the population from which a random sample is chosen. How large should the sample size be for testing H 0 : μ   =   5 versus H a : μ   =   5.5 so that α   =   0.01 and β   =   0.05?

We are given μ 0   =   5 and μ a   =   5.5. Also, z α   =   z 0.01   =   2.33 and z β   =   z 0.05   =   1.645. Hence, the sample size: n = ( z α + z β ) 2 σ 2 ( μ a − μ 0 ) 2 = ( 2.33 + 1.645 ) 2 ( 3.1 ) 2 ( 0.5 ) 2 = 607.37.

So, n   =   608 will provide the desired levels. That is, for us to test the foregoing hypothesis, we must randomly select 608 observations from the given population.

From a practical standpoint, the researcher typically chooses α and the sample size, β , is ignored. Because a trade-off exists between α and β , choosing a very small value of α will tend to increase β in a serious way. A general rule of thumb is to pick reasonable values of α , possibly in the 0.05 to 0.10 range, so that β will remain reasonably small.

Truth, Possibility and Probability

In North-Holland Mathematics Studies , 1991

4 Hypotheses tests

In hypotheses tests, we test a hypothesis H 0 , called the null or working hypothesis , considering at the same time an alternative hypothesis H 1 . The working hypothesis H 0 and the alternative hypothesis H 1 serve to partition the parameter space Ω. Under H 0 , θ lies in a subspace Ω′; under H 1 , θ lies in the complementary subspace Ω - Ω′.

The purpose of a hypothesis test is to determine whether H 0 or H 1 are consistent with the data. Thus, accepting H 0 means simply that we are not in a position of rejecting H 0 , i.e., that H 0 is consistent with the data. Similarly, rejecting H 0 , and hence, accepting H 1 , means that H 0 is inconsistent with the data, and hence, with respect to H 1 , it only means that H 1 is consistent with the data. Thus, hypotheses tests involve rejection and acceptance at the same time, but not on the same level of importance.

I now begin with the justification of hypotheses tests, according to the view spoused in this book. Although I consider hypotheses tests to be valid, my justification is different from that of the developers of this tests, Neyman and Pearson. In the next chapter, Section 1 , we shall discuss the Neyman-Pearson justification for these tests.

Many of the notions introduced for significance tests are also used in hypothesis tests and the same conditions that apply to significance tests apply here. The most important, according to my point of view, is that the test statistics must be a function of a discriminating experiment. The notion of a discriminating experiment must be modified to take into account that we are only considering two alternative hypotheses, H 0 : θ ∈ Ω′ and H 1 : θ ∈ Ω - Ω′. For simplicity, we write Ω 0 = Ω′ and Ω 1 = Ω - Ω′. We assume that E = 〈 E n ‖   1 ≤ n ≤ ν 〉 , for an infinte ν, is a sequence of random variables on each of the models in H i , i = 0, 1. We begin by the notion of evidentially equivalent results, which is the same as the old notion in Chapter IV Section 2 and in Section 2 of this chapter, adapted to the new situation. Let r and s be possible results of E n . Then r ~ n s if there is a c > 0 such that for every θ ∈; Ω

Similarly as before, [ r ] is the equivalence class of r .

We say that r ≼ θ n s , for θ ∈ Ω i , if

and there is θ ∈ Ω j , with j ≠ i , such that the inequality is reversed.

The rejection set for θ, with θ ∈ Ω, determined by the result r of E n is then, as before

The probability Pr θ R n θ r is the p -value of the test for θ with result r .

Definition XVIII.3 D. E. For Hypotheses Tests

We say that the system E = 〈 E n ‖ 1 ≤ n ≤ ν 〉 is a discriminating experiment ( d.e .) for H 0 against H 1 if

The sequence E n is an internal sequence of random variable over the n -product space of the spaces in H 0 and H 1 .

For each θ ∈ Ω 1 there is an internal set A of sequences of results such that E is almost surely eventually in A according to θ, and E is almost surely eventually never in A according to θ′, for θ′ ∈ Ω 0 , and such that for any 〈 r n ‖ 1 ≤ n ≤ ν 〉 in A , and any α ≫ 0 there is a finite n such that for every θ′ ∈ Ω 0 and every m ≥ n Pr θ ′ [ E m ∈ R m θ ′ r m ] ≤ α ⋅ That is, the p -value of the test can be made as small as one wishes.

(3) For each θ ∈ Ω 0 there is an internal set B of sequences of results such that E is a.s. eventually in B , according to θ, and a.s. eventually not in B , according to θ′, for θ′ ∈ Ω 1 , and such that for any { r n } n = 1 ν in B , any θ′ ∈ Ω 1 and any α ≫ 0 there is a finite n such that for every m ≥ n Pr θ ′ [ E m ∈ R m θ ′ r m ] ≤ α ⋅

The main differences of this definition with Definitions IV.3 and XVIII.2 are:

In the definition r ≼ θ n s s , i.e., of r being at least as bad as s for θ, we require that the inequality be reversed for a θ′ in the hypothesis which is alternative to that in which θ is.

H 0 and H 1 are asymmetric in the sense that we require that sequences in the set A work for each α with the same n for all θ ∈ Ω 0 , while for sequences of B , the n may be different for each θ′ ∈ Ω 1 .

The dialectical rule of rejection for this case is now:

Rule XVIII.1 (Dialectical Rule of Rejection of H 0 against H 1 )

Let Ω be a set of parameters of possible probability models for a setup, and suppose that we are considering the null hypothesis H 0 : θ ∈ Ω′ against H 1 : θ ∈ Ω - Ω′. We say that H 0 should be provisionally rejected at level of significance α, if there is a discriminating experiment 〈 E n ‖ 1 ≤ n ≤ ν 〉 for H 0 against H 1 ; such that

[ E n = a ] obtains, for some a and n ∈ ℕ.

Pr θ [ E n ∈ R n θ a ] ≤ α , for every θ ∈ Ω′, i.e., the p-value msut be small for every θ ∈ Ω′.

Since in the definition of discriminating experiment for H 0 against H 1 we require that the n be uniform for all elements of Ω 0 , the old proof of Theorem XVIII.1 does not work because we do not know whether convergence to the normal distribution is uniform. If all sampling distributions for θ ∈ Ω 0 have the property that R n θ a is an interval or union of intervals, then the theorem remains true:

Theorem XVIII.2

The proof, which I omit, is similar to that of Theorem XVIII.1 . We shall now discuss some examples:

Example XVIII.1

Suppose that we are testing nearly normal distributions with known variance σ 2 and unknown mean μ. We are testing H 0 : = μ 0 against H 1 : μ ≠ μ 0 . Then Theorem XVIII.1 (and also Theorem XVIII.2 ) applies and hence, the sample mean X ¯ n is a discriminating experiment. The usual test works. For one-sided tests, say H 0 : μ ≤ μ 0 aginst H 1 : μ > μ 0 , however, we must apply Theorem XVIII.2 . This last theorem applies, since the rejection set determined by a result r , for this case, is the interval [ X ¯ n > r ].

Example XVIII.2

As a second example, assume that the alternative hypotheses are nearly normal distributions with mean μ ∈ Ω and unknown variance. The experiment now should the be discriminating for H 0 : μ = μ 0 against H 1 : μ ≠ μ 0 , where μ 0 is a fixed number. We need, for this case, an experiment M with two results: the sample mean

and the sample variance

Thus, M n is the pair 〈 X n ¯ , S n 2 ≫ . The null hypothesis H 0 is then a composite hypothesis including all distributions with mean μ 0 and different variances. Here we cannot apply directly our theorems. We need some preliminary work to obtain the sets A and B of the definition of discriminating experiment. Besides the sample mean and variance we need the following function

where μ is the real mean.

Extend the sequence X n to * ℕ. Let ν ≈ ∞. Assume that μ is the true mean, and σ 2 , the true variance. We have that X ν ¯ ≈ μ , a.s. Also, a.s., X ν ˆ ≈ σ 2 , by the law of large numbers, Theorem XI.7 , because ( X 1 − μ ) 2 ,   ( X 2 − μ ) 2 ,   ⋯ are independent and identically distributed random variables with E μ , σ 2 ( X i − μ ) 2 = σ 2 . Thus

Therefore, if μ 0 and σ 2 are the true mean and variance, then the set, A σ , of pairs of infinite sequences, { r n } and { s n }, such that 〈{ r n }, { s n }〉 ∈ A if and only if r n nearly converges to μ and s n 2 nearly converges to σ 2 , has the property that M is a.s. in A σ . On the other hand, if the true mean or variance are different, then M is a.s. eventually not in A . We shall prove that this A σ has the required properties of the definition of discriminating experiment.

By Theorem X.13

has nearly t distribution with n − 1 degrees of freedom, i.e., it is T n − 1 , if μ is the true mean. Let 〈 m, s 〉 and 〈 m′, s ′ 〉 be possible results of M n . Then, because of the t distribution, 〈 m , s 〉 ≼ n , μ 〈 m ′ , s ′ 〉 if

Let ‖ μ − μ 0 ‖ = ɛ . Then, since μ and μ 0 real, ε ≫ 0. Assume that 〈 m, s ∈ A . Then m n S-converges to μ and s n S-converges to σ 2 . Let n 1 be such that for all k ≥ n 1

Then, for k ≥ n 1

As is usual we define

where Z is a unit nearly normal random variable, and

where T n is nearly a student t random variable with n degrees of freedom.

Let, now, n 2 ≥ n 1 be such that

and let n 3 ≥ n 1 + 4 be a finite number such that

Therefore, if k ≥ n 2

The choice, for each μ ≠ μ 0 , of of the set B of the definition of discriminating experiment is similar. This shows that the experiment is discriminating.

Example XVIII.3

Suppose, now, as another example, that we have as alternative hypotheses nearly normal distributions with a known and fixed mean μ and different variances. We would like to test the hypothesis H 0 : σ 2 ≥   σ 0 2 against H 1 : σ 2 <   σ 0 2 , for a fixed σ 0 , where σ 2 is the variance. We use as our statistics

We have that ( X 1 − μ ) 2 ,   ( X 2 − μ ) 2 ,   ⋯ ( X ν − μ ) 2 , represent independent repetitions of a random variable ( X − μ) 2 . Also

By Theorem X.12 , the distribution of

is χ n 2 , if the true variance is σ 2 . Then the rejection sets are intervals. Using Theorem XVIII.2 , we get that X n ˆ is a discriminating experiment for H 0 against H 1 . The rejection rule, then, is the same as the usual one.

Example XVIII.4

As an example of a possible statistics that does not work, we have the following. I do not think that anybody has suggested this test, but in order to make the point, let us assume that we are in the same situation as before, i.e., testing H 0 : σ 2 ≥   σ 0 2 against H 1 : σ 2 <   σ 0 2 , but that we choose as our test statistics

Since, if σ 2 is the true variance

has a normal distribution, the rejection sets have the appropriate form. However, X ¯ − μ tends to 0, for every σ 2 , and hence there is no discriminating set of sequences of probability approximating one. Thus, this possible statistics does not satisfy the requirements for a discriminating experiment.

Example XVIII.5

Suppose, as a final example, that we are testing the same hypotheses as above, H 0 : σ 2 ≥   σ 0 2 against H 1 : σ 2 <   σ 0 2 , but that now we do not know the mean μ. We can take as our statistics, as it is usual, the sample variance

Here, the variable

is χ n − 1 2 when σ 2 is the true variance. Hence, the rejection sets have the right form, for the application of Theorem XVIII.2 . We cannot obtain directly by the strong law of large numbers, however, the sets A or B of the definition of discriminating experiment. We have, as above

for σ 1 ≠ σ.

Thus, by a similar argument as in the proof of Theorem XVIII.1 , using (1) and (2), we can show that S n 2 is a discriminating experiment for this case.

A few remarks about the differences between significance tests and hypothesis tests are in order. As we have seen, both types of tests need the consideration of a set of possible alternative hypotheses, which does not consist of all logically possible hypotheses. In significance tests, the widest reasonable set is entertained, but the alternative hypotheses are only considered in order to eliminate some tests that don't make sense. The only role of alternatives, for significance tests, is to insure that the definition of worse results is the correct one.

In hypothesis tests, the alternative hypothesis may be accepted. I think it is clearly better to have an alternative hypothesis that is to be accepted in case the main hypothesis is rejected. The other advantage of hypotheses tests is that, by restricting the class of possible hypotheses, the properties of the test are easier to study and more powerful tests are possible. These two extra advantages require the restriction of the class Ω of possible distributions. For decision theoretic purposes, this may pose no problems. But for strictly inferential purposes it may be better to have the widest possible class of alternative distributions.

If there are sufficient reasons to limit the possible alternatives, however, hypothesis tests are more flexible, since they allow, among other things, for minimization of, what are called, type II errors, 4 for all or some of the alternatives, as follows:

Let S 0 and S 1 be the critical and noncritical regions for the test and let

A test is better, for θ ∈ Ω − Ω ′ , if β(θ) is small. Although there may be no test that makes β(θ) small for all θ ∈ Ω − Ω ′ , in hypothesis tests we may privilege some of the θ ∈ Ω − Ω ′ , and require β(θ) at some specific level β, for a certain θ ∈ Ω − Ω ′ . This may involve us in another dialectical process: for each assigned β, a certain sample size n may be required. So if we are asked for a β(θ) = β, for a certain θ, we must perform a test with at least this sample size.

Related terms:

  • Probability Theory
  • Model Selection
  • Likelihood Ratio
  • Rational Number
  • Posterior Distribution
  • Simple Hypothesis
  • Critical Region
  • Quantile Plot
  • Even number
  • Privacy Policy

Research Method

Home » What is a Hypothesis – Types, Examples and Writing Guide

What is a Hypothesis – Types, Examples and Writing Guide

Table of Contents

What is a Hypothesis

Definition:

Hypothesis is an educated guess or proposed explanation for a phenomenon, based on some initial observations or data. It is a tentative statement that can be tested and potentially proven or disproven through further investigation and experimentation.

Hypothesis is often used in scientific research to guide the design of experiments and the collection and analysis of data. It is an essential element of the scientific method, as it allows researchers to make predictions about the outcome of their experiments and to test those predictions to determine their accuracy.

Types of Hypothesis

Types of Hypothesis are as follows:

Research Hypothesis

A research hypothesis is a statement that predicts a relationship between variables. It is usually formulated as a specific statement that can be tested through research, and it is often used in scientific research to guide the design of experiments.

Null Hypothesis

The null hypothesis is a statement that assumes there is no significant difference or relationship between variables. It is often used as a starting point for testing the research hypothesis, and if the results of the study reject the null hypothesis, it suggests that there is a significant difference or relationship between variables.

Alternative Hypothesis

An alternative hypothesis is a statement that assumes there is a significant difference or relationship between variables. It is often used as an alternative to the null hypothesis and is tested against the null hypothesis to determine which statement is more accurate.

Directional Hypothesis

A directional hypothesis is a statement that predicts the direction of the relationship between variables. For example, a researcher might predict that increasing the amount of exercise will result in a decrease in body weight.

Non-directional Hypothesis

A non-directional hypothesis is a statement that predicts the relationship between variables but does not specify the direction. For example, a researcher might predict that there is a relationship between the amount of exercise and body weight, but they do not specify whether increasing or decreasing exercise will affect body weight.

Statistical Hypothesis

A statistical hypothesis is a statement that assumes a particular statistical model or distribution for the data. It is often used in statistical analysis to test the significance of a particular result.

Composite Hypothesis

A composite hypothesis is a statement that assumes more than one condition or outcome. It can be divided into several sub-hypotheses, each of which represents a different possible outcome.

Empirical Hypothesis

An empirical hypothesis is a statement that is based on observed phenomena or data. It is often used in scientific research to develop theories or models that explain the observed phenomena.

Simple Hypothesis

A simple hypothesis is a statement that assumes only one outcome or condition. It is often used in scientific research to test a single variable or factor.

Complex Hypothesis

A complex hypothesis is a statement that assumes multiple outcomes or conditions. It is often used in scientific research to test the effects of multiple variables or factors on a particular outcome.

Applications of Hypothesis

Hypotheses are used in various fields to guide research and make predictions about the outcomes of experiments or observations. Here are some examples of how hypotheses are applied in different fields:

  • Science : In scientific research, hypotheses are used to test the validity of theories and models that explain natural phenomena. For example, a hypothesis might be formulated to test the effects of a particular variable on a natural system, such as the effects of climate change on an ecosystem.
  • Medicine : In medical research, hypotheses are used to test the effectiveness of treatments and therapies for specific conditions. For example, a hypothesis might be formulated to test the effects of a new drug on a particular disease.
  • Psychology : In psychology, hypotheses are used to test theories and models of human behavior and cognition. For example, a hypothesis might be formulated to test the effects of a particular stimulus on the brain or behavior.
  • Sociology : In sociology, hypotheses are used to test theories and models of social phenomena, such as the effects of social structures or institutions on human behavior. For example, a hypothesis might be formulated to test the effects of income inequality on crime rates.
  • Business : In business research, hypotheses are used to test the validity of theories and models that explain business phenomena, such as consumer behavior or market trends. For example, a hypothesis might be formulated to test the effects of a new marketing campaign on consumer buying behavior.
  • Engineering : In engineering, hypotheses are used to test the effectiveness of new technologies or designs. For example, a hypothesis might be formulated to test the efficiency of a new solar panel design.

How to write a Hypothesis

Here are the steps to follow when writing a hypothesis:

Identify the Research Question

The first step is to identify the research question that you want to answer through your study. This question should be clear, specific, and focused. It should be something that can be investigated empirically and that has some relevance or significance in the field.

Conduct a Literature Review

Before writing your hypothesis, it’s essential to conduct a thorough literature review to understand what is already known about the topic. This will help you to identify the research gap and formulate a hypothesis that builds on existing knowledge.

Determine the Variables

The next step is to identify the variables involved in the research question. A variable is any characteristic or factor that can vary or change. There are two types of variables: independent and dependent. The independent variable is the one that is manipulated or changed by the researcher, while the dependent variable is the one that is measured or observed as a result of the independent variable.

Formulate the Hypothesis

Based on the research question and the variables involved, you can now formulate your hypothesis. A hypothesis should be a clear and concise statement that predicts the relationship between the variables. It should be testable through empirical research and based on existing theory or evidence.

Write the Null Hypothesis

The null hypothesis is the opposite of the alternative hypothesis, which is the hypothesis that you are testing. The null hypothesis states that there is no significant difference or relationship between the variables. It is important to write the null hypothesis because it allows you to compare your results with what would be expected by chance.

Refine the Hypothesis

After formulating the hypothesis, it’s important to refine it and make it more precise. This may involve clarifying the variables, specifying the direction of the relationship, or making the hypothesis more testable.

Examples of Hypothesis

Here are a few examples of hypotheses in different fields:

  • Psychology : “Increased exposure to violent video games leads to increased aggressive behavior in adolescents.”
  • Biology : “Higher levels of carbon dioxide in the atmosphere will lead to increased plant growth.”
  • Sociology : “Individuals who grow up in households with higher socioeconomic status will have higher levels of education and income as adults.”
  • Education : “Implementing a new teaching method will result in higher student achievement scores.”
  • Marketing : “Customers who receive a personalized email will be more likely to make a purchase than those who receive a generic email.”
  • Physics : “An increase in temperature will cause an increase in the volume of a gas, assuming all other variables remain constant.”
  • Medicine : “Consuming a diet high in saturated fats will increase the risk of developing heart disease.”

Purpose of Hypothesis

The purpose of a hypothesis is to provide a testable explanation for an observed phenomenon or a prediction of a future outcome based on existing knowledge or theories. A hypothesis is an essential part of the scientific method and helps to guide the research process by providing a clear focus for investigation. It enables scientists to design experiments or studies to gather evidence and data that can support or refute the proposed explanation or prediction.

The formulation of a hypothesis is based on existing knowledge, observations, and theories, and it should be specific, testable, and falsifiable. A specific hypothesis helps to define the research question, which is important in the research process as it guides the selection of an appropriate research design and methodology. Testability of the hypothesis means that it can be proven or disproven through empirical data collection and analysis. Falsifiability means that the hypothesis should be formulated in such a way that it can be proven wrong if it is incorrect.

In addition to guiding the research process, the testing of hypotheses can lead to new discoveries and advancements in scientific knowledge. When a hypothesis is supported by the data, it can be used to develop new theories or models to explain the observed phenomenon. When a hypothesis is not supported by the data, it can help to refine existing theories or prompt the development of new hypotheses to explain the phenomenon.

When to use Hypothesis

Here are some common situations in which hypotheses are used:

  • In scientific research , hypotheses are used to guide the design of experiments and to help researchers make predictions about the outcomes of those experiments.
  • In social science research , hypotheses are used to test theories about human behavior, social relationships, and other phenomena.
  • I n business , hypotheses can be used to guide decisions about marketing, product development, and other areas. For example, a hypothesis might be that a new product will sell well in a particular market, and this hypothesis can be tested through market research.

Characteristics of Hypothesis

Here are some common characteristics of a hypothesis:

  • Testable : A hypothesis must be able to be tested through observation or experimentation. This means that it must be possible to collect data that will either support or refute the hypothesis.
  • Falsifiable : A hypothesis must be able to be proven false if it is not supported by the data. If a hypothesis cannot be falsified, then it is not a scientific hypothesis.
  • Clear and concise : A hypothesis should be stated in a clear and concise manner so that it can be easily understood and tested.
  • Based on existing knowledge : A hypothesis should be based on existing knowledge and research in the field. It should not be based on personal beliefs or opinions.
  • Specific : A hypothesis should be specific in terms of the variables being tested and the predicted outcome. This will help to ensure that the research is focused and well-designed.
  • Tentative: A hypothesis is a tentative statement or assumption that requires further testing and evidence to be confirmed or refuted. It is not a final conclusion or assertion.
  • Relevant : A hypothesis should be relevant to the research question or problem being studied. It should address a gap in knowledge or provide a new perspective on the issue.

Advantages of Hypothesis

Hypotheses have several advantages in scientific research and experimentation:

  • Guides research: A hypothesis provides a clear and specific direction for research. It helps to focus the research question, select appropriate methods and variables, and interpret the results.
  • Predictive powe r: A hypothesis makes predictions about the outcome of research, which can be tested through experimentation. This allows researchers to evaluate the validity of the hypothesis and make new discoveries.
  • Facilitates communication: A hypothesis provides a common language and framework for scientists to communicate with one another about their research. This helps to facilitate the exchange of ideas and promotes collaboration.
  • Efficient use of resources: A hypothesis helps researchers to use their time, resources, and funding efficiently by directing them towards specific research questions and methods that are most likely to yield results.
  • Provides a basis for further research: A hypothesis that is supported by data provides a basis for further research and exploration. It can lead to new hypotheses, theories, and discoveries.
  • Increases objectivity: A hypothesis can help to increase objectivity in research by providing a clear and specific framework for testing and interpreting results. This can reduce bias and increase the reliability of research findings.

Limitations of Hypothesis

Some Limitations of the Hypothesis are as follows:

  • Limited to observable phenomena: Hypotheses are limited to observable phenomena and cannot account for unobservable or intangible factors. This means that some research questions may not be amenable to hypothesis testing.
  • May be inaccurate or incomplete: Hypotheses are based on existing knowledge and research, which may be incomplete or inaccurate. This can lead to flawed hypotheses and erroneous conclusions.
  • May be biased: Hypotheses may be biased by the researcher’s own beliefs, values, or assumptions. This can lead to selective interpretation of data and a lack of objectivity in research.
  • Cannot prove causation: A hypothesis can only show a correlation between variables, but it cannot prove causation. This requires further experimentation and analysis.
  • Limited to specific contexts: Hypotheses are limited to specific contexts and may not be generalizable to other situations or populations. This means that results may not be applicable in other contexts or may require further testing.
  • May be affected by chance : Hypotheses may be affected by chance or random variation, which can obscure or distort the true relationship between variables.

About the author

' src=

Muhammad Hassan

Researcher, Academic Writer, Web developer

You may also like

Informed Consent in Research

Informed Consent in Research – Types, Templates...

APA Table of Contents

APA Table of Contents – Format and Example

Dissertation Methodology

Dissertation Methodology – Structure, Example...

Limitations in Research

Limitations in Research – Types, Examples and...

Future Research

Future Research – Thesis Guide

Ethical Considerations

Ethical Considerations – Types, Examples and...

Tutorial Playlist

Statistics tutorial, everything you need to know about the probability density function in statistics, the best guide to understand central limit theorem, an in-depth guide to measures of central tendency : mean, median and mode, the ultimate guide to understand conditional probability.

A Comprehensive Look at Percentile in Statistics

The Best Guide to Understand Bayes Theorem

Everything you need to know about the normal distribution, an in-depth explanation of cumulative distribution function, chi-square test, what is hypothesis testing in statistics types and examples, understanding the fundamentals of arithmetic and geometric progression, the definitive guide to understand spearman’s rank correlation, mean squared error: overview, examples, concepts and more, all you need to know about the empirical rule in statistics, the complete guide to skewness and kurtosis, a holistic look at bernoulli distribution.

All You Need to Know About Bias in Statistics

A Complete Guide to Get a Grasp of Time Series Analysis

The Key Differences Between Z-Test Vs. T-Test

The Complete Guide to Understand Pearson's Correlation

A complete guide on the types of statistical studies, everything you need to know about poisson distribution, your best guide to understand correlation vs. regression, the most comprehensive guide for beginners on what is correlation, hypothesis testing in statistics - types | examples.

Lesson 10 of 24 By Avijeet Biswal

What Is Hypothesis Testing in Statistics? Types and Examples

Table of Contents

In today’s data-driven world, decisions are based on data all the time. Hypothesis plays a crucial role in that process, whether it may be making business decisions, in the health sector, academia, or in quality improvement. Without hypothesis and hypothesis tests, you risk drawing the wrong conclusions and making bad decisions. In this tutorial, you will look at Hypothesis Testing in Statistics.

What Is Hypothesis Testing in Statistics?

Hypothesis Testing is a type of statistical analysis in which you put your assumptions about a population parameter to the test. It is used to estimate the relationship between 2 statistical variables.

Let's discuss few examples of statistical hypothesis from real-life - 

  • A teacher assumes that 60% of his college's students come from lower-middle-class families.
  • A doctor believes that 3D (Diet, Dose, and Discipline) is 90% effective for diabetic patients.

Now that you know about hypothesis testing, look at the two types of hypothesis testing in statistics.

The Ultimate Ticket to Top Data Science Job Roles

The Ultimate Ticket to Top Data Science Job Roles

Importance of Hypothesis Testing in Data Analysis

Here is what makes hypothesis testing so important in data analysis and why it is key to making better decisions:

Avoiding Misleading Conclusions (Type I and Type II Errors)

One of the biggest benefits of hypothesis testing is that it helps you avoid jumping to the wrong conclusions. For instance, a Type I error could occur if a company launches a new product thinking it will be a hit, only to find out later that the data misled them. A Type II error might happen when a company overlooks a potentially successful product because their testing wasn’t thorough enough. By setting up the right significance level and carefully calculating the p-value, hypothesis testing minimizes the chances of these errors, leading to more accurate results.

Making Smarter Choices

Hypothesis testing is key to making smarter, evidence-based decisions. Let’s say a city planner wants to determine if building a new park will increase community engagement. By testing the hypothesis using data from similar projects, they can make an informed choice. Similarly, a teacher might use hypothesis testing to see if a new teaching method actually improves student performance. It’s about taking the guesswork out of decisions and relying on solid evidence instead.

Optimizing Business Tactics

In business, hypothesis testing is invaluable for testing new ideas and strategies before fully committing to them. For example, an e-commerce company might want to test whether offering free shipping increases sales. By using hypothesis testing, they can compare sales data from customers who received free shipping offers and those who didn’t. This allows them to base their business decisions on data, not hunches, reducing the risk of costly mistakes.

Hypothesis Testing Formula

Z = ( x̅ – μ0 ) / (σ /√n)

  • Here, x̅ is the sample mean,
  • μ0 is the population mean,
  • σ is the standard deviation,
  • n is the sample size.

How Hypothesis Testing Works?

An analyst performs hypothesis testing on a statistical sample to present evidence of the plausibility of the null hypothesis. Measurements and analyses are conducted on a random sample of the population to test a theory. Analysts use a random population sample to test two hypotheses: the null and alternative hypotheses.

The null hypothesis is typically an equality hypothesis between population parameters; for example, a null hypothesis may claim that the population means return equals zero. The alternate hypothesis is essentially the inverse of the null hypothesis (e.g., the population means the return is not equal to zero). As a result, they are mutually exclusive, and only one can be correct. One of the two possibilities, however, will always be correct.

Your Dream Career is Just Around The Corner!

Your Dream Career is Just Around The Corner!

Null Hypothesis and Alternative Hypothesis

The Null Hypothesis is the assumption that the event will not occur. A null hypothesis has no bearing on the study's outcome unless it is rejected.

H0 is the symbol for it, and it is pronounced H-naught.

The Alternate Hypothesis is the logical opposite of the null hypothesis. The acceptance of the alternative hypothesis follows the rejection of the null hypothesis. H1 is the symbol for it.

Let's understand this with an example.

A sanitizer manufacturer claims that its product kills 95 percent of germs on average. 

To put this company's claim to the test, create a null and alternate hypothesis.

H0 (Null Hypothesis): Average = 95%.

Alternative Hypothesis (H1): The average is less than 95%.

Another straightforward example to understand this concept is determining whether or not a coin is fair and balanced. The null hypothesis states that the probability of a show of heads is equal to the likelihood of a show of tails. In contrast, the alternate theory states that the probability of a show of heads and tails would be very different.

Become a Data Scientist with Hands-on Training!

Become a Data Scientist with Hands-on Training!

Hypothesis Testing Calculation With Examples

Let's consider a hypothesis test for the average height of women in the United States. Suppose our null hypothesis is that the average height is 5'4". We gather a sample of 100 women and determine their average height is 5'5". The standard deviation of population is 2.

To calculate the z-score, we would use the following formula:

z = ( x̅ – μ0 ) / (σ /√n)

z = (5'5" - 5'4") / (2" / √100)

z = 0.5 / (0.045)

We will reject the null hypothesis as the z-score of 11.11 is very large and conclude that there is evidence to suggest that the average height of women in the US is greater than 5'4".

Steps in Hypothesis Testing

Hypothesis testing is a statistical method to determine if there is enough evidence in a sample of data to infer that a certain condition is true for the entire population. Here’s a breakdown of the typical steps involved in hypothesis testing:

Formulate Hypotheses

  • Null Hypothesis (H0): This hypothesis states that there is no effect or difference, and it is the hypothesis you attempt to reject with your test.
  • Alternative Hypothesis (H1 or Ha): This hypothesis is what you might believe to be true or hope to prove true. It is usually considered the opposite of the null hypothesis.

Choose the Significance Level (α)

The significance level, often denoted by alpha (α), is the probability of rejecting the null hypothesis when it is true. Common choices for α are 0.05 (5%), 0.01 (1%), and 0.10 (10%).

Select the Appropriate Test

Choose a statistical test based on the type of data and the hypothesis. Common tests include t-tests, chi-square tests, ANOVA, and regression analysis. The selection depends on data type, distribution, sample size, and whether the hypothesis is one-tailed or two-tailed.

Collect Data

Gather the data that will be analyzed in the test. To infer conclusions accurately, this data should be representative of the population.

Calculate the Test Statistic

Based on the collected data and the chosen test, calculate a test statistic that reflects how much the observed data deviates from the null hypothesis.

Determine the p-value

The p-value is the probability of observing test results at least as extreme as the results observed, assuming the null hypothesis is correct. It helps determine the strength of the evidence against the null hypothesis.

Make a Decision

Compare the p-value to the chosen significance level:

  • If the p-value ≤ α: Reject the null hypothesis, suggesting sufficient evidence in the data supports the alternative hypothesis.
  • If the p-value > α: Do not reject the null hypothesis, suggesting insufficient evidence to support the alternative hypothesis.

Report the Results

Present the findings from the hypothesis test, including the test statistic, p-value, and the conclusion about the hypotheses.

Perform Post-hoc Analysis (if necessary)

Depending on the results and the study design, further analysis may be needed to explore the data more deeply or to address multiple comparisons if several hypotheses were tested simultaneously.

Types of Hypothesis Testing

To determine whether a discovery or relationship is statistically significant, hypothesis testing uses a z-test. It usually checks to see if two means are the same (the null hypothesis). Only when the population standard deviation is known and the sample size is 30 data points or more, can a z-test be applied.

A statistical test called a t-test is employed to compare the means of two groups. To determine whether two groups differ or if a procedure or treatment affects the population of interest, it is frequently used in hypothesis testing.

3. Chi-Square 

You utilize a Chi-square test for hypothesis testing concerning whether your data is as predicted. To determine if the expected and observed results are well-fitted, the Chi-square test analyzes the differences between categorical variables from a random sample. The test's fundamental premise is that the observed values in your data should be compared to the predicted values that would be present if the null hypothesis were true.

ANOVA , or Analysis of Variance, is a statistical method used to compare the means of three or more groups. It’s particularly useful when you want to see if there are significant differences between multiple groups. For instance, in business, a company might use ANOVA to analyze whether three different stores are performing differently in terms of sales. It’s also widely used in fields like medical research and social sciences, where comparing group differences can provide valuable insights.

Hypothesis Testing and Confidence Intervals

Both confidence intervals and hypothesis tests are inferential techniques that depend on approximating the sample distribution. Data from a sample is used to estimate a population parameter using confidence intervals. Data from a sample is used in hypothesis testing to examine a given hypothesis. We must have a postulated parameter to conduct hypothesis testing.

Bootstrap distributions and randomization distributions are created using comparable simulation techniques. The observed sample statistic is the focal point of a bootstrap distribution, whereas the null hypothesis value is the focal point of a randomization distribution.

A variety of feasible population parameter estimates are included in confidence ranges. In this lesson, we created just two-tailed confidence intervals. There is a direct connection between these two-tail confidence intervals and these two-tail hypothesis tests. The results of a two-tailed hypothesis test and two-tailed confidence intervals typically provide the same results. In other words, a hypothesis test at the 0.05 level will virtually always fail to reject the null hypothesis if the 95% confidence interval contains the predicted value. A hypothesis test at the 0.05 level will nearly certainly reject the null hypothesis if the 95% confidence interval does not include the hypothesized parameter.

Become a Data Scientist through hands-on learning with hackathons, masterclasses, webinars, and Ask-Me-Anything sessions! Start learning!

Simple and Composite Hypothesis Testing

Depending on the population distribution, you can classify the statistical hypothesis into two types.

Simple Hypothesis: A simple hypothesis specifies an exact value for the parameter.

Composite Hypothesis: A composite hypothesis specifies a range of values.

A company is claiming that their average sales for this quarter are 1000 units. This is an example of a simple hypothesis.

Suppose the company claims that the sales are in the range of 900 to 1000 units. Then this is a case of a composite hypothesis.

One-Tailed and Two-Tailed Hypothesis Testing

The One-Tailed test, also called a directional test, considers a critical region of data that would result in the null hypothesis being rejected if the test sample falls into it, inevitably meaning the acceptance of the alternate hypothesis.

In a one-tailed test, the critical distribution area is one-sided, meaning the test sample is either greater or lesser than a specific value.

In two tails, the test sample is checked to be greater or less than a range of values in a Two-Tailed test, implying that the critical distribution area is two-sided.

If the sample falls within this range, the alternate hypothesis will be accepted, and the null hypothesis will be rejected.

Become a Data Scientist With Real-World Experience

Become a Data Scientist With Real-World Experience

Right Tailed Hypothesis Testing

If the larger than (>) sign appears in your hypothesis statement, you are using a right-tailed test, also known as an upper test. Or, to put it another way, the disparity is to the right. For instance, you can contrast the battery life before and after a change in production. Your hypothesis statements can be the following if you want to know if the battery life is longer than the original (let's say 90 hours):

  • The null hypothesis is (H0 <= 90) or less change.
  • A possibility is that battery life has risen (H1) > 90.

The crucial point in this situation is that the alternate hypothesis (H1), not the null hypothesis, decides whether you get a right-tailed test.

Left Tailed Hypothesis Testing

Alternative hypotheses that assert the true value of a parameter is lower than the null hypothesis are tested with a left-tailed test; they are indicated by the asterisk "<".

Suppose H0: mean = 50 and H1: mean not equal to 50

According to the H1, the mean can be greater than or less than 50. This is an example of a Two-tailed test.

In a similar manner, if H0: mean >=50, then H1: mean <50

Here the mean is less than 50. It is called a One-tailed test.

Type 1 and Type 2 Error

A hypothesis test can result in two types of errors.

Type 1 Error: A Type-I error occurs when sample results reject the null hypothesis despite being true.

Type 2 Error: A Type-II error occurs when the null hypothesis is not rejected when it is false, unlike a Type-I error.

Suppose a teacher evaluates the examination paper to decide whether a student passes or fails.

H0: Student has passed

H1: Student has failed

Type I error will be the teacher failing the student [rejects H0] although the student scored the passing marks [H0 was true]. 

Type II error will be the case where the teacher passes the student [do not reject H0] although the student did not score the passing marks [H1 is true].

Serious About Success? Don't Settle for Less

Serious About Success? Don't Settle for Less

Practice Problems on Hypothesis Testing

Here are the practice problems on hypothesis testing that will help you understand how to apply these concepts in real-world scenarios:

A telecom service provider claims that customers spend an average of ₹400 per month, with a standard deviation of ₹25. However, a random sample of 50 customer bills shows a mean of ₹250 and a standard deviation of ₹15. Does this sample data support the service provider’s claim?

Solution: Let’s break this down:

  • Null Hypothesis (H0): The average amount spent per month is ₹400.
  • Alternate Hypothesis (H1): The average amount spent per month is not ₹400.
  • Population Standard Deviation (σ): ₹25
  • Sample Size (n): 50
  • Sample Mean (x̄): ₹250

1. Calculate the z-value:

z=250-40025/50 −42.42

2. Compare with critical z-values: For a 5% significance level, critical z-values are -1.96 and +1.96. Since -42.42 is far outside this range, we reject the null hypothesis. The sample data suggests that the average amount spent is significantly different from ₹400.

Out of 850 customers, 400 made online grocery purchases. Can we conclude that more than 50% of customers are moving towards online grocery shopping?

Solution: Here’s how to approach it:

  • Proportion of customers who shopped online (p): 400 / 850 = 0.47
  • Null Hypothesis (H0): The proportion of online shoppers is 50% or more.
  • Alternate Hypothesis (H1): The proportion of online shoppers is less than 50%.
  • Sample Size (n): 850
  • Significance Level (α): 5%

z=p-PP(1-P)/n

z=0.47-0.500.50.5/850  −1.74

2. Compare with the critical z-value: For a 5% significance level (one-tailed test), the critical z-value is -1.645. Since -1.74 is less than -1.645, we reject the null hypothesis. This means the data does not support the idea that most customers are moving towards online grocery shopping.

In a study of code quality, Team A has 250 errors in 1000 lines of code, and Team B has 300 errors in 800 lines of code. Can we say Team B performs worse than Team A?

Solution: Let’s analyze it:

  • Proportion of errors for Team A (pA): 250 / 1000 = 0.25
  • Proportion of errors for Team B (pB): 300 / 800 = 0.375
  • Null Hypothesis (H0): Team B’s error rate is less than or equal to Team A’s.
  • Alternate Hypothesis (H1): Team B’s error rate is greater than Team A’s.
  • Sample Size for Team A (nA): 1000
  • Sample Size for Team B (nB): 800

p=nApA+nBpBnA+nB

p=10000.25+8000.3751000+800 ≈ 0.305

z=​pA−pB​p(1-p)(1nA+1nB)

z=​0.25−0.375​0.305(1-0.305) (11000+1800) ≈ −5.72

2. Compare with the critical z-value: For a 5% significance level (one-tailed test), the critical z-value is +1.645. Since -5.72 is far less than +1.645, we reject the null hypothesis. The data indicates that Team B’s performance is significantly worse than Team A’s.

Our Data Scientist Master's Program will help you master core topics such as R, Python, Machine Learning, Tableau, Hadoop, and Spark. Get started on your journey today!

Applications of Hypothesis Testing

Apart from the practical problems, let's look at the real-world applications of hypothesis testing across various fields:

Medicine and Healthcare

In medicine, hypothesis testing plays a pivotal role in assessing the success of new treatments. For example, researchers may want to find out if a new exercise regimen improves heart health. By comparing data from patients who followed the program to those who didn’t, they can determine if the exercise significantly improves health outcomes. Such rigorous testing allows medical professionals to rely on proven methods rather than assumptions.

Quality Control and Manufacturing

In manufacturing, ensuring product quality is vital, and hypothesis testing helps maintain those standards. Suppose a beverage company introduces a new bottling process and wants to verify if it reduces contamination. By analyzing samples from the new and old processes, hypothesis testing can reveal whether the new method reduces the risk of contamination. This allows manufacturers to implement improvements that enhance product safety and quality confidently.

Education and Learning

In education and learning, hypothesis testing is a tool to evaluate the impact of innovative teaching techniques. Imagine a situation where teachers introduce project-based learning to boost critical thinking skills. By comparing the performance of students who engaged in project-based learning with those in traditional settings, educators can test their hypothesis. The results can help educators make informed choices about adopting new teaching strategies.

Environmental Science

Hypothesis testing is essential in environmental science for evaluating the effectiveness of conservation measures. For example, scientists might explore whether a new water management strategy improves river health. By collecting and comparing data on water quality before and after the implementation of the strategy, they can determine whether the intervention leads to positive changes. Such findings are crucial for guiding environmental decisions that have long-term impacts.

Marketing and Advertising

In marketing, businesses use hypothesis testing to refine their approaches. For instance, a clothing brand might test if offering limited-time discounts increases customer loyalty. By running campaigns with and without the discount and analyzing the outcomes, they can assess if the strategy boosts customer retention. Data-driven insights from hypothesis testing enable companies to design marketing strategies that resonate with their audience and drive growth.

Limitations of Hypothesis Testing

Hypothesis testing has some limitations that researchers should be aware of:

  • It cannot prove or establish the truth: Hypothesis testing provides evidence to support or reject a hypothesis, but it cannot confirm the absolute truth of the research question.
  • Results are sample-specific: Hypothesis testing is based on analyzing a sample from a population, and the conclusions drawn are specific to that particular sample.
  • Possible errors: During hypothesis testing, there is a chance of committing type I error (rejecting a true null hypothesis) or type II error (failing to reject a false null hypothesis).
  • Assumptions and requirements: Different tests have specific assumptions and requirements that must be met to accurately interpret results.

Learn All The Tricks Of The BI Trade

Learn All The Tricks Of The BI Trade

After reading this tutorial, you would have a much better understanding of hypothesis testing, one of the most important concepts in the field of Data Science . The majority of hypotheses are based on speculation about observed behavior, natural phenomena, or established theories.

If you are interested in statistics of data science and skills needed for such a career, you ought to explore the Post Graduate Program in Data Science.

1. What is hypothesis testing in statistics with example?

Hypothesis testing is a statistical method used to determine if there is enough evidence in a sample data to draw conclusions about a population. It involves formulating two competing hypotheses, the null hypothesis (H0) and the alternative hypothesis (Ha), and then collecting data to assess the evidence. An example: testing if a new drug improves patient recovery (Ha) compared to the standard treatment (H0) based on collected patient data.

2. What is H0 and H1 in statistics?

In statistics, H0​ and H1​ represent the null and alternative hypotheses. The null hypothesis, H0​, is the default assumption that no effect or difference exists between groups or conditions. The alternative hypothesis, H1​, is the competing claim suggesting an effect or a difference. Statistical tests determine whether to reject the null hypothesis in favor of the alternative hypothesis based on the data.

3. What is a simple hypothesis with an example?

A simple hypothesis is a specific statement predicting a single relationship between two variables. It posits a direct and uncomplicated outcome. For example, a simple hypothesis might state, "Increased sunlight exposure increases the growth rate of sunflowers." Here, the hypothesis suggests a direct relationship between the amount of sunlight (independent variable) and the growth rate of sunflowers (dependent variable), with no additional variables considered.

4. What are the 3 major types of hypothesis?

The three major types of hypotheses are:

  • Null Hypothesis (H0): Represents the default assumption, stating that there is no significant effect or relationship in the data.
  • Alternative Hypothesis (Ha): Contradicts the null hypothesis and proposes a specific effect or relationship that researchers want to investigate.
  • Nondirectional Hypothesis: An alternative hypothesis that doesn't specify the direction of the effect, leaving it open for both positive and negative possibilities.

5. What software tools can assist with hypothesis testing?

Several software tools offering distinct features can help with hypothesis testing. R and RStudio are popular for their advanced statistical capabilities. The Python ecosystem, including libraries like SciPy and Statsmodels, also supports hypothesis testing. SAS and SPSS are well-established tools for comprehensive statistical analysis. For basic testing, Excel offers simple built-in functions.

6. How do I interpret the results of a hypothesis test?

Interpreting hypothesis test results involves comparing the p-value to the significance level (alpha). If the p-value is less than or equal to alpha, you can reject the null hypothesis, indicating statistical significance. This suggests that the observed effect is unlikely to have occurred by chance, validating your analysis findings.

7. Why is sample size important in hypothesis testing?

Sample size is crucial in hypothesis testing as it affects the test’s power. A larger sample size increases the likelihood of detecting a true effect, reducing the risk of Type II errors. Conversely, a small sample may lack the statistical power needed to identify differences, potentially leading to inaccurate conclusions.

8. Can hypothesis testing be used for non-numerical data?

Yes, hypothesis testing can be applied to non-numerical data through non-parametric tests. These tests are ideal when data doesn't meet parametric assumptions or when dealing with categorical data. Non-parametric tests, like the Chi-square or Mann-Whitney U test, provide robust methods for analyzing non-numerical data and drawing meaningful conclusions.

9. How do I choose the proper hypothesis test?

Selecting the right hypothesis test depends on several factors: the objective of your analysis, the type of data (numerical or categorical), and the sample size. Consider whether you're comparing means, proportions, or associations, and whether your data follows a normal distribution. The correct choice ensures accurate results tailored to your research question.

Find our PL-300 Microsoft Power BI Certification Training Online Classroom training classes in top cities:

NameDatePlace
12 Oct -27 Oct 2024,
Weekend batch
Your City
26 Oct -10 Nov 2024,
Weekend batch
Your City
9 Nov -24 Nov 2024,
Weekend batch
Your City

About the Author

Avijeet Biswal

Avijeet is a Senior Research Analyst at Simplilearn. Passionate about Data Analytics, Machine Learning, and Deep Learning, Avijeet is also interested in politics, cricket, and football.

Recommended Resources

The Key Differences Between Z-Test Vs. T-Test

Free eBook: Top Programming Languages For A Data Scientist

Normality Test in Minitab: Minitab with Statistics

Normality Test in Minitab: Minitab with Statistics

A Comprehensive Look at Percentile in Statistics

Machine Learning Career Guide: A Playbook to Becoming a Machine Learning Engineer

  • PMP, PMI, PMBOK, CAPM, PgMP, PfMP, ACP, PBA, RMP, SP, and OPM3 are registered marks of the Project Management Institute, Inc.

Lesson 27: Likelihood Ratio Tests

In this lesson, we'll learn how to apply a method for developing a hypothesis test for situations in which both the null and alternative hypotheses are composite. That's not completely accurate. The method, called the likelihood ratio test , can be used even when the hypotheses are simple, but it is most commonly used when the alternative hypothesis is composite. Throughout the lesson, we'll continue to assume that we know the the functional form of the probability density (or mass) function, but we don't know the value of one (or more) of its parameters. That is, we might know that the data come from a normal distrbution, but we don't know the mean or variance of the distribution, and hence the interest in performing a hypothesis test about the unknown parameter(s).

27.1 - A Definition and Simple Example

The title of this page is a little risky, as there are few simple examples when it comes to likelihood ratio testing! But, we'll work to make the example as simple as possible, namely by assuming again, unrealistically, that we know the population variance, but not the population mean. Before we state the definition of a likelihood ratio test, and then investigate our simple, but unrealistic, example, we first need to define some notation that we'll use throughout the lesson.

We'll assume that the probability density (or mass) function of X is \(f(x;\theta)\) where \(\theta\) represents one or more unknown parameters. Then:

  • Let \(\Omega\) (greek letter "omega") denote the total possible parameter space of \(\theta\), that is, the set of all possible values of \(\theta\) as specified in totality in the null and alternative hypotheses.
  • Let \(H_0 : \theta \in \omega\) denote the null hypothesis where \(\omega\) (greek letter "omega") is a subset of the parameter space \(\Omega\).
  • Let \(H_A : \theta \in \omega'\) denote the alternative hypothesis where \(\omega '\) is the complement of \(\omega\) with respect to the parameter space \(\Omega\).

Let's make sure we are clear about that phrase "where \(\omega '\) is the complement of \(\omega\) with respect to the parameter space \(\Omega\)."

Example 27-1

If the total parameter space of the mean \(\mu\) is \(\Omega = {\mu: −∞ < \mu < ∞}\) and the null hypothesis is specified as \(H_0: \mu = 3\), how should we specify the alternative hypothesis so that the alternative parameter space is the complement of the null parameter space?

If the null parameter space is \(\Omega = {\mu: \mu = 3}\), then the alternative parameter space is everything that is in \(\Omega = {\mu: −∞ < \mu < ∞}\) that is not in \(\Omega\). That is, the alternative parameter space is \(\Omega ' = {\mu: \mu ≠ 3}\). And, so the alternative hypothesis is:

\(H_A : \mu \ne 3\)

In this case, we'd be interested in deriving a two-tailed test.

Example 27-2

If the alternative hypothesis is \(H_A: \mu > 3\), how should we (technically) specify the null hypothesis so that the null parameter space is the complement of the alternative parameter space?

If the alternative parameter space is (\omega ' = {\mu: \mu > 3}\), then the null parameter space is \(\omega = {\mu: \mu ≤ 3}\). And, so the null hypothesis is:

\(H_0 : \mu \le 3\)

Now, the reality is that some authors do specify the null hypothesis as such, even when they mean \(H_0: \mu = 3\). Ours don't, and so we won't. (That's why I put that "technically" in parentheses up above.) At any rate, in this case, we'd be interested in deriving a one-tailed test.

Definition. Let:

\(L(\hat{\omega})\) denote the maximum of the likelihood function with respect to \(\theta\) when \(\theta\) is in the null parameter space \(\omega\).

\(L(\hat{\Omega})\) denote the maximum of the likelihood function with respect to \(\theta\) when \(\theta\) is in the entire parameter space \(\Omega\).

Then, the likelihood ratio is the quotient:

\(\lambda = \dfrac{L(\hat{\omega})}{L(\hat{\Omega})}\)

And, to test the null hypothesis \(H_0 : \theta \in \omega\) against the alternative hypothesis \(H_A : \theta \in \omega'\), the critical region for the likelihood ratio test is the set of sample points for which:

\(\lambda = \dfrac{L(\hat{\omega})}{L(\hat{\Omega})} \le k\)

where \(0 < k < 1\), and k is selected so that the test has a desired significance level \(\alpha\).

Example 27-3

Honeyspoon

A food processing company packages honey in small glass jars. Each jar is supposed to contain 10 fluid ounces of the sweet and gooey good stuff. Previous experience suggests that the volume X , the volume in fluid ounces of a randomly selected jar of the company's honey is normally distributed with a known variance of 2. Derive the likelihood ratio test for testing, at a significance level of \(\alpha = 0.05\), the null hypothesis \(H_0: \mu = 10\) against the alternative hypothesis H_A: \mu ≠ 10\).

Because we are interested in testing the null hypothesis \(H_0: \mu = 10\) against the alternative hypothesis \(H_A: \mu ≠ 10\) for a normal mean, our total parameter space is:

\(\Omega =\left \{\mu : -\infty < \mu < \infty \right \}\)

and our null parameter space is:

\(\omega = \left \{10\right \}\)

Now, to find the likelihood ratio, as defined above, we first need to find \(L(\hat{\omega})\). Well, when the null hypothesis \(H_0: \mu = 10\) is true, the mean \(\mu\) can take on only one value, namely, \(\mu = 10\). Therefore:

\(L(\hat{\omega}) = L(10)\)

We also need to find \(L(\hat{\Omega})\) in order to define the likelihood ratio. To find it, we must find the value of \(\mu\)   that maximizes \(L(\mu)\) . Well, we did that back when we studied maximum likelihood as a method of estimation. We showed that \(\hat{\mu} = \bar{x}\) is the maximum likelihood estimate of \(\mu\) . Therefore:

\(L(\hat{\Omega}) = L(\bar{x})\)

Now, putting it all together to form the likelihood ratio, we get:

which simplifies to:

Now, let's step aside for a minute and focus just on the summation in the numerator. If we "add 0" in a special way to the quantity in parentheses:

we can show that the summation can be written as:

\(\sum_{i=1}^{n}(x_i - 10)^2 = \sum_{i=1}^{n}(x_i - \bar{x})^2 + n(\bar{x} -10)^2 \)

Therefore, the likelihood ratio becomes:

which greatly simplifies to:

\(\lambda = exp \left [-\dfrac{n}{4}(\bar{x}-10)^2 \right ]\)

Now, the likelihood ratio test tells us to reject the null hypothesis when the likelihood ratio \(\lambda\) is small, that is, when:

\(\lambda = exp\left[-\dfrac{n}{4}(\bar{x}-10)^2 \right] \le k\)

where k is chosen to ensure that, in this case, \(\alpha = 0.05\). Well, by taking the natural log of both sides of the inequality, we can show that \(\lambda ≤ k\) is equivalent to:

\( -\dfrac{n}{4}(\bar{x}-10)^2 \le \text{ln} k \)

which, by multiplying through by −4/ n , is equivalent to:

\((\bar{x}-10)^2 \ge -\dfrac{4}{n} \text{ln} k \)

which is equivalent to:

\(\dfrac{|\bar{X}-10|}{\sigma / \sqrt{n}} \ge \dfrac{\sqrt{-(4/n)\text{ln} k}}{\sigma / \sqrt{n}} =k* \)

Aha! We should recognize that quantity on the left-side of the inequality! We know that:

\(Z = \dfrac{\bar{X}-10}{\sigma / \sqrt{n}} \)

follows a standard normal distribution when \(H_0: \mu = 10\). Therefore we can determine the appropriate \(k^*\) by using the standard normal table. We have shown that the likelihood ratio test tells us to reject the null hypothesis \(H_0: \mu = 10\) in favor of the alternative hypothesis \(H_A: \mu ≠ 10\) for all sample means for which the following holds:

\(\dfrac{|\bar{X}-10|}{ \sqrt{2} / \sqrt{n}} \ge z_{0.025} = 1.96 \)

Doing so will ensure that our probability of committing a Type I error is set to \(\alpha = 0.05\), as desired.

27.2 - The T-Test For One Mean

Well, geez, now why would we be revisiting the t -test for a mean \(\mu\) when we have already studied it back in the hypothesis testing section? Well, the answer, it turns out, is that, as we'll soon see, the t -test for a mean \(\mu\) is the likelihood ratio test! Let's take a look!

Example 27-4

sunset

Suppose that a random sample \(X_1 , X_2 , \dots , X_n\) arises from a normal population with unknown mean \(\mu\) and unknown variance \(\sigma^2\). (Yes, back to the realistic situation, in which we don't know the population variance either.) Find the size \(\alpha\) likelihood ratio test for testing the null hypothesis \(H_0: \mu = \mu_0\) against the two-sided alternative hypothesis \(H_A: \mu ≠ \mu_0\) .

Our unrestricted parameter space is:

\( \Omega = \left\{ (\mu, \sigma^2) : -\infty < \mu < \infty, 0 < \sigma^2 < \infty \right\} \)

Under the null hypothesis, the mean \(\mu\) is the only parameter that is restricted. Therefore, our parameter space under the null hypothesis is:

\( \omega = \left\{(\mu, \sigma^2) : \mu =\mu_0, 0 < \sigma^2 < \infty \right\}\)

Now, first consider the case where the mean and variance are unrestricted. We showed back when we studied maximum likelihood estimation that the maximum likelihood estimates of \(\mu\) and \(\sigma^2\) are, respectively:

\(\hat{\mu} = \bar{x} \text{ and } \hat{\sigma}^2 = \dfrac{1}{n}\sum_{i=1}^{n}(x_i - \bar{x})^2 \)

Therefore, the maximum of the likelihood function for the unrestricted parameter space is:

\( L(\hat{\Omega})= \left[\dfrac{ne^{-1}}{2\pi \Sigma (x_i - \bar{x})^2} \right]^{n/2} \)

Now, under the null parameter space, the maximum likelihood estimates of \(\mu\)   and \(\sigma^2\) are, respectively:

\( \hat{\mu} = \mu_0 \text{ and } \hat{\sigma}^2 = \dfrac{1}{n}\sum_{i=1}^{n}(x_i - \mu_0)^2 \)

Therefore, the likelihood under the null hypothesis is:

\( L(\hat{\omega})= \left[\dfrac{ne^{-1}}{2\pi \Sigma (x_i - \mu_0)^2} \right]^{n/2} \)

And now taking the ratio of the two likelihoods, we get:

which reduces to:

\( \lambda = \left[ \dfrac{\sum_{i=1}^{n}(x_i - \bar{x})^2}{\sum_{i=1}^{n}(x_i - \mu_0)^2} \right] ^{n/2}\)

Focusing only on the denominator for a minute, let's do that trick again of "adding 0" in just the right away. Adding 0 to the quantity in the parentheses, we get:

\( \sum_{i=1}^{n}(x_i - \mu_0)^2 = \sum_{i=1}^{n}(x_i - \bar{x})^2 +n(\bar{x} - \mu_0)^2 \)

Then, our likelihood ratio \(\lambda\) becomes:

\( \lambda = \left[ \dfrac{\sum_{i=1}^{n}(x_i - \bar{x})^2}{\sum_{i=1}^{n}(x_i - \mu_0)^2} \right] ^{n/2} = \left[ \dfrac{\sum_{i=1}^{n}(x_i - \bar{x})^2}{ \sum_{i=1}^{n}(x_i - \bar{x})^2 +n(\bar{x} - \mu_0)^2} \right] ^{n/2} \)

which, upon dividing through numerator and denominator by \( \sum_{i=1}^{n}(x_i - \bar{x})^2 \) simplifies to:

Therefore, the likelihood ratio test's critical region, which is given by the inequality \(\lambda ≤ k\), is equivalent to:

which with some minor algebraic manipulation can be shown to be equivalent to:

So, in a nutshell, we've shown that the likelihood ratio test tells us that for this situation we should reject the null hypothesis \(H_0: \mu= \mu_0\) in favor of the alternative hypothesis \(H_A: \mu ≠ \mu_0\)   if:

\( \dfrac{(\bar{x}-\mu_0)^2 }{s^2 / n} \ge k^{*} \)

Well, okay, so I started out this page claiming that the t -test for a mean \(\mu\) is the likelihood ratio test. Is it? Well, the above critical region is equivalent to rejecting the null hypothesis if:

\( \dfrac{|\bar{x}-\mu_0| }{s / \sqrt{n}} \ge k^{**} \)

Does that look familiar? We previously learned that if \(X_1, X_2, \dots, X_n\) are normally distributed with mean \(\mu\) and variance \(\sigma^2\), then:

\( T = \dfrac{\bar{X}-\mu}{S / \sqrt{n}} \)

follows a T distribution with n − 1 degrees of freedom. So, this tells us that we should use the T distribution to choose \(k^{**}\) . That is, set:

\(k^{**} = t_{\alpha /2, n-1}\)

and we have our size \(\alpha\) t -test that ensures the probability of committing a Type I error is \(\alpha\).

It turns out... we didn't know it at the time... but every hypothesis test that we derived in the hypothesis testing section is a likelihood ratio test. Back then, we derived each test using distributional results of the relevant statistic(s), but we could have alternatively, and perhaps just as easily, derived the tests using the likelihood ratio testing method.

What does "Composite Hypothesis" mean?

Definition of Composite Hypothesis in the context of A/B testing (online controlled experiments).

What is a Composite Hypothesis?

In hypothesis testing a composite hypothesis is a hypothesis which covers a set of values from the parameter space. For example, if the entire parameter space covers -∞ to +∞ a composite hypothesis could be μ ≤ 0. It could be any other number as well, such 1, 2 or 3,1245. The alternative hypothesis is always a composite hypothesis : either one-sided hypothesis if the null is composite or a two-sided one if the null is a point null. The "composite" part means that such a hypothesis is the union of many simple point hypotheses.

In a Null Hypothesis Statistical Test only the null hypothesis can be a point hypothesis. Also, a composite hypothesis usually spans from -∞ to zero or some value of practical significance or from such a value to +∞.

Testing a composite null is what is most often of interest in an A/B testing scenario as we are usually interested in detecting and estimating effects in only one direction: either an increase in conversion rate or average revenue per user, or a decrease in unsubscribe events would be of interest and not its opposite. In fact, running a test so long as to detect a statistically significant negative outcome can result in significant business harm.

Like this glossary entry? For an in-depth and comprehensive reading on A/B testing stats, check out the book "Statistical Methods in Online A/B Testing" by the author of this glossary, Georgi Georgiev.

Articles on Composite Hypothesis

One-tailed vs Two-tailed Tests of Significance in A/B Testing blog.analytics-toolkit.com

Related A/B Testing terms

Purchase Statistical Methods in Online A/B Testing

Statistical Methods in Online A/B Testing

Take your A/B testing program to the next level with the most comprehensive book on user testing statistics in e-commerce.

Glossary index by letter

Select a letter to see all A/B testing terms starting with that letter or visit the Glossary homepage to see all.

  • Stack Overflow for Teams Where developers & technologists share private knowledge with coworkers
  • Advertising & Talent Reach devs & technologists worldwide about your product, service or employer brand
  • OverflowAI GenAI features for Teams
  • OverflowAPI Train & fine-tune LLMs
  • Labs The future of collective knowledge sharing
  • About the company Visit the blog

Collectives™ on Stack Overflow

Find centralized, trusted content and collaborate around the technologies you use most.

Q&A for work

Connect and share knowledge within a single location that is structured and easy to search.

Get early access and see previews of new features.

How do I use composite strategies in hypothesis (hypothesis.errors.InvalidArgument: Expected SearchStrategy but got function)

This example is a variation of the one in the docs :

What am I doing wrong?

  • python-hypothesis

upe's user avatar

  • what's draw here? –  baxx Commented Sep 10, 2021 at 15:08

You need to call the composite functions. This is not explained in the docs, but there is an example in a 2016 blog post .

  • 3 There's also an example in the docs ( list_and_index ), but I agree this should be explicitly stated in the text. –  Zac Hatfield-Dodds Commented Jun 16, 2018 at 18:44
  • It is shown how to produce one example in the REPL. It is not clear to me that means that you have to use function calls in the given statements. Regular strategies are used without function calls. I think it would be helpful to write that composites are different :) –  The Unfun Cat Commented Jun 17, 2018 at 6:58

Your Answer

Reminder: Answers generated by artificial intelligence tools are not allowed on Stack Overflow. Learn more

Sign up or log in

Post as a guest.

Required, but never shown

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy .

Not the answer you're looking for? Browse other questions tagged python-hypothesis or ask your own question .

  • The Overflow Blog
  • Where developers feel AI coding tools are working—and where they’re missing...
  • Masked self-attention: How LLMs learn relationships between tokens
  • Featured on Meta
  • User activation: Learnings and opportunities
  • Preventing unauthorized automated access to the network
  • Should low-scoring meta questions no longer be hidden on the Meta.SO home...
  • Announcing the new Staging Ground Reviewer Stats Widget
  • Feedback Requested: How do you use the tagged questions page?

Hot Network Questions

  • How to make a wall of unicode symbols useful?
  • tan 3θ = cot(−θ)
  • White (king and 2 bishops) vs Black (king and 1 knight). White to play and mate in 2
  • Help Identify ebike electrical C13-like socket
  • Is this a balanced way to implement the "sparks" spell from Skyrim into D&D?
  • What exactly do I buy when I buy an index-following ETF?
  • Is there any language which distinguishes between “un” as in “not” and “un” as in “inverse”?
  • is it okay to mock a database when writing unit test?
  • How similar were the MC6800 and MOS 6502?
  • Can I breed fish in Minecraft?
  • Will a car seat fit into a standard economy class seat on a plane?
  • Why would an escrow/title company not accept ACH payments?
  • Is it possible to speed up this function?
  • Matter made of neutral charges does not radiate?
  • What is "illegal, immoral or improper" use in CPOL?
  • The most common one (L)
  • FIFO capture using cat not working as intended?
  • How to adjust for p-values in Kruskal Wallis test with two or more hypothesis
  • Undamaged tire repeatedly deflating
  • In John 3:16, what is the significance of Jesus' distinction between the terms 'world' and 'everyone who believes' within the context?
  • Is there a fast/clever way to return a logical vector if elements of a vector are in at least one interval?
  • Tikz: On straight lines moving balls on a circle inside a regular polygon
  • Does this work for page turns in a busy violin part?
  • In John 8, why did the Jews call themselves "children of Abraham" not "children of Jacob" or something else?

hypothesis composite

Biostatistics

  • Data Science
  • Programming
  • Social Science
  • Certificates
  • Undergraduate
  • For Businesses
  • FAQs and Knowledge Base

Test Yourself

  • Instructors

Composite Hypothesis

Composite Hypothesis:

A statistical hypothesis which does not completely specify the distribution of a random variable is referred to as a composite hypothesis.

Browse Other Glossary Entries

Planning on taking an introductory statistics course, but not sure if you need to start at the beginning? Review the course description for each of our introductory statistics courses and estimate which best matches your level, then take the self test for that course. If you get all or almost all the questions correct, move on and take the next test.

Data Analytics

Considering becoming adata scientist, customer analyst or our data science certificate program?

Advanced Statistics Quiz

Looking at statistics for graduate programs or to enhance your foundational knowledge?

Regression Quiz

Entering the biostatistics field? Test your skill here.

Stay Informed

Read up on our latest blogs

Learn about our certificate programs

Find the right course for you

We'd love to answer your questions

Our mentors and academic advisors are standing by to help guide you towards the courses or program that makes the most sense for you and your goals.

300 W Main St STE 301, Charlottesville, VA 22903

(434) 973-7673

[email protected]

By submitting your information, you agree to receive email communications from Statistics.com. All information submitted is subject to our privacy policy . You may opt out of receiving communications at any time.

IMAGES

  1. Simple and Composite Hypothesis

    hypothesis composite

  2. PPT

    hypothesis composite

  3. Simple Hypothesis Vs Composite Hypothesis || Types of Hypothesis || Simple Vs Complex Hypothesis ||

    hypothesis composite

  4. Simple and Composite Hypothesis, Statistics Lecture

    hypothesis composite

  5. 13 Different Types of Hypothesis (2024)

    hypothesis composite

  6. PPT

    hypothesis composite

VIDEO

  1. F.A-II statistics Null hypothesis alternative hypothesis simple and composite hypothesis

  2. Simple and Composite Statistical Hypothesis definitions

  3. simple and composite hypothesis and steps of testing of hypothesis

  4. Field Theory 6: Composite Fields

  5. Lecture 14

  6. UiA-IKT721: Lecture 17: Composite Hypothesis Tests

COMMENTS

  1. What you can generate and how

    What you can generate and how. Most things should be easy to generate and everything should be possible. To support this principle Hypothesis provides strategies for most built-in types with arguments to constrain or adjust the output, as well as higher-order strategies that can be composed to generate more complex types.

  2. PDF Composite Hypotheses

    Rejection and failure to reject the null hypothesis, critical regions, C, and type I and type II errors have the same meaning for a composite hypotheses as it does with a simple hypothesis. Significance level and power will necessitate an extension of the ideas for simple hypotheses. 18.2 The Power Function

  3. 13 Different Types of Hypothesis (2024)

    A composite hypothesis is a hypothesis that does not predict the exact parameters, distribution, or range of the dependent variable. Often, we would predict an exact outcome. For example: "23 year old men are on average 189cm tall." Here, we are giving an exact parameter. So, the hypothesis is not composite.

  4. Simple Hypothesis and Composite Hypothesis

    A hypothesis which is not simple (i.e. in which not all of the parameters are specified) is called a composite hypothesis. For instance, if we hypothesize that $${H_o}:\mu > 62$$ (and $${\sigma ^2} = 4$$) or$${H_o}:\mu = 62$$ and $${\sigma ^2} < 4$$, the hypothesis becomes a composite hypothesis because we cannot know the exact distribution of the population in either case.

  5. Composite hypothesis

    A composite hypothesis is a type of statistical hypothesis that includes a range of possible values for a parameter, rather than specifying a single value. This concept is crucial when dealing with null and alternative hypotheses, as it allows researchers to consider multiple scenarios or conditions under which the data may be analyzed, providing a more flexible approach to hypothesis testing.

  6. PDF Lecture 10: Composite Hypothesis Testing

    then we have a simple hypothesis, as discussed in past lectures. When a set contains more than one parameter value, then the hypothesis is called a composite hypothesis, because it involves more than one model. The name is even clearer if we consider the following equivalent expression for the hypotheses above. H 0: X ˘p 0; p 0 2fp 0(xj 0)g 02 ...

  7. PDF Lecture 7

    Lecture 7 | Composite hypotheses and the t-test 7.1 Composite null and alternative hypotheses This week we will discuss various hypothesis testing problems involving a composite null hypothesis and a compositive alternative hypothesis. To motivate the discussion, consider the following examples: Example 7.1. There are 80 students in a STATS 200 ...

  8. Composite Hypothesis Test

    A composite hypothesis test contains more than one parameter and more than one model. In a simple hypothesis test, the probability density functions for both the null hypothesis (H 0) and alternate hypothesis (H 1) are known. In academic and hypothetical situations, the simple hypothesis test works for most cases.

  9. Simple and composite hypothesis

    Definition: Simple and composite hypothesis. Definition: Let H H be a statistical hypothesis. Then, H H is called a simple hypothesis, if it completely specifies the population distribution; in this case, the sampling distribution of the test statistic is a function of sample size alone. H H is called a composite hypothesis, if it does not ...

  10. PDF Topic 16: Composite Hypotheses

    0 is called the null hypothesis and H 1 the alternative hypothesis. Rejection and failure to reject the null hypothesis, critical regions, C, and type I and type II errors have the same meaning for a composite hypotheses as it does with a simple hypothesis. 1 Power Power is now a function ˇ( ) = P fX2Cg:

  11. 26.1

    Any hypothesis that is not a simple hypothesis is called a composite hypothesis. Example 26-1 Section Suppose \(X_1 , X_2 , \dots , X_n\) is a random sample from an exponential distribution with parameter \(\theta\).

  12. Composite Hypothesis

    Composite Hypothesis. In subject area: Mathematics. Classically, composite hypotheses are used to determine if a point null is statistically distinguishable from the best alternative, or to determine if the best supported alternative lies on a specified side of the point null. From: Philosophy of Statistics, 2011.

  13. What is a Hypothesis

    A composite hypothesis is a statement that assumes more than one condition or outcome. It can be divided into several sub-hypotheses, each of which represents a different possible outcome. Empirical Hypothesis. An empirical hypothesis is a statement that is based on observed phenomena or data. It is often used in scientific research to develop ...

  14. Simple and Composite Hypothesis

    This lecture explains simple and composite hypotheses.Other videos @DrHarishGargHow to write H0 and H1: https://youtu.be/U1e8CqkSzLISimple and Composite Hypo...

  15. 26.2

    26.2 - Uniformly Most Powerful Tests. The Neyman Pearson Lemma is all well and good for deriving the best hypothesis tests for testing a simple null hypothesis against a simple alternative hypothesis, but the reality is that we typically are interested in testing a simple null hypothesis, such as H 0: μ = 10 against a composite alternative ...

  16. Hypothesis Testing in Statistics

    Composite Hypothesis: A composite hypothesis specifies a range of values. Example: A company is claiming that their average sales for this quarter are 1000 units. This is an example of a simple hypothesis. Suppose the company claims that the sales are in the range of 900 to 1000 units. Then this is a case of a composite hypothesis.

  17. Lesson 27: Likelihood Ratio Tests

    Lesson 27: Likelihood Ratio Tests. In this lesson, we'll learn how to apply a method for developing a hypothesis test for situations in which both the null and alternative hypotheses are composite. That's not completely accurate. The method, called the likelihood ratio test, can be used even when the hypotheses are simple, but it is most ...

  18. What is a Composite Hypothesis?

    In hypothesis testing a composite hypothesis is a hypothesis which covers a set of values from the parameter space. For example, if the entire parameter space covers -∞ to +∞ a composite hypothesis could be μ ≤ 0. It could be any other number as well, such 1, 2 or 3,1245. The alternative hypothesis is always a composite hypothesis ...

  19. PDF Chapter 9 Chapter 9: Hypothesis Testing

    Simple vs Composite, One or Two sided Simple and Composite hypotheses: If i contains only a single value, i = f ig, then Hi is a simple hypothesis If i contains more than a single value then Hi is a composite hypothesis One-Sided and Two-Sided (for a one-dimensional ) One-sided hypotheses: H0: 0 H1: < 0 or H0: 0 H1: > 1

  20. How do I use composite strategies in hypothesis (hypothesis.errors

    How to execute Python functions using Hypothesis' composite strategy? 1. Given a Hypothesis Strategy, can I get the minimal example? 5. Multiple strategies for same function parameter in python hypothesis. 3. How to enforce relative constraints on hypothesis strategies? 2.

  21. PDF Section 11 Goodness-of-fit for composite hypotheses.

    that now our hypothesis is not that the data comes from a particular given distribution but that the data comes from a family of distributions which is called a composite hypothesis. Running [H,P,STATS]= chi2gof(X,'cdf',@(z)normcdf(z,mean(X),std(X,1))) would test a simple hypothesis that the data comes from a particular normal distribution

  22. PDF Asymptotic Tests of Composite Hypotheses

    The parameter space, Θ, defines the maintained hypothesis, which is a non-empty subset of Rp, for some integer p. We consider the hypothesis, H0: θ ∈ Θ0,whereΘ0 is a subset of Θ,andweshall be concerned with the case where Θ0 contains more than a single point, so that H0 is a composite hypothesis. 2.1 A Simple Illustrative Example

  23. Composite Hypothesis

    Our mentors and academic advisors are standing by to help guide you towards the courses or program that makes the most sense for you and your goals. 300 W Main St STE 301, Charlottesville, VA 22903. (434) 973-7673. [email protected].