• Python for Machine Learning
  • Machine Learning with R
  • Machine Learning Algorithms
  • Math for Machine Learning
  • Machine Learning Interview Questions
  • ML Projects
  • Deep Learning
  • Computer vision
  • Data Science
  • Artificial Intelligence

Hypothesis in Machine Learning

  • Demystifying Machine Learning
  • Bayes Theorem in Machine learning
  • What is Machine Learning?
  • Best IDEs For Machine Learning
  • Learn Machine Learning in 45 Days
  • Interpolation in Machine Learning
  • How does Machine Learning Works?
  • Machine Learning for Healthcare
  • Applications of Machine Learning
  • Machine Learning - Learning VS Designing
  • Continual Learning in Machine Learning
  • Meta-Learning in Machine Learning
  • P-value in Machine Learning
  • Why Machine Learning is The Future?
  • How Does NASA Use Machine Learning?
  • Few-shot learning in Machine Learning
  • Machine Learning Jobs in Hyderabad

The concept of a hypothesis is fundamental in Machine Learning and data science endeavours. In the realm of machine learning, a hypothesis serves as an initial assumption made by data scientists and ML professionals when attempting to address a problem. Machine learning involves conducting experiments based on past experiences, and these hypotheses are crucial in formulating potential solutions.

It’s important to note that in machine learning discussions, the terms “hypothesis” and “model” are sometimes used interchangeably. However, a hypothesis represents an assumption, while a model is a mathematical representation employed to test that hypothesis. This section on “Hypothesis in Machine Learning” explores key aspects related to hypotheses in machine learning and their significance.

Table of Content

How does a Hypothesis work?

Hypothesis space and representation in machine learning, hypothesis in statistics, faqs on hypothesis in machine learning.

A hypothesis in machine learning is the model’s presumption regarding the connection between the input features and the result. It is an illustration of the mapping function that the algorithm is attempting to discover using the training set. To minimize the discrepancy between the expected and actual outputs, the learning process involves modifying the weights that parameterize the hypothesis. The objective is to optimize the model’s parameters to achieve the best predictive performance on new, unseen data, and a cost function is used to assess the hypothesis’ accuracy.

In most supervised machine learning algorithms, our main goal is to find a possible hypothesis from the hypothesis space that could map out the inputs to the proper outputs. The following figure shows the common method to find out the possible hypothesis from the Hypothesis space:

Hypothesis-Geeksforgeeks

Hypothesis Space (H)

Hypothesis space is the set of all the possible legal hypothesis. This is the set from which the machine learning algorithm would determine the best possible (only one) which would best describe the target function or the outputs.

Hypothesis (h)

A hypothesis is a function that best describes the target in supervised machine learning. The hypothesis that an algorithm would come up depends upon the data and also depends upon the restrictions and bias that we have imposed on the data.

The Hypothesis can be calculated as:

[Tex]y = mx + b [/Tex]

  • m = slope of the lines
  • b = intercept

To better understand the Hypothesis Space and Hypothesis consider the following coordinate that shows the distribution of some data:

Hypothesis_Geeksforgeeks

Say suppose we have test data for which we have to determine the outputs or results. The test data is as shown below:

hypothesis space in machine learning with example

We can predict the outcomes by dividing the coordinate as shown below:

hypothesis space in machine learning with example

So the test data would yield the following result:

hypothesis space in machine learning with example

But note here that we could have divided the coordinate plane as:

hypothesis space in machine learning with example

The way in which the coordinate would be divided depends on the data, algorithm and constraints.

  • All these legal possible ways in which we can divide the coordinate plane to predict the outcome of the test data composes of the Hypothesis Space.
  • Each individual possible way is known as the hypothesis.

Hence, in this example the hypothesis space would be like:

Possible hypothesis-Geeksforgeeks

The hypothesis space comprises all possible legal hypotheses that a machine learning algorithm can consider. Hypotheses are formulated based on various algorithms and techniques, including linear regression, decision trees, and neural networks. These hypotheses capture the mapping function transforming input data into predictions.

Hypothesis Formulation and Representation in Machine Learning

Hypotheses in machine learning are formulated based on various algorithms and techniques, each with its representation. For example:

  • Linear Regression : [Tex] h(X) = \theta_0 + \theta_1 X_1 + \theta_2 X_2 + … + \theta_n X_n[/Tex]
  • Decision Trees : [Tex]h(X) = \text{Tree}(X)[/Tex]
  • Neural Networks : [Tex]h(X) = \text{NN}(X)[/Tex]

In the case of complex models like neural networks, the hypothesis may involve multiple layers of interconnected nodes, each performing a specific computation.

Hypothesis Evaluation:

The process of machine learning involves not only formulating hypotheses but also evaluating their performance. This evaluation is typically done using a loss function or an evaluation metric that quantifies the disparity between predicted outputs and ground truth labels. Common evaluation metrics include mean squared error (MSE), accuracy, precision, recall, F1-score, and others. By comparing the predictions of the hypothesis with the actual outcomes on a validation or test dataset, one can assess the effectiveness of the model.

Hypothesis Testing and Generalization:

Once a hypothesis is formulated and evaluated, the next step is to test its generalization capabilities. Generalization refers to the ability of a model to make accurate predictions on unseen data. A hypothesis that performs well on the training dataset but fails to generalize to new instances is said to suffer from overfitting. Conversely, a hypothesis that generalizes well to unseen data is deemed robust and reliable.

The process of hypothesis formulation, evaluation, testing, and generalization is often iterative in nature. It involves refining the hypothesis based on insights gained from model performance, feature importance, and domain knowledge. Techniques such as hyperparameter tuning, feature engineering, and model selection play a crucial role in this iterative refinement process.

In statistics , a hypothesis refers to a statement or assumption about a population parameter. It is a proposition or educated guess that helps guide statistical analyses. There are two types of hypotheses: the null hypothesis (H0) and the alternative hypothesis (H1 or Ha).

  • Null Hypothesis(H 0 ): This hypothesis suggests that there is no significant difference or effect, and any observed results are due to chance. It often represents the status quo or a baseline assumption.
  • Aternative Hypothesis(H 1 or H a ): This hypothesis contradicts the null hypothesis, proposing that there is a significant difference or effect in the population. It is what researchers aim to support with evidence.

Q. How does the training process use the hypothesis?

The learning algorithm uses the hypothesis as a guide to minimise the discrepancy between expected and actual outputs by adjusting its parameters during training.

Q. How is the hypothesis’s accuracy assessed?

Usually, a cost function that calculates the difference between expected and actual values is used to assess accuracy. Optimising the model to reduce this expense is the aim.

Q. What is Hypothesis testing?

Hypothesis testing is a statistical method for determining whether or not a hypothesis is correct. The hypothesis can be about two variables in a dataset, about an association between two groups, or about a situation.

Q. What distinguishes the null hypothesis from the alternative hypothesis in machine learning experiments?

The null hypothesis (H0) assumes no significant effect, while the alternative hypothesis (H1 or Ha) contradicts H0, suggesting a meaningful impact. Statistical testing is employed to decide between these hypotheses.

Please Login to comment...

Similar reads.

author

  • Machine Learning

Improve your Coding Skills with Practice

 alt=

What kind of Experience do you want to share?

eml header

Best Guesses: Understanding The Hypothesis in Machine Learning

Stewart Kaplan

  • February 22, 2024
  • General , Supervised Learning , Unsupervised Learning

Machine learning is a vast and complex field that has inherited many terms from other places all over the mathematical domain.

It can sometimes be challenging to get your head around all the different terminologies, never mind trying to understand how everything comes together.

In this blog post, we will focus on one particular concept: the hypothesis.

While you may think this is simple, there is a little caveat regarding machine learning.

The statistics side and the learning side.

Don’t worry; we’ll do a full breakdown below.

You’ll learn the following:

What Is a Hypothesis in Machine Learning?

  • Is This any different than the hypothesis in statistics?
  • What is the difference between the alternative hypothesis and the null?
  • Why do we restrict hypothesis space in artificial intelligence?
  • Example code performing hypothesis testing in machine learning

learning together

In machine learning, the term ‘hypothesis’ can refer to two things.

First, it can refer to the hypothesis space, the set of all possible training examples that could be used to predict or answer a new instance.

Second, it can refer to the traditional null and alternative hypotheses from statistics.

Since machine learning works so closely with statistics, 90% of the time, when someone is referencing the hypothesis, they’re referencing hypothesis tests from statistics.

Is This Any Different Than The Hypothesis In Statistics?

In statistics, the hypothesis is an assumption made about a population parameter.

The statistician’s goal is to prove it true or disprove it.

prove them wrong

This will take the form of two different hypotheses, one called the null, and one called the alternative.

Usually, you’ll establish your null hypothesis as an assumption that it equals some value.

For example, in Welch’s T-Test Of Unequal Variance, our null hypothesis is that the two means we are testing (population parameter) are equal.

This means our null hypothesis is that the two population means are the same.

We run our statistical tests, and if our p-value is significant (very low), we reject the null hypothesis.

This would mean that their population means are unequal for the two samples you are testing.

Usually, statisticians will use the significance level of .05 (a 5% risk of being wrong) when deciding what to use as the p-value cut-off.

What Is The Difference Between The Alternative Hypothesis And The Null?

The null hypothesis is our default assumption, which we are trying to prove correct.

The alternate hypothesis is usually the opposite of our null and is much broader in scope.

For most statistical tests, the null and alternative hypotheses are already defined.

You are then just trying to find “significant” evidence we can use to reject our null hypothesis.

can you prove it

These two hypotheses are easy to spot by their specific notation. The null hypothesis is usually denoted by H₀, while H₁ denotes the alternative hypothesis.

Example Code Performing Hypothesis Testing In Machine Learning

Since there are many different hypothesis tests in machine learning and data science, we will focus on one of my favorites.

This test is Welch’s T-Test Of Unequal Variance, where we are trying to determine if the population means of these two samples are different.

There are a couple of assumptions for this test, but we will ignore those for now and show the code.

You can read more about this here in our other post, Welch’s T-Test of Unequal Variance .

We see that our p-value is very low, and we reject the null hypothesis.

welch t test result with p-value

What Is The Difference Between The Biased And Unbiased Hypothesis Spaces?

The difference between the Biased and Unbiased hypothesis space is the number of possible training examples your algorithm has to predict.

The unbiased space has all of them, and the biased space only has the training examples you’ve supplied.

Since neither of these is optimal (one is too small, one is much too big), your algorithm creates generalized rules (inductive learning) to be able to handle examples it hasn’t seen before.

Here’s an example of each:

Example of The Biased Hypothesis Space In Machine Learning

The Biased Hypothesis space in machine learning is a biased subspace where your algorithm does not consider all training examples to make predictions.

This is easiest to see with an example.

Let’s say you have the following data:

Happy  and  Sunny  and  Stomach Full  = True

Whenever your algorithm sees those three together in the biased hypothesis space, it’ll automatically default to true.

This means when your algorithm sees:

Sad  and  Sunny  And  Stomach Full  = False

It’ll automatically default to False since it didn’t appear in our subspace.

This is a greedy approach, but it has some practical applications.

greedy

Example of the Unbiased Hypothesis Space In Machine Learning

The unbiased hypothesis space is a space where all combinations are stored.

We can use re-use our example above:

This would start to breakdown as

Happy  = True

Happy  and  Sunny  = True

Happy  and  Stomach Full  = True

Let’s say you have four options for each of the three choices.

This would mean our subspace would need 2^12 instances (4096) just for our little three-word problem.

This is practically impossible; the space would become huge.

subspace

So while it would be highly accurate, this has no scalability.

More reading on this idea can be found in our post, Inductive Bias In Machine Learning .

Why Do We Restrict Hypothesis Space In Artificial Intelligence?

We have to restrict the hypothesis space in machine learning. Without any restrictions, our domain becomes much too large, and we lose any form of scalability.

This is why our algorithm creates rules to handle examples that are seen in production. 

This gives our algorithms a generalized approach that will be able to handle all new examples that are in the same format.

Other Quick Machine Learning Tutorials

At EML, we have a ton of cool data science tutorials that break things down so anyone can understand them.

Below we’ve listed a few that are similar to this guide:

  • Instance-Based Learning in Machine Learning
  • Types of Data For Machine Learning
  • Verbose in Machine Learning
  • Generalization In Machine Learning
  • Epoch In Machine Learning
  • Inductive Bias in Machine Learning
  • Understanding The Hypothesis In Machine Learning
  • Zip Codes In Machine Learning
  • get_dummies() in Machine Learning
  • Bootstrapping In Machine Learning
  • X and Y in Machine Learning
  • F1 Score in Machine Learning
  • Recent Posts

Stewart Kaplan

  • How much does a software engineer at JPMorgan Chase make? [Discover the Insider Salary Insights] - May 17, 2024
  • Mastering Customer Cluster Analysis: Strategies for Effective Clustering [Enhance Your Customer Insights] - May 17, 2024
  • Mastering Chroma Luxe Software: A Comprehensive Guide [Boost Your Editing Skills] - May 17, 2024

Hypothesis Space

  • Reference work entry
  • Cite this reference work entry

hypothesis space in machine learning with example

  • Hendrik Blockeel  

4 Citations

4 Altmetric

Model space

The hypothesis space used by a machine learning system is the set of all hypotheses that might possibly be returned by it. It is typically defined by a Hypothesis Language , possibly in conjunction with a Language Bias .

Motivation and Background

Many machine learning algorithms rely on some kind of search procedure: given a set of observations and a space of all possible hypotheses that might be considered (the “hypothesis space”), they look in this space for those hypotheses that best fit the data (or are optimal with respect to some other quality criterion).

To describe the context of a learning system in more detail, we introduce the following terminology. The key terms have separate entries in this encyclopedia, and we refer to those entries for more detailed definitions.

A learner takes observations as inputs. The Observation Language is the language used to describe these observations.

The hypotheses that a learner may produce, will be formulated in...

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Recommended Reading

De Raedt, L. (1992). Interactive theory revision: An inductive logic programming approach . London: Academic Press.

Google Scholar  

Nédellec, C., Adé, H., Bergadano, F., & Tausend, B. (1996). Declarative bias in ILP. In L. De Raedt (Ed.), Advances in inductive logic programming . Frontiers in artificial intelligence and applications (Vol. 32, pp. 82–103). Amsterdam: IOS Press.

Download references

Author information

Authors and affiliations.

You can also search for this author in PubMed   Google Scholar

Editor information

Editors and affiliations.

School of Computer Science and Engineering, University of New South Wales, Sydney, Australia, 2052

Claude Sammut

Faculty of Information Technology, Clayton School of Information Technology, Monash University, P.O. Box 63, Victoria, Australia, 3800

Geoffrey I. Webb

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer Science+Business Media, LLC

About this entry

Cite this entry.

Blockeel, H. (2011). Hypothesis Space. In: Sammut, C., Webb, G.I. (eds) Encyclopedia of Machine Learning. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-30164-8_373

Download citation

DOI : https://doi.org/10.1007/978-0-387-30164-8_373

Publisher Name : Springer, Boston, MA

Print ISBN : 978-0-387-30768-8

Online ISBN : 978-0-387-30164-8

eBook Packages : Computer Science Reference Module Computer Science and Engineering

Share this entry

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Publish with us

Policies and ethics

  • Find a journal
  • Track your research

Javatpoint Logo

Machine Learning

Artificial Intelligence

Control System

Supervised Learning

Classification, miscellaneous, related tutorials.

Interview Questions

JavaTpoint

  • Send your Feedback to [email protected]

Help Others, Please Share

facebook

Learn Latest Tutorials

Splunk tutorial

Transact-SQL

Tumblr tutorial

Reinforcement Learning

R Programming tutorial

R Programming

RxJS tutorial

React Native

Python Design Patterns

Python Design Patterns

Python Pillow tutorial

Python Pillow

Python Turtle tutorial

Python Turtle

Keras tutorial

Preparation

Aptitude

Verbal Ability

Interview Questions

Company Questions

Trending Technologies

Artificial Intelligence

Cloud Computing

Hadoop tutorial

Data Science

Angular 7 Tutorial

B.Tech / MCA

DBMS tutorial

Data Structures

DAA tutorial

Operating System

Computer Network tutorial

Computer Network

Compiler Design tutorial

Compiler Design

Computer Organization and Architecture

Computer Organization

Discrete Mathematics Tutorial

Discrete Mathematics

Ethical Hacking

Ethical Hacking

Computer Graphics Tutorial

Computer Graphics

Software Engineering

Software Engineering

html tutorial

Web Technology

Cyber Security tutorial

Cyber Security

Automata Tutorial

C Programming

C++ tutorial

Data Mining

Data Warehouse Tutorial

Data Warehouse

RSS Feed

Artificial Intelligence 2E

foundations of computational agents

The third edition of Artificial Intelligence: foundations of computational agents , Cambridge University Press, 2023 is now available (including full text ).

7.8.1 Version-Space Learning

Rather than enumerating all of the hypotheses, the set of elements of ℋ consistent with all of the examples can be found more efficiently by imposing some structure on the hypothesis space.

Hypothesis h 1 is a more general hypothesis than hypothesis h 2 if h 2 implies h 1 . In this case, h 2 is a more specific hypothesis than h 1 . Any hypothesis is both more general than itself and more specific than itself.

Example 7.29 .

The hypothesis ¬ ⁢ a ⁢ c ⁢ a ⁢ d ⁢ e ⁢ m ⁢ i ⁢ c ∧ m ⁢ u ⁢ s ⁢ i ⁢ c is more specific than m ⁢ u ⁢ s ⁢ i ⁢ c and is also more specific than ¬ ⁢ a ⁢ c ⁢ a ⁢ d ⁢ e ⁢ m ⁢ i ⁢ c . Thus, m ⁢ u ⁢ s ⁢ i ⁢ c is more general than ¬ ⁢ a ⁢ c ⁢ a ⁢ d ⁢ e ⁢ m ⁢ i ⁢ c ∧ m ⁢ u ⁢ s ⁢ i ⁢ c . The most general hypothesis is t ⁢ r ⁢ u ⁢ e . The most specific hypothesis is f ⁢ a ⁢ l ⁢ s ⁢ e .

The “more general than” relation forms a partial ordering over the hypothesis space. The version-space algorithm that follows exploits this partial ordering to search for hypotheses that are consistent with the training examples.

Given hypothesis space ℋ and examples E , the version space is the subset of ℋ that is consistent with the examples.

The general boundary of a version space, G , is the set of maximally general members of the version space (i.e., those members of the version space such that no other element of the version space is more general). The specific boundary of a version space, S , is the set of maximally specific members of the version space.

These concepts are useful because the general boundary and the specific boundary completely determine the version space, as shown by the following proposition.

Proposition 7.2 .

The version space is the set of h ∈ H such that h is more general than an element of S and more specific than an element of G .

Candidate Elimination Learner

The candidate elimination learner incrementally builds the version space given a hypothesis space ℋ and a set E of examples. The examples are added one by one; each example possibly shrinks the version space by removing the hypotheses that are inconsistent with the example. The candidate elimination algorithm does this by updating the general and/or the specific boundary for each new example. This is described in Figure 7.21 .

Example 7.30 .

Consider how the candidate elimination algorithm handles Example 7.26 , where H is the set of conjunctions of literals.

Before it has seen any examples, G 0 = { t ⁢ r ⁢ u ⁢ e } – the user reads everything – and S 0 = { f ⁢ a ⁢ l ⁢ s ⁢ e } – the user reads nothing. Note that t ⁢ r ⁢ u ⁢ e is the empty conjunction and f ⁢ a ⁢ l ⁢ s ⁢ e is the conjunction of an atom and its negation.

After considering the first example, a 1 , G 1 = { t ⁢ r ⁢ u ⁢ e } and

Thus, the most general hypothesis is that the user reads everything, and the most specific hypothesis is that the user only reads articles exactly like this one.

After considering the first two examples, G 2 = { t ⁢ r ⁢ u ⁢ e } and

Since a 1 and a 2 disagree on music, but have the same prediction it can be concluded that music cannot be relevant.

After considering the first three examples, the general boundary becomes

and S 3 = S 2 . Now there are two most general hypotheses; the first is that the user reads anything about crime, and the second is that the user reads anything non-academic.

After considering the first four examples,

and S 4 = S 3 .

After considering all five examples,

Thus, after five examples, only two hypotheses exist in the version space. They differ only in their prediction on an example that has c ⁢ r ⁢ i ⁢ m ⁢ e ∧ ⁢ l ⁢ o ⁢ c ⁢ a ⁢ l true. If the target concept can be represented as a conjunction, only an example with c ⁢ r ⁢ i ⁢ m ⁢ e ∧ ⁢ l ⁢ o ⁢ c ⁢ a ⁢ l true will change G or S . This version space can make predictions about all other examples.

The Bias Involved in Version-Space Learning

Recall that a bias is necessary for any learning to generalize beyond the training data. There must have been a bias in Example 7.30 because, after observing only five of the 16 possible assignments to the input variables, an agent was able to make predictions about examples it had not seen.

The bias involved in version-space learning is a called a language bias or a restriction bias because the bias is obtained from restricting the allowable hypotheses. For example, a new example with crime false and music true will be classified as false (the user will not read the article), even though no such example has been seen. The restriction that the hypothesis must be a conjunction of literals is enough to predict its value.

This bias should be contrasted with the bias involved in decision tree learning . A decision tree can represent any Boolean function. Decision tree learning involves a preference bias , in that some Boolean functions are preferred over others; those with smaller decision trees are preferred over those with larger decision trees. A decision tree learning algorithm that builds a single decision tree top-down also involves a search bias in that the decision tree returned depends on the search strategy used.

The candidate elimination algorithm is sometimes said to be an unbiased learning algorithm because the learning algorithm does not impose any bias beyond the language bias involved in choosing ℋ . It is easy for the version space to collapse to the empty set, for example, if the user reads an article with crime false and music true. This means that the target concept is not in ℋ . Version-space learning is not tolerant to noise; just one misclassified example can throw off the whole system.

The bias-free hypothesis space is where ℋ is the set of all Boolean functions. In this case, G always contains one concept: the concept which says that all negative examples have been seen and every other example is positive. Similarly, S contains the single concept which says that all unseen examples are negative. The version space is incapable of concluding anything about examples it has not seen; thus, it cannot generalize. Without a language bias or a preference bias, no generalization and, therefore, no learning will occur.

Artificial Intelligence: Foundations of Computational Agents, Poole & Mackworth This online version is free to view and download for personal use only. The text is not for re-distribution, re-sale or use in derivative works. Copyright © 2017, David L. Poole and Alan K. Mackworth . This book is published by Cambridge University Press .

Genetic algorithm: Hypothesis space search

As already understood from our illustrative example, it is clear that genetic algorithms employ a randomized beam search method to seek maximally fit hypotheses. In the hypothesis space search method, we can see that the gradient descent search in backpropagation moves smoothly from one hypothesis to another. On the other hand, the genetic algorithm search can move much more abruptly. It replaces the parent hypotheses with an offspring that can be very different from the parent. Due to this reason, genetic algorithm search has lower chances of it falling into the same kind of local minima that plaques the gradient descent methods.

There is one practical difficulty that is often encountered in genetic algorithms, it is crowding. Crowding can be defined as the phenomenon in which some individuals that are more fit in comparison to others, reproduce quickly, therefore the copies of this individual take over a larger fraction of the population. Most of the strategies used in the genetic algorithms are inspired by biological evolution. One such other strategy used is fitness sharing, in which the measured fitness of an individual is decreased by the presence of another individual of a similar kind. The third method is to restrict all the individuals to combine to form offspring. To better understand we can say that by allowing individuals of the same kind to recombine, clusters of similar individuals are formed, forming multiple subspecies in the population.

Another method would be to spatially distribute individuals and allow only nearby individuals to combine.

Population evolution and schema theorem.

The schema theorem of Holland is used to mathematically characterize the evolution over time of the population with respect to time. It is based on the concept of schema. So, what is schema? Schema is any string composed of 0s, and 1s, and *s, where * represents null, so a schema 0*10, is the same as 0010 and 0110. The schema theorem characterizes the evolution within a genetic algorithm on the basis of the number of instances representing each schema. Let us assume the m(s, t) to denote the number of instances of schema denoted by ‘s’, in the population at the time ‘t’, the expected value in the schema theorem is described as m(s, t+1), in terms of m(s, t), and the other parameters of the population, schema, and GA.

In a genetic algorithm, the evolution of the population depends on the selection step, the recombination step, and the mutation step. The schema theorem is one of the most widely used theorems in the characterization of population evolution within a genetic algorithm. If it fails to consider the positive effects of crossover and mutation, it is in a way incomplete. There are many other recent theoretical analyses that have been proposed, many of these analogies are based on models such as Markov chain models and the statistical mechanical model.

Help | Advanced Search

Computer Science > Machine Learning

Title: the platonic representation hypothesis.

Abstract: We argue that representations in AI models, particularly deep networks, are converging. First, we survey many examples of convergence in the literature: over time and across multiple domains, the ways by which different neural networks represent data are becoming more aligned. Next, we demonstrate convergence across data modalities: as vision models and language models get larger, they measure distance between datapoints in a more and more alike way. We hypothesize that this convergence is driving toward a shared statistical model of reality, akin to Plato's concept of an ideal reality. We term such a representation the platonic representation and discuss several possible selective pressures toward it. Finally, we discuss the implications of these trends, their limitations, and counterexamples to our analysis.

Submission history

Access paper:.

  • HTML (experimental)
  • Other Formats

license icon

References & Citations

  • Google Scholar
  • Semantic Scholar

BibTeX formatted citation

BibSonomy logo

Bibliographic and Citation Tools

Code, data and media associated with this article, recommenders and search tools.

  • Institution

arXivLabs: experimental projects with community collaborators

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.

Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.

Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs .

COMMENTS

  1. Hypothesis in Machine Learning

    A hypothesis is a function that best describes the target in supervised machine learning. The hypothesis that an algorithm would come up depends upon the data and also depends upon the restrictions and bias that we have imposed on the data. The Hypothesis can be calculated as: y = mx + b y =mx+b. Where, y = range. m = slope of the lines.

  2. What's a Hypothesis Space?

    In this article, we talked about hypotheses spaces in machine learning. An algorithm's hypothesis space contains all the models it can learn from any dataset. The algorithms with too expressive spaces can generalize poorly to unseen data and be too complex to understand, whereas those with overly simple hypotheses may underfit the data.

  3. What is a Hypothesis in Machine Learning?

    Hypothesis in Machine Learning: Candidate model that approximates a target function for mapping examples of inputs to outputs. We can see that a hypothesis in machine learning draws upon the definition of a hypothesis more broadly in science. Just like a hypothesis in science is an explanation that covers available evidence, is falsifiable and ...

  4. What exactly is a hypothesis space in machine learning?

    To get a better idea: The input space is in the above given example 24 2 4, its the number of possible inputs. The hypothesis space is 224 = 65536 2 2 4 = 65536 because for each set of features of the input space two outcomes ( 0 and 1) are possible. The ML algorithm helps us to find one function, sometimes also referred as hypothesis, from the ...

  5. Best Guesses: Understanding The Hypothesis in Machine Learning

    In machine learning, the term 'hypothesis' can refer to two things. First, it can refer to the hypothesis space, the set of all possible training examples that could be used to predict or answer a new instance. Second, it can refer to the traditional null and alternative hypotheses from statistics. Since machine learning works so closely ...

  6. PDF CS 446 Machine Learning Fall 2016 Aug 25, 2016 Introduction to Machine

    We must put restrictions on the hypothesis space { H { such that H jYj jX. Our hypothesis space could be the set of simple conjunctions (x 1 ^x 2; x 1 ^x 2 ^x 3), or the set of m-of-n rules (m out of the n features are 1, etc.). Many other restrictions are also possible. Introduction to Machine Learning-4

  7. Hypothesis Space

    The term "hypothesis space" is ubiquitous in the machine learning literature, but few articles discuss the concept itself. In Inductive Logic Programming, a significant body of work exists on how to define a language bias (and thus a hypothesis space), and on how to automatically weaken the bias (enlarge the hypothesis space) when a given bias turns out to be too strong.

  8. PDF CS534: Machine Learning

    Hypothesis space. The space of all hypotheses that can, in principle, be output by a particular learning algorithm. Version Space. The space of all hypotheses in the hypothesis space that have not yet been ruled out by a training example. Training Sample (or Training Set or Training Data): a set of N training examples drawn according to P(x,y).

  9. Machine Learning: The Basics

    A learning rate or step-size parameter used by gradient-based methods. h() A hypothesis map that reads in features x of a data point and delivers a prediction ^y= h(x) for its label y. H A hypothesis space or model used by a ML method. The hypothesis space consists of di erent hypothesis maps h: X!Ybetween which the ML method has to choose. 8

  10. PDF 10-806 Foundations of Machine Learning and Data Science

    10-806 Foundations of Machine Learning and Data Science Maria-Florina Balcan Lecture 4-5: September 21st and September 23rd, 2015 Sample Complexity Results for In nite Hypothesis Spaces The Shattering Coe cient Let C be a concept class over an instance space X, i.e. a set of functions functions from X to f0;1g(where both Cand Xmay be in nite).

  11. Machine Learning 1.1: Hypothesis Spaces

    This video introduces the concept of a hypothesis space which is a restricted set of predictor functions that can be computed and manipulated efficiently giv...

  12. Hypothesis in Machine Learning

    The hypothesis is one of the commonly used concepts of statistics in Machine Learning. It is specifically used in Supervised Machine learning, where an ML model learns a function that best maps the input to corresponding outputs with the help of an available dataset. In supervised learning techniques, the main aim is to determine the possible ...

  13. A Gentle Introduction to Computational Learning Theory

    Whether a group of points can be shattered by an algorithm depends on the hypothesis space and the number of points. For example, a line (hypothesis space) can be used to shatter three points, but not four points. Any placement of three points on a 2d plane with class labels 0 or 1 can be "correctly" split by label with a line, e.g ...

  14. 7.8.1 Version-Space Learning‣ 7.8 Learning as Refining the Hypothesis

    The bias involved in version-space learning is a called a language bias or a restriction bias because the bias is obtained from restricting the allowable hypotheses. For example, a new example with crime false and music true will be classified as false (the user will not read the article), even though no such example has been seen.

  15. machine learning

    Note that representational capacity (not capacity, which is common!) is not a standard term in computational learning theory, while hypothesis space/class is commonly used. For example, this famous book on machine learning and learning theory uses the term hypothesis class in many places, but it never uses the term representational capacity.

  16. A concept Learning Task and Inductive Learning Hypothesis

    This hypothesis is called target concept (or) hypothesis space. Hypothesis Space: To formally define Hypothesis space, The collection of all feasible legal hypotheses is known as hypothesis space. This is the set from which the machine learning algorithm will select the best (and only) function or outputs that describe the target function.

  17. machine learning

    This function takes N N binary inputs and outputs a single binary classification. With N N binary inputs, then the size of the domain must be 2N 2 N. Then, I would think that for each of these possible 2N 2 N instances there must be two hypotheses (one for each output). This would make the total number of hypotheses equal to 2 × (2N) 2 × ( 2 N).

  18. Could anyone explain the terms "Hypothesis space" "sample space

    I am confused with these machine learning terms, and trying to distinguish them with one concrete example. ... Hypothesis space (HS): ... In your example, the feature space is the 10,000 pixels and the color values they can take. The hypothesis space covers all potential solutions that you could arrive at with your choice of model. A model that ...

  19. Finding a Maximally Specific Hypothesis: Find-S

    In supervised machine learning, a hypothesis is a function that best characterizes the target. ... The Q Learning Algorithm with an Illustrative example; Machine Learning- Reinforcement Learning: Problems and Real-life applications ... Hypothesis space search; Machine Learning- Genetic programming; Machine Learning- GENETIC ALGORITHM: MODELS OF ...

  20. machine learning

    Hypothesis Space. Space which contains all the functions produced by a model. The functions map the inputs to their respective outputs. A model can output various functions ( or rather relationships between the inputs and outputs ) based on its learning. If you have a larger hypothesis space, the model cannot find the "best" one. See this answer.

  21. Machine Learning- Genetic algorithm: Hypothesis space search

    Genetic algorithm: Hypothesis space search. As already understood from our illustrative example, it is clear that genetic algorithms employ a randomized beam search method to seek maximally fit hypotheses. In the hypothesis space search method, we can see that the gradient descent search in backpropagation moves smoothly from one hypothesis to ...

  22. [2405.07987] The Platonic Representation Hypothesis

    The Platonic Representation Hypothesis. Minyoung Huh, Brian Cheung, Tongzhou Wang, Phillip Isola. We argue that representations in AI models, particularly deep networks, are converging. First, we survey many examples of convergence in the literature: over time and across multiple domains, the ways by which different neural networks represent ...

  23. Coyotes

    To test their hypothesis, they're capturing, collaring and tracking coyotes using GPS technology. Each coyote also has an accelerometer, which measures head movement. Together, the data from the GPS and accelerometer give the scientists a potential window into where and when different behaviors like capturing prey and eating occur.