• Skip to main content
  • Skip to primary sidebar
  • Skip to footer
  • QuestionPro

survey software icon

  • Solutions Industries Gaming Automotive Sports and events Education Government Travel & Hospitality Financial Services Healthcare Cannabis Technology Use Case AskWhy Communities Audience Contactless surveys Mobile LivePolls Member Experience GDPR Positive People Science 360 Feedback Surveys
  • Resources Blog eBooks Survey Templates Case Studies Training Help center

empirical research data analysis

Home Market Research

Empirical Research: Definition, Methods, Types and Examples

What is Empirical Research

Content Index

Empirical research: Definition

Empirical research: origin, quantitative research methods, qualitative research methods, steps for conducting empirical research, empirical research methodology cycle, advantages of empirical research, disadvantages of empirical research, why is there a need for empirical research.

Empirical research is defined as any research where conclusions of the study is strictly drawn from concretely empirical evidence, and therefore “verifiable” evidence.

This empirical evidence can be gathered using quantitative market research and  qualitative market research  methods.

For example: A research is being conducted to find out if listening to happy music in the workplace while working may promote creativity? An experiment is conducted by using a music website survey on a set of audience who are exposed to happy music and another set who are not listening to music at all, and the subjects are then observed. The results derived from such a research will give empirical evidence if it does promote creativity or not.

LEARN ABOUT: Behavioral Research

You must have heard the quote” I will not believe it unless I see it”. This came from the ancient empiricists, a fundamental understanding that powered the emergence of medieval science during the renaissance period and laid the foundation of modern science, as we know it today. The word itself has its roots in greek. It is derived from the greek word empeirikos which means “experienced”.

In today’s world, the word empirical refers to collection of data using evidence that is collected through observation or experience or by using calibrated scientific instruments. All of the above origins have one thing in common which is dependence of observation and experiments to collect data and test them to come up with conclusions.

LEARN ABOUT: Causal Research

Types and methodologies of empirical research

Empirical research can be conducted and analysed using qualitative or quantitative methods.

  • Quantitative research : Quantitative research methods are used to gather information through numerical data. It is used to quantify opinions, behaviors or other defined variables . These are predetermined and are in a more structured format. Some of the commonly used methods are survey, longitudinal studies, polls, etc
  • Qualitative research:   Qualitative research methods are used to gather non numerical data.  It is used to find meanings, opinions, or the underlying reasons from its subjects. These methods are unstructured or semi structured. The sample size for such a research is usually small and it is a conversational type of method to provide more insight or in-depth information about the problem Some of the most popular forms of methods are focus groups, experiments, interviews, etc.

Data collected from these will need to be analysed. Empirical evidence can also be analysed either quantitatively and qualitatively. Using this, the researcher can answer empirical questions which have to be clearly defined and answerable with the findings he has got. The type of research design used will vary depending on the field in which it is going to be used. Many of them might choose to do a collective research involving quantitative and qualitative method to better answer questions which cannot be studied in a laboratory setting.

LEARN ABOUT: Qualitative Research Questions and Questionnaires

Quantitative research methods aid in analyzing the empirical evidence gathered. By using these a researcher can find out if his hypothesis is supported or not.

  • Survey research: Survey research generally involves a large audience to collect a large amount of data. This is a quantitative method having a predetermined set of closed questions which are pretty easy to answer. Because of the simplicity of such a method, high responses are achieved. It is one of the most commonly used methods for all kinds of research in today’s world.

Previously, surveys were taken face to face only with maybe a recorder. However, with advancement in technology and for ease, new mediums such as emails , or social media have emerged.

For example: Depletion of energy resources is a growing concern and hence there is a need for awareness about renewable energy. According to recent studies, fossil fuels still account for around 80% of energy consumption in the United States. Even though there is a rise in the use of green energy every year, there are certain parameters because of which the general population is still not opting for green energy. In order to understand why, a survey can be conducted to gather opinions of the general population about green energy and the factors that influence their choice of switching to renewable energy. Such a survey can help institutions or governing bodies to promote appropriate awareness and incentive schemes to push the use of greener energy.

Learn more: Renewable Energy Survey Template Descriptive Research vs Correlational Research

  • Experimental research: In experimental research , an experiment is set up and a hypothesis is tested by creating a situation in which one of the variable is manipulated. This is also used to check cause and effect. It is tested to see what happens to the independent variable if the other one is removed or altered. The process for such a method is usually proposing a hypothesis, experimenting on it, analyzing the findings and reporting the findings to understand if it supports the theory or not.

For example: A particular product company is trying to find what is the reason for them to not be able to capture the market. So the organisation makes changes in each one of the processes like manufacturing, marketing, sales and operations. Through the experiment they understand that sales training directly impacts the market coverage for their product. If the person is trained well, then the product will have better coverage.

  • Correlational research: Correlational research is used to find relation between two set of variables . Regression analysis is generally used to predict outcomes of such a method. It can be positive, negative or neutral correlation.

LEARN ABOUT: Level of Analysis

For example: Higher educated individuals will get higher paying jobs. This means higher education enables the individual to high paying job and less education will lead to lower paying jobs.

  • Longitudinal study: Longitudinal study is used to understand the traits or behavior of a subject under observation after repeatedly testing the subject over a period of time. Data collected from such a method can be qualitative or quantitative in nature.

For example: A research to find out benefits of exercise. The target is asked to exercise everyday for a particular period of time and the results show higher endurance, stamina, and muscle growth. This supports the fact that exercise benefits an individual body.

  • Cross sectional: Cross sectional study is an observational type of method, in which a set of audience is observed at a given point in time. In this type, the set of people are chosen in a fashion which depicts similarity in all the variables except the one which is being researched. This type does not enable the researcher to establish a cause and effect relationship as it is not observed for a continuous time period. It is majorly used by healthcare sector or the retail industry.

For example: A medical study to find the prevalence of under-nutrition disorders in kids of a given population. This will involve looking at a wide range of parameters like age, ethnicity, location, incomes  and social backgrounds. If a significant number of kids coming from poor families show under-nutrition disorders, the researcher can further investigate into it. Usually a cross sectional study is followed by a longitudinal study to find out the exact reason.

  • Causal-Comparative research : This method is based on comparison. It is mainly used to find out cause-effect relationship between two variables or even multiple variables.

For example: A researcher measured the productivity of employees in a company which gave breaks to the employees during work and compared that to the employees of the company which did not give breaks at all.

LEARN ABOUT: Action Research

Some research questions need to be analysed qualitatively, as quantitative methods are not applicable there. In many cases, in-depth information is needed or a researcher may need to observe a target audience behavior, hence the results needed are in a descriptive analysis form. Qualitative research results will be descriptive rather than predictive. It enables the researcher to build or support theories for future potential quantitative research. In such a situation qualitative research methods are used to derive a conclusion to support the theory or hypothesis being studied.

LEARN ABOUT: Qualitative Interview

  • Case study: Case study method is used to find more information through carefully analyzing existing cases. It is very often used for business research or to gather empirical evidence for investigation purpose. It is a method to investigate a problem within its real life context through existing cases. The researcher has to carefully analyse making sure the parameter and variables in the existing case are the same as to the case that is being investigated. Using the findings from the case study, conclusions can be drawn regarding the topic that is being studied.

For example: A report mentioning the solution provided by a company to its client. The challenges they faced during initiation and deployment, the findings of the case and solutions they offered for the problems. Such case studies are used by most companies as it forms an empirical evidence for the company to promote in order to get more business.

  • Observational method:   Observational method is a process to observe and gather data from its target. Since it is a qualitative method it is time consuming and very personal. It can be said that observational research method is a part of ethnographic research which is also used to gather empirical evidence. This is usually a qualitative form of research, however in some cases it can be quantitative as well depending on what is being studied.

For example: setting up a research to observe a particular animal in the rain-forests of amazon. Such a research usually take a lot of time as observation has to be done for a set amount of time to study patterns or behavior of the subject. Another example used widely nowadays is to observe people shopping in a mall to figure out buying behavior of consumers.

  • One-on-one interview: Such a method is purely qualitative and one of the most widely used. The reason being it enables a researcher get precise meaningful data if the right questions are asked. It is a conversational method where in-depth data can be gathered depending on where the conversation leads.

For example: A one-on-one interview with the finance minister to gather data on financial policies of the country and its implications on the public.

  • Focus groups: Focus groups are used when a researcher wants to find answers to why, what and how questions. A small group is generally chosen for such a method and it is not necessary to interact with the group in person. A moderator is generally needed in case the group is being addressed in person. This is widely used by product companies to collect data about their brands and the product.

For example: A mobile phone manufacturer wanting to have a feedback on the dimensions of one of their models which is yet to be launched. Such studies help the company meet the demand of the customer and position their model appropriately in the market.

  • Text analysis: Text analysis method is a little new compared to the other types. Such a method is used to analyse social life by going through images or words used by the individual. In today’s world, with social media playing a major part of everyone’s life, such a method enables the research to follow the pattern that relates to his study.

For example: A lot of companies ask for feedback from the customer in detail mentioning how satisfied are they with their customer support team. Such data enables the researcher to take appropriate decisions to make their support team better.

Sometimes a combination of the methods is also needed for some questions that cannot be answered using only one type of method especially when a researcher needs to gain a complete understanding of complex subject matter.

We recently published a blog that talks about examples of qualitative data in education ; why don’t you check it out for more ideas?

Learn More: Data Collection Methods: Types & Examples

Since empirical research is based on observation and capturing experiences, it is important to plan the steps to conduct the experiment and how to analyse it. This will enable the researcher to resolve problems or obstacles which can occur during the experiment.

Step #1: Define the purpose of the research

This is the step where the researcher has to answer questions like what exactly do I want to find out? What is the problem statement? Are there any issues in terms of the availability of knowledge, data, time or resources. Will this research be more beneficial than what it will cost.

Before going ahead, a researcher has to clearly define his purpose for the research and set up a plan to carry out further tasks.

Step #2 : Supporting theories and relevant literature

The researcher needs to find out if there are theories which can be linked to his research problem . He has to figure out if any theory can help him support his findings. All kind of relevant literature will help the researcher to find if there are others who have researched this before, or what are the problems faced during this research. The researcher will also have to set up assumptions and also find out if there is any history regarding his research problem

Step #3: Creation of Hypothesis and measurement

Before beginning the actual research he needs to provide himself a working hypothesis or guess what will be the probable result. Researcher has to set up variables, decide the environment for the research and find out how can he relate between the variables.

Researcher will also need to define the units of measurements, tolerable degree for errors, and find out if the measurement chosen will be acceptable by others.

Step #4: Methodology, research design and data collection

In this step, the researcher has to define a strategy for conducting his research. He has to set up experiments to collect data which will enable him to propose the hypothesis. The researcher will decide whether he will need experimental or non experimental method for conducting the research. The type of research design will vary depending on the field in which the research is being conducted. Last but not the least, the researcher will have to find out parameters that will affect the validity of the research design. Data collection will need to be done by choosing appropriate samples depending on the research question. To carry out the research, he can use one of the many sampling techniques. Once data collection is complete, researcher will have empirical data which needs to be analysed.

LEARN ABOUT: Best Data Collection Tools

Step #5: Data Analysis and result

Data analysis can be done in two ways, qualitatively and quantitatively. Researcher will need to find out what qualitative method or quantitative method will be needed or will he need a combination of both. Depending on the unit of analysis of his data, he will know if his hypothesis is supported or rejected. Analyzing this data is the most important part to support his hypothesis.

Step #6: Conclusion

A report will need to be made with the findings of the research. The researcher can give the theories and literature that support his research. He can make suggestions or recommendations for further research on his topic.

Empirical research methodology cycle

A.D. de Groot, a famous dutch psychologist and a chess expert conducted some of the most notable experiments using chess in the 1940’s. During his study, he came up with a cycle which is consistent and now widely used to conduct empirical research. It consists of 5 phases with each phase being as important as the next one. The empirical cycle captures the process of coming up with hypothesis about how certain subjects work or behave and then testing these hypothesis against empirical data in a systematic and rigorous approach. It can be said that it characterizes the deductive approach to science. Following is the empirical cycle.

  • Observation: At this phase an idea is sparked for proposing a hypothesis. During this phase empirical data is gathered using observation. For example: a particular species of flower bloom in a different color only during a specific season.
  • Induction: Inductive reasoning is then carried out to form a general conclusion from the data gathered through observation. For example: As stated above it is observed that the species of flower blooms in a different color during a specific season. A researcher may ask a question “does the temperature in the season cause the color change in the flower?” He can assume that is the case, however it is a mere conjecture and hence an experiment needs to be set up to support this hypothesis. So he tags a few set of flowers kept at a different temperature and observes if they still change the color?
  • Deduction: This phase helps the researcher to deduce a conclusion out of his experiment. This has to be based on logic and rationality to come up with specific unbiased results.For example: In the experiment, if the tagged flowers in a different temperature environment do not change the color then it can be concluded that temperature plays a role in changing the color of the bloom.
  • Testing: This phase involves the researcher to return to empirical methods to put his hypothesis to the test. The researcher now needs to make sense of his data and hence needs to use statistical analysis plans to determine the temperature and bloom color relationship. If the researcher finds out that most flowers bloom a different color when exposed to the certain temperature and the others do not when the temperature is different, he has found support to his hypothesis. Please note this not proof but just a support to his hypothesis.
  • Evaluation: This phase is generally forgotten by most but is an important one to keep gaining knowledge. During this phase the researcher puts forth the data he has collected, the support argument and his conclusion. The researcher also states the limitations for the experiment and his hypothesis and suggests tips for others to pick it up and continue a more in-depth research for others in the future. LEARN MORE: Population vs Sample

LEARN MORE: Population vs Sample

There is a reason why empirical research is one of the most widely used method. There are a few advantages associated with it. Following are a few of them.

  • It is used to authenticate traditional research through various experiments and observations.
  • This research methodology makes the research being conducted more competent and authentic.
  • It enables a researcher understand the dynamic changes that can happen and change his strategy accordingly.
  • The level of control in such a research is high so the researcher can control multiple variables.
  • It plays a vital role in increasing internal validity .

Even though empirical research makes the research more competent and authentic, it does have a few disadvantages. Following are a few of them.

  • Such a research needs patience as it can be very time consuming. The researcher has to collect data from multiple sources and the parameters involved are quite a few, which will lead to a time consuming research.
  • Most of the time, a researcher will need to conduct research at different locations or in different environments, this can lead to an expensive affair.
  • There are a few rules in which experiments can be performed and hence permissions are needed. Many a times, it is very difficult to get certain permissions to carry out different methods of this research.
  • Collection of data can be a problem sometimes, as it has to be collected from a variety of sources through different methods.

LEARN ABOUT:  Social Communication Questionnaire

Empirical research is important in today’s world because most people believe in something only that they can see, hear or experience. It is used to validate multiple hypothesis and increase human knowledge and continue doing it to keep advancing in various fields.

For example: Pharmaceutical companies use empirical research to try out a specific drug on controlled groups or random groups to study the effect and cause. This way, they prove certain theories they had proposed for the specific drug. Such research is very important as sometimes it can lead to finding a cure for a disease that has existed for many years. It is useful in science and many other fields like history, social sciences, business, etc.

LEARN ABOUT: 12 Best Tools for Researchers

With the advancement in today’s world, empirical research has become critical and a norm in many fields to support their hypothesis and gain more knowledge. The methods mentioned above are very useful for carrying out such research. However, a number of new methods will keep coming up as the nature of new investigative questions keeps getting unique or changing.

Create a single source of real data with a built-for-insights platform. Store past data, add nuggets of insights, and import research data from various sources into a CRM for insights. Build on ever-growing research with a real-time dashboard in a unified research management platform to turn insights into knowledge.

LEARN MORE         FREE TRIAL

MORE LIKE THIS

When thinking about Customer Experience, so much of what we discuss is focused on measurement, dashboards, analytics, and insights. However, the “product” that is provided can be just as important.

Was The Experience Memorable? — Tuesday CX Thoughts

Sep 10, 2024

Data Analyst

What Does a Data Analyst Do? Skills, Tools & Tips

Sep 9, 2024

Gallup Access alternatives

Best Gallup Access Alternatives & Competitors in 2024

Sep 6, 2024

Experimental vs Observational Studies: Differences & Examples

Experimental vs Observational Studies: Differences & Examples

Sep 5, 2024

Other categories

  • Academic Research
  • Artificial Intelligence
  • Assessments
  • Brand Awareness
  • Case Studies
  • Communities
  • Consumer Insights
  • Customer effort score
  • Customer Engagement
  • Customer Experience
  • Customer Loyalty
  • Customer Research
  • Customer Satisfaction
  • Employee Benefits
  • Employee Engagement
  • Employee Retention
  • Friday Five
  • General Data Protection Regulation
  • Insights Hub
  • Life@QuestionPro
  • Market Research
  • Mobile diaries
  • Mobile Surveys
  • New Features
  • Online Communities
  • Question Types
  • Questionnaire
  • QuestionPro Products
  • Release Notes
  • Research Tools and Apps
  • Revenue at Risk
  • Survey Templates
  • Training Tips
  • Tuesday CX Thoughts (TCXT)
  • Uncategorized
  • What’s Coming Up
  • Workforce Intelligence
  • Data and data management

empirical analysis

Gavin Wright

  • Gavin Wright

What is empirical analysis?

Empirical analysis is an evidence-based approach to the study and interpretation of information. Empirical evidence is information that can be gathered from experience or by the five senses. In a scientific context, it is called empirical research .

Empirical analysis requires evidence to prove any theory. An empirical approach gathers observable data and sets out a repeatable process to produce verifiable results. Empirical analysis often requires statistical analysis to support a claim.

The word empirical comes from the ancient Greek word empeiria , meaning experience.

empirical approaches in the real world and IT

How to conduct empirical analysis

Empirical analysis is based on observable data. It is mainly concerned with what can be experienced and directly observed. Well-conducted empirical analysis sets out what was initially observed, what it expects to observe during testing and what was observed during testing. If the observed results do not match the expected result, then the hypothesis is not supported by the observed data. Empirical research is concerned only with what is observed, not with what makes sense or follows logically. It is closely related to the scientific method .

using the scientific method to confirm a hypothesis

Empiricism vs. rationalism

Empiricism is often contrasted with rationalism . Rationalism is a school of thought that truth can be determined by starting from simple truths, or axioms, and using logic and reasoning alone to build up to larger truths without needing to verify the truths with reality. A strictly empirical approach is limited to only what can be observed and can only produce results that support, disprove or are neutral to a theory.

Both an empirical and rational approach are needed to produce practical results. A purely rational approach can produce ideas that do not agree with observable reality, while relying on empirical data alone cannot produce new ideas and insights. Making good use of both is the cornerstone of the scientific method.

Quantitative and qualitative research in empirical analysis

Empirical analysis relies on gathering data through quantitative research, qualitative research or a mix of the two.

Quantitative research is related to things that can be quantified or assigned numbers. It deals with things that can be counted or measured. It may also use multiple-choice or closed-ended questions. In quantitative research, if two different people made the same measurements, they would get the same results.

Qualitative research is related to human perception. It deals with likes, dislikes, opinions, thoughts and behavior. It is often gathered in interviews, focus groups or open-ended surveys. Qualitative research can give excellent insight into data, but due to human nature and the difficulty of gathering large amounts of unstructured information, it may not always be reliable.

As an example of quantitative and qualitative research, imagine a firm wanted to determine if its new product was easier to use then its old one, so it observes people using the product. Examples of quantitative data it can gather would be how many people successfully completed the task, how long it took the person to finish, the age of the person and a survey with a rating of one to five of how difficult the person thought the task was. Examples of qualitative data would be what an observer saw while the person was doing the task and an interview afterward.

methods for collecting empirical evidence

Empirical research cycle

In 1969, Dutch researcher A.D. de Groot published his five-step empirical research cycle. It has been widely adopted as the most concise way to conduct empirical research. Each step must be conducted in sequence and is as important as the last:

  • Observation. Initial observations of a phenomena are made. This sparks an idea or a line of inquiry. Initial empirical data and research into existing information can be done.
  • Induction. A probable explanation of the observed phenomenon is proposed. Inductive reasoning is used to take the specific example from step one and infer a generalized explanation for it.
  • Deduction. A testable hypothesis is proposed that can support the explanation. Deductive reasoning is used to take the generalized explanation and make a specific prediction that can be tested and observed.
  • Testing. Quantitative and qualitative empirical data are gathered. The data is examined, often with statistical analysis. The results can support, refute or be neutral to the hypothesis. Because of the limits of empirical data and human perception, it is not said that the results prove or disapprove the hypothesis, only that they support or don't support it.
  • Evaluation. The reasoning, methodology and findings of the experiment are written down, and the conclusions of the researcher are presented. Information relating to any difficulties, challenges and limits of the test are also included. It may also include further possible avenues of research.

As a simple example of the empirical research cycle, imagine you start sneezing when you visit your sister.

  • Observation. I do not sneeze at home, I do sneeze at my sister's home and my sister owns a cat, while I do not have a cat.
  • Induction. I may be allergic to cats.
  • Deduction. I hypothesize that, if I go to the pet store and pick up a cat, I will start sneezing.
  • Testing. I went to the pet store, and when I picked up the cat, I started sneezing.
  • Evaluation. My trip to the pet store supports the idea that I am allergic to cats. But it was a different type of cat, and it was the same season, So, it may have been hay fever. If I wanted to gather more evidence, I should visit another person with a cat.

common features of empirical research projects

Empirical analysis in IT and business

Using empirical analysis is highly effective in IT and in business. These areas can be highly complex, have interrelated factors or delve into human behavior. Because of this, the behaviors of systems or why things happen can be unclear, hard to find, or even counterintuitive or seemingly irrational. Using the evidence-based approach of empirical analysis can help to remove uncertainty in the decision-making process.

a/b testing

Data warehouses and data lakes can create vast amounts of empirical information. By applying empirical analysis methods to this data, new insights can be found. This can include information about customer behavior or business efficiencies. Data analytics falls in this category.

Using A/B testing is a common way to do empirical research on usability. Different users are presented different designs, and by monitoring metrics, such as click-through, the best one can be found.

See also: data collection , data mining , data cleansing , data curation , data validation , big data , quantitative analyst and field of view .

Continue Reading About empirical analysis

  • How to evaluate and select data visualization tools
  • The importance of continued business analytics education
  • Develop a data literacy program to fit your company needs
  • Developing a data-driven culture begins with enablement
  • The top augmented analytics tools vendors are offering

Related Terms

In general, asynchronous -- from Greek asyn- ('not with/together') and chronos ('time') -- describes objects or events not ...

A URL (Uniform Resource Locator) is a unique identifier used to locate a resource on the internet.

File Transfer Protocol (FTP) is a network protocol for transmitting files between computers over TCP/IP connections.

Network detection and response (NDR) technology continuously scrutinizes network traffic to identify suspicious activity and ...

Identity threat detection and response (ITDR) is a collection of tools and best practices aimed at defending against cyberattacks...

Managed extended detection and response (MXDR) is an outsourced service that collects and analyzes threat data from across an ...

Data storytelling is the process of translating complex data analyses into understandable terms to inform a business decision or ...

Demand shaping is an operational supply chain management (SCM) strategy where a company uses tactics such as price incentives, ...

Data monetization is the process of measuring the economic benefit of corporate data.

Employee self-service (ESS) is a widely used human resources technology that enables employees to perform many job-related ...

Diversity, equity and inclusion is a term used to describe policies and programs that promote the representation and ...

Payroll software automates the process of paying salaried, hourly and contingent employees.

Salesforce Commerce Cloud is a cloud-based suite of products that enable e-commerce businesses to set up e-commerce sites, drive ...

Multichannel marketing refers to the practice of companies interacting with customers via multiple direct and indirect channels ...

A contact center is a central point from which organizations manage all customer interactions across various channels.

Banner

  • University of Memphis Libraries
  • Research Guides

Empirical Research: Defining, Identifying, & Finding

Defining empirical research, what is empirical research, quantitative or qualitative.

  • Introduction
  • Database Tools
  • Search Terms
  • Image Descriptions

Calfee & Chambliss (2005)  (UofM login required) describe empirical research as a "systematic approach for answering certain types of questions."  Those questions are answered "[t]hrough the collection of evidence under carefully defined and replicable conditions" (p. 43). 

The evidence collected during empirical research is often referred to as "data." 

Characteristics of Empirical Research

Emerald Publishing's guide to conducting empirical research identifies a number of common elements to empirical research: 

  • A  research question , which will determine research objectives.
  • A particular and planned  design  for the research, which will depend on the question and which will find ways of answering it with appropriate use of resources.
  • The gathering of  primary data , which is then analysed.
  • A particular  methodology  for collecting and analysing the data, such as an experiment or survey.
  • The limitation of the data to a particular group, area or time scale, known as a sample [emphasis added]: for example, a specific number of employees of a particular company type, or all users of a library over a given time scale. The sample should be somehow representative of a wider population.
  • The ability to  recreate  the study and test the results. This is known as  reliability .
  • The ability to  generalize  from the findings to a larger sample and to other situations.

If you see these elements in a research article, you can feel confident that you have found empirical research. Emerald's guide goes into more detail on each element. 

Empirical research methodologies can be described as quantitative, qualitative, or a mix of both (usually called mixed-methods).

Ruane (2016)  (UofM login required) gets at the basic differences in approach between quantitative and qualitative research:

  • Quantitative research  -- an approach to documenting reality that relies heavily on numbers both for the measurement of variables and for data analysis (p. 33).
  • Qualitative research  -- an approach to documenting reality that relies on words and images as the primary data source (p. 33).

Both quantitative and qualitative methods are empirical . If you can recognize that a research study is quantitative or qualitative study, then you have also recognized that it is empirical study. 

Below are information on the characteristics of quantitative and qualitative research. This video from Scribbr also offers a good overall introduction to the two approaches to research methodology: 

Characteristics of Quantitative Research 

Researchers test hypotheses, or theories, based in assumptions about causality, i.e. we expect variable X to cause variable Y. Variables have to be controlled as much as possible to ensure validity. The results explain the relationship between the variables. Measures are based in pre-defined instruments.

Examples: experimental or quasi-experimental design, pretest & post-test, survey or questionnaire with closed-ended questions. Studies that identify factors that influence an outcomes, the utility of an intervention, or understanding predictors of outcomes. 

Characteristics of Qualitative Research

Researchers explore “meaning individuals or groups ascribe to social or human problems (Creswell & Creswell, 2018, p3).” Questions and procedures emerge rather than being prescribed. Complexity, nuance, and individual meaning are valued. Research is both inductive and deductive. Data sources are multiple and varied, i.e. interviews, observations, documents, photographs, etc. The researcher is a key instrument and must be reflective of their background, culture, and experiences as influential of the research.

Examples: open question interviews and surveys, focus groups, case studies, grounded theory, ethnography, discourse analysis, narrative, phenomenology, participatory action research.

Calfee, R. C. & Chambliss, M. (2005). The design of empirical research. In J. Flood, D. Lapp, J. R. Squire, & J. Jensen (Eds.),  Methods of research on teaching the English language arts: The methodology chapters from the handbook of research on teaching the English language arts (pp. 43-78). Routledge.  http://ezproxy.memphis.edu/login?url=http://search.ebscohost.com/login.aspx?direct=true&db=nlebk&AN=125955&site=eds-live&scope=site .

Creswell, J. W., & Creswell, J. D. (2018).  Research design: Qualitative, quantitative, and mixed methods approaches  (5th ed.). Thousand Oaks: Sage.

How to... conduct empirical research . (n.d.). Emerald Publishing.  https://www.emeraldgrouppublishing.com/how-to/research-methods/conduct-empirical-research .

Scribbr. (2019). Quantitative vs. qualitative: The differences explained  [video]. YouTube.  https://www.youtube.com/watch?v=a-XtVF7Bofg .

Ruane, J. M. (2016).  Introducing social research methods : Essentials for getting the edge . Wiley-Blackwell.  http://ezproxy.memphis.edu/login?url=http://search.ebscohost.com/login.aspx?direct=true&db=nlebk&AN=1107215&site=eds-live&scope=site .  

  • << Previous: Home
  • Next: Identifying Empirical Research >>
  • Last Updated: Apr 2, 2024 11:25 AM
  • URL: https://libguides.memphis.edu/empirical-research

What is Empirical Research? Definition, Methods, Examples

Appinio Research · 09.02.2024 · 36min read

What is Empirical Research Definition Methods Examples

Ever wondered how we gather the facts, unveil hidden truths, and make informed decisions in a world filled with questions? Empirical research holds the key.

In this guide, we'll delve deep into the art and science of empirical research, unraveling its methods, mysteries, and manifold applications. From defining the core principles to mastering data analysis and reporting findings, we're here to equip you with the knowledge and tools to navigate the empirical landscape.

What is Empirical Research?

Empirical research is the cornerstone of scientific inquiry, providing a systematic and structured approach to investigating the world around us. It is the process of gathering and analyzing empirical or observable data to test hypotheses, answer research questions, or gain insights into various phenomena. This form of research relies on evidence derived from direct observation or experimentation, allowing researchers to draw conclusions based on real-world data rather than purely theoretical or speculative reasoning.

Characteristics of Empirical Research

Empirical research is characterized by several key features:

  • Observation and Measurement : It involves the systematic observation or measurement of variables, events, or behaviors.
  • Data Collection : Researchers collect data through various methods, such as surveys, experiments, observations, or interviews.
  • Testable Hypotheses : Empirical research often starts with testable hypotheses that are evaluated using collected data.
  • Quantitative or Qualitative Data : Data can be quantitative (numerical) or qualitative (non-numerical), depending on the research design.
  • Statistical Analysis : Quantitative data often undergo statistical analysis to determine patterns , relationships, or significance.
  • Objectivity and Replicability : Empirical research strives for objectivity, minimizing researcher bias . It should be replicable, allowing other researchers to conduct the same study to verify results.
  • Conclusions and Generalizations : Empirical research generates findings based on data and aims to make generalizations about larger populations or phenomena.

Importance of Empirical Research

Empirical research plays a pivotal role in advancing knowledge across various disciplines. Its importance extends to academia, industry, and society as a whole. Here are several reasons why empirical research is essential:

  • Evidence-Based Knowledge : Empirical research provides a solid foundation of evidence-based knowledge. It enables us to test hypotheses, confirm or refute theories, and build a robust understanding of the world.
  • Scientific Progress : In the scientific community, empirical research fuels progress by expanding the boundaries of existing knowledge. It contributes to the development of theories and the formulation of new research questions.
  • Problem Solving : Empirical research is instrumental in addressing real-world problems and challenges. It offers insights and data-driven solutions to complex issues in fields like healthcare, economics, and environmental science.
  • Informed Decision-Making : In policymaking, business, and healthcare, empirical research informs decision-makers by providing data-driven insights. It guides strategies, investments, and policies for optimal outcomes.
  • Quality Assurance : Empirical research is essential for quality assurance and validation in various industries, including pharmaceuticals, manufacturing, and technology. It ensures that products and processes meet established standards.
  • Continuous Improvement : Businesses and organizations use empirical research to evaluate performance, customer satisfaction , and product effectiveness. This data-driven approach fosters continuous improvement and innovation.
  • Human Advancement : Empirical research in fields like medicine and psychology contributes to the betterment of human health and well-being. It leads to medical breakthroughs, improved therapies, and enhanced psychological interventions.
  • Critical Thinking and Problem Solving : Engaging in empirical research fosters critical thinking skills, problem-solving abilities, and a deep appreciation for evidence-based decision-making.

Empirical research empowers us to explore, understand, and improve the world around us. It forms the bedrock of scientific inquiry and drives progress in countless domains, shaping our understanding of both the natural and social sciences.

How to Conduct Empirical Research?

So, you've decided to dive into the world of empirical research. Let's begin by exploring the crucial steps involved in getting started with your research project.

1. Select a Research Topic

Selecting the right research topic is the cornerstone of a successful empirical study. It's essential to choose a topic that not only piques your interest but also aligns with your research goals and objectives. Here's how to go about it:

  • Identify Your Interests : Start by reflecting on your passions and interests. What topics fascinate you the most? Your enthusiasm will be your driving force throughout the research process.
  • Brainstorm Ideas : Engage in brainstorming sessions to generate potential research topics. Consider the questions you've always wanted to answer or the issues that intrigue you.
  • Relevance and Significance : Assess the relevance and significance of your chosen topic. Does it contribute to existing knowledge? Is it a pressing issue in your field of study or the broader community?
  • Feasibility : Evaluate the feasibility of your research topic. Do you have access to the necessary resources, data, and participants (if applicable)?

2. Formulate Research Questions

Once you've narrowed down your research topic, the next step is to formulate clear and precise research questions . These questions will guide your entire research process and shape your study's direction. To create effective research questions:

  • Specificity : Ensure that your research questions are specific and focused. Vague or overly broad questions can lead to inconclusive results.
  • Relevance : Your research questions should directly relate to your chosen topic. They should address gaps in knowledge or contribute to solving a particular problem.
  • Testability : Ensure that your questions are testable through empirical methods. You should be able to gather data and analyze it to answer these questions.
  • Avoid Bias : Craft your questions in a way that avoids leading or biased language. Maintain neutrality to uphold the integrity of your research.

3. Review Existing Literature

Before you embark on your empirical research journey, it's essential to immerse yourself in the existing body of literature related to your chosen topic. This step, often referred to as a literature review, serves several purposes:

  • Contextualization : Understand the historical context and current state of research in your field. What have previous studies found, and what questions remain unanswered?
  • Identifying Gaps : Identify gaps or areas where existing research falls short. These gaps will help you formulate meaningful research questions and hypotheses.
  • Theory Development : If your study is theoretical, consider how existing theories apply to your topic. If it's empirical, understand how previous studies have approached data collection and analysis.
  • Methodological Insights : Learn from the methodologies employed in previous research. What methods were successful, and what challenges did researchers face?

4. Define Variables

Variables are fundamental components of empirical research. They are the factors or characteristics that can change or be manipulated during your study. Properly defining and categorizing variables is crucial for the clarity and validity of your research. Here's what you need to know:

  • Independent Variables : These are the variables that you, as the researcher, manipulate or control. They are the "cause" in cause-and-effect relationships.
  • Dependent Variables : Dependent variables are the outcomes or responses that you measure or observe. They are the "effect" influenced by changes in independent variables.
  • Operational Definitions : To ensure consistency and clarity, provide operational definitions for your variables. Specify how you will measure or manipulate each variable.
  • Control Variables : In some studies, controlling for other variables that may influence your dependent variable is essential. These are known as control variables.

Understanding these foundational aspects of empirical research will set a solid foundation for the rest of your journey. Now that you've grasped the essentials of getting started, let's delve deeper into the intricacies of research design.

Empirical Research Design

Now that you've selected your research topic, formulated research questions, and defined your variables, it's time to delve into the heart of your empirical research journey – research design . This pivotal step determines how you will collect data and what methods you'll employ to answer your research questions. Let's explore the various facets of research design in detail.

Types of Empirical Research

Empirical research can take on several forms, each with its own unique approach and methodologies. Understanding the different types of empirical research will help you choose the most suitable design for your study. Here are some common types:

  • Experimental Research : In this type, researchers manipulate one or more independent variables to observe their impact on dependent variables. It's highly controlled and often conducted in a laboratory setting.
  • Observational Research : Observational research involves the systematic observation of subjects or phenomena without intervention. Researchers are passive observers, documenting behaviors, events, or patterns.
  • Survey Research : Surveys are used to collect data through structured questionnaires or interviews. This method is efficient for gathering information from a large number of participants.
  • Case Study Research : Case studies focus on in-depth exploration of one or a few cases. Researchers gather detailed information through various sources such as interviews, documents, and observations.
  • Qualitative Research : Qualitative research aims to understand behaviors, experiences, and opinions in depth. It often involves open-ended questions, interviews, and thematic analysis.
  • Quantitative Research : Quantitative research collects numerical data and relies on statistical analysis to draw conclusions. It involves structured questionnaires, experiments, and surveys.

Your choice of research type should align with your research questions and objectives. Experimental research, for example, is ideal for testing cause-and-effect relationships, while qualitative research is more suitable for exploring complex phenomena.

Experimental Design

Experimental research is a systematic approach to studying causal relationships. It's characterized by the manipulation of one or more independent variables while controlling for other factors. Here are some key aspects of experimental design:

  • Control and Experimental Groups : Participants are randomly assigned to either a control group or an experimental group. The independent variable is manipulated for the experimental group but not for the control group.
  • Randomization : Randomization is crucial to eliminate bias in group assignment. It ensures that each participant has an equal chance of being in either group.
  • Hypothesis Testing : Experimental research often involves hypothesis testing. Researchers formulate hypotheses about the expected effects of the independent variable and use statistical analysis to test these hypotheses.

Observational Design

Observational research entails careful and systematic observation of subjects or phenomena. It's advantageous when you want to understand natural behaviors or events. Key aspects of observational design include:

  • Participant Observation : Researchers immerse themselves in the environment they are studying. They become part of the group being observed, allowing for a deep understanding of behaviors.
  • Non-Participant Observation : In non-participant observation, researchers remain separate from the subjects. They observe and document behaviors without direct involvement.
  • Data Collection Methods : Observational research can involve various data collection methods, such as field notes, video recordings, photographs, or coding of observed behaviors.

Survey Design

Surveys are a popular choice for collecting data from a large number of participants. Effective survey design is essential to ensure the validity and reliability of your data. Consider the following:

  • Questionnaire Design : Create clear and concise questions that are easy for participants to understand. Avoid leading or biased questions.
  • Sampling Methods : Decide on the appropriate sampling method for your study, whether it's random, stratified, or convenience sampling.
  • Data Collection Tools : Choose the right tools for data collection, whether it's paper surveys, online questionnaires, or face-to-face interviews.

Case Study Design

Case studies are an in-depth exploration of one or a few cases to gain a deep understanding of a particular phenomenon. Key aspects of case study design include:

  • Single Case vs. Multiple Case Studies : Decide whether you'll focus on a single case or multiple cases. Single case studies are intensive and allow for detailed examination, while multiple case studies provide comparative insights.
  • Data Collection Methods : Gather data through interviews, observations, document analysis, or a combination of these methods.

Qualitative vs. Quantitative Research

In empirical research, you'll often encounter the distinction between qualitative and quantitative research . Here's a closer look at these two approaches:

  • Qualitative Research : Qualitative research seeks an in-depth understanding of human behavior, experiences, and perspectives. It involves open-ended questions, interviews, and the analysis of textual or narrative data. Qualitative research is exploratory and often used when the research question is complex and requires a nuanced understanding.
  • Quantitative Research : Quantitative research collects numerical data and employs statistical analysis to draw conclusions. It involves structured questionnaires, experiments, and surveys. Quantitative research is ideal for testing hypotheses and establishing cause-and-effect relationships.

Understanding the various research design options is crucial in determining the most appropriate approach for your study. Your choice should align with your research questions, objectives, and the nature of the phenomenon you're investigating.

Data Collection for Empirical Research

Now that you've established your research design, it's time to roll up your sleeves and collect the data that will fuel your empirical research. Effective data collection is essential for obtaining accurate and reliable results.

Sampling Methods

Sampling methods are critical in empirical research, as they determine the subset of individuals or elements from your target population that you will study. Here are some standard sampling methods:

  • Random Sampling : Random sampling ensures that every member of the population has an equal chance of being selected. It minimizes bias and is often used in quantitative research.
  • Stratified Sampling : Stratified sampling involves dividing the population into subgroups or strata based on specific characteristics (e.g., age, gender, location). Samples are then randomly selected from each stratum, ensuring representation of all subgroups.
  • Convenience Sampling : Convenience sampling involves selecting participants who are readily available or easily accessible. While it's convenient, it may introduce bias and limit the generalizability of results.
  • Snowball Sampling : Snowball sampling is instrumental when studying hard-to-reach or hidden populations. One participant leads you to another, creating a "snowball" effect. This method is common in qualitative research.
  • Purposive Sampling : In purposive sampling, researchers deliberately select participants who meet specific criteria relevant to their research questions. It's often used in qualitative studies to gather in-depth information.

The choice of sampling method depends on the nature of your research, available resources, and the degree of precision required. It's crucial to carefully consider your sampling strategy to ensure that your sample accurately represents your target population.

Data Collection Instruments

Data collection instruments are the tools you use to gather information from your participants or sources. These instruments should be designed to capture the data you need accurately. Here are some popular data collection instruments:

  • Questionnaires : Questionnaires consist of structured questions with predefined response options. When designing questionnaires, consider the clarity of questions, the order of questions, and the response format (e.g., Likert scale , multiple-choice).
  • Interviews : Interviews involve direct communication between the researcher and participants. They can be structured (with predetermined questions) or unstructured (open-ended). Effective interviews require active listening and probing for deeper insights.
  • Observations : Observations entail systematically and objectively recording behaviors, events, or phenomena. Researchers must establish clear criteria for what to observe, how to record observations, and when to observe.
  • Surveys : Surveys are a common data collection instrument for quantitative research. They can be administered through various means, including online surveys, paper surveys, and telephone surveys.
  • Documents and Archives : In some cases, data may be collected from existing documents, records, or archives. Ensure that the sources are reliable, relevant, and properly documented.

To streamline your process and gather insights with precision and efficiency, consider leveraging innovative tools like Appinio . With Appinio's intuitive platform, you can harness the power of real-time consumer data to inform your research decisions effectively. Whether you're conducting surveys, interviews, or observations, Appinio empowers you to define your target audience, collect data from diverse demographics, and analyze results seamlessly.

By incorporating Appinio into your data collection toolkit, you can unlock a world of possibilities and elevate the impact of your empirical research. Ready to revolutionize your approach to data collection?

Book a Demo

Data Collection Procedures

Data collection procedures outline the step-by-step process for gathering data. These procedures should be meticulously planned and executed to maintain the integrity of your research.

  • Training : If you have a research team, ensure that they are trained in data collection methods and protocols. Consistency in data collection is crucial.
  • Pilot Testing : Before launching your data collection, conduct a pilot test with a small group to identify any potential problems with your instruments or procedures. Make necessary adjustments based on feedback.
  • Data Recording : Establish a systematic method for recording data. This may include timestamps, codes, or identifiers for each data point.
  • Data Security : Safeguard the confidentiality and security of collected data. Ensure that only authorized individuals have access to the data.
  • Data Storage : Properly organize and store your data in a secure location, whether in physical or digital form. Back up data to prevent loss.

Ethical Considerations

Ethical considerations are paramount in empirical research, as they ensure the well-being and rights of participants are protected.

  • Informed Consent : Obtain informed consent from participants, providing clear information about the research purpose, procedures, risks, and their right to withdraw at any time.
  • Privacy and Confidentiality : Protect the privacy and confidentiality of participants. Ensure that data is anonymized and sensitive information is kept confidential.
  • Beneficence : Ensure that your research benefits participants and society while minimizing harm. Consider the potential risks and benefits of your study.
  • Honesty and Integrity : Conduct research with honesty and integrity. Report findings accurately and transparently, even if they are not what you expected.
  • Respect for Participants : Treat participants with respect, dignity, and sensitivity to cultural differences. Avoid any form of coercion or manipulation.
  • Institutional Review Board (IRB) : If required, seek approval from an IRB or ethics committee before conducting your research, particularly when working with human participants.

Adhering to ethical guidelines is not only essential for the ethical conduct of research but also crucial for the credibility and validity of your study. Ethical research practices build trust between researchers and participants and contribute to the advancement of knowledge with integrity.

With a solid understanding of data collection, including sampling methods, instruments, procedures, and ethical considerations, you are now well-equipped to gather the data needed to answer your research questions.

Empirical Research Data Analysis

Now comes the exciting phase of data analysis, where the raw data you've diligently collected starts to yield insights and answers to your research questions. We will explore the various aspects of data analysis, from preparing your data to drawing meaningful conclusions through statistics and visualization.

Data Preparation

Data preparation is the crucial first step in data analysis. It involves cleaning, organizing, and transforming your raw data into a format that is ready for analysis. Effective data preparation ensures the accuracy and reliability of your results.

  • Data Cleaning : Identify and rectify errors, missing values, and inconsistencies in your dataset. This may involve correcting typos, removing outliers, and imputing missing data.
  • Data Coding : Assign numerical values or codes to categorical variables to make them suitable for statistical analysis. For example, converting "Yes" and "No" to 1 and 0.
  • Data Transformation : Transform variables as needed to meet the assumptions of the statistical tests you plan to use. Common transformations include logarithmic or square root transformations.
  • Data Integration : If your data comes from multiple sources, integrate it into a unified dataset, ensuring that variables match and align.
  • Data Documentation : Maintain clear documentation of all data preparation steps, as well as the rationale behind each decision. This transparency is essential for replicability.

Effective data preparation lays the foundation for accurate and meaningful analysis. It allows you to trust the results that will follow in the subsequent stages.

Descriptive Statistics

Descriptive statistics help you summarize and make sense of your data by providing a clear overview of its key characteristics. These statistics are essential for understanding the central tendencies, variability, and distribution of your variables. Descriptive statistics include:

  • Measures of Central Tendency : These include the mean (average), median (middle value), and mode (most frequent value). They help you understand the typical or central value of your data.
  • Measures of Dispersion : Measures like the range, variance, and standard deviation provide insights into the spread or variability of your data points.
  • Frequency Distributions : Creating frequency distributions or histograms allows you to visualize the distribution of your data across different values or categories.

Descriptive statistics provide the initial insights needed to understand your data's basic characteristics, which can inform further analysis.

Inferential Statistics

Inferential statistics take your analysis to the next level by allowing you to make inferences or predictions about a larger population based on your sample data. These methods help you test hypotheses and draw meaningful conclusions. Key concepts in inferential statistics include:

  • Hypothesis Testing : Hypothesis tests (e.g., t-tests , chi-squared tests ) help you determine whether observed differences or associations in your data are statistically significant or occurred by chance.
  • Confidence Intervals : Confidence intervals provide a range within which population parameters (e.g., population mean) are likely to fall based on your sample data.
  • Regression Analysis : Regression models (linear, logistic, etc.) help you explore relationships between variables and make predictions.
  • Analysis of Variance (ANOVA) : ANOVA tests are used to compare means between multiple groups, allowing you to assess whether differences are statistically significant.

Chi-Square Calculator :

t-Test Calculator :

One-way ANOVA Calculator :

Inferential statistics are powerful tools for drawing conclusions from your data and assessing the generalizability of your findings to the broader population.

Qualitative Data Analysis

Qualitative data analysis is employed when working with non-numerical data, such as text, interviews, or open-ended survey responses. It focuses on understanding the underlying themes, patterns, and meanings within qualitative data. Qualitative analysis techniques include:

  • Thematic Analysis : Identifying and analyzing recurring themes or patterns within textual data.
  • Content Analysis : Categorizing and coding qualitative data to extract meaningful insights.
  • Grounded Theory : Developing theories or frameworks based on emergent themes from the data.
  • Narrative Analysis : Examining the structure and content of narratives to uncover meaning.

Qualitative data analysis provides a rich and nuanced understanding of complex phenomena and human experiences.

Data Visualization

Data visualization is the art of representing data graphically to make complex information more understandable and accessible. Effective data visualization can reveal patterns, trends, and outliers in your data. Common types of data visualization include:

  • Bar Charts and Histograms : Used to display the distribution of categorical data or discrete data .
  • Line Charts : Ideal for showing trends and changes in data over time.
  • Scatter Plots : Visualize relationships and correlations between two variables.
  • Pie Charts : Display the composition of a whole in terms of its parts.
  • Heatmaps : Depict patterns and relationships in multidimensional data through color-coding.
  • Box Plots : Provide a summary of the data distribution, including outliers.
  • Interactive Dashboards : Create dynamic visualizations that allow users to explore data interactively.

Data visualization not only enhances your understanding of the data but also serves as a powerful communication tool to convey your findings to others.

As you embark on the data analysis phase of your empirical research, remember that the specific methods and techniques you choose will depend on your research questions, data type, and objectives. Effective data analysis transforms raw data into valuable insights, bringing you closer to the answers you seek.

How to Report Empirical Research Results?

At this stage, you get to share your empirical research findings with the world. Effective reporting and presentation of your results are crucial for communicating your research's impact and insights.

1. Write the Research Paper

Writing a research paper is the culmination of your empirical research journey. It's where you synthesize your findings, provide context, and contribute to the body of knowledge in your field.

  • Title and Abstract : Craft a clear and concise title that reflects your research's essence. The abstract should provide a brief summary of your research objectives, methods, findings, and implications.
  • Introduction : In the introduction, introduce your research topic, state your research questions or hypotheses, and explain the significance of your study. Provide context by discussing relevant literature.
  • Methods : Describe your research design, data collection methods, and sampling procedures. Be precise and transparent, allowing readers to understand how you conducted your study.
  • Results : Present your findings in a clear and organized manner. Use tables, graphs, and statistical analyses to support your results. Avoid interpreting your findings in this section; focus on the presentation of raw data.
  • Discussion : Interpret your findings and discuss their implications. Relate your results to your research questions and the existing literature. Address any limitations of your study and suggest avenues for future research.
  • Conclusion : Summarize the key points of your research and its significance. Restate your main findings and their implications.
  • References : Cite all sources used in your research following a specific citation style (e.g., APA, MLA, Chicago). Ensure accuracy and consistency in your citations.
  • Appendices : Include any supplementary material, such as questionnaires, data coding sheets, or additional analyses, in the appendices.

Writing a research paper is a skill that improves with practice. Ensure clarity, coherence, and conciseness in your writing to make your research accessible to a broader audience.

2. Create Visuals and Tables

Visuals and tables are powerful tools for presenting complex data in an accessible and understandable manner.

  • Clarity : Ensure that your visuals and tables are clear and easy to interpret. Use descriptive titles and labels.
  • Consistency : Maintain consistency in formatting, such as font size and style, across all visuals and tables.
  • Appropriateness : Choose the most suitable visual representation for your data. Bar charts, line graphs, and scatter plots work well for different types of data.
  • Simplicity : Avoid clutter and unnecessary details. Focus on conveying the main points.
  • Accessibility : Make sure your visuals and tables are accessible to a broad audience, including those with visual impairments.
  • Captions : Include informative captions that explain the significance of each visual or table.

Compelling visuals and tables enhance the reader's understanding of your research and can be the key to conveying complex information efficiently.

3. Interpret Findings

Interpreting your findings is where you bridge the gap between data and meaning. It's your opportunity to provide context, discuss implications, and offer insights. When interpreting your findings:

  • Relate to Research Questions : Discuss how your findings directly address your research questions or hypotheses.
  • Compare with Literature : Analyze how your results align with or deviate from previous research in your field. What insights can you draw from these comparisons?
  • Discuss Limitations : Be transparent about the limitations of your study. Address any constraints, biases, or potential sources of error.
  • Practical Implications : Explore the real-world implications of your findings. How can they be applied or inform decision-making?
  • Future Research Directions : Suggest areas for future research based on the gaps or unanswered questions that emerged from your study.

Interpreting findings goes beyond simply presenting data; it's about weaving a narrative that helps readers grasp the significance of your research in the broader context.

With your research paper written, structured, and enriched with visuals, and your findings expertly interpreted, you are now prepared to communicate your research effectively. Sharing your insights and contributing to the body of knowledge in your field is a significant accomplishment in empirical research.

Examples of Empirical Research

To solidify your understanding of empirical research, let's delve into some real-world examples across different fields. These examples will illustrate how empirical research is applied to gather data, analyze findings, and draw conclusions.

Social Sciences

In the realm of social sciences, consider a sociological study exploring the impact of socioeconomic status on educational attainment. Researchers gather data from a diverse group of individuals, including their family backgrounds, income levels, and academic achievements.

Through statistical analysis, they can identify correlations and trends, revealing whether individuals from lower socioeconomic backgrounds are less likely to attain higher levels of education. This empirical research helps shed light on societal inequalities and informs policymakers on potential interventions to address disparities in educational access.

Environmental Science

Environmental scientists often employ empirical research to assess the effects of environmental changes. For instance, researchers studying the impact of climate change on wildlife might collect data on animal populations, weather patterns, and habitat conditions over an extended period.

By analyzing this empirical data, they can identify correlations between climate fluctuations and changes in wildlife behavior, migration patterns, or population sizes. This empirical research is crucial for understanding the ecological consequences of climate change and informing conservation efforts.

Business and Economics

In the business world, empirical research is essential for making data-driven decisions. Consider a market research study conducted by a business seeking to launch a new product. They collect data through surveys , focus groups , and consumer behavior analysis.

By examining this empirical data, the company can gauge consumer preferences, demand, and potential market size. Empirical research in business helps guide product development, pricing strategies, and marketing campaigns, increasing the likelihood of a successful product launch.

Psychological studies frequently rely on empirical research to understand human behavior and cognition. For instance, a psychologist interested in examining the impact of stress on memory might design an experiment. Participants are exposed to stress-inducing situations, and their memory performance is assessed through various tasks.

By analyzing the data collected, the psychologist can determine whether stress has a significant effect on memory recall. This empirical research contributes to our understanding of the complex interplay between psychological factors and cognitive processes.

These examples highlight the versatility and applicability of empirical research across diverse fields. Whether in medicine, social sciences, environmental science, business, or psychology, empirical research serves as a fundamental tool for gaining insights, testing hypotheses, and driving advancements in knowledge and practice.

Conclusion for Empirical Research

Empirical research is a powerful tool for gaining insights, testing hypotheses, and making informed decisions. By following the steps outlined in this guide, you've learned how to select research topics, collect data, analyze findings, and effectively communicate your research to the world. Remember, empirical research is a journey of discovery, and each step you take brings you closer to a deeper understanding of the world around you. Whether you're a scientist, a student, or someone curious about the process, the principles of empirical research empower you to explore, learn, and contribute to the ever-expanding realm of knowledge.

How to Collect Data for Empirical Research?

Introducing Appinio , the real-time market research platform revolutionizing how companies gather consumer insights for their empirical research endeavors. With Appinio, you can conduct your own market research in minutes, gaining valuable data to fuel your data-driven decisions.

Appinio is more than just a market research platform; it's a catalyst for transforming the way you approach empirical research, making it exciting, intuitive, and seamlessly integrated into your decision-making process.

Here's why Appinio is the go-to solution for empirical research:

  • From Questions to Insights in Minutes : With Appinio's streamlined process, you can go from formulating your research questions to obtaining actionable insights in a matter of minutes, saving you time and effort.
  • Intuitive Platform for Everyone : No need for a PhD in research; Appinio's platform is designed to be intuitive and user-friendly, ensuring that anyone can navigate and utilize it effectively.
  • Rapid Response Times : With an average field time of under 23 minutes for 1,000 respondents, Appinio delivers rapid results, allowing you to gather data swiftly and efficiently.
  • Global Reach with Targeted Precision : With access to over 90 countries and the ability to define target groups based on 1200+ characteristics, Appinio empowers you to reach your desired audience with precision and ease.

Register now EN

Get free access to the platform!

Join the loop 💌

Be the first to hear about new updates, product news, and data insights. We'll send it all straight to your inbox.

Get the latest market research news straight to your inbox! 💌

Wait, there's more

Get your brand Holiday Ready: 4 Essential Steps to Smash your Q4

03.09.2024 | 3min read

Get your brand Holiday Ready: 4 Essential Steps to Smash your Q4

Beyond Demographics: Psychographic Power in target group identification

03.09.2024 | 8min read

Beyond Demographics: Psychographics power in target group identification

What is Convenience Sampling Definition Method Examples

29.08.2024 | 32min read

What is Convenience Sampling? Definition, Method, Examples

empirical research data analysis

Empirical Research: A Comprehensive Guide for Academics 

empirical research

Empirical research relies on gathering and studying real, observable data. The term ’empirical’ comes from the Greek word ’empeirikos,’ meaning ‘experienced’ or ‘based on experience.’ So, what is empirical research? Instead of using theories or opinions, empirical research depends on real data obtained through direct observation or experimentation. 

Why Empirical Research?

Empirical research plays a key role in checking or improving current theories, providing a systematic way to grow knowledge across different areas. By focusing on objectivity, it makes research findings more trustworthy, which is critical in research fields like medicine, psychology, economics, and public policy. In the end, the strengths of empirical research lie in deepening our awareness of the world and improving our capacity to tackle problems wisely. 1,2  

Qualitative and Quantitative Methods

There are two main types of empirical research methods – qualitative and quantitative. 3,4 Qualitative research delves into intricate phenomena using non-numerical data, such as interviews or observations, to offer in-depth insights into human experiences. In contrast, quantitative research analyzes numerical data to spot patterns and relationships, aiming for objectivity and the ability to apply findings to a wider context. 

Steps for Conducting Empirical Research

When it comes to conducting research, there are some simple steps that researchers can follow. 5,6  

  • Create Research Hypothesis:  Clearly state the specific question you want to answer or the hypothesis you want to explore in your study. 
  • Examine Existing Research:  Read and study existing research on your topic. Understand what’s already known, identify existing gaps in knowledge, and create a framework for your own study based on what you learn. 
  • Plan Your Study:  Decide how you’ll conduct your research—whether through qualitative methods, quantitative methods, or a mix of both. Choose suitable techniques like surveys, experiments, interviews, or observations based on your research question. 
  • Develop Research Instruments:  Create reliable research collection tools, such as surveys or questionnaires, to help you collate data. Ensure these tools are well-designed and effective. 
  • Collect Data:  Systematically gather the information you need for your research according to your study design and protocols using the chosen research methods. 
  • Data Analysis:  Analyze the collected data using suitable statistical or qualitative methods that align with your research question and objectives. 
  • Interpret Results:  Understand and explain the significance of your analysis results in the context of your research question or hypothesis. 
  • Draw Conclusions:  Summarize your findings and draw conclusions based on the evidence. Acknowledge any study limitations and propose areas for future research. 

Advantages of Empirical Research

Empirical research is valuable because it stays objective by relying on observable data, lessening the impact of personal biases. This objectivity boosts the trustworthiness of research findings. Also, using precise quantitative methods helps in accurate measurement and statistical analysis. This precision ensures researchers can draw reliable conclusions from numerical data, strengthening our understanding of the studied phenomena. 4  

Disadvantages of Empirical Research

While empirical research has notable strengths, researchers must also be aware of its limitations when deciding on the right research method for their study.4 One significant drawback of empirical research is the risk of oversimplifying complex phenomena, especially when relying solely on quantitative methods. These methods may struggle to capture the richness and nuances present in certain social, cultural, or psychological contexts. Another challenge is the potential for confounding variables or biases during data collection, impacting result accuracy.  

Tips for Empirical Writing

In empirical research, the writing is usually done in research papers, articles, or reports. The empirical writing follows a set structure, and each section has a specific role. Here are some tips for your empirical writing. 7   

  • Define Your Objectives:  When you write about your research, start by making your goals clear. Explain what you want to find out or prove in a simple and direct way. This helps guide your research and lets others know what you have set out to achieve. 
  • Be Specific in Your Literature Review:  In the part where you talk about what others have studied before you, focus on research that directly relates to your research question. Keep it short and pick studies that help explain why your research is important. This part sets the stage for your work. 
  • Explain Your Methods Clearly : When you talk about how you did your research (Methods), explain it in detail. Be clear about your research plan, who took part, and what you did; this helps others understand and trust your study. Also, be honest about any rules you follow to make sure your study is ethical and reproducible. 
  • Share Your Results Clearly : After doing your empirical research, share what you found in a simple way. Use tables or graphs to make it easier for your audience to understand your research. Also, talk about any numbers you found and clearly state if they are important or not. Ensure that others can see why your research findings matter. 
  • Talk About What Your Findings Mean:  In the part where you discuss your research results, explain what they mean. Discuss why your findings are important and if they connect to what others have found before. Be honest about any problems with your study and suggest ideas for more research in the future. 
  • Wrap It Up Clearly:  Finally, end your empirical research paper by summarizing what you found and why it’s important. Remind everyone why your study matters. Keep your writing clear and fix any mistakes before you share it. Ask someone you trust to read it and give you feedback before you finish. 

References:  

  • Empirical Research in the Social Sciences and Education, Penn State University Libraries. Available online at  https://guides.libraries.psu.edu/emp  
  • How to conduct empirical research, Emerald Publishing. Available online at  https://www.emeraldgrouppublishing.com/how-to/research-methods/conduct-empirical-research  
  • Empirical Research: Quantitative & Qualitative, Arrendale Library, Piedmont University. Available online at  https://library.piedmont.edu/empirical-research  
  • Bouchrika, I.  What Is Empirical Research? Definition, Types & Samples  in 2024. Research.com, January 2024. Available online at  https://research.com/research/what-is-empirical-research  
  • Quantitative and Empirical Research vs. Other Types of Research. California State University, April 2023. Available online at  https://libguides.csusb.edu/quantitative  
  • Empirical Research, Definitions, Methods, Types and Examples, Studocu.com website. Available online at  https://www.studocu.com/row/document/uganda-christian-university/it-research-methods/emperical-research-definitions-methods-types-and-examples/55333816  
  • Writing an Empirical Paper in APA Style. Psychology Writing Center, University of Washington. Available online at  https://psych.uw.edu/storage/writing_center/APApaper.pdf  

Paperpal is an AI writing assistant that help academics write better, faster with real-time suggestions for in-depth language and grammar correction. Trained on millions of research manuscripts enhanced by professional academic editors, Paperpal delivers human precision at machine speed.  

Try it for free or upgrade to  Paperpal Prime , which unlocks unlimited access to premium features like academic translation, paraphrasing, contextual synonyms, consistency checks and more. It’s like always having a professional academic editor by your side! Go beyond limitations and experience the future of academic writing.  Get Paperpal Prime now at just US$19 a month!  

Related Reads:

  • How to Write a Scientific Paper in 10 Steps 
  • What is a Literature Review? How to Write It (with Examples)
  • What is an Argumentative Essay? How to Write It (With Examples)
  • Ethical Research Practices For Research with Human Subjects

Ethics in Science: Importance, Principles & Guidelines 

Presenting research data effectively through tables and figures, you may also like, academic integrity vs academic dishonesty: types & examples, dissertation printing and binding | types & comparison , what is a dissertation preface definition and examples , the ai revolution: authors’ role in upholding academic..., the future of academia: how ai tools are..., how to write a research proposal: (with examples..., how to write your research paper in apa..., how to choose a dissertation topic, how to write a phd research proposal, how to write an academic paragraph (step-by-step guide).

  • MAY 16, 2024

What Is Empirical Research? Definition, Types & Samples in 2024

Imed Bouchrika, Phd

by Imed Bouchrika, Phd

Co-Founder and Chief Data Scientist

How was the world formed? Are there parallel universes? Why does time move forward but never in reverse? These are longstanding questions that have yet to receive definitive answers up to now.

In research, these are called empirical questions, which ask about how the world is, how the world works, etc. Such questions are addressed by a corresponding type of study—called empirical research or the empirical method—which is concerned with actual events and phenomena.

What is an empirical study? Research is empirical if it seeks to find a general story or explanation, one that applies to various cases and across time. The empirical approach functions to create new knowledge about the way the world actually works. This article discusses the empirical research definition, concepts, types, processes, and other important aspects of this method. It also tackles the importance of identifying evidence in research .

I. What is Empirical Research?

A. definitions.

What is empirical evidence? Empirical research is defined as any study whose conclusions are exclusively derived from concrete, verifiable evidence. The term empirical basically means that it is guided by scientific experimentation and/or evidence. Likewise, a study is empirical when it uses real-world evidence in investigating its assertions.

This research type is founded on the view that direct observation of phenomena is a proper way to measure reality and generate truth about the world (Bhattacharya, 2008). And by its name, it is a methodology in research that observes the rules of empiricism and uses quantitative and qualitative methods for gathering evidence.

For instance, a study is being conducted to determine if working from home helps in reducing stress from highly-demanding jobs. An experiment is conducted using two groups of employees, one working at their homes, the other working at the office. Each group was observed. The outcomes derived from this research will provide empirical evidence if working from home does help reduce stress or not. This also applies to entrepreneurs when they use a small business idea generator instead of manual procedures.

It was the ancient Greek medical practitioners who originated the term empirical ( empeirikos which means “experienced") when they began to deviate from the long-observed dogmatic principles to start depending on observed phenomena. Later on, empiricism pertained to a theory of knowledge in philosophy, which follows the belief that knowledge comes from evidence and experience derived particularly using the senses.

What ancient philosophers considered empirical research pertained to the reliance on observable data to design and test theories and reach conclusions. As such, empirical research is used to produce knowledge that is based on experience. At present, the word “empirical" pertains to the gathering of data using evidence that is derived through experience or observation or by using calibrated scientific tools.

Most of today’s outstanding empirical research outputs are published in prestigious journals. These scientific publications are considered high-impact journals because they publish research articles that tend to be the most cited in their fields.

II. Types and Methodologies of Empirical Research

Empirical research is done using either qualitative or quantitative methods.

Qualitative research Qualitative research methods are utilized for gathering non-numerical data. It is used to determine the underlying reasons, views, or meanings from study participants or subjects. Under the qualitative research design, empirical studies had evolved to test the conventional concepts of evidence and truth while still observing the fundamental principles of recognizing the subjects beings studied as empirical (Powner, 2015).

This method can be semi-structured or unstructured. Results from this research type are more descriptive than predictive. It allows the researcher to write a conclusion to support the hypothesis or theory being examined.

Due to realities like time and resources, the sample size of qualitative research is typically small. It is designed to offer in-depth information or more insight regarding the problem. Some of the most popular forms of methods are interviews, experiments, and focus groups.

Quantitative research   Quantitative research methods are used for gathering information via numerical data. This type is used to measure behavior, personal views, preferences, and other variables. Quantitative studies are in a more structured format, while the variables used are predetermined.

Data gathered from quantitative studies is analyzed to address the empirical questions. Some of the commonly used quantitative methods are polls, surveys, and longitudinal or cohort studies.

There are situations when using a single research method is not enough to adequately answer the questions being studied. In such cases, a combination of both qualitative and quantitative methods is necessary. Also, papers can also make use of both primary and secondary research methods

What Is Empirical Research? Definition, Types & Samples in 2024

III. Qualitative Empirical Research Methods

Some research question examples need to be gathered and analyzed qualitatively or quantitatively, depending on the nature of the study. These not only supply answers to empirical questions but also outline one’s scope of work . Here are the general types of qualitative research methods.

Observational Method

This involves observing and gathering data from study subjects. As a qualitative approach, observation is quite personal and time-intensive. It is often used in ethnographic studies to obtain empirical evidence.

The observational method is a part of the ethnographic research design, e.g., archival research, survey, etc. However, while it is commonly used for qualitative purposes, observation is also utilized for quantitative research, such as when observing measurable variables like weight, age, scale, etc.

One remarkable observational research was conducted by Abbott et al. (2016), a team of physicists from the Advanced Laser Interferometer Gravitational-Wave Observatory who examined the very first direct observation of gravitational waves. According to Google Scholar’s (2019) Metrics ranking, this study is among the most highly cited articles from the world’s most influential journals (Crew, 2019).

This method is exclusively qualitative and is one of the most widely used (Jamshed, 2014). Its popularity is mainly due to its ability to allow researchers to obtain precise, relevant information if the correct questions are asked.

This method is a form of a conversational approach, where in-depth data can be obtained. Interviews are commonly used in the social sciences and humanities, such as for interviewing resource persons.

This method is used to identify extensive information through an in-depth analysis of existing cases. It is typically used to obtain empirical evidence for investigating problems or business studies.

When conducting case studies, the researcher must carefully perform the empirical analysis, ensuring the variables and parameters in the current case are similar to the case being examined. From the findings of a case study, conclusions can be deduced about the topic being investigated.

Case studies are commonly used in studying the experience of organizations, groups of persons, geographic locations, etc.

Textual Analysis

This primarily involves the process of describing, interpreting, and understanding textual content. It typically seeks to connect the text to a broader artistic, cultural, political, or social context (Fairclough, 2003).

A relatively new research method, textual analysis is often used nowadays to elaborate on the trends and patterns of media content, especially social media. Data obtained from this approach are primarily used to determine customer buying habits and preferences for product development, and designing marketing campaigns.

Focus Groups

A focus group is a thoroughly planned discussion guided by a moderator and conducted to derive opinions on a designated topic. Essentially a group interview or collective conversation, this method offers a notably meaningful approach to think through particular issues or concerns (Kamberelis & Dimitriadis, 2011).

This research method is used when a researcher wants to know the answers to “how," “what," and “why" questions. Nowadays, focus groups are among the most widely used methods by consumer product producers for designing and/or improving products that people prefer.

IV. Quantitative Empirical Research Methods

Quantitative methods primarily help researchers to better analyze the gathered evidence. Here are the most common types of quantitative research techniques:

A research hypothesis is commonly tested using an experiment, which involves the creation of a controlled environment where the variables are maneuvered. Aside from determining the cause and effect, this method helps in knowing testing outcomes, such as when altering or removing variables.

Traditionally, experimental, laboratory-based research is used to advance knowledge in the physical and life sciences, including psychology. In recent decades, more and more social scientists are also adopting lab experiments (Falk & Heckman, 2009).

Survey research is designed to generate statistical data about a target audience (Fowler, 2014). Surveys can involve large, medium, or small populations and can either be a one-time event or a continuing process

Governments across the world are among the heavy users of continuing surveys, such as for census of populations or labor force surveys. This is a quantitative method that uses predetermined sets of closed questions that are easy to answer, thus enabling the gathering and analysis of large data sets.

In the past, surveys used to be expensive and time-consuming. But with the advancement in technology, new survey tools like social media and emails have made this research method easier and cheaper.

Causal-Comparative research

This method leverages the strength of comparison. It is primarily utilized to determine the cause and effect relationship among variables (Schenker & Rumrill, 2004).

For instance, a causal-comparative study measured the productivity of employees in an organization that allows remote work setup and compared that to the staff of another organization that does not offer work from home arrangements.

Cross-sectional research

While the observation method considers study subjects at a given point in time, cross-sectional research focuses on the similarity in all variables except the one being studied. 

This type does not allow for the determination of cause-effect relationships since subjects are now observed continuously. A cross-sectional study is often followed by longitudinal research to determine the precise causes. It is used mainly by pharmaceutical firms and retailers.

Longitudinal study

A longitudinal method of research is used for understanding the traits or behavior of a subject under observation after repeatedly testing the subject over a certain period of time. Data collected using this method can be qualitative or quantitative in nature. 

A commonly-used form of longitudinal research is the cohort study. For instance, in 1951, a cohort study called the British Doctors Study (Doll et al., 2004) was initiated, which compared smokers and non-smokers in the UK. The study continued through 2001. As early as 1956, the study gave undeniable proof of the direct link between smoking and the incidence of lung cancer.

Correlational research

This method is used to determine the relationships and prevalence among variables (Curtis et al., 2016). It commonly employs regression as the statistical treatment for predicting the study’s outcomes, which can only be a negative, neutral, or positive correlation.

A classic example of empirical research with correlational research is when studying if high education helps in obtaining better-paying jobs. If outcomes indicate that higher education does allow individuals to have high-salaried jobs, then it follows that people with less education tend to have lower-paying jobs.

What Is Empirical Research? Definition, Types & Samples in 2024

V. Steps for Conducting Empirical Research

Since empirical research is based on observation and capturing experiences, it is important to plan the steps to conduct the experiment and how to analyze it. This will enable the researcher to resolve problems or obstacles, which can occur during the experiment.

Step #1: Establishing the research objective

In this initial step, the researcher must be clear about what he or she precisely wants to do in the study. He or she should likewise frame the problem statement, plans of action, and determine any potential issues with the available resources, schedule, etc. for the research.

Most importantly, the researcher must be able to ascertain whether the study will be more beneficial than the cost it will incur.

Step #2: Reviewing relevant literature and supporting theories

The researcher must determine relevant theories or models to his or her research problem. If there are any such theories or models, they must understand how it can help in supporting the study outcomes.

Relevant literature must also be consulted. The researcher must be able to identify previous studies that examined similar problems or subjects, as well as determine the issues encountered.

Step #3: Framing the hypothesis and measurement

The researcher must frame an initial hypothesis or educated guess that could be the likely outcome. Variables must be established, along with the research context.

Units of measurements should also be defined, including the allowable margin of errors. The researcher must determine if the selected measures will be accepted by other scholars.

Step #4: Defining the research design, methodology, and data collection techniques

Before proceeding with the study, the researcher must establish an appropriate approach for the research. He or she must organize experiments to gather data that will allow him or her to frame the hypothesis.

The researcher should also decide whether he or she will use a nonexperimental or experimental technique to perform the study. Likewise, the  type of research design will depend on the type of study being conducted.

Finally, the researcher must determine the parameters that will influence the validity of the research design. Data gathering must be performed by selecting suitable samples based on the research question. After gathering the empirical data, the analysis follows.

Step #5: Conducting data analysis and framing the results

Data analysis is done either quantitatively or qualitatively. Depending on the nature of the study, the researcher must determine which method of data analysis is the appropriate one, or whether a combination of the two is suitable.

The outcomes of this step determine if the hypothesis is supported or rejected. This is why data analysis is considered as one of the most crucial steps in any research undertaking.

Step #6: Making conclusions

A report must be prepared in that it presents the findings and the entire research proceeding. If the researcher intends to disseminate his or her findings to a wider audience, the report will be converted into an article for publication. Aside from including the typical parts from the introduction and literature view, up to the methods, analysis, and conclusions, the researcher should also make recommendations for further research on his or her topic.

To ensure the originality and credibility of the report or research, it is essential to employ a plagiarism checker. By using a reliable plagiarism checker, the researcher can verify the uniqueness of their work and avoid any unintentional instances of plagiarism. This step helps maintain the integrity of the research and ensures that the recommendations for further research are based on the researcher’s own original insights. Incorporating a plagiarism checker into the writing process provides an additional layer of assurance and professionalism, enhancing the impact of the report or article in the academic community. Educators can also check the originality of their students’ research by utilizing a free plagiarism checker for teachers .

VI. Empirical Research Cycle

The empirical research cycle is composed of five phases, with each one considered as important as the next phase (de Groot, 1969). This rigorous and systematic method can consistently capture the process of framing hypotheses on how certain subjects behave or function and then testing them versus empirical data. It is considered to typify the deductive approach to science.

These are the five phases of the empirical research cycle:

1. Observation

During this initial phase, an idea is triggered for presenting a hypothesis. It involves the use of observation to gather empirical data. For example :

Consumers tend to consult first their smartphones before buying something in-store .

2. Induction

Inductive reasoning is then conducted to frame a general conclusion from the data gathered through observation. For example:

As mentioned earlier, most consumers tend to consult first their smartphones before buying something in-store .

A researcher may pose the question, “Does the tendency to use a smartphone indicate that today’s consumers need to be informed before making purchasing decisions?" The researcher can assume that is the case. Nonetheless, since it is still just a supposition, an experiment must be conducted to support or reject this hypothesis.

The researcher decided to conduct an online survey to inquire about the buying habits of a certain sample population of buyers at brick-and-mortar stores. This is to determine whether people always look at their smartphones first before making a purchase.

3. Deduction

This phase enables the researcher to figure out a conclusion out of the experiment. This must be based on rationality and logic in order to arrive at particular, unbiased outcomes. For example:

In the experiment, if a shopper consults first his or her smartphone before buying in-store, then it can be concluded that the shopper needs information to help him or her make informed buying decisions .

This phase involves the researcher going back to the empirical research steps to test the hypothesis. There is now the need to analyze and validate the data using appropriate statistical methods.

If the researcher confirms that in-store shoppers do consult their smartphones for product information before making a purchase, the researcher has found support for the hypothesis. However, it should be noted that this is just support of the hypothesis, not proof of a reality.

5. Evaluation

This phase is often neglected by many but is actually a crucial step to help keep expanding knowledge. During this stage, the researcher presents the gathered data, the supporting contention/s, and conclusions.

The researcher likewise puts forth the limitations of the study and his hypothesis. In addition, the researcher makes recommendations for further studies on the same topic with expanded variables.

What Is Empirical Research? Definition, Types & Samples in 2024

VII. Advantages and Disadvantages of Empirical Research

Since the time of the ancient Greeks, empirical research had been providing the world with numerous benefits. The following are a few of them:

  • Empirical research is used to validate previous research findings and frameworks.
  • It assumes a critical role in enhancing internal validity.
  • The degree of control is high, which enables the researcher to manage numerous variables.
  • It allows a researcher to comprehend the progressive changes that can occur, and thus enables him to modify an approach when needed.
  • Being based on facts and experience makes a research project more authentic and competent.

Disadvantages

Despite the many benefits it brings, empirical research is far from perfect. The following are some of its drawbacks:

  • Being evidence-based, data collection is a common problem especially when the research involves different sources and multiple methods.
  • It can be time-consuming, especially for longitudinal research.
  • Requesting permission to perform certain methods can be difficult, especially when a study involves human subjects.
  • Conducting research in multiple locations can be very expensive.
  • The propensity of even seasoned researchers to incorrectly interpret the statistical significance. For instance, Amrhein et al. (2019) made an analysis of 791 articles from five journals and found that half incorrectly interpreted that non-significance indicates zero effect.

VIII. Samples of Empirical Research

There are many types of empirical research. And, they can take many formsfrom basic research to action research like community project efforts. Here are some notable empirical research examples:

Professional Research

  • Research on Information Technology
  • Research on Infectious Diseases
  • Research on Occupational Health Psychology
  • Research on Infection Control
  • Research on Cancer
  • Research on Mathematical Science
  • Research on Environmental Science
  • Research on Genetics
  • Research on Climate Change
  • Research on Economics

Student Research

  • Thesis for B.S. in Computer Science & Engineering  
  • Thesis for B.S. in Geography
  • Thesis for B.S. in Architecture
  • Thesis for Master of Science in Electrical Engineering & Computer Science
  • Thesis for Master of Science in Artificial Intelligence
  • Thesis for Master of Science in Food Science and Nutrition
  • Dissertation for Ph.D. in Marketing  
  • Dissertation for Ph.D. in Social Work
  • Dissertation for Ph.D. in Urban Planning

Since ancient times until today, empirical research remains one of the most useful tools in man’s collective endeavor to unlock life’s mysteries. Using meaningful experience and observable evidence, this type of research will continue helping validate myriad hypotheses, test theoretical models, and advance various fields of specialization.

With new forms of deadly diseases and other problems continuing to plague man’s existence, finding effective medical interventions and relevant solutions had never been more important. This is among the reasons why empirical research had assumed a more prominent role in today’s society.

This article was able to discuss the different empirical research methods, the steps for conducting empirical research, the empirical research cycle, and notable examples. All of these contribute to supporting the larger societal cause to help understand how the world really works and make it a better place. Furthermore, being factually accurate is a big part of the criteria of good research , and it serves as the heart of empirical research.

Key Insights

  • Definition of Empirical Research: Empirical research is based on verifiable evidence derived from observation and experimentation, aiming to understand how the world works.
  • Origins: The concept of empirical research dates back to ancient Greek medical practitioners who relied on observed phenomena rather than dogmatic principles.
  • Types and Methods: Empirical research can be qualitative (e.g., interviews, case studies) or quantitative (e.g., surveys, experiments), depending on the nature of the data collected and the research question.
  • Empirical Research Cycle: Consists of observation, induction, deduction, testing, and evaluation, forming a systematic approach to generating and testing hypotheses.
  • Steps in Conducting Empirical Research: Includes establishing objectives, reviewing literature, framing hypotheses, designing methodology, collecting data, analyzing data, and making conclusions.
  • Advantages: Empirical research validates previous findings, enhances internal validity, allows for high control over variables, and is fact-based, making it authentic and competent.
  • Disadvantages: Data collection can be challenging and time-consuming, especially in longitudinal studies, and interpreting statistical significance can be problematic.
  • Applications: Used across various fields such as IT, infectious diseases, occupational health, environmental science, and economics. It is also prevalent in student research for theses and dissertations.
  • What is the primary goal of empirical research? The primary goal of empirical research is to generate knowledge about how the world works by relying on verifiable evidence obtained through observation and experimentation.
  • How does empirical research differ from theoretical research? Empirical research is based on observable and measurable evidence, while theoretical research involves abstract ideas and concepts without necessarily relying on real-world data.
  • What are the main types of empirical research methods? The main types of empirical research methods are qualitative (e.g., interviews, case studies, focus groups) and quantitative (e.g., surveys, experiments, cross-sectional studies).
  • Why is the empirical research cycle important? The empirical research cycle is important because it provides a structured and systematic approach to generating and testing hypotheses, ensuring that the research is thorough and reliable.
  • What are the steps involved in conducting empirical research? The steps involved in conducting empirical research include establishing the research objective, reviewing relevant literature, framing hypotheses, defining research design and methodology, collecting data, analyzing data, and making conclusions.
  • What are the advantages of empirical research? The advantages of empirical research include validating previous findings, enhancing internal validity, allowing for high control over variables, and being based on facts and experiences, making the research authentic and competent.
  • What are some common challenges in conducting empirical research? Common challenges in conducting empirical research include difficulties in data collection, time-consuming processes, obtaining permissions for certain methods, high costs, and potential misinterpretation of statistical significance.
  • In which fields is empirical research commonly used? Empirical research is commonly used in fields such as information technology, infectious diseases, occupational health, environmental science, economics, and various academic disciplines for student theses and dissertations.
  • Can empirical research use both qualitative and quantitative methods? Yes, empirical research can use both qualitative and quantitative methods, often combining them to provide a comprehensive understanding of the research problem.
  • What role does empirical research play in modern society? Empirical research plays a crucial role in modern society by validating hypotheses, testing theoretical models, and advancing knowledge across various fields, ultimately contributing to solving complex problems and improving the quality of life.
  • Abbott, B., Abbott, R., Abbott, T., Abernathy, M., & Acernese, F. (2016). Observation of Gravitational Waves from a Binary Black Hole Merger. Physical Review Letters, 116 (6), 061102. https://doi.org/10.1103/PhysRevLett.116.061102
  • Akpinar, E. (2014). Consumer Information Sharing: Understanding Psychological Drivers of Social Transmission . (Unpublished Ph.D. dissertation). Erasmus University Rotterdam, Rotterdam, The Netherlands.  http://hdl.handle.net/1765/1
  • Altmetric (2020). The 2019 Altmetric top 100. Altmetric .
  • Amrhein, V., Greenland, S., & McShane, B. (2019). Scientists rise up against statistical significance. Nature, 567 , 305-307.  https://doi.org/10.1038/d41586-019-00857-9
  • Amrhein, V., Trafimow, D., & Greenland, S. (2019). Inferential statistics as descriptive statistics: There is no replication crisis if we don’t expect replication. The American Statistician, 73 , 262-270. https://doi.org/10.1080/00031305.2018.1543137
  • Arute, F., Arya, K., Babbush, R. et al. (2019). Quantum supremacy using a programmable superconducting processor. Nature, 574 , 505510. https://doi.org/10.1038/s41586-019-1666-5
  • Bhattacharya, H. (2008). Empirical Research. In L. M. Given (ed.), The SAGE Encyclopedia of Qualitative Research Methods . Thousand Oaks, CA: Sage, 254-255.  https://dx.doi.org/10.4135/9781412963909.n133
  • Cohn, A., Maréchal, M., Tannenbaum, D., & Zund, C. (2019). Civic honesty around the globe. Science, 365 (6448), 70-73. https://doi.org/10.1126/science.aau8712
  • Corbin, J., & Strauss, A. (2015). Basics of Qualitative Research: Techniques and Procedures for Developing Grounded Theory, 4th ed . Thousand Oaks, CA: Sage. ISBN 978-1-4129-9746-1
  • Crew, B. (2019, August 2). Google Scholar reveals its most influential papers for 2019. Nature Index .
  • Curtis, E., Comiskey, C., & Dempsey, O. (2016). Importance and use of correlational research. Nurse Researcher, 23 (6), 20-25. https://doi.org/10.7748/nr.2016.e1382
  • Dashti, H., Jones, S., Wood, A., Lane, J., & van Hees, V., et al. (2019). Genome-wide association study identifies genetic loci for self-reported habitual sleep duration supported by accelerometer-derived estimates. Nature Communications, 10 (1).  https://doi.org/10.1038/s41467-019-08917-4
  • de Groot, A.D. (1969). Methodology: foundations of inference and research in the behavioral sciences. In  Psychological Studies, 6 . The Hague & Paris: Mouton & Co. Google Books
  • Doll, R., Peto, R., Boreham, J., & Isabelle Sutherland, I. (2004). Mortality in relation to smoking: 50 years’ observations on male British doctors. BMJ, 328  (7455), 1519-33. https://doi.org/10.1136/bmj.38142.554479.AE
  • Fairclough, N. (2003). Analyzing Discourse: Textual Analysis for Social Research . Abingdon-on-Thames: Routledge. Google Books
  • Falk, A., & Heckman, J. (2009). Lab experiments are a major source of knowledge in the social sciences. Science, 326 (5952), pp. 535-538. https://doi.org/10.1126/science.1168244
  • Fowler, F.J. (2014). Survey Research Methods, 5th ed . Thousand Oaks, CA: Sage. WorldCat
  • Gabriel, A., Manalo, M., Feliciano, R., Garcia, N., Dollete, U., & Paler J. (2018). A Candida parapsilosis inactivation-based UV-C process for calamansi (Citrus microcarpa) juice frink. LWT Food Science and Technology, 90 , 157-163. https://doi.org/10.1016/j.lwt.2017.12.020
  • Gallus, S., Bosetti, C., Negri, E., Talamini, R., Montella, M., et al. (2003). Does pizza protect against cancer? International Journal of Cancer, 107 (2), pp. 283-284. https://doi.org/10.1002/ijc.11382
  • Ganna, A., Verweij, K., Nivard, M., Maier, R., & Wedow, R. (2019). Large-scale GWAS reveals insights into the genetic architecture of same-sex sexual behavior. Science, 365 (6456). https://doi.org/10.1126/science.aat7693
  • Gedik, H., Voss, T., & Voss, A. (2013). Money and Transmission of Bacteria. Antimicrobial Resistance and Infection Control, 2 (2).  https://doi.org/10.1186/2047-2994-2-22
  • Gonzalez-Morales, M. G., Kernan, M. C., Becker, T. E., & Eisenberger, R. (2018). Defeating abusive supervision: Training supervisors to support subordinates. Journal of Occupational Health Psychology, 23  (2), 151-162. https://dx.doi.org/10.1037/ocp0000061
  • Google (2020). The 2019 Google Scholar Metrics Ranking . Google Scholar
  • Greenberg, D., Warrier, V., Allison, C., & Baron-Cohen, S. (2018). Testing the Empathizing-Systemising theory of sex differences and the Extreme Male Brain theory of autism in half a million people. PNAS, 115 (48), 12152-12157. https://doi.org/10.1073/pnas.1811032115
  • Grullon, D. (2019). Disentangling time constant and time-dependent hidden state in time series with variational Bayesian inference . (Unpublished master’s thesis). Massachusetts Institute of Technology, Cambridge, MA.  https://hdl.handle.net/1721.1/124572
  • He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) , 770-778. https://doi.org/10.1109/CVPR.2016.90
  • Hviid, A., Hansen, J., Frisch, M., & Melbye, M. (2019). Measles, mumps, rubella vaccination, and autism: A nationwide cohort study. Annals of Internal Medicine, 170 (8), 513-520. https://doi.org/10.7326/M18-2101
  • Jamshed, S. (2014). Qualitative research method-interviewing and observation. Journal of Basic and Clinical Pharmacy, 5 (4), 87-88. https://doi.org/10.4103/0976-0105.141942
  • Jamshidnejad, A. (2017). Efficient Predictive Model-Based and Fuzzy Control for Green Urban Mobility . (Unpublished Ph.D. dissertation). Delft University of Technology, Delft, Netherlands.  DUT
  • Kamberelis, G., & Dimitriadis, G. (2011). Focus groups: Contingent articulations of pedagogy, politics, and inquiry. In N. Denzin & Y. Lincoln (Eds.), The SAGE Handbook of Qualitative Research  (pp. 545-562). Thousand Oaks, CA: Sage. ISBN 978-1-4129-7417-2
  • Knowles-Smith, A. (2017). Refugees and theatre: an exploration of the basis of self-representation . (Unpublished undergraduate thesis). University College London, London, UK. UCL
  • Kulp, S.A., & Strauss, B.H. (2019). New elevation data triple estimates of global vulnerability to sea-level rise and coastal flooding. Nature Communications, 10 (4844), 1-12.  https://doi.org/10.1038/s41467-019-12808-z
  • LeCun, Y., Bengio, Y. & Hinton, G. (2015). Deep learning. Nature, 521 , 436444. https://doi.org/10.1038/nature14539
  • Levitt, H. M., Bamberg, M., Creswell, J. W., Frost, D. M., Josselson, R., & Suarez-Orozco, C. (2018). Journal article reporting standards for qualitative primary, qualitative meta-analytic, and mixed methods research in psychology: The APA Publications and Communications Board task force report.  American Psychologist, 73 (1), 26-46. https://doi.org/10.1037/amp0000151
  • Long, J., Shelhamer, E., & Darrell, T. (2015). Fully convolutional networks for semantic segmentation. 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) , 3431-3440. https://doi.org/10.1109/CVPR.2015.7298965
  • Martindell, N. (2014). DCDN: Distributed content delivery for the modern web . (Unpublished undergraduate thesis). University of Washington, Seattle, WA. CSE-UW
  • Mora, T. (2019). Transforming Parking Garages Into Affordable Housing . (Unpublished undergraduate thesis). University of Arkansas-Fayetteville, Fayetteville, AK. UARK
  • Ng, M., Fleming, T., Robinson, M., Thomson, B., & Graetz, N. (2014). Global, regional, and national prevalence of overweight and obesity in children and adults during 19802013: a systematic analysis for the Global Burden of Disease Study 2013. The Lancet, 384  (9945), 766-781. https://doi.org/10.1016/S0140-6736(14)60460-8
  • Ogden, C., Carroll, M., Kit, B., & Flegal, K. (2014). Prevalence of Childhood and Adult Obesity in the United States, 2011-2012. JAMA, 311 (8), 806-14. https://doi.org/10.1001/jama.2014.732
  • Powner, L. (2015). Empirical Research and Writing: A Political Science Student’s Practical Guide . Thousand Oaks, CA: Sage, 1-19.  https://dx.doi.org/10.4135/9781483395906
  • Ripple, W., Wolf, C., Newsome, T., Barnard, P., & Moomaw, W. (2020). World scientists’ warning of a climate emergency. BioScience, 70 (1), 8-12. https://doi.org/10.1093/biosci/biz088
  • Schenker, J., & Rumrill, P. (2004). Causal-comparative research designs. Journal of Vocational Rehabilitation, 21 (3), 117-121.
  • Shereen, M., Khan, S., Kazmi, A., Bashir, N., & Siddique, R. (2020). COVID-19 infection: Origin, transmission, and characteristics of human coronaviruses. Journal of Advanced Research, 24 , 91-98.  https://doi.org/10.1016/j.jare.2020.03.005
  • Sipola, C. (2017). Summarizing electricity usage with a neural network . (Unpublished master’s thesis). University of Edinburgh, Edinburgh, Scotland. Project-Archive
  • Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., & Rabinovich, A. (2015). Going deeper with convolutions. 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) , 1-9. https://doi.org/10.1109/CVPR.2015.7298594
  • Taylor, S. (2017). Effacing and Obscuring Autonomy: the Effects of Structural Violence on the Transition to Adulthood of Street Involved Youth . (Unpublished Ph.D. dissertation). University of Ottawa, Ottawa, Canada. UOttawa
  • Vosoughi, S., Roy, D., & Aral, S. (2018). The spread of true and false news online. Science, 359 (6380), 1146-1151. https://doi.org/10.1126/science.aap9559

Related Articles

How to Write a Thesis Statement for a Research Paper in 2024: Steps and Examples thumbnail

How to Write a Thesis Statement for a Research Paper in 2024: Steps and Examples

How to Write a Research Paper Abstract in 2024: Guide With Examples thumbnail

How to Write a Research Paper Abstract in 2024: Guide With Examples

72 Scholarship Statistics: 2024 Data, Facts & Analysis thumbnail

72 Scholarship Statistics: 2024 Data, Facts & Analysis

What Is a University Dissertation: 2024 Structure, Challenges & Writing Tips thumbnail

What Is a University Dissertation: 2024 Structure, Challenges & Writing Tips

Web-Based Research: Tips For Conducting Academic Research thumbnail

Web-Based Research: Tips For Conducting Academic Research

How to Write Research Methodology in 2024: Overview, Tips, and Techniques thumbnail

How to Write Research Methodology in 2024: Overview, Tips, and Techniques

EasyChair : Tutorial of how to request an installation for Conference Management System thumbnail

EasyChair : Tutorial of how to request an installation for Conference Management System

Levels of Evidence in Research: Examples, Hierachies & Practice in 2024 thumbnail

Levels of Evidence in Research: Examples, Hierachies & Practice in 2024

How to Write a Research Question in 2024: Types, Steps, and Examples thumbnail

How to Write a Research Question in 2024: Types, Steps, and Examples

How to Write a Research Proposal in 2024: Structure, Examples & Common Mistakes thumbnail

How to Write a Research Proposal in 2024: Structure, Examples & Common Mistakes

Needs Analysis in 2024: Definition, Importance & Implementation thumbnail

Needs Analysis in 2024: Definition, Importance & Implementation

Importing references from google scholar to bibtex.

Enhancing Student Learning with Technology: Trends and Innovations thumbnail

Enhancing Student Learning with Technology: Trends and Innovations

How to Become a Nurse in Arizona in 2024 thumbnail

How to Become a Nurse in Arizona in 2024

How to Become a Nurse in New Jersey in 2024 thumbnail

How to Become a Nurse in New Jersey in 2024

How to Become a Nurse in Maryland in 2024 thumbnail

How to Become a Nurse in Maryland in 2024

How to Become a Nurse in Florida in 2024 thumbnail

How to Become a Nurse in Florida in 2024

How to Become a Nurse in Hawaii in 2024 thumbnail

How to Become a Nurse in Hawaii in 2024

How to Become a Nurse in Michigan in 2024 thumbnail

How to Become a Nurse in Michigan in 2024

How to Become a Nurse in California in 2024 thumbnail

How to Become a Nurse in California in 2024

How to Become a Nurse in Missouri in 2024 thumbnail

How to Become a Nurse in Missouri in 2024

Recently published articles.

Best Doctorate Degree in Psychology Programs: 2024 Costs & Job Opportunities

Best Doctorate Degree in Psychology Programs: 2024 Costs & Job Opportunities

Best Medical Assistant Schools & Programs in Oregon – How to Become a Medical Assistant in Oregon in 2024

Best Medical Assistant Schools & Programs in Oregon – How to Become a Medical Assistant in Oregon in 2024

21 Easiest Online College Degrees and Majors for 2024

21 Easiest Online College Degrees and Majors for 2024

21 Best One Year MBA Programs for 2024

21 Best One Year MBA Programs for 2024

Best Online Marketing Degree Programs in 2024

Best Online Marketing Degree Programs in 2024

How to Become a Bookkeeper: 2024 Guide to Career Paths & Certification Requirements

How to Become a Bookkeeper: 2024 Guide to Career Paths & Certification Requirements

Best Accounting Schools in Colorado in 2024 – How to Become a CPA in CO

Best Accounting Schools in Colorado in 2024 – How to Become a CPA in CO

How to Become a Teacher in New Jersey – What Degree Do You Need to Be a Teacher in 2024

How to Become a Teacher in New Jersey – What Degree Do You Need to Be a Teacher in 2024

Easiest Psychology Degree Programs for 2024

Easiest Psychology Degree Programs for 2024

12 Fastest Doctor of Nursing Practice (DNP) Degree Programs for 2024

12 Fastest Doctor of Nursing Practice (DNP) Degree Programs for 2024

Best Nursing Schools in San Diego, CA 2024 – Accredited Nursing Degree Programs Online

Best Nursing Schools in San Diego, CA 2024 – Accredited Nursing Degree Programs Online

Library Science Careers: 2024 Guide to Career Paths, Options & Salary

Library Science Careers: 2024 Guide to Career Paths, Options & Salary

How to Choose Between a Psychologist vs Psychiatrist Career in 2024

How to Choose Between a Psychologist vs Psychiatrist Career in 2024

International Business Careers: 2024 Guide to Career Paths, Options & Salary

International Business Careers: 2024 Guide to Career Paths, Options & Salary

Best Psychology Schools in Georgia – 2024 Accredited Colleges & Programs

Best Psychology Schools in Georgia – 2024 Accredited Colleges & Programs

Best Nursing Schools in North Las Vegas, NV 2024 – Accredited Nursing Degree Programs Online

Best Nursing Schools in North Las Vegas, NV 2024 – Accredited Nursing Degree Programs Online

Quick Medical Certifications that Pay Well in Healthcare in 2024

Quick Medical Certifications that Pay Well in Healthcare in 2024

Most Affordable Online RN-to-MSN Degree Programs for 2024

Most Affordable Online RN-to-MSN Degree Programs for 2024

History Degree Guide: 2024 Costs, Requirements & Job Opportunities

History Degree Guide: 2024 Costs, Requirements & Job Opportunities

Best Online Master’s Degree In Data Science: Guide to Online Programs for 2024

Best Online Master’s Degree In Data Science: Guide to Online Programs for 2024

What Can You Do with an MBA Degree? 2024 Costs & Job Opportunities

What Can You Do with an MBA Degree? 2024 Costs & Job Opportunities

Newsletter & conference alerts.

Research.com uses the information to contact you about our relevant content. For more information, check out our privacy policy .

Newsletter confirmation

Thank you for subscribing!

Confirmation email sent. Please click the link in the email to confirm your subscription.

Research Guide

Chapter 6 the empirical analysis.

Any quantitative research in economics is centered on the analysis we perform on the data we collected. This is the most crucial part of the paper and will define if our work is a success or not (this is, of course linked to having a good research question and a plausible hypothesis).

In this section, I provide a set of guidelines of some of the elements to keep in mind when conducting quantitative research. This material, of course, is not exhaustive as there are many elements we need to take into account, but it may provide you with some structure as to what are the issues we need to keep in mind.

6.1 The Data

There are two different types of data that exist. Experimental data is collected when an experiment or study is conducted to examine the effects of a given policy or intervention. One example may be when looking if there is an increase in vaccination when providing incentives. One group may not receive any sort of incentive, whereas another group may receive a monetary incentive and another one an in-kind incentive. Data is collected to ensure that all the arms in the study have a similar configuration, so when the study is conducted, we can verify that the true effects come from the treatment (the incentives) and not from a different factor affecting the configuration of the sample.

The most popular sort of data, however, is observational data. This information is collected by either administrative sources (think of the U.S. Census data or the World Bank). This data is collected using surveys or accessing historical records. Sometimes, it is hard to use this data for econometric analysis as there is no random assignment of a treatment, so it is harder to elicit the true effect . However, there are multiple tools that we can use to deal with these issues and estimate causal effects.

6.1.1 Data configuration

6.1.1.1 cross-sectional data.

Cross-sectional data includes data on different subjects (individuals, households, government units, countries) for a single time period . This means that we only have one level of analysis and one observation per subject (the i ). This type of data allows us to learn more about the relationship among different variables.

One example of this type of data is the survey on smallholder farmers collected in the Ivory Coast in 2015 by the World Bank, where about 2,500 smallholder farmers were surveyed to ask questions about farming practices, investment and access to financial services.

6.1.1.2 Time-series Data

In this case, data for a single subject is collected during multiple time periods. In this case, the main unit of analysis will be based on time (the t ).

The most common type of data used for this type of analysis is macroeconomic data (GDP, unemployment, etc.) and is highly used to do forecasting.

6.1.1.3 Panel Data

Panel, or longitudinal, data includes multiple observations for each subject Mostly, we are going to see that data is collected for the same object during multiple time periods, so we will see that for the same i , we will have data for multiple t ’s.

This data is highly used in econometrics. One example is, for instance, the number of violent crimes per county (the i ) for the period between 2000 and 2020 (the t ).

It is extremely important to understand the configuration of your data, as this will define the type of econometric analysis that you can conduct.

6.1.2 Describing your Variables

After we have identified the configuration of our data, it is necessary that we think deeper about the configuration of the variables that we will use in our analysis. It is crucial that you identify their characteristics, as well as their distribution. This will then help you evaluate if you need to conduct any sort of transformation to your variables, and understand how to interpret the coefficients of your regressions. Here, I am just including the most relevant aspects of this steps, but you can read Nick Hunington-Kelin’s book for more details.

6.1.2.1 Types of Variables

  • Continuous variables : In theory, this variables can include any value, but sometimes they may be censored in some way (for instance, some variables cannot be negative). Some examples of this type of variable are income, for example.
  • Count variables : Most times, we treat this variables in the same way as we treat continuous variables, but in this case, these variables represent how many or how much there is of a certain variable (they count). When we plot them, it is clear that these variables are not continuous.
  • Categorical variables : Multiple times, surveys include questions that have a pre-set number of values or where the respondent needs to provide an answer that can then be grouped in a given category. For instance, ethnicity, religion, age group, etc. Many times, these variables are or can be transformed into binary (or indicator) variables. A clear example of the former is sex, but a new set of variables for different religions can be created to identify Christians, Jewish, Muslims, and so forth. Depending on the original category, a new set of dichotomous variables can be created to identify if a person identifies with one of these religions.
  • Qualitative Variables : Sometimes, responses require a more detailed explanation and therefore cannot be grouped into categories (at least not on first sight). For instance, the ACLED data, a source on conflict data, includes a variable that explains the details of a given conflict event.

6.1.3 Visualizing your Data

After you identify the type of variables you are using in your analysis, it is key that you understand their distribution. What are the different values that a variable can take? How often these values occur?

This can be done in multiple ways. The easiest one is to generate a table for the variable. In Stata, this is done with:

To tabulate a variable in R, you can use:

You can also plot your variables to obtain a clear visualization of their distribution. You can use histograms for non-continuous variables, and density plots for continuous variables.

6.1.4 Distribution

Many times, it is important to know more about the different moments of the distribution of your variables: mean, variance (or standard deviation), skewness, and sometimes, the kurtosis.

Although a visual representation of your data is very useful in these cases, obtaining a table with this information may also be necessary, to also obtain the range of your data, as well as other important characteristics.

In Stata, you can obtain a set of descriptive statistics using:

In R, you can get a range of descriptive statistics using

Why is this important? Because remember, we are trying to draw some inferences from the sample we have and apply it to the real world (to the whole population we are analyzing). Many times, we have some idea of theoretical distribution of the variables we are interested in In most cases, it is plausible to assume a normal distribution (remember the Central Limit Theorem ). This is one of the reasons we prefer larger samples than smaller ones. In some cases, we may get a distribution that is skewed to the right and has a very fat right-tail, but once we obtain the natural logarithm, it becomes normal. This refers to a log-normal distribution. As we proceed with analysis and do hypothesis testing, remember that you are using a limited sample to learn more about a bigger population.

6.2 Initial Description of a Relationship

Once we know how our specific variables are distributed, we may be interested in learning more about how they are linked. We want to see how our independent variable(s) is(are) linked to the dependent variable.

The most straightforward way to do this is by using a scatterplot, where we plot the independent and dependent variable and see how they correlate.

We may also look at some conditional distributions and plot histograms and scatterplots, looking at a subsample of the data or plotting it for different groups.

In addition, we can obtain an initial image on the relationship between X and Y doing a simple OLS regression (with no control variables). We may even plot this fitted OLS line.

For more examples and a more detailed description, please check Nick Hunington-Kelin’s book .

6.3 Handouts

How to Interpret Coefficients?

Canvas | University | Ask a Librarian

  • Library Homepage
  • Arrendale Library

Empirical & Non-Empirical Research

  • Empirical Research

Introduction: What is Empirical Research?

Quantitative methods, qualitative methods.

  • Quantitative vs. Qualitative
  • Reference Works for Social Sciences Research
  • What is Non-Empirical Research?
  • Contact Us!

 Call us at 706-776-0111

  Chat with a Librarian

  Send Us Email

  Library Hours

Empirical research  is based on phenomena that can be observed and measured. Empirical research derives knowledge from actual experience rather than from theory or belief. 

Key characteristics of empirical research include:

  • Specific research questions to be answered;
  • Definitions of the population, behavior, or phenomena being studied;
  • Description of the methodology or research design used to study this population or phenomena, including selection criteria, controls, and testing instruments (such as surveys);
  • Two basic research processes or methods in empirical research: quantitative methods and qualitative methods (see the rest of the guide for more about these methods).

(based on the original from the Connelly LIbrary of LaSalle University)

empirical research data analysis

Empirical Research: Qualitative vs. Quantitative

Learn about common types of journal articles that use APA Style, including empirical studies; meta-analyses; literature reviews; and replication, theoretical, and methodological articles.

Academic Writer

© 2024 American Psychological Association.

  • More about Academic Writer ...

Quantitative Research

A quantitative research project is characterized by having a population about which the researcher wants to draw conclusions, but it is not possible to collect data on the entire population.

  • For an observational study, it is necessary to select a proper, statistical random sample and to use methods of statistical inference to draw conclusions about the population. 
  • For an experimental study, it is necessary to have a random assignment of subjects to experimental and control groups in order to use methods of statistical inference.

Statistical methods are used in all three stages of a quantitative research project.

For observational studies, the data are collected using statistical sampling theory. Then, the sample data are analyzed using descriptive statistical analysis. Finally, generalizations are made from the sample data to the entire population using statistical inference.

For experimental studies, the subjects are allocated to experimental and control group using randomizing methods. Then, the experimental data are analyzed using descriptive statistical analysis. Finally, just as for observational data, generalizations are made to a larger population.

Iversen, G. (2004). Quantitative research . In M. Lewis-Beck, A. Bryman, & T. Liao (Eds.), Encyclopedia of social science research methods . (pp. 897-898). Thousand Oaks, CA: SAGE Publications, Inc.

Qualitative Research

What makes a work deserving of the label qualitative research is the demonstrable effort to produce richly and relevantly detailed descriptions and particularized interpretations of people and the social, linguistic, material, and other practices and events that shape and are shaped by them.

Qualitative research typically includes, but is not limited to, discerning the perspectives of these people, or what is often referred to as the actor’s point of view. Although both philosophically and methodologically a highly diverse entity, qualitative research is marked by certain defining imperatives that include its case (as opposed to its variable) orientation, sensitivity to cultural and historical context, and reflexivity. 

In its many guises, qualitative research is a form of empirical inquiry that typically entails some form of purposive sampling for information-rich cases; in-depth interviews and open-ended interviews, lengthy participant/field observations, and/or document or artifact study; and techniques for analysis and interpretation of data that move beyond the data generated and their surface appearances. 

Sandelowski, M. (2004).  Qualitative research . In M. Lewis-Beck, A. Bryman, & T. Liao (Eds.),  Encyclopedia of social science research methods . (pp. 893-894). Thousand Oaks, CA: SAGE Publications, Inc.

  • Next: Quantitative vs. Qualitative >>
  • Last Updated: Jul 24, 2024 12:04 PM
  • URL: https://library.piedmont.edu/empirical-research
  • Ebooks & Online Video
  • New Materials
  • Renew Checkouts
  • Faculty Resources
  • Library Friends
  • Library Services
  • Our Mission
  • Library History
  • Ask a Librarian!
  • Making Citations
  • Working Online

Friend us on Facebook!

Arrendale Library Piedmont University 706-776-0111

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • J Korean Med Sci
  • v.37(16); 2022 Apr 25

Logo of jkms

A Practical Guide to Writing Quantitative and Qualitative Research Questions and Hypotheses in Scholarly Articles

Edward barroga.

1 Department of General Education, Graduate School of Nursing Science, St. Luke’s International University, Tokyo, Japan.

Glafera Janet Matanguihan

2 Department of Biological Sciences, Messiah University, Mechanicsburg, PA, USA.

The development of research questions and the subsequent hypotheses are prerequisites to defining the main research purpose and specific objectives of a study. Consequently, these objectives determine the study design and research outcome. The development of research questions is a process based on knowledge of current trends, cutting-edge studies, and technological advances in the research field. Excellent research questions are focused and require a comprehensive literature search and in-depth understanding of the problem being investigated. Initially, research questions may be written as descriptive questions which could be developed into inferential questions. These questions must be specific and concise to provide a clear foundation for developing hypotheses. Hypotheses are more formal predictions about the research outcomes. These specify the possible results that may or may not be expected regarding the relationship between groups. Thus, research questions and hypotheses clarify the main purpose and specific objectives of the study, which in turn dictate the design of the study, its direction, and outcome. Studies developed from good research questions and hypotheses will have trustworthy outcomes with wide-ranging social and health implications.

INTRODUCTION

Scientific research is usually initiated by posing evidenced-based research questions which are then explicitly restated as hypotheses. 1 , 2 The hypotheses provide directions to guide the study, solutions, explanations, and expected results. 3 , 4 Both research questions and hypotheses are essentially formulated based on conventional theories and real-world processes, which allow the inception of novel studies and the ethical testing of ideas. 5 , 6

It is crucial to have knowledge of both quantitative and qualitative research 2 as both types of research involve writing research questions and hypotheses. 7 However, these crucial elements of research are sometimes overlooked; if not overlooked, then framed without the forethought and meticulous attention it needs. Planning and careful consideration are needed when developing quantitative or qualitative research, particularly when conceptualizing research questions and hypotheses. 4

There is a continuing need to support researchers in the creation of innovative research questions and hypotheses, as well as for journal articles that carefully review these elements. 1 When research questions and hypotheses are not carefully thought of, unethical studies and poor outcomes usually ensue. Carefully formulated research questions and hypotheses define well-founded objectives, which in turn determine the appropriate design, course, and outcome of the study. This article then aims to discuss in detail the various aspects of crafting research questions and hypotheses, with the goal of guiding researchers as they develop their own. Examples from the authors and peer-reviewed scientific articles in the healthcare field are provided to illustrate key points.

DEFINITIONS AND RELATIONSHIP OF RESEARCH QUESTIONS AND HYPOTHESES

A research question is what a study aims to answer after data analysis and interpretation. The answer is written in length in the discussion section of the paper. Thus, the research question gives a preview of the different parts and variables of the study meant to address the problem posed in the research question. 1 An excellent research question clarifies the research writing while facilitating understanding of the research topic, objective, scope, and limitations of the study. 5

On the other hand, a research hypothesis is an educated statement of an expected outcome. This statement is based on background research and current knowledge. 8 , 9 The research hypothesis makes a specific prediction about a new phenomenon 10 or a formal statement on the expected relationship between an independent variable and a dependent variable. 3 , 11 It provides a tentative answer to the research question to be tested or explored. 4

Hypotheses employ reasoning to predict a theory-based outcome. 10 These can also be developed from theories by focusing on components of theories that have not yet been observed. 10 The validity of hypotheses is often based on the testability of the prediction made in a reproducible experiment. 8

Conversely, hypotheses can also be rephrased as research questions. Several hypotheses based on existing theories and knowledge may be needed to answer a research question. Developing ethical research questions and hypotheses creates a research design that has logical relationships among variables. These relationships serve as a solid foundation for the conduct of the study. 4 , 11 Haphazardly constructed research questions can result in poorly formulated hypotheses and improper study designs, leading to unreliable results. Thus, the formulations of relevant research questions and verifiable hypotheses are crucial when beginning research. 12

CHARACTERISTICS OF GOOD RESEARCH QUESTIONS AND HYPOTHESES

Excellent research questions are specific and focused. These integrate collective data and observations to confirm or refute the subsequent hypotheses. Well-constructed hypotheses are based on previous reports and verify the research context. These are realistic, in-depth, sufficiently complex, and reproducible. More importantly, these hypotheses can be addressed and tested. 13

There are several characteristics of well-developed hypotheses. Good hypotheses are 1) empirically testable 7 , 10 , 11 , 13 ; 2) backed by preliminary evidence 9 ; 3) testable by ethical research 7 , 9 ; 4) based on original ideas 9 ; 5) have evidenced-based logical reasoning 10 ; and 6) can be predicted. 11 Good hypotheses can infer ethical and positive implications, indicating the presence of a relationship or effect relevant to the research theme. 7 , 11 These are initially developed from a general theory and branch into specific hypotheses by deductive reasoning. In the absence of a theory to base the hypotheses, inductive reasoning based on specific observations or findings form more general hypotheses. 10

TYPES OF RESEARCH QUESTIONS AND HYPOTHESES

Research questions and hypotheses are developed according to the type of research, which can be broadly classified into quantitative and qualitative research. We provide a summary of the types of research questions and hypotheses under quantitative and qualitative research categories in Table 1 .

Quantitative research questionsQuantitative research hypotheses
Descriptive research questionsSimple hypothesis
Comparative research questionsComplex hypothesis
Relationship research questionsDirectional hypothesis
Non-directional hypothesis
Associative hypothesis
Causal hypothesis
Null hypothesis
Alternative hypothesis
Working hypothesis
Statistical hypothesis
Logical hypothesis
Hypothesis-testing
Qualitative research questionsQualitative research hypotheses
Contextual research questionsHypothesis-generating
Descriptive research questions
Evaluation research questions
Explanatory research questions
Exploratory research questions
Generative research questions
Ideological research questions
Ethnographic research questions
Phenomenological research questions
Grounded theory questions
Qualitative case study questions

Research questions in quantitative research

In quantitative research, research questions inquire about the relationships among variables being investigated and are usually framed at the start of the study. These are precise and typically linked to the subject population, dependent and independent variables, and research design. 1 Research questions may also attempt to describe the behavior of a population in relation to one or more variables, or describe the characteristics of variables to be measured ( descriptive research questions ). 1 , 5 , 14 These questions may also aim to discover differences between groups within the context of an outcome variable ( comparative research questions ), 1 , 5 , 14 or elucidate trends and interactions among variables ( relationship research questions ). 1 , 5 We provide examples of descriptive, comparative, and relationship research questions in quantitative research in Table 2 .

Quantitative research questions
Descriptive research question
- Measures responses of subjects to variables
- Presents variables to measure, analyze, or assess
What is the proportion of resident doctors in the hospital who have mastered ultrasonography (response of subjects to a variable) as a diagnostic technique in their clinical training?
Comparative research question
- Clarifies difference between one group with outcome variable and another group without outcome variable
Is there a difference in the reduction of lung metastasis in osteosarcoma patients who received the vitamin D adjunctive therapy (group with outcome variable) compared with osteosarcoma patients who did not receive the vitamin D adjunctive therapy (group without outcome variable)?
- Compares the effects of variables
How does the vitamin D analogue 22-Oxacalcitriol (variable 1) mimic the antiproliferative activity of 1,25-Dihydroxyvitamin D (variable 2) in osteosarcoma cells?
Relationship research question
- Defines trends, association, relationships, or interactions between dependent variable and independent variable
Is there a relationship between the number of medical student suicide (dependent variable) and the level of medical student stress (independent variable) in Japan during the first wave of the COVID-19 pandemic?

Hypotheses in quantitative research

In quantitative research, hypotheses predict the expected relationships among variables. 15 Relationships among variables that can be predicted include 1) between a single dependent variable and a single independent variable ( simple hypothesis ) or 2) between two or more independent and dependent variables ( complex hypothesis ). 4 , 11 Hypotheses may also specify the expected direction to be followed and imply an intellectual commitment to a particular outcome ( directional hypothesis ) 4 . On the other hand, hypotheses may not predict the exact direction and are used in the absence of a theory, or when findings contradict previous studies ( non-directional hypothesis ). 4 In addition, hypotheses can 1) define interdependency between variables ( associative hypothesis ), 4 2) propose an effect on the dependent variable from manipulation of the independent variable ( causal hypothesis ), 4 3) state a negative relationship between two variables ( null hypothesis ), 4 , 11 , 15 4) replace the working hypothesis if rejected ( alternative hypothesis ), 15 explain the relationship of phenomena to possibly generate a theory ( working hypothesis ), 11 5) involve quantifiable variables that can be tested statistically ( statistical hypothesis ), 11 6) or express a relationship whose interlinks can be verified logically ( logical hypothesis ). 11 We provide examples of simple, complex, directional, non-directional, associative, causal, null, alternative, working, statistical, and logical hypotheses in quantitative research, as well as the definition of quantitative hypothesis-testing research in Table 3 .

Quantitative research hypotheses
Simple hypothesis
- Predicts relationship between single dependent variable and single independent variable
If the dose of the new medication (single independent variable) is high, blood pressure (single dependent variable) is lowered.
Complex hypothesis
- Foretells relationship between two or more independent and dependent variables
The higher the use of anticancer drugs, radiation therapy, and adjunctive agents (3 independent variables), the higher would be the survival rate (1 dependent variable).
Directional hypothesis
- Identifies study direction based on theory towards particular outcome to clarify relationship between variables
Privately funded research projects will have a larger international scope (study direction) than publicly funded research projects.
Non-directional hypothesis
- Nature of relationship between two variables or exact study direction is not identified
- Does not involve a theory
Women and men are different in terms of helpfulness. (Exact study direction is not identified)
Associative hypothesis
- Describes variable interdependency
- Change in one variable causes change in another variable
A larger number of people vaccinated against COVID-19 in the region (change in independent variable) will reduce the region’s incidence of COVID-19 infection (change in dependent variable).
Causal hypothesis
- An effect on dependent variable is predicted from manipulation of independent variable
A change into a high-fiber diet (independent variable) will reduce the blood sugar level (dependent variable) of the patient.
Null hypothesis
- A negative statement indicating no relationship or difference between 2 variables
There is no significant difference in the severity of pulmonary metastases between the new drug (variable 1) and the current drug (variable 2).
Alternative hypothesis
- Following a null hypothesis, an alternative hypothesis predicts a relationship between 2 study variables
The new drug (variable 1) is better on average in reducing the level of pain from pulmonary metastasis than the current drug (variable 2).
Working hypothesis
- A hypothesis that is initially accepted for further research to produce a feasible theory
Dairy cows fed with concentrates of different formulations will produce different amounts of milk.
Statistical hypothesis
- Assumption about the value of population parameter or relationship among several population characteristics
- Validity tested by a statistical experiment or analysis
The mean recovery rate from COVID-19 infection (value of population parameter) is not significantly different between population 1 and population 2.
There is a positive correlation between the level of stress at the workplace and the number of suicides (population characteristics) among working people in Japan.
Logical hypothesis
- Offers or proposes an explanation with limited or no extensive evidence
If healthcare workers provide more educational programs about contraception methods, the number of adolescent pregnancies will be less.
Hypothesis-testing (Quantitative hypothesis-testing research)
- Quantitative research uses deductive reasoning.
- This involves the formation of a hypothesis, collection of data in the investigation of the problem, analysis and use of the data from the investigation, and drawing of conclusions to validate or nullify the hypotheses.

Research questions in qualitative research

Unlike research questions in quantitative research, research questions in qualitative research are usually continuously reviewed and reformulated. The central question and associated subquestions are stated more than the hypotheses. 15 The central question broadly explores a complex set of factors surrounding the central phenomenon, aiming to present the varied perspectives of participants. 15

There are varied goals for which qualitative research questions are developed. These questions can function in several ways, such as to 1) identify and describe existing conditions ( contextual research question s); 2) describe a phenomenon ( descriptive research questions ); 3) assess the effectiveness of existing methods, protocols, theories, or procedures ( evaluation research questions ); 4) examine a phenomenon or analyze the reasons or relationships between subjects or phenomena ( explanatory research questions ); or 5) focus on unknown aspects of a particular topic ( exploratory research questions ). 5 In addition, some qualitative research questions provide new ideas for the development of theories and actions ( generative research questions ) or advance specific ideologies of a position ( ideological research questions ). 1 Other qualitative research questions may build on a body of existing literature and become working guidelines ( ethnographic research questions ). Research questions may also be broadly stated without specific reference to the existing literature or a typology of questions ( phenomenological research questions ), may be directed towards generating a theory of some process ( grounded theory questions ), or may address a description of the case and the emerging themes ( qualitative case study questions ). 15 We provide examples of contextual, descriptive, evaluation, explanatory, exploratory, generative, ideological, ethnographic, phenomenological, grounded theory, and qualitative case study research questions in qualitative research in Table 4 , and the definition of qualitative hypothesis-generating research in Table 5 .

Qualitative research questions
Contextual research question
- Ask the nature of what already exists
- Individuals or groups function to further clarify and understand the natural context of real-world problems
What are the experiences of nurses working night shifts in healthcare during the COVID-19 pandemic? (natural context of real-world problems)
Descriptive research question
- Aims to describe a phenomenon
What are the different forms of disrespect and abuse (phenomenon) experienced by Tanzanian women when giving birth in healthcare facilities?
Evaluation research question
- Examines the effectiveness of existing practice or accepted frameworks
How effective are decision aids (effectiveness of existing practice) in helping decide whether to give birth at home or in a healthcare facility?
Explanatory research question
- Clarifies a previously studied phenomenon and explains why it occurs
Why is there an increase in teenage pregnancy (phenomenon) in Tanzania?
Exploratory research question
- Explores areas that have not been fully investigated to have a deeper understanding of the research problem
What factors affect the mental health of medical students (areas that have not yet been fully investigated) during the COVID-19 pandemic?
Generative research question
- Develops an in-depth understanding of people’s behavior by asking ‘how would’ or ‘what if’ to identify problems and find solutions
How would the extensive research experience of the behavior of new staff impact the success of the novel drug initiative?
Ideological research question
- Aims to advance specific ideas or ideologies of a position
Are Japanese nurses who volunteer in remote African hospitals able to promote humanized care of patients (specific ideas or ideologies) in the areas of safe patient environment, respect of patient privacy, and provision of accurate information related to health and care?
Ethnographic research question
- Clarifies peoples’ nature, activities, their interactions, and the outcomes of their actions in specific settings
What are the demographic characteristics, rehabilitative treatments, community interactions, and disease outcomes (nature, activities, their interactions, and the outcomes) of people in China who are suffering from pneumoconiosis?
Phenomenological research question
- Knows more about the phenomena that have impacted an individual
What are the lived experiences of parents who have been living with and caring for children with a diagnosis of autism? (phenomena that have impacted an individual)
Grounded theory question
- Focuses on social processes asking about what happens and how people interact, or uncovering social relationships and behaviors of groups
What are the problems that pregnant adolescents face in terms of social and cultural norms (social processes), and how can these be addressed?
Qualitative case study question
- Assesses a phenomenon using different sources of data to answer “why” and “how” questions
- Considers how the phenomenon is influenced by its contextual situation.
How does quitting work and assuming the role of a full-time mother (phenomenon assessed) change the lives of women in Japan?
Qualitative research hypotheses
Hypothesis-generating (Qualitative hypothesis-generating research)
- Qualitative research uses inductive reasoning.
- This involves data collection from study participants or the literature regarding a phenomenon of interest, using the collected data to develop a formal hypothesis, and using the formal hypothesis as a framework for testing the hypothesis.
- Qualitative exploratory studies explore areas deeper, clarifying subjective experience and allowing formulation of a formal hypothesis potentially testable in a future quantitative approach.

Qualitative studies usually pose at least one central research question and several subquestions starting with How or What . These research questions use exploratory verbs such as explore or describe . These also focus on one central phenomenon of interest, and may mention the participants and research site. 15

Hypotheses in qualitative research

Hypotheses in qualitative research are stated in the form of a clear statement concerning the problem to be investigated. Unlike in quantitative research where hypotheses are usually developed to be tested, qualitative research can lead to both hypothesis-testing and hypothesis-generating outcomes. 2 When studies require both quantitative and qualitative research questions, this suggests an integrative process between both research methods wherein a single mixed-methods research question can be developed. 1

FRAMEWORKS FOR DEVELOPING RESEARCH QUESTIONS AND HYPOTHESES

Research questions followed by hypotheses should be developed before the start of the study. 1 , 12 , 14 It is crucial to develop feasible research questions on a topic that is interesting to both the researcher and the scientific community. This can be achieved by a meticulous review of previous and current studies to establish a novel topic. Specific areas are subsequently focused on to generate ethical research questions. The relevance of the research questions is evaluated in terms of clarity of the resulting data, specificity of the methodology, objectivity of the outcome, depth of the research, and impact of the study. 1 , 5 These aspects constitute the FINER criteria (i.e., Feasible, Interesting, Novel, Ethical, and Relevant). 1 Clarity and effectiveness are achieved if research questions meet the FINER criteria. In addition to the FINER criteria, Ratan et al. described focus, complexity, novelty, feasibility, and measurability for evaluating the effectiveness of research questions. 14

The PICOT and PEO frameworks are also used when developing research questions. 1 The following elements are addressed in these frameworks, PICOT: P-population/patients/problem, I-intervention or indicator being studied, C-comparison group, O-outcome of interest, and T-timeframe of the study; PEO: P-population being studied, E-exposure to preexisting conditions, and O-outcome of interest. 1 Research questions are also considered good if these meet the “FINERMAPS” framework: Feasible, Interesting, Novel, Ethical, Relevant, Manageable, Appropriate, Potential value/publishable, and Systematic. 14

As we indicated earlier, research questions and hypotheses that are not carefully formulated result in unethical studies or poor outcomes. To illustrate this, we provide some examples of ambiguous research question and hypotheses that result in unclear and weak research objectives in quantitative research ( Table 6 ) 16 and qualitative research ( Table 7 ) 17 , and how to transform these ambiguous research question(s) and hypothesis(es) into clear and good statements.

VariablesUnclear and weak statement (Statement 1) Clear and good statement (Statement 2) Points to avoid
Research questionWhich is more effective between smoke moxibustion and smokeless moxibustion?“Moreover, regarding smoke moxibustion versus smokeless moxibustion, it remains unclear which is more effective, safe, and acceptable to pregnant women, and whether there is any difference in the amount of heat generated.” 1) Vague and unfocused questions
2) Closed questions simply answerable by yes or no
3) Questions requiring a simple choice
HypothesisThe smoke moxibustion group will have higher cephalic presentation.“Hypothesis 1. The smoke moxibustion stick group (SM group) and smokeless moxibustion stick group (-SLM group) will have higher rates of cephalic presentation after treatment than the control group.1) Unverifiable hypotheses
Hypothesis 2. The SM group and SLM group will have higher rates of cephalic presentation at birth than the control group.2) Incompletely stated groups of comparison
Hypothesis 3. There will be no significant differences in the well-being of the mother and child among the three groups in terms of the following outcomes: premature birth, premature rupture of membranes (PROM) at < 37 weeks, Apgar score < 7 at 5 min, umbilical cord blood pH < 7.1, admission to neonatal intensive care unit (NICU), and intrauterine fetal death.” 3) Insufficiently described variables or outcomes
Research objectiveTo determine which is more effective between smoke moxibustion and smokeless moxibustion.“The specific aims of this pilot study were (a) to compare the effects of smoke moxibustion and smokeless moxibustion treatments with the control group as a possible supplement to ECV for converting breech presentation to cephalic presentation and increasing adherence to the newly obtained cephalic position, and (b) to assess the effects of these treatments on the well-being of the mother and child.” 1) Poor understanding of the research question and hypotheses
2) Insufficient description of population, variables, or study outcomes

a These statements were composed for comparison and illustrative purposes only.

b These statements are direct quotes from Higashihara and Horiuchi. 16

VariablesUnclear and weak statement (Statement 1)Clear and good statement (Statement 2)Points to avoid
Research questionDoes disrespect and abuse (D&A) occur in childbirth in Tanzania?How does disrespect and abuse (D&A) occur and what are the types of physical and psychological abuses observed in midwives’ actual care during facility-based childbirth in urban Tanzania?1) Ambiguous or oversimplistic questions
2) Questions unverifiable by data collection and analysis
HypothesisDisrespect and abuse (D&A) occur in childbirth in Tanzania.Hypothesis 1: Several types of physical and psychological abuse by midwives in actual care occur during facility-based childbirth in urban Tanzania.1) Statements simply expressing facts
Hypothesis 2: Weak nursing and midwifery management contribute to the D&A of women during facility-based childbirth in urban Tanzania.2) Insufficiently described concepts or variables
Research objectiveTo describe disrespect and abuse (D&A) in childbirth in Tanzania.“This study aimed to describe from actual observations the respectful and disrespectful care received by women from midwives during their labor period in two hospitals in urban Tanzania.” 1) Statements unrelated to the research question and hypotheses
2) Unattainable or unexplorable objectives

a This statement is a direct quote from Shimoda et al. 17

The other statements were composed for comparison and illustrative purposes only.

CONSTRUCTING RESEARCH QUESTIONS AND HYPOTHESES

To construct effective research questions and hypotheses, it is very important to 1) clarify the background and 2) identify the research problem at the outset of the research, within a specific timeframe. 9 Then, 3) review or conduct preliminary research to collect all available knowledge about the possible research questions by studying theories and previous studies. 18 Afterwards, 4) construct research questions to investigate the research problem. Identify variables to be accessed from the research questions 4 and make operational definitions of constructs from the research problem and questions. Thereafter, 5) construct specific deductive or inductive predictions in the form of hypotheses. 4 Finally, 6) state the study aims . This general flow for constructing effective research questions and hypotheses prior to conducting research is shown in Fig. 1 .

An external file that holds a picture, illustration, etc.
Object name is jkms-37-e121-g001.jpg

Research questions are used more frequently in qualitative research than objectives or hypotheses. 3 These questions seek to discover, understand, explore or describe experiences by asking “What” or “How.” The questions are open-ended to elicit a description rather than to relate variables or compare groups. The questions are continually reviewed, reformulated, and changed during the qualitative study. 3 Research questions are also used more frequently in survey projects than hypotheses in experiments in quantitative research to compare variables and their relationships.

Hypotheses are constructed based on the variables identified and as an if-then statement, following the template, ‘If a specific action is taken, then a certain outcome is expected.’ At this stage, some ideas regarding expectations from the research to be conducted must be drawn. 18 Then, the variables to be manipulated (independent) and influenced (dependent) are defined. 4 Thereafter, the hypothesis is stated and refined, and reproducible data tailored to the hypothesis are identified, collected, and analyzed. 4 The hypotheses must be testable and specific, 18 and should describe the variables and their relationships, the specific group being studied, and the predicted research outcome. 18 Hypotheses construction involves a testable proposition to be deduced from theory, and independent and dependent variables to be separated and measured separately. 3 Therefore, good hypotheses must be based on good research questions constructed at the start of a study or trial. 12

In summary, research questions are constructed after establishing the background of the study. Hypotheses are then developed based on the research questions. Thus, it is crucial to have excellent research questions to generate superior hypotheses. In turn, these would determine the research objectives and the design of the study, and ultimately, the outcome of the research. 12 Algorithms for building research questions and hypotheses are shown in Fig. 2 for quantitative research and in Fig. 3 for qualitative research.

An external file that holds a picture, illustration, etc.
Object name is jkms-37-e121-g002.jpg

EXAMPLES OF RESEARCH QUESTIONS FROM PUBLISHED ARTICLES

  • EXAMPLE 1. Descriptive research question (quantitative research)
  • - Presents research variables to be assessed (distinct phenotypes and subphenotypes)
  • “BACKGROUND: Since COVID-19 was identified, its clinical and biological heterogeneity has been recognized. Identifying COVID-19 phenotypes might help guide basic, clinical, and translational research efforts.
  • RESEARCH QUESTION: Does the clinical spectrum of patients with COVID-19 contain distinct phenotypes and subphenotypes? ” 19
  • EXAMPLE 2. Relationship research question (quantitative research)
  • - Shows interactions between dependent variable (static postural control) and independent variable (peripheral visual field loss)
  • “Background: Integration of visual, vestibular, and proprioceptive sensations contributes to postural control. People with peripheral visual field loss have serious postural instability. However, the directional specificity of postural stability and sensory reweighting caused by gradual peripheral visual field loss remain unclear.
  • Research question: What are the effects of peripheral visual field loss on static postural control ?” 20
  • EXAMPLE 3. Comparative research question (quantitative research)
  • - Clarifies the difference among groups with an outcome variable (patients enrolled in COMPERA with moderate PH or severe PH in COPD) and another group without the outcome variable (patients with idiopathic pulmonary arterial hypertension (IPAH))
  • “BACKGROUND: Pulmonary hypertension (PH) in COPD is a poorly investigated clinical condition.
  • RESEARCH QUESTION: Which factors determine the outcome of PH in COPD?
  • STUDY DESIGN AND METHODS: We analyzed the characteristics and outcome of patients enrolled in the Comparative, Prospective Registry of Newly Initiated Therapies for Pulmonary Hypertension (COMPERA) with moderate or severe PH in COPD as defined during the 6th PH World Symposium who received medical therapy for PH and compared them with patients with idiopathic pulmonary arterial hypertension (IPAH) .” 21
  • EXAMPLE 4. Exploratory research question (qualitative research)
  • - Explores areas that have not been fully investigated (perspectives of families and children who receive care in clinic-based child obesity treatment) to have a deeper understanding of the research problem
  • “Problem: Interventions for children with obesity lead to only modest improvements in BMI and long-term outcomes, and data are limited on the perspectives of families of children with obesity in clinic-based treatment. This scoping review seeks to answer the question: What is known about the perspectives of families and children who receive care in clinic-based child obesity treatment? This review aims to explore the scope of perspectives reported by families of children with obesity who have received individualized outpatient clinic-based obesity treatment.” 22
  • EXAMPLE 5. Relationship research question (quantitative research)
  • - Defines interactions between dependent variable (use of ankle strategies) and independent variable (changes in muscle tone)
  • “Background: To maintain an upright standing posture against external disturbances, the human body mainly employs two types of postural control strategies: “ankle strategy” and “hip strategy.” While it has been reported that the magnitude of the disturbance alters the use of postural control strategies, it has not been elucidated how the level of muscle tone, one of the crucial parameters of bodily function, determines the use of each strategy. We have previously confirmed using forward dynamics simulations of human musculoskeletal models that an increased muscle tone promotes the use of ankle strategies. The objective of the present study was to experimentally evaluate a hypothesis: an increased muscle tone promotes the use of ankle strategies. Research question: Do changes in the muscle tone affect the use of ankle strategies ?” 23

EXAMPLES OF HYPOTHESES IN PUBLISHED ARTICLES

  • EXAMPLE 1. Working hypothesis (quantitative research)
  • - A hypothesis that is initially accepted for further research to produce a feasible theory
  • “As fever may have benefit in shortening the duration of viral illness, it is plausible to hypothesize that the antipyretic efficacy of ibuprofen may be hindering the benefits of a fever response when taken during the early stages of COVID-19 illness .” 24
  • “In conclusion, it is plausible to hypothesize that the antipyretic efficacy of ibuprofen may be hindering the benefits of a fever response . The difference in perceived safety of these agents in COVID-19 illness could be related to the more potent efficacy to reduce fever with ibuprofen compared to acetaminophen. Compelling data on the benefit of fever warrant further research and review to determine when to treat or withhold ibuprofen for early stage fever for COVID-19 and other related viral illnesses .” 24
  • EXAMPLE 2. Exploratory hypothesis (qualitative research)
  • - Explores particular areas deeper to clarify subjective experience and develop a formal hypothesis potentially testable in a future quantitative approach
  • “We hypothesized that when thinking about a past experience of help-seeking, a self distancing prompt would cause increased help-seeking intentions and more favorable help-seeking outcome expectations .” 25
  • “Conclusion
  • Although a priori hypotheses were not supported, further research is warranted as results indicate the potential for using self-distancing approaches to increasing help-seeking among some people with depressive symptomatology.” 25
  • EXAMPLE 3. Hypothesis-generating research to establish a framework for hypothesis testing (qualitative research)
  • “We hypothesize that compassionate care is beneficial for patients (better outcomes), healthcare systems and payers (lower costs), and healthcare providers (lower burnout). ” 26
  • Compassionomics is the branch of knowledge and scientific study of the effects of compassionate healthcare. Our main hypotheses are that compassionate healthcare is beneficial for (1) patients, by improving clinical outcomes, (2) healthcare systems and payers, by supporting financial sustainability, and (3) HCPs, by lowering burnout and promoting resilience and well-being. The purpose of this paper is to establish a scientific framework for testing the hypotheses above . If these hypotheses are confirmed through rigorous research, compassionomics will belong in the science of evidence-based medicine, with major implications for all healthcare domains.” 26
  • EXAMPLE 4. Statistical hypothesis (quantitative research)
  • - An assumption is made about the relationship among several population characteristics ( gender differences in sociodemographic and clinical characteristics of adults with ADHD ). Validity is tested by statistical experiment or analysis ( chi-square test, Students t-test, and logistic regression analysis)
  • “Our research investigated gender differences in sociodemographic and clinical characteristics of adults with ADHD in a Japanese clinical sample. Due to unique Japanese cultural ideals and expectations of women's behavior that are in opposition to ADHD symptoms, we hypothesized that women with ADHD experience more difficulties and present more dysfunctions than men . We tested the following hypotheses: first, women with ADHD have more comorbidities than men with ADHD; second, women with ADHD experience more social hardships than men, such as having less full-time employment and being more likely to be divorced.” 27
  • “Statistical Analysis
  • ( text omitted ) Between-gender comparisons were made using the chi-squared test for categorical variables and Students t-test for continuous variables…( text omitted ). A logistic regression analysis was performed for employment status, marital status, and comorbidity to evaluate the independent effects of gender on these dependent variables.” 27

EXAMPLES OF HYPOTHESIS AS WRITTEN IN PUBLISHED ARTICLES IN RELATION TO OTHER PARTS

  • EXAMPLE 1. Background, hypotheses, and aims are provided
  • “Pregnant women need skilled care during pregnancy and childbirth, but that skilled care is often delayed in some countries …( text omitted ). The focused antenatal care (FANC) model of WHO recommends that nurses provide information or counseling to all pregnant women …( text omitted ). Job aids are visual support materials that provide the right kind of information using graphics and words in a simple and yet effective manner. When nurses are not highly trained or have many work details to attend to, these job aids can serve as a content reminder for the nurses and can be used for educating their patients (Jennings, Yebadokpo, Affo, & Agbogbe, 2010) ( text omitted ). Importantly, additional evidence is needed to confirm how job aids can further improve the quality of ANC counseling by health workers in maternal care …( text omitted )” 28
  • “ This has led us to hypothesize that the quality of ANC counseling would be better if supported by job aids. Consequently, a better quality of ANC counseling is expected to produce higher levels of awareness concerning the danger signs of pregnancy and a more favorable impression of the caring behavior of nurses .” 28
  • “This study aimed to examine the differences in the responses of pregnant women to a job aid-supported intervention during ANC visit in terms of 1) their understanding of the danger signs of pregnancy and 2) their impression of the caring behaviors of nurses to pregnant women in rural Tanzania.” 28
  • EXAMPLE 2. Background, hypotheses, and aims are provided
  • “We conducted a two-arm randomized controlled trial (RCT) to evaluate and compare changes in salivary cortisol and oxytocin levels of first-time pregnant women between experimental and control groups. The women in the experimental group touched and held an infant for 30 min (experimental intervention protocol), whereas those in the control group watched a DVD movie of an infant (control intervention protocol). The primary outcome was salivary cortisol level and the secondary outcome was salivary oxytocin level.” 29
  • “ We hypothesize that at 30 min after touching and holding an infant, the salivary cortisol level will significantly decrease and the salivary oxytocin level will increase in the experimental group compared with the control group .” 29
  • EXAMPLE 3. Background, aim, and hypothesis are provided
  • “In countries where the maternal mortality ratio remains high, antenatal education to increase Birth Preparedness and Complication Readiness (BPCR) is considered one of the top priorities [1]. BPCR includes birth plans during the antenatal period, such as the birthplace, birth attendant, transportation, health facility for complications, expenses, and birth materials, as well as family coordination to achieve such birth plans. In Tanzania, although increasing, only about half of all pregnant women attend an antenatal clinic more than four times [4]. Moreover, the information provided during antenatal care (ANC) is insufficient. In the resource-poor settings, antenatal group education is a potential approach because of the limited time for individual counseling at antenatal clinics.” 30
  • “This study aimed to evaluate an antenatal group education program among pregnant women and their families with respect to birth-preparedness and maternal and infant outcomes in rural villages of Tanzania.” 30
  • “ The study hypothesis was if Tanzanian pregnant women and their families received a family-oriented antenatal group education, they would (1) have a higher level of BPCR, (2) attend antenatal clinic four or more times, (3) give birth in a health facility, (4) have less complications of women at birth, and (5) have less complications and deaths of infants than those who did not receive the education .” 30

Research questions and hypotheses are crucial components to any type of research, whether quantitative or qualitative. These questions should be developed at the very beginning of the study. Excellent research questions lead to superior hypotheses, which, like a compass, set the direction of research, and can often determine the successful conduct of the study. Many research studies have floundered because the development of research questions and subsequent hypotheses was not given the thought and meticulous attention needed. The development of research questions and hypotheses is an iterative process based on extensive knowledge of the literature and insightful grasp of the knowledge gap. Focused, concise, and specific research questions provide a strong foundation for constructing hypotheses which serve as formal predictions about the research outcomes. Research questions and hypotheses are crucial elements of research that should not be overlooked. They should be carefully thought of and constructed when planning research. This avoids unethical studies and poor outcomes by defining well-founded objectives that determine the design, course, and outcome of the study.

Disclosure: The authors have no potential conflicts of interest to disclose.

Author Contributions:

  • Conceptualization: Barroga E, Matanguihan GJ.
  • Methodology: Barroga E, Matanguihan GJ.
  • Writing - original draft: Barroga E, Matanguihan GJ.
  • Writing - review & editing: Barroga E, Matanguihan GJ.

empirical research data analysis

How to... Conduct empirical research

Share this content

Empirical research is research that is based on observation and measurement of phenomena, as directly experienced by the researcher. The data thus gathered may be compared against a theory or hypothesis, but the results are still based on real life experience. The data gathered is all primary data, although secondary data from a literature review may form the theoretical background.

On this page

What is empirical research, the research question, the theoretical framework, sampling techniques, design of the research.

  • Methods of empirical research
  • Techniques of data collection & analysis
  • Reporting the findings of empirical research
  • Further information

Typically, empirical research embodies the following elements:

  • A  research question , which will determine research objectives.
  • A particular and planned  design  for the research, which will depend on the question and which will find ways of answering it with appropriate use of resources.
  • The gathering of  primary data , which is then analysed.
  • A particular  methodology  for collecting and analysing the data, such as an experiment or survey.
  • The limitation of the data to a particular group, area or time scale, known as a sample: for example, a specific number of employees of a particular company type, or all users of a library over a given time scale. The sample should be somehow representative of a wider population.
  • The ability to  recreate  the study and test the results. This is known as  reliability .
  • The ability to  generalise  from the findings to a larger sample and to other situations.

The starting point for your research should be your research question. This should be a formulation of the issue which is at the heart of the area which you are researching, which has the right degree of breadth and depth to make the research feasible within your resources. The following points are useful to remember when coming up with your research question, or RQ:

  • your doctoral thesis;
  • reading the relevant literature in journals, especially literature reviews which are good at giving an overview, and spotting interesting conceptual developments;
  • looking at research priorities of funding bodies, professional institutes etc.;
  • going to conferences;
  • looking out for calls for papers;
  • developing a dialogue with other researchers in your area.
  • To narrow down your research topic, brainstorm ideas around it, possibly with your colleagues if you have decided to collaborate, noting all the questions down.
  • Come up with a "general focus" question; then develop some other more specific ones.
  • they are not too broad;
  • they are not so narrow as to yield uninteresting results;
  • will the research entailed be covered by your resources, i.e. will you have sufficient time and money;
  • there is sufficient background literature on the topic;
  • you can carry out appropriate field research;
  • you have stated your question in the simplest possible way.

Let's look at some examples:

Bisking et al. examine whether or not gender has an influence on disciplinary action in their article  Does the sex of the leader and subordinate influence a leader's disciplinary decisions?  ( Management Decision , Volume 41 Number 10) and come up with the following series of inter-related questions:

  • Given the same infraction, would a male leader impose the same disciplinary action on male and female subordinates?
  • Given the same infraction, would a female leader impose the same disciplinary action on male and female subordinates?
  • Given the same infraction, would a female leader impose the same disciplinary action on female subordinates as a male leader would on male subordinates?
  • Given the same infraction, would a female leader impose the same disciplinary action on male subordinates as a male leader would on female subordinates?
  • Given the same infraction, would a male and female leader impose the same disciplinary action on male subordinates?
  • Given the same infraction, would a male and female leader impose the same disciplinary action on female subordinates?
  • Do female and male leaders impose the same discipline on subordinates regardless of the type of infraction?
  • Is it possible to predict how female and male leaders will impose disciplinary actions based on their respective BSRI femininity and masculinity scores?

Motion et al. examined co-branding in  Equity in Corporate Co-branding  ( European Journal of Marketing , Volume 37 Number 7/8) and came up with the following RQs:

RQ1:  What objectives underpinned the corporate brand?

RQ2:  How were brand values deployed to establish the corporate co-brand within particular discourse contexts?

RQ3:  How was the desired rearticulation promoted to shareholders?

RQ4:  What are the sources of corporate co-brand equity?

Note, the above two examples state the RQs very explicitly; sometimes the RQ is implicit:

Qun G. Jiao, Anthony J. Onwuegbuzie are library researchers who examined the question:  "What is the relationship between library anxiety and social interdependence?"  in a number of articles, see  Dimensions of library anxiety and social interdependence: implications for library services   ( Library Review , Volume 51 Number 2).

Or sometimes the RQ is stated as a general objective:

Ying Fan describes outsourcing in British companies in  Strategic outsourcing: evidence from British companies  ( Marketing Intelligence & Planning , Volume 18 Number 4) and states his research question as an objective:

The main objective of the research was to explore the two key areas in the outsourcing process, namely:

  • pre-outsourcing decision process; and
  • post-outsourcing supplier management.

or as a proposition:

Karin Klenke explores issues of gender in management decisions in  Gender influences in decision-making processes in top management teams   ( Management Decision , Volume 41 Number 10).

Given the exploratory nature of this research, no specific hypotheses were formulated. Instead, the following general propositions are postulated:

P1.  Female and male members of TMTs exercise different types of power in the strategic decision making process.

P2.  Female and male members of TMTs differ in the extent in which they employ political savvy in the strategic decision making process.

P3.  Male and female members of TMTs manage conflict in strategic decision making situations differently.

P4.  Female and male members of TMTs utilise different types of trust in the decision making process.

Sometimes, the theoretical underpinning (see next section) of the research leads you to formulate a hypothesis rather than a question:

Martin et al. explored the effect of fast-forwarding of ads (called zipping) in  Remote control marketing: how ad fast-forwarding and ad repetition affect consumers  ( Marketing Intelligence & Planning , Volume 20 Number 1) and his research explores the following hypotheses:

The influence of zipping H1. Individuals viewing advertisements played at normal speed will exhibit higher ad recall and recognition than those who view zipped advertisements.

Ad repetition effects H2. Individuals viewing a repeated advertisement will exhibit higher ad recall and recognition than those who see an advertisement once.

Zipping and ad repetition H3. Individuals viewing zipped, repeated advertisements will exhibit higher ad recall and recognition than those who see a normal speed advertisement that is played once.

Empirical research is not divorced from theoretical considerations; and a consideration of theory should form one of the starting points of your research. This applies particularly in the case of management research which by its very nature is practical and applied to the real world. The link between research and theory is symbiotic: theory should inform research, and the findings of research should inform theory.

There are a number of different theoretical perspectives; if you are unfamiliar with them, we suggest that you look at any good research methods textbook for a full account (see Further information), but this page will contain notes on the following:

This is the approach of the natural sciences, emphasising total objectivity and independence on the part of the researcher, a highly scientific methodology, with data being collected in a value-free manner and using quantitative techniques with some statistical measures of analysis. Assumes that there are 'independent facts' in the social world as in the natural world. The object is to generalise from what has been observed and hence add to the body of theory.

Very similar to positivism in that it has a strong reliance on objectivity and quantitative methods of data collection, but with less of a reliance on theory. There is emphasis on data and facts in their own right; they do not need to be linked to theory.

Interpretivism

This view criticises positivism as being inappropriate for the social world of business and management which is dominated by people rather than the laws of nature and hence has an inevitable subjective element as people will have different interpretations of situations and events. The business world can only be understood through people's interpretation. This view is more likely to emphasise qualitative methods such as participant observation, focus groups and semi-structured interviewing.

 
typically use  typically use 
are  are 
involve the researcher as ideally an  require more   and   on the part of the researcher.
may focus on cause and effect. focuses on understanding of phenomena in their social, institutional, political and economic context.
require a hypothesis.  require a 
have the   that they may force people into categories, also it cannot go into much depth about subjects and issues. have the   that they focus on a few individuals, and may therefore be difficult to generalise.

While reality exists independently of human experience, people are not like objects in the natural world but are subject to social influences and processes. Like  empiricism  and  positivism , this emphasises the importance of explanation, but is also concerned with the social world and with its underlying structures.

Inductive and deductive approaches

At what point in your research you bring in a theoretical perspective will depend on whether you choose an:

  • Inductive approach  – collect the data, then develop the theory.
  • Deductive approach  – assume a theoretical position then test it against the data.
is more usually linked with an   approach. is more usually linked with the   approach.
is more likely to use qualitative methods, such as interviewing, observation etc., with a more flexible structure. is more likely to use quantitative methods, such as experiments, questionnaires etc., and a highly structured methodology with controls.
does not simply look at cause and effect, but at people's perceptions of events, and at the context of the research. is the more scientific method, concerned with cause and effect, and the relationship between variables.
builds theory after collection of the data. starts from a theoretical perspective, and develops a hypothesis which is tested against the data.
is more likely to use an in-depth study of a smaller sample. is more likely to use a larger sample.
is less likely to be concerned with generalisation (a danger is that no patterns emerge). is concerned with generalisation.
tresses the researcher involvement. stresses the independence of the researcher.

It should be emphasised that none of the above approaches are mutually exclusive and can be used in combination.

Sampling may be done either:

  • On a  random  basis – a given number is selected completely at random.
  • On a  systematic  basis – every  n th element  of the population is selected.
  • On a  stratified random  basis – the population is divided into segments, for example, in a University, you could divide the population into academic, administrators, and academic related. A random number of each group is then selected.
  • On a  cluster  basis – a particular subgroup is chosen at random.
  • Convenience  – being present at a particular time e.g. at lunch in the canteen.
  • Purposive  – people can be selected deliberately because their views are relevant to the issue concerned.
  • Quota  – the assumption is made that there are subgroups in the population, and a quota of respondents is chosen to reflect this diversity.

Useful articles

Richard Laughlin in  Empirical research in accounting: alternative approaches and a case for "middle-range" thinking  provides an interesting general overview of the different perspectives on theory and methodology as applied to accounting. ( Accounting, Auditing & Accountability Journal,  Volume 8 Number 1).

D. Tranfield and K. Starkey in  The Nature, Social Organization and Promotion of Management Research: Towards Policy  look at the relationship between theory and practice in management research, and develop a number of analytical frameworks, including looking at Becher's conceptual schema for disciplines and Gibbons et al.'s taxonomy of knowledge production systems. ( British Journal of Management , vol. 9, no. 4 – abstract only).

Research design is about how you go about answering your question: what strategy you adopt, and what methods do you use to achieve your results. In particular you should ask yourself... 

There's a lot more to this article; just fill in the form below to instantly see the complete content.

Read the complete article

What's in the rest?

  • Continuation of 'Design of the research'
  • Books & websites for further information

Your data will be used, alongside feedback we may request, only to help inform and improve our 'How to' section – thank you.

PW Skills | Blog

Data Analysis Techniques in Research – Methods, Tools & Examples

' src=

Varun Saharawat is a seasoned professional in the fields of SEO and content writing. With a profound knowledge of the intricate aspects of these disciplines, Varun has established himself as a valuable asset in the world of digital marketing and online content creation.

Data analysis techniques in research are essential because they allow researchers to derive meaningful insights from data sets to support their hypotheses or research objectives.

data analysis techniques in research

Data Analysis Techniques in Research : While various groups, institutions, and professionals may have diverse approaches to data analysis, a universal definition captures its essence. Data analysis involves refining, transforming, and interpreting raw data to derive actionable insights that guide informed decision-making for businesses.

A straightforward illustration of data analysis emerges when we make everyday decisions, basing our choices on past experiences or predictions of potential outcomes.

If you want to learn more about this topic and acquire valuable skills that will set you apart in today’s data-driven world, we highly recommend enrolling in the Data Analytics Course by Physics Wallah . And as a special offer for our readers, use the coupon code “READER” to get a discount on this course.

Table of Contents

What is Data Analysis?

Data analysis is the systematic process of inspecting, cleaning, transforming, and interpreting data with the objective of discovering valuable insights and drawing meaningful conclusions. This process involves several steps:

  • Inspecting : Initial examination of data to understand its structure, quality, and completeness.
  • Cleaning : Removing errors, inconsistencies, or irrelevant information to ensure accurate analysis.
  • Transforming : Converting data into a format suitable for analysis, such as normalization or aggregation.
  • Interpreting : Analyzing the transformed data to identify patterns, trends, and relationships.

Types of Data Analysis Techniques in Research

Data analysis techniques in research are categorized into qualitative and quantitative methods, each with its specific approaches and tools. These techniques are instrumental in extracting meaningful insights, patterns, and relationships from data to support informed decision-making, validate hypotheses, and derive actionable recommendations. Below is an in-depth exploration of the various types of data analysis techniques commonly employed in research:

1) Qualitative Analysis:

Definition: Qualitative analysis focuses on understanding non-numerical data, such as opinions, concepts, or experiences, to derive insights into human behavior, attitudes, and perceptions.

  • Content Analysis: Examines textual data, such as interview transcripts, articles, or open-ended survey responses, to identify themes, patterns, or trends.
  • Narrative Analysis: Analyzes personal stories or narratives to understand individuals’ experiences, emotions, or perspectives.
  • Ethnographic Studies: Involves observing and analyzing cultural practices, behaviors, and norms within specific communities or settings.

2) Quantitative Analysis:

Quantitative analysis emphasizes numerical data and employs statistical methods to explore relationships, patterns, and trends. It encompasses several approaches:

Descriptive Analysis:

  • Frequency Distribution: Represents the number of occurrences of distinct values within a dataset.
  • Central Tendency: Measures such as mean, median, and mode provide insights into the central values of a dataset.
  • Dispersion: Techniques like variance and standard deviation indicate the spread or variability of data.

Diagnostic Analysis:

  • Regression Analysis: Assesses the relationship between dependent and independent variables, enabling prediction or understanding causality.
  • ANOVA (Analysis of Variance): Examines differences between groups to identify significant variations or effects.

Predictive Analysis:

  • Time Series Forecasting: Uses historical data points to predict future trends or outcomes.
  • Machine Learning Algorithms: Techniques like decision trees, random forests, and neural networks predict outcomes based on patterns in data.

Prescriptive Analysis:

  • Optimization Models: Utilizes linear programming, integer programming, or other optimization techniques to identify the best solutions or strategies.
  • Simulation: Mimics real-world scenarios to evaluate various strategies or decisions and determine optimal outcomes.

Specific Techniques:

  • Monte Carlo Simulation: Models probabilistic outcomes to assess risk and uncertainty.
  • Factor Analysis: Reduces the dimensionality of data by identifying underlying factors or components.
  • Cohort Analysis: Studies specific groups or cohorts over time to understand trends, behaviors, or patterns within these groups.
  • Cluster Analysis: Classifies objects or individuals into homogeneous groups or clusters based on similarities or attributes.
  • Sentiment Analysis: Uses natural language processing and machine learning techniques to determine sentiment, emotions, or opinions from textual data.

Also Read: AI and Predictive Analytics: Examples, Tools, Uses, Ai Vs Predictive Analytics

Data Analysis Techniques in Research Examples

To provide a clearer understanding of how data analysis techniques are applied in research, let’s consider a hypothetical research study focused on evaluating the impact of online learning platforms on students’ academic performance.

Research Objective:

Determine if students using online learning platforms achieve higher academic performance compared to those relying solely on traditional classroom instruction.

Data Collection:

  • Quantitative Data: Academic scores (grades) of students using online platforms and those using traditional classroom methods.
  • Qualitative Data: Feedback from students regarding their learning experiences, challenges faced, and preferences.

Data Analysis Techniques Applied:

1) Descriptive Analysis:

  • Calculate the mean, median, and mode of academic scores for both groups.
  • Create frequency distributions to represent the distribution of grades in each group.

2) Diagnostic Analysis:

  • Conduct an Analysis of Variance (ANOVA) to determine if there’s a statistically significant difference in academic scores between the two groups.
  • Perform Regression Analysis to assess the relationship between the time spent on online platforms and academic performance.

3) Predictive Analysis:

  • Utilize Time Series Forecasting to predict future academic performance trends based on historical data.
  • Implement Machine Learning algorithms to develop a predictive model that identifies factors contributing to academic success on online platforms.

4) Prescriptive Analysis:

  • Apply Optimization Models to identify the optimal combination of online learning resources (e.g., video lectures, interactive quizzes) that maximize academic performance.
  • Use Simulation Techniques to evaluate different scenarios, such as varying student engagement levels with online resources, to determine the most effective strategies for improving learning outcomes.

5) Specific Techniques:

  • Conduct Factor Analysis on qualitative feedback to identify common themes or factors influencing students’ perceptions and experiences with online learning.
  • Perform Cluster Analysis to segment students based on their engagement levels, preferences, or academic outcomes, enabling targeted interventions or personalized learning strategies.
  • Apply Sentiment Analysis on textual feedback to categorize students’ sentiments as positive, negative, or neutral regarding online learning experiences.

By applying a combination of qualitative and quantitative data analysis techniques, this research example aims to provide comprehensive insights into the effectiveness of online learning platforms.

Also Read: Learning Path to Become a Data Analyst in 2024

Data Analysis Techniques in Quantitative Research

Quantitative research involves collecting numerical data to examine relationships, test hypotheses, and make predictions. Various data analysis techniques are employed to interpret and draw conclusions from quantitative data. Here are some key data analysis techniques commonly used in quantitative research:

1) Descriptive Statistics:

  • Description: Descriptive statistics are used to summarize and describe the main aspects of a dataset, such as central tendency (mean, median, mode), variability (range, variance, standard deviation), and distribution (skewness, kurtosis).
  • Applications: Summarizing data, identifying patterns, and providing initial insights into the dataset.

2) Inferential Statistics:

  • Description: Inferential statistics involve making predictions or inferences about a population based on a sample of data. This technique includes hypothesis testing, confidence intervals, t-tests, chi-square tests, analysis of variance (ANOVA), regression analysis, and correlation analysis.
  • Applications: Testing hypotheses, making predictions, and generalizing findings from a sample to a larger population.

3) Regression Analysis:

  • Description: Regression analysis is a statistical technique used to model and examine the relationship between a dependent variable and one or more independent variables. Linear regression, multiple regression, logistic regression, and nonlinear regression are common types of regression analysis .
  • Applications: Predicting outcomes, identifying relationships between variables, and understanding the impact of independent variables on the dependent variable.

4) Correlation Analysis:

  • Description: Correlation analysis is used to measure and assess the strength and direction of the relationship between two or more variables. The Pearson correlation coefficient, Spearman rank correlation coefficient, and Kendall’s tau are commonly used measures of correlation.
  • Applications: Identifying associations between variables and assessing the degree and nature of the relationship.

5) Factor Analysis:

  • Description: Factor analysis is a multivariate statistical technique used to identify and analyze underlying relationships or factors among a set of observed variables. It helps in reducing the dimensionality of data and identifying latent variables or constructs.
  • Applications: Identifying underlying factors or constructs, simplifying data structures, and understanding the underlying relationships among variables.

6) Time Series Analysis:

  • Description: Time series analysis involves analyzing data collected or recorded over a specific period at regular intervals to identify patterns, trends, and seasonality. Techniques such as moving averages, exponential smoothing, autoregressive integrated moving average (ARIMA), and Fourier analysis are used.
  • Applications: Forecasting future trends, analyzing seasonal patterns, and understanding time-dependent relationships in data.

7) ANOVA (Analysis of Variance):

  • Description: Analysis of variance (ANOVA) is a statistical technique used to analyze and compare the means of two or more groups or treatments to determine if they are statistically different from each other. One-way ANOVA, two-way ANOVA, and MANOVA (Multivariate Analysis of Variance) are common types of ANOVA.
  • Applications: Comparing group means, testing hypotheses, and determining the effects of categorical independent variables on a continuous dependent variable.

8) Chi-Square Tests:

  • Description: Chi-square tests are non-parametric statistical tests used to assess the association between categorical variables in a contingency table. The Chi-square test of independence, goodness-of-fit test, and test of homogeneity are common chi-square tests.
  • Applications: Testing relationships between categorical variables, assessing goodness-of-fit, and evaluating independence.

These quantitative data analysis techniques provide researchers with valuable tools and methods to analyze, interpret, and derive meaningful insights from numerical data. The selection of a specific technique often depends on the research objectives, the nature of the data, and the underlying assumptions of the statistical methods being used.

Also Read: Analysis vs. Analytics: How Are They Different?

Data Analysis Methods

Data analysis methods refer to the techniques and procedures used to analyze, interpret, and draw conclusions from data. These methods are essential for transforming raw data into meaningful insights, facilitating decision-making processes, and driving strategies across various fields. Here are some common data analysis methods:

  • Description: Descriptive statistics summarize and organize data to provide a clear and concise overview of the dataset. Measures such as mean, median, mode, range, variance, and standard deviation are commonly used.
  • Description: Inferential statistics involve making predictions or inferences about a population based on a sample of data. Techniques such as hypothesis testing, confidence intervals, and regression analysis are used.

3) Exploratory Data Analysis (EDA):

  • Description: EDA techniques involve visually exploring and analyzing data to discover patterns, relationships, anomalies, and insights. Methods such as scatter plots, histograms, box plots, and correlation matrices are utilized.
  • Applications: Identifying trends, patterns, outliers, and relationships within the dataset.

4) Predictive Analytics:

  • Description: Predictive analytics use statistical algorithms and machine learning techniques to analyze historical data and make predictions about future events or outcomes. Techniques such as regression analysis, time series forecasting, and machine learning algorithms (e.g., decision trees, random forests, neural networks) are employed.
  • Applications: Forecasting future trends, predicting outcomes, and identifying potential risks or opportunities.

5) Prescriptive Analytics:

  • Description: Prescriptive analytics involve analyzing data to recommend actions or strategies that optimize specific objectives or outcomes. Optimization techniques, simulation models, and decision-making algorithms are utilized.
  • Applications: Recommending optimal strategies, decision-making support, and resource allocation.

6) Qualitative Data Analysis:

  • Description: Qualitative data analysis involves analyzing non-numerical data, such as text, images, videos, or audio, to identify themes, patterns, and insights. Methods such as content analysis, thematic analysis, and narrative analysis are used.
  • Applications: Understanding human behavior, attitudes, perceptions, and experiences.

7) Big Data Analytics:

  • Description: Big data analytics methods are designed to analyze large volumes of structured and unstructured data to extract valuable insights. Technologies such as Hadoop, Spark, and NoSQL databases are used to process and analyze big data.
  • Applications: Analyzing large datasets, identifying trends, patterns, and insights from big data sources.

8) Text Analytics:

  • Description: Text analytics methods involve analyzing textual data, such as customer reviews, social media posts, emails, and documents, to extract meaningful information and insights. Techniques such as sentiment analysis, text mining, and natural language processing (NLP) are used.
  • Applications: Analyzing customer feedback, monitoring brand reputation, and extracting insights from textual data sources.

These data analysis methods are instrumental in transforming data into actionable insights, informing decision-making processes, and driving organizational success across various sectors, including business, healthcare, finance, marketing, and research. The selection of a specific method often depends on the nature of the data, the research objectives, and the analytical requirements of the project or organization.

Also Read: Quantitative Data Analysis: Types, Analysis & Examples

Data Analysis Tools

Data analysis tools are essential instruments that facilitate the process of examining, cleaning, transforming, and modeling data to uncover useful information, make informed decisions, and drive strategies. Here are some prominent data analysis tools widely used across various industries:

1) Microsoft Excel:

  • Description: A spreadsheet software that offers basic to advanced data analysis features, including pivot tables, data visualization tools, and statistical functions.
  • Applications: Data cleaning, basic statistical analysis, visualization, and reporting.

2) R Programming Language :

  • Description: An open-source programming language specifically designed for statistical computing and data visualization.
  • Applications: Advanced statistical analysis, data manipulation, visualization, and machine learning.

3) Python (with Libraries like Pandas, NumPy, Matplotlib, and Seaborn):

  • Description: A versatile programming language with libraries that support data manipulation, analysis, and visualization.
  • Applications: Data cleaning, statistical analysis, machine learning, and data visualization.

4) SPSS (Statistical Package for the Social Sciences):

  • Description: A comprehensive statistical software suite used for data analysis, data mining, and predictive analytics.
  • Applications: Descriptive statistics, hypothesis testing, regression analysis, and advanced analytics.

5) SAS (Statistical Analysis System):

  • Description: A software suite used for advanced analytics, multivariate analysis, and predictive modeling.
  • Applications: Data management, statistical analysis, predictive modeling, and business intelligence.

6) Tableau:

  • Description: A data visualization tool that allows users to create interactive and shareable dashboards and reports.
  • Applications: Data visualization , business intelligence , and interactive dashboard creation.

7) Power BI:

  • Description: A business analytics tool developed by Microsoft that provides interactive visualizations and business intelligence capabilities.
  • Applications: Data visualization, business intelligence, reporting, and dashboard creation.

8) SQL (Structured Query Language) Databases (e.g., MySQL, PostgreSQL, Microsoft SQL Server):

  • Description: Database management systems that support data storage, retrieval, and manipulation using SQL queries.
  • Applications: Data retrieval, data cleaning, data transformation, and database management.

9) Apache Spark:

  • Description: A fast and general-purpose distributed computing system designed for big data processing and analytics.
  • Applications: Big data processing, machine learning, data streaming, and real-time analytics.

10) IBM SPSS Modeler:

  • Description: A data mining software application used for building predictive models and conducting advanced analytics.
  • Applications: Predictive modeling, data mining, statistical analysis, and decision optimization.

These tools serve various purposes and cater to different data analysis needs, from basic statistical analysis and data visualization to advanced analytics, machine learning, and big data processing. The choice of a specific tool often depends on the nature of the data, the complexity of the analysis, and the specific requirements of the project or organization.

Also Read: How to Analyze Survey Data: Methods & Examples

Importance of Data Analysis in Research

The importance of data analysis in research cannot be overstated; it serves as the backbone of any scientific investigation or study. Here are several key reasons why data analysis is crucial in the research process:

  • Data analysis helps ensure that the results obtained are valid and reliable. By systematically examining the data, researchers can identify any inconsistencies or anomalies that may affect the credibility of the findings.
  • Effective data analysis provides researchers with the necessary information to make informed decisions. By interpreting the collected data, researchers can draw conclusions, make predictions, or formulate recommendations based on evidence rather than intuition or guesswork.
  • Data analysis allows researchers to identify patterns, trends, and relationships within the data. This can lead to a deeper understanding of the research topic, enabling researchers to uncover insights that may not be immediately apparent.
  • In empirical research, data analysis plays a critical role in testing hypotheses. Researchers collect data to either support or refute their hypotheses, and data analysis provides the tools and techniques to evaluate these hypotheses rigorously.
  • Transparent and well-executed data analysis enhances the credibility of research findings. By clearly documenting the data analysis methods and procedures, researchers allow others to replicate the study, thereby contributing to the reproducibility of research findings.
  • In fields such as business or healthcare, data analysis helps organizations allocate resources more efficiently. By analyzing data on consumer behavior, market trends, or patient outcomes, organizations can make strategic decisions about resource allocation, budgeting, and planning.
  • In public policy and social sciences, data analysis is instrumental in developing and evaluating policies and interventions. By analyzing data on social, economic, or environmental factors, policymakers can assess the effectiveness of existing policies and inform the development of new ones.
  • Data analysis allows for continuous improvement in research methods and practices. By analyzing past research projects, identifying areas for improvement, and implementing changes based on data-driven insights, researchers can refine their approaches and enhance the quality of future research endeavors.

However, it is important to remember that mastering these techniques requires practice and continuous learning. That’s why we highly recommend the Data Analytics Course by Physics Wallah . Not only does it cover all the fundamentals of data analysis, but it also provides hands-on experience with various tools such as Excel, Python, and Tableau. Plus, if you use the “ READER ” coupon code at checkout, you can get a special discount on the course.

For Latest Tech Related Information, Join Our Official Free Telegram Group : PW Skills Telegram Group

Data Analysis Techniques in Research FAQs

What are the 5 techniques for data analysis.

The five techniques for data analysis include: Descriptive Analysis Diagnostic Analysis Predictive Analysis Prescriptive Analysis Qualitative Analysis

What are techniques of data analysis in research?

Techniques of data analysis in research encompass both qualitative and quantitative methods. These techniques involve processes like summarizing raw data, investigating causes of events, forecasting future outcomes, offering recommendations based on predictions, and examining non-numerical data to understand concepts or experiences.

What are the 3 methods of data analysis?

The three primary methods of data analysis are: Qualitative Analysis Quantitative Analysis Mixed-Methods Analysis

What are the four types of data analysis techniques?

The four types of data analysis techniques are: Descriptive Analysis Diagnostic Analysis Predictive Analysis Prescriptive Analysis

  • 10 Best Companies For Data Analysis Internships 2024

data analysis internship

This article will help you provide the top 10 best companies for a Data Analysis Internship which will not only…

  • Top Best Big Data Analytics Classes 2024

big data analytics classes

Many websites and institutions provide online remote big data analytics classes to help you learn and also earn certifications for…

  • Data Analyst Roadmap 2024: Responsibilities, Skills Required, Career Path

empirical research data analysis

Data Analyst Roadmap: The field of data analysis is booming and is very rewarding for those with the right skills.…

right adv

Related Articles

  • The Best Data And Analytics Courses For Beginners
  • Best Courses For Data Analytics: Top 10 Courses For Your Career in Trend
  • BI & Analytics: What’s The Difference?
  • Predictive Analysis: Predicting the Future with Data
  • Graph Analytics – What Is it and Why Does It Matter?
  • How to Analysis of Survey Data: Methods & Examples
  • SQL For Data Analytics: A Comprehensive Guide

bottom banner

Purdue University

  • Ask a Librarian

Research: Overview & Approaches

  • Getting Started with Undergraduate Research
  • Planning & Getting Started
  • Building Your Knowledge Base
  • Locating Sources
  • Reading Scholarly Articles
  • Creating a Literature Review
  • Productivity & Organizing Research
  • Scholarly and Professional Relationships

Introduction to Empirical Research

Databases for finding empirical research, guided search, google scholar, examples of empirical research, sources and further reading.

  • Interpretive Research
  • Action-Based Research
  • Creative & Experimental Approaches

Your Librarian

Profile Photo

  • Introductory Video This video covers what empirical research is, what kinds of questions and methods empirical researchers use, and some tips for finding empirical research articles in your discipline.

Video Tutorial

  • Guided Search: Finding Empirical Research Articles This is a hands-on tutorial that will allow you to use your own search terms to find resources.

Google Scholar Search

  • Study on radiation transfer in human skin for cosmetics
  • Long-Term Mobile Phone Use and the Risk of Vestibular Schwannoma: A Danish Nationwide Cohort Study
  • Emissions Impacts and Benefits of Plug-In Hybrid Electric Vehicles and Vehicle-to-Grid Services
  • Review of design considerations and technological challenges for successful development and deployment of plug-in hybrid electric vehicles
  • Endocrine disrupters and human health: could oestrogenic chemicals in body care cosmetics adversely affect breast cancer incidence in women?

empirical research data analysis

  • << Previous: Scholarly and Professional Relationships
  • Next: Interpretive Research >>
  • Last Updated: Aug 13, 2024 12:18 PM
  • URL: https://guides.lib.purdue.edu/research_approaches

Comprehensive Review and Empirical Evaluation of Causal Discovery Algorithms for Numerical Data

Causal analysis has become an essential component in understanding the underlying causes of phenomena across various fields. Despite its significance, existing literature on causal discovery algorithms is fragmented, with inconsistent methodologies, i.e., there is no universal classification standard for existing methods, and a lack of comprehensive evaluations, i.e., data characteristics are often ignored to be jointly analyzed when benchmarking algorithms. This study addresses these gaps by conducting an exhaustive review and empirical evaluation for causal discovery methods on numerical data, aiming to provide a clearer and more structured understanding of the field. Our research begins with a comprehensive literature review spanning over two decades, analyzing over 200 academic articles and identifying more than 40 representative algorithms. This extensive analysis leads to the development of a structured taxonomy tailored to the complexities of causal discovery, categorizing methods into six main types. To address the lack of comprehensive evaluations, our study conducts an extensive empirical assessment of 29 causal discovery algorithms on multiple synthetic and real-world datasets. We categorize synthetic datasets based on size, linearity, and noise distribution, employing five evaluation metrics, and summarize the top-3 algorithm recommendations, providing guidelines for users in various data scenarios. Our results highlight a significant impact of dataset characteristics on algorithm performance. Moreover, a metadata extraction strategy with an accuracy exceeding 80% is developed to assist users in algorithm selection on unknown datasets. Based on these insights, we offer professional and practical guidelines to help users choose the most suitable causal discovery methods for their specific dataset.

Keywords: Causal Discovery, Time Series, Independent and Identically Distributed (I.I.D.) Data, Algorithm Evaluation, Survey

1 Introduction

Causality, as a dynamically evolving interdisciplinary field, has been gaining increasing attention from both academia and industry (Nogueira et al., 2021 ; Menegozzo et al., 2021 ; Masson-Delmotte et al., 2021 ; Ganguly et al., 2023 ) . Causal analysis employs a systematic approach to uncovering the underlying causes of phenomena, primarily addressing the question of “Why” behind observed trends. Since Granger’s seminal work in 1969 (Granger, 1969 ) , which introduced a mathematical concept of causality, the field has expanded from philosophy to economics (Imbens, 2004 ) and other domains such as medicine (Mani and Cooper, 2000 ) , environmental science (Li et al., 2014 ) , and dynamics (Hu et al., 2015 ) . Recently, the rapid advancement of Artificial Intelligence (AI) has opened new avenues for causal analysis. The integration of machine learning has enhanced the precision and efficiency of data processing for causal inference, while causal learning has established a more reliable and trustworthy framework for machine learning (Scholkopf, 2019 ; Makhlouf et al., 2020 ) . These two disciplines mutually reinforce each other, driving significant advancements in scientific research.

Given that correlation does not imply causation, causal research necessitates a thorough investigation beyond simple association analysis. Pearl’s work ( 2000 ) provided a widely accepted framework, known as the “Ladder of Causation”, which delineates three stages: association , intervention , and counterfactual . The initial stage, association , involves observing relationships between variables, yet it is insufficient for identifying confounders or selection bias that may lead to spurious causation (Cheng et al., 2019 ) . The second stage, intervention , involves controlled experiments to quantify the causal impact of one variable on another. The final stage, counterfactual analysis, requires a deep understanding of the causal mechanisms underlying the phenomena. Figure 1 shows the causation ladder and corresponding analysis engine (Bareinboim and Pearl, 2016 ) .

Refer to caption

This section elaborates on the fundamental concepts, evolution, and classifications of causal analysis, addresses the limitations of previous studies, and introduces the innovative contributions and specific objectives of the current work. Research on causal analysis is primarily categorized into two areas (Nogueira et al., 2022 ) : causal inference and causal discovery . Causal inference typically progresses from cause to effect, focusing on the quantitative problems of “Intervention Stage” within the causation ladder framework (Peters et al., 2017 ) . The central concept of inference involves controlled trials (Yao et al., 2021 ) , where both experimental and control groups are observed to determine the effects of interventions. When the causal graph is known, observational data can be used to predict intervention effects of experimental tests. Common approaches for causal effect estimation include covariate adjustment (Pearl, 2009 ; Stekhoven et al., 2012 ; Maathuis and Colombo, 2015 ) , optimal adjustment (Sekhon, 2008 ; Runge, 2021 ; Henckel et al., 2022 ) , and the path method (Nandy et al., 2017 ) . In contrast, causal discovery seeks to identify causal relationships from observed outcomes, emphasizing the qualitative problems to learn causal structures (Gelman, 2011 ) . Upon uncovering causal mechanisms, it becomes possible to infer outcomes based on hypothetical scenarios that have not occurred. Despite their inverse logical relationship, causal discovery and causal inference differ significantly in their research methodologies, algorithms, and applications.

In practical scenarios, a significant challenge in causal analysis is managing diverse and complex data types. Standard data formats include time series (Eichler, 2012 ) , cross-sectional, and panel data. Panel data, which has two dimensions (time and samples), combines elements of cross-sectional data, where all sample observations at a specific time point are included, and time series data, where observations of the same sample are recorded at different time points. Since time series data does not adhere to the assumptions of independent and identically distributed (i.i.d.) random variables, it necessitates specialized research on data processing techniques and causal analysis algorithms. Furthermore, due to the importance and widespread use of time series in real-world applications, many researchers have focused extensively on this data type (Assaad et al., 2021 ; Biswas and Mukherjee, 2024 ) . If a time series can be viewed as a “list", then the i.i.d. variable is a “set" without self-causes. Therefore, causal discovery for i.i.d. data, which differs from time-series causality, has also attracted a lot of research attention (Xie et al., 2019 ) .

However, existing literature lacks a universally applicable algorithm for causal analysis (Edinburgh et al., 2021 ) . Unlike causal inference , the precision of causal discovery heavily relies on the selection of an appropriate causality model. Hence, users often face challenges in selecting the appropriate causal discovery algorithm on unknown datasets, leading to unsatisfactory results or unnecessary expenditure of computational resources and time. The significance of this study including providing guidelines to help users quickly identify the most suitable algorithm for unknown datasets, and assisting researchers in organizing and benchmarking numerous existing algorithms. Therefore, we concentrates on two pivotal elements of causal analysis: (1) causal discovery and (2) time series and i.i.d. data analysis . An exhaustive investigation was conducted to delve into the methodologies of causal discovery, encompassing principles, algorithmic strategies, and recent advancements in the field.

Extensive efforts have been made to review and reevaluate causal discovery algorithms in a unified, more extensive, and systematic way. To address the challenge of limited benchmark datasets for causal discovery, many researchers have focused on developing data simulators in fields such as industrial systems (Menegozzo et al., 2022 ) , neurology (Tu et al., 2019 ) , and biology (Ma et al., 2023 ) . There have been some studies that conducted experimental studies to systematically evaluate one type of (not all existing) causal discovery methods. For example, Sogawa et al. ( 2010 ) evaluated the identification accuracy and robustness of linear non-Gaussian methods and its variants. Raghu et al. ( 2018 ) compared the performance of four conditional independence-based algorithms on mixed data with latent variables. Ko et al. ( 2018 ) summarized estimation of distribution algorithms (EDAs) and compared their performance on four datasets to infer the best ones.

There have been articles that survey various types of methods. Song et al. ( 2016 ) and Käding et al. ( 2021 ) compared causal discovery methods for bivariates on real-world bench datasets. However, their research only focused on bivariate and did not investigate multivariate algorithms. Ombadi et al. ( 2020 ) evaluated four causal discovery algorithms on hydrometeorological data, aiming to guide researchers in determining which causal method is most appropriate based on the characteristics of hydrological system. Assaad et al. ( 2022 ) not only systematically organized time-series methodologies but also performed thorough evaluations of representative algorithms based on distinct causal structures.

Despite these contributions, most existing literature primarily focuses on theoretical summaries. Even when experiments were conducted, they mainly assessed the impact of causal structures on model performance. Given the often ambiguous causal structure of observational data, it is vital to provide practical and reliable insights from the user’s perspective.

Unlike previous surveys, this paper adopts a data-oriented approach, categorizing causal relationships into four types: i.i.d. causality, time-delay causality, instantaneous causality, and causal pairs. To address the research gap that often overlooks the user’s perspective, this study performs a comprehensive analysis starting from the intrinsic characteristics of the data (data assumptions). Motivated by this approach, comparative experiments are conducted, treating data assumptions as experimental factors and employing various algorithms as experimental subjects. These comparisons aim to establish the relationship between algorithms and specific data features within each causality category. Specifically, our pursuit is dedicated to identifying the optimal algorithm, considering various factors, including data size, linearity, stationarity, and noise attributes. By leveraging the extracted data features, users can choose the most appropriate algorithm for their specific needs. The main contributions of this paper are summarized as follows:

Survey and Taxonomy: We comprehensively collect and categorize methods and algorithms for causal discovery, summarizing the characteristics and applications of these algorithms.

Benchmarking: We conduct extensive benchmarking of selected state-of-the-art algorithms across diverse datasets using multiple evaluation metrics to access performance and applicability.

Practical Guidelines: We provide practical insights and recommendations on the optimal algorithm for specific datasets, offering decision-making suggestions in various application fields.

In light of this, our experiment focus on the analysis of algorithm performance and result effectiveness, with the aim of addressing the following research questions:

RQ 1 ( Comparison of algorithm performance ): Among the assessed algorithms, which one demonstrates superior effectiveness or efficiency under specific data characteristics?

This research question can be considered as a benchmark and baseline for answering other questions. By evaluating the impact of data features on algorithm performance, we establish a foundation that informs subsequent steps in our experimental analysis.

RQ 2 ( Real-world applicability ): Are the insights derived from the synthetic datasets consistent with those acquired from the real datasets?

This research question is crucial for determining the effectiveness of insights gained from RQ 1, as it connects the findings from synthetic data to real-world scenarios. This question ensures that our conclusions are not limited to controlled experimental conditions but are also valid in practical applications.

RQ 3 ( Generalization to unknown datasets ) : This RQ can be further divided into the following sub-research questions:

RQ 3.1 ( Metadata recognition for algorithm selection ): Is it feasible to precisely capture the representative attributes of unknown datasets using their metadata to ascertain the optimal algorithm based on our previous conclusions?

RQ 3.2 ( Practical recommendations for users ): Upon successfully identifying the optimal algorithm for an unknown dataset in RQ 3.1, what practical recommendations can we provide to users for selecting appropriate methods for their specific datasets?

This research question is challenging with respect to experimental justifications of the other RQs as it involves extending the results of RQ 1 and RQ 2 to a broader range of applications. RQ 3.1 focuses on the feasibility of applying our findings to new datasets by analyzing their metadata, ensuring that our methods are robust and versatile. RQ 3.2 aims to translate these validated approaches into practical, user-friendly guidelines that assist practitioners in choosing the best algorithms for their unique datasets, thus bridging the gap between theoretical research and practical implementation.

Refer to caption

The rest of this paper is organized as follows. Section 2 presents the methodology of conducting the literature review, in which Section 2.1 covering the fundamental preliminaries of the domain, Section 2.2 discussing assumptions for causality, and Section 2.3 reviewing related algorithms and relevant surveys on causal analysis and identifying research gaps. Section 3 details the survey methodology, describing the data collection process in Section 3.1 , and the analysis of the survey data in Section 3.2 . Section 4 focuses on causal discovery algorithms, covering 6 categories of methodologies in Sections 4.1 to 4.6 , respectively. Section 5 outlines the empirical study design, including investigated datasets in Section 5.1 , evaluation metrics in Section 5.2 , algorithms in Section 5.3 , and environment settings in Section 5.4 . Section 6 presents the results analysis, answering the research questions. Potential threats to validity are discussed in Section 7 . Section 8 concludes this study and presents future work directions.

2 Background and Related Work

Here we will analyze the preliminaries, related survey literature, and research gaps in causal discovery. The section aims to structure the knowledge body of this academic domain systematically.

2.1 Preliminaries

This section presents the fundamental definitions and corresponding notations associated with causal discovery. One needs to state that matrices are denoted by uppercase bold letters, whereas vectors are indicated by lowercase bold letters. Consider the dataset denoted by X 𝑋 X italic_X , which manifests as a m × n 𝑚 𝑛 m\times n italic_m × italic_n matrix. Here, x n superscript 𝑥 𝑛 x^{n} italic_x start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT designates the n ⁢ t ⁢ h 𝑛 𝑡 ℎ nth italic_n italic_t italic_h variable, while x 𝑥 x italic_x embodies m 𝑚 m italic_m observations. This endeavour categorizes observational data into cross-sectional and time-series data, as defined below.

Definition 1 (Cross-sectional Data): Cross-sectional data is a set of observations collected from subjects at one time point.

Note that we mainly focus on a common type of cross-sectional data, namely independent and identically distributed (i.i.d.) data.

Definition 2 (Time Series): Time series is a sequence of data points arranged in temporal order. Given a time series x n superscript 𝑥 𝑛 x^{n} italic_x start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT , an observation at a specific temporal point t 𝑡 t italic_t is represented as x t n subscript superscript 𝑥 𝑛 𝑡 x^{n}_{t} italic_x start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT .

The concept of time lag is introduced to discern the demarcation between time-delay and instantaneous causality.

Definition 3 (Time Lag): Time Lag τ 𝜏 \tau italic_τ refers to the temporal interval between a cause and its effect.

In cases where τ > 0 𝜏 0 \tau>0 italic_τ > 0 , it signifies the occurrence of the cause τ 𝜏 \tau italic_τ units of time prior to its effect. This phenomenon is referred to as “time-delay causality”. Nonetheless, circumstances arising from sampling techniques or other factors might occasion an instance wherein τ = 0 𝜏 0 \tau=0 italic_τ = 0 . In such scenarios, the causal latency is deemed insignificant for observation, and this relationship is classified as “instantaneous causality”.

To further depict the causal interdependencies among variables within the dataset, it becomes imperative to introduce the notion of causal graphs.

Definition 4 (Causal Graph): The causal graph G 𝐺 G italic_G is composed of two subsets: a set of nodes v 𝑣 v italic_v and a set of edges ϵ italic-ϵ \epsilon italic_ϵ . If variable x i superscript 𝑥 𝑖 x^{i} italic_x start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT is cause of variable x j superscript 𝑥 𝑗 x^{j} italic_x start_POSTSUPERSCRIPT italic_j end_POSTSUPERSCRIPT , denoted as 𝒙 𝒊 → 𝒙 𝒋 → superscript 𝒙 𝒊 superscript 𝒙 𝒋 \bm{x^{i}}\rightarrow\bm{x^{j}} bold_italic_x start_POSTSUPERSCRIPT bold_italic_i end_POSTSUPERSCRIPT → bold_italic_x start_POSTSUPERSCRIPT bold_italic_j end_POSTSUPERSCRIPT , this relationship manifests as an edge from node i 𝑖 i italic_i to node j 𝑗 j italic_j in Directed Acyclic Graphs (DAGs) (Pearl, 1985 ) .

However, when there are hidden variables in the dataset, Maximal Ancestral Graphs (MAGs) (Richardson and Spirtes, 2002 ) can represent causal relationships. The types of edges in MAGs are as follows:

𝒙 𝒊 → 𝒙 𝒋 → superscript 𝒙 𝒊 superscript 𝒙 𝒋 \bm{x^{i}}\rightarrow\bm{x^{j}} bold_italic_x start_POSTSUPERSCRIPT bold_italic_i end_POSTSUPERSCRIPT → bold_italic_x start_POSTSUPERSCRIPT bold_italic_j end_POSTSUPERSCRIPT : 𝒙 𝒊 superscript 𝒙 𝒊 \bm{x^{i}} bold_italic_x start_POSTSUPERSCRIPT bold_italic_i end_POSTSUPERSCRIPT causes 𝒙 𝒋 superscript 𝒙 𝒋 \bm{x^{j}} bold_italic_x start_POSTSUPERSCRIPT bold_italic_j end_POSTSUPERSCRIPT ;

𝒙 𝒊 ↔ 𝒙 𝒋 ↔ superscript 𝒙 𝒊 superscript 𝒙 𝒋 \bm{x^{i}}\leftrightarrow\bm{x^{j}} bold_italic_x start_POSTSUPERSCRIPT bold_italic_i end_POSTSUPERSCRIPT ↔ bold_italic_x start_POSTSUPERSCRIPT bold_italic_j end_POSTSUPERSCRIPT : there is a hidden confounder between 𝒙 𝒊 superscript 𝒙 𝒊 \bm{x^{i}} bold_italic_x start_POSTSUPERSCRIPT bold_italic_i end_POSTSUPERSCRIPT and 𝒙 𝒋 superscript 𝒙 𝒋 \bm{x^{j}} bold_italic_x start_POSTSUPERSCRIPT bold_italic_j end_POSTSUPERSCRIPT ;

𝒙 𝒊 − 𝒙 𝒋 superscript 𝒙 𝒊 superscript 𝒙 𝒋 \bm{x^{i}}-\bm{x^{j}} bold_italic_x start_POSTSUPERSCRIPT bold_italic_i end_POSTSUPERSCRIPT - bold_italic_x start_POSTSUPERSCRIPT bold_italic_j end_POSTSUPERSCRIPT : there is a hidden effect variable from both 𝒙 𝒊 superscript 𝒙 𝒊 \bm{x^{i}} bold_italic_x start_POSTSUPERSCRIPT bold_italic_i end_POSTSUPERSCRIPT and 𝒙 𝒋 superscript 𝒙 𝒋 \bm{x^{j}} bold_italic_x start_POSTSUPERSCRIPT bold_italic_j end_POSTSUPERSCRIPT .

If we do not consider causal directions, a skeleton graph can signify the causal relations between variables. There are only undirected edges in the skeleton graph that represent causal links. For time series, a window graph is common for causal discovery, referring to the causal graph within the maximum time lag window (Assaad et al., 2023 ) . Figure 3 shows examples of these causal graphs.

Refer to caption

Building upon the precedent definitions, we can elucidate the tasks of causal discovery. Given a dataset X 𝑋 X italic_X , the objectives of causal discovery are the deduction of the causal graph, the quantification of causal strength, and the determination of causal time lags (when the data is time-series). In essence, the goal is to reconstruct the causal mechanism interconnecting the variables. Our focus is centred upon exploring causal graphs, driven by the objective of identifying causal relations within the variable set v 𝑣 v italic_v . To accomplish this, the outcomes of the causal discovery process are encoded using an adjacency matrix A 𝐴 A italic_A .

Definition 5 (Adjacency Matrix): The Adjacency Matrix of causal discovery constitutes a square matrix of dimensions n × n 𝑛 𝑛 n\times n italic_n × italic_n . Specifically, the row vector is set to signify the cause, while the column vector signifies the effect. If the value located at position ( i , j ) 𝑖 𝑗 (i,j) ( italic_i , italic_j ) in the matrix is 1, it denotes the presence of a causal link from variable x i superscript 𝑥 𝑖 x^{i} italic_x start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT to x j superscript 𝑥 𝑗 x^{j} italic_x start_POSTSUPERSCRIPT italic_j end_POSTSUPERSCRIPT .

2.2 Assumptions for Causality

When conducting causal discovery tasks, it is crucial to consider certain foundational assumptions. All the methods analyzed in this study are based on at least one of these assumptions. These assumptions help in relating causality to probability densities (Spirtes and Zhang, 2016 ) .

2.2.1 The Causal Markov Assumption

Spirtes and Zhang ( 2016 ) argue that in a dataset, all features are independent of their non-effects (nondescendants in the causal graph) conditional on their direct causes (parents in the causal graph). However, it is important to note that the Causal Markov Assumption can be an oversimplification. It assumes that causality is the sole reason for associations between all features, which is not always true, as other kinds of associations exist as well.

For instance, consider the case of ice cream sales and drowning incidents. According to the Causal Markov Assumption, if these two features were present in a dataset and shared an association, it should imply a causal link. However, we know this isn’t true. During summers, both ice cream sales and swimming activities increase. While there might be some conditional dependencies between the two due to common factors, confusing correlation with causality would lead to spurious results.

2.2.2 The Causal Faithfulness Assumption

The Causal Faithfulness Assumption asserts that all and only the true causal links between features are represented in the observed data. This means there are no latent or unobserved confounding variables impacting the observed features and their underlying causal relationships (Spirtes and Zhang, 2016 ) .

For example, consider a dataset of patients in a lung cancer ward, with smoking habits as one of the features and lung cancer as the target variable. To determine whether smoking causes lung cancer, the causal faithfulness assumption holds if there are no unmeasured factors (like genetic vulnerability) linked to both smoking habits and lung cancer but are absent from the observed dataset. If the assumption does not hold, it indicates the presence of confounding factors. For instance, if the dataset lacks information about the patients’ medical history or genetic predisposition, the observed causal link between smoking and lung cancer may be spurious. In such a scenario, smoking habits may appear to cause lung cancer in the data, but the true causal relationship is distorted by latent confounders.

2.2.3 Markov Equivalence Classes (MEC)

Markov Equivalence Classes (MEC) are another important concept in causal discovery. MECs group statistically indistinguishable causal models that make the same predictions about the probability distributions of observed variables (Spirtes and Zhang, 2016 ) . In a given set of variables V, a Markov Equivalence Class represents an aggregation of DAGs that exhibit the same pattern of conditional independence associations among variables in V.

For example, consider two DAGs, G1 and G2. G1 and G2 lie in the same Markov Equivalence Class if, for each pair of variables v1 and v2 in V, v1 and v2 are conditionally independent given all other variables v3 in V in G1 if and only if they are conditionally independent given v3 in G2. Simply put, for G1 and G2 to lie in the same Markov Equivalence Class, they must imply the same set of conditional independence statements among the variables.

To illustrate this further, consider two plausible causal models, M1 and M2, and three variables P, Q, and R:

In M1, P causally influences Q, and Q causally influences R. In M2, R causally influences Q, and Q causally influences P. Despite the different causal directions, M1 and M2 lie in the same Markov Equivalence Class because the observed data generated from these models will exhibit the same trends and conditional independence associations (Spirtes and Zhang, 2016 ) .

2.2.4 The Causal Sufficiency Assumption

The Causal Sufficiency Assumption (Spirtes et al., 2001 ) states that the common causes between any two variables of the variable set v 𝑣 v italic_v are entirely contained within v 𝑣 v italic_v itself, thereby excluding the presence of latent confounders. This condition is considered a prerequisite for the efficacy of most causal discovery algorithms.

For example, consider one causal model M and three variables, P, L, and R:

If L is an unobserved variable, indicating that L is a hidden confounder of R and Q, then the model M does not satisfy the causal sufficiency assumption. Under this assumption, a directed edge in DAGs represents a causation from a cause to its effect.

2.3 Related Works

1969     Pairwise Granger Causality )  one of the first statistical model of causal discovery
1982     Multivariate Granger Causality (MVGC) ; Chen et al., ; Barrett et al., )
2001     PC ; Colombo et al., )  turn observations into causal knowledge
2002     GES , , )
2006     ICALiNGAM )  based on structural equation model
2008     ANM , ) Kernel Granger Causality (KGC) , ) FCI ; Colombo et al., )
2010     tsFCI ) VARLiNGAM )
2011     IOTA ) DirectLiNGAM ; Hyvärinen and Smith, ) ARMA-LiNGAM )
2012     CCM ; Ye et al., )  dynamic systems PNL ) IGCI )
2013     ES ) TiMINO )
2014     Copula Granger Causality (Copula GC) ) CMS ) PAI )
2015     oCSE )
2017     PSDR-TE )
2018     NOTEARS , ; Fang et al., ) CGNN ) SCDA ) RECI )
2019     PCMCI ; Runge, ) TCDF ) GraNDAG ) CDS )
2020     CD-NOD ) DYNOTEARS ) RCD ; Maeda, ) GOLEM ) NonSENS )
2021     CAM-UV ) DAG-GNN ) CORL ) NGC ; Wang et al., )
2022     GRaSP ) ACD )
2024     NBCB, CBNB )

In recent years, the topic of causal discovery has garnered significant interest as researchers endeavor to uncover causal relationships from observational and experimental data. To establish a solid foundation for understanding causal discovery, it is crucial to collect related works in this field. Pearl’s work (Pearl, 1985 ; Pearl et al., 2000 ) on causal Bayesian networks and the introduction of causal graphical models significantly advanced the field’s theoretical foundation. These seminal contributions serve as the basis for further research and methodologies. Peter Spirtes introduced nonparametric Structural Causal Models (SCM) (Pearl, 2009 ) as a formal and intelligible language for articulating causal knowledge and explaining causal notions used in scientific discourse. These include concepts like randomization, intervention, direct and indirect effects, confounding, counterfactuals, and attribution. The structural language’s algebraic component corresponds to the potential-outcome framework (Rubin, 1974 ) , while its graphical component incorporates Wright’s method of path diagrams. The potential outcome framework, which focuses on estimating potential outcomes to calculate treatment effects, is particularly applicable to A/B tests. It performs effectively in causal inference, even when the complete causal graph is unknown (Aliprantis, 2015 ) . When combined, these components provide a robust approach for causal inference, addressing long-standing issues in empirical sciences, such as confounding control, policy evaluation, mediation analysis, and the algorithmization of counterfactuals.

Table 1 illustrates the timeline of the development of causal discovery algorithms. As shown in Table 1 , Granger (Granger, 1969 ) proposed a statistical model to determine causal relationships between bivariables, which became one of the oldest mathematical models in the history of causal discovery. Spirtes et al. introduced the assumptions and methods that laid the foundation for this field in their pioneering work (Spirtes et al., 2001 ) , attempting to transform observations in real-world into causal knowledge. Spirtes and Glymour ( 2001 ) developed the PC algorithm as a fundamental constraint-based algorithm. This algorithm starts with an undirected graph and recursively deletes edges based on conditional independence judgments. Since 2006, Shimizu et al. ( 2006 ) have designed a Linear Non-Gaussian Acyclic Model (LiNGAM) algorithm based on Structural Equation Model (SEM), which was developed into many variants to address non-linear relationships, time series, mixed data, latent confounders, and other data cases, forming a class of algorithm groups. LiNGAM-related methods are widely applied across diverse fields, including neuroscience (Ji et al., 2024 ; Chiyohara et al., 2023 ) , economics (Jin and Xu, 2024 ) , epidemiology (Barrera and Miljkovic, 2022 ; García-Velázquez et al., 2020 ) , psychology (Mojtabai, 2024 ; Rosenström et al., 2023 ) , chemistry (Luo et al., 2024 ) , and others. However, the above methods cannot handle nonseparable weakly connected dynamic systems. In response to this issue, Sugihara et al. ( 2012 ) proposed a Convergent Cross Mapping (CCM) algorithm based on state space method under the assumption of nonlinear deterministic systems, which is suitable for dynamic research fields such as ecology. In addition, for large sample datasets, causality algorithms combined with constantly evolving deep learning techniques greatly improve the accuracy of causal discovery and have become a popular research method.

In addition to the algorithms and methods mentioned above, we also need to investigate causal discovery from a more holistic perspective. Consequently, an exhaustive analysis must be conducted, combined with authoritative surveys over the past five years, to gain a comprehensive understanding of this field.

It is imperative to differentiate the objectives and conceptual frameworks associated with causal discovery and causal inference to further delve into these two notions, as elucidated by Guo et al. ( 2020 ). Guo et al. emphasised that causal inference involves tracing the causal path from cause to effect, aiming to understand the impact of manipulating specific variables on others. Within the realm of causal inference, Yao et al. ( 2021 ) have conducted an in-depth investigation into the concepts, methods, and applications of causal inference, contributing significantly to the field. Acknowledging the distinct nature of time series data, which differs from i.i.d. data, is essential. This distinction presents unique challenges and considerations in analysing causality.

Additionally, to supplement the understanding of these concepts, Nogueira et al. ( 2022 ) have analysed and compared causal discovery and inference using software tools, providing practical examples for testing. Their work contributed to the existing body of knowledge by exploring the practical application and evaluation of different approaches.

Focusing on time series, Moraffah et al. ( 2021 ) provided a comprehensive examination of causality for such data, offering insights into the generation of time series data, methodologies employed in the causal analysis, and evaluation metrics used to assess causal relationships. Notably, the study enumerated these evaluation metrics’ specific attributes and characteristics, providing valuable information for researchers in selecting appropriate metrics for their analyses. Regrettably, their research remained confined to theoretical realms and has yet to be realized through practical trial.

Another article on evaluation metrics was proposed by Cheng et al. ( 2022 ). The research conducted thoroughly examines the evaluation methods employed in causal analysis. Their analysis referred to a wide range of considerations, including the availability and suitability of software packages, algorithms’ effectiveness, and datasets’ appropriateness for evaluating causal learning algorithms. By investigating these aspects, Cheng et al. provided researchers with valuable guidance for selecting appropriate evaluation methods in causal analysis studies.

Charles K. Assaad et al. ( 2022 ) have made a notable contribution to the field of time series causal discovery, and their work serves as a pivotal reference for the research project at hand. In the theory field, they presented a comprehensive framework comprising seven distinct categories for analyzing causal relationships. In the empirical field, they employed ten algorithms to assess these methods’ performance across different causal structures.

More recently, Runge, J. et al. ( 2023a ) comprehensively summarized methods of causal discovery and proposed a Question-Assumptions-Data (QAD) template, embedding causal discovery into Pearl’s causal ladder. They also designed a method selector to match the optimal algorithm to different graph assumptions. However, they did not further discuss parametric assumptions about datasets through experimentation.

Hasan, U. et al. ( 2023 ) summarized causal discovery methods for i.i.d. data and time series, and collected source code of relevant algorithms. They also tested and compared the performance of 9 algorithms of i.i.d. data and 7 algorithms of time series on benchmark datasets. However, they did not further test and analyze the influence of data assumptions on algorithms performance, such as dependency functions and noise distributions.

Question Data Method Evaluation
Causal Direction Pairwise Granger-Based
Condition Independent-Based Classification-Based Measures
Causal Graph I.I.D. Data State space Dynamic-Based
Structural Equation Modeling-Based
Time Series Deep Learning-Based Graph Distance-Based Measures
Hybrid Methods

Drawing upon the analysis mentioned above, we have amalgamated and organized the principal research directions, which are visually represented in Table 2 . By employing the Question-Data-Method-Evaluation (QDME) template, the review aims to provide a clear and structured overview of the diverse areas and subtopics within the field of causal discovery research, enhancing the organization and coherence of the work.

Based on the aforementioned papers, while the algorithms and evaluations of causal discovery have become relatively comprehensive, several unresolved challenges persist, highlighting gaps in the existing body of knowledge as below.

Many articles employ outdated taxonomies and lack updates on the latest algorithms.

Most surveys emphasize theoretical analyses, often neglecting the systematic experiments necessary for quantitative assessment.

Although some articles have conducted experimental comparisons of algorithms, these studies primarily consider causal structures as their experimental factors, overlooking the characteristics of the data.

3 Survey Methodology

This section will introduce how to collect relevant research resources, including literature, codes, metrics, and datasets. Moreover, the last section briefly explains the analytical technologies we used.

A quantitative research approach is employed to gather and analyze papers from databases systematically. This approach facilitates measuring and exploring trends, methods, datasets, and evaluation in the research domain. Google Scholar was chosen as the literature database due to its extensive coverage of scholarly articles. Information acquisition was initiated by employing a keyword search approach. Primarily, Figure 4 demonstrates the research trend of causal discovery, underpinned by the number of articles published during the preceding two decades.

Refer to caption

Figure 4 illustrates an increasing annual trend in research endeavours about causal discovery. This discernible growth can be attributed partly to the robust advancement of AI technologies in contemporary times, which has laid the groundwork for algorithmic developments facilitated by enhanced data processing capabilities.

Furthermore, for preparing the programming underpinnings, our endeavour extends more than literature collection to encompass acquiring requisite algorithmic source code. To this end, we searched for the source code associated with each algorithm on the GitHub platform, with the integration and revision of original algorithm codes as deemed necessary.

To facilitate a comprehensive and impartial investigative analysis, this study employs both comparative analysis and case study methodologies. Specifically, a series of comparative experiments involving diverse algorithms on distinct artificial datasets are conducted. By analyzing the magnitudes of evaluation metrics, a quantitative assessment of algorithmic performance is executed, yielding overarching insights. Ultimately, to affirm the robustness and effectiveness of our findings, the conclusions are corroborated through case studies such as real datasets.

3.1 Data Collection

Here we executed a keyword-based search focusing on aspects within the tree graph above. The subsequent table illustrates the collection result derived from the published paper from 2004 to 2023, as sourced from Google Scholar.

To prevent repetitive retrieval and collection of papers, “Causal Discovery” is used as the basic keyword, with previously retrieved terms excluded when searching for new supplementary terms. It should be noted that we also searched for keywords related to “causal inference” to ensure comprehensive collection of papers within cross fields. We have identified 220 articles with the highest correlation from over 3000 related articles and have summarized the datasets, algorithms, and evaluation metrics for causal discovery. The terms “Hits,” “Title,” and “Body” in Table 3 refer to the number of papers returned by the search, the relevance of the titles to the desired content, and the number of papers that remain after title and body framing, respectively.

Keywords Hits Title Body
“Causal Discovery” + [Survey Overview Review] 980 15 14
“Causal Discovery” + [Bench Benchmark] 347 7 7
“Causal Discovery” + Dataset 619 16 16
“Causal Discovery” + [Evaluation Comparison] 240 18 16
“Causal Discovery” + [Method Algorithm Approach] 410 138 137
“Causal Inference” + [Time Series Cross-sectional I.I.D.] 995 89 30
Overall - 283 220

A selection of widely employed datasets for causal discovery has been identified by synthesising diverse review articles. These datasets encompass both real-world instances and artificial constructs. To streamline ensuing experimental processes, a classification framework has been devised to categorise these datasets into four types of causality relationships. Additionally, the sources of these datasets are also categorized into real-world and synthetic datasets. Specifically, CausalWorld (Ahmed et al., 2020 ) , SynTReN (Van den Bulcke et al., 2006 ) , LUCAS (Guyon et al., 2011 ) , ALARM (Lauritzen and Spiegelhalter, 1988 ) and ASIA (Lauritzen and Spiegelhalter, 1988 ) are synthetic datasets, while Tubingen (Mooij et al., 2016 ) , ADNI (Petersen et al., 2010 ) , AntiCD3/CD28 (Sachs et al., 2005 ) , Abalone (Asuncion and Newman, 2007 ) , fMRI (Smith et al., 2011 ) , Causality 4 Climate (Runge et al., 2020 ) , Traffic Prediction (Pan et al., 2018 ) , OHDNOAA (Jangyodsuk et al., 2014 ) , Temperature Ozone (Gong et al., 2017 ) , Sachs (Sachs et al., 2005 ) and CHILD (Spiegelhalter et al., 1993 ) are real-world datasets. This categorisation is illustrated in Table 4 .

Causality Type Dataset Year Area Source
Pairwise Cause Effect Pairs Challenge ) 2013 Various Real-world
Tubingen ) 2016 Various Real-world
CE-Gauss ) 2016 - Synthetic
CE-Multi Net ) 2018 - Synthetic
Instantaneous SynTReN ) 2006 Biology Synthetic
FLUXNET ) 2020 Biogeoscience Synthetic
causaLens ) 2021 Various Synthetic
Time-delay fMRI ) 2011 Neuroscience Real-world
FinanceCPT ) 2012 Economics Synthetic
OHDNOAA ) 2014 Hydrologic Real-world
Traffic Prediction ) 2018 Traffic Real-world
Causality 4 Climate ) 2020 Climate Real-world
I.I.D. Data Sachs ) 2005 Biology Real-world
LUCAS ) 2011 Medical Synthetic
ALARM ) 1989 Belief Networks Semi-synthetic
CHILD ) 1993 Medical Real-world
ASIA ) 1988 Medical Synthetic
Auto-mpg ) 2005 Engineering Real-world

Likewise, we have identified performance metrics for assessing causal discovery techniques. These metrics are divided into two overarching families: graph-based metrics (Peters and Bühlmann, 2015 ) and classification-based metrics, as expounded in Table 5 . It is pertinent to highlight that nearly all metric computations necessitate the availability of both estimated DAGs and ground truth DAGs. Therefore, the prudent selection of datasets with well-established ground truth becomes imperative.

Measure Type Metric Notions
Graph distance-based measure Structural Hamming Distance (SHD) Calculate the difference between two (binary) adjacency matrices: each edge that is missing or not in the target graph is counted as an error.
Frobenius Norm Compare the similarity between real matrices and estimation matrices.
Structural Intervention Distance (SID) Estimate the count of erroneously deduced intervention distributions.
Classification-based measure Precision The quotient of true positives (TP) divided by the sum of TP and false positives (FP).
Recall The proportion of TP in relation to the summation of TP and false negatives (FN).
F1 Score The harmonic mean of precision and recall of the learned structure as compared to true causal structure.
FPR The ratio of the edges that are present in the predicted graph but not present in the ground-truth graph.
TPR The ratio of the common edges between the ground-truth and predicted causal graphs over the number of edges in ground-truth graph.
MSE The sum of square of difference between the predicted and the ground-truth causal graphs divided by the total number of nodes.
Area under ROC Curve (AUROC) Area under ROC curve is the area under the curve of recall versus FPR at different thresholds.

Additionally, we collected several well-established packages for causal discovery algorithms, including bnlearn, pcalg, Tetrad, Causal Discovery Toolbox (CDT), CausalNex, gCastle, Tigramite, and causal-learn . bnlearn (Scutari, 2009 ) is an R package designed for Bayesian network learning and inference. It offers an open-source implementation of various structure learning algorithms, including constraint-based, score-based, and hybrid methods. Another R package, pcalg (Kalisch et al., 2012 ) , integrates graphical models and causal inference techniques. It provides implementations for several widely-used causal discovery algorithms, including PC, FCI, RFCI, GES, GIES, SIMY, ARGES, and LiNGAM. Tetrad (Ramsey et al., 2018 ) is a Java package designed for generating and simulating data, estimating parameters, testing hypotheses, predicting outcomes, and searching causal models. CDT (Kalainathan et al., 2020 ) primarily focuses on discovering causal relationships from observational data, ranging from determining pairwise causal directions to full graph modeling. CausalNex (Beaumont et al., 2021 ) is a Python library that integrates machine learning and domain expertise for causal inference using Bayesian networks. It enables users to discover structural relationships within data, analyze complex distributions, and evaluate the effects of potential interventions. gCastle (Zhang et al., 2021b ) is a causal structure learning toolchain developed by Huawei Noah’s Ark Laboratory, offering a Python library for mainstream algorithms and emerging gradient-based approaches. Tigramite (Runge et al., 2023b ) is a Python package designed for time series analysis based on the PCMCI framework. It reconstructs graphical models (conditional independence graphs) from discrete or continuous time series data and generates high-quality graphical representations. causal-learn (Zheng et al., 2024b ) is a Python library built upon the Java-based Tetrad causal discovery platform. The library provides modular code, enabling researchers to implement and extend their own algorithms efficiently.

3.2 Experimental Data Collection

Utilizing a comparative analysis framework, we systematically process the performance metrics. We selected a reference algorithm exhibiting superior performance to enhance result lucidity and then calculated residuals for other algorithms compared with the reference. The overall distribution of these residuals is visually represented through the violin plots.

Nevertheless, two specific algorithms may manifest insignificant disparities between their metric values. Hence, we introduce a significance assessment mechanism to pursue a more methodical treatment of data relationships. Given the constraint that each data size consists of merely five datasets, we opt for the non-parametric Mann-Whitney U test (Mann and Whitney, 1947 ) . This approach circumvents the necessity to assume normality in data distribution. Significance is determined based on a p-value threshold of 0.05; when under this threshold, it signals a notable divergence in the performance of the two algorithms.

In addition, a ranking table is formulated to delineate the evaluation outcomes for enhancing the discernibility of inter-algorithm performance disparities. This table is ordered in a descending manner upon the average values. Meanwhile, we calculate the standard deviation for each algorithm’s metric values; reduced standard deviation signifies less dependence of algorithmic performance on data size, which means enhanced stability of algorithms.

4 Causal Discovery Algorithms

Various research articles present diverse taxonomy for causal discovery, yet a universally accepted classification structure does not currently exist. Specifically, in recent years, the rapid advancements in this domain have resulted in the incompleteness and obsolescence of numerous surveys. It is essential to collect and analyse the taxonomy methodologies proposed in papers critically to establish a systematic categorisation of existing methods. Simultaneously, the devised structure should encompass as many algorithms as possible. Thus, this project diligently compiles and summarises the prevailing methods, effectively partitioning causal discovery into six fundamental categories, as shown in Figure 5 : Granger-Based, Conditional Independence-Based, State Space Dynamics-Based, Structural Equation Modelling-Based, Deep Learning-Based, and Hybrid Method.

for tree= grow’=east, s sep=5pt, l sep=0.8cm, anchor=base, edge path= [ \forestoption edge] (!u.parent anchor) – ( ( ! u . e a s t ) + ( 15 p t , 0 ) (!u.east)+(15pt,0) ( ! italic_u . italic_e italic_a italic_s italic_t ) + ( 15 italic_p italic_t , 0 ) ) |- (.child anchor) \forestoption edge label; , scale=0.28, font= , [Taxonomy for Causal Discovery, align=center [Granger-Based Methods, align=center [ Multivariate Granger Analysis Method (Arize, 1993 ) ] [ Extended Granger Causality (Chen et al., 2004 ) ] [ Kernel Granger Method (Liao et al., 2009 ) ] [ Copula Granger Method (Hu and Liang, 2014 ) ] ] [Condition Independence-Based Methods, align=center [Information Theoretic-Based Approach, [ oCSE (Sun et al., 2015 ) ] ] [Causal Network-Based Approach, align=center, s sep=2pt, l=0.9cm [Constraint-Based, [Peter-Clark (PC) (Kalisch and Bühlman, 2007 ) ] [CD-NOD (Huang et al., 2020 ) ] [ PCMCI (Runge et al., 2019 ) ] [Fast Causal Inference (FCI) (Entner and Hoyer, 2010 ) ][ tsFCI (Entner and Hoyer, 2010 ) ] ] [Score-Based, [Greedy Equivalence Search (GES) (Chickering, 2002a ) ] [ES (Yuan and Malone, 2013 ) ] [GRaSP (Lam et al., 2022 ) ] [ DYNOTEARS (Pamfil et al., 2020 ) ] ] ] ] [State Space Dynamic-Based Methods, align=center, [ CCM (Sugihara et al., 2012 ) ] [ Cross Map Smoothness (CMS) (Ma et al., 2014 ) ] [ Inner Composition Alignment (IOTA) (Hempel et al., 2011 ) ] [ Pairwise Asymmetric Inference (PAI) (McCracken and Weigel, 2014 ) ] ] [Structural Equation Model-Based Methods, align=center [LiNGAM-Based, [ICA-LiNGAM (Shimizu et al., 2006 ) ] [Direct LiNGAM (Shimizu et al., 2011 ) ] [ VARLiNGAM (Hyvärinen et al., 2010 ) ] [RCD (Maeda and Shimizu, 2020 ) ] [CAM-UV (Maeda and Shimizu, 2021 ) ] ] [Additive Noise Models (ANM) (Hoyer et al., 2008 ) , [ TiMINo (Peters et al., 2013 ) ] ] [PNL (Zhang and Hyvarinen, 2012 ) ] [DAGs with NO TEARS (Zheng et al., 2018 ) ] [GOLEM (Ng et al., 2020 ) ] ] [Deep Learning-Based Methods, align=center [Causal Generative Neural Networks (CGNN) (Goudet et al., 2018 ) ] [DAG-GNN (Yu et al., 2019 ) ] [ Temporal Causal Discovery Framework (TCDF) (Nauta et al., 2019a ) ] [ Amortized Causal Discovery (ACD) (Löwe et al., 2022 ) ] [Ordering-Based Causal Discovery with Reinforcement Learning (Wang et al., 2021 ) ] [GraNDAG (Lachapelle et al., 2019 ) ] ] [Hybrid Methods, align=center [ ARMA-LiNGAM (Kawahara et al., 2011 ) ] [Scalable Causation Discovery Algorithm (SCDA) (Raghu et al., 2018 ) ] [ PSDR-TE (Mao and Shang, 2017 ) ] [Information Geometric Causal Inference (IGCI) (Janzing et al., 2012 ) ] [Non-linear SEM Estimation using Non-Stationarity (NonSENS) (Monti et al., 2020 ) ] [ Neural Granger Causality (NGC) (Tank et al., 2021b ) ] [ NBCB and CBNB (Bystrova et al., 2024 ) ] ] ]

4.1 Granger Based Method

Granger causality (GC) is one of the pioneering measurement methods for analysing time series data. Over several decades, undergoing refinement and evolution, it still maintains an irreplaceable position in the contemporary landscape of causal discovery. The core premise of Granger causality postulates that future events do not affect the present or past, while past events potentially impact both the present and the future. When the historical information of variables x 𝑥 x italic_x and y 𝑦 y italic_y are included, leading to better predictions for variable y 𝑦 y italic_y than predictions based solely on the information of y 𝑦 y italic_y , it signifies that variable x 𝑥 x italic_x is considered the Granger cause of variable y 𝑦 y italic_y . In mathematical notation, the given statement can be expressed as follows (McCracken, 2016 ) :

(1)

Equation 1 illustrates x 𝑥 x italic_x Granger causes y 𝑦 y italic_y , wherein the variable x 𝑥 x italic_x and y 𝑦 y italic_y represents two discrete time series, and the subscript n 𝑛 n italic_n corresponds to the time point t 𝑡 t italic_t . The all-encompassing set of information available at all points t ≤ n 𝑡 𝑛 t\leq n italic_t ≤ italic_n is symbolically denoted as Ω n subscript Ω 𝑛 \Omega_{n} roman_Ω start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT . To ascertain the Granger relationship based on the aforementioned formula, the Vector Autoregressive (VAR) model (Lütkepohl, 2005 ) stands as the prevailing technique, built upon the premise of data stationarity and equipped to forecast variable values.

Our attention is directed towards several primary algorithms based on GC. Arize et al. ( 1993 ) put forth the Multivariate Granger Causality (MVGC) analysis method to overcome the limitations of Pairwise Granger Causality (PWGC), which can only deal with bivariate data. However, ensuring linearity in real-world datasets can present a significant challenge. To address this issue, Chen et al. ( 2004 ) introduced an approach known as Extended Granger Causality (EGC), specifically designed to handle nonlinear data. An alternative model catering to nonlinear time series is the Kernel Granger Causality (KGC) method (Marinazzo et al., 2008 ; Liao et al., 2009 ; Marinazzo et al., 2011 ) , showcasing notable attributes such as high accuracy and flexibility. When confronted with continuous time series data, Hu et al. ( 2014 ) proposed the Copula Granger method, which is capable of uncovering nonlinear and higher-order causal relationships (Kim et al., 2020 ; Jang et al., 2022 ) .

Although Granger-based causality method has a long history of development, it still struggles to handle complex causal relationships. Its main drawback is the inability to identify latent confounders and instantaneous causal effects. Therefore, the Granger-based method is often combined with other methods to achieve mutual development.

4.2 Condition Independence Based Method

The conditional independence-based method exhibits a close association with probability. By quantifying the mutual information (Runge, 2018 ) between variables, this approach enables the determination of causal relations and causal strength. A fundamental concept in this context is the transfer entropy (Schreiber, 2000 ) , which is defined as follows:

(2)

In contrast to Shannon entropy, the transfer entropy is computed by the Kullback entropy (Kullback, 1997 ) . In this equation, x n subscript 𝑥 𝑛 x_{n} italic_x start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT represents the value of variable x 𝑥 x italic_x at the n 𝑛 n italic_n th time point, and likewise for the variable y 𝑦 y italic_y , with the superscript indicating the time delay length. When T 𝒙 → 𝒚 − T 𝒚 → 𝒙 > 0 subscript 𝑇 → 𝒙 𝒚 subscript 𝑇 → 𝒚 𝒙 0 T_{\bm{x}\rightarrow\bm{y}}-T_{\bm{y}\rightarrow\bm{x}}>0 italic_T start_POSTSUBSCRIPT bold_italic_x → bold_italic_y end_POSTSUBSCRIPT - italic_T start_POSTSUBSCRIPT bold_italic_y → bold_italic_x end_POSTSUBSCRIPT > 0 , it can be inferred that variable x 𝑥 x italic_x is the cause of variable y 𝑦 y italic_y ; conversely, variable y 𝑦 y italic_y is the cause of variable x 𝑥 x italic_x .

Causal discovery algorithms based on conditional independence can be categorized into two distinct groups. The first category is the information-theoretic-based approach, with Optimal Causation Entropy (oCSE) (Sun et al., 2014 , 2015 ) being a representative algorithm. oCSE is a two-step discovery algorithm explicitly designed for short time series data. The second approach is causal network-based, where the optimal causal graph is determined through statistical testing. The Peter-Clark (PC) algorithm (Kalisch and Bühlman, 2007 ) has gained widespread adoption and has proven effective in analyzing high-dimensional time series using causal graphs. Recognizing the potential interference of latent confounders in causal detection, corresponding approaches have been developed. A notable example is the Fast Causal Inference (FCI) algorithm (Zhang, 2008 ; Entner and Hoyer, 2010 ; Spirtes et al., 2013 ) , a classical method that explicitly accounts for unobserved confounders. To further enhance control over false positive rates, Runge et al. introduced the PCMCI method and its variants (Runge et al., 2019 ; Runge, 2020 ; Gerhardus and Runge, 2020 ) by incorporating the MCI test into the PC algorithm. PCMCI is an improvement of PC in time series, which can detect contemporaneous and time-delay effects.

Conditional Distribution Similarity Statistic (CDS) (Fonollosa, 2019 ) was proposed to detect the causal direction of bivariates. This method measures the statistical characteristics of the joint distribution of marginal variance data after conditioning the bins. This algorithm has been proven to be robust as it has a high AUC in ChaLearn causal pair challenges.

Constraint-based causal Discovery from heterogeneous/NOnstationary Data (CD-NOD) (Huang et al., 2020 ) is another framework designed to discover causal relationships in data where generating processes change over time or across domains. It detects changing local mechanisms, recovers causal structures, and estimates the driving force behind nonstationarity. This nonparametric method leverages data heterogeneity and connects nonstationarity with soft interventions, demonstrating efficacy on synthetic and real-world datasets like task-fMRI and stock market data.

Here we introduce classic score-based approaches. Compared to the PC algorithm, the Greedy Equivalence Search (GES) algorithm, proposed by Chickering et al. ( 2002a ), shows enhanced robustness when dealing with nonstationary data. Greedy Relaxations of Sparsest Permutation (GRaSP) (Lam et al., 2022 ; Andrews et al., 2023 ) is designed to efficiently identify DAGs representing causal structures from observational data. It builds on permutation-based reasoning and introduces a novel operation called “tuck" to relax the assumptions required by previous methods like Triangle Sparsest Permutation (TSP) and Edge Sparsest Permutation (ESP). GRaSP consists of three tiers: GRaSP0, GRaSP1, and GRaSP2, each progressively weakening the assumptions and increasing the ability to recover sparser permutations. GRaSP2, the most relaxed form, outperforms several state-of-the-art algorithms in simulations, demonstrating scalability and accuracy for dense graphs and those with over 100 variables. To address causal discovery problems in temporal series, Pamfil et al. ( 2020 ) introduced DYNOTEARS, a novel Bayesian network learning algorithm that utilizes score constraints to ascertain the edges within the causal structure graph. DYNOTEARS is capable of effectively handling both instantaneous and delayed causality, making it a versatile tool for causal inference in time series data.

Compared to Granger-based methods, conditional independence-based methods can handle more complex data scenarios, such as high-dimensional data, instantaneous causality, and latent variables. However, these methods generally require the faithfulness assumption and have limitations in determining causal direction, i.e., some causal links remain unoriented. Despite these drawbacks, they are well-suited for identifying causal skeleton graphs.

4.3 State Space Dynamic Based Method

The state space dynamics-based method can be regarded as a complementary approach to address a category of data not encompassed by GC. This method investigates the causality of variables within weakly coupled dynamic systems, significantly enhancing the causal discovery capability in ecological, dynamics, and other relevant domains. This method draws inspiration from the Takens theorem (Takens, 1981 ) and computes the bidirectional cross-correlation between two variables to establish a cross-mapping. To be specific, variable x 𝑥 x italic_x causes y 𝑦 y italic_y when C 𝒙 ⁢ 𝒚 > C 𝒚 ⁢ 𝒙 subscript 𝐶 𝒙 𝒚 subscript 𝐶 𝒚 𝒙 C_{\bm{xy}}>C_{\bm{yx}} italic_C start_POSTSUBSCRIPT bold_italic_x bold_italic_y end_POSTSUBSCRIPT > italic_C start_POSTSUBSCRIPT bold_italic_y bold_italic_x end_POSTSUBSCRIPT is satisfied, wherein C 𝒙 ⁢ 𝒚 subscript 𝐶 𝒙 𝒚 C_{\bm{xy}} italic_C start_POSTSUBSCRIPT bold_italic_x bold_italic_y end_POSTSUBSCRIPT and C 𝒚 ⁢ 𝒙 subscript 𝐶 𝒚 𝒙 C_{\bm{yx}} italic_C start_POSTSUBSCRIPT bold_italic_y bold_italic_x end_POSTSUBSCRIPT represent the Convergent Cross-Mapping (CCM) correlations from x 𝑥 x italic_x to y 𝑦 y italic_y and from y 𝑦 y italic_y to x 𝑥 x italic_x , respectively. The calculation formula for CCM correlation is as follows, where ρ 𝜌 \rho italic_ρ means the Pearson correlation coefficient.

(3)

We assume a classic example to explain cross-mapping further, using variable x 𝑥 x italic_x to construct shadow manifold M 𝒙 subscript 𝑀 𝒙 M_{\bm{x}} italic_M start_POSTSUBSCRIPT bold_italic_x end_POSTSUBSCRIPT , and 𝒚 𝒚 \bm{y} bold_italic_y to construct M 𝒚 subscript 𝑀 𝒚 M_{\bm{y}} italic_M start_POSTSUBSCRIPT bold_italic_y end_POSTSUBSCRIPT . If x 𝑥 x italic_x leads to y 𝑦 y italic_y , then using the neighbouring points of a certain point in M 𝒚 subscript 𝑀 𝒚 M_{\bm{y}} italic_M start_POSTSUBSCRIPT bold_italic_y end_POSTSUBSCRIPT should be able to identify better the neighbouring points of the corresponding point in M 𝒙 subscript 𝑀 𝒙 M_{\bm{x}} italic_M start_POSTSUBSCRIPT bold_italic_x end_POSTSUBSCRIPT . Supposing a delay of 1 and the shadow manifold graphs from two directions are shown in Figure 6 , where x 𝑥 x italic_x causes y 𝑦 y italic_y .

Refer to caption

Sugihara et al. ( 2012 ) introduced the concept of CCM to infer causality. This method proves advantageous for non-separable and weakly connected dynamic systems. An essential feature of CCM is convergence, which means that the longer the time series used (with a larger sample size), the smaller the estimated error of the obtained cross-mapping. To overcome this limitation that necessitates long time series data, Ma et al. ( 2014 ) developed Cross Map Smoothness (CMS), specifically designed to handle varying data sizes, particularly short time series. Additionally, Inner Composition Alignment (IOTA) (Hempel et al., 2011 ; Wang et al., 2014 ) offers an alternative technique for short-time series analysis. McCracken et al. ( 2014 ) proposed Pairwise Asymmetric Inference (PAI) as an exploratory tool for analysing high-dimensional dynamic systems. PAI aids in uncovering causal relationships within complex systems by evaluating asymmetries in pairwise interactions.

The most significant advantage of state space dynamic based methods is their efficacy in deterministic systems, making them the preferred choice for such specialized data scenarios. However, their primary disadvantage lies in their limited applicability and difficulty in handling datasets with time-variant noises.

4.4 Structural Equation Model Based Method

The three methods above are well-suited for identifying time delay causality but may not necessarily be adept at detecting instantaneous causality. On the other hand, the Structural Equation Model (SEM) based method represents a significant advancement and revolution in the realm of instantaneous causal discovery. This method ascertains the edges of DAG by establishing a structural equation to solve the coefficient matrix. The most basic form of the structural equation (Shimizu et al., 2006 ) is as follows:

(4)

The matrix 𝐁 𝐁 \mathbf{B} bold_B is referred to as the coefficient matrix, with its row and column representing the two dimensions of cause and effect. Upon determining 𝐁 𝐁 \mathbf{B} bold_B , the causal relation can be discerned. Additionally, the matrix 𝐄 𝐄 \mathbf{E} bold_E represents the noise matrix in the model, usually non-Gaussian noise.

The Linear Non-Gaussian Acyclic Model (LiNGAM) provides a directed acyclic graph that reveals instantaneous causal relations between variables. LiNGAM assumes linear and non-Gaussian independent noise about the data generation method of the system, solving these using Independent Component Analysis (ICA) (Lee and Lee, 1998 ; Naik and Kumar, 2011 ) . Since ICA algorithms typically employ FastICA and gradient-based algorithms, they may converge to local rather than global optima. To address these issues, Shimizu et al. ( 2011 ) proposed the Direct LiNGAM algorithm. Compared to ICA-LiNGAM, Direct LiNGAM produces more stable and reliable results, though it has some drawbacks. One drawback is its slower computational efficiency compared to ICA-LiNGAM. Additionally, its assumptions are relatively strict; in real-world scenarios, data generation mechanisms are often nonlinear and do not conform to its assumptions. Hyvärinen et al. ( 2013 ) proposed a measure for determining pairwise causal direction based on likelihood ratio tests. The objective of the study was to develop a method based on SEM that performs more effectively on real-world data with limited sample sizes. In simulations using brain imaging (fMRI) data, this method demonstrated significantly better performance compared to ICA-LiNGAM and other approaches.

To address the gap in the LiNGAM algorithm family concerning delayed causality, Hyvärinen et al. ( 2010 ) designed the VARLiNGAM algorithm. VARLiNGAM operates in two steps: first, it predicts time lag effects using a Vector Autoregression (VAR) model; second, it estimates instantaneous causality by applying LiNGAM.

Building upon the insights from LiNGAM, Hoyer et al. ( 2008 ) introduced Additive Noise Models (ANM) to detect nonlinear time series, emphasizing that nonlinearities offer valuable identification power. One shortcoming of this algorithm is its high computational cost, as it involves determining the direction between pairs of variables one-on-one, making the algorithm pairwise causality.

The Post-Nonlinear (PNL) causal model (Zhang and Hyvarinen, 2012 ; Zhang et al., 2015 ; Uemura et al., 2022 ) addresses the complexities of nonlinear effects, inner noise, and measurement distortions in observed variables for causal discovery. Representing each variable as a function of its direct causes, an independent disturbance, and a post-nonlinear distortion, PNL can distinguish between causes and effects, especially in non-Gaussian scenarios. The model’s identifiability has been extensively studied, revealing that it can generally identify causal directions except in specific conditions. Empirical results demonstrate its efficacy in various real-world data sets, making it a robust tool for causal inference in complex systems.

Peters et al. ( 2013 ) proposed Time Series Models with Independent Noise (TiMINo) to capture both lagged and instantaneous effects. This model is based on nonlinear independence tests and can perform well even when the dataset does not satisfy the causal sufficiency assumption.

DAGs with NO TEARS (Zheng et al., 2018 ) is a causal discovery method that uses continuous optimization schemes to learn the structure of Directed Acyclic Graphs (DAGs) from observational data. The name “NO TEARS” stands for “Nonlinear Optimal Transformations for Efficient and Accurate Recovery of Structure,” emphasizing its focus on tackling issues related to learning nonlinear causal relationships. The algorithm transforms the data to make it linear or Gaussian and learns a Structural Equation Model (SEM) linking features in a causal graph. It optimizes the SEM with a focus on fit and sparsity using methods like gradient descent or ADMM. Soft thresholding promotes sparsity by zeroing out edges. The process iterates until convergence criteria are met, then returns a DAG with NO TEARS, representing causal relationships. However, Kaiser and Sipos ( 2021 ) analyzed the lack of scale invariance in the NOTEARS algorithm and concluded that this limitation makes NOTEARS unsuitable for identifying true causal relationships from data.

Regression Error based Causal Inference (RECI) (Blöbaum et al., 2018 ) addresses the problem of inferring the causal relationship between two variables by comparing the least-squares errors of predictions in both possible causal directions. Blöbaum emphasize that RECI can have a significantly lower computational cost than ANM, while delivering comparable or even superior results. Additionally, RECI is straightforward to implement and apply.

GOLEM, introduced by Ng et al. ( 2020 ), is a continuous likelihood-based method for causal discovery. It uses a score-based approach with soft sparsity and DAG constraints to maximize the data probability of a linear Gaussian model. GOLEM employs two objective functions to account for noise variances and uses an l1 penalty for complexity. It formulates an unconstrained optimization problem, ensuring the graph remains a DAG under reasonable assumptions. The algorithm utilizes gradient-based optimization methods, with Adam optimizer and GPU acceleration. A post-processing step removes low-weight edges to enhance performance. GOLEM effectively restores DAG structures while managing soft constraints.

Repetitive Causal Discovery (RCD) (Maeda and Shimizu, 2020 ; Maeda, 2022 ) is a method for identifying causal structures in data affected by latent confounders. It repeatedly infers causal directions between small sets of observed variables, determining if relationships are influenced by latent confounders. The resulting causal graph uses bi-directed arrows to indicate variables sharing the same latent confounders and directed arrows for causal directions between variables not affected by the same latent confounder. Experimental validation with simulated and real-world data shows that RCD effectively identifies latent confounders and causal directions.

Causal Additive Models with Unobserved Variables (CAM-UV) (Maeda and Shimizu, 2021 ) handle causal discovery in the presence of unobserved variables, particularly for nonlinear causal relationships. This model extends causal additive models by accounting for both unobserved common causes and intermediate variables. CAM-UV identifies all theoretically possible causal relationships without bias from unobserved variables, avoiding incorrect inferences. Empirical results from artificial and simulated fMRI data confirm CAM-UV’s effectiveness in inferring causal structures despite the presence of unobserved variables.

The advantage of Structural Equation Model based methods is their applicability to a wide range of data types. LiNGAM and its variant algorithms can handle i.i.d. data, time series, instantaneous causality, hidden confounders, mixed data, and other data types without requiring faithfulness assumptions. However, these methods predominantly rely on linear relationships, and only a few algorithms are capable of handling nonlinear relations.

4.5 Deep Learning Based Method

Deep learning-based methods have emerged as powerful tools in causal discovery, closely connected with machine learning. These methods offer significant technical advantages, particularly in processing vast amounts of data. Notably, deep learning-based causal algorithms can better infer hidden variables using network information.

For instance, Goudet et al. ( 2018 ) designed Causal Generative Neural Networks (CGNN) to address the challenges posed by latent variables in causal analysis. CGNN is an algorithm that infers the optimal causal direction on a causal skeleton diagram, which belongs to pairwise causality. Through testing on both artificial and real-world datasets, CGNN has demonstrated advanced performance in handling potential confounders.

DAG-GNN, developed by Yu et al. ( 2019 ), combines Graph Neural Networks (GNNs) with a score-based approach to learn DAGs from data. It uses GNNs for node embeddings to model feature dependencies and detect causal relationships. The method starts by embedding features using GNNs, then defines a score to evaluate the causal structure. It formulates an optimization problem to maximize this score using gradient-based techniques like stochastic gradient descent or Adam. Causal constraints ensure acyclicity. The data is split into training and validation sets to optimize the score and train the model. After achieving optimal performance on the validation set, the algorithm returns a DAG representing causal relationships.

Another noteworthy algorithm is the Temporal Causal Discovery Framework (TCDF) (Nauta et al., 2019a ) , which effectively handles both latent and instantaneous causal effects. TCDF adopts an attention mechanism in convolutional neural networks. The attention coefficients of different variables, learned by the network, can be interpreted as the degree of correlation between variables. If the attention coefficient is below a certain threshold, it indicates no causal relationship between the two variables.

GraNDAG (Lachapelle et al., 2019 ) is a novel score-based approach for learning DAGs from observational data. It adapts a recent continuous constrained optimization formulation to accommodate nonlinear relationships between variables using neural networks. This method effectively models complex interactions and avoids the combinatorial nature of the problem. By comparing GraNDAG to existing continuous optimization methods and nonlinear greedy search methods, it has been demonstrated that GraNDAG outperforms current continuous methods on most tasks and remains competitive with existing greedy search methods on important causal inference metrics.

Ordering-Based Causal Discovery with Reinforcement Learning (CORL), developed by Wang et al. ( 2021 ), combines ordering-based causal discovery with reinforcement learning techniques (Zhu et al., 2019 ) to learn causal relationships by generating and refining an ordering of variables. The algorithm treats the problem as a sequential decision-making task, where a reinforcement learning agent arranges variables to approximate true causal relationships. A reward function provides feedback, incentivizing correct orderings and penalizing incorrect ones. The task is formulated as a Markov Decision Process (MDP), with states representing the current ordering and actions selecting the next variable. The agent is trained using reinforcement learning algorithms like Q-learning or Proximal Policy Optimization to optimize long-term rewards. Balancing exploration and exploitation is crucial. A post-processing step, such as local search, refines the ordering to improve accuracy. The algorithm ultimately returns an optimal causal graph representing the relationships between features.

More recently, Löwe et al. ( 2022 ) introduced the Amortized Causal Discovery (ACD) algorithm for time series, which is effective with small data sample sizes and demonstrates efficacy in dynamic systems. This model utilizes shared information between dynamic system variables to identify confounders in additive noise datasets.

The advantage of deep learning based methods lies in their capacity to handle datasets with large sample sizes and numerous variables. However, their drawbacks include long running times, low efficiency, and suboptimal performance on short time series.

4.6 Hybrid Method

Hybrid methods combine two or more algorithms to complement and optimize each other, enhancing the ability to discover causality. These methods leverage the strengths of different approaches to address their individual limitations and improve overall performance.

One illustrative hybrid approach is the Autoregressive Moving Average - Linear Non-Gaussian Acyclic Model (ARMA-LiNGAM) (Kawahara et al., 2011 ) , which combines Granger Causality (GC) and Structural Equation Models (SEM). This composite method resolves both instantaneous and delayed causality by leveraging the attributes of LiNGAM and ARMA models. ARMA-LiNGAM’s integration allows for a more comprehensive analysis of time series data, accommodating both immediate and lagged effects. Incorporating deep learning techniques with GC models, Neural Granger Causality (NGC) (Tank et al., 2021b ; Wang et al., 2023 ) stands out. NGC utilizes the Causal Multilayer Perceptron (CMLP) model to train data, thereby enhancing the accuracy of causal inference tasks. By combining the predictive power of neural networks with the interpretability of Granger causality, NGC offers a robust framework for identifying causal relationships in complex datasets.

Janzing et al. ( 2012 ) introduced Information Geometric Causal Inference (IGCI) to address the nonlinear challenges encountered by the Additive Noise Model (ANM) algorithm. IGCI enhances causal inference by incorporating information entropy, providing a more effective method for dealing with nonlinear data structures. This approach allows for better differentiation between cause and effect in scenarios where traditional linear models fall short.

Mao et al. ( 2017 ) extended the application of the Convergent Cross Mapping (CCM) algorithm from bivariate to multivariate analysis by incorporating transfer entropy. This integration, named Phase State Delay Reconstruction - Transfer Entropy (PSDR-TE), effectively addresses the limitation of the CCM algorithm, which was originally designed for detecting bivariate relationships. PSDR-TE expands the capability of causal discovery to more complex multivariate systems, improving the detection of causal links across multiple variables.

Raghu et al. ( 2018 ) proposed the Scalable Causation Discovery Algorithm (SCDA), which combines structural equation model-based and conditional independence-based methods. SCDA provides a solution for mixed data containing both continuous and discrete sequences. By integrating these two methodologies, SCDA can handle a broader range of data types and improve the robustness of causal inference in heterogeneous datasets.

Monti et al. ( 2020 ) proposed an algorithm called Non-linear SEM Estimation using Non-Stationarity (NonSENS) for bivariate data. This approach employs a deep learning-based method, Time Contrastive Learning (TCL), within a SEM framework, allowing for arbitrary instantaneous nonlinear relationships without assuming additive noise. Notably, the direction of effect in arbitrary nonlinear SEMs is proved to be identifiable (Hyvärinen et al., 2024 ) .

Bystrova et al. ( 2024 ) developed two novel algorithms, NBCB and CBNB, which integrate SEM with a constraint-based approach to infer causal graphs from time series. Both approaches are capable of inferring various types of causal graphs including instantaneous and lagged relationships. These algorithms exhibit effectiveness and robustness across both synthetic and real-world datasets.

In summary, hybrid methods in causal discovery leverage the strengths of multiple algorithms to address their respective weaknesses. By combining techniques such as Granger causality, structural equation models, neural networks, and information entropy, these hybrid approaches offer powerful tools for uncovering causal relationships in diverse and complex datasets.

5 Empirical Study Design

The aim of this section is to design experiments to answer the three RQs in section 1 . RQ1 is comparison of algorithm performance , RQ 2 is real-world applicability , and RQ3 is generalization to unknown datasets .

In light of this, the experimental framework is structured across four distinct phases. The inaugural phase involves conducting a comparative assessment of algorithms applied to synthesised datasets with specific features while concurrently evaluating a range of performance metrics. Subsequently, the second phase encompasses presenting and analysing outcomes derived from the initial stage to extract meaningful insights. Transitioning to the third phase, real-world datasets are engaged for testing, utilising the insights garnered in the preceding phase to ascertain the optimal algorithm. This stage aims to verify whether the test results are consistent with the predicted optimal algorithm. The fourth and final phase entails deploying diverse data processing and testing methodologies to ascertain the metadata of the time series datasets. This, in turn, facilitates the extrapolation of insights from the second phase to previously unexplored datasets.

5.1 Datasets

This experimental inquiry necessitates two dataset categories: synthetic datasets designed to explore underlying patterns and real-world datasets serving the purpose of validation. The data generation structure of the artificial dataset is shown in Figure 7 .

Refer to caption

Given the categorization of testing algorithms into three classes, a concomitant preparation of diverse composite datasets becomes imperative. The first category entails a data generator tasked with producing causal pairs, whereby the data generation tool within the causal discovery toolbox (Kalainathan et al., 2020 ) was employed. The second and third categories encompass instantaneous and time-delay causality, for which the data generator in Tigramite framework, posited by Runge et al. ( 2023a ), was harnessed for data synthesis. In the fourth category, characterized by i.i.d. data, we adopted the data generation model established within the gCastle package (Zhang et al., 2021b ) .

Subsequently, we considered the dataset sizes for both time series and i.i.d. data. For time series data, the experimental framework established distinct time series lengths ranging from 50 to 300 time points for small-scale datasets, 300 to 1000 for medium-scale datasets, and 1000 to 3000 for large-scale datasets. For i.i.d. data, the sizes were similarly categorized into small (50, 100, 150, 200, 250), medium (300, 440, 580, 720, 860), and large (1000, 1400, 1800, 2200, 2600).

The design of dataset attributes necessitates attention to causal relationships and noise distribution types. Causal relationships within this context are bifurcated into linear and nonlinear relationships. Linear relationships are generated through polynomial operations on dataset variables, while nonlinear relationships involve trigonometric operations. The noise distribution types encompass Gaussian noise with parameters defined by a mean of 0 and a standard deviation of 1, and uniform noise spanning the interval (0,1).

Building upon the framework above, five datasets are generated for each data size, each subjected ten times to mitigate runtime-induced biases, resulting in a total of 180 distinct datasets for comprehensive algorithm evaluation. All generated datasets adhere to the causal sufficiency assumption and possess stability, prerequisites fundamental to the operation of the algorithmic processes.

Regarding authentic datasets, scarcity in datasets featuring established ground truth, particularly within the domain of time series, is evident. Consequently, this study incorporates two verifiable datasets. The first is the “Tuebingen” dataset, which comprises 100 real cause-effect pairs. Additionally, the “fMRI” dataset, which aims to investigate the Blood Oxygen Level Dependent (BOLD) signal across 28 distinct intrinsic brain networks, is also integrated into the study.

5.2 Evaluation Metrics

In devising this project’s evaluation criteria, we tried to encompass multiple aspects as comprehensively as possible. Recognizing that a singular indicator might introduce bias, we select five distinct indicators for our assessment.

The initial metric utilized is the F1 score, which serves as a prevalent evaluation criterion in causal discovery due to its capacity to assess the model’s overall performance. The F1 score is computed as follows:

(5)

To augment the evaluation of the model’s robustness, specifically its capacity to discern TPR and FPR, we employed the Area Under the Receiver Operating Characteristic (AUROC) as a metric, which can be obtained by computing the area under the ROC curve.

The first two indicators signify that higher numerical values correspond to better performance, with a preference to assess the accurate inference. For assessing the extent of false causality present in the model, we employ the False Positive Rate (FPR), computed as follows:

(6)

Subsequently, we incorporate a graph-based causal discovery metric called Structural Hamming Distance (SHD). This metric directly reveals the number of incorrectly inferred edges by comparing the differences between the ground truth and the estimated causal graph.

Lastly, we meticulously recorded the run time of each algorithm, given that the time cost was considered a significant aspect of this experiment. This particular indicator will serve as a crucial criterion for assessing the efficiency of the algorithms.

5.3 Experimental Algorithms

By collecting resources on GitHub, we introduce in this section the algorithms selected for the experiment, as well as their source code and packages. We mainly use four packages, Causal Discovery Toolbox (CDT) ( 2019 ), gCastle ( 2021a ), Causal Discovery for Time Series (CD_TS)( 2022 ), causal-learn ( 2024a ) to implement the testing algorithms.

As shown in Table 6 , we select MVGC and PWGC, implemented in CD_TS( 2022 ), from the repertoire of GC methods, as the subject of experimentation. It is worth noting that MVGC exclusively addresses delayed causality while PWGC addresses pairwise causality, so their applicability has certain limitations.

In the context of the conditional independence-based method, Runge et al. ( 2023b ) developed PCMCI and its variant algorithms for time series. The oCSE, tsFCI and DYNOTEARS algorithms are also included in the experiment for time series. Considering i.i.d. data, we chooose PC, FCI, GES, GRaSP, ES and CDS. These methods are chosen for their robust performance in different scenarios, providing a comprehensive evaluation of algorithmic capabilities.

Most algorithms rooted in the state space dynamic-based approach primarily concentrate on resolving the directionality of causal pairs. Among these, we examine the classical CCM and PAI algorithms (Javier, 2021 ) , categorizing them as instances of pairwise causality. Additionally, IGCI and ANM are incorporated as pairwise causality algorithms.

As for the structural equation model-based approach, our emphasis lies on exploring a variant of the LiNGAM algorithm (Ikeuchi et al., 2023 ) tailored for i.i.d. data and time series data, referred to as ICALiNGAM, DirectLiNGAM, RCD, and VARLiNGAM. The TiMINO algorithm is excluded from consideration due to its documented inferiority compared to PCMCI and TCDF (Nauta et al., 2019a ) .

TCDF algorithm( 2019b ) frequently emerges in diverse surveys and holds a pivotal position within the field, thus making it a suitable choice as a representative algorithm for deep learning-based methods. For i.i.d. data, we include three algorithms from gCastle package: DAG-GNN, CORL and GraNDAG. Besides, we do not include CGNN in this experiment due to its prolonged running time, which could impede the efficiency of the overall analysis.

Within the hybrid method, we opt to employ both NeuralGC (Tank et al., 2021a ) and IGCI. NeuralGC possesses the ability to handle both delay and instantaneous causal relationships, while IGCI is specialized in resolving causal pairs.

In order to present a lucid exposition of the algorithms employed in this experiment, we marked the testing algorithms with * in Table 6 . It is essential to highlight that, in pursuit of optimal performance for each algorithm, a series of individual tests were conducted. According to these tests, specific default values of several algorithms were modified to ensure the validation of the results.

Method Algorithm Time-series i.i.d. Faithfulness CMC Sufficiency Linear Software
Granger-based PWGC*
MVGC*
EGC
KGC
CopulaGC
Condition independence-based oCSE*
PC* ; ;
PCMCI* ;
FCI*
tsFCI*
GES* ; ;
GRaSP*
ES*
DYNOTEARS*
CDS*
State space dynamics-based CCM*
CMS
IOTA
PAI*
Structural equation model-based ICALiNGAM* ; ;
DirectLiNGAM* ; ;
VARLiNGAM* ; ;
RECI*
RCD* ;
CAM-UV ;
ANM* ; ;
TiMINO*
PNL ;
NOTEARS*
GOLEM*
Deep learning-based CGNN
DAG-GNN*
TCDF* ;
ACD
CORL*
GraNDAG*
Hybrid ARMA-LiNGAM
NeuralGC*
IGCI*
SCDA
PSDR-TE

5.4 Environment Settings

Here we will expound upon the environment’s configuration of the entire code architecture, encompassing domains such as software provisioning, hardware parameters, database integration, and related facets.

The instantiation of this project is grounded in the Python programming language, with compilation facilitated through the PyCharm (professional edition) software. Upon successful compilation, the resultant code is subsequently uploaded onto the designated server for operational deployment.

The particular details about the server infrastructure are outlined herewith: The server infrastructure is established on the Ubuntu operating system, boasting four Graphics Processing Units (GPUs) that operate in tandem. These GPUs are compatible with the 11.4 version of the Compute Unified Device Architecture (CUDA). Notably, each GPU has a computational prowess of 350 Watts and 24,268 Megabytes of memory.

The source code’s comprehensive architecture encompasses five submodules, each fulfilling designated functions. The initial submodule, termed the “dataset”, functionality encompasses the storage of datasets and ground truth sets in CSV format alongside the capacity to generate synthetic datasets. Moving forward, the second submodule, designated the “examples”, serves the dual purpose of harmonizing testing algorithms and facilitating the visual representation of obtained outcomes. The third submodule, “save”, is dedicated to the archival of causal discovery results derived from algorithmic testing endeavours. Within the fourth submodule, designated the “src”, the central objective pertains to integrating external configuration files, thus ensuring seamless operational functionality of the codebase. The final submodule, aptly labelled the “doc”, assumes the role of elucidating practical usage instances of this code library while presenting information regarding the version of the installation package.

6 Expirical Results and Analyses

Within this section, the research questions posited in the preceding section will be addressed systematically, resulting in generalizable findings according to experimental plots and tables.

6.1 Answer to RQ1: Comparison of Algorithm Performance

Through meticulous experimentation, we have derived comparative graphs and ranking tables for algorithms across four distinct causality types. Considering that the ranking algorithm table only displays the average of metrics across all data sizes, we still need violin plots in Appendix A to supplement the changes in recommendation algorithms under specific sample lengths. This section will subsequently present an in-depth analysis of each category.

The causal relationships, whether linear or nonlinear, in conjunction with noise distributions (Gaussian or non-Gaussian), yield four distinct subgraphs, each representing different data modalities. Dataset time lengths are categorized as follows: small (50, 300), medium (300, 1000), and large (1000, 5000). Each dataset is processed ten times to compute mean values.

6.1.1 Pairwise causality

Figure A illustrates the experimental outcomes for the pairwise causal discovery algorithms . We identify an algorithm that consistently performs well as the reference algorithm. For instance, within linear relationships, the CCM algorithm serves as the benchmark, whereas the PAI algorithm is selected for nonlinear relationships. The deviations between the algorithms and their respective reference counterparts are depicted using violin plots in Appendix A .

Acknowledging the limitations of violin plots in conveying precise numerical results, we supplement our analysis with a ranking table. Table 7 quantifies and compares the performance disparities among the algorithms under consideration. The algorithms are ordered in descending sequence based on metric values in this table. Higher F1 scores and AUROC values indicate superior model performance, while lower FPR and SHD values reflect better algorithmic behavior.

Linear Gaussian Linear Non-Gaussian Non-Linear Gaussian Non-Linear Non-Gaussian
F1 CCM CCM PAI PAI
PAI PAI CDS CCM
PWGC PWGC CCM CDS
CDS CDS PWGC PWGC
IGCI IGCI ANM ANM
ANM ANM IGCI IGCI
RECI RECI RECI RECI
AUROC CCM CCM PAI PAI
PAI PAI CDS CCM
CDS CDS CCM CDS
PWGC PWGC PWGC PWGC
ANM IGCI ANM ANM
IGCI ANM IGCI IGCI
RECI RECI RECI RECI
FPR RECI RECI RECI RECI
IGCI IGCI IGCI IGCI
PWGC PWGC PWGC PWGC
PAI PAI CCM CCM
CDS CCM ANM CDS
CCM CDS CDS ANM
ANM ANM PAI PAI
SHD RECI RECI RECI RECI
ANM ANM IGCI IGCI
IGCI IGCI ANM ANM
CDS CDS PWGC PWGC
PWGC PWGC CCM CDS
PAI PAI CDS CCM
CCM CCM PAI PAI

Drawing insights from Figure A and Table 7 , we synthesize a comprehensive summary delineating the preeminent algorithmic selections under diverse application scenarios. In scenarios where the overall performance of the model is emphasized, the F1 score assumes paramount importance. For the linear dataset, CCM is advocated as the optimal choice, while the RECI algorithm, deemed the least effective, exhibits an F1 score approximately 50% lower than that of CCM. Across other datasets, the PAI algorithm consistently emerges as the pinnacle performer. Particularly in nonlinear and Gaussian datasets, PAI outperforms its peers significantly, with its F1 score surpassing that of the second-best algorithm by 20%. Note that the optimal algorithm for each data size is always among the top three algorithms in the average ranking, which means that the sample length has little impact on our recommendation for F1.

In the pursuit of heightened system robustness, AUROC is prioritized. The recommendations from this scenario align with those derived from the F1 score, reaffirming the advisability of employing either the CCM or PAI algorithms. However, the PAI algorithm is not competitive in small sample sizes of linear datasets and is only recommended in medium or large sample sizes ( L > 300 𝐿 300 L>300 italic_L > 300 ).

For systems sensitive to false causality, preference should be given to the FPR metric. In cases involving linear interrelationships among variables, ANM is the preferred choice, with the worst algorithm’s FPR being more than twice as high as that of ANM. CDS is more recommended in small size datasets. Conversely, when dealing with nonlinear relationships among variables, the PAI algorithm emerges as the optimal selection.

When dealing with situations characterized by limited error tolerance, SHD takes precedence. The distinction between the CCM and PAI algorithms is minimal within linear datasets. However, in nonlinear datasets, PAI demonstrates a noteworthy reduction in SHD, exceeding a minimum of 20% compared to other algorithms.

6.1.2 Instantaneous causality

Following this exposition, we elucidate the performance evaluation of instantaneous causal discovery algorithms , as visually depicted in Figure A and quantitatively delineated in Table 8 . Figure A records the performance of seven algorithms, with oCSE serving as the reference algorithm for Gaussian noise datasets and VARLiNGAM as the reference for non-Gaussian noise datasets.

Similar to the analysis of pairwise causality, we consider five scenarios when evaluating instantaneous causality algorithms. The top 3 best-performing algorithms under each metric can be summarized from Table 8 . We will present them in Table 12 and not elaborate here. It is necessary to supplement some insights based on data sample size in Figure A .

Firstly, when prioritizing F1 scores, for non-linear Gaussian datasets, tsFCI performs in the top three in small data sizes, while PCMCI performs better in large data sizes. For Nonlinear non-Gaussian datasets, VARLiNGAM will only be recommended when the sample length is small ( L < 1000 𝐿 1000 L<1000 italic_L < 1000 ). The average ranking of algorithm performance under AUROC metric is basically consistent with the ranking under each data size.

In the context of prioritizing FPR metrics, PCMCI belongs to the recommendation algorithms in all four data types, indicating its superior performance under the FPR metric. For Linear Gaussian datasets, tsFCI is only recommended when the sample size is small.

Under the SHD metric, oCSE, PCMCI, and VARLiNGAM are recommended for all four data types, proving that these three algorithms have stable performance to find the real causal relations and are not heavily dependent on data features.

Linear Gaussian Linear Non-Gaussian Non-Linear Gaussian Non-Linear Non-Gaussian
F1 oCSE VARLiNGAM NeuralGC NeuralGC
PCMCI oCSE tsFCI VARLiNGAM
VARLiNGAM PCMCI PCMCI tsFCI
NeuralGC DYNOTEARS oCSE PCMCI
DYNOTEARS NeuralGC VARLiNGAM oCSE
tsFCI tsFCI DYNOTEARS DYNOTEARS
TCDF TCDF TCDF TCDF
AUROC oCSE VARLiNGAM NeuralGC NeuralGC
PCMCI oCSE oCSE tsFCI
NeuralGC PCMCI PCMCI oCSE
VARLiNGAM NeuralGC tsFCI VARLiNGAM
tsFCI DYNOTEARS VARLiNGAM PCMCI
DYNOTEARS tsFCI DYNOTEARS DYNOTEARS
TCDF TCDF TCDF TCDF
FPR NeuralGC NeuralGC NeuralGC NeuralGC
TCDF tsFCI TCDF TCDF
DYNOTEARS TCDF VARLiNGAM DYNOTEARS
VARLiNGAM DYNOTEARS DYNOTEARS VARLiNGAM
tsFCI PCMCI tsFCI tsFCI
PCMCI oCSE PCMCI PCMCI
oCSE VARLiNGAM oCSE oCSE
SHD tsFCI tsFCI tsFCI tsFCI
NeuralGC NeuralGC NeuralGC NeuralGC
TCDF TCDF DYNOTEARS TCDF
DYNOTEARS DYNOTEARS TCDF DYNOTEARS
VARLiNGAM PCMCI VARLiNGAM PCMCI
PCMCI oCSE oCSE VARLiNGAM
oCSE VARLiNGAM PCMCI oCSE

6.1.3 Time-delay Causality

Below, we will analyze the performance comparison of time-delay causal algorithms in detail. The top 3 best-performing algorithms under each metric can be summarized from Table 9 . We will present them in Table 12 and not elaborate here.

Linear Gaussian Linear Non-Gaussian Non-Linear Gaussian Non-Linear Non-Gaussian
F1 VARLiNGAM PCMCI PCMCI PCMCI
MVGC VARLiNGAM DYNOTEARS NeuralGC
DYNOTEARS DYNOTEARS VARLiNGAM DYNOTEARS
PCMCI MVGC NeuralGC oCSE
NeuralGC NeuralGC MVGC VARLiNGAM
oCSE oCSE oCSE MVGC
TCDF TCDF tsFCI tsFCI
tsFCI tsFCI TCDF TCDF
AUROC MVGC MVGC PCMCI PCMCI
VARLiNGAM VARLiNGAM MVGC MVGC
DYNOTEARS PCMCI DYNOTEARS NeuralGC
PCMCI DYNOTEARS VARLiNGAM oCSE
NeuralGC oCSE NeuralGC DYNOTEARS
oCSE NeuralGC oCSE VARLiNGAM
TCDF TCDF tsFCI tsFCI
tsFCI tsFCI TCDF TCDF
FPR tsFCI tsFCI NeuralGC NeuralGC
NeuralGC NeuralGC TCDF TCDF
oCSE oCSE tsFCI DYNOTEARS
DYNOTEARS DYNOTEARS DYNOTEARS tsFCI
PCMCI PCMCI PCMCI PCMCI
TCDF TCDF oCSE VARLiNGAM
VARLiNGAM VARLiNGAM VARLiNGAM oCSE
MVGC MVGC MVGC MVGC
SHD tsFCI tsFCI tsFCI tsFCI
TCDF TCDF TCDF TCDF
oCSE NeuralGC NeuralGC NeuralGC
NeuralGC oCSE oCSE VARLiNGAM
DYNOTEARS DYNOTEARS VARLiNGAM DYNOTEARS
PCMCI MVGC MVGC oCSE
MVGC PCMCI DYNOTEARS MVGC
VARLiNGAM VARLiNGAM PCMCI PCMCI

It is necessary to supplement some insights based on data sample size in Figure A . Firstly, we analyze the algorithms under the F1 metric. For Linear Gaussian data, VARLiNGAM performs better in larger sample sizes. On the contrary, MVGC is more suitable for small sample size data. The top three ranking algorithms perform relatively stably on Linear non-Gaussian and Nonlinear Gaussian data, and their rankings do not change significantly based on changes in sample size. For Nonlinear non-Gaussian data, DYNOTEARS is recommended only when the sample size is small, since it does not ranked in the top three in large sample size.

The ordering of algorithms according to the AUROC metric exhibits a relatively stable pattern. Specifically, the top three algorithms in average ranking table reflects the recommended algorithms for AUROC across all data sizes.

In evaluating performance utilizing the FPR metric, for linear data, TCDF ranks in the top three only when the sample length > 1000 absent 1000 >1000 > 1000 , so it is not a recommended algorithm when the sample size is small. For non-linear data, VARLiNGAM is not outstanding when the sample length < 300 absent 300 <300 < 300 and is only recommended for use in large data size. Note that although the oCSE algorithm does not always in the top three of the mean ranking, it is always the best algorithm in small sample sizes ( L < 300 𝐿 300 L<300 italic_L < 300 ).

Considering SHD, for linear type, VARLiNGAM is more suitable for large sample data, while PCMCI is more suitable for small sample data. For nonlinear type, MVGC is more suitable for small sample data. In Nonlinear non-Gaussian data, although NeuralGC is not among the top three algorithms, it is the best algorithm when sample length > 1000 absent 1000 >1000 > 1000 and can be recommended as a supplementary algorithm.

After comparing the performance of time-delay causal discovery algorithms, it can be concluded that tsFCI are not recommended in any scenario, as they are not competitive across any metric.

6.1.4 i.i.d. Causality

Below, we will analyze the performance comparison of i.i.d. data causal algorithms in detail.

Linear Gaussian Linear Non-Gaussian Non-Linear Gaussian Non-linear Non-Gaussian
F1 GOLEM DirectLiNGAM GraNDAG CORL
CORL CORL CORL GOLEM
DAG-GNN GOLEM GES GraNDAG
NOTEARS ICALiNGAM GOLEM ICALiNGAM
ES DAG-GNN RCD DAG-GNN
ICALiNGAM RCD FCI RCD
GRaSP NOTEARS PC GES
PC GRaSP ICALiNGAM PC
FCI PC ES NOTEARS
GES FCI NOTEARS FCI
DirectLiNGAM ES DAG-GNN ES
GraNDAG GES GRaSP GRaSP
RCD GraNDAG DirectLiNGAM DirectLiNGAM
AUROC GOLEM DirectLiNGAM CORL CORL
CORL CORL NOTEARS GOLEM
DAG-GNN GOLEM GraNDAG DAG-GNN
NOTEARS ICALiNGAM ICALiNGAM NOTEARS
ES DAG-GNN DAG-GNN ICALiNGAM
ICALiNGAM RCD GOLEM GraNDAG
GRaSP NOTEARS RCD RCD
FCI GRaSP GES GES
PC FCI FCI GRaSP
RCD PC GRaSP PC
GES ES PC FCI
GraNDAG GES ES ES
DirectLiNGAM GraNDAG DirectLiNGAM DirectLiNGAM
FPR RCD ES ES ES
GES GES GraNDAG GraNDAG
DirectLiNGAM PC PC PC
ICALiNGAM GRaSP GES GES
PC GraNDAG RCD RCD
GRaSP RCD FCI FCI
GraNDAG DAG-GNN GOLEM GRaSP
ES FCI GRaSP GOLEM
FCI NOTEARS DirectLiNGAM DirectLiNGAM
DAG-GNN GOLEM CORL CORL
NOTEARS ICALiNGAM ICALiNGAM ICALiNGAM
CORL CORL DAG-GNN DAG-GNN
GOLEM DirectLiNGAM NOTEARS NOTEARS
SHD RCD GES RCD RCD
GES GraNDAG ES GraNDAG
DirectLiNGAM ES DirectLiNGAM ES
GraNDAG FCI FCI GES
FCI PC GraNDAG PC
PC GRaSP GES FCI
GRaSP RCD GRaSP GRaSP
ICALiNGAM NOTEARS PC DirectLiNGAM
ES DAG-GNN DAG-GNN NOTEARS
NOTEARS GOLEM GOLEM DAG-GNN
DAG-GNN ICALiNGAM NOTEARS ICALiNGAM
CORL CORL ICALiNGAM GOLEM
GOLEM DirectLiNGAM CORL CORL

Based on Figure A and Table 10 , we can derive insights into the performance of recommendation algorithms across different i.i.d. data types. Analyzing the algorithms using the F1 metric, Table 8 reveals that the optimal algorithms vary according to specific data characteristics, with the CORL algorithm consistently ranking among the top three performers.

When considering the AUROC metric, the CORL algorithm shows superior performance on nonlinear datasets, although the improvement over the second-best algorithm is less than 10%. In contrast, GOLEM or DirectLiNGAM exhibit a slight advantage over CORL when applied to linear datasets.

Assessing performance using the FPR metric, NOTEARS emerges as the most proficient algorithm for nonlinear datasets. The GraNDAG algorithm performs best when the sample size is large, so we also include it in Table 10 . For linear datasets, GOLEM performs best under Gaussian noise distribution, while DirectLiNGAM excels with non-Gaussian noise. Note that we need to supplement the DAG-GNN algorithm on large datasets, although it performs poorly on small-sized datasets.

Evaluated using the SHD metric, CORL demonstrates optimal performance on nonlinear datasets. Conversely, GOLEM achieves the lowest SHD values on linear datasets with Gaussian noise, while DirectLiNGAM performs best with non-Gaussian noise.

6.1.5 Discussion on Algorithm Efficiency

It is known that effectiveness does not equal efficiency. Even if some algorithms have good causal discovery performance, they may not be suitable for users because of long running time. So here we will specifically discuss the running time of the algorithms we tested in the first four sections, as shown in Table 11 , to help users make a trade-off between effectiveness and efficiency.

Linear Gaussian Linear Non-Gaussian Non-Linear Gaussian Non-Linear Non-Gaussian
Pairwise ANM ANM ANM ANM
CCM CCM CCM CCM
PAI PAI PAI PAI
CDS CDS CDS CDS
IGCI IGCI IGCI IGCI
PWGC PWGC PWGC PWGC
RECI RECI RECI RECI
Instantaneous NeuralGC NeuralGC NeuralGC NeuralGC
TCDF TCDF TCDF TCDF
tsFCI tsFCI tsFCI tsFCI
oCSE oCSE oCSE oCSE
DYNOTEARS DYNOTEARS DYNOTEARS DYNOTEARS
PCMCI PCMCI PCMCI PCMCI
VARLiNGAM VARLiNGAM VARLiNGAM VARLiNGAM
Time-delay NeuralGC NeuralGC NeuralGC NeuralGC
TCDF TCDF TCDF TCDF
MVGC MVGC MVGC MVGC
tsFCI tsFCI tsFCI tsFCI
oCSE oCSE oCSE oCSE
PCMCI PCMCI PCMCI PCMCI
VARLiNGAM VARLiNGAM DYNOTEARS DYNOTEARS
DYNOTEARS DYNOTEARS VARLiNGAM VARLiNGAM
i.i.d GraNDAG GraNDAG CORL CORL
CORL CORL GraNDAG GraNDAG
DAG-GNN RCD DAG-GNN DAG-GNN
GOLEM DAG-GNN GOLEM GOLEM
NOTEARS GOLEM RCD RCD
ES NOTEARS NOTEARS NOTEARS
GES ES GES GES
ICALiNGAM GES ES ES
DirectLiNGAM DirectLiNGAM ICALiNGAM ICALiNGAM
GRaSP GRaSP DirectLiNGAM DirectLiNGAM
PC PC PC PC
RCD FCI FCI FCI
FCI ICALiNGAM GRaSP GRaSP

Considering time cost of pairwise algorithms, RECI, PWGC, and IGCI stand out as the exemplars of efficiency across all datasets, showcasing a runtime nearly one order of magnitude lower than that of other algorithms.

For instantaneous and time-delay causality, DYNOTEARS, PCMCI, VARLiNGAM algorithm are preferred. In contrast, the NeuralGC algorithm exhibits the most prolonged computational execution time, exceeding that of the best algorithm by nearly three orders of magnitude.

When considering i.i.d. data, FCI, ICALiNGAM, and GRaSP are the preferred choices. In contrast, the least efficient algorithms, GraNDAG and CORL, have runtime that are thousands of times longer than the most efficient ones.

6.1.6 Recommendation Algorithms

It is important to note that the ranking table for each data category is obtained by calculating the average value of the results run on 15 datasets. To answer RQ 1, by sorting out the experimental findings of these causal discovery algorithms, we select the top three with the best average value under each evaluation metric as our recommended algorithms, as shown in Table 12 .

F1 score AUROC FPR SHD Run Time (s)
Pairwise linear CCM CCM ANM CCM RECI
PAI PAI (L) CCM PAI PWGC
PWGC CDS CDS (S) PWGC IGCI
nonlinear PAI PAI PAI PAI RECI
CDS CDS CDS CDS PWGC
CCM CCM ANM CCM IGCI
Instantaneous linear + Gaussian oCSE oCSE oCSE oCSE VARLiNGAM
PCMCI PCMCI PCMCI PCMCI PCMCI
VARLiNGAM NeuralGC tsFCI (S) VARLiNGAM (S) DYNOTEARS
linear + non-Gaussian VARLiNGAM VARLiNGAM VARLiNGAM VARLiNGAM VARLiNGAM
oCSE oCSE oCSE oCSE PCMCI
PCMCI PCMCI PCMCI PCMCI DYNOTEARS
nonlinear + Gaussian NeuralGC NeuralGC oCSE PCMCI VARLiNGAM
tsFCI (S) oCSE PCMCI oCSE PCMCI
PCMCI (L) PCMCI tsFCI VARLiNGAM (S) DYNOTEARS
nonlinear + non-Gaussian NeuralGC NeuralGC oCSE oCSE VARLiNGAM
VARLiNGAM (S) tsFCI (L) PCMCI VARLiNGAM PCMCI
tsFCI oCSE tsFCI PCMCI DYNOTEARS
Time-delay linear + Gaussian VARLiNGAM (L) MVGC MVGC VARLiNGAM (L) DYNOTEARS
MVGC (S) VARLiNGAM VARLiNGAM MVGC VARLiNGAM
DYNOTEARS DYNOTEARS TCDF (L) PCMCI (S) PCMCI
oCSE (S)
linear + non-Gaussian PCMCI MVGC MVGC VARLiNGAM DYNOTEARS
VARLiNGAM VARLiNGAM VARLiNGAM PCMCI VARLiNGAM
DYNOTEARS PCMCI TCDF (L) MVGC (S) PCMCI
oCSE (S)
nonlinear + Gaussian PCMCI PCMCI MVGC PCMCI VARLiNGAM
DYNOTEARS MVGC VARLiNGAM (L) DYNOTEARS DYNOTEARS
VARLiNGAM DYNOTEARS oCSE (S) MVGC (S) PCMCI
nonlinear + non-Gaussian PCMCI PCMCI MVGC PCMCI VARLiNGAM
NeuralGC MVGC oCSE (S) MVGC DYNOTEARS
DYNOTEARS (S) NeuralGC VARLiNGAM (L) oCSE PCMCI
NeuralGC (L)
i.i.d. linear + Gaussian GOLEM GOLEM GOLEM GOLEM FCI
CORL CORL CORL CORL RCD
DAG-GNN DAG-GNN NOTEARS DAG-GNN PC
DAG-GNN (L) DAG-GNN (L)
linear + non-Gaussian DirectLiNGAM DirectLiNGAM DirectLiNGAM DirectLiNGAM ICALiNGAM
CORL CORL CORL CORL FCI
GOLEM GOLEM ICALiNGAM ICALiNGAM PC
nonlinear + Gaussian GraNDAG (L) CORL NOTEARS CORL GRaSP
CORL NOTEARS DAG-GNN ICALiNGAM FCI
GES GraNDAG (L) ICALiNGAM NOTEARS PC
GRaNDAG (L) GRaNDAG (L)
nonlinear + non-Gaussian CORL CORL NOTEARS CORL GRaSP
GOLEM GOLEM DAG-GNN GOLEM FCI
GraNDAG DAG-GNN ICALiNGAM ICALiNGAM PC

Note that if the optimal algorithm for a specific data size is not among the top three in average ranking, we will supplement it with gray font to ensure that Table 12 covers all possible scenarios as much as possible. “S” indicates that the algorithm is more suitable on small sample length ( L < 1000 𝐿 1000 L<1000 italic_L < 1000 ), while “L” indicates that the algorithm is more suitable on large sample length ( L > 1000 𝐿 1000 L>1000 italic_L > 1000 ).

6.2 Answer to RQ2: Real-World Applicability

We first tested the real-world dataset, Tuebingen, which comprises 100 pairs of causal relationships within a stationary time series framework characterized by nonlinearity and non-Gaussian attributes. The time length spans from 94 to 16,382 time points. Leveraging the insights posited in Section 6.1 , it is deduced that the PAI, CDS, CCM algorithms are the preeminent choices under metrics such as F1 score, AUROC, and SHD; PAI, CDS, ANM are recommended for FPR metric. Additionally, RECI, PWGC, IGCI algorithm are identified as the most efficient. The dataset consists of one hundred instances of causal pairs, which we divided based on their temporal extent: those exceeding 1000 time points were classified as “large datasets” and those below 1000 time points were categorized as “small datasets”. Subsequently, an exhaustive execution of all algorithms on this real dataset was conducted, resulting in Figure 8 .

Refer to caption

Observation of the graph reveals a clear pattern: using the PAI algorithm as the benchmark, except for CDS, the average F1 and AUROC metrics of the other algorithms consistently reside beneath the horizontal baseline, while the metrics of FPR and SHD exhibit values surpassing those of PAI. This collective trend signifies that PAI demonstrates superiority as the optimal algorithm across these four evaluative metrics on “small datasets”, while CDS performs better on “large datasets”.

Moreover, when temporal considerations are factored in, the violin plot corresponding to PECI, IGCI, PWGC, CCM algorithms are prominently positioned beneath the horizontal reference line. This distinctive placement underscores that RECI holds the lowest time complexity.

These findings align with the algorithmic recommendations derived from experiments on the authentic dataset and corroborate the deductions drawn based on the outcomes expounded in Section 6.3 . This congruence augments our confidence in extending the theoretical framework to real-world datasets, thereby validating our theoretical assertions and demonstrating their practical applicability.

The second real dataset pertains to functional Magnetic Resonance Imaging (fMRI), comprising 28 sets of multivariate time series. A subset of data that failed to meet criteria associated with causal sufficiency was omitted, resulting in the examination of 27 datasets. This dataset is characterized by nonlinearity and Gaussian attributes, emblematic of time-delay causal causality with l ⁢ a ⁢ g = 1 𝑙 𝑎 𝑔 1 lag=1 italic_l italic_a italic_g = 1 . Among the dataset constituents, 21 sets comprise fewer than 1000 data points, while the remaining six sets exceed this threshold. Each dataset includes 5, 10, or 15 time series variables. Guided by these salient attributes, we predict that one of PCMCI, DYNOTEARS, VARLiNGAM algorithms will exhibit optimal performance under F1 score, whereas PCMCI, MVGC, DYNOTEARS algorithms will attain primacy in terms of AUROC. For FPR, MVGC is the most recommended algorithm since it performs well on datasets of all sizes, while VARLiNGAM is only recommended on large sample sizes and oCSE is only recommended on small sample sizes. Considering SHD, PCMCI, DYNOTEARS, and MVGC are recommended algorithms, with MVGC being more suitable for small sample sizes. When taking into account the time cost, we recommend PCMCI, VARLiNGAM, and DYNOTEARS algorithms.

Refer to caption

To ensure a comprehensive assessment of algorithmic efficacy, we evaluated both instantaneous causal discovery algorithms and time-delay algorithms on the fMRI dataset. A total of nine distinct algorithms were compared. Analysis of Figure 9 reveals that the reference algorithm, PCMCI, achieves the highest values under the F1 metric and the lowest values for Run time. Regarding AUROC, FPR and SHD, it can be clearly seen from the graph that MVGC is the best performing one, which is included in our recommendations. This means that MVGC has good stability and low error rate. This alignment with our earlier assessments further bolsters the robustness of our conclusions.

To answer RQ 2, the optimal algorithm identified through experimental testing on the two datasets aligns with the one determined according to the insights in Section 6.1 . This consistency underscores the reliability of the theory derived from both synthetic and real-world datasets.

6.3 Answer to RQ3: Generalization to Unknown Datasets

In Section 6.3.1 , a metadata detection program was designed to extract data features. Subsequently, in Section 6.3.2 , our recommendation program was tested on various datasets to verify its consistency with the algorithm test results.

6.3.1 Answer to RQ 3.1: Metadata Recognition for Algorithm Selection

Given that our prior analyses focused on causality types, linearity among series, and noise distribution, capturing these pivotal attributes within an unknown dataset is crucial for the project’s universality and practical applicability.

The first task of metadata detection is identifying temporal lags within variables. We employ the Time Lag Cross Correlation (TLCC) technique to accomplish this. TLCC is measured by gradually shifting a time series vector and repeatedly calculating the correlation between two signals. Identifying correlation maxima facilitates the ascertainment of inter-variable temporal lag. Specifically, a zero lag denotes an instantaneous causal association, whereas a non-zero lag signifies a time-delay causality. If no lag is detected, the dataset is classified as i.i.d. data.

Subsequently, identifying the noise distribution is requisite. We employ concurrent evaluative methodologies encompassing the Shapiro-Wilk, Kolmogorov-Smirnov, and Anderson-Darling tests. These tests collectively serve to discern the presence of Gaussian noise in the data. The following criteria serve as benchmarks:

1. The Shapiro-Wilk test’s computed p-value surpasses the significance threshold of 0.05.

2. The p-value resulting from the Kolmogorov-Smirnov test exceeds 0.05.

3. The p-value derived from the Anderson-Darling test remains below its critical threshold.

Fulfillment of these conditions collectively allows for the inference of Gaussian noise as the prevailing noise type characterizing the dataset.

Lastly, a crucial inquiry involves ascertaining potential linear interdependence among variables. To address this, a linear regression framework is applied to every pair of variables. Subsequently, the coefficient of determination (R-squared) is derived to gauge the efficacy of the model fit. A predetermined threshold of 0.5 is set for assessment. If the computed R-squared value surpasses this threshold, it signifies the presence of a discernible linear relationship between the variables. Conversely, an R-squared value below the threshold implies suboptimal alignment with the linear regression framework, indicating the absence of a linear relationship between the implicated variables.

To comprehensively appraise the previously delineated feature extraction procedures, we conducted metadata detection experiments on 100 datasets, with causality types, linear relations, noise distributions, and time lengths randomly generated. This evaluation was accomplished by computing the accuracy of judgments for the three metadata categories. We conducted ten trials, yielding comprehensive average and standard deviation metrics, as detailed in Table 13 .

Causality Type Linear Relation Gaussian Noise
Accuracy

To answer RQ 3.1, Table 13 shows that these three metadata can be extracted precisely, with accuracies over 75% and standard deviations of no more than 0.1, indicating that the judgment program is also relatively stable.

6.3.2 Answer to RQ 3.2: Practical Recommendations for Users

LUCAS and Sachs was selected as the test dataset. The extracted metadata, based on the program described in the previous section, along with the corresponding recommendation algorithms provided by Table 12 , are listed in Table 14 .

LUCAS Sachs
Metadata Data size (500, 11) (7466, 11)
Dependency funcs nonlinear linear
Noises distributions Gaussian non-Gaussian
Recommendation algorithms F1 GraNDAG; CORL; GES DirectLiNGAM; CORL; GOLEM
AUROC CORL; NOTEARS; GraNDAG DirectLiNGAM; CORL; GOLEM
FPR NOTEARS; DAG-GNN; ICALiNGAM DirectLiNGAM; CORL; ICALiNGAM
SHD CORL; ICALiNGAM; NOTEARS DirectLiNGAM; CORL; ICALiNGAM
Runtime GRaSP; FCI; PC ICALiNGAM; FCI; PC

13 algorithms for i.i.d. data were tested on the LUCAS dataset and 12 algorithms were tested on Sachs. The RCD algorithm is not considered for Sachs due to its long running time (run time > 15 ⁢ m ⁢ i ⁢ n ⁢ u ⁢ t ⁢ e ⁢ s absent 15 𝑚 𝑖 𝑛 𝑢 𝑡 𝑒 𝑠 >15minutes > 15 italic_m italic_i italic_n italic_u italic_t italic_e italic_s ). The results, presented in Table 15 illustrates that the optimal algorithm for each metric is within previous recommendation range, which verifies the effectiveness of our recommendation program. To illustrate clearly, the algorithm we recommend is underlined, and the optimal algorithm obtained from horizontal testing is highlighted in bold.

F1 AUROC FPR SHD Runtime
LUCAS GraNDAG (0.70) NOTEARS (0.75) RCD (0.47) RCD (30.0) CORL (257.80)
NOTEARS (0.70) GraNDAG (0.75) DirectLiNGAM (0.33) DirectLiNGAM (16.0) GraNDAG (233.52)
GES (0.69) GES (0.72) ES (0.26) CORL (13.0) DAG-GNN (44.29)
DAG-GNN (0.69) DAG-GNN (0.72) GOLEM (0.23) ES (13.0) GOLEM (32.61)
FCI (0.67) FCI (0.68) CORL (0.23) GOLEM (11.0) RCD (12.19)
GRaSP (0.67) GRaSP (0.68) PC (0.21) PC (11.0) NOTEARS (0.83)
GOLEM (0.63) GOLEM (0.67) GES (0.16) ICALiNGAM (10.0) ES (0.71)
ICALiNGAM (0.59) ICALiNGAM (0.62) ICALiNGAM (0.16) GES (8.0) GES (0.42)
ES (0.56) ES (0.61) DAG-GNN (0.16) GRaSP (8.0) DirectLiNGAM (0.10)
CORL (0.53) CORL (0.57) GRaSP (0.09) DAG-GNN (8.0) FCI (0.05)
PC (0.48) RCD (0.52) FCI (0.09) FCI (7.0) ICALiNGAM (0.05)
RCD (0.31) PC (0.52) GraNDAG (0.02) GraNDAG (6.0) GRaSP (0.05)
DirectLiNGAM (0.21) DirectLiNGAM (0.25) NOTEARS (0.02) NOTEARS (5.0) PC (0.05)
Sachs CORL (0.30) CORL (0.35) ES (0.84) ES (37.0) DAG-GNN (814.25)
NOTEARS (0.26) DirectLiNGAM (0.32) DAG-GNN (0.70) DAG-GNN (32.0) GraNDAG (236.29)
GRaSP (0.25) GRaSP (0.32) GRaSP (0.62) FCI (31.0) CORL (95.95)
DAG-GNN (0.24) NOTEARS (0.32) PC (0.57) PC (31.0) GOLEM (68.19)
GOLEM (0.24) DAG-GNN (0.31) FCI (0.49) GRaSP (30.0) NOTEARS (25.67)
ICALiNGAM (0.23) GOLEM (0.29) CORL (0.43) CORL (25.0) ES (4.76)
PC (0.22) ICALiNGAM (0.29) NOTEARS (0.41) GES (25.0) GES (1.14)
FCI (0.19) PC (0.29) ICALiNGAM (0.35) NOTEARS (25.0) PC (1.10)
ES (0.19) ES (0.26) GOLEM (0.32) GOLEM (24.0) FCI (0.68)
DirectLiNGAM (0.17) FCI (0.26) GES (0.19) ICALiNGAM (24.0) DirectLiNGAM (0.38)
GES (0.16) GES(0.22) GraNDAG (0.14) GraNDAG (20.0) GRaSP (0.35)
GraNDAG (0.08) GraNDAG (0.18) DirectLiNGAM (0.08) DirectLiNGAM (17.0) ICALiNGAM (0.09)

To answer RQ 3.2, the case study of these two real datasets indicates that the optimal algorithms obtained from the experiment are included in our recommended algorithms. This means that users can quickly find the most suitable algorithm based on extracted metadatas and save computing power.

7 Threats to Validity

7.1 threats to internal validity, 7.1.1 correctness of the codes.

A potential threat to the study revolves around the accuracy of the employed codebase. Despite rigorous testing and validation procedures, the complexity of algorithmic implementations and potential oversights during the coding process may give rise to errors. Variations in the code may inadvertently influence the outcomes, thereby introducing a threat to the study’s internal validity.

To address this, the study highlights the ongoing commitment to code review and validation, emphasizing the significance of code quality in ensuring the reliability of the study’s outcomes. Specifically, we conducted validation tests on each algorithm to ensure its correctness.

7.1.2 Implementation of Control Variables

In light of the comparative analytical methodology employed within the experimental framework of this project, the precision in governing variables assumes paramount importance due to its direct influence on the ensuing output. If multiple testing algorithms are administered across disparate datasets, the potential for consequential impact stemming from inherent dataset dissimilarities becomes salient. Such variations could undermine the cogency of the ultimate conclusions drawn from the study.

To mitigate this, distinct algorithms were executed on a singular dataset, thereby facilitating establishing a controlled environment wherein the dataset variable maintains consistency. Notably, our program was executed holistically on a singular server infrastructure, ensuring uniformity in the controlling variable of the computing capacity and thereby rendering temporal comparisons feasible. In the interest of upholding the validation of the comparative analysis, synthesized data types of specific time lengths were all derived from one DAG structure. This strategic alignment eradicated any potential ambiguities stemming from disparate ground truth, consequently enhancing the reliability and interpretability of the results.

7.2 Threats to external validity

7.2.1 metadata selection bias.

The external validity of this study could be threatened by metadata selection bias. In light of our methodology, which exclusively constructs testing data predicated upon established causal linkages, noise distribution characteristics, and dataset dimensions, it is prudent to acknowledge the potential bias that crucial time series’ metadata might not be exhaustively contained. Consequently, there remains a prospect for more influential data attributes that could threaten the comprehensiveness and external validity of the experimental outcomes.

In order to mitigate the potential influence of this factor on the experimental validity, two real-world datasets were examined, and it was ascertained that the empirical findings obtained from these real-world datasets exhibit congruence with those derived from the synthetic datasets. This correspondence serves as an indirect validation of the judicious selection of metadata, affirming its representativeness.

7.2.2 Temporal Validity

Considering the incessant evolution characterizing algorithms, it is plausible that our research might not encompass forthcoming optimizations, thereby engendering temporal constraints on the experiment’s conclusions. This scenario presents an external threat to the generalizability of research endeavours.

To attenuate this influence to the greatest extent possible, a mitigation strategy has been adopted wherein algorithms are encoded following a standardized input-output paradigm, thereby creating an extensible code library. This design contributes to the seamless integration of forthcoming algorithmic enhancements into the repository, facilitating their inclusion within the testing framework. This holistic program design underscores commendable operational feasibility and assimilation capacity.

8 Conclusions

This section will summarize the entire project and analyze potential future work from three aspects.

8.1 Conclusions

This work conducted comprehensive research on causal discovery in time series, introducing the research topic’s relevant background, importance, and literature system. We introduced six types of methods and sorted out over 20 algorithms.

In terms of experiments, our task is to explore the optimal algorithms in different application scenarios. We successfully summarised recommendation algorithms corresponding to 16 data types through comparative experiments and verified the applicability of these insights through testing on two real-world datasets. Furthermore, we extended the practicality of this discovery to the unknown dataset through metadata extraction technology, with the testing accuracy up to 80 % percent 80 80\% 80 % .

Finally, we discussed the experimental results and related them to existing literature, and listed the threats that could affect the validity of the results. By discussing these two aspects, the findings of this project are made more reliable and vivid.

Given the above content, this project has systematically completed all research objectives, thoroughly analyzing causal discovery for time series from theoretical and practical aspects. We provided practical and effective algorithm recommendations from users’ perspectives, filling the research gaps in previous articles and providing specific reference values for future algorithm research.

8.2 Discussion on Future Research Directions

Opportunities for enhancement and progress persist in this study. Based on the discussion section, a trajectory for future investigations can be charted across three principal dimensions to improve the algorithm guideline system. The initial facet involves considering a broader range of data types, such as high-dimensional datasets or time series incorporating latent variables.

The second avenue of progression entails the extension of algorithmic testing. Due to temporal constraints in this project, certain time-intensive algorithms were omitted from the assessment. It is advisable to incorporate these additional algorithms in future endeavors to achieve a more exhaustive and comprehensive comparative analysis of algorithmic efficacy.

The final and pivotal facet pertains to enhancing the precision of metadata identification. To achieve this objective, the adoption of more sophisticated data processing techniques is warranted. Notably, the integration of machine learning methodologies should be considered. This approach would yield more precise determinations of unknown data characteristics, particularly in distinguishing between time-delay and instantaneous causality.

Upon the successful execution of the proposed endeavors, users will be furnished with a meticulous causal discovery algorithm recommendation service. This service would facilitate the expedient identification of optimal algorithms for arbitrary datasets, thereby markedly curtailing the duration of trial-and-error procedures and minimizing computational resource consumption. This work effectively addresses the intricate conundrum associated with algorithm selection for causal discovery, which holds significant research and practical value.

Besides, it is important to discuss research directions related to existing research gaps in the field, mainly focusing on three aspects:

Datasets and algorithms : One significant issue is the limited availability of real datasets with ground truth, which restricts algorithm experiments to only a few real datasets. Therefore, developing a broader range of real datasets is crucial for benchmarking causal discovery algorihthms. Additionally, in the development of new algorithms, two-step approaches show significant potential. For instance, integrating causal discovery methods with neural networks in different stages could overcome the limitations relying on a single method.

Algorithm selector : A critical research direction is the joint analysis of data features and causal graph structures. This effort aims to develop a comprehensive algorithm selector (Runge et al., 2023a ) capable of addressing all possible data scenarios, thereby assisting users in handling various complex situations.

Causality or Association? Understanding the fundamental differences between causality and regression is crucial. Conducting empirical research to clarify the quantitative differences between causal algorithms and association-based methods will provide valuable insights for advancing causal research and statistical techniques. This understanding is also essential for evaluating the impact that causal learning could bring to the field of explainable AI (Montavon et al., 2018 ) .

Additionally, future work could explore the application of causal discovery methods to verify the results of data augmentation (Gao et al., 2023 ) . For example, these methods could be used to ascertain whether newly generated data retains the causality of the original data. This verification process would enhance the reliability and validity of augmented datasets, ensuring that the fundamental causal relationships are preserved.

  • Ahmed et al. (2020) Ossama Ahmed, Frederik Träuble, Anirudh Goyal, Alexander Neitz, Yoshua Bengio, Bernhard Schölkopf, Manuel Wüthrich, and Stefan Bauer. Causalworld: A robotic manipulation benchmark for causal structure and transfer learning. arXiv preprint arXiv:2010.04296 , 2020.
  • Aliprantis (2015) Dionissi Aliprantis. A distinction between causal effects in structural and Rubin causal models. SSRN Electronic Journal , 15-05, 2015.
  • Andrews et al. (2023) Bryan Andrews, Joseph Ramsey, and Ruben Sanchez Romero. Fast scalable and accurate discovery of dags using the best order score search and grow shrink trees. Advances in Neural Information Processing Systems , 36:63945–63956, 2023.
  • Arize (1993) Augustine C Arize. Determinants of income velocity in the united kingdom: multivariate granger causality. The American Economist , 37(2):40–45, 1993.
  • Assaad (2022) Charles K Assaad. causal_discovery_for_time_series, 2022. URL https://github.com/ckassaad/causal_discovery_for_time_series.git .
  • Assaad et al. (2022) Charles K Assaad, Emilie Devijver, and Eric Gaussier. Survey and evaluation of causal discovery methods for time series. Journal of Artificial Intelligence Research , 73:767–819, 2022.
  • Assaad et al. (2023) Charles K Assaad, Imad Ez-Zejjari, and Lei Zan. Root cause identification for collective anomalies in time series given an acyclic summary causal graph with loops. In International Conference on Artificial Intelligence and Statistics , pages 8395–8404. PMLR, 2023.
  • Assaad et al. (2021) Karim Assaad, Emilie Devijver, Eric Gaussier, and Ali Ait-Bachir. A mixed noise and constraint-based approach to causal inference in time series. In Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part I 21 , pages 453–468. Springer, 2021.
  • Asuncion and Newman (2007) Arthur Asuncion and David Newman. Uci machine learning repository. Irvine, CA, USA , 2007.
  • Bareinboim and Pearl (2016) Elias Bareinboim and Judea Pearl. Causal inference and the data-fusion problem. Proceedings of the National Academy of Sciences , 113(27):7345–7352, 2016.
  • Barrera and Miljkovic (2022) Emiliano Lopez Barrera and Dragan Miljkovic. The link between the two epidemics provides an opportunity to remedy obesity while dealing with covid-19. Journal of Policy Modeling , 44(2):280–297, 2022.
  • Barrett et al. (2010) Adam B Barrett, Lionel Barnett, and Anil K Seth. Multivariate granger causality and generalized variance. Physical Review E—Statistical, Nonlinear, and Soft Matter Physics , 81(4):041907, 2010.
  • Beaumont et al. (2021) Paul Beaumont, Ben Horsburgh, Philip Pilgerstorfer, Angel Droth, Richard Oentaryo, Steven Ler, Hiep Nguyen, Gabriel Azevedo Ferreira, Zain Patel, and Wesley Leong. CausalNex (Version 0.12.1), October 2021. URL https://github.com/quantumblacklabs/causalnex .
  • Beinlich et al. (1989) Ingo A Beinlich, Henri Jacques Suermondt, R Martin Chavez, and Gregory F Cooper. The alarm monitoring system: A case study with two probabilistic inference techniques for belief networks. In AIME 89: Second European Conference on Artificial Intelligence in Medicine, London, August 29th–31st 1989. Proceedings , pages 247–256. Springer, 1989.
  • Biswas and Mukherjee (2024) Rahul Biswas and Somabha Mukherjee. Consistent causal inference from time series with pc algorithm and its time-aware extension. Statistics and Computing , 34(1):14, 2024.
  • Blöbaum et al. (2018) Patrick Blöbaum, Dominik Janzing, Takashi Washio, Shohei Shimizu, and Bernhard Schölkopf. Cause-effect inference by comparing regression errors. In International Conference on Artificial Intelligence and Statistics , pages 900–909. PMLR, 2018.
  • Bystrova et al. (2024) Daria Bystrova, Charles Assaad, Julyan Arbel, Emilie Devijver, Éric Gaussier, and Wilfried Thuiller. Causal discovery from time series with hybrids of constraint-based and noise-based algorithms. Transactions on Machine Learning Research Journal , 2024.
  • Chen et al. (2004) Yonghong Chen, Govindan Rangarajan, Jianfeng Feng, and Mingzhou Ding. Analyzing multiple nonlinear time series with extended granger causality. Physics letters A , 324(1):26–35, 2004.
  • Cheng et al. (2019) Lu Cheng, Ruocheng Guo, and Huan Liu. Robust cyberbullying detection with causal interpretation. In Companion Proceedings of The 2019 World Wide Web Conference , pages 169–175, 2019.
  • Cheng et al. (2022) Lu Cheng, Ruocheng Guo, Raha Moraffah, Paras Sheth, K Selçuk Candan, and Huan Liu. Evaluation methods and measures for causal learning algorithms. IEEE Transactions on Artificial Intelligence , 3(6):924–943, 2022.
  • Chickering (2002a) David Maxwell Chickering. Learning equivalence classes of bayesian-network structures. The Journal of Machine Learning Research , 2:445–498, 2002a.
  • Chickering (2002b) David Maxwell Chickering. Optimal structure identification with greedy search. Journal of machine learning research , 3(Nov):507–554, 2002b.
  • Chickering (2020) Max Chickering. Statistically efficient greedy equivalence search. In Conference on Uncertainty in Artificial Intelligence , pages 241–249. Pmlr, 2020.
  • Chiyohara et al. (2023) Shinya Chiyohara, Jun-ichiro Furukawa, Tomoyuki Noda, Jun Morimoto, and Hiroshi Imamizu. Proprioceptive short-term memory in passive motor learning. Scientific Reports , 13(1):20826, 2023.
  • Colombo et al. (2012) Diego Colombo, Marloes H Maathuis, Markus Kalisch, and Thomas S Richardson. Learning high-dimensional directed acyclic graphs with latent and selection variables. The Annals of Statistics , pages 294–321, 2012.
  • Colombo et al. (2014) Diego Colombo, Marloes H Maathuis, et al. Order-independent constraint-based causal structure learning. J. Mach. Learn. Res. , 15(1):3741–3782, 2014.
  • Driessens and Džeroski (2005) Kurt Driessens and Sašo Džeroski. Combining model-based and instance-based learning for first order regression. In Proceedings of the 22nd international conference on Machine learning , pages 193–200, 2005.
  • Edinburgh et al. (2021) Tom Edinburgh, Stephen J. Eglen, and Ari Ercole. Causality indices for bivariate time series data: A comparative review of performance. Chaos: An Interdisciplinary Journal of Nonlinear Science , 31(8):083111, 08 2021. ISSN 1054-1500. doi: 10.1063/5.0053519 . URL https://doi.org/10.1063/5.0053519 .
  • Eichler (2012) Michael Eichler. Causal inference in time series analysis. Causality: Statistical perspectives and applications , pages 327–354, 2012.
  • Entner and Hoyer (2010) Doris Entner and Patrik O Hoyer. On causal discovery from time series data using fci. Probabilistic graphical models , 16, 2010.
  • Fang et al. (2023) Zhuangyan Fang, Shengyu Zhu, Jiji Zhang, Yue Liu, Zhitang Chen, and Yangbo He. On low-rank directed acyclic graphs and causal structure learning. IEEE Transactions on Neural Networks and Learning Systems , 35(4):4924–4937, 2023.
  • Fonollosa (2019) José AR Fonollosa. Conditional distribution variability measures for causality detection. Cause Effect Pairs in Machine Learning , pages 339–347, 2019.
  • Ganguly et al. (2023) Niloy Ganguly, Dren Fazlija, Maryam Badar, Marco Fisichella, Sandipan Sikdar, Johanna Schrader, Jonas Wallat, Koustav Rudra, Manolis Koubarakis, Gourab K Patro, et al. A review of the role of causality in developing trustworthy ai systems. arXiv preprint arXiv:2302.06975 , 2023.
  • Gao et al. (2023) Zijun Gao, Lingbo Li, and Tianhua Xu. Data augmentation for time-series classification: An extensive empirical study and comprehensive survey. arXiv preprint arXiv:2310.10060 , 2023.
  • García-Velázquez et al. (2020) Regina García-Velázquez, Markus Jokela, and Tom Henrik Rosenström. Direction of dependence between specific symptoms of depression: A non-gaussian approach. Clinical Psychological Science , 8(2):240–251, 2020.
  • Gelman (2011) Andrew Gelman. Causality and statistical learning, 2011.
  • Gerhardus and Runge (2020) Andreas Gerhardus and Jakob Runge. High-recall causal discovery for autocorrelated time series with latent confounders. Advances in Neural Information Processing Systems , 33:12615–12625, 2020.
  • Geweke (1982) John Geweke. Measurement of linear dependence and feedback between multiple time series. Journal of the American statistical association , 77(378):304–313, 1982.
  • Gong et al. (2017) Mingming Gong, Kun Zhang, Bernhard Schölkopf, Clark Glymour, and Dacheng Tao. Causal discovery from temporally aggregated time series. In Uncertainty in artificial intelligence: proceedings of the… conference. Conference on Uncertainty in Artificial Intelligence , volume 2017. NIH Public Access, 2017.
  • Goudet et al. (2018) Olivier Goudet, Diviyan Kalainathan, and Philippe Caillou. Learning functional causal models with generative neural networks. Explainable and interpretable models in computer vision and machine learning , pages 39–80, 2018.
  • Granger (1969) Clive WJ Granger. Investigating causal relations by econometric models and cross-spectral methods. Econometrica: journal of the Econometric Society , pages 424–438, 1969.
  • Guo et al. (2020) Ruocheng Guo, Lu Cheng, Jundong Li, P. Richard Hahn, and Huan Liu. A survey of learning causality with data: Problems and methods. ACM Comput. Surv. , 53(4), jul 2020. ISSN 0360-0300. doi: 10.1145/3397269 . URL https://doi.org/10.1145/3397269 .
  • Guyon et al. (2011) Isabelle Guyon, Constantin Aliferis, Gregory Cooper, André Elisseeff, Jean Philippe Pellet, Peter Spirtes, and Alexander Statnikov. Causality workbench. In Causality in the sciences . Oxford University Press, 2011.
  • Hasan et al. (2023) Uzma Hasan, Emam Hossain, and Md Osman Gani. A survey on causal discovery methods for iid and time series data. Transactions on Machine Learning Research , 2023.
  • Hempel et al. (2011) Stefan Hempel, Aneta Koseska, Jürgen Kurths, and Zora Nikoloski. Inner composition alignment for inferring directed networks from short time series. Physical review letters , 107(5):054101, 2011.
  • Henckel et al. (2022) Leonard Henckel, Emilija Perković, and Marloes H Maathuis. Graphical criteria for efficient total effect estimation via adjustment in causal linear models. Journal of the Royal Statistical Society Series B: Statistical Methodology , 84(2):579–599, 2022.
  • Hoyer et al. (2008) Patrik Hoyer, Dominik Janzing, Joris M Mooij, Jonas Peters, and Bernhard Schölkopf. Nonlinear causal discovery with additive noise models. Advances in neural information processing systems , 21, 2008.
  • Hoyer et al. (2012) Patrik O Hoyer, Aapo Hyvarinen, Richard Scheines, Peter L Spirtes, Joseph Ramsey, Gustavo Lacerda, and Shohei Shimizu. Causal discovery of linear acyclic models with arbitrary distributions. arXiv preprint arXiv:1206.3260 , 2012.
  • Hu and Liang (2014) Meng Hu and Hualou Liang. A copula approach to assessing granger causality. NeuroImage , 100:125–134, 2014.
  • Hu et al. (2015) Sanqing Hu, Hui Wang, Jianhai Zhang, Wanzeng Kong, Yu Cao, and Robert Kozma. Comparison analysis: Granger causality and new causality and their applications to motor imagery. IEEE transactions on neural networks and learning systems , 27(7):1429–1444, 2015.
  • Huang et al. (2020) Biwei Huang, Kun Zhang, Jiji Zhang, Joseph Ramsey, Ruben Sanchez-Romero, Clark Glymour, and Bernhard Schlkopf. Causal discovery from heterogeneous/nonstationary data. Journal of Machine Learning Research , 21(89):1–53, 2020. URL http://jmlr.org/papers/v21/19-232.html .
  • Hyvärinen and Smith (2013) Aapo Hyvärinen and Stephen M Smith. Pairwise likelihood ratios for estimation of non-gaussian structural equation models. The Journal of Machine Learning Research , 14(1):111–152, 2013.
  • Hyvärinen et al. (2010) Aapo Hyvärinen, Kun Zhang, Shohei Shimizu, and Patrik O. Hoyer. Estimation of a structural vector autoregression model using non-gaussianity. J. Mach. Learn. Res. , 11:1709–1731, aug 2010. ISSN 1532-4435.
  • Hyvärinen et al. (2024) Aapo Hyvärinen, Ilyes Khemakhem, and Ricardo Monti. Identifiability of latent-variable and structural-equation models: from linear to nonlinear. Annals of the Institute of Statistical Mathematics , 76(1):1–33, 2024.
  • Ikeuchi et al. (2023) Takashi Ikeuchi, Mayumi Ide, Yan Zeng, Takashi Nicholas Maeda, and Shohei Shimizu. LiNGAM - Discovery of non-gaussian linear causal models (Version 1.9.0), 2023. URL https://github.com/cdt15/lingam.git .
  • Imbens (2004) Guido W Imbens. Nonparametric estimation of average treatment effects under exogeneity: A review. Review of Economics and statistics , 86(1):4–29, 2004.
  • Jang et al. (2022) Hyuna Jang, Jong-Min Kim, and Hohsuk Noh. Vine copula granger causality in mean. Economic Modelling , 109:105798, 2022.
  • Jangyodsuk et al. (2014) Piraporn Jangyodsuk, Dong-Jun Seo, and Jean Gao. Causal graph discovery for hydrological time series knowledge discovery. International Conference on Hydroinformatics , 2014.
  • Janzing et al. (2012) Dominik Janzing, Joris Mooij, Kun Zhang, Jan Lemeire, Jakob Zscheischler, Povilas Daniušis, Bastian Steudel, and Bernhard Schölkopf. Information-geometric approach to inferring causal directions. Artificial Intelligence , 182:1–31, 2012.
  • Javier (2021) Prince Joseph Erneszer Javier. causal-ccm a Python implementation of Convergent Cross Mapping (Version 0.3.3), June 2021. URL https://github.com/PrinceJavier/causal_ccm.git .
  • Ji et al. (2024) Junzhong Ji, Zuozhen Zhang, Lu Han, and Jinduo Liu. Metacae: Causal autoencoder with meta-knowledge transfer for brain effective connectivity estimation. Computers in Biology and Medicine , 170:107940, 2024.
  • Jin and Xu (2024) Bingzi Jin and Xiaojie Xu. Contemporaneous causality among price indices of ten major steel products. Ironmaking & Steelmaking , page 03019233241249361, 2024.
  • Käding and Runge (2021) Christoph Käding and Jakob Runge. A benchmark for bivariate causal discovery methods. In EGU General Assembly Conference Abstracts , pages EGU21–8584, 2021.
  • Kaiser and Sipos (2021) Marcus Kaiser and Maksim Sipos. Unsuitability of notears for causal graph discovery. arXiv preprint arXiv:2104.05441 , 2021.
  • Kalainathan and Goudet (2019) Diviyan Kalainathan and Olivier Goudet. CausalDiscoveryToolbox (Version 0.6.0), 2019. URL https://github.com/FenTechSolutions/CausalDiscoveryToolbox.git .
  • Kalainathan et al. (2020) Diviyan Kalainathan, Olivier Goudet, and Ritik Dutta. Causal discovery toolbox: Uncovering causal relationships in python. Journal of Machine Learning Research , 21(37):1–5, 2020.
  • Kalisch and Bühlman (2007) Markus Kalisch and Peter Bühlman. Estimating high-dimensional directed acyclic graphs with the pc-algorithm. Journal of Machine Learning Research , 8(3), 2007.
  • Kalisch et al. (2012) Markus Kalisch, Martin Mächler, Diego Colombo, Marloes H Maathuis, and Peter Bühlmann. Causal inference using graphical models with the r package pcalg. Journal of statistical software , 47:1–26, 2012.
  • Kawahara et al. (2011) Yoshinobu Kawahara, Shohei Shimizu, and Takashi Washio. Analyzing relationships among arma processes based on non-gaussianity of external influences. Neurocomputing , 74(12-13):2212–2221, 2011.
  • Kim et al. (2020) Jong-Min Kim, Namgil Lee, and Sun Young Hwang. A copula nonlinear granger causality. Economic Modelling , 88:420–430, 2020.
  • Kleinberg (2013) Samantha Kleinberg. Causality, probability, and time . Cambridge University Press, 2013.
  • Ko et al. (2018) Song Ko, Hyunki Lim, Hoon Ko, and Dae-Won Kim. Experimental comparisons with respect to the usage of the promising relations in eda-based causal discovery. Annals of Operations Research , 265:241–255, 2018.
  • Kullback (1997) Solomon Kullback. Information theory and statistics. Courier Corporation , 1997.
  • Lachapelle et al. (2019) Sébastien Lachapelle, Philippe Brouillard, Tristan Deleu, and Simon Lacoste-Julien. Gradient-based neural dag learning. arXiv preprint arXiv:1906.02226 , 2019.
  • Lam et al. (2022) Wai-Yin Lam, Bryan Andrews, and Joseph Ramsey. Greedy relaxations of the sparsest permutation algorithm. In Uncertainty in Artificial Intelligence , pages 1052–1062. PMLR, 2022.
  • Lauritzen and Spiegelhalter (1988) Steffen L Lauritzen and David J Spiegelhalter. Local computations with probabilities on graphical structures and their application to expert systems. Journal of the Royal Statistical Society: Series B (Methodological) , 50(2):157–194, 1988.
  • Lawrence et al. (2020) Andrew R. Lawrence, Marcus Kaiser, Rui Sampaio, and Maksim Sipos. Data generating process to evaluate causal discovery techniques for time series data. Causal Discovery & Causality-Inspired Machine Learning Workshop at Neural Information Processing Systems , 2020.
  • Lee and Lee (1998) Te-Won Lee and Te-Won Lee. Independent component analysis . Springer, 1998.
  • Li et al. (2014) Jundong Li, Osmar R Zaïane, and Alvaro Osornio-Vargas. Discovering statistically significant co-location rules in datasets with extended spatial objects. In Data Warehousing and Knowledge Discovery: 16th International Conference, DaWaK 2014, Munich, Germany, September 2-4, 2014. Proceedings 16 , pages 124–135. Springer, 2014.
  • Liao et al. (2009) Wei Liao, Daniele Marinazzo, Zhengyong Pan, Qiyong Gong, and Huafu Chen. Kernel granger causality mapping effective connectivity on fmri data. IEEE transactions on medical imaging , 28(11):1825–1835, 2009.
  • Löwe et al. (2022) Sindy Löwe, David Madras, Richard Zemel, and Max Welling. Amortized causal discovery: Learning to infer causal graphs from time-series data. In Conference on Causal Learning and Reasoning , pages 509–525. PMLR, 2022.
  • Luo et al. (2024) Jiaojiao Luo, Zhehao Jin, Heping Jin, Qian Li, Xu Ji, and Yiyang Dai. Causal temporal graph attention network for fault diagnosis of chemical processes. Chinese Journal of Chemical Engineering , 70:20–32, 2024.
  • Lütkepohl (2005) Helmut Lütkepohl. New introduction to multiple time series analysis . Springer Science & Business Media, 2005.
  • Ma et al. (2014) Huanfei Ma, Kazuyuki Aihara, and Luonan Chen. Detecting causality from nonlinear dynamics with short-term time series. Scientific reports , 4(1):7464, 2014.
  • Ma et al. (2023) Sisi Ma, Jinhua Wang, Cameron Bieganek, Roshan Tourani, and Constantin Aliferis. Local causal pathway discovery for single-cell rna sequencing count data: a benchmark study. Journal of Translational Genetics and Genomics , 7(1):50–65, 2023.
  • Maathuis and Colombo (2015) Marloes H Maathuis and Diego Colombo. A generalized back-door criterion. The Annals of Statistics , 43(3):1060–1088, 2015.
  • Maeda (2022) Takashi Nicholas Maeda. I-rcd: an improved algorithm of repetitive causal discovery from data with latent confounders. Behaviormetrika , 49(2):329–341, 2022.
  • Maeda and Shimizu (2020) Takashi Nicholas Maeda and Shohei Shimizu. Rcd: Repetitive causal discovery of linear non-gaussian acyclic models with latent confounders. In International Conference on Artificial Intelligence and Statistics , 2020. URL https://api.semanticscholar.org/CorpusID:210164913 .
  • Maeda and Shimizu (2021) Takashi Nicholas Maeda and Shohei Shimizu. Causal additive models with unobserved variables. In Conference on Uncertainty in Artificial Intelligence , 2021. URL https://api.semanticscholar.org/CorpusID:237511555 .
  • Makhlouf et al. (2020) Karima Makhlouf, Sami Zhioua, and Catuscia Palamidessi. Survey on causal-based machine learning fairness notions. arXiv preprint arXiv:2010.09553 , 2020.
  • Mani and Cooper (2000) Subramani Mani and Gregory F Cooper. Causal discovery from medical textual data. In Proceedings of the AMIA Symposium , page 542. American Medical Informatics Association, 2000.
  • Mann and Whitney (1947) Henry B Mann and Donald R Whitney. On a test of whether one of two random variables is stochastically larger than the other. The annals of mathematical statistics , pages 50–60, 1947.
  • Mao and Shang (2017) Xuegeng Mao and Pengjian Shang. Transfer entropy between multivariate time series. Communications in Nonlinear Science and Numerical Simulation , 47:338–347, 2017.
  • Marinazzo et al. (2021) D. Marinazzo, M.Pellicoro, and S. Stramaglia. KernelGrangerCausality, 2021. URL https://github.com/danielemarinazzo/KernelGrangerCausality.git .
  • Marinazzo et al. (2008) Daniele Marinazzo, Mario Pellicoro, and Sebastiano Stramaglia. Kernel-granger causality and the analysis of dynamical networks. Physical Review E—Statistical, Nonlinear, and Soft Matter Physics , 77(5):056215, 2008.
  • Marinazzo et al. (2011) Daniele Marinazzo, Wei Liao, Huafu Chen, and Sebastiano Stramaglia. Nonlinear connectivity by granger causality. Neuroimage , 58(2):330–338, 2011.
  • Masson-Delmotte et al. (2021) Valérie Masson-Delmotte, Panmao Zhai, Anna Pirani, Sarah L Connors, Clotilde Péan, Sophie Berger, Nada Caud, Y Chen, L Goldfarb, MI Gomis, et al. Climate change 2021: the physical science basis. Contribution of working group I to the sixth assessment report of the intergovernmental panel on climate change , 2(1):2391, 2021.
  • McCracken (2016) James M McCracken. Exploratory causal analysis with time series data. Springer International Publishing , 2016. doi: 10.1007/978-3-031-01909-8_3 . URL https://doi.org/10.1007/978-3-031-01909-8_3 .
  • McCracken and Weigel (2014) James M McCracken and Robert S Weigel. Convergent cross-mapping and pairwise asymmetric inference. Physical Review E , 90(6):062903, 2014.
  • Menegozzo et al. (2021) Giovanni Menegozzo, Diego Dall’Alba, and Paolo Fiorini. Industrial time series modeling with causal precursors and separable temporal convolutions. IEEE Robotics and Automation Letters , 6(4):6939–6946, 2021.
  • Menegozzo et al. (2022) Giovanni Menegozzo, Diego Dall’Alba, and Paolo Fiorini. Cipcad-bench: Continuous industrial process datasets for benchmarking causal discovery methods. In 2022 IEEE 18th International Conference on Automation Science and Engineering (CASE) , pages 2124–2131. IEEE, 2022.
  • Mojtabai (2024) Ramin Mojtabai. Problematic social media use and psychological symptoms in adolescents. Social psychiatry and psychiatric epidemiology , pages 1–8, 2024.
  • Montavon et al. (2018) Grégoire Montavon, Wojciech Samek, and Klaus-Robert Müller. Methods for interpreting and understanding deep neural networks. Digital signal processing , 73:1–15, 2018.
  • Monti et al. (2020) Ricardo Pio Monti, Kun Zhang, and Aapo Hyvärinen. Causal discovery with general non-linear relationships using non-linear ica. In Uncertainty in artificial intelligence , pages 186–195. PMLR, 2020.
  • Mooij et al. (2016) Joris M Mooij, Jonas Peters, Dominik Janzing, Jakob Zscheischler, and Bernhard Schölkopf. Distinguishing cause from effect using observational data: methods and benchmarks. The Journal of Machine Learning Research , 17(1):1103–1204, 2016.
  • Moraffah et al. (2021) Raha Moraffah, Paras Sheth, Mansooreh Karami, Anchit Bhattacharya, Qianru Wang, Anique Tahir, Adrienne Raglin, and Huan Liu. Causal inference for time series analysis: Problems, methods and evaluation. Knowledge and Information Systems , 63:3041–3085, 2021.
  • Naik and Kumar (2011) Ganesh R Naik and Dinesh K Kumar. An overview of independent component analysis and its applications. Informatica , 35(1), 2011.
  • Nandy et al. (2017) Preetam Nandy, Marloes H Maathuis, and Thomas S Richardson. Estimating the effect of joint interventions from observational data in sparse high-dimensional settings. The Annals of Statistics , 45(2):647–674, 2017.
  • Nauta et al. (2019a) Meike Nauta, Doina Bucur, and Christin Seifert. Causal discovery with attention-based convolutional neural networks. Machine Learning and Knowledge Extraction , 1(1):19, 2019a.
  • Nauta et al. (2019b) Meike Nauta, Doina Bucur, and Christin Seifert. TCDF-Temporal Causal Discovery Framework (PyTorch), 2019b. URL https://github.com/M-Nauta/TCDF.git .
  • Ng et al. (2020) Ignavier Ng, AmirEmad Ghassami, and Kun Zhang. On the role of sparsity and dag constraints for learning linear dags. Advances in Neural Information Processing Systems , 33:17943–17954, 2020.
  • Nogueira et al. (2021) Ana Rita Nogueira, João Gama, and Carlos Abreu Ferreira. Causal discovery in machine learning: Theories and applications. Journal of Dynamics & Games , 8(3), 2021.
  • Nogueira et al. (2022) Ana Rita Nogueira, Andrea Pugnana, Salvatore Ruggieri, Dino Pedreschi, and João Gama. Methods and tools for causal discovery and causal inference. WIREs Data Mining and Knowledge Discovery , 12(2):e1449, 2022. doi: https://doi.org/10.1002/widm.1449 . URL https://wires.onlinelibrary.wiley.com/doi/abs/10.1002/widm.1449 .
  • Ombadi et al. (2020) Mohammed Ombadi, Phu Nguyen, Soroosh Sorooshian, and Kuo-lin Hsu. Evaluation of methods for causal discovery in hydrometeorological systems. Water Resources Research , 56(7):e2020WR027251, 2020.
  • Pamfil et al. (2020) Roxana Pamfil, Nisara Sriwattanaworachai, Shaan Desai, Philip Pilgerstorfer, Konstantinos Georgatzis, Paul Beaumont, and Bryon Aragam. Dynotears: Structure learning from time-series data. PMLR , pages 1595–1605, 2020.
  • Pan et al. (2018) Zheyi Pan, Yuxuan Liang, Junbo Zhang, Xiuwen Yi, Yong Yu, and Yu Zheng. Hyperst-net: Hypernetworks for spatio-temporal forecasting. arXiv preprint arXiv:1809.10889 , 2018.
  • Pastorello et al. (2020) Gilberto Pastorello, Carlo Trotta, and Canfora. The fluxnet2015 dataset and the oneflux processing pipeline for eddy covariance data. Scientific data , 7(1):225, 2020.
  • Pearl (1985) Judea Pearl. Bayesian networks: A model of self-activated memory for evidential reasoning. In Proceedings of the 7th conference of the Cognitive Science Society, University of California, Irvine, CA, USA , pages 15–17, 1985.
  • Pearl (2009) Judea Pearl. Causality . Cambridge university press, 2009.
  • Pearl et al. (2000) Judea Pearl et al. Models, reasoning and inference. Cambridge, UK: CambridgeUniversityPress , 19(2):3, 2000.
  • Peters and Bühlmann (2015) Jonas Peters and Peter Bühlmann. Structural intervention distance for evaluating causal graphs. Neural computation , 27(3):771–799, 2015.
  • Peters et al. (2013) Jonas Peters, Dominik Janzing, and Bernhard Schölkopf. Causal inference on time series using restricted structural equation models. Advances in neural information processing systems , 26, 2013.
  • Peters et al. (2017) Jonas Peters, Dominik Janzing, and Bernhard Schölkopf. Elements of causal inference: foundations and learning algorithms . The MIT Press, 2017.
  • Petersen et al. (2010) Ronald Carl Petersen, Paul S Aisen, Laurel A Beckett, Michael C Donohue, Anthony Collins Gamst, Danielle J Harvey, Clifford R Jack, William J Jagust, Leslie M Shaw, Arthur W Toga, et al. Alzheimer’s disease neuroimaging initiative (adni): clinical characterization. Neurology , 74(3):201–209, 2010.
  • Raghu et al. (2018) Vineet Raghu, Joseph Ramsey, Alison Morris, Dimitris Manatakis, Peter Sprites, Panos Chrysanthis, Clark Glymour, and Panayiotis Benos. Comparison of strategies for scalable causal discovery of latent variable models from mixed data. International Journal of Data Science and Analytics , 6, 08 2018. doi: 10.1007/s41060-018-0104-3 .
  • Ramsey et al. (2018) Joseph D Ramsey, Kun Zhang, Madelyn Glymour, Ruben Sanchez Romero, Biwei Huang, Imme Ebert-Uphoff, Savini Samarasinghe, Elizabeth A Barnes, and Clark Glymour. Tetrad—a toolbox for causal discovery. In 8th international workshop on climate informatics , pages 1–4, 2018.
  • Richardson and Spirtes (2002) Thomas Richardson and Peter Spirtes. Ancestral graph markov models. The Annals of Statistics , 30(4):962–1030, 2002.
  • Rosenström et al. (2023) Tom H Rosenström, Nikolai O Czajkowski, Ole André Solbakken, and Suoma E Saarni. Direction of dependence analysis for pre-post assessments using non-gaussian methods: a tutorial. Psychotherapy Research , 33(8):1058–1075, 2023.
  • Rubin (1974) Donald B Rubin. Estimating causal effects of treatments in randomized and nonrandomized studies. Journal of educational Psychology , 66(5):688, 1974.
  • Runge (2018) Jakob Runge. Conditional independence testing based on a nearest-neighbor estimator of conditional mutual information. In International Conference on Artificial Intelligence and Statistics , pages 938–947. Pmlr, 2018.
  • Runge (2020) Jakob Runge. Discovering contemporaneous and lagged causal relations in autocorrelated nonlinear time series datasets. In Conference on Uncertainty in Artificial Intelligence , pages 1388–1397. Pmlr, 2020.
  • Runge (2021) Jakob Runge. Necessary and sufficient graphical conditions for optimal adjustment sets in causal graphical models with hidden variables. Advances in Neural Information Processing Systems , 34:15762–15773, 2021.
  • Runge et al. (2019) Jakob Runge, Peer Nowack, Marlene Kretschmer, Seth Flaxman, and Dino Sejdinovic. Detecting and quantifying causal associations in large nonlinear time series datasets. Science advances , 5(11):eaau4996, 2019.
  • Runge et al. (2020) Jakob Runge, Xavier-Andoni Tibau, Matthias Bruhns, Jordi Muñoz-Marí, and Gustau Camps-Valls. The causality for climate competition. In NeurIPS 2019 Competition and Demonstration Track , pages 110–120. PMLR, 2020.
  • Runge et al. (2023a) Jakob Runge, Andreas Gerhardus, Gherardo Varando, Veronika Eyring, and Gustau Camps-Valls. Causal inference for time series. Nature Reviews Earth & Environment , pages 1–19, 2023a.
  • Runge et al. (2023b) Jakob Runge, Andreas Gerhardus, Gherardo Varando, Veronika Eyring, and Gustau Camps-Valls. tigramite (Version 5.2), 2023b. URL https://github.com/jakobrunge/tigramite.git .
  • Sachs et al. (2005) Karen Sachs, Omar Perez, Dana Pe’er, Douglas A Lauffenburger, and Garry P Nolan. Causal protein-signaling networks derived from multiparameter single-cell data. Science , 308(5721):523–529, 2005.
  • Scholkopf (2019) Bernhard Scholkopf. Causality for machine learning. Probabilistic and Causal Inference , 2019. URL https://api.semanticscholar.org/CorpusID:208267600 .
  • Schreiber (2000) Thomas Schreiber. Measuring information transfer. Physical review letters , 85(2):461, 2000.
  • Scutari (2009) Marco Scutari. Learning bayesian networks with the bnlearn r package. arXiv preprint arXiv:0908.3817 , 2009.
  • Sekhon (2008) Jasjeet S Sekhon. Multivariate and propensity score matching software with automated balance optimization: the matching package for r. Journal of Statistical Software, Forthcoming , 2008.
  • Shimizu et al. (2006) Shohei Shimizu, Patrik O Hoyer, Aapo Hyvärinen, Antti Kerminen, and Michael Jordan. A linear non-gaussian acyclic model for causal discovery. Journal of Machine Learning Research , 7(10), 2006.
  • Shimizu et al. (2011) Shohei Shimizu, Takanori Inazumi, Yasuhiro Sogawa, Aapo Hyvärinen, Yoshinobu Kawahara, Takashi Washio, Patrik O. Hoyer, and Kenneth Bollen. Directlingam: A direct method for learning a linear non-gaussian structural equation model. J. Mach. Learn. Res. , 12(null):1225–1248, jul 2011. ISSN 1532-4435.
  • Smith et al. (2011) Stephen M Smith, Karla L Miller, Gholamreza Salimi-Khorshidi, Matthew Webster, Christian F Beckmann, Thomas E Nichols, Joseph D Ramsey, and Mark W Woolrich. Network modelling methods for fmri. Neuroimage , 54(2):875–891, 2011.
  • Sogawa et al. (2010) Yasuhiro Sogawa, Shohei Shimizu, Yoshinobu Kawahara, and Takashi Washio. An experimental comparison of linear non-gaussian causal discovery methods and their variants. In The 2010 International Joint Conference on Neural Networks (IJCNN) , pages 1–8. IEEE, 2010.
  • Song et al. (2016) Jing Song, Satoshi Oyama, Haruhiko Sato, and Masahito Kurihara. Evaluation of causal discovery models in bivariate case using real world data. In Proceedings of the International MultiConference of Engineers and Computer Scientists , volume 1, 2016.
  • Spiegelhalter et al. (1993) David J Spiegelhalter, A Philip Dawid, Steffen L Lauritzen, and Robert G Cowell. Bayesian analysis in expert systems. Statistical science , pages 219–247, 1993.
  • Spirtes and Zhang (2016) Peter Spirtes and Kun Zhang. Causal discovery and inference: concepts and recent methodological advances. In Applied informatics , volume 3, pages 1–28. Springer, 2016.
  • Spirtes et al. (2001) Peter Spirtes, Clark Glymour, and Richard Scheines. Causation, prediction, and search . MIT press, 2001.
  • Spirtes et al. (2013) Peter L Spirtes, Christopher Meek, and Thomas S Richardson. Causal inference in the presence of latent variables and selection bias. arXiv preprint arXiv:1302.4983 , 2013.
  • Statnikov et al. (2013) Alexander Statnikov, Ben Hamner, Hugo Jair Escalante, Isabelle, and Mehreen Saeed. Cause-effect pairs, 2013. URL https://kaggle.com/competitions/cause-effect-pairs .
  • Stekhoven et al. (2012) Daniel J Stekhoven, Izabel Moraes, Gardar Sveinbjörnsson, Lars Hennig, Marloes H Maathuis, and Peter Bühlmann. Causal stability ranking. Bioinformatics , 28(21):2819–2823, 2012.
  • Sugihara et al. (2012) George Sugihara, Robert May, Hao Ye, Chih-hao Hsieh, Ethan Deyle, Michael Fogarty, and Stephan Munch. Detecting causality in complex ecosystems. science , 338(6106):496–500, 2012.
  • Sun et al. (2014) Jie Sun, Carlo Cafaro, and Erik M Bollt. Identifying the coupling structure in complex systems through the optimal causation entropy principle. Entropy , 16(6):3416–3433, 2014.
  • Sun et al. (2015) Jie Sun, Dane Taylor, and Erik M Bollt. Causal network inference by optimal causation entropy. SIAM Journal on Applied Dynamical Systems , 14(1):73–106, 2015.
  • Takens (1981) Floris Takens. Dynamical systems and turbulence. Warwick, 1980 , pages 366–381, 1981.
  • Tank et al. (2021a) Alex Tank, Ian Covert, Nicholas Foti, Ali Shojaie, and Emily Fox. Neural-GC, 2021a. URL https://github.com/iancovert/Neural-GC.git .
  • Tank et al. (2021b) Alex Tank, Ian Covert, Nicholas Foti, Ali Shojaie, and Emily B Fox. Neural granger causality. IEEE Transactions on Pattern Analysis and Machine Intelligence , pages 1–1, 2021b. doi: 10.1109/tpami.2021.3065601 . URL https://doi.org/10.1109%2Ftpami.2021.3065601 .
  • Tu et al. (2019) Ruibo Tu, Kun Zhang, Bo Bertilson, Hedvig Kjellstrom, and Cheng Zhang. Neuropathic pain diagnosis simulator for causal discovery algorithm evaluation. Advances in Neural Information Processing Systems , 32, 2019.
  • Uemura et al. (2022) Kento Uemura, Takuya Takagi, Kambayashi Takayuki, Hiroyuki Yoshida, and Shohei Shimizu. A multivariate causal discovery based on post-nonlinear model. In Conference on Causal Learning and Reasoning , pages 826–839. PMLR, 2022.
  • Van den Bulcke et al. (2006) Tim Van den Bulcke, Koenraad Van Leemput, Bart Naudts, Piet van Remortel, Hongwu Ma, Alain Verschoren, Bart De Moor, and Kathleen Marchal. Syntren: a generator of synthetic gene expression data for design and analysis of structure learning algorithms. BMC bioinformatics , 7:1–12, 2006.
  • Wang et al. (2014) Jing Wang, Pengjian Shang, Aijin Lin, and Yuechen Chen. Segmented inner composition alignment to detect coupling of different subsystems. Nonlinear Dynamics , 76:1821–1828, 2014.
  • Wang et al. (2023) Lu Wang, Hang Ruan, Yanran Hong, and Keyu Luo. Detecting the hidden asymmetric relationship between crude oil and the us dollar: A novel neural granger causality method. Research in International Business and Finance , 64:101899, 2023.
  • Wang et al. (2021) Xiaoqiang Wang, Yali Du, Shengyu Zhu, Liangjun Ke, Zhitang Chen, Jianye Hao, and Jun Wang. Ordering-based causal discovery with reinforcement learning. arXiv preprint arXiv:2105.06631 , 2021.
  • Xie et al. (2019) Feng Xie, Ruichu Cai, Yan Zeng, Jiantao Gao, and Zhifeng Hao. An efficient entropy-based causal discovery method for linear structural equation models with iid noise variables. IEEE transactions on neural networks and learning systems , 31(5):1667–1680, 2019.
  • Yao et al. (2021) Liuyi Yao, Zhixuan Chu, Sheng Li, Yaliang Li, Jing Gao, and Aidong Zhang. A survey on causal inference. ACM Trans. Knowl. Discov. Data , 15(5), may 2021. ISSN 1556-4681. doi: 10.1145/3444944 . URL https://doi.org/10.1145/3444944 .
  • Ye et al. (2015) Hao Ye, Ethan R Deyle, Luis J Gilarranz, and George Sugihara. Distinguishing time-delayed causal interactions using convergent cross mapping. Scientific reports , 5(1):14750, 2015.
  • Yu et al. (2019) Yue Yu, Jie Chen, Tian Gao, and Mo Yu. Dag-gnn: Dag structure learning with graph neural networks. In International conference on machine learning , pages 7154–7163. PMLR, 2019.
  • Yuan and Malone (2013) Changhe Yuan and Brandon Malone. Learning optimal bayesian networks: a shortest path perspective. J. Artif. Int. Res. , 48(1):23–65, oct 2013. ISSN 1076-9757.
  • Zhang (2008) Jiji Zhang. On the completeness of orientation rules for causal discovery in the presence of latent confounders and selection bias. Artificial Intelligence , 172(16-17):1873–1896, 2008.
  • Zhang et al. (2021a) Keli Zhang, Shengyu Zhu, Marcus Kalander, Ignavier Ng, Junjian Ye, Zhitang Chen, and Lujia Pan. gCastle (Version 1.0.4), 2021a. URL https://github.com/huawei-noah/trustworthyAI.git .
  • Zhang et al. (2021b) Keli Zhang, Shengyu Zhu, Marcus Kalander, Ignavier Ng, Junjian Ye, Zhitang Chen, and Lujia Pan. gcastle: A python toolbox for causal discovery. arXiv preprint arXiv:2111.15155 , 2021b.
  • Zhang and Hyvarinen (2012) Kun Zhang and Aapo Hyvarinen. On the identifiability of the post-nonlinear causal model. arXiv preprint arXiv:1205.2599 , 2012.
  • Zhang et al. (2015) Kun Zhang, Zhikun Wang, Jiji Zhang, and Bernhard Schölkopf. On estimation of functional causal models: general results and application to the post-nonlinear causal model. ACM Transactions on Intelligent Systems and Technology (TIST) , 7(2):1–22, 2015.
  • Zheng et al. (2018) Xun Zheng, Bryon Aragam, Pradeep K Ravikumar, and Eric P Xing. Dags with no tears: Continuous optimization for structure learning. Advances in neural information processing systems , 31, 2018.
  • Zheng et al. (2020) Xun Zheng, Chen Dan, Bryon Aragam, Pradeep Ravikumar, and Eric Xing. Learning sparse nonparametric dags. In International Conference on Artificial Intelligence and Statistics , pages 3414–3425. Pmlr, 2020.
  • Zheng et al. (2024a) Yujia Zheng, Biwei Huang, Wei Chen, Joseph Ramsey, Mingming Gong, Ruichu Cai, Shohei Shimizu, Peter Spirtes, and Kun Zhang. causal-learn (Version 0.1.3.8), 2024a. URL https://github.com/py-why/causal-learn.git .
  • Zheng et al. (2024b) Yujia Zheng, Biwei Huang, Wei Chen, Joseph Ramsey, Mingming Gong, Ruichu Cai, Shohei Shimizu, Peter Spirtes, and Kun Zhang. Causal-learn: Causal discovery in python. Journal of Machine Learning Research , 25(60):1–8, 2024b.
  • Zhu et al. (2019) Shengyu Zhu, Ignavier Ng, and Zhitang Chen. Causal discovery with reinforcement learning. arXiv preprint arXiv:1906.04477 , 2019.

Appendix A Residual Plot

Refer to caption

Information

  • Author Services

Initiatives

You are accessing a machine-readable page. In order to be human-readable, please install an RSS reader.

All articles published by MDPI are made immediately available worldwide under an open access license. No special permission is required to reuse all or part of the article published by MDPI, including figures and tables. For articles published under an open access Creative Common CC BY license, any part of the article may be reused without permission provided that the original article is clearly cited. For more information, please refer to https://www.mdpi.com/openaccess .

Feature papers represent the most advanced research with significant potential for high impact in the field. A Feature Paper should be a substantial original Article that involves several techniques or approaches, provides an outlook for future research directions and describes possible research applications.

Feature papers are submitted upon individual invitation or recommendation by the scientific editors and must receive positive feedback from the reviewers.

Editor’s Choice articles are based on recommendations by the scientific editors of MDPI journals from around the world. Editors select a small number of articles recently published in the journal that they believe will be particularly interesting to readers, or important in the respective research area. The aim is to provide a snapshot of some of the most exciting work published in the various research areas of the journal.

Original Submission Date Received: .

  • Active Journals
  • Find a Journal
  • Proceedings Series
  • For Authors
  • For Reviewers
  • For Editors
  • For Librarians
  • For Publishers
  • For Societies
  • For Conference Organizers
  • Open Access Policy
  • Institutional Open Access Program
  • Special Issues Guidelines
  • Editorial Process
  • Research and Publication Ethics
  • Article Processing Charges
  • Testimonials
  • Preprints.org
  • SciProfiles
  • Encyclopedia

applsci-logo

Article Menu

empirical research data analysis

  • Subscribe SciFeed
  • Recommended Articles
  • Google Scholar
  • on Google Scholar
  • Table of Contents

Find support for a specific problem in the support section of our website.

Please let us know what you think of our products and services.

Visit our dedicated information section to learn more about MDPI.

JSmol Viewer

High-resolution monitored data analysis of ev public charging stations for modelled grid impact validation.

empirical research data analysis

Share and Cite

Estrada Poggio, A.; Rotondo, G.; Prina, M.G.; Zubaryeva, A.; Sparber, W. High-Resolution Monitored Data Analysis of EV Public Charging Stations for Modelled Grid Impact Validation. Appl. Sci. 2024 , 14 , 8133. https://doi.org/10.3390/app14188133

Estrada Poggio A, Rotondo G, Prina MG, Zubaryeva A, Sparber W. High-Resolution Monitored Data Analysis of EV Public Charging Stations for Modelled Grid Impact Validation. Applied Sciences . 2024; 14(18):8133. https://doi.org/10.3390/app14188133

Estrada Poggio, Aaron, Giuseppe Rotondo, Matteo Giacomo Prina, Alyona Zubaryeva, and Wolfram Sparber. 2024. "High-Resolution Monitored Data Analysis of EV Public Charging Stations for Modelled Grid Impact Validation" Applied Sciences 14, no. 18: 8133. https://doi.org/10.3390/app14188133

Article Metrics

Article access statistics, further information, mdpi initiatives, follow mdpi.

MDPI

Subscribe to receive issue release notifications and newsletters from MDPI journals

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Open access
  • Published: 10 September 2024

Spatial heterogeneity analysis of biased land resource supply policies on housing prices and innovation efficiency

  • Jinsi Liu   ORCID: orcid.org/0000-0001-9181-3936 1 , 2 ,
  • Hu Xiang 3 ,
  • Shengjiao Zhu 1 , 2 &
  • Shixiang Chen 1 , 2  

Humanities and Social Sciences Communications volume  11 , Article number:  1180 ( 2024 ) Cite this article

Metrics details

  • Development studies
  • Environmental studies
  • Social policy

To rationally allocate land resources, the government has formulated biased land resource supply policies. However, this policy will promote the fluctuation of regional housing prices, thereby affecting regional innovation efficiency. To elucidate the intrinsic logical relationship among these factors, this study investigates the influence of biased land resource supply policies on housing prices and innovation across 31 Chinese provinces over a 16-year period. It places particular emphasis on probing into the spatial disparities in this impact. The innovation of this paper lies in the development of a theoretical analysis model, which creatively utilizes the regional land supply status as an instrumental variable to address the endogeneity problem. At the same time, a double robustness test was conducted by substituting the explanatory variables and employing the fixed effects model. The results of the TSLS regression indicate: (1) The land resource supply policy, which prioritizes the central and western regions, has resulted in an increase in housing prices across the majority of regions in China, with the eastern region experiencing a more rapid rise compared to the central and western regions. (2) The accelerated growth of housing prices in the eastern region is likely to exert a more pronounced inhibitory effect on innovation compared to the central and western regions. In summary, China’s land resource supply policy, which favors the central and western regions, has exacerbated the rise in housing prices in the eastern region, and rising housing prices will inhibit the innovation efficiency of the region. Hence, it is imperative for the government to continuously optimize and adjust land resource supply policies, ensuring their alignment with population migration patterns, stabilizing the real estate market, while also considering the innovation efficiency of regional development.

Introduction

Land resource supply, as a fundamental social and economic endeavor, involves the government allocating land resources to various entities based on specific regulations. This process encompasses two primary facets: land acquisition and land distribution, with the latter exerting a series of influences on the regional economy and society. Of course, land tenure and management systems vary from country to country. In Egypt, the Egyptian government influences land tenure, delivery and supply systems, and land prices through urban land policies (El Araby, 2003 ). The Israeli government engages in the real estate market by regulating the supply of land resources through land bidding. (Rubin and Felsenstein, 2017 ). The land resource supply planning system in the UK imposes significant constraints on urban space, directly influencing the escalation and fluctuation of housing prices (Cheshire, 2004 ). In China, in 2016, the National People’s Congress issued the “Outline of the Thirteenth Five-Year Plan for National Economic and Social Development of the People’s Republic of China.” The document emphasized prioritizing the provision of land resources for major projects outlined in this plan. In mid-December 2016, the Central Economic Work Conference first proposed the concept that ‘housing is for living in, not for speculation.’ This assertion clarified the residential nature of housing, influenced the formulation of land supply policies, and fostered a path toward sustainable development for real estate enterprises (Li et al., 2021 ). In 2021, the State Council issued the “Notice on Issuing the Tourism Development Plan for the 14th Five-Year Plan”, stipulating that the supply of land resources for tourism public service facilities must be ensured in compliance with the law. The Land Management Law of the People’s Republic of China underwent revisions and enhancements in 1998, 2004, and 2019. This document mandates that governments at all levels organize the formulation of land resource utilization plans in accordance with national economic and social development plans. Footnote 1 Indeed, around 2003, the Chinese government implemented a series of significant measures to regulate the supply of land resources. A notable aspect among them is the spatial distribution of land resource supply. In pursuit of balanced regional development, the nation commenced implementing land resource supply policies that favored the central and western regions. (Liang et al., 2016 ). There is evidence indicating that after 2003, the proportion of land resource supply in the central and western regions of the country experienced a significant upward trend (Yang et al., 2022 ). From this perspective, the government engages in macro-control through land control policies to achieve the purpose of rational distribution of land resources in space and time, and thereby promote the development of the national economy and society (Li et al., 2020 ).

However, the supply of land resources profoundly influences housing prices. Numerous scholars have investigated the intrinsic relationship between land resource supply policies and regional housing prices. It is widely acknowledged that shifts in regional land resource supply have had a substantial impact on fluctuations in regional housing prices. In China, the supply of regional land resources is primarily influenced by macro land control policies (Fan et al., 2021 ). The issue of high housing prices has gradually become a focal point of social concern, drawing the attention and research of many scholars regarding its impact on innovation (Yu and Cai, 2021 ). Current research shows that the rapid increase in housing prices leads to higher costs of labor, land, and other factors. Consequently, this triggers an over-concentration of resource factors within the real estate industry, resulting in a crowding-out effect on innovative human resources, R&D investment, and other areas. This phenomenon represents the inhibitory impact of high housing prices on innovation (Miao and Wang, 2014 ). China has entered a new phase of high-quality development, where innovation, serving as the primary driver, is increasingly becoming the decisive factor propelling such development. The competitive landscape among local governments has shifted from the initial “race for growth” to a focus on “competition for innovation” (He et al., 2018 ). As outlined in China’s 14th Five-Year Plan, there exists a disparity between China’s innovation capabilities and the requirements for high-quality development. Numerous obstacles hinder the advancement of the innovation-driven development strategy, such as an unfavorable innovation environment, dispersed innovation resources, and insufficient vitality among innovation entities (Niu, Zhang, et al., 2023 ). So, will the land resource supply policy that favors the central and western regions influence housing prices and innovation (Zhang et al., 2024 )? Additionally, is this influence spatially heterogeneous? To answer this question, based on the existing literature (Liang et al., 2016 ) on the relationship between “land supply and housing prices” (Fan et al., 2021 ) and “housing prices and innovation” (Yu and Cai, 2021 ), this paper further integrates the analysis of land resource supply, housing prices, and innovation within the same framework.

The innovative aspects of this paper are as follows: (1) While most studies typically focus solely on either the impact of land supply on housing prices (Yii et al. 2022 ) or the influence of housing prices on regional innovation (Li et al., 2023 ), this paper distinguishes itself by constructing a theoretical analysis model and utilizing instrumental variables to integrate land resource supply, housing prices, and innovation within a unified framework (Li and Li, 2024 ). (2) This paper temporally and spatially presents current land supply, housing prices, and regional innovation, while also measuring and analyzing regional innovation efficiency. (3) Furthermore, this paper conducts a double robustness test by substituting the explained variables and employing the fixed effects model (Li et al., 2022 ). In conclusion, the research presented in this paper offers insights that can inform the development of government land supply policies worldwide.

The research framework of this paper unfolds as follows: Initially, a theoretical model is formulated, and corresponding hypotheses are posited. Leveraging the regional land resource supply status as an instrumental variable addresses the endogeneity concern. Subsequently, the dynamics of land supply, housing prices, and regional innovation are examined across both temporal and spatial dimensions. Concurrently, regional innovation efficiency is assessed and analyzed. Finally, TSLS regression is employed to delve into the impact of land resource supply on housing prices and the reciprocal influence of housing prices on innovation efficiency. Additionally, the paper furnishes recommendations and insights for refining national land resource supply policies. The research results of this paper: China’s land resource supply policy, which favors the central and western regions, has exacerbated the surge in housing prices in the eastern region. Consequently, the escalating housing costs are impeding the region’s innovation efficiency. The research framework of this paper is outlined below (Fig. 1 ).

figure 1

Research framework diagram.

Literature review

Pathways of land resource supply policy impacting housing prices.

Currently, many countries or regions are enhancing regional land resource supply preparations through intensified land expropriation (Xu et al., 2022 ), which constitutes a pivotal aspect of land resource provision (Magliocca et al., 2020 ). Obviously, unscientific forced land acquisition will result in more negative impacts (Larbi et al., 2004 ). Of course, land distribution, as the second step in the supply of land resources, has a more significant impact on the economy and society of the entire region (Monk et al., 1996 ). China’s distinctive foundational land conditions and systems determine that land resource supply policies have a significant impact on the housing market (He, 2022 ). The land resource supply policy encompasses various aspects, including land supply quantity, structure, method, and pricing. Among these, there exists ample research on the impacts of land supply quantity (Huang et al., 2015 ). The impact of land supply quantity on housing prices can be summarized into four pathways. The first is the production function channel, which influences the quantity of housing supply through the amount of land supplied, thereby affecting housing prices (Tian and Ma, 2009 ). In general, an increase in land supply can effectively facilitate real estate destocking, although disparities exist between land supply and housing stock across various regions or periods (Shen et al., 2018 ). The second is the anticipatory channel. The modification of land resource supply policy impacts the supply and demand dynamics of the housing market by influencing market participants’ expectations for future real estate trends, ultimately influencing housing prices (Hu and Qian, 2017 ). The third aspect is supply scale. Disparities in supply scale will impact housing prices by influencing the elasticity of housing supply (El Araby, 2003 ). The fourth factor is the structure of land resource supply. Various land supply structures yield distinct effects on housing prices. Owing to historical and geographical factors, notable differences exist in the impact of government’s existing land planning policies on housing prices (Meen and Nygaard, 2011 ). For instance, the advancement of high-speed railways directly catalyzes shifts in land use structure, subsequently exerting varying impacts on housing prices across different regions (Chen et al., 2021 ). Land use planning varies significantly among different countries. The United Kingdom, for example, grapples with considerable uncertainty in land planning, coupled with a lack of control in the housing land market, resulting in heightened speculative risk-taking. House prices in France are subject to tighter controls, with speculation being more limited. Sweden, on the other hand, is utilizing its land resource reserves to mitigate the increase in housing prices (Barlow, 1993 ).

Researching the link between land resource supply and housing prices

In the case of China, since around 2003, the Chinese government has essentially regulated the supply of urban residential land. This has led to a decrease in residential land supply and supply elasticity, consequently placing downward pressure on new housing availability (Yan et al., 2014 ). Based on the average land prices of residential, industrial, and commercial plots, they were respectively 57%, 24%, and 41% higher in 2012 compared to 2007 (Qin et al., 2016 ). It’s evident from this that in recent decades, escalating housing prices have gained widespread acknowledgment. The majority of scholars hold the view that augmenting the supply of land resources can effectively curb the surge in housing prices. Particularly, China has adopted a policy addressing the spatial and structural mismatch in land resource supply. Specifically, the supply of land resources increases in the central and western regions, while decreasing in the eastern region. This imbalance causes land prices in the eastern region to rise (Grimes and Aitken, 2010 ), subsequently leading to a rapid increase in housing prices (Fan et al., 2021 ). Correspondingly, as housing prices rise, wages in the area also rise (Liang et al., 2016 ). Furthermore, local governments can assume a more significant role in land supply. By increasing the supply of residential land, they can mitigate housing prices, and employ dynamic procyclical land resource supply policies to dampen macroeconomic fluctuations (He, 2023 ). For this reason, some scholars suggest that policymakers should prioritize increasing the supply of land to mitigate the rise in housing prices (Yii et al., 2022 ). On the contrary, some scholars hold differing views, suggesting that the influence of land supply on housing prices is minimal or that there is no direct causal relationship between land resource availability and housing price fluctuations (Tse, 1998 ). Research conducted by scholars on Melbourne’s urban growth boundary has found no conclusive evidence indicating that the escalation in land prices is solely attributable to land resource supply policies (Buxton and Taylor, 2011 ). Likewise, certain scholars posit a positive correlation between land supply and housing prices, particularly evident in regions characterized by higher housing costs (Wang et al., 2023 ).

Exploring the relationship between housing prices and regional innovation

In academic discourse, there’s a prevailing notion that the escalation of housing prices tends to stifle the advancement of regional innovation. Mechanistically, this inhibition of regional innovation stems from three primary channels. The first factor involves the crowding-out effect of rising housing prices on innovative funding. The second pertains to the crowding-out effect of rising housing prices on innovative talent. Lastly, there’s the blocking effect of rising housing prices on the spirit of innovation, entrepreneurship, and vitality. Naturally, the influence of housing prices on innovation varies over time. Generally, housing prices exhibit a U-shaped relationship with innovation. Beyond a certain point, the continued increase in housing prices diminishes regional innovative vigor. This study primarily focuses on the national context of China, where it is evident that in many regions, housing prices have surpassed this critical juncture (Yu and Cai, 2021 ), thereby stifling regional innovation.

Crowding-out effect: investigating the impact of rising housing prices on innovative funding

When it comes to innovation funding, particularly for enterprises engaged in any form of innovation activities, sustained long-term investment from their internal financing channels is crucial for maintenance (Brown et al., 2012 ). Hence, the rapid surge in housing prices has spurred greater investment in real estate. This inevitably entails a notable crowding-out impact on the long-term investment funds essential for enterprises to pursue innovative R&D activities, consequently discouraging further innovation efforts. While there’s no direct evidence linking housing prices to the high savings rate of Chinese households, it’s undeniable that a high savings rate translates to reduced investment in innovation (Wang and Wen, 2012 ). Drawing from theoretical models, scholars have further substantiated the crowding-out impact of rapid growth in real estate investment on long-term investment in other sectors (Bleck and Liu, 2018 ). In first-tier urban agglomerations, housing prices exert significant spillover effects on urban innovation, influencing the innovative spirit of surrounding cities as well (Wang and Hu, 2023 ). The adverse effects of high housing prices on regional innovation manifest across various aspects including innovation input, output, and the transformation of innovation achievements (Li et al., 2023 ).

Crowding out effect: examining how rising housing prices impact innovative talent

In terms of the impact on innovative talents, as China’s housing prices continue to rise, entrepreneurial activities among urban adults are generally hindered (Li and Wu, 2014 ). Excessively high housing prices draw funds into the real estate market, thereby constraining residents’ consumption levels and impeding the entry of innovative talent to some extent (Ding et al., 2023 ). Hence, in the current urban development context, it is crucial to prioritize examining how housing costs influence the employment and residential choices of talents (Ling et al., 2023 ). Of course, the impact of rising housing prices on talent attraction should be considered at different stages. In the initial phase of urban housing price increases, there is indeed a certain level of attraction for talent. However, in recent years, the escalation of housing prices has transformed into a bubble. During this period, high housing prices exert a significant crowding-out effect on innovative talent and talent attraction (Lin et al., 2021 ).

Blocking effects: researching how rising housing prices impact innovation and entrepreneurship

In regard to the impact on the vitality of entrepreneurial spirit, when housing prices surpass a certain threshold, it inevitably hampers the innovative endeavors of businesses, particularly those with limited enthusiasm for innovation (Chu et al., 2023 ). According to labor force dynamics data, scholars have demonstrated that exorbitant housing prices have a substantial adverse effect on the inclination of urban individuals to pursue demand-driven entrepreneurship (Hu and Qian, 2022 ). Elevated housing prices significantly impede innovation within manufacturing companies. As housing prices appreciate, companies divert funds toward real estate investments, thereby diminishing their innovation drive (Rong et al., 2016 ). In pursuit of this goal, scholars have suggested that decreasing housing price fluctuations and increasing liquidity in the real estate market can encourage families to embrace entrepreneurial pursuits (Oh et al., 2021 ). Conversely, other scholars have put forth differing perspectives on this matter. They argue that there exists a positive correlation between regional innovation quality and housing price appreciation. Specifically, the enhancement of innovation quality can stimulate housing price escalation (Beracha et al., 2022 ). Some studies have also indicated that housing prices can bolster the innovation and entrepreneurial vitality of both local cities and their neighboring counterparts (Fan et al., 2023 ). From this observation, it’s evident that the impact of increasing housing prices on regional innovation varies significantly across different periods (Hu et al. 2019 ). Certainly, the rise in housing prices will also yield other ancillary effects (Fang and Lv, 2023 ), such as enhancing the quality of green innovation within enterprises (Yan et al., 2024 ). Nonetheless, the current surge in housing prices across China has outpaced both public and business expectations, potentially exerting a detrimental effect on regional innovation. Consequently, it is imperative for the government to prioritize the impact of housing macro-prudential policies (Chen et al., 2023 ).

A review of the existing literature reveals that although academic research findings on land resource supply, housing prices, and innovation are relatively abundant. However, most of the literature primarily delves into the relationship between two of these factors, with few studies examining the interplay among all three. In light of this gap, this paper endeavors to integrate land resource supply, housing prices, and innovation within a unified analytical framework, drawing upon existing research findings. It seeks to investigate how land supply influences housing prices and subsequently impacts innovation. Employing panel data comprising 31 provinces and regions in China, this paper empirically examines the relationship between land resource supply, housing prices, and innovation. Specifically, it focuses on analyzing whether biases in land resource supply policies exert a spatially heterogeneous impact on both regional housing prices and innovation efficiency in China.

Research design

Theoretical model construction and hypothesis proposing.

Before delving into empirical analysis, this paper references the paradigms of Helpman (Helpman, 1995 ) and Ottaviano (Ottaviano et al., 2011 ) to explore spatial agglomeration efficiency within the framework of the new economic geography model. Simultaneously, Krugman’s new economic geography theory holds significant reference value (Krugman, 1991 ). Building upon these paradigms and theories, this paper examines how land resource supply policies across various regions influence housing prices, thereby impacting innovation.

Assume that the central and western region a and the eastern region b (Table 1 ), the labor force in the two regions can flow freely. Consumers optimize their utility by efficiently allocating their income towards the consumption of both tradable industrial products and non-tradable housing. The consumer utility function for region \(q(q=a,b)\) is then:

In formula ( 1 ), \({C}_{{qm}}\) and \({C}_{{qh}}\) respectively represent the quantity of industrial products and the quantity of housing consumed by consumers in region q . \({P}_{{qm}}\) and \({P}_{{qh}}\) denote the prices of industrial products and housing, respectively. \({W}_{q}\) represents the income of consumers in region q . μ denotes the proportion of consumer expenditure allocated to industrial product consumption, 0 <  μ  < 1.

After optimization, the consumer’s indirect utility function can be derived:

Take the logarithm of Eq. ( 2 ) and perform monotonic transformation to obtain:

In formula ( 3 ), \({v}_{q}\) represents the monotonic transformation of the indirect utility function \({V}_{q}\) after taking the logarithm. \({w}_{q}\) signifies the logarithmic income. \(\theta =1-\mu\) denotes the share of housing consumption in consumption expenditure, and \({p}_{{qh}}\) stands for the logarithmic housing price. This paper normalizes the price of tradable industrial goods to 1 and does not consider the investment function of housing. Therefore, house prices are relative to tradable industrial goods.

To begin with, to analyze the distribution of labor across various regions, we adopt Moretti & Perloff’s methodology and incorporate labor’s individual preferences for regions (Moretti and Perloff, 2002 ), denoted as \({e}_{{qj}}\) , using Eq. ( 3 ) as a basis.

In Eq. ( 4 ), \({e}_{{qj}}\) represents labor force j 's personal preference for region q , \({e}_{{qj}} \sim U\left(-r,r\right)\) . The greater the value of r , the more significant personal preference becomes in the choice of residence.

Given the free flow of labor, equilibrium dictates that utility levels attained in different regions are equal, implying \({v}_{a}={v}_{b}\) , as derived from Eq. ( 4 ).

Furthermore, it can be observed that the distribution of labor force population in regions a and b at equilibrium is.

In the above formulas ( 6 ) and ( 7 ), \({l}_{q}\) represents the logarithm of the labor force population in area \(q(q=a,b)\) , \(l={l}_{a}+{l}_{b}\) .

Secondly, from the perspective of producers, following Kline & Moretti (Kline and Moretti, 2014 ) analysis based on the constant returns to scale production function model. Under the assumption of unrestricted capital mobility, when solving the first-order conditions to maximize corporate profits, the ensuing relationship can be derived.

In formula ( 8 ), α represents the elastic coefficient of labor output, β denotes the elastic coefficient of capital output, where \(\alpha +\beta < 1\) , and C is a constant.

Thirdly, from the perspective of the housing market, if each labor force consumes 1 unit of housing, and \({\rho }_{q}\) represents the housing supply elasticity of region q (Saiz, 2010 ), then the following relationship holds.

Among these, the larger \({\rho }_{q}\) is, the smaller the housing supply elasticity is.

Combining Eqs. ( 6 ) to ( 9 ), the following equilibrium solution can be obtained:

In Eq. ( 10 ), \({\varphi }_{q}=l\theta {\rho }_{q}\left(1-\beta \right)+l\left(1-\alpha -\beta \right)+r(1-\beta )\) .

In order to analyze the impact of biased land supply policies on housing prices in the eastern, central, and western regions, Eq. ( 10 ) is further employed to derive \({\rho }_{b}\) , yielding the following results:

From Eqs. ( 11 ) to ( 13 ), it’s evident that the land resource supply policy favoring the central and western regions results in a decrease in the elasticity of housing supply in the eastern region ( \({\rho }_{b}\) becomes larger) (Fan et al., 2021 ). On the one hand, the house price \({p}_{{bh}}\) in the eastern region and the house price \({p}_{{ah}}\) in the central and western regions both rise with the increase of \({\rho }_{b}\) . On the other hand, it also accelerates the increase in housing prices, \({p}_{{bh}}\) , in the eastern region compared to housing prices, \({p}_{{ah}}\) , in the central and western regions. Based on this, the research hypothesis 1 of this paper is put forward:

Research Hypothesis 1: The land resource supply policy favoring the central-western regions has caused housing prices to rise in both the eastern and central and western regions, with the eastern region experiencing a faster rate of increase compared to the central-western regions.

From the preceding mechanism analysis, it’s evident that increasing housing prices will impede regional innovation through three channels: the crowding out effect of innovation funds, the crowding out effect of innovative talents, and the blocking effect of innovation and entrepreneurship spirit and vitality. Building on this, the research hypothesis 2 of this paper is formulated, based on the research hypothesis 1:

Research Hypothesis 2: Faster housing price growth in the eastern region will lead to a stronger inhibitory effect on innovation than in the central and western regions.

Regional innovation efficiency measurement model setting

Based on the Stochastic Frontier Analysis (SFA) model of Battese & Coelli (Battese and Coelli, 1995 ). Primarily, this model serves as a tool for assessing the efficiency of the production process (Sartori et al., 2024 ). It proficiently estimates the technical efficiency, as well as the influence of random errors that may arise during the production process (Danelon and Kumbhakar, 2023 ). Therefore, this paper opts for the Stochastic Frontier Analysis model (Zhan et al., 2022 ), which surpasses the traditional logarithmic production function form, to accurately assess regional innovation efficiency. The specific model settings are as follows:

In the above equation, i and t respectively represent provincial regions and years. \({Inn\_product}\) represents actual innovation output. f () represents frontier innovation output. \({x}_{{it}}\) denotes the set of variables influencing regional innovation output, including innovative labor input \({Inn\_labor}\) and innovative capital input \({Inn\_capital}\) . \({v}_{{it}}-{\mu }_{{it}}\) represents the composite error term. Among them, the random error term \({v}_{{it}}\) obeys \(N(0,{\sigma }_{v}^{2})\) and is independent of the technical inefficiency term \({\mu }_{{it}}\) . \({\mu }_{{it}}\) obeys \({N}^{+}(\mu ,{\sigma }_{v}^{2})\) . \({\mu }_{{it}}={\mu }_{i}\exp \left[-\eta (t-T)\right]\) , η represents the time-varying parameter.

Regional innovation efficiency is defined as the ratio of actual innovation output to frontier innovation output, expressed as:

The range of values for \({Inn\_}{{efficiency}}_{{it}}\) is [0, 1]. If \({Inn\_}{{efficiency}}_{{it}}=1\) , it indicates full technological effectiveness. If \({Inn\_}{{efficiency}}_{{it}} < 1\) , it suggests technological inefficiency.

Basic econometric model settings

The empirical model in this paper primarily examines how the elasticity of land supply, influenced by biased land resource supply policies, impacts housing prices and consequently influences innovation. The fundamental econometric model here is the equation depicting the influence of housing prices on innovation:

In this equation, i and t respectively represent provincial regions and years. \(I{nn}{\_efficiency}\) is the dependent variable measuring regional innovation efficiency. The core explanatory variable \({Hou\_price}\) represents housing prices across provincial regions. Z denotes the control variable. \({\delta }_{i}\) and \({\omega }_{t}\) respectively stand for dummy variables representing provincial regions and year fixed effects. \({\alpha }_{0}\) , \({\alpha }_{1}\) , and \({\alpha }_{2}\) are the parameters to be estimated, and ε represents the random error term. To clarify the representation of variables in the models discussed in the preceding three sections, Table 1 is utilized to explain these variables.

Endogeneity and instrumental variables

There may be specific endogeneity issues between housing prices and innovation within a region. Primarily, there is an endogeneity problem caused by the connection between housing prices and innovation. The higher the level of innovation in a region, the stronger its ability to sustain economic development. The demand for real estate increases due to the expansion effect of enterprise production and rising residents’ income, thereby driving up housing prices. (Beracha et al., 2022 ). Secondly, the influence of housing prices on innovation also faces endogeneity challenges due to omitted variables (Hu et al., 2019 ). For instance, each province and region will implement a range of distinct innovation encouragement policies tailored to their specific resource endowments and economic development stages. These policies not only influence innovation endeavors but may also impact the investment choices of local residents and businesses, consequently affecting local housing prices. In addressing the issue of omitted variables, this paper has made efforts to control for a range of provincial-level characteristic variables associated with the level of innovation and for which information is accessible. However, theoretically, there may still be omitted variables that cannot be effectively controlled for.

A viable approach to address the aforementioned endogeneity issue is through the instrumental variable method, which involves identifying a suitable instrumental variable for housing prices (Han and Kung, 2015 ). Prior studies have extensively explored potential instrumental variables for housing prices, with regional land supply status being among the most commonly utilized (He, 2022 ). Following the fundamental principle of the instrumental variable method, it’s imperative to identify an exogenous variable that affects the endogenous variable (house price) but does not directly affect the explained variable (innovation) as the corresponding instrumental variable (Shen et al., 2024 ). This paper opts for the state-owned construction land transfer area variable from the previous year in each province and region as the instrumental variable for housing prices.

The rationale for variable selection is as follows: (1) Regional land resource supply conditions have a direct impact on housing prices. Existing research suggests four primary pathways through which land supply influences housing prices. The first pathway is through its effect on the quantity of housing supply via the quantity of land resource supply, subsequently impacting housing prices (Tian and Ma, 2009 ). The second pathway is that modifications to land resource supply policies will influence market entities’ expectations of future housing prices, consequently impacting the supply and demand dynamics of the real estate market (He, 2023 ). The third pathway involves variations in the scale of land resource supply, which alter the elasticity of housing supply, consequently impacting housing prices (El Araby, 2003 ). The fourth pathway suggests that varying structures of land resource supply result in different impacts on housing prices (Barlow, 1993 ). (2) Regional land supply conditions solely impact housing prices without directly affecting innovation. China enforces a rigorous land use control system and farmland protection regulations under public land ownership. Consequently, land supply in a region is tightly regulated by both the central and provincial governments. In 1999, the Ministry of Land and Resources of China introduced the “Annual Land Use Plan Management Measures,” subsequently revised in 2004, 2006, and 2016. These regulations stipulate that the Ministry of Natural Resources will draft a national land use annual plan based on two recommendations. On one hand, there are proposals for overall control indicators in the national annual land use plan. On the other hand, provinces, autonomous regions, and municipalities directly under the Central Government put forward planning indicator suggestions. Following deliberation and approval by the State Council and the National People’s Congress, the draft national economic and social development plan becomes officially implemented. Within this institutional framework, the land supply status of each province and region must align with the annual land use plan. Primarily determined by the central government, the annual land use plan ensures that the land resource supply situation in each province solely influences housing prices and does not directly impact innovation. (3) Utilizing the state-owned construction land transfer area variable from the previous year can mitigate the potential channel mechanism by which changes in housing prices impact the state-owned construction land transfer area.

Variable selection

Explained variables.

Regional innovation efficiency ( \({Inn\_efficiency}\) ) is calculated using stochastic frontier analysis, as parameters \({\sigma }^{2}\) and γ are significantly positive, and the one-sided LR test values have also passed significant testing. This demonstrates the appropriateness of the stochastic frontier model. Innovation output \(({Inn\_product}\) ): is quantified by the number of invention patent authorizations. Innovation investment encompasses innovation labor input ( \({Inn\_labor}\) ) and innovation capital investment ( \({Inn\_capital}\) ). Specifically, innovation labor input is gauged by the full-time equivalent of R&D personnel, and innovation capital investment is measured by internal expenditure of R&D funds.

Explanatory variable

House price ( \({Hou\_price}\) ) is determined by the average sales price of commercial housing in each province and region.

Instrumental variable

Land supply ( \({Land\_supply}\) ) is quantified by the state-owned construction land transfer area in each province and region from the preceding year.

Control variables

① The level of regional economic development is gauged by GDP per capita ( \({Per\_gdp}\) ). ② Human capital level ( \({Hum\_capital}\) ) is assessed by the density of universities in each province and region. ③ Industrial structure level ( \({Ind}{\_structure}\) ) is determined by the ratio of the output value of the secondary industry and the output value of the tertiary industry in each province and region. ④ The degree of openness to the outside world is measured by the import volume ( \({Imp\_}{volume}\) ) and export volume ( \(Ex{p\_}{volume}\) ) of each province and region. ⑤ Tax level is evaluated based on the tax revenue ( \({Tax\_revenue}\) ) of each province and region.

Moreover, recognizing the potential reverse causality of specific control factors in the econometric equation on innovation, this study employs lagged control variables to mitigate it. Acknowledging that the correlation between housing prices and innovation could be influenced by unobservable factors, this paper incorporates dummy variables for province, region, and year fixed effects in the econometric equation.

Data sources and spatial analysis

Owing to data availability constraints, this study compiles panel data from 31 provinces and regions across the nation (excluding Hong Kong, Macao, and Taiwan) spanning a 16-year period from 2003 to 2018. These 31 provinces and regions are categorized into two primary regions: the eastern and western regions (Table 2 ). The regional classification follows the delineation established in the “Seventh Five-Year Plan” ratified by the Fourth Session of the Sixth National People’s Congress in 1986. Broadly speaking, the eastern region comprises coastal areas with relatively advanced economic development, while the central and western regions encompass inland territories with comparatively lower economic development levels.

Among these, data on the number of invention patents granted, full-time equivalent of R&D personnel, and internal expenditure on R&D funds are sourced from the “China Science and Technology Statistical Yearbook” over the years. The average sales price data of commercial housing is sourced from the “China Real Estate Statistical Yearbook” over the years. Data on the transfer area of state-owned construction land is obtained from the “China Land and Resources Yearbook” over the years. The remaining data is derived from the “China Statistical Yearbook”, the statistical yearbooks of various provinces and regions, and the National Economic and Social Development Statistical Bulletin over the years. The descriptive statistics of relevant variables are presented below (Table 3 ).

After completing the descriptive statistics of the variables, to illustrate the strength and direction of the linear relationship between different variables, this paper generated a Pearson correlation coefficient matrix for the relevant variables (Fig. 2 ). Specifically, the value of the correlation coefficient ranges between −1 and 1, where 0 indicating no linear relationship, 1 indicating a perfect positive correlation, and −1 indicating a perfect negative correlation. The correlation coefficient between land supply and housing prices is −0.0527, indicating a negative correlation. This suggests that a decrease in land supply has resulted in an uptick in housing prices. This observation underscores the skewness of the land supply policy toward the western region, diminishing land availability in the eastern region and consequently impacting housing prices there.

figure 2

Pearson correlation coefficient matrix plot.

To distinctly illustrate the spatial and temporal distribution characteristics of land supply, housing prices, and innovation, this paper focuses on three variables for analysis: construction land transfer area, average sales price of commercial housing, and the comprehensive utility value of regional innovation capabilities. Additionally, the comparison years are set as 2003, 2008, 2013, and 2018. Specifically, Beijing and Shanghai are selected from the eastern region, while Hubei and Chongqing represent the central and western regions.

Judging from the area of construction land transfer in China (Fig. 3 ). It’s evident that around 2013, the government boosted land supply to central and western provinces. The quantity of land supplied in these regions notably exceeds that in the eastern provinces, indicating a clear bias in China’s land supply policy.

figure 3

Regional distribution of construction land transfer area in China.

Based on the average sales price of commercial housing in China (Fig. 4 ), it’s apparent that housing prices in the eastern, central, and western regions are generally on the rise. However, due to the implementation of biased land supply policies, prices in the eastern region have experienced a faster increase compared to the slower rise observed in central and western provinces.

figure 4

Regional distribution of average sales price of commercial housing in China.

Analyzing the data on the comprehensive utility values of China’s regional innovation capabilities (Fig. 5 ), it is evident that the regional innovation capabilities of most provinces in the eastern region are experiencing a downward trend. Conversely, the regional innovation capabilities of most provinces in the central and western regions have shown a slight increase. This trend can be attributed to the land supply policy favoring the central and western regions, which has, to some extent, impeded the regional innovation capabilities of the eastern region. To better verify the relationship between land supply, housing prices, and regional innovation, further discussion and verification will be provided below.

figure 5

Distribution of comprehensive utility value of China’s regional innovation capabilities.

Analysis of empirical results

Analysis of regional innovation efficiency measurement results.

This paper utilizes Frontier4.1 to estimate the parameters of the Stochastic Frontier Analysis model of innovation efficiency in each province and region (Table 4 ). The parameters \({\sigma }^{2}\) and γ are both significantly positive, and the one-sided LR test value also passed the significance test. These results indicate the suitability of employing the Stochastic Frontier Analysis method to gauge China’s regional innovation efficiency. Through further calculation, the value of generalized likelihood statistic \({L}_{R}=-2\left\{\mathrm{ln}\left[L({H}_{0})/L({H}_{1})\right]\right\}\) is 35.3615, passing the 1% significance level test. This indicates excellent fit between the model and the data, affirming that the Translog stochastic frontier model is suitable and exhibits strong adequacy for interpreting the studied data.

After estimating the parameters of the Stochastic Frontier Analysis model for innovation efficiency in each province and region, the innovation efficiency across the eastern, central, and western regions, as well as nationally, was calculated for different years (Fig. 6 ).

figure 6

Evolution process of regional innovation efficiency in China.

From a national level, regional innovation efficiency shows an upward trend year by year, but the overall level is still low. From a regional perspective, both the eastern and central and western regions show an upward trend year by year, but the pronounced regional imbalance remains evident. The innovation efficiency of the eastern region holds an absolute advantage over the central and western regions, with the gap likely to widen further. The eastern region benefits from clear geographical advantages, a greater concentration of innovation factors, and a superior innovation and entrepreneurship environment. These factors likely contribute significantly to the spatial heterogeneity observed in regional innovation efficiency.

Benchmark regression results analysis

Tsls regression first stage results.

In the initial stage results of the Two-Stage Least Squares (TSLS) regression, column 1 illustrates the regression outcomes utilizing panel data from 31 provinces and regions nationwide. Columns 2 and 3 depict the regression outcomes utilizing panel data from 9 provinces and regions in the east, and 22 provinces and regions in the central and western areas, respectively (Table 5 ). In the instrumental variable estimation process of TSLS, the F values estimated in the first stage are 772.74, 391.20 and 320.00 respectively. All exceed the critical value of 16.38 for the F value at the 10% bias level as considered by Stock & Yogo (Stock and Yogo, 2002 ), affirming the appropriateness of the instrumental variables set in this paper. This robust fit enhances the reliability of our subsequent analyses regarding the impact of regional land resource supply on housing prices. Hence, there is no weak instrumental variable problem, affirming the appropriateness of the instrumental variables set in this paper.

Analyzing the impact of regional land resource supply on housing prices, the regression results in column 1 reveal that various relevant influencing factors are controlled at the national level. The transfer area of state-owned construction land in the previous year exhibits a notably negative effect on current housing prices, with an impact coefficient of −0.1300. In particular, smaller areas of state-owned construction land transferred in the previous year correlate with higher current housing prices. This finding suggests that the nationwide bias in land supply toward the central and western regions has exerted a certain negative impact on housing prices while also contributing to their escalation. To delve deeper into the spatial heterogeneity of the influence of land resource supply on housing prices, the regression outcomes in columns 2 and 3 are presented. The transfer area of state-owned construction land in both the eastern and central-western regions during the previous year also exhibits a notable negative effect on current housing prices. Moreover, the coefficient of influence in the eastern region (−0.1476) is lower than that in the central-western regions (−0.0792). This indicates that the bias in land supply toward the central and western regions adversely impacts housing prices not only in those areas but also in the eastern region. Furthermore, the adverse effect on housing prices in the eastern region surpasses that in the central and western regions, resulting in a swifter rise in housing prices in the eastern region compared to the central and western regions. These empirical findings substantiate the first research hypothesis of this paper: The land resource supply policy favoring the central-western regions has caused housing prices to rise in both the eastern and central and western regions, with the eastern region experiencing a faster rate of increase compared to the central-western regions.

TSLS regression second stage results

In the second-stage findings of the TSLS regression, the analysis delves into the influence of housing prices on innovation efficiency (Table 6 ). The regression outcomes in column 1 reveal that a range of pertinent influencing factors are controlled at the national level. The effect of housing prices on innovation efficiency is notably negative, evidenced by a coefficient of −0.0757. The negative coefficients estimated for housing prices on innovation efficiency across all regions underscore a consistent trend nationwide. This robust pattern is indicative of a strong fit of the model to the data, suggesting that the selected instrumental variables effectively capture the variation in housing prices and innovation efficiency. This underscores that across the nation, the swift escalation of housing prices has significantly impeded the enhancement of innovation levels.

Further, investigate the spatial heterogeneity in how housing prices influence innovation. The regression outcomes in columns 2 and 3 reveal that, even after accounting for various influencing factors, the effect of housing prices on innovation efficiency remains notably negative across the eastern, central, and western regions. The regression coefficient in the eastern region is −0.0924, smaller than the coefficient in the central and western regions, which is −0.0160. This indicates that the considerable increase in housing prices across the eastern, central, and western regions has notably hindered the enhancement of regional innovation efficiency, with the inhibitory effect being more pronounced in the eastern region compared to the central and western regions. The inhibitory impact of housing prices on innovation efficiency is predominantly fueled by the eastern region, characterized by constrained land supply and rapid housing price escalation. The empirical test results above substantiate the second research hypothesis of this paper: Faster housing price growth in the eastern region will lead to a stronger inhibitory effect on innovation than in the central and western regions.

Robustness test with replacement of the explained variable

To ensure the reliability of the above research conclusions, this paper employs a robustness test method by substituting the explained variables (Zhan et al., 2022 ). This approach offers a more lucid comprehension of how various variables influence outcomes, facilitating a more precise assessment of both the direction and magnitude of causal relationships (Xie et al., 2023 ). Utilize data from the “China Regional Innovation and Entrepreneurship Index,” as released by the Peking University Open Research Data Platform, to reconstruct the measurement index for regional innovation efficiency Footnote 2 (Table 7 ).

Derived from the second-stage results of Two-Stage Least Squares (TSLS) regression following the substitution of explained variables, the robustness test conducted by substituting the explained variables offers additional insight into the reliability of our findings. Despite the replacement of the explained variables, the consistently negative coefficients estimated for housing prices on innovation efficiency across all regions suggest a robust fit of the model to the data. This indicates that the observed inhibitory effect of housing prices on innovation remains stable and significant even under alternative specifications of the regression model. Thus, the strong alignment between the benchmark regression outcomes and the results of the robustness test underscores the reliability and robustness of our conclusions. The coefficients are −0.2622, −0.3290 and −0.2563, respectively. The inhibitory effect of housing prices on innovation efficiency in the eastern region is stronger than that in the central and western regions. It is evident that the primary source of the inhibitory effect of housing prices on innovation stems from the eastern region, characterized by tighter land supply and faster rising housing prices. The robustness test results above align entirely with the benchmark regression outcomes, affirming the reliability of the benchmark regression results.

Fixed effects model robustness test

After conducting parameter estimation using the Stochastic Frontier Analysis model and analyzing benchmark regression results for regional innovation efficiency, this paper will proceed to further analyze the fixed effects model. The introduction of the fixed effects model will provide a more comprehensive understanding of the impact of relevant factors on regional innovation efficiency and further validate the robustness of the previous conclusions. The 31 provinces are categorized into two regions: the eastern region and the central and western regions. The fixed effects model incorporates province fixed effects to account for the heterogeneity among different provinces (Fig. 7 ).

figure 7

Fixed effects model results.

It’s evident that the coefficient of house price ( \({Hou\_price}\) ) is −0.0002424, displaying statistical significanc ( P  < 0.05), suggesting that regional innovation efficiency slightly decreases with rising house prices. This implies that high housing prices might exert a certain negative influence on regional innovation efficiency. The rationale behind this is that elevated housing prices translate to heightened housing expenses, consequently raising costs for companies in terms of recruitment and employee welfare. Furthermore, high housing prices hinder population mobility, making it challenging for skilled individuals to relocate to regions with vibrant innovation scenes, thereby diminishing innovation efficiency in those areas. The coefficient for land supply ( \({Land\_supply}\) ) is 0.0000707, despite having weak statistical significance ( P  = 0.048), it indicates that land resource supply has a positive impact on regional innovation efficiency. This stems from the fact that an expansion in land resource supply can stimulate enterprise production and investment, consequently enhancing innovation efficiency. Moreover, the heightened land resource supply can also draw more businesses or talent to the region, thereby further fostering the region’s economic development and innovation capabilities. As a direct consequence, land resource supply policies favoring the central and western regions have resulted in increased land supply in those areas, while reducing land supply in the eastern region. The outcome has been the inhibition of regional innovation efficiency improvement in the eastern region.

Conclusions and recommendations

Research conclusions.

After conducting theoretical analysis and empirical testing, this paper delves into the impact of biased land resource supply policies on housing prices and their subsequent influence on innovation, with a particular focus on the spatial heterogeneity of this effect. The findings of this study can be summarized as follows: (1) Policies favoring land resource supply in the central and western regions have contributed to the increase in housing prices across the eastern, central, and western regions. Notably,l the eastern region has experienced a more rapid escalation compared to the central and western regions. Specifically, the impact of the transfer of state-owned construction land area on housing prices is significantly negative at both national and regional levels. The influence coefficients stand at −0.1300 nationally, −0.1476 in the eastern region, and −0.0792 in the central and western regions, with the coefficient in the eastern region being lower than that in the central and western regions. (2) The accelerated growth of housing prices in the eastern region is poised to exert a more pronounced inhibitory impact on innovation compared to the central and western regions. Specifically, the focus is on the influence of housing prices on innovation efficiency, which registers a notably negative effect nationwide, in the eastern region, and in the central and western regions. The respective influence coefficients stand at −0.0757, −0.0924, and −0.0160, with the coefficient in the eastern region significantly lower than that in the central and western regions. In the context of the innovation-driven development strategy, these research findings provide policy guidance for the stability of the real estate market and the optimization of land policies.

Based on the above results analysis, after 2003, the rapid rise in housing prices in the eastern region significantly hindered regional innovation. The underlying reason lies in the central government’s implementation of land resource supply policies favoring the central and western regions. While the policy was intended to support the development of underdeveloped areas, it resulted in a rapid increase in housing prices in the eastern region, ultimately significantly stifling innovation in the region. Indeed, this policy facilitated economic growth in underdeveloped regions for a certain period. However, it also led to efficiency deterioration and local debt issues, posing significant risks to the sustainable development of the local economy. Relying solely on administrative measures for land resource allocation, while neglecting market forces and geographic factors, resulted in spatial misallocation of resources and adverse effects on high-quality economic development. At its core, the phenomenon of spatial misallocation resulting from the administrative allocation of resources stems from a divergence between government-led resource allocation and market-driven population movements. During this period, the eastern region continued to attract population inflows. The influx of people led to soaring housing demand, yet local government land fiscal policies resulted in insufficient land supply, exacerbating the already tense land supply-demand contradiction. Therefore, the inadequate land supply has led to a crowding-out effect on human capital for innovation and R&D investment, driving up costs for factors such as labor and capital, ultimately significantly stifling regional innovation.

Certainly, this study has its limitations, but it serves as a foundation for further exploration. Firstly, it focuses on the relationship between land supply, housing prices, and innovation in the eastern and midwestern regions. However, it overlooks the interconnections among land supply, housing prices, and innovation across different provinces. This aspect warrants further investigation in the future. Furthermore, although this study discusses how biased land supply policies affect regional housing prices and innovation, it does not delve into the specific reasons behind these differences. In the future, a more in-depth comparative analysis could be conducted to explore the impact of factors such as economic development and policy disparities across different regions on housing prices and innovation. Finally, in addition to utilizing instrumental variable methods for data analysis, comparing different cases could be employed to investigate how biased land supply policies influence regional housing prices and subsequently affect regional innovation.

Research recommendations

Research on the factors influencing regional housing prices and innovation is not only relevant to the implementation of innovation-driven development strategies and the construction of an innovative nation, but also crucial for the high-quality development of the national economy and the realization of the great “Chinese Dream”. Therefore, based on the findings of this study, the following recommendations are proposed:

Enhance land resource supply policies. The government can implement tailored land supply policies, finely adjusting regulations based on factors like regional economic development (Niu, Chen, et al., 2023 ) and land resource utilization (Shen et al., 2023 ). This ensures stability and innovation vitality in real estate markets across different areas. The “people, land, and money” linkage policy proposed during the 2016 Two Sessions stated that urban construction land indicators are linked to the number of agricultural population resettlement.

Strengthen land use planning and management. The government can adjust land supply structures and strategically allocate various land types through scientifically sound land use planning to meet the development needs of different regions (Feng et al., 2023 ). China’s 14th Five-Year Plan emphasizes aligning the allocation of construction land indicators with population migration trends and maximizing the market’s decisive role in resource allocation.

Foster policy coordination and interagency collaboration. The government enhances communication and coordination among departments to establish a cohesive framework for land resource supply policies (Liu et al., 2024 ). Local governments prioritize regional coordination and comprehensive planning in the formulation and execution of land supply policies, fostering a collaborative effort to promote balanced regional economic development effectively (Feng et al., 2024 ).

Data availability

The online version of this work includes supplementary material available at https://doi.org/10.6084/m9.figshare.26539789.v1 . The data for this study originates from: (1) the National Bureau of Statistics of China and (2) the China Science and Technology Statistical Yearbook.

Data Source: https://www.pkulaw.com/ .

Data Sources: https://doi.org/10.18170/DVN/PEFDAS .

Barlow J (1993) Controlling the housing land market—some examples from Europe. Urban Stud 30(7):1129–1149. https://doi.org/10.1080/00420989320081091

Article   ADS   Google Scholar  

Battese GE, Coelli TJ (1995) A model for technical inefficiency effects in a stochastic frontier production function for panel data. Empir Econ 20:325–332

Article   Google Scholar  

Beracha E, He Z, Wintoki MB, Xi Y (2022) On the relation between innovation and housing prices—a metro level analysis of the US market. J Real Estate Financ Econ 65(4):622–648. https://doi.org/10.1007/s11146-021-09852-2

Bleck A, Liu X (2018) Credit expansion and credit misallocation. J Monetary Econ 94:27–40. https://doi.org/10.1016/j.jmoneco.2017.09.012

Brown JR, Martinsson G, Petersen BC (2012) Do financing constraints matter for R&D?. Eur Econ Rev 56(8):1512–1529. https://doi.org/10.1016/j.euroecorev.2012.07.007

Buxton M, Taylor E (2011) Urban land supply, governance and the pricing of land. Urban Policy Res 29(1):5–22. https://doi.org/10.1080/08111146.2011.537605

Chen M, Zhu H, Sun Y, Jin R (2023) The impact of housing macroprudential policy on firm innovation: empirical evidence from China. Hum Soc Sci Commun 10(1):498. https://doi.org/10.1057/s41599-023-02010-4

Chen Z, Zhou Y, Haynes KE (2021) Change in land use structure in urban China: does the development of high-speed rail make a difference. Land Use Policy 111:104962. https://doi.org/10.1016/j.landusepol.2020.104962

Cheshire P (2004) The British housing market: contained and exploding. Urban Policy Res 22(1):13–22

Chu Z, Chen X, Cheng M, Zhao X, Wang Z (2023) Booming house prices: friend or foe of innovative firms? [Article; Early Access]. J Technol Transfer. https://doi.org/10.1007/s10961-023-10005-1

Danelon AF, Kumbhakar SC (2023) Estimating COVID-19 under-reporting through stochastic frontier analysis and official statistics: A case study of São Paulo State, Brazil. Socio-Econ Plan Sci 90:101753. https://doi.org/10.1016/j.seps.2023.101753 . Article

Ding Y, Chin L, Li F, Deng P, Cong S (2023) How do housing prices affect a city’s innovation capacity? The case of China. Technol Econ Dev Econ 29(5):1382–1404. https://doi.org/10.3846/tede.2023.18899

El Araby MM (2003) The role of the state in managing urban land supply and prices in Egypt. Habitat Int 27(3):429–458. https://doi.org/10.1016/s0197-3975(02)00068-1

Fan J, Liu D, Hu M, Zang Y (2023) How do housing prices affect innovation and entrepreneurship? Evidence from China. PLoS One 18(7):e0288199. https://doi.org/10.1371/journal.pone.0288199

Article   CAS   PubMed Central   Google Scholar  

Fan JS, Zhou L, Yu XF, Zhang YJ (2021) Impact of land quota and land supply structure on China’s housing prices: Quasi-natural experiment based on land quota policy adjustment. Land Use Policy 106:105452. https://doi.org/10.1016/j.landusepol.2021.105452

Fang X, Lv Y (2023) Housing prices and green innovation: evidence from Chinese enterprises. Manag Decis 61(11):3519–3544. https://doi.org/10.1108/md-03-2023-0368

Feng Y, Gao Y, Xia X, Shi K, Zhang C, Yang L, Cifuentes-Faura J (2024) Identifying the path choice of digital economy to crack the “resource curse” in China from the perspective of configuration. Resour Policy 91:104912. https://doi.org/10.1016/j.resourpol.2024.104912

Feng Y, Hu J, Afshan S, Irfan M, Hu M, Abbas S (2023) Bridging resource disparities for sustainable development: a comparative analysis of resource-rich and resource-scarce countries. Resour Policy 85:103981. https://doi.org/10.1016/j.resourpol.2023.103981

Grimes A, Aitken A (2010) Housing supply, land costs and price adjustment. Real Estate Econ 38(2):325–353. https://doi.org/10.1111/j.1540-6229.2010.00269.x

Han L, Kung JK-S (2015) Fiscal incentives and policy choices of local governments: evidence from China. J Dev Econ 116:89–104. https://doi.org/10.1016/j.jdeveco.2015.04.003

He B, Wang J, Wang J, Wang K (2018) The impact of government competition on regional r&d efficiency: does legal environment matter in China’s innovation system?. Sustainability 10(12):4401. https://doi.org/10.3390/su10124401

He Y (2022) Endogenous land supply policy, economic fluctuations and social welfare analysis in China. Land 11(9):1542. https://doi.org/10.3390/land11091542

He Y (2023) The optimal land supply policy of Chinese local government. Singap Econ Rev 68(05):1731–1750. https://doi.org/10.1142/s0217590819500644

Helpman E (1995). The size of regions. Foerder Institute for Economic Research

Hu FZY, Qian J (2017) Land-based finance, fiscal autonomy and land supply for affordable housing in urban China: A prefecture-level analysis. Land Use Policy 69:454–460. https://doi.org/10.1016/j.landusepol.2017.09.050

Hu FZY, Qian J (2022) The impact of housing price on entrepreneurship in Chinese cities: Does the start-up motivation matter? Cities 131:104045. https://doi.org/10.1016/j.cities.2022.104045

Hu M, Su Y, Ye W (2019) Promoting or inhibiting: The role of housing price in entrepreneurship. Technol Forecast Soc Change 148:119732. https://doi.org/10.1016/j.techfore.2019.119732

Huang J, Shen GQ, Zheng HW (2015) Is insufficient land supply the root cause of housing shortage? Empirical evidence from Hong Kong. Habitat Int 49:538–546. https://doi.org/10.1016/j.habitatint.2015.07.006

Kline P, Moretti E (2014) Local economic development, agglomeration economies, and the big push: 100 years of evidence from the Tennessee Valley Authority. Q J Econ 129(1):275–331

Krugman P (1991) Increasing returns and economic geography. J Political Econ 99(3):483–499

Larbi WO, Antwi A, Olomolaiye P (2004) Compulsory land acquisition in Ghana—policy and praxis. Land Use Policy 21(2):115–127. https://doi.org/10.1016/j.landusepol.2003.09.004

Li B, Li RYM, Wareewanich T (2021) Factors influencing large real estate companies’ competitiveness: a sustainable development perspective. Land 10(11):1239. https://doi.org/10.3390/land10111239

Li J, Lyu P, Jin C (2023) The impact of housing prices on regional innovation capacity: evidence from China. Sustainability 15(15):11868. https://doi.org/10.3390/su151511868

Li L, Bao HXH, Robinson GM (2020) The return of state control and its impact on land market efficiency in urban China. Land Use Policy 99:104878. https://doi.org/10.1016/j.landusepol.2020.104878

Li L, Wu X (2014) Housing prices and entrepreneurship in China. J Comp Econ 42(2):436–449. https://doi.org/10.1016/j.jce.2013.09.001

Li N, Li RYM (2024) A bibliometric analysis of six decades of academic research on housing prices. Int J Hous Mark Anal 17(2):307–328. https://doi.org/10.1108/ijhma-05-2022-0080

Article   MathSciNet   Google Scholar  

Li N, Li RYM, Nuttapong J (2022) Factors affect the housing prices in China: a systematic review of papers indexed in Chinese Science Citation Database [Review]. Prop Manag 40(5):780–796. https://doi.org/10.1108/pm-11-2020-0078

Liang W, Lu M, Zhang H (2016) Housing prices raise wages: estimating the unexpected effects of land supply regulation in China [Article; Proceedings Paper]. J Hous Econ 33:70–81. https://doi.org/10.1016/j.jhe.2016.07.002

Lin X, Ren T, Wu H, Xiao Y (2021) Housing price, talent movement, and innovation output: evidence from Chinese cities. Rev Dev Econ 25(1):76–103. https://doi.org/10.1111/rode.12705

Ling X, Luo Z, Feng Y, Liu X, Gao Y (2023) How does digital transformation relieve the employment pressure in China? Empirical evidence from the national smart city pilot policy [Article]. Hum Soc Sci Commun 10(1):617. https://doi.org/10.1057/s41599-023-02131-w

Liu F, Liu G, Wang X, Feng Y (2024) Whether the construction of digital government alleviate resource curse? Empirical evidence from Chinese cities. Resour Policy 90:104811. https://doi.org/10.1016/j.resourpol.2024.104811

Magliocca NR, Khuc QV, de Bremond A, Ellicott EA (2020) Direct and indirect land-use change caused by large-scale land acquisitions in Cambodia. Environ Res Lett 15(2):024010. https://doi.org/10.1088/1748-9326/ab6397

Meen G, Nygaard C (2011) Local housing supply and the impact of history and geography. Urban Stud 48(14):3107–3124. https://doi.org/10.1177/0042098010394689

Miao J, Wang P (2014) Sectoral bubbles, misallocation, and endogenous growth. J Math Econ 53:153–163. https://doi.org/10.1016/j.jmateco.2013.12.003

Monk S, Pearce BJ, Whitehead CME (1996) Land-use planning, land supply, and house prices. Environ Plan A 28(3):495–511. https://doi.org/10.1068/a280495

Moretti E, Perloff JM (2002) Efficiency wages, deferred payments, and direct incentives in agriculture. Am J Agric Econ 84(4):1144–1155. https://doi.org/10.1111/1467-8276.00060

Niu S, Chen Y, Zhang R, Luo R, Feng Y (2023) Identifying and assessing the global causality among energy poverty, educational development, and public health from a novel perspective of natural resource policy optimization. Resour Policy 83:103770. https://doi.org/10.1016/j.resourpol.2023.103770

Niu S, Zhang J, Luo R, Feng Y (2023) How does climate policy uncertainty affect green technology innovation at the corporate level? New evidence from China. Environ Res 237:117003. https://doi.org/10.1016/j.envres.2023.117003

Article   CAS   Google Scholar  

Oh S, Lee J, Choi B-G (2021) The collateral channel: Dynamic effects of housing market on entrepreneurship?. Econ Lett 198:109661. https://doi.org/10.1016/j.econlet.2020.109661

Ottaviano G, Robert-Nicoud F, Baldwin R, Forslid R, Martin P (2011) Economic geography and public policy. Princeton University Press

Qin Y, Zhu H, Zhu R (2016) Changes in the distribution of land prices in urban China during 2007–2012. Regional Sci Urban Econ 57:77–90. https://doi.org/10.1016/j.regsciurbeco.2016.02.002

Rong Z, Wang W, Gong Q (2016) Housing price appreciation, investment opportunity, and firm innovation: evidence from China [Article; Proceedings Paper]. J Hous Econ 33:34–58. https://doi.org/10.1016/j.jhe.2016.04.002

Rubin Z, Felsenstein D (2017) Supply side constraints in the Israeli housing market The impact of state owned land. Land Use Policy 65:266–276. https://doi.org/10.1016/j.landusepol.2017.04.002

Saiz A (2010) The geographic determinants of housing supply. Q J Econ 125(3):1253–1296

Sartori PJ, Schons SZ, Barrett S (2024) A stochastic production frontier analysis of factors that affect productivity and efficiency of logging businesses in Virginia. J Forestry. https://doi.org/10.1093/jofore/fvae006

Shen Q, Pan Y, Meng X, Ling X, Hu S, Feng Y (2023) How does the transition policy of mineral resource-exhausted cities affect the process of industrial upgrading? New empirical evidence from China. Resour Policy 86:104226. https://doi.org/10.1016/j.resourpol.2023.104226

Shen Q, Wu R, Pan Y, Feng Y (2024) Explaining and modeling the impacts of inclusive finance on CO2 emissions in China integrated the intermediary role of energy poverty. Hum Soc Sci Commun 11(1):82. https://doi.org/10.1057/s41599-023-02595-w

Shen X, Huang X, Li H, Li Y, Zhao X (2018) Exploring the relationship between urban land supply and housing stock: evidence from 35 cities in China. Habitat Int 77:80–89. https://doi.org/10.1016/j.habitatint.2018.01.005

Stock JH, Yogo M (2002) Testing for weak instruments in linear IV regression. In: National Bureau of Economic Research Cambridge, Mass., USA

Tian L, Ma W (2009) Government intervention in city development of China: a tool of land supply. Land Use Policy 26(3):599–609. https://doi.org/10.1016/j.landusepol.2008.08.012

Tse RYC (1998) Housing price, land supply and revenue from land sales. Urban Stud 35(8):1377–1392. https://doi.org/10.1080/0042098984411

Wang Q, Zhu H, Sun P (2023) Land supply and house prices in China [Article; Early Access]. Appl Econ Lett. https://doi.org/10.1080/13504851.2023.2187026

Wang X, Wen Y (2012) Housing prices and the high Chinese saving rate puzzle. China Econ Rev 23(2):265–283. https://doi.org/10.1016/j.chieco.2011.11.003

Wang Y, Hu FZY (2023) Housing market booms in Chinese cities: boon or bane for urban entrepreneurship? J Asian Public Policy 16(2):199–220. https://doi.org/10.1080/17516234.2021.1976984

Xie J, Gu RR, Lei TY, Yang S, Yu RA (2023) Chairman’s Communist Party of China member status and targeted poverty alleviation: evidence from China. Plos One 18(6):e0284692. https://doi.org/10.1371/journal.pone.0284692

Xu L, Chen S, Tian S (2022) The mechanism of land registration program on land transfer in rural China: considering the effects of livelihood security and agricultural management incentives. Land 11(8):1347. https://doi.org/10.3390/land11081347

Yan S, Ge XJ, Wu Q (2014) Government intervention in land market and its impacts on land supply and new housing supply: evidence from major Chinese markets. Habitat Int 44:517–527. https://doi.org/10.1016/j.habitatint.2014.10.009

Yan Z, Yu Y, Du K, Zhang N (2024) How does environmental regulation promote green technology innovation? Evidence from China’s total emission control policy. Ecol Econ 219:108137. https://doi.org/10.1016/j.ecolecon.2024.108137

Yang L, Wang J, Feng Y, Wu Q (2022) The impact of the regional differentiation of land supply on total factor productivity in China: from the perspective of total factor productivity decomposition. Land 11(10):1859. https://doi.org/10.3390/land11101859

Yii K-J, Tan C-T, Ho W-K, Kwan X-H, Nerissa F-TS, Tan Y-Y, Wong K-H (2022) Land availability and housing price in China: empirical evidence from nonlinear autoregressive distributed lag (NARDL). Land Use Policy 113:105888. https://doi.org/10.1016/j.landusepol.2021.105888

Yu L, Cai Y (2021) Do rising housing prices restrict urban innovation vitality? Evidence from 288 cities in China. Econ Anal Policy 72:276–288. https://doi.org/10.1016/j.eap.2021.08.012

Zhan XG, Li RYM, Liu XY, He F, Wang MT, Qin Y, Liao WYY (2022) Fiscal decentralisation and green total factor productivity in China: SBM-GML and IV model approaches. Front Environ Sci 10:989194. https://doi.org/10.3389/fenvs.2022.989194

Zhang J, Cheng C, Feng Y (2024) The heterogeneous drivers of CO2 emissions in China’s two major economic belts: new evidence from spatio-temporal analysis [Article]. Environ Dev Sustain 26(4):10653–10679. https://doi.org/10.1007/s10668-023-03169-1

Download references

Acknowledgements

We thank all participants enrolled in this study for their collaboration. All authors have read and agreed to the published version of the manuscript.

Author information

Authors and affiliations.

School of Political Science and Public Administration, Wuhan University, Wuhan, China

Jinsi Liu, Shengjiao Zhu & Shixiang Chen

Local Government Public Service Innovation Research Center, Wuhan University, Wuhan, China

College of Public Administration, Huazhong Agricultural University, Wuhan, China

You can also search for this author in PubMed   Google Scholar

Contributions

Conceptualization, methodology, writing—original draft preparation, writing—review and editing: JL and HX; software: SZ; validation: JL and SC; formal analysis: JL and SZ; investigation, resources, project administration, funding acquisition, visualization, supervision: SC; data curation: JL.

Corresponding author

Correspondence to Shixiang Chen .

Ethics declarations

Competing interests.

The authors declare no competing interests.

Ethical approval

Ethical approval was not required for this study, as it did not involve human participants. No primary surveys were conducted; all data were obtained from official reports and databases.

Informed consent

Informed consent was not required as the study did not involve human participants.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/ .

Reprints and permissions

About this article

Cite this article.

Liu, J., Xiang, H., Zhu, S. et al. Spatial heterogeneity analysis of biased land resource supply policies on housing prices and innovation efficiency. Humanit Soc Sci Commun 11 , 1180 (2024). https://doi.org/10.1057/s41599-024-03702-1

Download citation

Received : 12 March 2024

Accepted : 20 August 2024

Published : 10 September 2024

DOI : https://doi.org/10.1057/s41599-024-03702-1

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

empirical research data analysis

Empirical Research

  • First Online: 08 May 2024

Cite this chapter

empirical research data analysis

  • Claes Wohlin 7 ,
  • Per Runeson 8 ,
  • Martin Höst 9 ,
  • Magnus C. Ohlsson 10 ,
  • Björn Regnell 8 &
  • Anders Wesslén 11  

This chapter presents a decision-making structure for determining an appropriate research design for a specific study. A selection of research approaches is introduced to help illustrate the decision-making structure. The research approaches are described briefly to provide a basic understanding of different options. Moreover, the chapter discusses how different research approaches may be used in a research project or when, for example, pursuing PhD studies.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save.

  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
  • Available as EPUB and PDF
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

The term “investigation” is used as a more general term than a specific study.

It is sometimes also referred to as a review or a code review, if reviewing code. However, we have chosen to use the term “inspection” to avoid mixing it up with a systematic literature review.

Latin for “in the glass” and refers to chemical experiments in a test tube.

Latin for “in life” and refers to experiments in a real environment.

Ali, N.B., Petersen, K., Wohlin, C.: A systematic literature review on the industrial use of software process simulation. J. Syst. Software 97 , 65–85 (2014). https://doi.org/10.1016/j.jss.2014.06.059

Article   Google Scholar  

Anastas, J.W.: Research Design for the Social Work and the Human Services, 2nd edn. Columbia University Press, New York (2000). https://doi.org/10.7312/anas11890

Anthony, R.N.: Planning and Control Systems: A Framework for Analysis. Harvard University Graduate School of Business Administration, Boston (1965)

Google Scholar  

Arisholm, E., Gallis, H., Dybå, T., Sjøberg, D.I.K.: Evaluating pair programming with respect to system complexity and programmer expertise. IEEE Trans. Software Eng. 33 (2), 65–86 (2007). https://doi.org/10.1109/TSE.2007.17

Ayala, C., Turhan, B., Franch, X., Juristo, N.: Use and misuse of the term ‘experiment’ in mining software repositories research. IEEE Trans. Software Eng. 48 (11), 4229–4248 (2022). https://doi.org/10.1109/TSE.2021.3113558

Babbie, E.R.: Survey Research Methods. Wadsworth, Belmont (1998)

Baskerville, R.: What design science is not. Eur. J. Inf. Syst. 17 (5), 441–443 (2008). https://doi.org/10.1057/ejis.2008.45

Braun, V., Clarke, V.: Using thematic analysis in psychology. Qual. Res. Psychol. 3 (2), 77–101 (2006). https://doi.org/10.1191/1478088706qp063oa

Collis, J., Hussey, R.: Business Research: A Practical Guide for Undergraduate and Postgraduate Students. Bloomsbury Publishing, London (2021)

Creswell, J.W., Creswell, J.D.: Research Design: Qualitative, Quantitative, and Mixed methods Approaches. Sage publications, Thousand Oaks (2017)

Drisko, J.W., Maschi, T.: Content Analysis. Oxford University Press, Oxford (2016)

Easterbrook, S., Singer, J., Storey, M.A., Damian, D.: Selecting empirical methods for software engineering research. In: Shull, F., Singer, J., Sjøberg, D.I.K. (eds.) Guide to Advanced Empirical Software Engineering. Springer, London (2008). https://doi.org/10.1007/978-1-84800-044-5_11

Emerson, R.M., Fretz, R.I., Shaw, L.L.: Participant observation and fieldnotes. In: Stanley, L. (ed.) Handbook of Ethnography, pp. 352–368. SAGE Publications Ltd., Thousand Oaks (2001)

Chapter   Google Scholar  

Engström, E., Storey, M.A., Runeson, P., Höst, M., Baldassarre, M.T.: How software engineering research aligns with design science: a review. Empirical Software Eng. 25 , 2630–2660 (2020). https://doi.org/10.1007/s10664-020-09818-7

Felderer, M., Travassos, G.H.: Contemporary Empirical Methods in Software Engineering. Springer, Cham (2020)

Book   Google Scholar  

Glaser, B., Strauss, A.: Discovery of Grounded Theory: Strategies for Qualitative Research. Routledge, Milton Park (2017). https://doi.org/10.4324/9780203793206

Hannay, J.E., Dybå, T., Arisholm, E., Sjøberg, D.I.K.: The effectiveness of pair programming: a meta-analysis. Inf. Software Technol. 51 (7), 1110–1122 (2009). https://doi.org/10.1016/j.infsof.2009.02.001

Harwood, T.G., Garry, T.: An overview of content analysis. Mark. Rev. 3 (4), 479–498 (2003). https://doi.org/10.1362/146934703771910080

Hevner, A.R., March, S.T., Park, J., Ram, S.: Design science in information systems research. MIS Q. 28 (1), 75–105 (2004). https://doi.org/10.2307/25148625

Hoda, R.: Socio-technical grounded theory for software engineering. IEEE Trans. Software Eng. 48 (10), 3808–3832 (2022). https://doi.org/10.1109/TSE.2021.3106280

Jiménez, M., Piattini, M.: Problems and solutions in distributed software development: a systematic review. In: Proceedings of the Software Engineering Approaches for Offshore and Outsourced Development, pp. 107–125 (2009). https://doi.org/10.1007/978-3-642-01856-5_8

Karahasanović, A., Anda, B., Arisholm, E., Hove, S.E., Jørgensen, M., Sjøberg, D.I.K., Welland, R.: Collecting feedback during software engineering experiments. Empirical Software Eng. 10 (2), 113–147 (2005). https://doi.org/10.1007/s10664-004-6189-4

Kitchenham, B., Charters, S.: Guidelines for performing systematic literature reviews in software engineering (version 2.3). Techncial Report. EBSE Technical Report EBSE-2007-01, Keele University and Durham University (2007)

Kitchenham, B.A., Pfleeger, S.L.: Principles of survey research part 2: designing a survey. ACM SIGSOFT Software Eng. Notes 27 (1), 18–20 (2002). https://doi.org/10.1145/566493.566495

Kitchenham, B., Pickard, L., Pfleeger, S.L.: Case studies for method and tool evaluation. IEEE Software, 52–62 (1995). https://doi.org/10.1109/52.391832

Kitchenham, B.A., Budgen, D., Brereton, P.: Evidence-Based Software Engineering and Systematic Reviews, vol. 4. CRC Press, Boca Raton (2015)

Kitchenham, B., Madeyski, L., Budgen, D.: SEGRESS: software engineering guidelines for reporting secondary studies. IEEE Trans. Software Eng. 49 (3), 1273–1298 (2023). https://doi.org/10.1109/TSE.2022.3174092

Klein, H.K., Myers, M.D.: A set of principles for conducting and evaluating interpretive field studies in information systems. MIS Q., 67–93 (1999). https://doi.org/10.2307/249410

Kontio, J., Bragge, J., Lehtola, L.: The focus group method as an empirical tool in software engineering. In: Guide to Advanced Empirical Software Engineering, pp. 93–116. Springer, London (2008). https://doi.org/10.1007/978-1-84800-044-5_4

Krishnaiah, V., Narsimha, G., Chandra, N.S.: Survey of classification techniques in data mining. Int. J. Comput. Sci. Eng. 2 (9), 65–74 (2014)

Law, A.M., Kelton, W.D.: Simulation Modeling and Analysis, vol. 3. Mcgraw-Hill, New York (2007)

Linkman, S., Rombach, H.D.: Experimentation as a vehicle for software technology transfer – a family of software reading techniques. Inf. Software Technol. 39 (11), 777–780 (1997). https://doi.org/10.1016/S0950-5849(97)00029-3

Moe, N.B., Aurum, A., Dybå, T.: Challenges of shared decision-making: a multiple case study of agile software development. Inf. Software Technol. 54 (8), 853–865 (2012). https://doi.org/10.1016/j.infsof.2011.11.006

Müller, M., Pfahl, D.: Simulation methods. In: Shull, F., Singer, J., Sjøberg, D.I.K. (eds.) Guide to Advanced Empirical Software Engineering, pp. 117–152. Springer, London (2008). https://doi.org/10.1007/978-1-84800-044-5_5

Petersen, K., Wohlin, C.: Context in industrial software engineering research. In: Proceedings of the International Symposium on Empirical Software Engineering and Measurement, pp. 401–404 (2009). https://doi.org/10.1109/ESEM.2009.5316010

Pfleeger, S.L.: Experimental design and analysis in software engineering. Ann. Software Eng. 1 , 219–253 (1995)

Pinsonneault, A., Kraemer, K.: Survey research methodology in management information systems: an assessment. J. Manage. Inf. Syst., 75–105 (1993). https://doi.org/10.1080/07421222.1993.11518001

Rainer, A.: The longitudinal, chronological case study research strategy: a definition, and an example from IBM Hursley Park. Inf. Software Technol. 53 (7), 730–746 (2011). https://doi.org/10.1016/j.infsof.2011.01.003

Robson, C.: Small-Scale Evaluation: Principles and Practice. Sage Publications Ltd., Thousand Oaks (2017)

Robson, C., McCartan, K.: Real World Research: A Resource for Users of Social Research Methods in Applied Settings, 4th edn. Wiley, Hoboken (2016)

Runeson, P., Höst, M.: Guidelines for conducting and reporting case study research in software engineering. Empirical Software Eng. 14 , 131–164 (2009). https://doi.org/10.1007/s10664-008-9102-8

Runeson, P., Skoglund, M.: Reference-based search strategies in systematic reviews. In: Proceedings of the International Conference on Empirical Assessment & Evaluation in Software Engineering (2009). https://doi.org/10.14236/ewic/EASE2009.4

Runeson, P., Höst, M., Rainer, A., Regnell, B.: Case Study Research in Software Engineering. Guidelines and Examples. John Wiley & Sons, Hoboken (2012)

Runeson, P., Engström, E., Storey, M.A.: The design science paradigm as a frame for empirical software engineering. In: Felderer, M., Travassos, G.H. (eds.) Contemporary Empirical Methods in Software Engineering, pp. 127–147. Springer, Berlin (2020). https://doi.org/10.1007/978-3-030-32489-6_5

Seaman, C.B.: Qualitative methods in empirical studies of software engineering. IEEE Trans. Software Eng. 25 (4), 557–572 (1999). https://doi.org/10.1109/32.799955

Shull, F.J., Carver, J.C., Vegas, S., Juristo, N.: The role of replications in empirical software engineering. Empirical Software Eng. 13 , 211–218 (2008). https://doi.org/10.1007/s10664-008-9060-1

Sjøberg, D.I.K., Dybå, T., Jørgensen, M.: The future of empirical methods in software engineering research. In: Proceedings of the Future of Software Engineering, pp. 358–378 (2007). https://doi.org/10.1109/FOSE.2007.30

Staron, M.: Action Research in Software Engineering. Springer International Publishing, Berlin (2020)

Staron, M., Kuzniarz, L., Wohlin, C.: Empirical assessment of using stereotypes to improve comprehension of UML models: a set of experiments. J. Syst. Software 79 (5), 727–742 (2006). https://doi.org/10.1016/j.jss.2005.09.014

Stol, K.J., Ralph, P., Fitzgerald, B.: Grounded theory in software engineering research: a critical review and guidelines. In: Proceedings of the International Conference on Software Engineering, pp. 120–131 (2016). https://doi.org/10.1145/2884781.2884833

Usman, M., bin Ali, N., Wohlin, C.: A quality assessment instrument for systematic literature reviews in software engineering. e-Inform. Software Eng. J. 17 (1), 230105 (2023). https://doi.org/10.37190/e-Inf230105

Verner, J.M., Sampson, J., Tosic, V., Abu Bakar, N.A., Kitchenham, B.A.: Guidelines for industrially-based multiple case studies in software engineering. In: Proceedings of the International Conference on Research Challenges in Information Science, pp. 313–324 (2009). https://doi.org/10.1109/RCIS.2009.5089295

Wieringa, R.J.: Design Science Methodology for Information Systems and Software Engineering. Springer, Berlin (2014). https://doi.org/10.1007/978-3-662-43839-8

Williams, L.A., Kessler, R.R.: Experiments with industry’s “pair-programming” model in the computer science classroom. Comput. Sci. Educ. 11 (1), 7–20 (2001). https://doi.org/10.1076/csed.11.1.7.3846

Wohlin, C.: Guidelines for snowballing in systematic literature studies and a replication in software engineering. In: Proceedings of the 18th International Conference on Evaluation and Assessment in Software Engineering (2014). https://doi.org/10.1145/2601248.2601268

Wohlin, C.: Case study research in software engineering—it is a case, and it is a study, but is it a case study? Inf. Software Technol. 133 , 106514 (2021). https://doi.org/10.1016/j.infsof.2021.106514

Wohlin, C., Aurum, A.: Towards a decision-making structure for selecting a research design in empirical software engineering. Empirical Software Eng. 20 (6), 1427–1455 (2015). https://doi.org/10.1007/s10664-014-9319-7

Wohlin, C., Rainer, A.: Is it a case study?–a critical analysis and guidance. J. Syst. Software 192 , 111395 (2022). https://doi.org/10.1016/j.jss.2022.111395

Wohlin, C., Runeson, P.: Guiding the selection of research methodology in industry–academia collaboration in software engineering. Inf. Software Technol. 140 , 106678 (2021). https://doi.org/10.1016/j.infsof.2021.106678

Wohlin, C., Gustavsson, A., Höst, M., Mattsson, C.: A framework for technology introduction in software organizations. In: Proceedings of the Conference on Software Process Improvement, pp. 167–176 (1996)

Zannier, C., Melnik, G., Maurer, F.: On the success of empirical studies in the international conference on software engineering. In: Proceedings of the International Conference on Software Engineering, pp. 341–350 (2006). https://doi.org/10.1145/1134285.1134333

Download references

Author information

Authors and affiliations.

Blekinge Institute of Technology, Karlskrona, Sweden

Claes Wohlin

Department of Computer Science, Lund University, Lund, Sweden

Per Runeson & Björn Regnell

Faculty of Technology and Society, Malmö University, Malmö, Sweden

Martin Höst

System Verification Sweden AB, Malmö, Sweden

Magnus C. Ohlsson

Ericsson AB, Lund, Sweden

Anders Wesslén

You can also search for this author in PubMed   Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer-Verlag GmbH, DE, part of Springer Nature

About this chapter

Wohlin, C., Runeson, P., Höst, M., Ohlsson, M.C., Regnell, B., Wesslén, A. (2024). Empirical Research. In: Experimentation in Software Engineering. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-69306-3_2

Download citation

DOI : https://doi.org/10.1007/978-3-662-69306-3_2

Published : 08 May 2024

Publisher Name : Springer, Berlin, Heidelberg

Print ISBN : 978-3-662-69305-6

Online ISBN : 978-3-662-69306-3

eBook Packages : Computer Science Computer Science (R0)

Share this chapter

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Publish with us

Policies and ethics

  • Find a journal
  • Track your research

IMAGES

  1. Empirical Research: Definition, Methods, Types and Examples

    empirical research data analysis

  2. Analysis of Empirical research Data

    empirical research data analysis

  3. Empirical Evidence

    empirical research data analysis

  4. What Is Empirical Research? Definition, Types & Samples in 2024

    empirical research data analysis

  5. 15 Empirical Evidence Examples (2024)

    empirical research data analysis

  6. Empirical analysis procedure. Note: the variable values and curves in

    empirical research data analysis

VIDEO

  1. Empirical Analysis

  2. An Empirical Analysis of the Interconnection Queue

  3. Qualitative Research (Data Analysis and Interpretation) Video Lesson

  4. Empirical Research with example

  5. sciencefather.com Assoc Prof Dr. Debasis Sarkar, Pandit University, Excellence in Research

  6. What is Empirical Analysis ?

COMMENTS

  1. Empirical Research: Definition, Methods, Types and Examples

    Empirical Research: Definition, Methods, Types and ...

  2. What is empirical analysis and how does it work?

    What is empirical analysis and how does it work?

  3. Empirical research

    Empirical research

  4. Empirical Research: Defining, Identifying, & Finding

    Empirical Research: Defining, Identifying, & Finding

  5. What is Empirical Research? Definition, Methods, Examples

    What is Empirical Research? Definition, Methods, Examples

  6. PDF Introduction to Empirical Data Analysis

    The primary objectives of empirical studies are: to describe reality (descriptive data analysis), to test statements or hypotheses developed logically or theoretically on the basis of real data (confirmatory analysis), to discover (previously unknown) relationships in real data (exploratory analysis).

  7. Empirical Research: A Comprehensive Guide for Academics

    Empirical Research: A Comprehensive Guide for Academics

  8. Introduction to Empirical Data Analysis

    Data are the 'raw material' of multivariate data analysis. In empirical research, we distinguish between different types of data. cross-sectional data and time series data, observational data and experimental data. Cross-sectional data are collected by observing many different subjects or objects at a single point or period in time.

  9. Data, measurement and empirical methods in the science of science

    Data, measurement and empirical methods in the science ...

  10. Empirical Research

    The term "empirical" entails gathered data based on experience, observations, or experimentation. In empirical research, knowledge is developed from factual experience as opposed to theoretical assumption and usually involved the use of data sources like datasets or fieldwork, but can also be based on observations within a laboratory setting.

  11. What Is Empirical Research? Definition, Types & Samples in 2024

    What Is Empirical Research? Definition, Types & Samples ...

  12. PDF Empirical Research Papers

    Empirical researchers observe, measure, record, and analyze data with the goal of generating knowledge. Empirical research may explore, describe, or explain behaviors or phenomena in humans, animals, or the natural world. It may use any number of quantitative or qualitative methods, ranging from laboratory experiments to surveys to artifact ...

  13. Chapter 6 The Empirical Analysis

    Chapter 6 The Empirical Analysis. Chapter 6. The Empirical Analysis. Any quantitative research in economics is centered on the analysis we perform on the data we collected. This is the most crucial part of the paper and will define if our work is a success or not (this is, of course linked to having a good research question and a plausible ...

  14. Empirical Research

    Empirical & Non-Empirical Research

  15. A Practical Guide to Writing Quantitative and Qualitative Research

    A Practical Guide to Writing Quantitative and Qualitative ...

  16. Learning to Do Qualitative Data Analysis: A Starting Point

    Learning to Do Qualitative Data Analysis: A Starting Point

  17. PDF What Is Empirical Social Research?

    teristics that set research apart. First, social research is systematic; that is, the researcher devel-ops a plan of ac. ion before beginning the research. Second, social research involves data, which are the pieces of informa. ion gathered from primary sources. This is what makes it empirical—based not on ideas or theory b.

  18. Conduct empirical research

    Conduct empirical research

  19. Statistical Analysis of Empirical Data

    Scott A. Pardo, Ph.D., is a professional statistician, having worked in a wide variety of industrial contexts, including the U.S. Army Information Systems Command, satellite systems engineering, pharmaceutical development, and medical devices.He is the author of Empirical Modeling and Data Analysis for Engineers and Applied Scientists (Springer 2016).

  20. Data Analysis Techniques in Research

    Data Analysis Techniques In Research - Methods, Tools & ...

  21. The Empirical Research Paper: A Guide

    The Empirical Research Paper: A Guide. Guidance and resources on how to read, design, and write an empirical research paper or thesis. Welcome; Reading the Empirical Paper; ... Data analysis/Data collection: A description of the statistical methods used for data analysis and/or collection. This section should be brief and in more depth during ...

  22. Empirical Research

    This book introduces readers to methods and strategies for research and provides them with enough knowledge to become discerning, confident consumers of research in writing. Topics covered include: library research, empirical methodology, quantitative research, experimental research, surveys, focus groups, ethnographies, and much more.

  23. Comprehensive Review and Empirical Evaluation of Causal Discovery

    Abstract. Causal analysis has become an essential component in understanding the underlying causes of phenomena across various fields. Despite its significance, existing literature on causal discovery algorithms is fragmented, with inconsistent methodologies, i.e., there is no universal classification standard for existing methods, and a lack of comprehensive evaluations, i.e., data ...

  24. High-Resolution Monitored Data Analysis of EV Public Charging Stations

    The research involved collecting empirical data from 411,800 recharging sessions and simulated data using the emobpy tool to model energy consumption and charging behavior. Key findings reveal a substantial increase in the number of recharging points, from 673 in 2021 to 970 in 2023, with the total energy delivered increasing from 938 MWh in ...

  25. Spatial heterogeneity analysis of biased land resource supply policies

    After conducting theoretical analysis and empirical testing, this paper delves into the impact of biased land resource supply policies on housing prices and their subsequent influence on ...

  26. Empirical Research

    The overall objective of this chapter is to introduce empirical research. More specifically, the objectives are: (1) to introduce and discuss a decision-making structure for selecting an appropriate research approach, (2) to compare a selection of the introduced research methodologies and methods, and (3) to discuss how different research methodologies and research methods can be used in ...

  27. Full article: Assessing the readiness for industry 4.0 in Pakistan's

    Automation (74), Cloud computing (70), networking (67) Moreover, Big Data analysis (67) was the highest-ranked technology. Automated Inspection (50), Biometric analysis (47) and Artificial intelligence (43) were deemed to have the lowest opportunities for improvement as compared to the technologies above.