• Skip to main content
  • Skip to primary sidebar
  • Skip to footer
  • QuestionPro

survey software icon

  • Solutions Industries Gaming Automotive Sports and events Education Government Travel & Hospitality Financial Services Healthcare Cannabis Technology Use Case NPS+ Communities Audience Contactless surveys Mobile LivePolls Member Experience GDPR Positive People Science 360 Feedback Surveys
  • Resources Blog eBooks Survey Templates Case Studies Training Help center

empirical research problems

Home Market Research

Empirical Research: Definition, Methods, Types and Examples

What is Empirical Research

Content Index

Empirical research: Definition

Empirical research: origin, quantitative research methods, qualitative research methods, steps for conducting empirical research, empirical research methodology cycle, advantages of empirical research, disadvantages of empirical research, why is there a need for empirical research.

Empirical research is defined as any research where conclusions of the study is strictly drawn from concretely empirical evidence, and therefore “verifiable” evidence.

This empirical evidence can be gathered using quantitative market research and  qualitative market research  methods.

For example: A research is being conducted to find out if listening to happy music in the workplace while working may promote creativity? An experiment is conducted by using a music website survey on a set of audience who are exposed to happy music and another set who are not listening to music at all, and the subjects are then observed. The results derived from such a research will give empirical evidence if it does promote creativity or not.

LEARN ABOUT: Behavioral Research

You must have heard the quote” I will not believe it unless I see it”. This came from the ancient empiricists, a fundamental understanding that powered the emergence of medieval science during the renaissance period and laid the foundation of modern science, as we know it today. The word itself has its roots in greek. It is derived from the greek word empeirikos which means “experienced”.

In today’s world, the word empirical refers to collection of data using evidence that is collected through observation or experience or by using calibrated scientific instruments. All of the above origins have one thing in common which is dependence of observation and experiments to collect data and test them to come up with conclusions.

LEARN ABOUT: Causal Research

Types and methodologies of empirical research

Empirical research can be conducted and analysed using qualitative or quantitative methods.

  • Quantitative research : Quantitative research methods are used to gather information through numerical data. It is used to quantify opinions, behaviors or other defined variables . These are predetermined and are in a more structured format. Some of the commonly used methods are survey, longitudinal studies, polls, etc
  • Qualitative research:   Qualitative research methods are used to gather non numerical data.  It is used to find meanings, opinions, or the underlying reasons from its subjects. These methods are unstructured or semi structured. The sample size for such a research is usually small and it is a conversational type of method to provide more insight or in-depth information about the problem Some of the most popular forms of methods are focus groups, experiments, interviews, etc.

Data collected from these will need to be analysed. Empirical evidence can also be analysed either quantitatively and qualitatively. Using this, the researcher can answer empirical questions which have to be clearly defined and answerable with the findings he has got. The type of research design used will vary depending on the field in which it is going to be used. Many of them might choose to do a collective research involving quantitative and qualitative method to better answer questions which cannot be studied in a laboratory setting.

LEARN ABOUT: Qualitative Research Questions and Questionnaires

Quantitative research methods aid in analyzing the empirical evidence gathered. By using these a researcher can find out if his hypothesis is supported or not.

  • Survey research: Survey research generally involves a large audience to collect a large amount of data. This is a quantitative method having a predetermined set of closed questions which are pretty easy to answer. Because of the simplicity of such a method, high responses are achieved. It is one of the most commonly used methods for all kinds of research in today’s world.

Previously, surveys were taken face to face only with maybe a recorder. However, with advancement in technology and for ease, new mediums such as emails , or social media have emerged.

For example: Depletion of energy resources is a growing concern and hence there is a need for awareness about renewable energy. According to recent studies, fossil fuels still account for around 80% of energy consumption in the United States. Even though there is a rise in the use of green energy every year, there are certain parameters because of which the general population is still not opting for green energy. In order to understand why, a survey can be conducted to gather opinions of the general population about green energy and the factors that influence their choice of switching to renewable energy. Such a survey can help institutions or governing bodies to promote appropriate awareness and incentive schemes to push the use of greener energy.

Learn more: Renewable Energy Survey Template Descriptive Research vs Correlational Research

  • Experimental research: In experimental research , an experiment is set up and a hypothesis is tested by creating a situation in which one of the variable is manipulated. This is also used to check cause and effect. It is tested to see what happens to the independent variable if the other one is removed or altered. The process for such a method is usually proposing a hypothesis, experimenting on it, analyzing the findings and reporting the findings to understand if it supports the theory or not.

For example: A particular product company is trying to find what is the reason for them to not be able to capture the market. So the organisation makes changes in each one of the processes like manufacturing, marketing, sales and operations. Through the experiment they understand that sales training directly impacts the market coverage for their product. If the person is trained well, then the product will have better coverage.

  • Correlational research: Correlational research is used to find relation between two set of variables . Regression analysis is generally used to predict outcomes of such a method. It can be positive, negative or neutral correlation.

LEARN ABOUT: Level of Analysis

For example: Higher educated individuals will get higher paying jobs. This means higher education enables the individual to high paying job and less education will lead to lower paying jobs.

  • Longitudinal study: Longitudinal study is used to understand the traits or behavior of a subject under observation after repeatedly testing the subject over a period of time. Data collected from such a method can be qualitative or quantitative in nature.

For example: A research to find out benefits of exercise. The target is asked to exercise everyday for a particular period of time and the results show higher endurance, stamina, and muscle growth. This supports the fact that exercise benefits an individual body.

  • Cross sectional: Cross sectional study is an observational type of method, in which a set of audience is observed at a given point in time. In this type, the set of people are chosen in a fashion which depicts similarity in all the variables except the one which is being researched. This type does not enable the researcher to establish a cause and effect relationship as it is not observed for a continuous time period. It is majorly used by healthcare sector or the retail industry.

For example: A medical study to find the prevalence of under-nutrition disorders in kids of a given population. This will involve looking at a wide range of parameters like age, ethnicity, location, incomes  and social backgrounds. If a significant number of kids coming from poor families show under-nutrition disorders, the researcher can further investigate into it. Usually a cross sectional study is followed by a longitudinal study to find out the exact reason.

  • Causal-Comparative research : This method is based on comparison. It is mainly used to find out cause-effect relationship between two variables or even multiple variables.

For example: A researcher measured the productivity of employees in a company which gave breaks to the employees during work and compared that to the employees of the company which did not give breaks at all.

LEARN ABOUT: Action Research

Some research questions need to be analysed qualitatively, as quantitative methods are not applicable there. In many cases, in-depth information is needed or a researcher may need to observe a target audience behavior, hence the results needed are in a descriptive analysis form. Qualitative research results will be descriptive rather than predictive. It enables the researcher to build or support theories for future potential quantitative research. In such a situation qualitative research methods are used to derive a conclusion to support the theory or hypothesis being studied.

LEARN ABOUT: Qualitative Interview

  • Case study: Case study method is used to find more information through carefully analyzing existing cases. It is very often used for business research or to gather empirical evidence for investigation purpose. It is a method to investigate a problem within its real life context through existing cases. The researcher has to carefully analyse making sure the parameter and variables in the existing case are the same as to the case that is being investigated. Using the findings from the case study, conclusions can be drawn regarding the topic that is being studied.

For example: A report mentioning the solution provided by a company to its client. The challenges they faced during initiation and deployment, the findings of the case and solutions they offered for the problems. Such case studies are used by most companies as it forms an empirical evidence for the company to promote in order to get more business.

  • Observational method:   Observational method is a process to observe and gather data from its target. Since it is a qualitative method it is time consuming and very personal. It can be said that observational research method is a part of ethnographic research which is also used to gather empirical evidence. This is usually a qualitative form of research, however in some cases it can be quantitative as well depending on what is being studied.

For example: setting up a research to observe a particular animal in the rain-forests of amazon. Such a research usually take a lot of time as observation has to be done for a set amount of time to study patterns or behavior of the subject. Another example used widely nowadays is to observe people shopping in a mall to figure out buying behavior of consumers.

  • One-on-one interview: Such a method is purely qualitative and one of the most widely used. The reason being it enables a researcher get precise meaningful data if the right questions are asked. It is a conversational method where in-depth data can be gathered depending on where the conversation leads.

For example: A one-on-one interview with the finance minister to gather data on financial policies of the country and its implications on the public.

  • Focus groups: Focus groups are used when a researcher wants to find answers to why, what and how questions. A small group is generally chosen for such a method and it is not necessary to interact with the group in person. A moderator is generally needed in case the group is being addressed in person. This is widely used by product companies to collect data about their brands and the product.

For example: A mobile phone manufacturer wanting to have a feedback on the dimensions of one of their models which is yet to be launched. Such studies help the company meet the demand of the customer and position their model appropriately in the market.

  • Text analysis: Text analysis method is a little new compared to the other types. Such a method is used to analyse social life by going through images or words used by the individual. In today’s world, with social media playing a major part of everyone’s life, such a method enables the research to follow the pattern that relates to his study.

For example: A lot of companies ask for feedback from the customer in detail mentioning how satisfied are they with their customer support team. Such data enables the researcher to take appropriate decisions to make their support team better.

Sometimes a combination of the methods is also needed for some questions that cannot be answered using only one type of method especially when a researcher needs to gain a complete understanding of complex subject matter.

We recently published a blog that talks about examples of qualitative data in education ; why don’t you check it out for more ideas?

Since empirical research is based on observation and capturing experiences, it is important to plan the steps to conduct the experiment and how to analyse it. This will enable the researcher to resolve problems or obstacles which can occur during the experiment.

Step #1: Define the purpose of the research

This is the step where the researcher has to answer questions like what exactly do I want to find out? What is the problem statement? Are there any issues in terms of the availability of knowledge, data, time or resources. Will this research be more beneficial than what it will cost.

Before going ahead, a researcher has to clearly define his purpose for the research and set up a plan to carry out further tasks.

Step #2 : Supporting theories and relevant literature

The researcher needs to find out if there are theories which can be linked to his research problem . He has to figure out if any theory can help him support his findings. All kind of relevant literature will help the researcher to find if there are others who have researched this before, or what are the problems faced during this research. The researcher will also have to set up assumptions and also find out if there is any history regarding his research problem

Step #3: Creation of Hypothesis and measurement

Before beginning the actual research he needs to provide himself a working hypothesis or guess what will be the probable result. Researcher has to set up variables, decide the environment for the research and find out how can he relate between the variables.

Researcher will also need to define the units of measurements, tolerable degree for errors, and find out if the measurement chosen will be acceptable by others.

Step #4: Methodology, research design and data collection

In this step, the researcher has to define a strategy for conducting his research. He has to set up experiments to collect data which will enable him to propose the hypothesis. The researcher will decide whether he will need experimental or non experimental method for conducting the research. The type of research design will vary depending on the field in which the research is being conducted. Last but not the least, the researcher will have to find out parameters that will affect the validity of the research design. Data collection will need to be done by choosing appropriate samples depending on the research question. To carry out the research, he can use one of the many sampling techniques. Once data collection is complete, researcher will have empirical data which needs to be analysed.

LEARN ABOUT: Best Data Collection Tools

Step #5: Data Analysis and result

Data analysis can be done in two ways, qualitatively and quantitatively. Researcher will need to find out what qualitative method or quantitative method will be needed or will he need a combination of both. Depending on the unit of analysis of his data, he will know if his hypothesis is supported or rejected. Analyzing this data is the most important part to support his hypothesis.

Step #6: Conclusion

A report will need to be made with the findings of the research. The researcher can give the theories and literature that support his research. He can make suggestions or recommendations for further research on his topic.

Empirical research methodology cycle

A.D. de Groot, a famous dutch psychologist and a chess expert conducted some of the most notable experiments using chess in the 1940’s. During his study, he came up with a cycle which is consistent and now widely used to conduct empirical research. It consists of 5 phases with each phase being as important as the next one. The empirical cycle captures the process of coming up with hypothesis about how certain subjects work or behave and then testing these hypothesis against empirical data in a systematic and rigorous approach. It can be said that it characterizes the deductive approach to science. Following is the empirical cycle.

  • Observation: At this phase an idea is sparked for proposing a hypothesis. During this phase empirical data is gathered using observation. For example: a particular species of flower bloom in a different color only during a specific season.
  • Induction: Inductive reasoning is then carried out to form a general conclusion from the data gathered through observation. For example: As stated above it is observed that the species of flower blooms in a different color during a specific season. A researcher may ask a question “does the temperature in the season cause the color change in the flower?” He can assume that is the case, however it is a mere conjecture and hence an experiment needs to be set up to support this hypothesis. So he tags a few set of flowers kept at a different temperature and observes if they still change the color?
  • Deduction: This phase helps the researcher to deduce a conclusion out of his experiment. This has to be based on logic and rationality to come up with specific unbiased results.For example: In the experiment, if the tagged flowers in a different temperature environment do not change the color then it can be concluded that temperature plays a role in changing the color of the bloom.
  • Testing: This phase involves the researcher to return to empirical methods to put his hypothesis to the test. The researcher now needs to make sense of his data and hence needs to use statistical analysis plans to determine the temperature and bloom color relationship. If the researcher finds out that most flowers bloom a different color when exposed to the certain temperature and the others do not when the temperature is different, he has found support to his hypothesis. Please note this not proof but just a support to his hypothesis.
  • Evaluation: This phase is generally forgotten by most but is an important one to keep gaining knowledge. During this phase the researcher puts forth the data he has collected, the support argument and his conclusion. The researcher also states the limitations for the experiment and his hypothesis and suggests tips for others to pick it up and continue a more in-depth research for others in the future. LEARN MORE: Population vs Sample

LEARN MORE: Population vs Sample

There is a reason why empirical research is one of the most widely used method. There are a few advantages associated with it. Following are a few of them.

  • It is used to authenticate traditional research through various experiments and observations.
  • This research methodology makes the research being conducted more competent and authentic.
  • It enables a researcher understand the dynamic changes that can happen and change his strategy accordingly.
  • The level of control in such a research is high so the researcher can control multiple variables.
  • It plays a vital role in increasing internal validity .

Even though empirical research makes the research more competent and authentic, it does have a few disadvantages. Following are a few of them.

  • Such a research needs patience as it can be very time consuming. The researcher has to collect data from multiple sources and the parameters involved are quite a few, which will lead to a time consuming research.
  • Most of the time, a researcher will need to conduct research at different locations or in different environments, this can lead to an expensive affair.
  • There are a few rules in which experiments can be performed and hence permissions are needed. Many a times, it is very difficult to get certain permissions to carry out different methods of this research.
  • Collection of data can be a problem sometimes, as it has to be collected from a variety of sources through different methods.

LEARN ABOUT:  Social Communication Questionnaire

Empirical research is important in today’s world because most people believe in something only that they can see, hear or experience. It is used to validate multiple hypothesis and increase human knowledge and continue doing it to keep advancing in various fields.

For example: Pharmaceutical companies use empirical research to try out a specific drug on controlled groups or random groups to study the effect and cause. This way, they prove certain theories they had proposed for the specific drug. Such research is very important as sometimes it can lead to finding a cure for a disease that has existed for many years. It is useful in science and many other fields like history, social sciences, business, etc.

LEARN ABOUT: 12 Best Tools for Researchers

With the advancement in today’s world, empirical research has become critical and a norm in many fields to support their hypothesis and gain more knowledge. The methods mentioned above are very useful for carrying out such research. However, a number of new methods will keep coming up as the nature of new investigative questions keeps getting unique or changing.

Create a single source of real data with a built-for-insights platform. Store past data, add nuggets of insights, and import research data from various sources into a CRM for insights. Build on ever-growing research with a real-time dashboard in a unified research management platform to turn insights into knowledge.

LEARN MORE         FREE TRIAL

MORE LIKE THIS

We are on the front end of an innovation that can help us better predict how to transform our customer interactions.

How Can I Help You? — Tuesday CX Thoughts

Jun 5, 2024

empirical research problems

Why Multilingual 360 Feedback Surveys Provide Better Insights

Jun 3, 2024

Raked Weighting

Raked Weighting: A Key Tool for Accurate Survey Results

May 31, 2024

Data trends

Top 8 Data Trends to Understand the Future of Data

May 30, 2024

Other categories

  • Academic Research
  • Artificial Intelligence
  • Assessments
  • Brand Awareness
  • Case Studies
  • Communities
  • Consumer Insights
  • Customer effort score
  • Customer Engagement
  • Customer Experience
  • Customer Loyalty
  • Customer Research
  • Customer Satisfaction
  • Employee Benefits
  • Employee Engagement
  • Employee Retention
  • Friday Five
  • General Data Protection Regulation
  • Insights Hub
  • Life@QuestionPro
  • Market Research
  • Mobile diaries
  • Mobile Surveys
  • New Features
  • Online Communities
  • Question Types
  • Questionnaire
  • QuestionPro Products
  • Release Notes
  • Research Tools and Apps
  • Revenue at Risk
  • Survey Templates
  • Training Tips
  • Uncategorized
  • Video Learning Series
  • What’s Coming Up
  • Workforce Intelligence

Purdue University

  • Ask a Librarian

Research: Overview & Approaches

  • Getting Started with Undergraduate Research
  • Planning & Getting Started
  • Building Your Knowledge Base
  • Locating Sources
  • Reading Scholarly Articles
  • Creating a Literature Review
  • Productivity & Organizing Research
  • Scholarly and Professional Relationships

Introduction to Empirical Research

Databases for finding empirical research, guided search, google scholar, examples of empirical research, sources and further reading.

  • Interpretive Research
  • Action-Based Research
  • Creative & Experimental Approaches

Your Librarian

Profile Photo

  • Introductory Video This video covers what empirical research is, what kinds of questions and methods empirical researchers use, and some tips for finding empirical research articles in your discipline.

Video Tutorial

  • Guided Search: Finding Empirical Research Articles This is a hands-on tutorial that will allow you to use your own search terms to find resources.

Google Scholar Search

  • Study on radiation transfer in human skin for cosmetics
  • Long-Term Mobile Phone Use and the Risk of Vestibular Schwannoma: A Danish Nationwide Cohort Study
  • Emissions Impacts and Benefits of Plug-In Hybrid Electric Vehicles and Vehicle-to-Grid Services
  • Review of design considerations and technological challenges for successful development and deployment of plug-in hybrid electric vehicles
  • Endocrine disrupters and human health: could oestrogenic chemicals in body care cosmetics adversely affect breast cancer incidence in women?

empirical research problems

  • << Previous: Scholarly and Professional Relationships
  • Next: Interpretive Research >>
  • Last Updated: May 29, 2024 3:30 PM
  • URL: https://guides.lib.purdue.edu/research_approaches

Penn State University Libraries

Empirical research in the social sciences and education.

  • What is Empirical Research and How to Read It
  • Finding Empirical Research in Library Databases
  • Designing Empirical Research
  • Ethics, Cultural Responsiveness, and Anti-Racism in Research
  • Citing, Writing, and Presenting Your Work

Contact the Librarian at your campus for more help!

Ellysa Cahoy

Introduction: What is Empirical Research?

Empirical research is based on observed and measured phenomena and derives knowledge from actual experience rather than from theory or belief. 

How do you know if a study is empirical? Read the subheadings within the article, book, or report and look for a description of the research "methodology."  Ask yourself: Could I recreate this study and test these results?

Key characteristics to look for:

  • Specific research questions to be answered
  • Definition of the population, behavior, or   phenomena being studied
  • Description of the process used to study this population or phenomena, including selection criteria, controls, and testing instruments (such as surveys)

Another hint: some scholarly journals use a specific layout, called the "IMRaD" format, to communicate empirical research findings. Such articles typically have 4 components:

  • Introduction : sometimes called "literature review" -- what is currently known about the topic -- usually includes a theoretical framework and/or discussion of previous studies
  • Methodology: sometimes called "research design" -- how to recreate the study -- usually describes the population, research process, and analytical tools used in the present study
  • Results : sometimes called "findings" -- what was learned through the study -- usually appears as statistical data or as substantial quotations from research participants
  • Discussion : sometimes called "conclusion" or "implications" -- why the study is important -- usually describes how the research results influence professional practices or future studies

Reading and Evaluating Scholarly Materials

Reading research can be a challenge. However, the tutorials and videos below can help. They explain what scholarly articles look like, how to read them, and how to evaluate them:

  • CRAAP Checklist A frequently-used checklist that helps you examine the currency, relevance, authority, accuracy, and purpose of an information source.
  • IF I APPLY A newer model of evaluating sources which encourages you to think about your own biases as a reader, as well as concerns about the item you are reading.
  • Credo Video: How to Read Scholarly Materials (4 min.)
  • Credo Tutorial: How to Read Scholarly Materials
  • Credo Tutorial: Evaluating Information
  • Credo Video: Evaluating Statistics (4 min.)
  • Next: Finding Empirical Research in Library Databases >>
  • Last Updated: Feb 18, 2024 8:33 PM
  • URL: https://guides.libraries.psu.edu/emp

Banner

  • University of Memphis Libraries
  • Research Guides

Empirical Research: Defining, Identifying, & Finding

Defining empirical research, what is empirical research, quantitative or qualitative.

  • Introduction
  • Database Tools
  • Search Terms
  • Image Descriptions

Calfee & Chambliss (2005)  (UofM login required) describe empirical research as a "systematic approach for answering certain types of questions."  Those questions are answered "[t]hrough the collection of evidence under carefully defined and replicable conditions" (p. 43). 

The evidence collected during empirical research is often referred to as "data." 

Characteristics of Empirical Research

Emerald Publishing's guide to conducting empirical research identifies a number of common elements to empirical research: 

  • A  research question , which will determine research objectives.
  • A particular and planned  design  for the research, which will depend on the question and which will find ways of answering it with appropriate use of resources.
  • The gathering of  primary data , which is then analysed.
  • A particular  methodology  for collecting and analysing the data, such as an experiment or survey.
  • The limitation of the data to a particular group, area or time scale, known as a sample [emphasis added]: for example, a specific number of employees of a particular company type, or all users of a library over a given time scale. The sample should be somehow representative of a wider population.
  • The ability to  recreate  the study and test the results. This is known as  reliability .
  • The ability to  generalize  from the findings to a larger sample and to other situations.

If you see these elements in a research article, you can feel confident that you have found empirical research. Emerald's guide goes into more detail on each element. 

Empirical research methodologies can be described as quantitative, qualitative, or a mix of both (usually called mixed-methods).

Ruane (2016)  (UofM login required) gets at the basic differences in approach between quantitative and qualitative research:

  • Quantitative research  -- an approach to documenting reality that relies heavily on numbers both for the measurement of variables and for data analysis (p. 33).
  • Qualitative research  -- an approach to documenting reality that relies on words and images as the primary data source (p. 33).

Both quantitative and qualitative methods are empirical . If you can recognize that a research study is quantitative or qualitative study, then you have also recognized that it is empirical study. 

Below are information on the characteristics of quantitative and qualitative research. This video from Scribbr also offers a good overall introduction to the two approaches to research methodology: 

Characteristics of Quantitative Research 

Researchers test hypotheses, or theories, based in assumptions about causality, i.e. we expect variable X to cause variable Y. Variables have to be controlled as much as possible to ensure validity. The results explain the relationship between the variables. Measures are based in pre-defined instruments.

Examples: experimental or quasi-experimental design, pretest & post-test, survey or questionnaire with closed-ended questions. Studies that identify factors that influence an outcomes, the utility of an intervention, or understanding predictors of outcomes. 

Characteristics of Qualitative Research

Researchers explore “meaning individuals or groups ascribe to social or human problems (Creswell & Creswell, 2018, p3).” Questions and procedures emerge rather than being prescribed. Complexity, nuance, and individual meaning are valued. Research is both inductive and deductive. Data sources are multiple and varied, i.e. interviews, observations, documents, photographs, etc. The researcher is a key instrument and must be reflective of their background, culture, and experiences as influential of the research.

Examples: open question interviews and surveys, focus groups, case studies, grounded theory, ethnography, discourse analysis, narrative, phenomenology, participatory action research.

Calfee, R. C. & Chambliss, M. (2005). The design of empirical research. In J. Flood, D. Lapp, J. R. Squire, & J. Jensen (Eds.),  Methods of research on teaching the English language arts: The methodology chapters from the handbook of research on teaching the English language arts (pp. 43-78). Routledge.  http://ezproxy.memphis.edu/login?url=http://search.ebscohost.com/login.aspx?direct=true&db=nlebk&AN=125955&site=eds-live&scope=site .

Creswell, J. W., & Creswell, J. D. (2018).  Research design: Qualitative, quantitative, and mixed methods approaches  (5th ed.). Thousand Oaks: Sage.

How to... conduct empirical research . (n.d.). Emerald Publishing.  https://www.emeraldgrouppublishing.com/how-to/research-methods/conduct-empirical-research .

Scribbr. (2019). Quantitative vs. qualitative: The differences explained  [video]. YouTube.  https://www.youtube.com/watch?v=a-XtVF7Bofg .

Ruane, J. M. (2016).  Introducing social research methods : Essentials for getting the edge . Wiley-Blackwell.  http://ezproxy.memphis.edu/login?url=http://search.ebscohost.com/login.aspx?direct=true&db=nlebk&AN=1107215&site=eds-live&scope=site .  

  • << Previous: Home
  • Next: Identifying Empirical Research >>
  • Last Updated: Apr 2, 2024 11:25 AM
  • URL: https://libguides.memphis.edu/empirical-research
  • USC Libraries
  • Research Guides

Organizing Your Social Sciences Research Paper

  • The Research Problem/Question
  • Purpose of Guide
  • Design Flaws to Avoid
  • Independent and Dependent Variables
  • Glossary of Research Terms
  • Reading Research Effectively
  • Narrowing a Topic Idea
  • Broadening a Topic Idea
  • Extending the Timeliness of a Topic Idea
  • Academic Writing Style
  • Applying Critical Thinking
  • Choosing a Title
  • Making an Outline
  • Paragraph Development
  • Research Process Video Series
  • Executive Summary
  • The C.A.R.S. Model
  • Background Information
  • Theoretical Framework
  • Citation Tracking
  • Content Alert Services
  • Evaluating Sources
  • Primary Sources
  • Secondary Sources
  • Tiertiary Sources
  • Scholarly vs. Popular Publications
  • Qualitative Methods
  • Quantitative Methods
  • Insiderness
  • Using Non-Textual Elements
  • Limitations of the Study
  • Common Grammar Mistakes
  • Writing Concisely
  • Avoiding Plagiarism
  • Footnotes or Endnotes?
  • Further Readings
  • Generative AI and Writing
  • USC Libraries Tutorials and Other Guides
  • Bibliography

A research problem is a definite or clear expression [statement] about an area of concern, a condition to be improved upon, a difficulty to be eliminated, or a troubling question that exists in scholarly literature, in theory, or within existing practice that points to a need for meaningful understanding and deliberate investigation. A research problem does not state how to do something, offer a vague or broad proposition, or present a value question. In the social and behavioral sciences, studies are most often framed around examining a problem that needs to be understood and resolved in order to improve society and the human condition.

Bryman, Alan. “The Research Question in Social Research: What is its Role?” International Journal of Social Research Methodology 10 (2007): 5-20; Guba, Egon G., and Yvonna S. Lincoln. “Competing Paradigms in Qualitative Research.” In Handbook of Qualitative Research . Norman K. Denzin and Yvonna S. Lincoln, editors. (Thousand Oaks, CA: Sage, 1994), pp. 105-117; Pardede, Parlindungan. “Identifying and Formulating the Research Problem." Research in ELT: Module 4 (October 2018): 1-13; Li, Yanmei, and Sumei Zhang. "Identifying the Research Problem." In Applied Research Methods in Urban and Regional Planning . (Cham, Switzerland: Springer International Publishing, 2022), pp. 13-21.

Importance of...

The purpose of a problem statement is to:

  • Introduce the reader to the importance of the topic being studied . The reader is oriented to the significance of the study.
  • Anchors the research questions, hypotheses, or assumptions to follow . It offers a concise statement about the purpose of your paper.
  • Place the topic into a particular context that defines the parameters of what is to be investigated.
  • Provide the framework for reporting the results and indicates what is probably necessary to conduct the study and explain how the findings will present this information.

In the social sciences, the research problem establishes the means by which you must answer the "So What?" question. This declarative question refers to a research problem surviving the relevancy test [the quality of a measurement procedure that provides repeatability and accuracy]. Note that answering the "So What?" question requires a commitment on your part to not only show that you have reviewed the literature, but that you have thoroughly considered the significance of the research problem and its implications applied to creating new knowledge and understanding or informing practice.

To survive the "So What" question, problem statements should possess the following attributes:

  • Clarity and precision [a well-written statement does not make sweeping generalizations and irresponsible pronouncements; it also does include unspecific determinates like "very" or "giant"],
  • Demonstrate a researchable topic or issue [i.e., feasibility of conducting the study is based upon access to information that can be effectively acquired, gathered, interpreted, synthesized, and understood],
  • Identification of what would be studied, while avoiding the use of value-laden words and terms,
  • Identification of an overarching question or small set of questions accompanied by key factors or variables,
  • Identification of key concepts and terms,
  • Articulation of the study's conceptual boundaries or parameters or limitations,
  • Some generalizability in regards to applicability and bringing results into general use,
  • Conveyance of the study's importance, benefits, and justification [i.e., regardless of the type of research, it is important to demonstrate that the research is not trivial],
  • Does not have unnecessary jargon or overly complex sentence constructions; and,
  • Conveyance of more than the mere gathering of descriptive data providing only a snapshot of the issue or phenomenon under investigation.

Bryman, Alan. “The Research Question in Social Research: What is its Role?” International Journal of Social Research Methodology 10 (2007): 5-20; Brown, Perry J., Allen Dyer, and Ross S. Whaley. "Recreation Research—So What?" Journal of Leisure Research 5 (1973): 16-24; Castellanos, Susie. Critical Writing and Thinking. The Writing Center. Dean of the College. Brown University; Ellis, Timothy J. and Yair Levy Nova. "Framework of Problem-Based Research: A Guide for Novice Researchers on the Development of a Research-Worthy Problem." Informing Science: the International Journal of an Emerging Transdiscipline 11 (2008); Thesis and Purpose Statements. The Writer’s Handbook. Writing Center. University of Wisconsin, Madison; Thesis Statements. The Writing Center. University of North Carolina; Tips and Examples for Writing Thesis Statements. The Writing Lab and The OWL. Purdue University; Selwyn, Neil. "‘So What?’…A Question that Every Journal Article Needs to Answer." Learning, Media, and Technology 39 (2014): 1-5; Shoket, Mohd. "Research Problem: Identification and Formulation." International Journal of Research 1 (May 2014): 512-518.

Structure and Writing Style

I.  Types and Content

There are four general conceptualizations of a research problem in the social sciences:

  • Casuist Research Problem -- this type of problem relates to the determination of right and wrong in questions of conduct or conscience by analyzing moral dilemmas through the application of general rules and the careful distinction of special cases.
  • Difference Research Problem -- typically asks the question, “Is there a difference between two or more groups or treatments?” This type of problem statement is used when the researcher compares or contrasts two or more phenomena. This a common approach to defining a problem in the clinical social sciences or behavioral sciences.
  • Descriptive Research Problem -- typically asks the question, "what is...?" with the underlying purpose to describe the significance of a situation, state, or existence of a specific phenomenon. This problem is often associated with revealing hidden or understudied issues.
  • Relational Research Problem -- suggests a relationship of some sort between two or more variables to be investigated. The underlying purpose is to investigate specific qualities or characteristics that may be connected in some way.

A problem statement in the social sciences should contain :

  • A lead-in that helps ensure the reader will maintain interest over the study,
  • A declaration of originality [e.g., mentioning a knowledge void or a lack of clarity about a topic that will be revealed in the literature review of prior research],
  • An indication of the central focus of the study [establishing the boundaries of analysis], and
  • An explanation of the study's significance or the benefits to be derived from investigating the research problem.

NOTE:   A statement describing the research problem of your paper should not be viewed as a thesis statement that you may be familiar with from high school. Given the content listed above, a description of the research problem is usually a short paragraph in length.

II.  Sources of Problems for Investigation

The identification of a problem to study can be challenging, not because there's a lack of issues that could be investigated, but due to the challenge of formulating an academically relevant and researchable problem which is unique and does not simply duplicate the work of others. To facilitate how you might select a problem from which to build a research study, consider these sources of inspiration:

Deductions from Theory This relates to deductions made from social philosophy or generalizations embodied in life and in society that the researcher is familiar with. These deductions from human behavior are then placed within an empirical frame of reference through research. From a theory, the researcher can formulate a research problem or hypothesis stating the expected findings in certain empirical situations. The research asks the question: “What relationship between variables will be observed if theory aptly summarizes the state of affairs?” One can then design and carry out a systematic investigation to assess whether empirical data confirm or reject the hypothesis, and hence, the theory.

Interdisciplinary Perspectives Identifying a problem that forms the basis for a research study can come from academic movements and scholarship originating in disciplines outside of your primary area of study. This can be an intellectually stimulating exercise. A review of pertinent literature should include examining research from related disciplines that can reveal new avenues of exploration and analysis. An interdisciplinary approach to selecting a research problem offers an opportunity to construct a more comprehensive understanding of a very complex issue that any single discipline may be able to provide.

Interviewing Practitioners The identification of research problems about particular topics can arise from formal interviews or informal discussions with practitioners who provide insight into new directions for future research and how to make research findings more relevant to practice. Discussions with experts in the field, such as, teachers, social workers, health care providers, lawyers, business leaders, etc., offers the chance to identify practical, “real world” problems that may be understudied or ignored within academic circles. This approach also provides some practical knowledge which may help in the process of designing and conducting your study.

Personal Experience Don't undervalue your everyday experiences or encounters as worthwhile problems for investigation. Think critically about your own experiences and/or frustrations with an issue facing society or related to your community, your neighborhood, your family, or your personal life. This can be derived, for example, from deliberate observations of certain relationships for which there is no clear explanation or witnessing an event that appears harmful to a person or group or that is out of the ordinary.

Relevant Literature The selection of a research problem can be derived from a thorough review of pertinent research associated with your overall area of interest. This may reveal where gaps exist in understanding a topic or where an issue has been understudied. Research may be conducted to: 1) fill such gaps in knowledge; 2) evaluate if the methodologies employed in prior studies can be adapted to solve other problems; or, 3) determine if a similar study could be conducted in a different subject area or applied in a different context or to different study sample [i.e., different setting or different group of people]. Also, authors frequently conclude their studies by noting implications for further research; read the conclusion of pertinent studies because statements about further research can be a valuable source for identifying new problems to investigate. The fact that a researcher has identified a topic worthy of further exploration validates the fact it is worth pursuing.

III.  What Makes a Good Research Statement?

A good problem statement begins by introducing the broad area in which your research is centered, gradually leading the reader to the more specific issues you are investigating. The statement need not be lengthy, but a good research problem should incorporate the following features:

1.  Compelling Topic The problem chosen should be one that motivates you to address it but simple curiosity is not a good enough reason to pursue a research study because this does not indicate significance. The problem that you choose to explore must be important to you, but it must also be viewed as important by your readers and to a the larger academic and/or social community that could be impacted by the results of your study. 2.  Supports Multiple Perspectives The problem must be phrased in a way that avoids dichotomies and instead supports the generation and exploration of multiple perspectives. A general rule of thumb in the social sciences is that a good research problem is one that would generate a variety of viewpoints from a composite audience made up of reasonable people. 3.  Researchability This isn't a real word but it represents an important aspect of creating a good research statement. It seems a bit obvious, but you don't want to find yourself in the midst of investigating a complex research project and realize that you don't have enough prior research to draw from for your analysis. There's nothing inherently wrong with original research, but you must choose research problems that can be supported, in some way, by the resources available to you. If you are not sure if something is researchable, don't assume that it isn't if you don't find information right away--seek help from a librarian !

NOTE:   Do not confuse a research problem with a research topic. A topic is something to read and obtain information about, whereas a problem is something to be solved or framed as a question raised for inquiry, consideration, or solution, or explained as a source of perplexity, distress, or vexation. In short, a research topic is something to be understood; a research problem is something that needs to be investigated.

IV.  Asking Analytical Questions about the Research Problem

Research problems in the social and behavioral sciences are often analyzed around critical questions that must be investigated. These questions can be explicitly listed in the introduction [i.e., "This study addresses three research questions about women's psychological recovery from domestic abuse in multi-generational home settings..."], or, the questions are implied in the text as specific areas of study related to the research problem. Explicitly listing your research questions at the end of your introduction can help in designing a clear roadmap of what you plan to address in your study, whereas, implicitly integrating them into the text of the introduction allows you to create a more compelling narrative around the key issues under investigation. Either approach is appropriate.

The number of questions you attempt to address should be based on the complexity of the problem you are investigating and what areas of inquiry you find most critical to study. Practical considerations, such as, the length of the paper you are writing or the availability of resources to analyze the issue can also factor in how many questions to ask. In general, however, there should be no more than four research questions underpinning a single research problem.

Given this, well-developed analytical questions can focus on any of the following:

  • Highlights a genuine dilemma, area of ambiguity, or point of confusion about a topic open to interpretation by your readers;
  • Yields an answer that is unexpected and not obvious rather than inevitable and self-evident;
  • Provokes meaningful thought or discussion;
  • Raises the visibility of the key ideas or concepts that may be understudied or hidden;
  • Suggests the need for complex analysis or argument rather than a basic description or summary; and,
  • Offers a specific path of inquiry that avoids eliciting generalizations about the problem.

NOTE:   Questions of how and why concerning a research problem often require more analysis than questions about who, what, where, and when. You should still ask yourself these latter questions, however. Thinking introspectively about the who, what, where, and when of a research problem can help ensure that you have thoroughly considered all aspects of the problem under investigation and helps define the scope of the study in relation to the problem.

V.  Mistakes to Avoid

Beware of circular reasoning! Do not state the research problem as simply the absence of the thing you are suggesting. For example, if you propose the following, "The problem in this community is that there is no hospital," this only leads to a research problem where:

  • The need is for a hospital
  • The objective is to create a hospital
  • The method is to plan for building a hospital, and
  • The evaluation is to measure if there is a hospital or not.

This is an example of a research problem that fails the "So What?" test . In this example, the problem does not reveal the relevance of why you are investigating the fact there is no hospital in the community [e.g., perhaps there's a hospital in the community ten miles away]; it does not elucidate the significance of why one should study the fact there is no hospital in the community [e.g., that hospital in the community ten miles away has no emergency room]; the research problem does not offer an intellectual pathway towards adding new knowledge or clarifying prior knowledge [e.g., the county in which there is no hospital already conducted a study about the need for a hospital, but it was conducted ten years ago]; and, the problem does not offer meaningful outcomes that lead to recommendations that can be generalized for other situations or that could suggest areas for further research [e.g., the challenges of building a new hospital serves as a case study for other communities].

Alvesson, Mats and Jörgen Sandberg. “Generating Research Questions Through Problematization.” Academy of Management Review 36 (April 2011): 247-271 ; Choosing and Refining Topics. Writing@CSU. Colorado State University; D'Souza, Victor S. "Use of Induction and Deduction in Research in Social Sciences: An Illustration." Journal of the Indian Law Institute 24 (1982): 655-661; Ellis, Timothy J. and Yair Levy Nova. "Framework of Problem-Based Research: A Guide for Novice Researchers on the Development of a Research-Worthy Problem." Informing Science: the International Journal of an Emerging Transdiscipline 11 (2008); How to Write a Research Question. The Writing Center. George Mason University; Invention: Developing a Thesis Statement. The Reading/Writing Center. Hunter College; Problem Statements PowerPoint Presentation. The Writing Lab and The OWL. Purdue University; Procter, Margaret. Using Thesis Statements. University College Writing Centre. University of Toronto; Shoket, Mohd. "Research Problem: Identification and Formulation." International Journal of Research 1 (May 2014): 512-518; Trochim, William M.K. Problem Formulation. Research Methods Knowledge Base. 2006; Thesis and Purpose Statements. The Writer’s Handbook. Writing Center. University of Wisconsin, Madison; Thesis Statements. The Writing Center. University of North Carolina; Tips and Examples for Writing Thesis Statements. The Writing Lab and The OWL. Purdue University; Pardede, Parlindungan. “Identifying and Formulating the Research Problem." Research in ELT: Module 4 (October 2018): 1-13; Walk, Kerry. Asking an Analytical Question. [Class handout or worksheet]. Princeton University; White, Patrick. Developing Research Questions: A Guide for Social Scientists . New York: Palgrave McMillan, 2009; Li, Yanmei, and Sumei Zhang. "Identifying the Research Problem." In Applied Research Methods in Urban and Regional Planning . (Cham, Switzerland: Springer International Publishing, 2022), pp. 13-21.

  • << Previous: Background Information
  • Next: Theoretical Framework >>
  • Last Updated: May 30, 2024 9:38 AM
  • URL: https://libguides.usc.edu/writingguide

empirical research problems

Get science-backed answers as you write with Paperpal's Research feature

Empirical Research: A Comprehensive Guide for Academics 

empirical research

Empirical research relies on gathering and studying real, observable data. The term ’empirical’ comes from the Greek word ’empeirikos,’ meaning ‘experienced’ or ‘based on experience.’ So, what is empirical research? Instead of using theories or opinions, empirical research depends on real data obtained through direct observation or experimentation. 

Why Empirical Research?

Empirical research plays a key role in checking or improving current theories, providing a systematic way to grow knowledge across different areas. By focusing on objectivity, it makes research findings more trustworthy, which is critical in research fields like medicine, psychology, economics, and public policy. In the end, the strengths of empirical research lie in deepening our awareness of the world and improving our capacity to tackle problems wisely. 1,2  

Qualitative and Quantitative Methods

There are two main types of empirical research methods – qualitative and quantitative. 3,4 Qualitative research delves into intricate phenomena using non-numerical data, such as interviews or observations, to offer in-depth insights into human experiences. In contrast, quantitative research analyzes numerical data to spot patterns and relationships, aiming for objectivity and the ability to apply findings to a wider context. 

Steps for Conducting Empirical Research

When it comes to conducting research, there are some simple steps that researchers can follow. 5,6  

  • Create Research Hypothesis:  Clearly state the specific question you want to answer or the hypothesis you want to explore in your study. 
  • Examine Existing Research:  Read and study existing research on your topic. Understand what’s already known, identify existing gaps in knowledge, and create a framework for your own study based on what you learn. 
  • Plan Your Study:  Decide how you’ll conduct your research—whether through qualitative methods, quantitative methods, or a mix of both. Choose suitable techniques like surveys, experiments, interviews, or observations based on your research question. 
  • Develop Research Instruments:  Create reliable research collection tools, such as surveys or questionnaires, to help you collate data. Ensure these tools are well-designed and effective. 
  • Collect Data:  Systematically gather the information you need for your research according to your study design and protocols using the chosen research methods. 
  • Data Analysis:  Analyze the collected data using suitable statistical or qualitative methods that align with your research question and objectives. 
  • Interpret Results:  Understand and explain the significance of your analysis results in the context of your research question or hypothesis. 
  • Draw Conclusions:  Summarize your findings and draw conclusions based on the evidence. Acknowledge any study limitations and propose areas for future research. 

Advantages of Empirical Research

Empirical research is valuable because it stays objective by relying on observable data, lessening the impact of personal biases. This objectivity boosts the trustworthiness of research findings. Also, using precise quantitative methods helps in accurate measurement and statistical analysis. This precision ensures researchers can draw reliable conclusions from numerical data, strengthening our understanding of the studied phenomena. 4  

Disadvantages of Empirical Research

While empirical research has notable strengths, researchers must also be aware of its limitations when deciding on the right research method for their study.4 One significant drawback of empirical research is the risk of oversimplifying complex phenomena, especially when relying solely on quantitative methods. These methods may struggle to capture the richness and nuances present in certain social, cultural, or psychological contexts. Another challenge is the potential for confounding variables or biases during data collection, impacting result accuracy.  

Tips for Empirical Writing

In empirical research, the writing is usually done in research papers, articles, or reports. The empirical writing follows a set structure, and each section has a specific role. Here are some tips for your empirical writing. 7   

  • Define Your Objectives:  When you write about your research, start by making your goals clear. Explain what you want to find out or prove in a simple and direct way. This helps guide your research and lets others know what you have set out to achieve. 
  • Be Specific in Your Literature Review:  In the part where you talk about what others have studied before you, focus on research that directly relates to your research question. Keep it short and pick studies that help explain why your research is important. This part sets the stage for your work. 
  • Explain Your Methods Clearly : When you talk about how you did your research (Methods), explain it in detail. Be clear about your research plan, who took part, and what you did; this helps others understand and trust your study. Also, be honest about any rules you follow to make sure your study is ethical and reproducible. 
  • Share Your Results Clearly : After doing your empirical research, share what you found in a simple way. Use tables or graphs to make it easier for your audience to understand your research. Also, talk about any numbers you found and clearly state if they are important or not. Ensure that others can see why your research findings matter. 
  • Talk About What Your Findings Mean:  In the part where you discuss your research results, explain what they mean. Discuss why your findings are important and if they connect to what others have found before. Be honest about any problems with your study and suggest ideas for more research in the future. 
  • Wrap It Up Clearly:  Finally, end your empirical research paper by summarizing what you found and why it’s important. Remind everyone why your study matters. Keep your writing clear and fix any mistakes before you share it. Ask someone you trust to read it and give you feedback before you finish. 

References:  

  • Empirical Research in the Social Sciences and Education, Penn State University Libraries. Available online at  https://guides.libraries.psu.edu/emp  
  • How to conduct empirical research, Emerald Publishing. Available online at  https://www.emeraldgrouppublishing.com/how-to/research-methods/conduct-empirical-research  
  • Empirical Research: Quantitative & Qualitative, Arrendale Library, Piedmont University. Available online at  https://library.piedmont.edu/empirical-research  
  • Bouchrika, I.  What Is Empirical Research? Definition, Types & Samples  in 2024. Research.com, January 2024. Available online at  https://research.com/research/what-is-empirical-research  
  • Quantitative and Empirical Research vs. Other Types of Research. California State University, April 2023. Available online at  https://libguides.csusb.edu/quantitative  
  • Empirical Research, Definitions, Methods, Types and Examples, Studocu.com website. Available online at  https://www.studocu.com/row/document/uganda-christian-university/it-research-methods/emperical-research-definitions-methods-types-and-examples/55333816  
  • Writing an Empirical Paper in APA Style. Psychology Writing Center, University of Washington. Available online at  https://psych.uw.edu/storage/writing_center/APApaper.pdf  

Paperpal is an AI writing assistant that help academics write better, faster with real-time suggestions for in-depth language and grammar correction. Trained on millions of research manuscripts enhanced by professional academic editors, Paperpal delivers human precision at machine speed.  

Try it for free or upgrade to  Paperpal Prime , which unlocks unlimited access to premium features like academic translation, paraphrasing, contextual synonyms, consistency checks and more. It’s like always having a professional academic editor by your side! Go beyond limitations and experience the future of academic writing.  Get Paperpal Prime now at just US$19 a month!  

Related Reads:

  • How to Write a Scientific Paper in 10 Steps 
  • What is a Literature Review? How to Write It (with Examples)
  • What is an Argumentative Essay? How to Write It (With Examples)
  • Ethical Research Practices For Research with Human Subjects

Ethics in Science: Importance, Principles & Guidelines 

Presenting research data effectively through tables and figures, you may also like, how paperpal can boost comprehension and foster interdisciplinary..., what is the importance of a concept paper..., how to write the first draft of a..., mla works cited page: format, template & examples, how to ace grant writing for research funding..., powerful academic phrases to improve your essay writing , how to write a high-quality conference paper, how paperpal’s research feature helps you develop and..., how paperpal is enhancing academic productivity and accelerating..., how to write a successful book chapter for....

What is Empirical Research? Definition, Methods, Examples

Appinio Research · 09.02.2024 · 36min read

What is Empirical Research Definition Methods Examples

Ever wondered how we gather the facts, unveil hidden truths, and make informed decisions in a world filled with questions? Empirical research holds the key.

In this guide, we'll delve deep into the art and science of empirical research, unraveling its methods, mysteries, and manifold applications. From defining the core principles to mastering data analysis and reporting findings, we're here to equip you with the knowledge and tools to navigate the empirical landscape.

What is Empirical Research?

Empirical research is the cornerstone of scientific inquiry, providing a systematic and structured approach to investigating the world around us. It is the process of gathering and analyzing empirical or observable data to test hypotheses, answer research questions, or gain insights into various phenomena. This form of research relies on evidence derived from direct observation or experimentation, allowing researchers to draw conclusions based on real-world data rather than purely theoretical or speculative reasoning.

Characteristics of Empirical Research

Empirical research is characterized by several key features:

  • Observation and Measurement : It involves the systematic observation or measurement of variables, events, or behaviors.
  • Data Collection : Researchers collect data through various methods, such as surveys, experiments, observations, or interviews.
  • Testable Hypotheses : Empirical research often starts with testable hypotheses that are evaluated using collected data.
  • Quantitative or Qualitative Data : Data can be quantitative (numerical) or qualitative (non-numerical), depending on the research design.
  • Statistical Analysis : Quantitative data often undergo statistical analysis to determine patterns , relationships, or significance.
  • Objectivity and Replicability : Empirical research strives for objectivity, minimizing researcher bias . It should be replicable, allowing other researchers to conduct the same study to verify results.
  • Conclusions and Generalizations : Empirical research generates findings based on data and aims to make generalizations about larger populations or phenomena.

Importance of Empirical Research

Empirical research plays a pivotal role in advancing knowledge across various disciplines. Its importance extends to academia, industry, and society as a whole. Here are several reasons why empirical research is essential:

  • Evidence-Based Knowledge : Empirical research provides a solid foundation of evidence-based knowledge. It enables us to test hypotheses, confirm or refute theories, and build a robust understanding of the world.
  • Scientific Progress : In the scientific community, empirical research fuels progress by expanding the boundaries of existing knowledge. It contributes to the development of theories and the formulation of new research questions.
  • Problem Solving : Empirical research is instrumental in addressing real-world problems and challenges. It offers insights and data-driven solutions to complex issues in fields like healthcare, economics, and environmental science.
  • Informed Decision-Making : In policymaking, business, and healthcare, empirical research informs decision-makers by providing data-driven insights. It guides strategies, investments, and policies for optimal outcomes.
  • Quality Assurance : Empirical research is essential for quality assurance and validation in various industries, including pharmaceuticals, manufacturing, and technology. It ensures that products and processes meet established standards.
  • Continuous Improvement : Businesses and organizations use empirical research to evaluate performance, customer satisfaction, and product effectiveness. This data-driven approach fosters continuous improvement and innovation.
  • Human Advancement : Empirical research in fields like medicine and psychology contributes to the betterment of human health and well-being. It leads to medical breakthroughs, improved therapies, and enhanced psychological interventions.
  • Critical Thinking and Problem Solving : Engaging in empirical research fosters critical thinking skills, problem-solving abilities, and a deep appreciation for evidence-based decision-making.

Empirical research empowers us to explore, understand, and improve the world around us. It forms the bedrock of scientific inquiry and drives progress in countless domains, shaping our understanding of both the natural and social sciences.

How to Conduct Empirical Research?

So, you've decided to dive into the world of empirical research. Let's begin by exploring the crucial steps involved in getting started with your research project.

1. Select a Research Topic

Selecting the right research topic is the cornerstone of a successful empirical study. It's essential to choose a topic that not only piques your interest but also aligns with your research goals and objectives. Here's how to go about it:

  • Identify Your Interests : Start by reflecting on your passions and interests. What topics fascinate you the most? Your enthusiasm will be your driving force throughout the research process.
  • Brainstorm Ideas : Engage in brainstorming sessions to generate potential research topics. Consider the questions you've always wanted to answer or the issues that intrigue you.
  • Relevance and Significance : Assess the relevance and significance of your chosen topic. Does it contribute to existing knowledge? Is it a pressing issue in your field of study or the broader community?
  • Feasibility : Evaluate the feasibility of your research topic. Do you have access to the necessary resources, data, and participants (if applicable)?

2. Formulate Research Questions

Once you've narrowed down your research topic, the next step is to formulate clear and precise research questions . These questions will guide your entire research process and shape your study's direction. To create effective research questions:

  • Specificity : Ensure that your research questions are specific and focused. Vague or overly broad questions can lead to inconclusive results.
  • Relevance : Your research questions should directly relate to your chosen topic. They should address gaps in knowledge or contribute to solving a particular problem.
  • Testability : Ensure that your questions are testable through empirical methods. You should be able to gather data and analyze it to answer these questions.
  • Avoid Bias : Craft your questions in a way that avoids leading or biased language. Maintain neutrality to uphold the integrity of your research.

3. Review Existing Literature

Before you embark on your empirical research journey, it's essential to immerse yourself in the existing body of literature related to your chosen topic. This step, often referred to as a literature review, serves several purposes:

  • Contextualization : Understand the historical context and current state of research in your field. What have previous studies found, and what questions remain unanswered?
  • Identifying Gaps : Identify gaps or areas where existing research falls short. These gaps will help you formulate meaningful research questions and hypotheses.
  • Theory Development : If your study is theoretical, consider how existing theories apply to your topic. If it's empirical, understand how previous studies have approached data collection and analysis.
  • Methodological Insights : Learn from the methodologies employed in previous research. What methods were successful, and what challenges did researchers face?

4. Define Variables

Variables are fundamental components of empirical research. They are the factors or characteristics that can change or be manipulated during your study. Properly defining and categorizing variables is crucial for the clarity and validity of your research. Here's what you need to know:

  • Independent Variables : These are the variables that you, as the researcher, manipulate or control. They are the "cause" in cause-and-effect relationships.
  • Dependent Variables : Dependent variables are the outcomes or responses that you measure or observe. They are the "effect" influenced by changes in independent variables.
  • Operational Definitions : To ensure consistency and clarity, provide operational definitions for your variables. Specify how you will measure or manipulate each variable.
  • Control Variables : In some studies, controlling for other variables that may influence your dependent variable is essential. These are known as control variables.

Understanding these foundational aspects of empirical research will set a solid foundation for the rest of your journey. Now that you've grasped the essentials of getting started, let's delve deeper into the intricacies of research design.

Empirical Research Design

Now that you've selected your research topic, formulated research questions, and defined your variables, it's time to delve into the heart of your empirical research journey – research design . This pivotal step determines how you will collect data and what methods you'll employ to answer your research questions. Let's explore the various facets of research design in detail.

Types of Empirical Research

Empirical research can take on several forms, each with its own unique approach and methodologies. Understanding the different types of empirical research will help you choose the most suitable design for your study. Here are some common types:

  • Experimental Research : In this type, researchers manipulate one or more independent variables to observe their impact on dependent variables. It's highly controlled and often conducted in a laboratory setting.
  • Observational Research : Observational research involves the systematic observation of subjects or phenomena without intervention. Researchers are passive observers, documenting behaviors, events, or patterns.
  • Survey Research : Surveys are used to collect data through structured questionnaires or interviews. This method is efficient for gathering information from a large number of participants.
  • Case Study Research : Case studies focus on in-depth exploration of one or a few cases. Researchers gather detailed information through various sources such as interviews, documents, and observations.
  • Qualitative Research : Qualitative research aims to understand behaviors, experiences, and opinions in depth. It often involves open-ended questions, interviews, and thematic analysis.
  • Quantitative Research : Quantitative research collects numerical data and relies on statistical analysis to draw conclusions. It involves structured questionnaires, experiments, and surveys.

Your choice of research type should align with your research questions and objectives. Experimental research, for example, is ideal for testing cause-and-effect relationships, while qualitative research is more suitable for exploring complex phenomena.

Experimental Design

Experimental research is a systematic approach to studying causal relationships. It's characterized by the manipulation of one or more independent variables while controlling for other factors. Here are some key aspects of experimental design:

  • Control and Experimental Groups : Participants are randomly assigned to either a control group or an experimental group. The independent variable is manipulated for the experimental group but not for the control group.
  • Randomization : Randomization is crucial to eliminate bias in group assignment. It ensures that each participant has an equal chance of being in either group.
  • Hypothesis Testing : Experimental research often involves hypothesis testing. Researchers formulate hypotheses about the expected effects of the independent variable and use statistical analysis to test these hypotheses.

Observational Design

Observational research entails careful and systematic observation of subjects or phenomena. It's advantageous when you want to understand natural behaviors or events. Key aspects of observational design include:

  • Participant Observation : Researchers immerse themselves in the environment they are studying. They become part of the group being observed, allowing for a deep understanding of behaviors.
  • Non-Participant Observation : In non-participant observation, researchers remain separate from the subjects. They observe and document behaviors without direct involvement.
  • Data Collection Methods : Observational research can involve various data collection methods, such as field notes, video recordings, photographs, or coding of observed behaviors.

Survey Design

Surveys are a popular choice for collecting data from a large number of participants. Effective survey design is essential to ensure the validity and reliability of your data. Consider the following:

  • Questionnaire Design : Create clear and concise questions that are easy for participants to understand. Avoid leading or biased questions.
  • Sampling Methods : Decide on the appropriate sampling method for your study, whether it's random, stratified, or convenience sampling.
  • Data Collection Tools : Choose the right tools for data collection, whether it's paper surveys, online questionnaires, or face-to-face interviews.

Case Study Design

Case studies are an in-depth exploration of one or a few cases to gain a deep understanding of a particular phenomenon. Key aspects of case study design include:

  • Single Case vs. Multiple Case Studies : Decide whether you'll focus on a single case or multiple cases. Single case studies are intensive and allow for detailed examination, while multiple case studies provide comparative insights.
  • Data Collection Methods : Gather data through interviews, observations, document analysis, or a combination of these methods.

Qualitative vs. Quantitative Research

In empirical research, you'll often encounter the distinction between qualitative and quantitative research . Here's a closer look at these two approaches:

  • Qualitative Research : Qualitative research seeks an in-depth understanding of human behavior, experiences, and perspectives. It involves open-ended questions, interviews, and the analysis of textual or narrative data. Qualitative research is exploratory and often used when the research question is complex and requires a nuanced understanding.
  • Quantitative Research : Quantitative research collects numerical data and employs statistical analysis to draw conclusions. It involves structured questionnaires, experiments, and surveys. Quantitative research is ideal for testing hypotheses and establishing cause-and-effect relationships.

Understanding the various research design options is crucial in determining the most appropriate approach for your study. Your choice should align with your research questions, objectives, and the nature of the phenomenon you're investigating.

Data Collection for Empirical Research

Now that you've established your research design, it's time to roll up your sleeves and collect the data that will fuel your empirical research. Effective data collection is essential for obtaining accurate and reliable results.

Sampling Methods

Sampling methods are critical in empirical research, as they determine the subset of individuals or elements from your target population that you will study. Here are some standard sampling methods:

  • Random Sampling : Random sampling ensures that every member of the population has an equal chance of being selected. It minimizes bias and is often used in quantitative research.
  • Stratified Sampling : Stratified sampling involves dividing the population into subgroups or strata based on specific characteristics (e.g., age, gender, location). Samples are then randomly selected from each stratum, ensuring representation of all subgroups.
  • Convenience Sampling : Convenience sampling involves selecting participants who are readily available or easily accessible. While it's convenient, it may introduce bias and limit the generalizability of results.
  • Snowball Sampling : Snowball sampling is instrumental when studying hard-to-reach or hidden populations. One participant leads you to another, creating a "snowball" effect. This method is common in qualitative research.
  • Purposive Sampling : In purposive sampling, researchers deliberately select participants who meet specific criteria relevant to their research questions. It's often used in qualitative studies to gather in-depth information.

The choice of sampling method depends on the nature of your research, available resources, and the degree of precision required. It's crucial to carefully consider your sampling strategy to ensure that your sample accurately represents your target population.

Data Collection Instruments

Data collection instruments are the tools you use to gather information from your participants or sources. These instruments should be designed to capture the data you need accurately. Here are some popular data collection instruments:

  • Questionnaires : Questionnaires consist of structured questions with predefined response options. When designing questionnaires, consider the clarity of questions, the order of questions, and the response format (e.g., Likert scale , multiple-choice).
  • Interviews : Interviews involve direct communication between the researcher and participants. They can be structured (with predetermined questions) or unstructured (open-ended). Effective interviews require active listening and probing for deeper insights.
  • Observations : Observations entail systematically and objectively recording behaviors, events, or phenomena. Researchers must establish clear criteria for what to observe, how to record observations, and when to observe.
  • Surveys : Surveys are a common data collection instrument for quantitative research. They can be administered through various means, including online surveys, paper surveys, and telephone surveys.
  • Documents and Archives : In some cases, data may be collected from existing documents, records, or archives. Ensure that the sources are reliable, relevant, and properly documented.

To streamline your process and gather insights with precision and efficiency, consider leveraging innovative tools like Appinio . With Appinio's intuitive platform, you can harness the power of real-time consumer data to inform your research decisions effectively. Whether you're conducting surveys, interviews, or observations, Appinio empowers you to define your target audience, collect data from diverse demographics, and analyze results seamlessly.

By incorporating Appinio into your data collection toolkit, you can unlock a world of possibilities and elevate the impact of your empirical research. Ready to revolutionize your approach to data collection?

Book a Demo

Data Collection Procedures

Data collection procedures outline the step-by-step process for gathering data. These procedures should be meticulously planned and executed to maintain the integrity of your research.

  • Training : If you have a research team, ensure that they are trained in data collection methods and protocols. Consistency in data collection is crucial.
  • Pilot Testing : Before launching your data collection, conduct a pilot test with a small group to identify any potential problems with your instruments or procedures. Make necessary adjustments based on feedback.
  • Data Recording : Establish a systematic method for recording data. This may include timestamps, codes, or identifiers for each data point.
  • Data Security : Safeguard the confidentiality and security of collected data. Ensure that only authorized individuals have access to the data.
  • Data Storage : Properly organize and store your data in a secure location, whether in physical or digital form. Back up data to prevent loss.

Ethical Considerations

Ethical considerations are paramount in empirical research, as they ensure the well-being and rights of participants are protected.

  • Informed Consent : Obtain informed consent from participants, providing clear information about the research purpose, procedures, risks, and their right to withdraw at any time.
  • Privacy and Confidentiality : Protect the privacy and confidentiality of participants. Ensure that data is anonymized and sensitive information is kept confidential.
  • Beneficence : Ensure that your research benefits participants and society while minimizing harm. Consider the potential risks and benefits of your study.
  • Honesty and Integrity : Conduct research with honesty and integrity. Report findings accurately and transparently, even if they are not what you expected.
  • Respect for Participants : Treat participants with respect, dignity, and sensitivity to cultural differences. Avoid any form of coercion or manipulation.
  • Institutional Review Board (IRB) : If required, seek approval from an IRB or ethics committee before conducting your research, particularly when working with human participants.

Adhering to ethical guidelines is not only essential for the ethical conduct of research but also crucial for the credibility and validity of your study. Ethical research practices build trust between researchers and participants and contribute to the advancement of knowledge with integrity.

With a solid understanding of data collection, including sampling methods, instruments, procedures, and ethical considerations, you are now well-equipped to gather the data needed to answer your research questions.

Empirical Research Data Analysis

Now comes the exciting phase of data analysis, where the raw data you've diligently collected starts to yield insights and answers to your research questions. We will explore the various aspects of data analysis, from preparing your data to drawing meaningful conclusions through statistics and visualization.

Data Preparation

Data preparation is the crucial first step in data analysis. It involves cleaning, organizing, and transforming your raw data into a format that is ready for analysis. Effective data preparation ensures the accuracy and reliability of your results.

  • Data Cleaning : Identify and rectify errors, missing values, and inconsistencies in your dataset. This may involve correcting typos, removing outliers, and imputing missing data.
  • Data Coding : Assign numerical values or codes to categorical variables to make them suitable for statistical analysis. For example, converting "Yes" and "No" to 1 and 0.
  • Data Transformation : Transform variables as needed to meet the assumptions of the statistical tests you plan to use. Common transformations include logarithmic or square root transformations.
  • Data Integration : If your data comes from multiple sources, integrate it into a unified dataset, ensuring that variables match and align.
  • Data Documentation : Maintain clear documentation of all data preparation steps, as well as the rationale behind each decision. This transparency is essential for replicability.

Effective data preparation lays the foundation for accurate and meaningful analysis. It allows you to trust the results that will follow in the subsequent stages.

Descriptive Statistics

Descriptive statistics help you summarize and make sense of your data by providing a clear overview of its key characteristics. These statistics are essential for understanding the central tendencies, variability, and distribution of your variables. Descriptive statistics include:

  • Measures of Central Tendency : These include the mean (average), median (middle value), and mode (most frequent value). They help you understand the typical or central value of your data.
  • Measures of Dispersion : Measures like the range, variance, and standard deviation provide insights into the spread or variability of your data points.
  • Frequency Distributions : Creating frequency distributions or histograms allows you to visualize the distribution of your data across different values or categories.

Descriptive statistics provide the initial insights needed to understand your data's basic characteristics, which can inform further analysis.

Inferential Statistics

Inferential statistics take your analysis to the next level by allowing you to make inferences or predictions about a larger population based on your sample data. These methods help you test hypotheses and draw meaningful conclusions. Key concepts in inferential statistics include:

  • Hypothesis Testing : Hypothesis tests (e.g., t-tests, chi-squared tests) help you determine whether observed differences or associations in your data are statistically significant or occurred by chance.
  • Confidence Intervals : Confidence intervals provide a range within which population parameters (e.g., population mean) are likely to fall based on your sample data.
  • Regression Analysis : Regression models (linear, logistic, etc.) help you explore relationships between variables and make predictions.
  • Analysis of Variance (ANOVA) : ANOVA tests are used to compare means between multiple groups, allowing you to assess whether differences are statistically significant.

Inferential statistics are powerful tools for drawing conclusions from your data and assessing the generalizability of your findings to the broader population.

Qualitative Data Analysis

Qualitative data analysis is employed when working with non-numerical data, such as text, interviews, or open-ended survey responses. It focuses on understanding the underlying themes, patterns, and meanings within qualitative data. Qualitative analysis techniques include:

  • Thematic Analysis : Identifying and analyzing recurring themes or patterns within textual data.
  • Content Analysis : Categorizing and coding qualitative data to extract meaningful insights.
  • Grounded Theory : Developing theories or frameworks based on emergent themes from the data.
  • Narrative Analysis : Examining the structure and content of narratives to uncover meaning.

Qualitative data analysis provides a rich and nuanced understanding of complex phenomena and human experiences.

Data Visualization

Data visualization is the art of representing data graphically to make complex information more understandable and accessible. Effective data visualization can reveal patterns, trends, and outliers in your data. Common types of data visualization include:

  • Bar Charts and Histograms : Used to display the distribution of categorical data or discrete data .
  • Line Charts : Ideal for showing trends and changes in data over time.
  • Scatter Plots : Visualize relationships and correlations between two variables.
  • Pie Charts : Display the composition of a whole in terms of its parts.
  • Heatmaps : Depict patterns and relationships in multidimensional data through color-coding.
  • Box Plots : Provide a summary of the data distribution, including outliers.
  • Interactive Dashboards : Create dynamic visualizations that allow users to explore data interactively.

Data visualization not only enhances your understanding of the data but also serves as a powerful communication tool to convey your findings to others.

As you embark on the data analysis phase of your empirical research, remember that the specific methods and techniques you choose will depend on your research questions, data type, and objectives. Effective data analysis transforms raw data into valuable insights, bringing you closer to the answers you seek.

How to Report Empirical Research Results?

At this stage, you get to share your empirical research findings with the world. Effective reporting and presentation of your results are crucial for communicating your research's impact and insights.

1. Write the Research Paper

Writing a research paper is the culmination of your empirical research journey. It's where you synthesize your findings, provide context, and contribute to the body of knowledge in your field.

  • Title and Abstract : Craft a clear and concise title that reflects your research's essence. The abstract should provide a brief summary of your research objectives, methods, findings, and implications.
  • Introduction : In the introduction, introduce your research topic, state your research questions or hypotheses, and explain the significance of your study. Provide context by discussing relevant literature.
  • Methods : Describe your research design, data collection methods, and sampling procedures. Be precise and transparent, allowing readers to understand how you conducted your study.
  • Results : Present your findings in a clear and organized manner. Use tables, graphs, and statistical analyses to support your results. Avoid interpreting your findings in this section; focus on the presentation of raw data.
  • Discussion : Interpret your findings and discuss their implications. Relate your results to your research questions and the existing literature. Address any limitations of your study and suggest avenues for future research.
  • Conclusion : Summarize the key points of your research and its significance. Restate your main findings and their implications.
  • References : Cite all sources used in your research following a specific citation style (e.g., APA, MLA, Chicago). Ensure accuracy and consistency in your citations.
  • Appendices : Include any supplementary material, such as questionnaires, data coding sheets, or additional analyses, in the appendices.

Writing a research paper is a skill that improves with practice. Ensure clarity, coherence, and conciseness in your writing to make your research accessible to a broader audience.

2. Create Visuals and Tables

Visuals and tables are powerful tools for presenting complex data in an accessible and understandable manner.

  • Clarity : Ensure that your visuals and tables are clear and easy to interpret. Use descriptive titles and labels.
  • Consistency : Maintain consistency in formatting, such as font size and style, across all visuals and tables.
  • Appropriateness : Choose the most suitable visual representation for your data. Bar charts, line graphs, and scatter plots work well for different types of data.
  • Simplicity : Avoid clutter and unnecessary details. Focus on conveying the main points.
  • Accessibility : Make sure your visuals and tables are accessible to a broad audience, including those with visual impairments.
  • Captions : Include informative captions that explain the significance of each visual or table.

Compelling visuals and tables enhance the reader's understanding of your research and can be the key to conveying complex information efficiently.

3. Interpret Findings

Interpreting your findings is where you bridge the gap between data and meaning. It's your opportunity to provide context, discuss implications, and offer insights. When interpreting your findings:

  • Relate to Research Questions : Discuss how your findings directly address your research questions or hypotheses.
  • Compare with Literature : Analyze how your results align with or deviate from previous research in your field. What insights can you draw from these comparisons?
  • Discuss Limitations : Be transparent about the limitations of your study. Address any constraints, biases, or potential sources of error.
  • Practical Implications : Explore the real-world implications of your findings. How can they be applied or inform decision-making?
  • Future Research Directions : Suggest areas for future research based on the gaps or unanswered questions that emerged from your study.

Interpreting findings goes beyond simply presenting data; it's about weaving a narrative that helps readers grasp the significance of your research in the broader context.

With your research paper written, structured, and enriched with visuals, and your findings expertly interpreted, you are now prepared to communicate your research effectively. Sharing your insights and contributing to the body of knowledge in your field is a significant accomplishment in empirical research.

Examples of Empirical Research

To solidify your understanding of empirical research, let's delve into some real-world examples across different fields. These examples will illustrate how empirical research is applied to gather data, analyze findings, and draw conclusions.

Social Sciences

In the realm of social sciences, consider a sociological study exploring the impact of socioeconomic status on educational attainment. Researchers gather data from a diverse group of individuals, including their family backgrounds, income levels, and academic achievements.

Through statistical analysis, they can identify correlations and trends, revealing whether individuals from lower socioeconomic backgrounds are less likely to attain higher levels of education. This empirical research helps shed light on societal inequalities and informs policymakers on potential interventions to address disparities in educational access.

Environmental Science

Environmental scientists often employ empirical research to assess the effects of environmental changes. For instance, researchers studying the impact of climate change on wildlife might collect data on animal populations, weather patterns, and habitat conditions over an extended period.

By analyzing this empirical data, they can identify correlations between climate fluctuations and changes in wildlife behavior, migration patterns, or population sizes. This empirical research is crucial for understanding the ecological consequences of climate change and informing conservation efforts.

Business and Economics

In the business world, empirical research is essential for making data-driven decisions. Consider a market research study conducted by a business seeking to launch a new product. They collect data through surveys , focus groups , and consumer behavior analysis.

By examining this empirical data, the company can gauge consumer preferences, demand, and potential market size. Empirical research in business helps guide product development, pricing strategies, and marketing campaigns, increasing the likelihood of a successful product launch.

Psychological studies frequently rely on empirical research to understand human behavior and cognition. For instance, a psychologist interested in examining the impact of stress on memory might design an experiment. Participants are exposed to stress-inducing situations, and their memory performance is assessed through various tasks.

By analyzing the data collected, the psychologist can determine whether stress has a significant effect on memory recall. This empirical research contributes to our understanding of the complex interplay between psychological factors and cognitive processes.

These examples highlight the versatility and applicability of empirical research across diverse fields. Whether in medicine, social sciences, environmental science, business, or psychology, empirical research serves as a fundamental tool for gaining insights, testing hypotheses, and driving advancements in knowledge and practice.

Conclusion for Empirical Research

Empirical research is a powerful tool for gaining insights, testing hypotheses, and making informed decisions. By following the steps outlined in this guide, you've learned how to select research topics, collect data, analyze findings, and effectively communicate your research to the world. Remember, empirical research is a journey of discovery, and each step you take brings you closer to a deeper understanding of the world around you. Whether you're a scientist, a student, or someone curious about the process, the principles of empirical research empower you to explore, learn, and contribute to the ever-expanding realm of knowledge.

How to Collect Data for Empirical Research?

Introducing Appinio , the real-time market research platform revolutionizing how companies gather consumer insights for their empirical research endeavors. With Appinio, you can conduct your own market research in minutes, gaining valuable data to fuel your data-driven decisions.

Appinio is more than just a market research platform; it's a catalyst for transforming the way you approach empirical research, making it exciting, intuitive, and seamlessly integrated into your decision-making process.

Here's why Appinio is the go-to solution for empirical research:

  • From Questions to Insights in Minutes : With Appinio's streamlined process, you can go from formulating your research questions to obtaining actionable insights in a matter of minutes, saving you time and effort.
  • Intuitive Platform for Everyone : No need for a PhD in research; Appinio's platform is designed to be intuitive and user-friendly, ensuring that anyone can navigate and utilize it effectively.
  • Rapid Response Times : With an average field time of under 23 minutes for 1,000 respondents, Appinio delivers rapid results, allowing you to gather data swiftly and efficiently.
  • Global Reach with Targeted Precision : With access to over 90 countries and the ability to define target groups based on 1200+ characteristics, Appinio empowers you to reach your desired audience with precision and ease.

Register now EN

Get free access to the platform!

Join the loop 💌

Be the first to hear about new updates, product news, and data insights. We'll send it all straight to your inbox.

Get the latest market research news straight to your inbox! 💌

Wait, there's more

Pareto Analysis Definition Pareto Chart Examples

30.05.2024 | 29min read

Pareto Analysis: Definition, Pareto Chart, Examples

What is Systematic Sampling Definition Types Examples

28.05.2024 | 32min read

What is Systematic Sampling? Definition, Types, Examples

Time Series Analysis Definition Types Techniques Examples

16.05.2024 | 30min read

Time Series Analysis: Definition, Types, Techniques, Examples

Philosophy Institute

Understanding the Empirical Method in Research Methodology

empirical research problems

Table of Contents

Have you ever wondered how scientists gather evidence to support their theories? Or what steps researchers take to ensure that their findings are reliable and not just based on speculation? The answer lies in a cornerstone of scientific investigation known as the empirical method . This approach to research is all about collecting data and observing the world to form solid, evidence-based conclusions. Let’s dive into the empirical method’s fascinating world and understand why it’s so critical in research methodology.

What is the empirical method?

The empirical method is a way of gaining knowledge by means of direct and indirect observation or experience. It’s fundamentally based on the idea that knowledge comes from sensory experience and can be acquired through observation and experimentation. This method stands in contrast to approaches that rely solely on theoretical or logical means.

The role of observation in the empirical method

Observation is at the heart of the empirical method. It involves using your senses to gather information about the world. This could be as simple as noting the color of a flower or as complex as using advanced technology to observe the behavior of microscopic organisms. The key is that the observations must be systematic and replicable, providing reliable data that can be used to draw conclusions.

Data collection: qualitative and quantitative

Different types of data can be collected using the empirical method:

  • Qualitative data – This data type is descriptive and conceptual, often collected through interviews, observations, and case studies.
  • Quantitative data – This involves numerical data collected through methods like surveys, experiments, and statistical analysis.

Empirical vs. experimental methods

While the empirical method is often associated with experimentation, it’s important to distinguish between the two. Experimental methods involve controlled tests where the researcher manipulates one variable to observe the effect on another. In contrast, the empirical method doesn’t necessarily involve manipulation. Instead, it focuses on observing and collecting data in natural settings, offering a broader understanding of phenomena as they occur in real life.

Why the distinction matters

Understanding the difference between empirical and experimental methods is crucial because it affects how research is conducted and how results are interpreted. Empirical research can provide a more naturalistic view of the subject matter, whereas experimental research can offer more control over variables and potentially more precise outcomes.

The significance of experiential learning

The empirical method has deep roots in experiential learning, which emphasizes learning through experience. This connection is vital because it underlines the importance of engaging with the subject matter at a practical level, rather than just theoretically. It’s a hands-on approach to knowledge that has been valued since the time of Aristotle.

Developing theories from empirical research

One of the most significant aspects of the empirical method is its role in theory development . Researchers collect and analyze data, and from these findings, they can formulate or refine theories. Theories that are supported by empirical evidence tend to be more robust and widely accepted in the scientific community.

Applying the empirical method in various fields

The empirical method is not limited to the natural sciences. It’s used across a range of disciplines, from social sciences to humanities, to understand different aspects of the world. For instance:

  • In psychology , researchers might use the empirical method to observe and record behaviors to understand the underlying mental processes.
  • In sociology , it could involve studying social interactions to draw conclusions about societal structures.
  • In economics , empirical data might be used to test the validity of economic theories or to measure market trends.

Challenges and limitations

Despite its importance, the empirical method has its challenges and limitations. One major challenge is ensuring that observations and data collection are unbiased. Additionally, not all phenomena are easily observable, and some may require more complex or abstract approaches.

The empirical method is a fundamental aspect of research methodology that has stood the test of time. By relying on observation and data collection, it allows researchers to ground their theories in reality, providing a solid foundation for knowledge. Whether it’s used in the hard sciences, social sciences, or humanities, the empirical method continues to be a critical tool for understanding our complex world.

How do you think the empirical method affects the credibility of research findings? And can you think of a situation where empirical methods might be difficult to apply but still necessary for advancing knowledge? Let’s discuss these thought-provoking questions and consider the breadth of the empirical method’s impact on the pursuit of understanding.

How useful was this post?

Click on a star to rate it!

Average rating / 5. Vote count:

No votes so far! Be the first to rate this post.

We are sorry that this post was not useful for you!

Let us improve this post!

Tell us how we can improve this post?

Submit a Comment Cancel reply

Your email address will not be published. Required fields are marked *

Save my name, email, and website in this browser for the next time I comment.

Submit Comment

Research Methodology

1 Introduction to Research in General

  • Research in General
  • Research Circle
  • Tools of Research
  • Methods: Quantitative or Qualitative
  • The Product: Research Report or Papers

2 Original Unity of Philosophy and Science

  • Myth Philosophy and Science: Original Unity
  • The Myth: A Spiritual Metaphor
  • Myth Philosophy and Science
  • The Greek Quest for Unity
  • The Ionian School
  • Towards a Grand Unification Theory or Theory of Everything
  • Einstein’s Perennial Quest for Unity

3 Evolution of the Distinct Methods of Science

  • Definition of Scientific Method
  • The Evolution of Scientific Methods
  • Theory-Dependence of Observation
  • Scope of Science and Scientific Methods
  • Prevalent Mistakes in Applying the Scientific Method

4 Relation of Scientific and Philosophical Methods

  • Definitions of Scientific and Philosophical method
  • Philosophical method
  • Scientific method
  • The relation
  • The Importance of Philosophical and scientific methods

5 Dialectical Method

  • Introduction and a Brief Survey of the Method
  • Types of Dialectics
  • Dialectics in Classical Philosophy
  • Dialectics in Modern Philosophy
  • Critique of Dialectical Method

6 Rational Method

  • Understanding Rationalism
  • Rational Method of Investigation
  • Descartes’ Rational Method
  • Leibniz’ Aim of Philosophy
  • Spinoza’ Aim of Philosophy

7 Empirical Method

  • Common Features of Philosophical Method
  • Empirical Method
  • Exposition of Empiricism
  • Locke’s Empirical Method
  • Berkeley’s Empirical Method
  • David Hume’s Empirical Method

8 Critical Method

  • Basic Features of Critical Theory
  • On Instrumental Reason
  • Conception of Society
  • Human History as Dialectic of Enlightenment
  • Substantive Reason
  • Habermasian Critical Theory
  • Habermas’ Theory of Society
  • Habermas’ Critique of Scientism
  • Theory of Communicative Action
  • Discourse Ethics of Habermas

9 Phenomenological Method (Western and Indian)

  • Phenomenology in Philosophy
  • Phenomenology as a Method
  • Phenomenological Analysis of Knowledge
  • Phenomenological Reduction
  • Husserl’s Triad: Ego Cogito Cogitata
  • Intentionality
  • Understanding ‘Consciousness’
  • Phenomenological Method in Indian Tradition
  • Phenomenological Method in Religion

10 Analytical Method (Western and Indian)

  • Analysis in History of Philosophy
  • Conceptual Analysis
  • Analysis as a Method
  • Analysis in Logical Atomism and Logical Positivism
  • Analytic Method in Ethics
  • Language Analysis
  • Quine’s Analytical Method
  • Analysis in Indian Traditions

11 Hermeneutical Method (Western and Indian)

  • The Power (Sakti) to Convey Meaning
  • Three Meanings
  • Pre-understanding
  • The Semantic Autonomy of the Text
  • Towards a Fusion of Horizons
  • The Hermeneutical Circle
  • The True Scandal of the Text
  • Literary Forms

12 Deconstructive Method

  • The Seminal Idea of Deconstruction in Heidegger
  • Deconstruction in Derrida
  • Structuralism and Post-structuralism
  • Sign Signifier and Signified
  • Writing and Trace
  • Deconstruction as a Strategic Reading
  • The Logic of Supplement
  • No Outside-text

13 Method of Bibliography

  • Preparing to Write
  • Writing a Paper
  • The Main Divisions of a Paper
  • Writing Bibliography in Turabian and APA
  • Sample Bibliography

14 Method of Footnotes

  • Citations and Notes
  • General Hints for Footnotes
  • Writing Footnotes
  • Examples of Footnote or Endnote
  • Example of a Research Article

15 Method of Notes Taking

  • Methods of Note-taking
  • Note Book Style
  • Note taking in a Computer
  • Types of Note-taking
  • Notes from Field Research
  • Errors to be Avoided

16 Method of Thesis Proposal and Presentation

  • Preliminary Section
  • Presenting the Problem of the Thesis
  • Design of the Study
  • Main Body of the Thesis
  • Conclusion Summary and Recommendations
  • Reference Material

Share on Mastodon

  • What is Empirical Research Study? [Examples & Method]

busayo.longe

The bulk of human decisions relies on evidence, that is, what can be measured or proven as valid. In choosing between plausible alternatives, individuals are more likely to tilt towards the option that is proven to work, and this is the same approach adopted in empirical research. 

In empirical research, the researcher arrives at outcomes by testing his or her empirical evidence using qualitative or quantitative methods of observation, as determined by the nature of the research. An empirical research study is set apart from other research approaches by its methodology and features hence; it is important for every researcher to know what constitutes this investigation method. 

What is Empirical Research? 

Empirical research is a type of research methodology that makes use of verifiable evidence in order to arrive at research outcomes. In other words, this  type of research relies solely on evidence obtained through observation or scientific data collection methods. 

Empirical research can be carried out using qualitative or quantitative observation methods , depending on the data sample, that is, quantifiable data or non-numerical data . Unlike theoretical research that depends on preconceived notions about the research variables, empirical research carries a scientific investigation to measure the experimental probability of the research variables 

Characteristics of Empirical Research

  • Research Questions

An empirical research begins with a set of research questions that guide the investigation. In many cases, these research questions constitute the research hypothesis which is tested using qualitative and quantitative methods as dictated by the nature of the research.

In an empirical research study, the research questions are built around the core of the research, that is, the central issue which the research seeks to resolve. They also determine the course of the research by highlighting the specific objectives and aims of the systematic investigation. 

  • Definition of the Research Variables

The research variables are clearly defined in terms of their population, types, characteristics, and behaviors. In other words, the data sample is clearly delimited and placed within the context of the research. 

  • Description of the Research Methodology

 An empirical research also clearly outlines the methods adopted in the systematic investigation. Here, the research process is described in detail including the selection criteria for the data sample, qualitative or quantitative research methods plus testing instruments. 

An empirical research is usually divided into 4 parts which are the introduction, methodology, findings, and discussions. The introduction provides a background of the empirical study while the methodology describes the research design, processes, and tools for the systematic investigation. 

The findings refer to the research outcomes and they can be outlined as statistical data or in the form of information obtained through the qualitative observation of research variables. The discussions highlight the significance of the study and its contributions to knowledge. 

Uses of Empirical Research

Without any doubt, empirical research is one of the most useful methods of systematic investigation. It can be used for validating multiple research hypotheses in different fields including Law, Medicine, and Anthropology. 

  • Empirical Research in Law : In Law, empirical research is used to study institutions, rules, procedures, and personnel of the law, with a view to understanding how they operate and what effects they have. It makes use of direct methods rather than secondary sources, and this helps you to arrive at more valid conclusions.
  • Empirical Research in Medicine : In medicine, empirical research is used to test and validate multiple hypotheses and increase human knowledge.
  • Empirical Research in Anthropology : In anthropology, empirical research is used as an evidence-based systematic method of inquiry into patterns of human behaviors and cultures. This helps to validate and advance human knowledge.
Discover how Extrapolation Powers statistical research: Definition, examples, types, and applications explained.

The Empirical Research Cycle

The empirical research cycle is a 5-phase cycle that outlines the systematic processes for conducting and empirical research. It was developed by Dutch psychologist, A.D. de Groot in the 1940s and it aligns 5 important stages that can be viewed as deductive approaches to empirical research. 

In the empirical research methodological cycle, all processes are interconnected and none of the processes is more important than the other. This cycle clearly outlines the different phases involved in generating the research hypotheses and testing these hypotheses systematically using the empirical data. 

  • Observation: This is the process of gathering empirical data for the research. At this stage, the researcher gathers relevant empirical data using qualitative or quantitative observation methods, and this goes ahead to inform the research hypotheses.
  • Induction: At this stage, the researcher makes use of inductive reasoning in order to arrive at a general probable research conclusion based on his or her observation. The researcher generates a general assumption that attempts to explain the empirical data and s/he goes on to observe the empirical data in line with this assumption.
  • Deduction: This is the deductive reasoning stage. This is where the researcher generates hypotheses by applying logic and rationality to his or her observation.
  • Testing: Here, the researcher puts the hypotheses to test using qualitative or quantitative research methods. In the testing stage, the researcher combines relevant instruments of systematic investigation with empirical methods in order to arrive at objective results that support or negate the research hypotheses.
  • Evaluation: The evaluation research is the final stage in an empirical research study. Here, the research outlines the empirical data, the research findings and the supporting arguments plus any challenges encountered during the research process.

This information is useful for further research. 

Learn about qualitative data: uncover its types and examples here.

Examples of Empirical Research 

  • An empirical research study can be carried out to determine if listening to happy music improves the mood of individuals. The researcher may need to conduct an experiment that involves exposing individuals to happy music to see if this improves their moods.

The findings from such an experiment will provide empirical evidence that confirms or refutes the hypotheses. 

  • An empirical research study can also be carried out to determine the effects of a new drug on specific groups of people. The researcher may expose the research subjects to controlled quantities of the drug and observe research subjects to controlled quantities of the drug and observe the effects over a specific period of time to gather empirical data.
  • Another example of empirical research is measuring the levels of noise pollution found in an urban area to determine the average levels of sound exposure experienced by its inhabitants. Here, the researcher may have to administer questionnaires or carry out a survey in order to gather relevant data based on the experiences of the research subjects.
  • Empirical research can also be carried out to determine the relationship between seasonal migration and the body mass of flying birds. A researcher may need to observe the birds and carry out necessary observation and experimentation in order to arrive at objective outcomes that answer the research question.

Empirical Research Data Collection Methods

Empirical data can be gathered using qualitative and quantitative data collection methods. Quantitative data collection methods are used for numerical data gathering while qualitative data collection processes are used to gather empirical data that cannot be quantified, that is, non-numerical data. 

The following are common methods of gathering data in empirical research

  • Survey/ Questionnaire

A survey is a method of data gathering that is typically employed by researchers to gather large sets of data from a specific number of respondents with regards to a research subject. This method of data gathering is often used for quantitative data collection , although it can also be deployed during quantitative research.

A survey contains a set of questions that can range from close-ended to open-ended questions together with other question types that revolve around the research subject. A survey can be administered physically or with the use of online data-gathering platforms like Formplus. 

Empirical data can also be collected by carrying out an experiment. An experiment is a controlled simulation in which one or more of the research variables is manipulated using a set of interconnected processes in order to confirm or refute the research hypotheses.

An experiment is a useful method of measuring causality; that is cause and effect between dependent and independent variables in a research environment. It is an integral data gathering method in an empirical research study because it involves testing calculated assumptions in order to arrive at the most valid data and research outcomes. 

T he case study method is another common data gathering method in an empirical research study. It involves sifting through and analyzing relevant cases and real-life experiences about the research subject or research variables in order to discover in-depth information that can serve as empirical data.

  • Observation

The observational method is a method of qualitative data gathering that requires the researcher to study the behaviors of research variables in their natural environments in order to gather relevant information that can serve as empirical data.

How to collect Empirical Research Data with Questionnaire

With Formplus, you can create a survey or questionnaire for collecting empirical data from your research subjects. Formplus also offers multiple form sharing options so that you can share your empirical research survey to research subjects via a variety of methods.

Here is a step-by-step guide of how to collect empirical data using Formplus:

Sign in to Formplus

empirical-research-data-collection

In the Formplus builder, you can easily create your empirical research survey by dragging and dropping preferred fields into your form. To access the Formplus builder, you will need to create an account on Formplus. 

Once you do this, sign in to your account and click on “Create Form ” to begin. 

Unlock the secrets of Quantitative Data: Click here to explore the types and examples.

Edit Form Title

Click on the field provided to input your form title, for example, “Empirical Research Survey”.

empirical-research-questionnaire

Edit Form  

  • Click on the edit button to edit the form.
  • Add Fields: Drag and drop preferred form fields into your form in the Formplus builder inputs column. There are several field input options for survey forms in the Formplus builder.
  • Edit fields
  • Click on “Save”
  • Preview form.

empirical-research-survey

Customize Form

Formplus allows you to add unique features to your empirical research survey form. You can personalize your survey using various customization options. Here, you can add background images, your organization’s logo, and use other styling options. You can also change the display theme of your form. 

empirical-research-questionnaire

  • Share your Form Link with Respondents

Formplus offers multiple form sharing options which enables you to easily share your empirical research survey form with respondents. You can use the direct social media sharing buttons to share your form link to your organization’s social media pages. 

You can send out your survey form as email invitations to your research subjects too. If you wish, you can share your form’s QR code or embed it on your organization’s website for easy access. 

formplus-form-share

Empirical vs Non-Empirical Research

Empirical and non-empirical research are common methods of systematic investigation employed by researchers. Unlike empirical research that tests hypotheses in order to arrive at valid research outcomes, non-empirical research theorizes the logical assumptions of research variables. 

Definition: Empirical research is a research approach that makes use of evidence-based data while non-empirical research is a research approach that makes use of theoretical data. 

Method: In empirical research, the researcher arrives at valid outcomes by mainly observing research variables, creating a hypothesis and experimenting on research variables to confirm or refute the hypothesis. In non-empirical research, the researcher relies on inductive and deductive reasoning to theorize logical assumptions about the research subjects.

The major difference between the research methodology of empirical and non-empirical research is while the assumptions are tested in empirical research, they are entirely theorized in non-empirical research. 

Data Sample: Empirical research makes use of empirical data while non-empirical research does not make use of empirical data. Empirical data refers to information that is gathered through experience or observation. 

Unlike empirical research, theoretical or non-empirical research does not rely on data gathered through evidence. Rather, it works with logical assumptions and beliefs about the research subject. 

Data Collection Methods : Empirical research makes use of quantitative and qualitative data gathering methods which may include surveys, experiments, and methods of observation. This helps the researcher to gather empirical data, that is, data backed by evidence.  

Non-empirical research, on the other hand, does not make use of qualitative or quantitative methods of data collection . Instead, the researcher gathers relevant data through critical studies, systematic review and meta-analysis. 

Advantages of Empirical Research 

  • Empirical research is flexible. In this type of systematic investigation, the researcher can adjust the research methodology including the data sample size, data gathering methods plus the data analysis methods as necessitated by the research process.
  • It helps the research to understand how the research outcomes can be influenced by different research environments.
  • Empirical research study helps the researcher to develop relevant analytical and observation skills that can be useful in dynamic research contexts.
  • This type of research approach allows the researcher to control multiple research variables in order to arrive at the most relevant research outcomes.
  • Empirical research is widely considered as one of the most authentic and competent research designs.
  • It improves the internal validity of traditional research using a variety of experiments and research observation methods.

Disadvantages of Empirical Research 

  • An empirical research study is time-consuming because the researcher needs to gather the empirical data from multiple resources which typically takes a lot of time.
  • It is not a cost-effective research approach. Usually, this method of research incurs a lot of cost because of the monetary demands of the field research.
  • It may be difficult to gather the needed empirical data sample because of the multiple data gathering methods employed in an empirical research study.
  • It may be difficult to gain access to some communities and firms during the data gathering process and this can affect the validity of the research.
  • The report from an empirical research study is intensive and can be very lengthy in nature.

Conclusion 

Empirical research is an important method of systematic investigation because it gives the researcher the opportunity to test the validity of different assumptions, in the form of hypotheses, before arriving at any findings. Hence, it is a more research approach. 

There are different quantitative and qualitative methods of data gathering employed during an empirical research study based on the purpose of the research which include surveys, experiments, and various observatory methods. Surveys are one of the most common methods or empirical data collection and they can be administered online or physically. 

You can use Formplus to create and administer your online empirical research survey. Formplus allows you to create survey forms that you can share with target respondents in order to obtain valuable feedback about your research context, question or subject. 

In the form builder, you can add different fields to your survey form and you can also modify these form fields to suit your research process. Sign up to Formplus to access the form builder and start creating powerful online empirical research survey forms. 

Logo

Connect to Formplus, Get Started Now - It's Free!

  • advantage of empirical research
  • disadvantages of empirical resarch
  • empirical research characteristics
  • empirical research cycle
  • empirical research method
  • example of empirical research
  • uses of empirical research
  • busayo.longe

Formplus

You may also like:

What is Pure or Basic Research? + [Examples & Method]

Simple guide on pure or basic research, its methods, characteristics, advantages, and examples in science, medicine, education and psychology

empirical research problems

Extrapolation in Statistical Research: Definition, Examples, Types, Applications

In this article we’ll look at the different types and characteristics of extrapolation, plus how it contrasts to interpolation.

Research Questions: Definitions, Types + [Examples]

A comprehensive guide on the definition of research questions, types, importance, good and bad research question examples

Recall Bias: Definition, Types, Examples & Mitigation

This article will discuss the impact of recall bias in studies and the best ways to avoid them during research.

Formplus - For Seamless Data Collection

Collect data the right way with a versatile data collection tool. try formplus and transform your work productivity today..

  • Administration
  •   Home
  • Øvrige samlinger
  • Høstingsarkiver
  • CRIStin høstingsarkiv

Research Problems and Hypotheses in Empirical Research

Permanent link, appears in the following collection.

  • Institutt for spesialpedagogikk [552]
  • CRIStin høstingsarkiv [31257]

Original version

For library staff.

Empirical Research in Education

Assumptions and Problems

Cite this chapter

empirical research problems

665 Accesses

In the previous chapters we have reviewed the history of social science research and introduced some of the basic principles on which empirical research, or LPE, in education is based. In this chapter we turn our attention toward identifying how the principles of logical positivism, when applied to education research, are ineffective for strengthening the discipline.

In the following sections, we address some of the most problematic assumptions involved in carrying out empirical research in education and grapple with several related problems. Some of the major assumptions in social science research that promote positivistic or scientific principles in educational research include the following claims that we deconstruct within the course of our discussion:

Educational researchers, like physical scientists, are detached from their objects of study in that their personal preferences and biases are excluded from their subject matter, observations, and attending analyses.

Investigations of educational phenomena can be conducted in a value-neutral fashion, with the researcher eliminating all personal bias and preconceptions and employing language that expresses objectivity. In other words, there is objectivity and conceptual clarity in describing the studied phenomena within genuine scientific inquiry.

Educational research, like the physical sciences is nomothetic – that is, it is possible to extrapolate from educational research data laws that apply generally across numerous classroom and schooling contexts. In education, this assumption is particularly crucial since the search for the holy grail of some universal, but of course entirely illusive, instructional design drives much of the empirical investigation within the field. Two researchers working in different contexts who employ the same experimental method ought to arrive at the same conclusion. As we demonstrate in this chapter, within education this outcome is simply not the case.

We will demonstrate that each of these scientific principles, or assumptions, is fundamentally flawed when applied to educational research. Hence, education research is once again unable to meet the minimal standards of meaningful scientific inquiry. Later in this chapter we will also discuss the conceptual confusions that impact negatively on education. Finally, we examine how an implicit commitment to the direct reference theory of language, and the related search for conceptual certainty, leads to ontological errors about certain education concepts and how these errors affect student academic experience.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
  • Durable hardcover edition

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Unable to display preview.  Download preview PDF.

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer

About this chapter

(2007). Empirical Research in Education. In: Scientism and Education. Springer, Dordrecht. https://doi.org/10.1007/978-1-4020-6678-8_4

Download citation

DOI : https://doi.org/10.1007/978-1-4020-6678-8_4

Publisher Name : Springer, Dordrecht

Print ISBN : 978-1-4020-6677-1

Online ISBN : 978-1-4020-6678-8

eBook Packages : Humanities, Social Sciences and Law Education (R0)

Share this chapter

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Publish with us

Policies and ethics

  • Find a journal
  • Track your research

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Perspective
  • Published: 05 June 2024

Misunderstanding the harms of online misinformation

  • Ceren Budak   ORCID: orcid.org/0000-0002-7767-3217 1 ,
  • Brendan Nyhan   ORCID: orcid.org/0000-0001-7497-1799 2 ,
  • David M. Rothschild   ORCID: orcid.org/0000-0002-7792-1989 3 ,
  • Emily Thorson   ORCID: orcid.org/0000-0002-6514-801X 4 &
  • Duncan J. Watts   ORCID: orcid.org/0000-0001-5005-4961 5  

Nature volume  630 ,  pages 45–53 ( 2024 ) Cite this article

537 Accesses

298 Altmetric

Metrics details

  • Communication

The controversy over online misinformation and social media has opened a gap between public discourse and scientific research. Public intellectuals and journalists frequently make sweeping claims about the effects of exposure to false content online that are inconsistent with much of the current empirical evidence. Here we identify three common misperceptions: that average exposure to problematic content is high, that algorithms are largely responsible for this exposure and that social media is a primary cause of broader social problems such as polarization. In our review of behavioural science research on online misinformation, we document a pattern of low exposure to false and inflammatory content that is concentrated among a narrow fringe with strong motivations to seek out such information. In response, we recommend holding platforms accountable for facilitating exposure to false and extreme content in the tails of the distribution, where consumption is highest and the risk of real-world harm is greatest. We also call for increased platform transparency, including collaborations with outside researchers, to better evaluate the effects of online misinformation and the most effective responses to it. Taking these steps is especially important outside the USA and Western Europe, where research and data are scant and harms may be more severe.

This is a preview of subscription content, access via your institution

Access options

Access Nature and 54 other Nature Portfolio journals

Get Nature+, our best-value online-access subscription

24,99 € / 30 days

cancel any time

Subscribe to this journal

Receive 51 print issues and online access

185,98 € per year

only 3,65 € per issue

Buy this article

  • Purchase on Springer Link
  • Instant access to full article PDF

Prices may be subject to local taxes which are calculated during checkout

empirical research problems

Similar content being viewed by others

empirical research problems

Toolbox of individual-level interventions against online misinformation

empirical research problems

Exposure to untrustworthy websites in the 2016 US election

empirical research problems

Psychological inoculation protects against the social media infodemic

Myers, S. L. How social media amplifies misinformation more than information. The New York Times , https://www.nytimes.com/2022/10/13/technology/misinformation-integrity-institute-report.html (13 October 2022).

Haidt, J. Why the past 10 years of American life have been uniquely stupid. The Atlantic , https://www.theatlantic.com/magazine/archive/2022/05/social-media-democracy-trust-babel/629369/ (11 April 2022).

Haidt, J. Yes, social media really is undermining democracy. The Atlantic , https://www.theatlantic.com/ideas/archive/2022/07/social-media-harm-facebook-meta-response/670975/ (28 July 2022).

Tufekci, Z. YouTube, the great radicalizer. The New York Times , https://www.nytimes.com/2018/03/10/opinion/sunday/youtube-politics-radical.html (10 March 2018).

Romer, P. A tax that could fix big tech. The New York Times , https://www.nytimes.com/2019/05/06/opinion/tax-facebook-google.html (6 May 2019).

Schnell, M. Clyburn blames polarization on the advent of social media. The Hill , https://thehill.com/homenews/sunday-talk-shows/580440-clyburn-says-polarization-is-at-its-worst-because-the-advent-of/ (7 November 2021).

Robert F. Kennedy Human Rights/AP-NORC Poll (AP/NORC, 2023).

Goeas, E. & Nienaber, B. Battleground Poll 65: Civility in Politics: Frustration Driven by Perception (Tarrance Group, 2019).

Murray, M. Poll: Nearly two-thirds of Americans say social media platforms are tearing us apart. NBC News , https://www.nbcnews.com/politics/meet-the-press/poll-nearly-two-thirds-americans-say-social-media-platforms-are-n1266773 (2021).

Auxier, B. 64% of Americans say social media have a mostly negative effect on the way things are going in the U.S. today. Pew Research Center (2020).

Koomey, J. G. et al. Sorry, wrong number: the use and misuse of numerical facts in analysis and media reporting of energy issues. Annu. Rev. Energy Env. 27 , 119–158 (2002).

Article   Google Scholar  

Gonon, F., Bezard, E. & Boraud, T. Misrepresentation of neuroscience data might give rise to misleading conclusions in the media: the case of attention deficit hyperactivity disorder. PLoS ONE 6 , e14618 (2011).

Article   ADS   CAS   PubMed   PubMed Central   Google Scholar  

Copenhaver, A., Mitrofan, O. & Ferguson, C. J. For video games, bad news is good news: news reporting of violent video game studies. Cyberpsychol. Behav. Soc. Netw. 20 , 735–739 (2017).

Article   PubMed   Google Scholar  

Bratton, L. et al. The association between exaggeration in health-related science news and academic press releases: a replication study. Wellcome Open Res. 4 , 148 (2019).

Article   PubMed   PubMed Central   Google Scholar  

Allcott, H., Braghieri, L., Eichmeyer, S. & Gentzkow, M. The welfare effects of social media. Am. Econ. Rev. 110 , 629–676 (2020).

Braghieri, L., Levy, R. & Makarin, A. Social media and mental health. Am. Econ. Rev. 112 , 3660–3693 (2022).

Guess, A. M., Barberá, P., Munzert, S. & Yang, J. The consequences of online partisan media. Proc. Natl Acad. Sci. USA 118 , e2013464118 (2021).

Article   CAS   PubMed   PubMed Central   Google Scholar  

Sabatini, F. & Sarracino, F. Online social networks and trust. Soc. Indic. Res. 142 , 229–260 (2019).

Lorenz-Spreen, P., Lewandowsky, S., Sunstein, C. R. & Hertwig, R. How behavioural sciences can promote truth, autonomy and democratic discourse online. Nat. Hum. Behav. 4 , 1102–1109 (2020). This paper provides a review of possible harms from social media .

Lapowsky, I. The mainstream media melted down as fake news festered. Wired , https://www.wired.com/2016/12/2016-mainstream-media-melted-fake-news-festered/ (26 December 2016).

Lalani, F. & Li, C. Why So Much Harmful Content Has Proliferated Online—and What We Can Do about It Technical Report (World Economic Forum, 2020).

Stewart, E. America’s growing fake news problem, in one chart. Vox , https://www.vox.com/policy-and-politics/2020/12/22/22195488/fake-news-social-media-2020 (22 December 2020).

Sanchez, G. R., Middlemass, K. & Rodriguez, A. Misinformation Is Eroding the Public’s Confidence in Democracy (Brookings Institution, 2022).

Bond, S. False Information Is Everywhere. ‘Pre-bunking’ Tries to Head It off Early. NPR , https://www.npr.org/2022/10/28/1132021770/false-information-is-everywhere-pre-bunking-tries-to-head-it-off-ear (National Public Radio, 2022).

Tufekci, Z. Algorithmic harms beyond Facebook and google: emergent challenges of computational agency. Colo. Tech. Law J. 13 , 203 (2015).

Google Scholar  

Cohen, J. N. Exploring echo-systems: how algorithms shape immersive media environments. J. Media Lit. Educ. 10 , 139–151 (2018).

Shin, J. & Valente, T. Algorithms and health misinformation: a case study of vaccine books on Amazon. J. Health Commun. 25 , 394–401 (2020).

Ceylan, G., Anderson, I. A. & Wood, W. Sharing of misinformation is habitual, not just lazy or biased. Proc. Natl Acad. Sci. USA 120 , e2216614120 (2023).

Pauwels, L., Brion, F. & De Ruyver, B. Explaining and Understanding the Role of Exposure to New Social Media on Violent Extremism. an Integrative Quantitative and Qualitative Approach (Belgian Science Policy, 2014).

McHugh, B. C., Wisniewski, P., Rosson, M. B. & Carroll, J. M. When social media traumatizes teens: the roles of online risk exposure, coping, and post-traumatic stress. Internet Res. 28 , 1169–1188 (2018).

Soral, W., Liu, J. & Bilewicz, M. Media of contempt: social media consumption predicts normative acceptance of anti-Muslim hate speech and Islamo-prejudice. Int. J. Conf. Violence 14 , 1–13 (2020).

Many believe misinformation is increasing extreme political views and behaviors. AP-NORC https://apnorc.org/projects/many-believe-misinformation-is-increasing-extreme-political-views-an (2022).

Fandos, N., Kang, C. & Isaac, M. Tech executives are contrite about election meddling, but make few promises on Capitol Hill. The New York Times , https://www.nytimes.com/2017/10/31/us/politics/facebook-twitter-google-hearings-congress.html (31 October 2017).

Eady, G., Paskhalis, T., Zilinsky, J., Bonneau, R., Nagler, J. & Tucker, J. A. Exposure to the Russian Internet Research Agency foreign influence campaign on Twitter in the 2016 US election and its relationship to attitudes and voting behavior. Nat. Commun. 14 , 62 (2023). This paper shows that exposure to Russian misinformation on social media in 2016 was a small portion of people’s news diets and not associated with shifting attitudes.

Badawy, A., Addawood, A., Lerman, K. & Ferrara, E. Characterizing the 2016 Russian IRA influence campaign. Soc. Netw. Anal. Min. 9 , 31 (2019). This paper shows that exposure to and amplification of Russian misinformation on social media in 2016 was concentrated among Republicans (who would have been predisposed to support Donald Trump regardless) .

Hosseinmardi, H., Ghasemian, A., Clauset, A., Mobius, M., Rothschild, D. M. & Watts, D. J. Examining the consumption of radical content on YouTube. Proc. Natl Acad. Sci. USA 118 , e2101967118 (2021). This paper shows that extreme content is consumed on YouTube by a small portion of the population who tend to consume similar content elsewhere online and that consumption is largely driven by demand, not algorithms .

Chen, A. Y., Nyhan, B., Reifler, J., Robertson, R. E. & Wilson, C. Subscriptions and external links help drive resentful users to alternative and extremist YouTube channels. Sci. Adv. 9 , eadd8080 (2023). This paper shows that people who consume extremist content on YouTube have highly resentful attitudes and typically find the content through subscriptions and external links, not algorithmic recommendations to non-subscribers .

Munger, K. & Phillips, J. Right-wing YouTube: a supply and demand perspective. Int. J. Press Polit. 27 , 186–219 (2022).

Lasser, J., Aroyehun, S. T., Simchon, A., Carrella, F., Garcia, D. & Lewandowsky, S. Social media sharing of low-quality news sources by political elites. PNAS Nexus 1 , pgac186 (2022).

Muddiman, A., Budak, C., Murray, C., Kim, Y. & Stroud, N. J. Indexing theory during an emerging health crisis: how U.S. TV news indexed elite perspectives and amplified COVID-19 misinformation. Ann. Inte. Commun. Assoc. 46 , 174–204 (2022). This paper shows how mainstream media also spreads misinformation through amplification of misleading statements from elites .

Pereira, F. B. et al. Detecting misinformation: identifying false news spread by political leaders in the Global South. Preprint at OSF , https://doi.org/10.31235/osf.io/hu4qr (2022).

Horwitz, J. & Seetharaman, D. Facebook executives shut down efforts to make the site less divisive. Wall Street Journal , https://www.wsj.com/articles/facebook-knows-it-encourages-division-top-executives-nixed-solutions-11590507499 (26 May 2020).

Hosseinmardi, H., Ghasemian, A., Rivera-Lanas, M., Horta Ribeiro, M., West, R. & Watts, D. J. Causally estimating the effect of YouTube’s recommender system using counterfactual bots. Proc. Natl Acad. Sci. USA 121 , e2313377121 (2024).

Article   CAS   PubMed   Google Scholar  

Nyhan, B. et al. Like-minded sources on facebook are prevalent but not polarizing. Nature 620 , 137–144 (2023).

Guess, A. M. et al. How do social media feed algorithms affect attitudes and behavior in an election campaign? Science 381 , 398–404 (2023). This paper shows that algorithms supply less untrustworthy content than reverse chronological feeds .

Article   ADS   CAS   PubMed   Google Scholar  

Asimovic, N., Nagler, J., Bonneau, R. & Tucker, J. A. Testing the effects of Facebook usage in an ethnically polarized setting. Proc. Natl Acad. Sci. USA 118 , e2022819118 (2021).

Allen, J., Mobius, M., Rothschild, D. M. & Watts, D. J. Research note: Examining potential bias in large-scale censored data. Harv. Kennedy Sch. Misinformation Rev. 2 , https://doi.org/10.37016/mr-2020-74 (2021). This paper shows that engagement metrics such as clicks and shares that are regularly used in popular and academic research do not take into account the fact that fake news is clicked and shared at a higher rate relative to exposure and viewing than non-fake news .

Scheuerman, M. K., Jiang, J. A., Fiesler, C. & Brubaker, J. R. A framework of severity for harmful content online. Proc. ACM Hum. Comput. Interact. 5 , 1–33 (2021).

Vosoughi, S., Roy, D. & Aral, S. The spread of true and false news online. Science 359 , 1146–1151 (2018).

Roy, D. Happy to see the extensive coverage of our science paper on spread of true and false news online, but over-interpretations of the scope of our study prompted me to diagram actual scope (caution, not to scale!). Twitter , https://twitter.com/dkroy/status/974251282071474177 (15 March 2018).

Greenemeier, L. You can’t handle the truth—at least on Twitter. Scientific American , https://www.scientificamerican.com/article/you-cant-handle-the-truth-at-least-on-twitter/ (8 March 2018).

Frankel, S. Deceptively edited video of Biden proliferates on social media. The New York Times , https://www.nytimes.com/2020/11/02/technology/biden-video-edited.html (2 November 2020).

Jiameng P. et al. Deepfake videos in the wild: analysis and detection. In Proc. Web Conference 2021 981–992 (International World Wide Web Conference Committee, 2021).

Widely Viewed Content Report: What People See on Facebook: Q1 2023 Report (Facebook, 2023).

Mayer, J. How Russia helped swing the election for Trump. The New Yorker , https://www.newyorker.com/magazine/2018/10/01/how-russia-helped-to-swing-the-election-for-trump (24 September 2018).

Jamieson, K. H. Cyberwar: How Russian Hackers and Trolls Helped Elect A President: What We Don’t, Can’t, and Do Know (Oxford Univ. Press, 2020).

Solon, O. & Siddiqui, S. Russia-backed Facebook posts ‘reached 126m Americans’ during US election. The Guardian , https://www.theguardian.com/technology/2017/oct/30/facebook-russia-fake-accounts-126-million (30 October 2017).

Watts, D. J. & Rothschild, D. M. Don’t blame the election on fake news. Blame it on the media. Columbia J. Rev. 5 , https://www.cjr.org/analysis/fake-news-media-election-trump.php (2017). This paper explores how seemingly large exposure levels to problematic content actually represent a small proportion of total news exposure .

Jie, Y. Frequency or total number? A comparison of different presentation formats on risk perception during COVID-19. Judgm. Decis. Mak. 17 , 215–236 (2022).

Reyna, V. F. & Brainerd, C. J. Numeracy, ratio bias, and denominator neglect in judgments of risk and probability. Learn. Individ. Differ. 18 , 89–107 (2008). This paper details research into how salient numbers can lead to confusion in judgements of risk and probability, such as denominator neglect in which people fixate on a large numerator and do not consider the appropriate denominator .

Jones, J. Americans: much misinformation, bias, inaccuracy in news. Gallup , https://news.gallup.com/opinion/gallup/235796/americans-misinformation-bias-inaccuracy-news.aspx (2018).

Grinberg, N., Joseph, K., Friedland, L., Swire-Thompson, B. & Lazer, D. Fake news on Twitter during the 2016 US presidential election. Science 363 , 374–378 (2019).

Guess, A. M., Nyhan, B. & Reifler, J. Exposure to untrustworthy websites in the 2016 US election. Nat. Hum. Behav. 4 , 472–480 (2020). This paper shows untrustworthy news exposure was relatively rare in US citizens’ web browsing in 2016 .

Altay, S., Nielsen, R. K. & Fletcher, R. Quantifying the “infodemic”: people turned to trustworthy news outlets during the 2020 coronavirus pandemic. J. Quant. Descr. Digit. Media 2 , 1–30 (2022).

Allen, J., Howland, B., Mobius, M., Rothschild, D. & Watts, D. J. Evaluating the fake news problem at the scale of the information ecosystem. Sci. Adv. 6 , eaay3539 (2020). This paper shows that exposure to fake news is a vanishingly small part of people’s overall news diets when you take television into account .

Article   ADS   PubMed   PubMed Central   Google Scholar  

Guess, A. M., Nyhan, B., O’Keeffe, Z. & Reifler, J. The sources and correlates of exposure to vaccine-related (mis)information online. Vaccine 38 , 7799–7805 (2020). This paper shows hows how a small portion of the population accounts for the vast majority of exposure to vaccine-sceptical content .

Chong, D. & Druckman, J. N. Framing public opinion in competitive democracies. Am. Polit. Sci. Rev. 101 , 637–655 (2007).

Arendt, F. Toward a dose-response account of media priming. Commun. Res. 42 , 1089–1115 (2015). This paper shows that people may need repeated exposure to information for it to affect their attitudes .

Arceneaux, K., Johnson, M. & Murphy, C. Polarized political communication, oppositional media hostility, and selective exposure. J. Polit. 74 , 174–186 (2012).

Feldman, L. & Hart, P. Broadening exposure to climate change news? How framing and political orientation interact to influence selective exposure. J. Commun. 68 , 503–524 (2018).

Druckman, J. N. Political preference formation: competition, deliberation, and the (ir)relevance of framing effects. Am. Polit. Sci. Rev. 98 , 671–686 (2004).

Bakshy, E., Messing, S. & Adamic, L. A. Exposure to ideologically diverse news and opinion on facebook. Science 348 , 1130–1132 (2015).

Article   ADS   MathSciNet   CAS   PubMed   Google Scholar  

Bozarth, L., Saraf, A. & Budak, C. Higher ground? How groundtruth labeling impacts our understanding of fake news about the 2016 U.S. presidential nominees. In Proc. International AAAI Conference on Web and Social Media Vol. 14, 48–59 (Association for the Advancement of Artificial Intelligence, 2020).

Gerber, A. S., Gimpel, J. G., Green, D. P. & Shaw, D. R. How large and long-lasting are the persuasive effects of televised campaign ads? Results from a randomized field experiment. Am. Polit. Sci. Rev. 105 , 135–150 (2011). This paper shows that the effect of news decays rapidly; news needs repeated exposure for long-term impact .

Hill, S. J., Lo, J., Vavreck, L. & Zaller, J. How quickly we forget: the duration of persuasion effects from mass communication. Polit. Commun. 30 , 521–547 (2013). This paper shows that the effect of persuasive advertising decays rapidly, necessitating repeated exposure for lasting effect .

Larsen, M. V. & Olsen, A. L. Reducing bias in citizens’ perception of crime rates: evidence from a field experiment on burglary prevalence. J. Polit. 82 , 747–752 (2020).

Roose, K. What if Facebook is the real ‘silent majority’? The New York Times , https://www.nytimes.com/2020/08/28/us/elections/what-if-facebook-is-the-real-silent-majority.html (27 August 2020).

Breland, A. A new report shows how Trump keeps buying Facebook ads. Mother Jones , https://www.motherjones.com/politics/2021/07/real-facebook-oversight-board/ (28 July 2021).

Marchal, N., Kollanyi, B., Neudert, L.-M. & Howard, P. N. Junk News during the EU Parliamentary Elections: Lessons from A Seven-language Study of Twitter and Facebook (Univ. Oxford, 2019).

Ellison, N. B., Trieu, P., Schoenebeck, S., Brewer, R. & Israni, A. Why we don’t click: interrogating the relationship between viewing and clicking in social media contexts by exploring the “non-click”. J. Comput. Mediat. Commun. 25 , 402–426 (2020).

Pennycook, G., Epstein, Z., Mosleh, M., Arechar, A. A., Eckles,and, D. & Rand, D. G. Shifting attention to accuracy can reduce misinformation online. Nature 592 , 590–595 (2021).

Ghezae, I. et al. Partisans neither expect nor receive reputational rewards for sharing falsehoods over truth online. Open Science Framework https://osf.io/5jwgd/ (2023).

Guess, A. M. et al. Reshares on social media amplify political news but do not detectably affect beliefs or opinions. Science 381 , 404–408 (2023).

Godel, W. et al. Moderating with the mob: evaluating the efficacy of real-time crowdsourced fact-checking. J. Online Trust Saf. 1 , https://doi.org/10.54501/jots.v1i1.15 (2021).

Rogers, K. Facebook’s algorithm is broken. We collected some suggestion on how to fix it. FiveThirtyEight , https://fivethirtyeight.com/features/facebooks-algorithm-is-broken-we-collected-some-spicy-suggestions-on-how-to-fix-it/ (16 November 2021).

Roose, K. The making of a YouTube radical. The New York Times , https://www.nytimes.com/interactive/2019/06/08/technology/youtube-radical.html (8 June 2019).

Eslami, M. et al. First I “like” it, then I hide it: folk theories of social feeds. In Proc. 2016 CHI Conference on Human Factors in Computing Systems 2371–2382 (Association for Computing Machinery, 2016).

Silva, D. E., Chen, C. & Zhu, Y. Facets of algorithmic literacy: information, experience, and individual factors predict attitudes toward algorithmic systems. New Media Soc. https://doi.org/10.1177/14614448221098042 (2022).

Eckles, D. Algorithmic Transparency and Assessing Effects of Algorithmic Ranking. Testimony before the Senate Subcommittee on Communications, Media, and Broadband , https://www.commerce.senate.gov/services/files/62102355-DC26-4909-BF90-8FB068145F18 (U.S. Senate Committee on Commerce, Science, and Transportation, 2021).

Kantrowitz, A. Facebook removed the news feed algorithm in an experiment. Then it gave up. OneZero , https://onezero.medium.com/facebook-removed-the-news-feed-algorithm-in-an-experiment-then-it-gave-up-25c8cb0a35a3 (25 October 2021).

Ribeiro, M. H., Hosseinmardi, H., West, R. & Watts, D. J. Deplatforming did not decrease parler users’ activity on fringe social media. PNAS Nexus 2 , pgad035 (2023). This paper shows that shutting down Parler just displaced user activity to other fringe social media websites .

Alfano, M., Fard, A. E., Carter, J. A., Clutton, P. & Klein, C. Technologically scaffolded atypical cognition: the case of YouTube’s recommender system. Synthese 199 , 835–858 (2021).

Huszár, F. et al. Algorithmic amplification of politics on Twitter. Proc. Natl Acad. Sci. USA 119 , e2025334119 (2022).

Levy, R. Social media, news consumption, and polarization: evidence from a field experiment. Am. Econ. Rev. 111 , 831–870 (2021).

Cho, J., Ahmed, S., Hilbert, M., Liu, B. & Luu, J. Do search algorithms endanger democracy? An experimental investigation of algorithm effects on political polarization. J. Broadcast. Electron. Media 64 , 150–172 (2020).

Lewandowsky, S., Robertson, R. E. & DiResta, R. Challenges in understanding human-algorithm entanglement during online information consumption. Perspect. Psychol. Sci. https://doi.org/10.1177/17456916231180809 (2023).

Narayanan, A. Understanding Social Media Recommendation Algorithms (Knight First Amendment Institute at Columbia University, 2023).

Finkel, E. J. et al. Political sectarianism in America. Science 370 , 533–536 (2020).

Auxier, B. & Anderson, M. Social Media Use in 2021 (Pew Research Center, 2021).

Frimer, J. A. et al. Incivility is rising among American politicians on Twitter. Soc. Psychol. Personal. Sci. 14 , 259–269 (2023).

Broderick, R. & Darmanin, J. The “yellow vest” riots in France are what happens when Facebook gets involved with local news. Buzzfeed News , https://www.buzzfeednews.com/article/ryanhatesthis/france-paris-yellow-jackets-facebook (2018).

Salzberg, S. De-platform the disinformation dozen. Forbes , https://www.forbes.com/sites/stevensalzberg/2021/07/19/de-platform-the-disinformation-dozen/ (2021).

Karell, D., Linke, A., Holland, E. & Hendrickson, E. “Born for a storm”: hard-right social media and civil unrest. Am. Soc. Rev. 88 , 322–349 (2023).

Smith, N. & Graham, T. Mapping the anti-vaccination movement on Facebook. Inf. Commun. Soc. 22 , 1310–1327 (2019).

Brady, W. J., McLoughlin, K., Doan, T. N. & Crockett, M. J. How social learning amplifies moral outrage expression in online social networks. Sci. Adv. 7 , eabe5641 (2021).

Suhay, E., Bello-Pardo, E. & Maurer, B. The polarizing effects of online partisan criticism: evidence from two experiments. Int. J. Press Polit. 23 , 95–115 (2018).

Arugute, N., Calvo, E. & Ventura, T. Network activated frames: content sharing and perceived polarization in social media. J. Commun. 73 , 14–24 (2023).

Nordbrandt, M. Affective polarization in the digital age: testing the direction of the relationship between social media and users’ feelings for out-group parties. New Media Soc. 25 , 3392–3411 (2023). This paper shows that affective polarization predicts media use, not the other way around .

AFP. Street protests, a French tradition par excellence. The Local https://www.thelocal.fr/20181205/revolutionary-tradition-the-story-behind-frances-street-protests (2018).

Spier, R. E. Perception of risk of vaccine adverse events: a historical perspective. Vaccine 20 , S78–S84 (2001). This article documents the history of untrustworthy information about vaccines, which long predates social media .

Bryant, L. V. The YouTube algorithm and the alt-right filter bubble. Open Inf. Sci. 4 , 85–90 (2020).

Sismeiro, C. & Mahmood, A. Competitive vs. complementary effects in online social networks and news consumption: a natural experiment. Manage. Sci. 64 , 5014–5037 (2018).

Fergusson, L. & Molina, C. Facebook Causes Protests Documento CEDE No. 41 , https://doi.org/10.2139/ssrn.3553514 (2019).

Lu, Y., Wu, J., Tan, Y. & Chen, J. Microblogging replies and opinion polarization: a natural experiment. MIS Q. 46 , 1901–1936 (2022).

Porter, E. & Wood, T. J. The global effectiveness of fact-checking: evidence from simultaneous experiments in Argentina, Nigeria, South Africa, and the United Kingdom. Proc. Natl Acad. Sci. USA 118 , e2104235118 (2021).

Arechar, A. A. et al. Understanding and combatting misinformation across 16 countries on six continents. Nat. Hum. Behav. 7 , 1502–1513 (2023).

Blair, R. A. et al. Interventions to Counter Misinformation: Lessons from the Global North and Applications to the Global South (USAID Development Experience Clearinghouse, 2023).

Haque, M. M. et al. Combating misinformation in Bangladesh: roles and responsibilities as perceived by journalists, fact-checkers, and users. Proc. ACM Hum. Comput. Interact. 4 , 1–32 (2020).

Humprecht, E., Esser, F. & Van Aelst, P. Resilience to online disinformation: a framework for cross-national comparative research. Int. J. Press Polit. 25 , 493–516 (2020).

Gillum, J. & Elliott, J. Sheryl Sandberg and top Facebook execs silenced an enemy of Turkey to prevent a hit to the company’s business. ProPublica , https://www.propublica.org/article/sheryl-sandberg-and-top-facebook-execs-silenced-an-enemy-of-turkey-to-prevent-a-hit-to-their-business (24 February 2021).

Nord M. et al. Democracy Report 2024: Democracy Winning and Losing at the Ballot V-Dem Report (Univ. Gothenburg V-Dem Institute, 2024).

Alba, D. How Duterte used Facebook to fuel the Philippine drug war. Buzzfeed , https://www.buzzfeednews.com/article/daveyalba/facebook-philippines-dutertes-drug-war (4 September 2018).

Zakrzewski, C., De Vynck, G., Masih, N. a& Mahtani, S. How Facebook neglected the rest of the world, fueling hate speech and violence in India. Washington Post , https://www.washingtonpost.com/technology/2021/10/24/india-facebook-misinformation-hate-speech/ (24 October 2021).

Simonite, T. Facebook is everywhere; its moderation is nowhere close. Wired , https://www.wired.com/story/facebooks-global-reach-exceeds-linguistic-grasp/ (21 October 2021).

Cruz, J. C. B. & Cheng, C. Establishing baselines for text classification in low-resource languages. Preprint at https://arxiv.org/abs/2005.02068 (2020). This paper shows one of the challenges that makes content moderation costlier in less resourced countries .

Müller, K. & Schwarz, C. Fanning the flames of hate: social media and hate crime. J. Eur. Econ. Assoc. 19 , 2131–2167 (2021).

Bursztyn, L., Egorov, G., Enikolopov, R. & Petrova, M. Social Media and Xenophobia: Evidence from Russia (National Bureau of Economic Research, 2019).

Lewandowsky, S., Jetter, M. & Ecker, U. K. H. Using the President’s tweets to understand political diversion in the age of social media. Nat. Commun. 11 , 5764 (2020).

Bursztyn, L., Rao, A., Roth, C. P. & Yanagizawa-Drott, D. H. Misinformation During a Pandemic (National Bureau of Economic Research, 2020).

Motta, M. & Stecula, D. Quantifying the effect of Wakefield et al. (1998) on skepticism about MMR vaccine safety in the US. PLoS ONE 16 , e0256395 (2021).

Sanderson, Z., Brown, M. A., Bonneau, R., Nagler, J. & Tucker, J. A. Twitter flagged Donald Trump’s tweets with election misinformation: they continued to spread both on and off the platform. Harv. Kennedy Sch. Misinformation Rev. 2 , https://doi.org/10.37016/mr-2020-77 (2021).

Anhalt-Depies, C., Stenglein, J. L., Zuckerberg, B., Townsend, P. A. & Rissman, A. R. Tradeoffs and tools for data quality, privacy, transparency, and trust in citizen science. Biol. Conserv. 238 , 108195 (2019).

Gerber, N., Gerber, P. & Volkamer, M. Explaining the privacy paradox: a systematic review of literature investigating privacy attitude and behavior. Comput. Secur. 77 , 226–261 (2018). This paper explores the trade-offs between privacy and research .

Isaak, J. & Hanna, M. J. User data privacy: Facebook, Cambridge Analytica, and privacy protection. Computer 51 , 56–59 (2018).

Vogus, C. Independent Researcher Access to Social Media Data: Comparing Legislative Proposals (Center for Democracy and Technology, 2022).

Xie, Y. “Undemocracy”: inequalities in science. Science 344 , 809–810 (2014).

Nielsen, M. W. & Andersen, J. P. Global citation inequality is on the rise. Proc. Natl Acad. Sci. USA 118 , e2012208118 (2021).

King, D. A. The scientific impact of nations. Nature 430 , 311–316 (2004).

Zaugg, I. A., Hossain, A. & Molloy, B. Digitally-disadvantaged languages. Internet Policy Rev. 11 , 1–11 (2022).

Zaugg, I. A. in Digital Inequalities in the Global South (eds Ragnedda, M. & Gladkova, A.) 247–267 (Springer, 2020).

Sablosky, J. Dangerous organizations: Facebook’s content moderation decisions and ethnic visibility in Myanmar. Media Cult. Soc. 43 , 1017–1042 (2021). This paper highlights the challenges of content moderation in the Global South .

Warofka, A. An independent assessment of the human rights impact of Facebook in Myanmar. Facebook Newsroom , https://about.fb.com/news/2018/11/myanmar-hria/ (2018).

Fick, M. & Dave, P. Facebook’s flood of languages leave it struggling to monitor content. Reuters , https://www.reuters.com/article/idUSKCN1RZ0DL/ (23 April 2019).

Newman, N. Executive Summary and Key Findings of the 2020 Report (Reuters Institute for the Study of Journalism, 2020).

Hilbert, M. The bad news is that the digital access divide is here to stay: domestically installed bandwidths among 172 countries for 1986–2014. Telecommun. Policy 40 , 567–581 (2016).

Traynor, I. Internet governance too US-centric, says European commission. The Guardian , https://www.theguardian.com/technology/2014/feb/12/internet-governance-us-european-commission (12 February 2014).

Pennycook, G., Cannon, T. D. & Rand, D. G. Prior exposure increases perceived accuracy of fake news. J. Exp. Psychol. Gen. 147 , 1865–1880 (2018).

Guess, A. M. et al. “Fake news” may have limited effects beyond increasing beliefs in false claims. Kennedy Sch. Misinformation Rev. 1 , https://doi.org/10.37016/mr-2020-004 (2020).

Loomba, S., de Figueiredo, A., Piatek, S. J., de Graaf, K. & Larson, H. J. Measuring the impact of COVID-19 vaccine misinformation on vaccination intent in the UK and USA. Nat. Hum. Behav. 5 , 337–348 (2021).

Lorenz-Spreen, P., Oswald, L., Lewandowsky, S. & Hertwig, R. Digital media and democracy: a systematic review of causal and correlational evidence worldwide. Nat. Hum. Behav. 7 , 74–101 (2023). This paper provides a review of evidence on social media effects .

Donato, K. M., Singh, L., Arab, A., Jacobs, E. & Post, D. Misinformation about COVID-19 and Venezuelan migration: trends in Twitter conversation during a pandemic. Harvard Data Sci. Rev. 4 , https://doi.org/10.1162/99608f92.a4d9a7c7 (2022).

Wieczner, J. Big lies vs. big lawsuits: why Dominion Voting is suing Fox News and a host of Trump allies. Fortune , https://fortune.com/longform/dominion-voting-lawsuits-fox-news-trump-allies-2020-election-libel-conspiracy-theories/ (2 April 2021).

Calma, J. Twitter just closed the book on academic research. The Verge https://www.theverge.com/2023/5/31/23739084/twitter-elon-musk-api-policy-chilling-academic-research (2023).

Edelson, L., Graef, I. & Lancieri, F. Access to Data and Algorithms: for an Effective DMA and DSA Implementation (Centre on Regulation in Europe, 2023).

Download references

Author information

Authors and affiliations.

University of Michigan School of Information, Ann Arbor, MI, USA

Ceren Budak

Department of Government, Dartmouth College, Hanover, NH, USA

Brendan Nyhan

Microsoft Research, New York, NY, USA

David M. Rothschild

Maxwell School of Citizenship and Public Affairs, Syracuse University, Syracuse, NY, USA

Emily Thorson

Department of Computer and Information Science, Annenberg School of Communication, and Operations, Information, and Decisions Department, University of Pennsylvania, Philadelphia, PA, USA

Duncan J. Watts

You can also search for this author in PubMed   Google Scholar

Contributions

C.B., B.N., D.M.R., E.T. and D.J.W. wrote and revised the paper. D.M.R. collected the data and prepared Fig. 1 .

Corresponding author

Correspondence to David M. Rothschild .

Ethics declarations

Competing interests.

The authors declare no competing interests, but provide the following information in the interests of transparency and full disclosure. C.B. and D.J.W. previously worked for Microsoft Research and D.M.R. currently works for Microsoft Research. B.N. has received grant funding from Meta. B.N. and E.T. are participants in the US 2020 Facebook and Instagram Election Study as independent academic researchers. D.J.W. has received funding from Google Research. D.M.R. and D.J.W. both previously worked at Yahoo!.

Peer review

Peer review information.

Nature thanks Stephan Lewandowsky, David Rand, Emma Spiro and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article.

Budak, C., Nyhan, B., Rothschild, D.M. et al. Misunderstanding the harms of online misinformation. Nature 630 , 45–53 (2024). https://doi.org/10.1038/s41586-024-07417-w

Download citation

Received : 13 October 2021

Accepted : 11 April 2024

Published : 05 June 2024

Issue Date : 06 June 2024

DOI : https://doi.org/10.1038/s41586-024-07417-w

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

By submitting a comment you agree to abide by our Terms and Community Guidelines . If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

empirical research problems

  • Open access
  • Published: 03 June 2024

High-dimensional mediation analysis for continuous outcome with confounders using overlap weighting method in observational epigenetic study

  • Weiwei Hu 1 ,
  • Shiyu Chen 1 ,
  • Jiaxin Cai 1 ,
  • Yuhui Yang 1 ,
  • Hong Yan 1 &
  • Fangyao Chen 1 , 2  

BMC Medical Research Methodology volume  24 , Article number:  125 ( 2024 ) Cite this article

85 Accesses

1 Altmetric

Metrics details

Mediation analysis is a powerful tool to identify factors mediating the causal pathway of exposure to health outcomes. Mediation analysis has been extended to study a large number of potential mediators in high-dimensional data settings. The presence of confounding in observational studies is inevitable. Hence, it’s an essential part of high-dimensional mediation analysis (HDMA) to adjust for the potential confounders. Although the propensity score (PS) related method such as propensity score regression adjustment (PSR) and inverse probability weighting (IPW) has been proposed to tackle this problem, the characteristics with extreme propensity score distribution of the PS-based method would result in the biased estimation.

In this article, we integrated the overlapping weighting (OW) technique into HDMA workflow and proposed a concise and powerful high-dimensional mediation analysis procedure consisting of OW confounding adjustment, sure independence screening (SIS), de-biased Lasso penalization, and joint-significance testing underlying the mixture null distribution. We compared the proposed method with the existing method consisting of PS-based confounding adjustment, SIS, minimax concave penalty (MCP) variable selection, and classical joint-significance testing.

Simulation studies demonstrate the proposed procedure has the best performance in mediator selection and estimation. The proposed procedure yielded the highest true positive rate, acceptable false discovery proportion level, and lower mean square error. In the empirical study based on the GSE117859 dataset in the Gene Expression Omnibus database using the proposed method, we found that smoking history may lead to the estimated natural killer (NK) cell level reduction through the mediation effect of some methylation markers, mainly including methylation sites cg13917614 in CNP gene and cg16893868 in LILRA2 gene.

Conclusions

The proposed method has higher power, sufficient false discovery rate control, and precise mediation effect estimation. Meanwhile, it is feasible to be implemented with the presence of confounders. Hence, our method is worth considering in HDMA studies.

Peer Review reports

The analysis of the mediating effect was first proposed by Baron and Kenny (1986) [ 1 ] and was broadly applied in many scientific fields, such as psychological, sociological, and biomedical studies [ 2 , 3 , 4 ]. Mediation analysis has become a powerful tool to investigate the underlying mechanism of environmental exposures on health outcomes and identify the factors mediating the effect of exposures on outcomes [ 5 ]. Currently, analytical methods including the single mediator model [ 6 , 7 ], multiple-mediators model [ 8 ], and high-dimensional mediation model [ 9 ] are proposed and available for researchers in many scientific fields.

With the development of advanced data collection techniques, high-dimensional data has become common in biomedical research. For example, in the epigenetic study, the Illumina Infinium HumanMethylation450 BeadChip array platform allows to measure the DNA methylation levels of roughly 480 K probes [ 10 ] and generates high dimensional data. Focusing on practical research, smoking affects lung function, and some DNA methylation sites may mediate the effect of smoking on lung function [ 11 , 12 ]. To identify the significant mediators (CpG sites) between smoking and lung function, we can conduct mediation analysis in the collected high-dimensional data [ 9 , 13 , 14 ]. Obviously, this method can be used to identify the methylation sites mediating the association between environmental factors other than smoking and other health outcomes including some physical signs and diseases.

However, there are also some issues in high dimensional mediation analysis (HDMA), such as the curse of dimensionality, the false positive rate inflation caused by multiplicity and the confounding existing in observational research. To overcome these issues, scholars have proposed a series of statistical methods. Zhang et al. [ 9 ]. proposed the HIMA model consisting of variable screening based on sure independence screening (SIS), variable selection techniques based on minimax concave penalty (MCP) estimation and joint significance test. HIMA extends the multiple mediator framework to the high-dimensional setting by incorporating variable screening and variable selection techniques into multiple mediation analysis. The following high-dimensional mediation analysis methods also employ the generic procedure [ 13 , 14 , 15 , 16 ], which reduces dimensionality from high to moderate or low scale and then conducts multiple mediation test. For example, the HIMA2 procedure proposed by Perera et al. [ 17 ], which employs the SIS method based on the indirect effect of every single mediator and conducts debiased Lasso to obtain more accurate estimates, then utilizes the multiple-testing procedure proposed by James et al. [ 18 ] to control the false discovery rate. Moreover, to adjust the confounders of observational epigenetic studies, researchers tried to integrate propensity score (PS) into the high-dimensional mediation model by weighting or considering it as a covariate [ 14 , 16 ], except for the classic regression adjustment.

Although many works have been made to tackle these problems, there are still some issues remaining in the dimensionality reduction and adjustment for confounders. For high dimensional mediation analysis, the previous studies don’t take confounders into account, just consider them as covariates [ 15 , 19 ], such as HDMA, HIMA, and HIMA2 [ 5 , 9 , 17 ]. As is known to all, the multivariable model cannot adequately account for confounding effects in the presence of a large number of confounders [ 20 ]. If we only control confounding during the mediation test, but not in the dimension reduction stage, then a biased variable selection result may be obtained [ 14 ]. Thus, it is necessary to adjust confounders to improve the performance of variable selection.

To address this issue, researchers have adopted the PS-based method including PS regression adjustment (termed PSR) and classical PS weighting (also called inverse probability weighting, IPW) to adjust confounding during both stages [ 14 ]. However, the adjustment for confounders using the IPW based on PS still faces the issue of extreme weights caused by extreme PS distribution [ 21 , 22 ]. To address the issue of extreme PS distribution, Li et al. [ 23 ]. proposed the overlap weighting (OW) method, which emphasizes individuals with the most overlap in their observed characteristics and is beneficial to provide a consistent estimator of the effect of exposure on outcome in the presence of extreme PS tails. OW belongs to the weighting confounding adjustment method based on PS and is gaining more popularity because of excellent statistical properties [ 24 , 25 ]. However, the above OW method is only applied to traditional epidemic analysis, which needs to be extended to mediation analysis and high-dimensional data setting. Besides, most of the existing methods all hold the independent assumption between potential mediators, which is hard to ensure in high dimensional epigenetic data analysis [ 5 , 9 , 13 , 14 , 15 , 18 ].

In this article, we incorporated the OW method into HIMA [ 9 ] and HIMA2 [ 17 ] models, respectively. In order to develop the accuracy of the screening of potential mediators, we modified the framework of variable screening in the original HIMA2 procedure. Eventually, we proposed the OW-based modified HIMA2 (mHIMA2) procedure for HDMA. We evaluated the performance of the proposed procedure and the existing models through simulation studies. All the above evaluations are based on the simulation study and real data application.

The rest of the article is structured as follows. In the next section, we introduced the notions, assumptions, models, and the procedure of adjustment for confounders in the high-dimensional mediation analysis model. Then, we conducted the Monte Carlo simulation study to evaluate the performance of various methods of confounding adjustment and two different mediation test approaches. Additionally, we applied the proposed method to the dataset GSE117859 in the Gene Expression Omnibus (GEO) databases and identified some DNA methylation markers that mediate the effect of smoking on the estimated natural killer (NK) cell level. Finally, we concluded the advantages and limitations of this study.

Model definitions

Our high-dimensional mediation model is shown in Fig. 1 . Let \(X\) be the exposure variable, where \(X=1\) represents the exposed group and \(X=0\) represents the controlled group. Denote the outcome as Y , here we mainly focus on continuous outcome. Let \(M={\left({M}_{1},{M}_{2},\cdots ,{M}_{P}\right)}^{T}\) be the set of the \(p\) -dimensional potential mediators, where \(p\gg n\) , \(n\) is the sample size. Let \(C={\left({C}_{1},{C}_{2},\cdots ,{C}_{q}\right)}^{T}\) be the \(q\) -dimension baseline confounders which influence the relation of exposure-mediator, mediator-outcome, and exposure-outcome. For individual \(i\) , \(i=\text{1,2},\cdots ,n\) , we have the high-dimensional mediation models as follows:

figure 1

Causal diagram. High-dimensional mediation model with confounders between exposure, mediator and outcome

where \(\alpha ={\left({\alpha }_{1},{\alpha }_{2},\cdots ,{\alpha }_{p}\right)}^{T}\) is the coefficient vector relating the exposure to the mediators, \(\beta ={\left({\beta }_{1},{\beta }_{2},\cdots ,{\beta }_{p}\right)}^{T}\) represents the effect of the mediators on the outcome, \({\alpha }_{k}{\beta }_{k}\) corresponds to the mediation effect of \({M}_{k}\) according to the definition of coefficients product method, and \(\left[p\right]\) denotes the set of \(\left\{\text{1,2},\cdots ,p\right\}\) . One can consider whether \({M}_{k}\) is the statistically significant mediator or not by testing the null hypothesis \({H}_{0}:{\alpha }_{k}{\beta }_{k}=0\) . \({\phi }_{k}\) and \(\eta\) are the effect of \(C\) on \(M\) and \(C\) on \(Y\) , respectively. \({a}_{k}\) and \(a\) are the intercept term in the Eqs. 1 and 2 , respectively. The same as above, \({e}_{k}\) and \(\epsilon\) are each the corresponding error term. We will compare the different variable selection strategies and methods of adjusting confounders.

Assumptions

To ensure the identification of path-specific mediating effects, some assumptions need to be held as below. These assumptions were proposed referring to necessary condition required for high-dimensional mediation analysis suggested in published studies [ 8 , 15 , 17 , 19 , 26 , 27 ]:

A1: There is no causal association between mediators. This means the proposed model contains only parallel mediators.

A2: Sequential ignorability. That consists of four assumptions listed below:

(A2.1) There are no unmeasured confounders between the exposure and the outcome;

(A2.2) There are no unmeasured confounders between the mediators and the outcome;

(A2.3) There are no unmeasured confounders between the exposure and the mediators;

(A2.4) There is no exposure-induced confounding between the mediators and the outcome.

A3: Stable unit treatment value assumption (SUTVA) [ 28 , 29 ] for both the mediators and the outcome. That is to say, there is no interference between individuals.

A4: Consistency for the mediators and the outcome. That is to say, there are no measurement errors in the mediators.

A5: Positivity assumption [ 30 ]. Every individual has some positive probability of being exposed to the factor of interest.

Proposed Procedure

We improved the HIMA procedure proposed by Zhang et al. (2016) [ 9 ] and the HIMA2 procedure proposed by Perera et al. (2022) [ 17 ] under the condition of adjusting confounders in observational data.

In this study, we developed two processes to conduct the confounding-controlled high-dimensional mediation analysis. The detailed procedure is described in the following text.

Step 1: PS-based methods for adjusting confounders

Since there are always some baseline confounders in observational data, we integrate propensity score (PS) into mediators (and/or outcome) models to reduce the selection bias and acquire as accurate estimates of the mediation effect as possible. Due to the PS approaches allowing the inclusion of a large scale of confounders, PS is widely used in observational research.

PS is defined as the conditional probability that a study individual with baseline covariates \(C=\left({C}_{1},{C}_{2},\cdots ,{C}_{l}\right)\) would be exposed to certain study factors of interest [ 31 ]:

PS can be estimated by classic multivariable statistical methods such as logistic regression [ 32 ] or by machine learning methods such as random forest (RF) and generalized boosted model (GBM) [ 33 , 34 ]. In practice, logistic regression is the most commonly used. The PS of \(i\) th individual \({\pi }_{i}=P\left({X}_{i}=1|{C}_{1i},\cdots ,{C}_{li}\right)\) can be expressed as follows:

where \(\theta ={\left({\theta }_{1},{\theta }_{2},\cdots ,{\theta }_{l}\right)}^{T}\) represents the effect of the confounders on the exposure. Then we can adopt some PS-based techniques to adjust confounders such as matching [ 35 ], stratification [ 36 ], regression [ 31 ], and weighting [ 37 ]. Here, we focus on PS regression (PSR) and PS weighting [ 14 ] (PSW, also called IPW short for inverse probability weighting) techniques to adjust potential confounders between exposure, mediators and outcome.

PSR approach incorporates PS as a covariate into the original regression model to adjust for the probability of being exposed to study factors and to reduce confounding [ 32 ]. That is similar to taking all confounders as covariates in a classical regression approach which usually uses the linear regression model for continuous outcomes and the logistic regression model for binary outcomes [ 38 ]. For the PSR approach, we can estimate the effect through the models below:

The PSW approach constructs the inverse probability weights by taking the reciprocal of PS. For binary exposure, the weight of the exposed group \(X=1\) is given as \(\frac{1}{PS}\) , and that of the controlled group \(X=0\) as \(\frac{1}{1-PS}\) . For \(i\) th individual:

Then, we can estimate the coefficients of X in pathways \(X\to M\) and \(M\to Y\) by weighted estimation:

where \({\alpha }_{k,ipw}\) and \({\gamma }_{ipw}\) are the weighted estimation according to the \(ipw\) weight vector. However, the IPW often faces extreme PSs issue which may lead to extreme weights and result in biased estimates and excessive variance [ 23 , 24 ].

The overlap weighting (OW) approach was proposed to address the issue of extreme PSs [ 23 ]. The overlap weight is given as \(1-PS\) for the group \(X=1\) and \(PS\) for the group \(X=0\) . Note that, individuals with \(PS\) of 0.5 make the largest contribution to the effect estimate, and individuals with \(PS\) close to 0 and 1 make the smallest contribution. OW is likely to be beneficial in the presence of extreme tail weights [ 23 , 39 ]. For individual \(i\) :

Then, the effect estimation of OW is similar to that of the PSW procedure:

In the same way, \({\alpha }_{k,ow}\) and \({\gamma }_{ow}\) are the weighted estimation using \(ow\) weight vector.

Step 2: Confounding-controlled SIS approach for dimensionality reduction

The SIS procedure is a general technique to reduce accurately high dimensions to below sample size [ 40 ]. We adopt the SIS method to reduce dimension \(p\) from ultra-high dimension to moderate scale \(d=\left[\frac{2n}{\text{log}\left(n\right)}\right]\) [ 9 , 15 ].

In this study, we considered two preliminary screening strategies as described in HIMA [ 9 ] and HIMA2 [ 17 ], based on the effects of \(M\) on \(Y\) ( \({\beta }_{k}\) ) and the indirect effect \(\left|{\alpha }_{k}{\beta }_{k}\right|\) respectively. Because the indirect effects can be both positive and negative effects, to address the influence of the signs of the estimated indirect effects, the HIMA2 approach uses the absolute values of the indirect effect to obtain the size of the effect estimate regardless of the direction. This approach ensures that mediators with large effect size can be selected.

Due to the lack of screening accuracy in SIS based on indirect effects in the presence of confounders, we conducted the SIS screening based on the effects on the path \(M\to Y\) controlling confounding effects using the OW approach.

In simulation, we found that it is hard to select the true mediators based on \(\left|{\alpha }_{k}{\beta }_{k}\right|\) in the presence of confounding factors as applied in the original HIMA2 approach. So, we modified the frame of the HIMA2 method and both adopt SIS based on the effects on the path \(M\to Y\) \({\beta }_{k}\) in the preliminary screening to select the subset of potential mediators \({M}_{SIS}=\left\{{M}_{k}:{M}_{k} \text{i}\text{s} \text{a}\text{m}\text{o}\text{n}\text{g} \text{t}\text{h}\text{e} \text{t}\text{o}\text{p} d \text{l}\text{a}\text{r}\text{g}\text{e}\text{s}\text{t} \text{e}\text{f}\text{f}\text{e}\text{c}\text{t} of {\beta }_{k}\right\}\) .

Noticing that we need to adopt a two-step weighting method [ 14 ] to estimate \({\beta }_{k}\) for the PSW and OW methods.

First, \({\gamma }_{k,w}\) can be obtained from the following sub-model:

where \({\widehat{\gamma }}_{k,w}\) is the estimator of \({\gamma }_{k,ow}\) or \({\gamma }_{k,ipw}\) for each \({M}_{k}\) . In addition, the residual \({\widehat{e}}_{k}\) can be derived:

Then \({\beta }_{k}\) can be estimated by regressing \({\widehat{e}}_{k}\) on \({M}_{k}\) without weighting. Through the above SIS procedure, we can identify the important mediators and achieve the goal of dimensionality reduction.

Step 3: Penalized estimation

According to the HIMA procedure, after the preliminary selection of candidate mediators, further variable selection can be accomplished by the penalized estimation method. Here, we adopt the MCP [ 41 ] rather than other penalty functions, since the MCP approach has the oracle property which can select the correct model with probability tending to 1 as \(n\to \infty\) [ 15 , 41 , 42 ].

For the \(d\) -dimensional subset \({M}_{SIS}\) , we employed the MCP-penalized estimation to further select significant mediators set \({M}_{MCP}=\left\{{M}_{k}:{\beta }_{k}\ne 0,{M}_{k}\in {M}_{SIS}\right\}\) , MCP penalty function can be defined as below:

where \(\lambda >0\) is the regularization parameter which can be selected by AIC or BIC, and \(\delta >0\) is the tuning parameter which determines the concavity of MCP. The MCP procedure can be implemented through the R package ncvreg [ 43 ]. Through MCP penalty estimation, we filtered out the mediators with too weak effects by combining SIS and MCP procedures and then acquired the small number of mediators that needed to be tested. That will help to obtain more accurate effect estimates.

Following the original HIMA2 procedure, the penalized estimation adopts the de-biased Lasso method to get the estimator \({\widehat{\beta }}_{k}\) and standard error \({\widehat{\sigma }}_{{\beta }_{k}}\) . The sub-model of the de-biased Lasso method can be described below:

where \({\beta }_{SIS}\) denote the effects of \({M}_{k}\in {M}_{SIS}\) on \(Y\) . The corresponding P -values \({P}_{{\beta }_{k}}\) are given as:

where \({\Phi }\left(.\right)\) is the cumulative distribution function of standard normal distribution \(N\left(\text{0,1}\right)\) . The de-biased Lasso method can be implemented with the R package hdi .

Step 4: PS-based multiple mediation test

After MCP-based penalized estimation, we use the Joint significance test [ 3 , 44 ] (termed JS-uniform) to test the mediation effect of \({M}_{k}\in {M}_{MCP}\) . The Joint significance test considers the \({M}_{k}\) as a true mediator when \({\alpha }_{k}\) and \({\beta }_{k}\) is significant simultaneously. Here, \({\alpha }_{k}\) can be estimated through different confounding adjustment methods as shown in Eqs. 1 , 3 , 4 , and 5 . \({\beta }_{k}\) can be obtained using the linear regression with considering all confounders as covariates or only including PS (summary of all confounders) as a covariate.

In other words, that is based on the P -values for testing the path-specific effects \({H}_{0}:{\alpha }_{k}=0\) or \({H}_{0}:{\beta }_{k}=0\) . The raw P -value for the joint significance test [ 3 ] is defined below:

\(\begin{array}{c}{P}_{raw,k}=\text{max}\left({P}_{raw,{\alpha }_{k}}, {P}_{raw,{\beta }_{k}}\right),\#\end{array}\) where \({P}_{raw,{\alpha }_{k}}\) and \({P}_{raw,{\beta }_{k}}\) are the P -values for testing \({H}_{0}:{\alpha }_{k}=0\) and \({H}_{0}:{\beta }_{k}=0\) . \({P}_{raw,{\alpha }_{k}}\) and \({P}_{raw,{\beta }_{k}}\) can be obtained from the mediator model (e.g. Equations 1 , 3 , 4 , and 5 ) and outcome model (Eq. 2 ), respectively.

For the multiplicity (Type I error inflation) issue in multiple mediation testing, we adopted the Benjamini–Hochberg (BH) method [ 45 , 46 ] to acquire the adjusted \(p\) -values as below,

where \(q\) is the number of potential mediators in the set \({M}_{MCP}\) , and \({r}_{k}\) is the location number of \({P}_{raw,k}\) when all the P -values \({P}_{raw,k}\) are sorted ascending.

However, the Joint significance test assumes \({P}_{raw,k}\) follows a uniform null distribution. Although \({P}_{{\alpha }_{k}}\) and \({P}_{{\beta }_{k}}\) are each uniformly distributed, their maximum may not. Therefore, the Joint significance test results in a valid but overly conservative test with lower power [ 13 , 17 , 47 ].

Hence, we adopt the PS-based joint significance with mixture null distribution method [ 18 ] (termed JS-mixture) approach to conduct multiple mediation test after de-biased Lasso penalized estimation [ 17 , 48 ] referring to the classical HIMA2 procedure. The PS-based JS-mixture approach adopts a 3-component mixture distribution as below:

The estimated pointwise FDR for testing mediation can be computed as:

where \(t\in \left[\text{0,1}\right]\) , \({V}_{00}\left(t\right),{V}_{01}\left(t\right),{V}_{10}\left(t\right)\) denoting the numbers of the three types of false positives and \(R\left(t\right)={V}_{00}\left(t\right)+{V}_{01}\left(t\right)+{V}_{10}\left(t\right)+{V}_{11}\left(t\right)\) . The \({V}_{00}\left(t\right),{V}_{01}\left(t\right),{V}_{10}\left(t\right)\) and \(\widehat{FDR}\left(t\right)\) can be obtained using the R package HDMT .

We set the significance level of 0.05 for all the tests. The detailed processes of the proposed method are summarized in Fig. 2 .

figure 2

The overall workflow for high-dimensional mediation analysis under the adjusting for confounders condition

Simulation studies

Simulation design.

In this section, we conducted the simulation studies to evaluate the performance of the proposed method. The implementation of the simulation was based on R (version 4.3.0, R Foundation for Statistical Computing, Vienna, Austria) and RStudio (version 2023.9.0.463, RStudio: Integrated Development Environment for R, Boston, MA). The setting of simulation parameters was based on the published studies [ 9 , 14 , 16 ]. The number of replications in simulation study was set to be 500 for each combination of parameter setting referring to the replication times settings in published methodogical studies [ 9 , 14 , 15 , 16 , 17 , 19 , 49 ].

The model structure is shown in Fig. 1 . We consider 8 confounders \(C=\left({C}_{1},{C}_{2},\cdots ,{C}_{8}\right)\) affecting the relationship of \(X\) , \(M\) , \(Y\) , in which continuous confounders \({C}_{1}-{C}_{4}\) follow a multivariate normal distribution \(N\left(\mu ,{\Sigma }\right)\) with a mean vector \(\mu ={\left(\text{0,0},\text{0,0}\right)}^{T}\) and a covariance matrix \({\Sigma }\) :

The last four binary confounders \({C}_{5}-{C}_{8}\) are independently generated from the Binary distribution \(B\left(n,0.3\right)\) , where \(n\) is the sample size.

Then exposure \(X\) can be generated from Binary distribution \(B\left(n,{P}_{c}\right)\) , where \(n\) is the sample size, \({P}_{c}=1/\left(1+{e}^{-\left({\theta }^{T}C\right)}\right)\) , and \({\theta }^{T}=\left({\theta }_{1},{\theta }_{2},\cdots ,{\theta }_{8}\right)=\left(\text{0.2,0.2,0.3,0.3,0.2,0.2,0.3,0.3}\right)\) .

Mediators \(M\) and the outcome variable \(Y\) are generated according to Eqs. 1 and 2 , respectively. For simplicity, we set all the effects of \(C\) on \(M\) to be the same. Let \({\phi }_{k}={\left({\phi }_{k1},\cdots ,{\phi }_{k8}\right)}^{T}={\left(\text{0.2,0.2,0.3,0.3,0.2,0.2,0.3,0.3}\right)}^{T}\) represent the effect of C on M. Let \(\eta ={\left({\eta }_{1},{\eta }_{2},\cdots ,{\eta }_{8}\right)}^{T}={\left(\text{0.2,0.2,0.3,0.3,0.2,0.2,0.3,0.3}\right)}^{T}\) denote the effects of \(C\) on \(Y\) .

We set the first four potential mediators \({M}_{1}-{M}_{4}\) as the true significant mediators in this study. Let \(\alpha ={\left({\alpha }_{1},{\alpha }_{2},\cdots ,{\alpha }_{p}\right)}^{T}=\left(\text{0.4,0.4,0.5,0.5,0.5,0.5,0},0,\cdots ,0\right)\) ; \(\beta ={\left({\beta }_{1},{\beta }_{2},\cdots ,{\beta }_{p}\right)}^{T}=\left(\text{0.4,0.5,0.5,0.6,0},\text{0,0.5,0.5,0},\cdots ,0\right)\) . The elements of both \(\alpha\) and \(\beta\) are equal to zero except for the first eight elements, and the first four are the significant mediators. The mediation effect size of the true mediators \({M}_{1}-{M}_{4}\) is \({\alpha \beta }_{1-4}=\left(\text{0.16,0.2,0.25,0.3}\right)\) .

Let \(\gamma =0.5\) ; \(a=0.5\) ; \(a_k\sim U(0,1)\) , \(\epsilon\sim N(0,1)\) . The error term \({e}_{k}\) are generated from \(N\left(\text{0,1.2}\right)\) and the correlation between mediators mostly falls between 0.15 and 0.35.

To evaluate the impacts of sample size and potential mediators dimension, we set two sample size levels \(n=300, 500\) , and two dimension levels \(p\) =1000,10000.

In addition, we take the correlation between mediators into account in the condition of \(p\) =1000 dimension. We simulate the strong correlation between mediators by generating the error terms \({e}_{k}\) from \(N\left(0,{{\Sigma }}_{e}\right)\) , where \({{\Sigma }}_{e}={\left({\rho }^{\left|k-{k}^{{\prime }}\right|}\right)}_{k,{k}^{{\prime }}}\) . It means the correlation between two mediators will decrease as the absolute difference in mediators’ subscript \(\left|k-{k}^{{\prime }}\right|\) increases. We set four correlation levels \(\rho =0, 0.25, \text{0.5,0.75}\) with dimension \(p\) =1000 and sample size \(n=300, 500\) . In the simulation setting \(\rho =0, 0.25, \text{0.5,0.75}\) , the corresponding Pearson correlation coefficients between two adjacent mediators are around 0.4, 0.5, 0.7, and 0.8, respectively. We evaluated the performance of the mHIMA2 and PS-based HIMA by conducting 500 replications of simulated data sets for each scenario [ 9 , 14 , 15 , 16 , 17 , 19 , 49 ].

Simulation results

Simulation results are presented in Tables 1 and 2 . Evaluation of the performance of mediator selection of the proposed approach is shown in Table 1 by measuring the true positive rate (TPR) and false discovery proportion (FDP) of selection after the significance test for mediation effects. The mediators have higher TPR as the indirect effect increases (i.e., larger mediation effect, higher detection rate).

As presented in Table 1 . Under most settings, the mHIMA2 mediation test approach has a higher TPR than PS-based HIMA while a higher FDP at the same time. Overall, the mHIMA2 is more powerful than the PS-based HIMA and is less conservative in selecting significant mediators.

As shown in Table 1 , for the mHIMA2 mediation test approach, TPR is ranked as OW > IPW > PSR > RA, and FDP is not more than 0.1 and gradually decreases to close to 0.05 as the sample size increases. Among all models, the mHIMA2 mediation test approach with OW adjustment has the highest power and acceptable false positive level. When using the PS-based HIMA mediation test approach, TPR is ranked consistently as RA > PSR > OW > IPW, and all four models also keep FDP at an extremely low level.

Table 2 presents the estimation of mediation effects with the mean and mean square error (MSE). The estimators approach the true values as the mediation effect increases. All models tend to be more accurate as \(n\) gets larger and \(p\) gets smaller. Overall, the mHIMA2 mediation test approach has a smaller MSE than the PS-based HIMA approach in most cases. RA adjustment has a higher MSE than other adjustment methods especially when facing the large mediation effect, OW adjustment has the lower MSE among the four adjustment methods.

As shown in Table 2 , similarly, the mHIMA2 approach with OW adjustment has the smallest MSE among all models. Moreover, similar results can be seen in the different strong correlation settings in Table S1-S8 in the supplementary file. The mHIMA2 methods have lower MSE (i.e. more precise estimation) and apparently higher TPR. That means the de-biased Lasso technique in mHIMA2 methods performs better when handling the moderate correlation between mediators. However, the FDP of all models slightly increases as the correlation between mediators increases. When correlation among the mediators is strong (for example, \(r\) >0.7), all models suffer in terms of increased MSE.

Data application

Smoking is an important environmental factor affecting the immune system and blood cell composition [ 50 , 51 ]. Previous studies have demonstrated smokers had lower natural killer (NK) cell counts and activity [ 50 , 51 ]. Smoking has also been found to be associated with DNA methylation levels [ 52 ]. Meanwhile, DNA methylation levels have also been found to be associated with associated with human NK cell activation [ 53 , 54 ]. Therefore, DNA methylation may mediate the association between smoking and NK cell level. So we implemented the proposed high-dimensional mediation analysis methods to identify the specific functional CpG sites that may mediate the relationship between smoking and the estimated NK cell level.

Here we apply our method to the GSE117859 dataset obtained from the Gene Expression Omnibus (GEO) database. The aim of the study in which GSE117859 was originally measured is to explore the smoking-associated DNA methylation features linked to AIDS outcomes in the HIV-positive population [ 55 ]. The blood samples from the Veteran Aging Cohort Study (VACS) were collected in that study. The HumanMethylation450 BeadChip platform was used to measure the DNA methylation levels.

In total 608 samplesand 485,577 probes were included in the dataset. Clinical information such as age, sex, race, smoking history, adherence of antiretroviral therapy (ART), estimated CD4 T cells, estimated CD8 T cells, and estimated NK cells were collected. The estimated CD4/CD8/NK were obtained using a methylation-based cell type deconvolution algorithm proposed by Housman et al. [ 56 ]. To some extent, the estimated CD4 and CD8 levels can represent AIDS severity.

Smoking status was collected based on self-report. All included patients were classified into the smoker and the non-smoker groups according to their reported smoking history. After removing the individuals without available clinical information and DNAm sites with missing values, a total of 587 samples and 485,503 probes were included in the analysis.

We adjusted the potential confounders including age, race, adherence of antiretroviral therapy, estimated CD4 T cells, and estimated CD8 T cells. Demographic and clinical variables included in our analysis are presented in Table 3 .

The analysis results using the proposed mHIMA2 method are presented in Table 4 . Here, we mainly presented the CpGs mediators with a total effect proportion greater than 5%. Due to the limitation of text content, we didn’t present the whole summary results of the PS-based HIMA method, but that can be seen in Table S9 in the supplementary file.

As shown in Table 4 , we identified two methylation sites cg13917614 in CNP gene and cg16893868 in LILRA2 gene by most of mHIMA2 based methods. The similar result can be seen in Table S9 in the supplementary file. The existing studies have already demonstrated the site cg13917614 is associated with smoking [ 52 , 57 ]. Although we don’t find direct evidence that the CNP gene is associated with immune function based on the existing literature, relevant studies showed a link between CNP and inflammatory responses in which the mechanism remains further study [ 58 , 59 ].

The encoded protein of the LILRA2 gene can suppress innate immune response [ 60 , 61 ]. The results reveal that smoking will promote the demethylation of cg16893868, leading to an increase in gene LILRA2 expression and ultimately reducing the estimated NK cell level. It has been found that the remaining CpG sites cg20460771, cg03164561, cg03605454, cg09529165, and cg01500140 are all associated with smoking [ 11 , 52 , 62 , 63 , 64 ]. Further insights into the discovered CpG mediators in genome-wide epigenetic studies will be meaningful.

The causal relationship obtained in high-dimensional mediation analysis usually depends on no-confounding assumption. However, confounding is almost inevitable in observational studies owing to the lack of randomization of the baseline covariates in practice. Previous studies show the utilization of PS method such as PS-adjustment and IPW in high-dimensional mediation analysis, but those face the issue of extreme PS distribution.

In this article, we integrated OW approach into the high-dimensional mediation model, which can address extreme PS distribution and better adjust for confounding. Finally, we developed a high-dimensional mediation analysis workflow consisting of OW confounding adjustment, SIS, de-biased Lasso penalization for potential mediator screening, and the high-dimensional mediation test underlying the mixture null distribution of P -values.

Simulation results indicate that the mHIMA2 with OW approach presented in this study performs best among all the compared models with the highest TPR, acceptable FDP level, and the smallest MSE in mediating effect estimation. In addition, the mHIMA2 embedded de-biased Lasso method performs better when moderate correlations between mediators exist.

Simulation study also suggestedthe proposed method would perform better when the sample size was increased. This result suggests that when the proposed method is used for the analysis of mediating effects on real data, a sufficient sample size should also be ensured. Such a feature is also consistent with other existing methods [ 5 , 9 , 14 , 17 , 19 , 49 ]. Furthermore, the dimensionality of potential mediators has little effect on the performance of the proposed method.

In most of the previous studies [ 5 , 9 , 13 , 17 ], it didn’t take confounding adjustment into account in the SIS process. However, we adopted the PS-based method to adjust confounding, thus improving the accuracy of mediators screening. Moreover, it has been assumed that mediators are linearly independent of each other, but such an assumption is often not strictly valid in real data. The violation of the mediators’ independence assumption often affects the accuracy of mediators selection and precision of mediating effect estimation. The proposed method can effectively deal with this issue which can tolerate the correlation between the mediators and ensure the robustness of mediators selection, multiple mediation testing, and mediating effect estimation.

Similar to other two-step approaches, the error of the first model may be introduced and cumulated in the second step, because the first-step can not quarantee 100% correctness. To avoid this, we set a relatively loose screening criterion with \(d=2n/log(n)\) to select the top \(d\) largest effect mediators [ 15 , 16 , 17 , 49 ] in the first step to control false negative while avoiding the increase of false positive error according to the application recommendation of SIS approach. Though the errors cannot be totally avoid, this can reduce the error in the preliminary screening of mediators and prevent serious error cumulation in the second step to some extend. As shown in the simulation, the proposed two-step model performed well. Besides, previous published studies also have demonstrated the error cumulation issue in two-step models can be controlled well in the similar way as we did, and well not cause serious bias in the final results [ 14 , 65 , 66 , 67 , 68 , 69 , 70 ].

Meanwhile, we applied the proposed method to the dataset GSE117859 obtained from the GEO databases and identified several significant DNAm mediators, including the sites cg13917614, cg16893868, cg20460771, cg03164561, cg03605454, cg09529165, and cg01500140. Among them, site cg16893868 in LILRA2 gene has been demonstrated to be associated with smoking and immune function [ 60 , 61 ]. That indicates that the proposed method can identify reliable mediators in empirical data analysis.

The presence of confounding in observation studies always is a major challenge to obtaining causal relationships. Currently, most genetic studies are based on observational research without randomization of baseline characteristics. Particularly, the high-dimensional mediation analysis always faces some issues, such as the accuracy of the high-dimensional mediation selection and the low power of multiple mediation test [ 13 , 14 , 17 , 18 ]. Although the utilization of PSR and IPW offers a solution of confounding adjustment in classical HDMA workflow, it still faces the issue of extreme PS distribution.

The proposed OW-based method can provide a more precise and stable mediating effect estimation. However, the misspecification of the outcome model and PS model can not be avoid in practice. Hence, the doubly robust methods may be desirable to be applied in HDMA workflow in future study. Even if the JS-mixture method was proposed to improve the power of multiple mediation testing, other more powerful test methods still are appealing in large-scale genome-wide epigenetic studies [ 13 , 18 ]. Conducting further simulation and methodology studies to compare different powerful test methods may provide useful reference for future studies. It should also be noticed that the existence of unmeasured confounding is out of the scope of this paper. Previews published studies have provided serval applicable methods to deal with this issue [ 49 , 71 ].

Overall, the mHIMA2 with OW adjustment has sufficient power in selecting potential true mediators and obtaining precise estimation for mediation effects. It can be recommended in practical high-dimensional mediation analysis, especially in epigenetic study.

Availability of data and materials

The dataset GSE117859 obtained from GEO database in our real data analysis can be accessed (at https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi? acc=GSE117859) without limitation. Our procedure is implemented using the R software. The corresponding R code can be found at https://github.com/huww1998/CONF_mHIMA2.

Abbreviations

False discovery proportion

Gene Expression Omnibus

High-dimensional mediation analysis

Inverse probability weighting

Joint significant test with uniform distribution

Joint significance test with mixture null distribution

Mean square error

Modified HIMA2 model

Natural killer cell

Overlapping weighting

Propensity score regression adjustment

  • Propensity score

Sure independence screening

True positive rate

Baron RM, Kenny DA. The moderator-mediator variable distinction in social psychological research: conceptual, strategic, and statistical considerations. J Personal Soc Psychol. 1986;51(6):1173–82. https://doi.org/10.1037//0022-3514.51.6.1173.

Article   CAS   Google Scholar  

Huan T, Joehanes R, Schurmann C, Schramm K, Pilling LC, Peters MJ, et al. A whole-blood transcriptome meta-analysis identifies gene expression signatures of cigarette smoking. Hum Mol Genet. 2016;25(21):4611–23. https://doi.org/10.1093/hmg/ddw288 .

Article   CAS   PubMed   PubMed Central   Google Scholar  

MacKinnon DP, Lockwood CM, Hoffman JM, West SG, Sheets V. A comparison of methods to test mediation and other intervening variable effects. Psychol Methods. 2002;7(1):83–104. https://doi.org/10.1037/1082-989x.7.1.83 .

Article   PubMed   PubMed Central   Google Scholar  

Biesanz JC, Falk CF, Savalei V. Assessing Mediational models: testing and interval estimation for Indirect effects. Multivar Behav Res. 2010;45(4):661–701. https://doi.org/10.1080/00273171.2010.498292 .

Article   Google Scholar  

Gao Y, Yang H, Fang R, Zhang Y, Goode EL, Cui Y. Testing mediation effects in high-dimensional epigenetic studies. Front Genet. 2019;10:1195. https://doi.org/10.3389/fgene.2019.01195 .

Taylor AB, MacKinnon DP. Four applications of permutation methods to testing a single-mediator model. Behav Res Methods. 2012;44(3):806–44. https://doi.org/10.3758/s13428-011-0181-x .

VanderWeele TJ. Marginal structural models for the estimation of direct and indirect effects. Epidemiol (Cambridge Mass). 2009;20(1):18–26. https://doi.org/10.1097/EDE.0b013e31818f69ce .

VanderWeele TJ, Vansteelandt S. Mediation analysis with multiple mediators. Epidemiol Methods. 2014;2(1):95–115. https://doi.org/10.1515/em-2012-0010 .

Zhang H, Zheng Y, Zhang Z, Gao T, Joyce B, Yoon G, et al. Estimating and testing high-dimensional mediation effects in epigenetic studies. Bioinf (Oxford England). 2016;32(20):3150–4. https://doi.org/10.1093/bioinformatics/btw351 .

Bibikova M, Barnes B, Tsan C, Ho V, Klotzle B, Le JM, et al. High density DNA methylation array with single CpG site resolution. Genomics. 2011;98(4):288–95. https://doi.org/10.1016/j.ygeno.2011.07.007 .

Article   CAS   PubMed   Google Scholar  

Harlid S, Xu Z, Panduri V, Sandler DP, Taylor JA. CpG sites associated with cigarette smoking: analysis of epigenome-wide data from the Sister Study. Environ Health Perspect. 2014;122(7):673–8. https://doi.org/10.1289/ehp.1307480 .

Toyooka S, Maruyama R, Toyooka KO, McLerran D, Feng Z, Fukuyama Y, et al. Smoke exposure, histologic type and geography-related differences in the methylation profiles of non-small cell lung cancer. Int J Cancer. 2003;103(2):153–60. https://doi.org/10.1002/ijc.10787 .

Liu Z, Shen J, Barfield R, Schwartz J, Baccarelli AA, Lin X. Large-scale hypothesis testing for Causal Mediation effects with Applications in Genome-wide epigenetic studies. J Am Stat Assoc. 2022;117(537):67–81. https://doi.org/10.1080/01621459.2021.1914634 .

Luo L, Yan Y, Cui Y, Yuan X, Yu Z. Linear high-dimensional mediation models adjusting for confounders using propensity score method. Front Genet. 2022;13:961148. https://doi.org/10.3389/fgene.2022.961148 .

Luo C, Fa B, Yan Y, Wang Y, Zhou Y, Zhang Y, et al. High-dimensional mediation analysis in survival models. PLoS Comput Biol. 2020;16(4):e1007768. https://doi.org/10.1371/journal.pcbi.1007768 .

Yu Z, Cui Y, Wei T, Ma Y, Luo C. High-dimensional mediation analysis with confounders in Survival models. Front Genet. 2021;12:688871. https://doi.org/10.3389/fgene.2021.688871 .

Perera C, Zhang H, Zheng Y, Hou L, Qu A, Zheng C, et al. HIMA2: high-dimensional mediation analysis and its application in epigenome-wide DNA methylation data. BMC Bioinformatics. 2022;23(1):296. https://doi.org/10.1186/s12859-022-04748-1 .

Dai JY, Stanford JL, LeBlanc M. A multiple-testing procedure for high-dimensional mediation hypotheses. J Am Stat Assoc. 2022;117(537):198–213. https://doi.org/10.1080/01621459.2020.1765785 .

Zhang H, Zheng Y, Hou L, Zheng C, Liu L. Mediation analysis for survival data with high-dimensional mediators. Bioinf (Oxford England). 2021;37(21):3815–21. https://doi.org/10.1093/bioinformatics/btab564 .

Heinze G, Jüni P. An overview of the objectives of and the approaches to propensity score analyses. Eur Heart J. 2011;32(14):1704–8. https://doi.org/10.1093/eurheartj/ehr031 .

Article   PubMed   Google Scholar  

Stuart EA. Matching methods for causal inference: a review and a look forward. Stat Science: Rev J Inst Math Stat. 2010;25(1):1–21. https://doi.org/10.1214/09-STS313 .

Lee BK, Lessler J, Stuart EA. Weight trimming and propensity score weighting. PLoS One. 2011;6(3):e18174. https://doi.org/10.1371/journal.pone.0018174 .

Li F, Thomas LE, Li F. Addressing Extreme Propensity scores via the Overlap weights. Am J Epidemiol. 2019;188(1):250–7. https://doi.org/10.1093/aje/kwy201 .

Thomas LE, Li F, Pencina MJ. Overlap weighting: a propensity score method that mimics attributes of a Randomized Clinical Trial. JAMA. 2020;323(23):2417. https://doi.org/10.1001/jama.2020.7819 .

Mlcoch T, Hrnciarova T, Tuzil J, Zadak J, Marian M, Dolezal T. Propensity Score Weighting Using Overlap Weights: A New Method Applied to Regorafenib Clinical Data and a Cost-Effectiveness Analysis. Value in Health. 2019;22(12):1370-7. doi: 10.1016/j.jval.2019.06.010.

Vanderweele TJ, Vansteelandt S, Robins JM. Effect decomposition in the presence of an exposure-induced mediator-outcome confounder. Epidemiol (Cambridge Mass). 2014;25(2):300–6. https://doi.org/10.1097/EDE.0000000000000034 .

Perera C, Zhang H, Zheng Y, Hou L, Qu A, Zheng C, et al. HIMA2: high-dimensional mediation analysis and its application in epigenome-wide DNA methylation data. BMC Bioinformatics. 2022;23(1).  https://doi.org/10.1186/s12859-022-04748-1 .

Basu D. Randomization analysis of Experimental Data: the Fisher randomization test. J Am Stat Assoc. 1980;75(371):575–82. https://doi.org/10.1080/01621459.1980.10477512 .

Rubin DB. Comment. J American Statis Assoc. 1986;81(396):961–2. https://doi.org/10.1080/01621459.1986.10478355 .

Imbens GW, Rubin DB. Causal inference for statistics, Social, and Biomedical sciences: an introduction. Cambridge: Cambridge University Press; 2015.

Book   Google Scholar  

Rosenbaum PR, Rubin DB. The central role of the propensity score in observational studies for causal effects. Biometrika. 1983;70(1):41–55. https://doi.org/10.1093/biomet/70.1.41 .

Haukoos JS, Lewis RJ. The Propensity score. JAMA. 2015;314(15):1637–8. https://doi.org/10.1001/jama.2015.13480 .

Lee BK, Lessler J, Stuart EA. Improving propensity score weighting using machine learning. Stat Med. 2010;29(3):337–46. https://doi.org/10.1002/sim.3782 .

Abdia Y, Kulasekera KB, Datta S, Boakye M, Kong M. Propensity scores based methods for estimating average treatment effect and average treatment effect among treated: a comparative study. Biometrical J Biometrische Z. 2017;59(5):967–85. https://doi.org/10.1002/bimj.201600094 .

Rosenbaum PR, Rubin DB. Constructing a Control Group Using Multivariate Matched Sampling Methods That Incorporate the Propensity Score. Am Stat. 1985;39(1):33–8. https://doi.org/10.1080/00031305.1985.10479383 .

Rosenbaum PR, Rubin DB. Reducing Bias in Observational studies using subclassification on the Propensity score. J Am Stat Assoc. 1984;79(387):516–24. https://doi.org/10.1080/01621459.1984.10478078 .

Robins JM, Rotnitzky A, Zhao LP. Analysis of semiparametric regression models for repeated outcomes in the presence of missing data. J Am Stat Assoc. 1995;90(429):106–21. https://doi.org/10.1080/01621459.1995.10476493 .

Austin PC. An introduction to propensity score methods for reducing the effects of confounding in observational studies. Multivar Behav Res. 2011;46(3):399–424. https://doi.org/10.1080/00273171.2011.568786 .

Li F, Morgan KL, Zaslavsky AM. Balancing covariates via Propensity score weighting. J Am Stat Assoc. 2018;113(521):390–400. https://doi.org/10.1080/01621459.2016.1260466 .

Fan J, Lv J. Sure Independence Screening for Ultrahigh Dimensional Feature Space. J Royal Stat Soc Ser B: Stat Methodol. 2008;70(5):849–911. https://doi.org/10.1111/j.1467-9868.2008.00674.x .

Zhang C-H. Nearly unbiased variable selection under minimax concave penalty. Annals Stat. 2010;38(2):894–942. https://doi.org/10.1214/09-AOS729 .

Maity AK, Basu S. Highest posterior model computation and variable selection via simulated annealing. The New England J Statis Data Sci. 2023;1(2):200–7. https://doi.org/10.51387/23-NEJSDS40 .

Breheny P, Huang J. Coordinate Descent algorithms for Nonconvex Penalized Regression, with applications to Biological feature selection. Annals Appl Stat. 2011;5(1):232–53. https://doi.org/10.1214/10-AOAS388 .

Huang Y-T, Pan W-C. Hypothesis test of mediation effect in causal mediation model with high-dimensional continuous mediators. Biometrics. 2016;72(2):402–13. https://doi.org/10.1111/biom.12421 .

Benjamini Y, Hochberg Y. Controlling the false Discovery rate: a practical and powerful Approach to multiple testing. J Roy Stat Soc: Ser B (Methodol). 1995;57(1):289–300. https://doi.org/10.1111/j.2517-6161.1995.tb02031.x .

Hochberg Y. A sharper Bonferroni procedure for multiple tests of significance. Biometrika. 1988;75(4):800–2. https://doi.org/10.1093/biomet/75.4.800 .

Huang Y-T. Joint significance tests for mediation effects of socioeconomic adversity on adiposity via epigenetics. Annals Appl Stat. 2018;12(3):1535–57. https://doi.org/10.1214/17-AOAS1120 .

Fang EX, Ning Y, Liu H. Testing and confidence intervals for high Dimensional Proportional hazards Model. J Royal Stat Soc Ser B Stat Methodol. 2017;79(5):1415–37. https://doi.org/10.1111/rssb.12224 .

Chen F, Hu W, Cai J, Chen S, Si A, Zhang Y, et al. Instrumental variable-based high-dimensional mediation analysis with unmeasured confounders for survival data in the observational epigenetic study. Front Genet. 2023;14:1092489. https://doi.org/10.3389/fgene.2023.1092489 .

Qiu F, Liang C-L, Liu H, Zeng Y-Q, Hou S, Huang S, et al. Impacts of cigarette smoking on immune responsiveness: up and down or upside down? Oncotarget. 2017;8(1):268–84. https://doi.org/10.18632/oncotarget.13613 .

Elisia I, Lam V, Cho B, Hay M, Li MY, Yeung M, et al. The effect of smoking on chronic inflammation, immune function and blood cell composition. Sci Rep. 2020;10:19480. https://doi.org/10.1038/s41598-020-76556-7 .

Breitling LP, Yang R, Korn B, Burwinkel B, Brenner H. Tobacco-smoking-related differential DNA methylation: 27K discovery and replication. Am J Hum Genet. 2011;88(4):450–7. https://doi.org/10.1016/j.ajhg.2011.03.003 .

Wiencke JK, Butler R, Hsuang G, Eliot M, Kim S, Sepulveda MA, et al. The DNA methylation profile of activated human natural killer cells. Epigenetics. 2016;11(5):363–80. https://doi.org/10.1080/15592294.2016.1163454 .

Gao X, Jia M, Zhang Y, Breitling LP, Brenner H. DNA methylation changes of whole blood cells in response to active smoking exposure in adults: a systematic review of DNA methylation studies. Clin Epigenetics. 2015;7:113. https://doi.org/10.1186/s13148-015-0148-3 .

Zhang X, Hu Y, Aouizerat BE, Peng G, Marconi VC, Corley MJ, et al. Machine learning selected smoking-associated DNA methylation signatures that predict HIV prognosis and mortality. Clin Epigenetics. 2018;10(1):155. https://doi.org/10.1186/s13148-018-0591-z .

Houseman EA, Accomando WP, Koestler DC, Christensen BC, Marsit CJ, Nelson HH, et al. DNA methylation arrays as surrogate measures of cell mixture distribution. BMC Bioinformatics. 2012;13(1):86. https://doi.org/10.1186/1471-2105-13-86 .

Wan ES, Qiu W, Baccarelli A, Carey VJ, Bacherman H, Rennard SI, et al. Cigarette smoking behaviors and time since quitting are associated with differential DNA methylation across the human genome. Hum Mol Genet. 2012;21(13):3073–82. https://doi.org/10.1093/hmg/dds135 .

Bao Q, Zhang B, Zhou L, Yang Q, Mu X, Liu X, et al. CNP Ameliorates Macrophage Inflammatory Response and Atherosclerosis. Circ Res. 0(0). https://doi.org/10.1161/CIRCRESAHA.123.324086 .

Bae C-R, Hino J, Hosoda H, Arai Y, Son C, Makino H, et al. Overexpression of C-type natriuretic peptide in endothelial cells protects against Insulin Resistance and inflammation during Diet-induced obesity. Sci Rep. 2017;7(1):9807. https://doi.org/10.1038/s41598-017-10240-1 .

Lu HK, Mitchell A, Endoh Y, Hampartzoumian T, Huynh O, Borges L, et al. LILRA2 selectively modulates LPS-mediated cytokine production and inhibits phagocytosis by monocytes. PLoS ONE. 2012;7(3):e33478. https://doi.org/10.1371/journal.pone.0033478 .

Lewis Marffy AL, McCarthy AJ. Leukocyte Immunoglobulin-Like receptors (LILRs) on human neutrophils: modulators of infection and immunity. Front Immunol. 2020;11:857. https://doi.org/10.3389/fimmu.2020.00857 .

Sikdar S, Joehanes R, Joubert BR, Xu C-J, Vives-Usano M, Rezwan FI, et al. Comparison of smoking-related DNA methylation between newborns from prenatal exposure and adults from personal smoking. Epigenomics. 2019;11(13):1487–500. https://doi.org/10.2217/epi-2019-0066 .

Joehanes R, Just AC, Marioni RE, Pilling LC, Reynolds LM, Mandaviya PR, et al. Epigenetic signatures of cigarette smoking. Circulation Cardiovasc Genet. 2016;9(5):436–47. https://doi.org/10.1161/CIRCGENETICS.116.001506 .

Sun YV, Smith AK, Conneely KN, Chang Q, Li W, Lazarus A, et al. Epigenomic association analysis identifies smoking-related DNA methylation sites in African americans. Hum Genet. 2013;132(9):1027–37. https://doi.org/10.1007/s00439-013-1311-6 .

Luo C, Wang G, Hu F. Two-Step Gene Feature Selection Algorithm Based on Permutation Test. In: Yao J, Yang Y, Słowiński R, Greco S, Li H, Mitra S, et al., editors. Springer; 2012. p. 249−58.  https://doi.org/10.1111/ppe.12382 .

Liu D, Yeung EH, McLain AC, Xie Y, Buck Louis GM, Sundaram R. A two-step Approach for Analysis of Nonignorable Missing outcomes in Longitudinal Regression: an application to Upstate KIDS Study. Paediatr Perinat Epidemiol. 2017;31(5):468–78. https://doi.org/10.1111/ppe.12382 .

Newcombe PJ, Connolly S, Seaman S, Richardson S, Sharp SJ. A two-step method for variable selection in the analysis of a case-cohort study. Int J Epidemiol. 2018;47(2):597–604. https://doi.org/10.1093/ije/dyx224 .

Song J, Shin SJ. A two-step approach for variable selection in linear regression with measurement error. Commun Stat Appl Methods. 2019;26(1):47–55. https://doi.org/10.29220/CSAM.2019.26.1.047 .

Liu Y, Qin SJ. A Novel two-step sparse Learning Approach for Variable Selection and Optimal Predictive modeling. IFAC-PapersOnLine. 2022;55(7):57–64. https://doi.org/10.1016/j.ifacol.2022.07.422 .

Chamlal H, Benzmane A, Ouaderhman T. A Two-Step Feature Selection Procedure to Handle High-Dimensional Data in Regression Problems. 2023 International Conference on Decision Aid Sciences and Applications (DASA). 2023. p. 592–6.

Google Scholar  

Wickramarachchi DS, Lim LHM, Sun B. Mediation analysis with multiple mediators under unmeasured mediator-outcome confounding. Stat Med. 2023;42(4):422–32. https://doi.org/10.1002/sim.9624 .

Download references

Acknowledgements

We acknowledge GEO database for providing their platforms and contributors for uploading their meaningful datasets.

This work was supported by the National Social Science Found of China (21CTJ009) and National Nature Science Foundation of China (81703325).

Author information

Authors and affiliations.

Department of Epidemiology and Biostatistics, School of Public Health, Xi’an Jiaotong University, Xi’an, 710061, Shaanxi, China

Weiwei Hu, Shiyu Chen, Jiaxin Cai, Yuhui Yang, Hong Yan & Fangyao Chen

Department of Radiology, First Affiliated Hospital of Xi’an Jiaotong University, Xi’an, 710061, Shaanxi, China

Fangyao Chen

You can also search for this author in PubMed   Google Scholar

Contributions

F.C., H.Y. and W.H. led the conception and design of the work. W.H. and S.C. conducted the simulation study. W.H., S.C., and J.C. finished data cleaning and implemented real data application. W.H. completed the original draft. F.C., J.C., and Y.Y. guided analyses and provided advice. F.C. critically reviewed and edited the manuscript.

Corresponding author

Correspondence to Fangyao Chen .

Ethics declarations

Ethics approval and consent to participate.

Not applicable.

Consent for publication

Competing interests.

The authors declare no competing interests.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

12874_2024_2254_moesm1_esm.docx.

Additional file 1: High-dimensional mediation analysis for continuous outcome with confounders using overlap weighting method in observational epigenetic study. Simulation results of different correlation levels   \(\rho\) =0,0.25,0.5,0.75 with dimension \(\rho\) =1000 and sample size \(n\) =300,500 were presented in the Table S1-8. Analysis result using the PS-based HIMA methods was shown in Table S9.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article.

Hu, W., Chen, S., Cai, J. et al. High-dimensional mediation analysis for continuous outcome with confounders using overlap weighting method in observational epigenetic study. BMC Med Res Methodol 24 , 125 (2024). https://doi.org/10.1186/s12874-024-02254-x

Download citation

Received : 15 March 2024

Accepted : 22 May 2024

Published : 03 June 2024

DOI : https://doi.org/10.1186/s12874-024-02254-x

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • High-dimensional mediation model
  • Overlap weighting
  • Joint significant test
  • Composite null hypothesis

BMC Medical Research Methodology

ISSN: 1471-2288

empirical research problems

IMAGES

  1. What Is Empirical Research? Definition, Types & Samples

    empirical research problems

  2. 15 Empirical Evidence Examples (2024)

    empirical research problems

  3. Empirical Evidence

    empirical research problems

  4. What Is Empirical Research? Definition, Types & Samples

    empirical research problems

  5. Research Methodology Examples / Quantitative Methodology

    empirical research problems

  6. Empirical Evidence

    empirical research problems

VIDEO

  1. Empirical Formula Practice Problems

  2. Solving Problems of Empirical and Molecular formula Senior 1

  3. jee main pyq : problems on empirical formula, mass percent of elements

  4. Empirical Formula || Chemistry

  5. Empirical research methods

  6. Empirical Legal Research Conference 2024

COMMENTS

  1. Research Problems and Hypotheses in Empirical Research

    Briefly About Criteria for Final Conclusions, Problems, and Hypotheses. The fundamental purpose of substantive research may be considered that of producing valid and needed knowledge in specified domains, or as formulated by Polit and Beck (Citation 2004, p. 3): "The ultimate goal of research is to develop, refine, and expand a body of knowledge."

  2. What Is Empirical Research? Definition, Types & Samples in 2024

    Empirical research is defined as any study whose conclusions are exclusively derived from concrete, verifiable evidence. The term empirical basically means that it is guided by scientific experimentation and/or evidence. Likewise, a study is empirical when it uses real-world evidence in investigating its assertions.

  3. Empirical Research: Definition, Methods, Types and Examples

    Steps for conducting empirical research. Since empirical research is based on observation and capturing experiences, it is important to plan the steps to conduct the experiment and how to analyse it. This will enable the researcher to resolve problems or obstacles which can occur during the experiment. Step #1: Define the purpose of the research

  4. What is a Research Problem? Characteristics, Types, and Examples

    Characteristics, Types, and Examples. August 22, 2023 Sunaina Singh. Knowing the basics of defining a research problem is instrumental in formulating a research inquiry. A research problem is a gap in existing knowledge, a contradiction in an established theory, or a real-world challenge that a researcher aims to address in their research.

  5. Empirical Research

    Strategies for Empirical Research in Writing is a particularly accessible approach to both qualitative and quantitative empirical research methods, helping novices appreciate the value of empirical research in writing while easing their fears about the research process. ... Solving Everyday Problems with the Scientific Method: Thinking Like a ...

  6. Empirical Research in the Social Sciences and Education

    Another hint: some scholarly journals use a specific layout, called the "IMRaD" format, to communicate empirical research findings. Such articles typically have 4 components: Introduction : sometimes called "literature review" -- what is currently known about the topic -- usually includes a theoretical framework and/or discussion of previous ...

  7. Empirical research

    A scientist gathering data for her research. Empirical research is research using empirical evidence.It is also a way of gaining knowledge by means of direct and indirect observation or experience. Empiricism values some research more than other kinds. Empirical evidence (the record of one's direct observations or experiences) can be analyzed quantitatively or qualitatively.

  8. Empirical Research: Defining, Identifying, & Finding

    Empirical research methodologies can be described as quantitative, qualitative, or a mix of both (usually called mixed-methods). Ruane (2016) (UofM login required) gets at the basic differences in approach between quantitative and qualitative research: Quantitative research -- an approach to documenting reality that relies heavily on numbers both for the measurement of variables and for data ...

  9. Empirical Research

    Empirical research, in other words, involves the process of employing working hypothesis that are tested through experimentation or observation. Hence, empirical research is a method of uncovering empirical evidence. ... The problem areas of sociology in Kurt H. Wolf: The sociology of Georg Simmel. London: The Free Press. Google Scholar

  10. Data, measurement and empirical methods in the science of science

    The discovery of empirical regularities in science has had a key role in driving conceptual developments and the directions of future research. By observing empirical patterns at scale ...

  11. Empirical research on problem solving and problem posing: a ...

    Although the empirical research and standards documents discussed above were quite seminal and influential for the line of research on problem posing in the United States, researchers in other countries were also forging new paths in this domain (e.g., Brink, 1987; Cai, 1998; English, 1998).

  12. The Research Problem/Question

    From a theory, the researcher can formulate a research problem or hypothesis stating the expected findings in certain empirical situations. The research asks the question: "What relationship between variables will be observed if theory aptly summarizes the state of affairs?" ... Research problems in the social and behavioral sciences are ...

  13. Empirical Research: A Comprehensive Guide for Academics

    Tips for Empirical Writing. In empirical research, the writing is usually done in research papers, articles, or reports. The empirical writing follows a set structure, and each section has a specific role. Here are some tips for your empirical writing. 7. Define Your Objectives: When you write about your research, start by making your goals clear.

  14. What is Empirical Research? Definition, Methods, Examples

    Empirical research is characterized by several key features: Observation and Measurement: It involves the systematic observation or measurement of variables, events, or behaviors. Data Collection: Researchers collect data through various methods, such as surveys, experiments, observations, or interviews.

  15. Understanding the Nature of and Identifying and Formulating "Research

    While the first explicit attempts to integrate quantitative and qualitative methods to address research problems in the social sciences were made in the late 19 th century (Maxwell, 2016), it has only been in recent decades that mixed methods research (MMR) has become an established research methodology for examining complex phenomena in the social, behavioral, health, and interdisciplinary ...

  16. Empirical evidence

    scientific theory. belief. empirical evidence, information gathered directly or indirectly through observation or experimentation that may be used to confirm or disconfirm a scientific theory or to help justify, or establish as reasonable, a person's belief in a given proposition. A belief may be said to be justified if there is sufficient ...

  17. Understanding the Empirical Method in Research Methodology

    The empirical method is a fundamental aspect of research methodology that has stood the test of time. By relying on observation and data collection, it allows researchers to ground their theories in reality, providing a solid foundation for knowledge. Whether it's used in the hard sciences, social sciences, or humanities, the empirical method ...

  18. What is Empirical Research Study? [Examples & Method]

    Empirical research is a type of research methodology that makes use of verifiable evidence in order to arrive at research outcomes. In other words, this type of research relies solely on evidence obtained through observation or scientific data collection methods. Empirical research can be carried out using qualitative or quantitative ...

  19. PDF Problems of Empirical Research

    Summary. Replacement of research methods with statistics AND replacement of dynamic research process with static research. are the two main problems of current empirical research practice. Due to this, empirical research is considered as an art by many and is named as "Garbage in, Garbage out" by some. And these problems have made empirical ...

  20. PDF Research Problems and Hypotheses in Empirical Research

    The account above concerns produced knowledge claims. One central task in empirical research is how to select research problems and hypotheses. Choice of research problems should be based on three ...

  21. Research Problems and Hypotheses in Empirical Research

    Abstract Criteria are briefly proposed for final conclusions, research problems, and research hypotheses in quantitative research. Moreover, based on a proposed definition of applied and basic/general research, it is argued that (1) in applied quantitative research, while research problems are necessary, research hypotheses are unjustified, and that (2) in basic/general quantitative hypothesis ...

  22. Empirical Research in Education

    Empirical Research in Education. Assumptions and Problems. In the previous chapters we have reviewed the history of social science research and introduced some of the basic principles on which empirical research, or LPE, in education is based. In this chapter we turn our attention toward identifying how the principles of logical positivism ...

  23. Problematic research practices in psychology: Misconceptions about data

    Arocha (2021) rightfully highlights epistemological problems of positivism, especially empiricism, operationism, and neglected theory development. Contrary to empiricists' beliefs, facts and observations cannot be pure elements of truth because it is scientists who decide what constitutes facts and what to observe in their field—and this presupposes theory (Weber, 1949).

  24. The double empathy problem: A derivation chain analysis and cautionary

    Work on the "double empathy problem" (DEP) is rapidly growing in academic and applied settings (e.g., clinical practice). It is most popular in research on conditions, like autism, which are characterized by social cognitive difficulties. Drawing from this literature, we propose that, while research on the DEP has the potential to improve understanding of both typical and atypical social ...

  25. Misunderstanding the harms of online misinformation

    The controversy over online misinformation and social media has opened a gap between public discourse and scientific research. Public intellectuals and journalists frequently make sweeping claims ...

  26. High-dimensional mediation analysis for continuous outcome with

    That indicates that the proposed method can identify reliable mediators in empirical data analysis. The presence of confounding in observation studies always is a major challenge to obtaining causal relationships. Currently, most genetic studies are based on observational research without randomization of baseline characteristics.