Research Paper Statistical Treatment of Data: A Primer

We can all agree that analyzing and presenting data effectively in a research paper is critical, yet often challenging.

This primer on statistical treatment of data will equip you with the key concepts and procedures to accurately analyze and clearly convey research findings.

You'll discover the fundamentals of statistical analysis and data management, the common quantitative and qualitative techniques, how to visually represent data, and best practices for writing the results - all framed specifically for research papers.

If you are curious on how AI can help you with statistica analysis for research, check Hepta AI .

Introduction to Statistical Treatment in Research

Statistical analysis is a crucial component of both quantitative and qualitative research. Properly treating data enables researchers to draw valid conclusions from their studies. This primer provides an introductory guide to fundamental statistical concepts and methods for manuscripts.

Understanding the Importance of Statistical Treatment

Careful statistical treatment demonstrates the reliability of results and ensures findings are grounded in robust quantitative evidence. From determining appropriate sample sizes to selecting accurate analytical tests, statistical rigor adds credibility. Both quantitative and qualitative papers benefit from precise data handling.

Objectives of the Primer

This primer aims to equip researchers with best practices for:

Statistical tools to apply during different research phases

Techniques to manage, analyze, and present data

Methods to demonstrate the validity and reliability of measurements

By covering fundamental concepts ranging from descriptive statistics to measurement validity, it enables both novice and experienced researchers to incorporate proper statistical treatment.

Navigating the Primer: Key Topics and Audience

The primer spans introductory topics including:

Research planning and design

Data collection, management, analysis

Result presentation and interpretation

While useful for researchers at any career stage, earlier-career scientists with limited statistical exposure will find it particularly valuable as they prepare manuscripts.

How do you write a statistical method in a research paper?

Statistical methods are a critical component of research papers, allowing you to analyze, interpret, and draw conclusions from your study data. When writing the statistical methods section, you need to provide enough detail so readers can evaluate the appropriateness of the methods you used.

Here are some key things to include when describing statistical methods in a research paper:

Type of Statistical Tests Used

Specify the types of statistical tests performed on the data, including:

Parametric vs nonparametric tests

Descriptive statistics (means, standard deviations)

Inferential statistics (t-tests, ANOVA, regression, etc.)

Statistical significance level (often p < 0.05)

For example: We used t-tests and one-way ANOVA to compare means across groups, with statistical significance set at p < 0.05.

Analysis of Subgroups

If you examined subgroups or additional variables, describe the methods used for these analyses.

For example: We stratified data by gender and used chi-square tests to analyze differences between subgroups.

Software and Versions

List any statistical software packages used for analysis, including version numbers. Common programs include SPSS, SAS, R, and Stata.

For example: Data were analyzed using SPSS version 25 (IBM Corp, Armonk, NY).

The key is to give readers enough detail to assess the rigor and appropriateness of your statistical methods. The methods should align with your research aims and design. Keep explanations clear and concise using consistent terminology throughout the paper.

What are the 5 statistical treatment in research?

The five most common statistical treatments used in academic research papers include:

The mean, or average, is used to describe the central tendency of a dataset. It provides a singular value that represents the middle of a distribution of numbers. Calculating means allows researchers to characterize typical observations within a sample.

Standard Deviation

Standard deviation measures the amount of variability in a dataset. A low standard deviation indicates observations are clustered closely around the mean, while a high standard deviation signifies the data is more spread out. Reporting standard deviations helps readers contextualize means.

Regression Analysis

Regression analysis models the relationship between independent and dependent variables. It generates an equation that predicts changes in the dependent variable based on changes in the independents. Regressions are useful for hypothesizing causal connections between variables.

Hypothesis Testing

Hypothesis testing evaluates assumptions about population parameters based on statistics calculated from a sample. Common hypothesis tests include t-tests, ANOVA, and chi-squared. These quantify the likelihood of observed differences being due to chance.

Sample Size Determination

Sample size calculations identify the minimum number of observations needed to detect effects of a given size at a desired statistical power. Appropriate sampling ensures studies can uncover true relationships within the constraints of resource limitations.

These five statistical analysis methods form the backbone of most quantitative research processes. Correct application allows researchers to characterize data trends, model predictive relationships, and make probabilistic inferences regarding broader populations. Expertise in these techniques is fundamental for producing valid, reliable, and publishable academic studies.

How do you know what statistical treatment to use in research?

The selection of appropriate statistical methods for the treatment of data in a research paper depends on three key factors:

The Aim and Objective of the Study

The aim and objectives that the study seeks to achieve will determine the type of statistical analysis required.

Descriptive research presenting characteristics of the data may only require descriptive statistics like measures of central tendency (mean, median, mode) and dispersion (range, standard deviation).

Studies aiming to establish relationships or differences between variables need inferential statistics like correlation, t-tests, ANOVA, regression etc.

Predictive modeling research requires methods like regression, discriminant analysis, logistic regression etc.

Thus, clearly identifying the research purpose and objectives is the first step in planning appropriate statistical treatment.

Type and Distribution of Data

The type of data (categorical, numerical) and its distribution (normal, skewed) also guide the choice of statistical techniques.

Parametric tests have assumptions related to normality and homogeneity of variance.

Non-parametric methods are distribution-free and better suited for non-normal or categorical data.

Testing data distribution and characteristics is therefore vital.

Nature of Observations

Statistical methods also differ based on whether the observations are paired or unpaired.

Analyzing changes within one group requires paired tests like paired t-test, Wilcoxon signed-rank test etc.

Comparing between two or more independent groups needs unpaired tests like independent t-test, ANOVA, Kruskal-Wallis test etc.

Thus the nature of observations is pivotal in selecting suitable statistical analyses.

In summary, clearly defining the research objectives, testing the collected data, and understanding the observational units guides proper statistical treatment and interpretation.

What is statistical techniques in research paper?

Statistical methods are essential tools in scientific research papers. They allow researchers to summarize, analyze, interpret and present data in meaningful ways.

Some key statistical techniques used in research papers include:

Descriptive statistics: These provide simple summaries of the sample and the measures. Common examples include measures of central tendency (mean, median, mode), measures of variability (range, standard deviation) and graphs (histograms, pie charts).

Inferential statistics: These help make inferences and predictions about a population from a sample. Common techniques include estimation of parameters, hypothesis testing, correlation and regression analysis.

Analysis of variance (ANOVA): This technique allows researchers to compare means across multiple groups and determine statistical significance.

Factor analysis: This technique identifies underlying relationships between variables and latent constructs. It allows reducing a large set of variables into fewer factors.

Structural equation modeling: This technique estimates causal relationships using both latent and observed factors. It is widely used for testing theoretical models in social sciences.

Proper statistical treatment and presentation of data are crucial for the integrity of any quantitative research paper. Statistical techniques help establish validity, account for errors, test hypotheses, build models and derive meaningful insights from the research.

Fundamental Concepts and Data Management

Exploring basic statistical terms.

Understanding key statistical concepts is essential for effective research design and data analysis. This includes defining key terms like:

Statistics : The science of collecting, organizing, analyzing, and interpreting numerical data to draw conclusions or make predictions.

Variables : Characteristics or attributes of the study participants that can take on different values.

Measurement : The process of assigning numbers to variables based on a set of rules.

Sampling : Selecting a subset of a larger population to estimate characteristics of the whole population.

Data types : Quantitative (numerical) or qualitative (categorical) data.

Descriptive vs. inferential statistics : Descriptive statistics summarize data while inferential statistics allow making conclusions from the sample to the larger population.

Ensuring Validity and Reliability in Measurement

When selecting measurement instruments, it is critical they demonstrate:

Validity : The extent to which the instrument measures what it intends to measure.

Reliability : The consistency of measurement over time and across raters.

Researchers should choose instruments aligned to their research questions and study methodology .

Data Management Essentials

Proper data management requires:

Ethical collection procedures respecting autonomy, justice, beneficence and non-maleficence.

Handling missing data through deletion, imputation or modeling procedures.

Data cleaning by identifying and fixing errors, inconsistencies and duplicates.

Data screening via visual inspection and statistical methods to detect anomalies.

Data Management Techniques and Ethical Considerations

Ethical data management includes:

Obtaining informed consent from all participants.

Anonymization and encryption to protect privacy.

Secure data storage and transfer procedures.

Responsible use of statistical tools free from manipulation or misrepresentation.

Adhering to ethical guidelines preserves public trust in the integrity of research.

Statistical Methods and Procedures

This section provides an introduction to key quantitative analysis techniques and guidance on when to apply them to different types of research questions and data.

Descriptive Statistics and Data Summarization

Descriptive statistics summarize and organize data characteristics such as central tendency, variability, and distributions. Common descriptive statistical methods include:

Measures of central tendency (mean, median, mode)

Measures of variability (range, interquartile range, standard deviation)

Graphical representations (histograms, box plots, scatter plots)

Frequency distributions and percentages

These methods help describe and summarize the sample data so researchers can spot patterns and trends.

Inferential Statistics for Generalizing Findings

While descriptive statistics summarize sample data, inferential statistics help generalize findings to the larger population. Common techniques include:

Hypothesis testing with t-tests, ANOVA

Correlation and regression analysis

Nonparametric tests

These methods allow researchers to draw conclusions and make predictions about the broader population based on the sample data.

Selecting the Right Statistical Tools

Choosing the appropriate analyses involves assessing:

The research design and questions asked

Type of data (categorical, continuous)

Data distributions

Statistical assumptions required

Matching the correct statistical tests to these elements helps ensure accurate results.

Statistical Treatment of Data for Quantitative Research

For quantitative research, common statistical data treatments include:

Testing data reliability and validity

Checking assumptions of statistical tests

Transforming non-normal data

Identifying and handling outliers

Applying appropriate analyses for the research questions and data type

Examples and case studies help demonstrate correct application of statistical tests.

Approaches to Qualitative Data Analysis

Qualitative data is analyzed through methods like:

Thematic analysis

Content analysis

Discourse analysis

Grounded theory

These help researchers discover concepts and patterns within non-numerical data to derive rich insights.

Data Presentation and Research Method

Crafting effective visuals for data presentation.

When presenting analyzed results and statistics in a research paper, well-designed tables, graphs, and charts are key for clearly showcasing patterns in the data to readers. Adhering to formatting standards like APA helps ensure professional data presentation. Consider these best practices:

Choose the appropriate visual type based on the type of data and relationship being depicted. For example, bar charts for comparing categorical data, line graphs to show trends over time.

Label the x-axis, y-axis, legends clearly. Include informative captions.

Use consistent, readable fonts and sizing. Avoid clutter with unnecessary elements. White space can aid readability.

Order data logically. Such as largest to smallest values, or chronologically.

Include clear statistical notations, like error bars, where applicable.

Following academic standards for visuals lends credibility while making interpretation intuitive for readers.

Writing the Results Section with Clarity

When writing the quantitative Results section, aim for clarity by balancing statistical reporting with interpretation of findings. Consider this structure:

Open with an overview of the analysis approach and measurements used.

Break down results by logical subsections for each hypothesis, construct measured etc.

Report exact statistics first, followed by interpretation of their meaning. For example, “Participants exposed to the intervention had significantly higher average scores (M=78, SD=3.2) compared to controls (M=71, SD=4.1), t(115)=3.42, p = 0.001. This suggests the intervention was highly effective for increasing scores.”

Use present verb tense. And scientific, formal language.

Include tables/figures where they aid understanding or visualization.

Writing results clearly gives readers deeper context around statistical findings.

Highlighting Research Method and Design

With a results section full of statistics, it's vital to communicate key aspects of the research method and design. Consider including:

Brief overview of study variables, materials, apparatus used. Helps reproducibility.

Descriptions of study sampling techniques, data collection procedures. Supports transparency.

Explanations around approaches to measurement, data analysis performed. Bolsters methodological rigor.

Noting control variables, attempts to limit biases etc. Demonstrates awareness of limitations.

Covering these methodological details shows readers the care taken in designing the study and analyzing the results obtained.

Acknowledging Limitations and Addressing Biases

Honestly recognizing methodological weaknesses and limitations goes a long way in establishing credibility within the published discussion section. Consider transparently noting:

Measurement errors and biases that may have impacted findings.

Limitations around sampling methods that constrain generalizability.

Caveats related to statistical assumptions, analysis techniques applied.

Attempts made to control/account for biases and directions for future research.

Rather than detracting value, acknowledging limitations demonstrates academic integrity regarding the research performed. It also gives readers deeper insight into interpreting the reported results and findings.

Conclusion: Synthesizing Statistical Treatment Insights

Recap of statistical treatment fundamentals.

Statistical treatment of data is a crucial component of high-quality quantitative research. Proper application of statistical methods and analysis principles enables valid interpretations and inferences from study data. Key fundamentals covered include:

Descriptive statistics to summarize and describe the basic features of study data

Inferential statistics to make judgments of the probability and significance based on the data

Using appropriate statistical tools aligned to the research design and objectives

Following established practices for measurement techniques, data collection, and reporting

Adhering to these core tenets ensures research integrity and allows findings to withstand scientific scrutiny.

Key Takeaways for Research Paper Success

When incorporating statistical treatment into a research paper, keep these best practices in mind:

Clearly state the research hypothesis and variables under examination

Select reliable and valid quantitative measures for assessment

Determine appropriate sample size to achieve statistical power

Apply correct analytical methods suited to the data type and distribution

Comprehensively report methodology procedures and statistical outputs

Interpret results in context of the study limitations and scope

Following these guidelines will bolster confidence in the statistical treatment and strengthen the research quality overall.

Encouraging Continued Learning and Application

As statistical techniques continue advancing, it is imperative for researchers to actively further their statistical literacy. Regularly reviewing new methodological developments and learning advanced tools will augment analytical capabilities. Persistently putting enhanced statistical knowledge into practice through research projects and manuscript preparations will cement competencies. Statistical treatment mastery is a journey requiring persistent effort, but one that pays dividends in research proficiency.

Avatar of Antonio Carlos Filho

Antonio Carlos Filho @acfilho_dev

research statistical treatment example

Community Blog

Keep up-to-date on postgraduate related issues with our quick reads written by students, postdocs, professors and industry leaders.

Statistical Treatment of Data – Explained & Example

DiscoverPhDs

  • By DiscoverPhDs
  • September 8, 2020

Statistical Treatment of Data in Research

‘Statistical treatment’ is when you apply a statistical method to a data set to draw meaning from it. Statistical treatment can be either descriptive statistics, which describes the relationship between variables in a population, or inferential statistics, which tests a hypothesis by making inferences from the collected data.

Introduction to Statistical Treatment in Research

Every research student, regardless of whether they are a biologist, computer scientist or psychologist, must have a basic understanding of statistical treatment if their study is to be reliable.

This is because designing experiments and collecting data are only a small part of conducting research. The other components, which are often not so well understood by new researchers, are the analysis, interpretation and presentation of the data. This is just as important, if not more important, as this is where meaning is extracted from the study .

What is Statistical Treatment of Data?

Statistical treatment of data is when you apply some form of statistical method to a data set to transform it from a group of meaningless numbers into meaningful output.

Statistical treatment of data involves the use of statistical methods such as:

  • regression,
  • conditional probability,
  • standard deviation and
  • distribution range.

These statistical methods allow us to investigate the statistical relationships between the data and identify possible errors in the study.

In addition to being able to identify trends, statistical treatment also allows us to organise and process our data in the first place. This is because when carrying out statistical analysis of our data, it is generally more useful to draw several conclusions for each subgroup within our population than to draw a single, more general conclusion for the whole population. However, to do this, we need to be able to classify the population into different subgroups so that we can later break down our data in the same way before analysing it.

Statistical Treatment Example – Quantitative Research

Statistical Treatment of Data Example

For a statistical treatment of data example, consider a medical study that is investigating the effect of a drug on the human population. As the drug can affect different people in different ways based on parameters such as gender, age and race, the researchers would want to group the data into different subgroups based on these parameters to determine how each one affects the effectiveness of the drug. Categorising the data in this way is an example of performing basic statistical treatment.

Type of Errors

A fundamental part of statistical treatment is using statistical methods to identify possible outliers and errors. No matter how careful we are, all experiments are subject to inaccuracies resulting from two types of errors: systematic errors and random errors.

Systematic errors are errors associated with either the equipment being used to collect the data or with the method in which they are used. Random errors are errors that occur unknowingly or unpredictably in the experimental configuration, such as internal deformations within specimens or small voltage fluctuations in measurement testing instruments.

These experimental errors, in turn, can lead to two types of conclusion errors: type I errors and type II errors . A type I error is a false positive which occurs when a researcher rejects a true null hypothesis. On the other hand, a type II error is a false negative which occurs when a researcher fails to reject a false null hypothesis.

Writing Habits That Work

There’s no doubt about it – writing can be difficult. Whether you’re writing the first sentence of a paper or a grant proposal, it’s easy

Covid-19 Guidance for Students

Stay up to date with current information being provided by the UK Government and Universities about the impact of the global pandemic on PhD research studies.

Can you do a PhD part time while working answered

Is it really possible to do a PhD while working? The answer is ‘yes’, but it comes with several ‘buts’. Read our post to find out if it’s for you.

Join thousands of other students and stay up to date with the latest PhD programmes, funding opportunities and advice.

research statistical treatment example

Browse PhDs Now

Dissertation Title Page

The title page of your dissertation or thesis conveys all the essential details about your project. This guide helps you format it in the correct way.

research statistical treatment example

This post gives you the best questions to ask at a PhD interview, to help you work out if your potential supervisor and lab is a good fit for you.

research statistical treatment example

Dr Williams gained her PhD in Chemical Engineering at the Rensselaer Polytechnic Institute in Troy, New York in 2020. She is now a Presidential Postdoctoral Fellow at Cornell University, researching simplifying vaccine manufacturing in low-income countries.

research statistical treatment example

Elpida is about to start her third year of PhD research at the University of Leicester. Her research focuses on preventing type 2 diabetes in women who had gestational diabetes, and she an active STEM Ambassador.

Join Thousands of Students

Statistical Treatment

Statistics Definitions > Statistical Treatment

What is Statistical Treatment?

Statistical treatment can mean a few different things:

  • In Data Analysis : Applying any statistical method — like regression or calculating a mean — to data.
  • In Factor Analysis : Any combination of factor levels is called a treatment.
  • In a Thesis or Experiment : A summary of the procedure, including statistical methods used.

1. Statistical Treatment in Data Analysis

The term “statistical treatment” is a catch all term which means to apply any statistical method to your data. Treatments are divided into two groups: descriptive statistics , which summarize your data as a graph or summary statistic and inferential statistics , which make predictions and test hypotheses about your data. Treatments could include:

  • Finding standard deviations and sample standard errors ,
  • Finding T-Scores or Z-Scores .
  • Calculating Correlation coefficients .

2. Treatments in Factor Analysis

statistical treatment

3. Treatments in a Thesis or Experiment

Sometimes you might be asked to include a treatment as part of a thesis. This is asking you to summarize the data and analysis portion of your experiment, including measurements and formulas used. For example, the following experimental summary is from Statistical Treatment in Acta Physiologica Scandinavica. :

Each of the test solutions was injected twice in each subject…30-42 values were obtained for the intensity, and a like number for the duration, of the pain indiced by the solution. The pain values reported in the following are arithmetical means for these 30-42 injections.”

The author goes on to provide formulas for the mean, the standard deviation and the standard error of the mean.

Vogt, W.P. (2005). Dictionary of Statistics & Methodology: A Nontechnical Guide for the Social Sciences . SAGE. Wheelan, C. (2014). Naked Statistics . W. W. Norton & Company Unknown author (1961) Chapter 3: Statistical Treatment. Acta Physiologica Scandinavica. Volume 51, Issue s179 December Pages 16–20.

  • Foundations
  • Write Paper

Search form

  • Experiments
  • Anthropology
  • Self-Esteem
  • Social Anxiety
  • Statistics >

Statistical Treatment Of Data

Statistical treatment of data is essential in order to make use of the data in the right form. Raw data collection is only one aspect of any experiment; the organization of data is equally important so that appropriate conclusions can be drawn. This is what statistical treatment of data is all about.

This article is a part of the guide:

  • Statistics Tutorial
  • Branches of Statistics
  • Statistical Analysis
  • Discrete Variables

Browse Full Outline

  • 1 Statistics Tutorial
  • 2.1 What is Statistics?
  • 2.2 Learn Statistics
  • 3 Probability
  • 4 Branches of Statistics
  • 5 Descriptive Statistics
  • 6 Parameters
  • 7.1 Data Treatment
  • 7.2 Raw Data
  • 7.3 Outliers
  • 7.4 Data Output
  • 8 Statistical Analysis
  • 9 Measurement Scales
  • 10 Variables and Statistics
  • 11 Discrete Variables

There are many techniques involved in statistics that treat data in the required manner. Statistical treatment of data is essential in all experiments, whether social, scientific or any other form. Statistical treatment of data greatly depends on the kind of experiment and the desired result from the experiment.

For example, in a survey regarding the election of a Mayor, parameters like age, gender, occupation, etc. would be important in influencing the person's decision to vote for a particular candidate. Therefore the data needs to be treated in these reference frames.

An important aspect of statistical treatment of data is the handling of errors. All experiments invariably produce errors and noise. Both systematic and random errors need to be taken into consideration.

Depending on the type of experiment being performed, Type-I and Type-II errors also need to be handled. These are the cases of false positives and false negatives that are important to understand and eliminate in order to make sense from the result of the experiment.

research statistical treatment example

Treatment of Data and Distribution

Trying to classify data into commonly known patterns is a tremendous help and is intricately related to statistical treatment of data. This is because distributions such as the normal probability distribution occur very commonly in nature that they are the underlying distributions in most medical, social and physical experiments.

Therefore if a given sample size is known to be normally distributed, then the statistical treatment of data is made easy for the researcher as he would already have a lot of back up theory in this aspect. Care should always be taken, however, not to assume all data to be normally distributed, and should always be confirmed with appropriate testing.

Statistical treatment of data also involves describing the data. The best way to do this is through the measures of central tendencies like mean , median and mode . These help the researcher explain in short how the data are concentrated. Range, uncertainty and standard deviation help to understand the distribution of the data. Therefore two distributions with the same mean can have wildly different standard deviation, which shows how well the data points are concentrated around the mean.

Statistical treatment of data is an important aspect of all experimentation today and a thorough understanding is necessary to conduct the right experiments with the right inferences from the data obtained.

  • Psychology 101
  • Flags and Countries
  • Capitals and Countries

Siddharth Kalla (Apr 10, 2009). Statistical Treatment Of Data. Retrieved Jun 11, 2024 from Explorable.com: https://explorable.com/statistical-treatment-of-data

You Are Allowed To Copy The Text

The text in this article is licensed under the Creative Commons-License Attribution 4.0 International (CC BY 4.0) .

This means you're free to copy, share and adapt any parts (or all) of the text in the article, as long as you give appropriate credit and provide a link/reference to this page.

That is it. You don't need our permission to copy the article; just include a link/reference back to this page. You can use it freely (with some kind of link), and we're also okay with people reprinting in publications like books, blogs, newsletters, course-material, papers, wikipedia and presentations (with clear attribution).

research statistical treatment example

Want to stay up to date? Follow us!

Save this course for later.

Don't have time for it all now? No problem, save it as a course and come back to it later.

Footer bottom

  • Privacy Policy

research statistical treatment example

  • Subscribe to our RSS Feed
  • Like us on Facebook
  • Follow us on Twitter

Library homepage

  • school Campus Bookshelves
  • menu_book Bookshelves
  • perm_media Learning Objects
  • login Login
  • how_to_reg Request Instructor Account
  • hub Instructor Commons

Margin Size

  • Download Page (PDF)
  • Download Full Book (PDF)
  • Periodic Table
  • Physics Constants
  • Scientific Calculator
  • Reference & Cite
  • Tools expand_more
  • Readability

selected template will load here

This action is not available.

Chemistry LibreTexts

Statistical Treatment of Data

  • Last updated
  • Save as PDF
  • Page ID 136262

  • W. R. Fawcett, John Berg, P. B. Kelley, Carlito B. Lebrilla, Gang-yu Liu, Delmar Larsen, Paul Hrvatin, David Goodin, and Brooke McMahon
  • University of California, Davis

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

\( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)

( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)

\( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

\( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)

\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

\( \newcommand{\Span}{\mathrm{span}}\)

\( \newcommand{\id}{\mathrm{id}}\)

\( \newcommand{\kernel}{\mathrm{null}\,}\)

\( \newcommand{\range}{\mathrm{range}\,}\)

\( \newcommand{\RealPart}{\mathrm{Re}}\)

\( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

\( \newcommand{\Argument}{\mathrm{Arg}}\)

\( \newcommand{\norm}[1]{\| #1 \|}\)

\( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)

\( \newcommand{\vectorA}[1]{\vec{#1}}      % arrow\)

\( \newcommand{\vectorAt}[1]{\vec{\text{#1}}}      % arrow\)

\( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

\( \newcommand{\vectorC}[1]{\textbf{#1}} \)

\( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)

\( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)

\( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)

Many times during the course of the Chemistry 115 laboratory you will be asked to report an average, relative deviation, and a standard deviation. You may also have to analyze multiple trials to decide whether or not a certain piece of data should be discarded. This section describes these procedures.

Average and Standard Deviation

The average or mean of the data set, \(\bar{x}\), is defined by:

\(\bar{x} = \dfrac{\sum_{i=1}^N x_i}{N}\)

where x i is the result of the i th measurement, i = 1,…,N. The standard deviation, σ, measures how closely values are clustered about the mean. The standard deviation for small samples is defined by:

\( \sigma = \sqrt{\dfrac{\sum_{i=1}^N (x_i-\bar{x})^2}{N}} \)

The smaller the value of σ, the more closely packed the data are about the mean, and we say that the measurements are precise . In contrast, a high accuracy of the measurements occurs if the mean is close to the real result (presuming we know that information). It is easy to tell if your measurements are precise, but it is often difficult to tell if they are accurate.

Relative Deviation

The relative average deviation, d, like the standard deviation, is useful to determine how data are clustered about a mean. The advantage of a relative deviation is that it incorporates the relative numerical magnitude of the average. The relative average deviation, d, is calculated in the following way.

Analysis of Poor Data

Other important concepts and procedures.

  • Normal error curve: Histogram of an infinitely large number of good measurements usually follow a Gaussian distribution
  • Confidence limit (95%)
  • Linear least squares fit
  • Residual sum of squares
  • Correlation coefficient

Just one more step to your free trial.

.surveysparrow.com

Already using SurveySparrow?  Login

By clicking on "Get Started", I agree to the Privacy Policy and Terms of Service .

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Enterprise Survey Software

Enterprise Survey Software to thrive in your business ecosystem

NPS® Software

Turn customers into promoters

Offline Survey

Real-time data collection, on the move. Go internet-independent.

360 Assessment

Conduct omnidirectional employee assessments. Increase productivity, grow together.

Reputation Management

Turn your existing customers into raving promoters by monitoring online reviews.

Ticket Management

Build loyalty and advocacy by delivering personalized support experiences that matter.

Chatbot for Website

Collect feedback smartly from your website visitors with the engaging Chatbot for website.

Swift, easy, secure. Scalable for your organization.

Executive Dashboard

Customer journey map, craft beautiful surveys, share surveys, gain rich insights, recurring surveys, white label surveys, embedded surveys, conversational forms, mobile-first surveys, audience management, smart surveys, video surveys, secure surveys, api, webhooks, integrations, survey themes, accept payments, custom workflows, all features, customer experience, employee experience, product experience, marketing experience, sales experience, hospitality & travel, market research, saas startup programs, wall of love, success stories, sparrowcast, nps® benchmarks, learning centre, apps & integrations, testimonials.

Our surveys come with superpowers ⚡

Blog Customer Experience

Statistical Treatment of Data for Survey: The Right Approach

Last Updated:  

30 May 2024

Statistical treatment of data is a process used to convert raw data into something interpretable. This process is essential because it allows businesses to make better decisions based on customer feedback. This blog post will give a short overview of the statistical treatment of data and how it can be used to improve your business.

What exactly is Statistical Treatment?

In its simplest form, statistical treatment of data is taking raw data and turning it into something that can be interpreted and used to make decisions. This process is important for businesses because it allows them to take customer feedback and turn it into actionable insights.

There are many different statistical data treatment methods, but the most common are surveys and polls. Surveys are a great way to collect large amounts of customer data, but they can be time-consuming and expensive to administer. Polls are a quicker and more efficient way to collect data, but they typically have a smaller sample size.

Statistical methods for surveys

These are some statistical treatment of data for survey and how they can help improve your customer feedback program.

Descriptive Statistics

Descriptive statistics are used to describe the overall characteristics of a dataset. This includes measures of central tendency (mean, median, mode) and dispersion (range, variance, standard deviation). Descriptive statistics can be used to generate summary reports of survey data. These reports can be used to understand responses’ distribution and identify outliers.

Here’s how you can generate a detailed summary of survey data using SurveySparrow…

To create similar dashboards for statistical analysis, sign up here for Free!

Please enter a valid Email ID.

14-Day Free Trial • No Credit Card Required • No Strings Attached

Inferential Statistics

Inferential statistics are used to make predictions or inferences about a population based on a sample. This is done by using estimation methods (point estimates and confidence intervals) and testing methods (hypothesis testing). Inferential statistics can be used to understand how likely it is that a particular population characteristic is true. For example, inferential statistics can be used to calculate the probability that a customer will be satisfied with a product or service.

Once you have collected your data, the next step is to choose a statistical analysis method. The most common methods are regression, correlation, and factor analysis.

Regression Analysis

Regression analysis is a method used to identify the relationships between different variables. For example, you could use regression analysis to understand how customer satisfaction ratings change based on the number of support tickets they open.

Correlation analysis

Correlation analysis is a method used to understand how two variables relate. For example, you could use correlation analysis to understand how customer satisfaction ratings change based on the number of support tickets they open.

Factor analysis

Factor analysis is a method used to identify which variables impact a particular outcome most. For example, you could use factor analysis to determine which factors influence customer satisfaction ratings most.

Once you have chosen a method of statistical treatment of data, the next step is to apply it to your dataset. This step can be done using Excel or another similar program. Once you have applied your chosen method, you can interpret the results and use them to make decisions about your business .

Wrapping up…

Statistical treatment of data for survey is a necessary process that allows businesses to take customer feedback and turn it into actionable insights. There are many different statistical data treatment methods, but the most common are surveys and polls. Online survey tools like SurveySparrow can help you with the same. Once you have collected your data, you must choose a form of statistical analysis and apply it to your dataset. You will then be able to interpret the results and use them to make decisions about your business. Thanks for reading!

Product Marketer

Frustrated developer turned joyous writer.

You Might Also Like

How to use surveys in seo analysis, how to get reviews on facebook & manage them: a quick guide, how to create a support ticketing system the easy way, everything about delighting customers. you’ll find them here..

Leave us your email, we wont spam. Promise!

Start your free trial today

No Credit Card Required. 14-Day Free Trial

Request a Demo

Want to learn more about SurveySparrow? We'll be in touch soon!

Take the right approach to statistical analysis now!

Create in-depth dashboards in seconds.

14-Day Free Trial • No Credit card required • 40% more completion rate

Hi there, we use cookies to offer you a better browsing experience and to analyze site traffic. By continuing to use our website, you consent to the use of these cookies. Learn More

Have a thesis expert improve your writing

Check your thesis for plagiarism in 10 minutes, generate your apa citations for free.

  • Knowledge Base

The Beginner's Guide to Statistical Analysis | 5 Steps & Examples

Statistical analysis means investigating trends, patterns, and relationships using quantitative data . It is an important research tool used by scientists, governments, businesses, and other organisations.

To draw valid conclusions, statistical analysis requires careful planning from the very start of the research process . You need to specify your hypotheses and make decisions about your research design, sample size, and sampling procedure.

After collecting data from your sample, you can organise and summarise the data using descriptive statistics . Then, you can use inferential statistics to formally test hypotheses and make estimates about the population. Finally, you can interpret and generalise your findings.

This article is a practical introduction to statistical analysis for students and researchers. We’ll walk you through the steps using two research examples. The first investigates a potential cause-and-effect relationship, while the second investigates a potential correlation between variables.

Table of contents

Step 1: write your hypotheses and plan your research design, step 2: collect data from a sample, step 3: summarise your data with descriptive statistics, step 4: test hypotheses or make estimates with inferential statistics, step 5: interpret your results, frequently asked questions about statistics.

To collect valid data for statistical analysis, you first need to specify your hypotheses and plan out your research design.

Writing statistical hypotheses

The goal of research is often to investigate a relationship between variables within a population . You start with a prediction, and use statistical analysis to test that prediction.

A statistical hypothesis is a formal way of writing a prediction about a population. Every research prediction is rephrased into null and alternative hypotheses that can be tested using sample data.

While the null hypothesis always predicts no effect or no relationship between variables, the alternative hypothesis states your research prediction of an effect or relationship.

  • Null hypothesis: A 5-minute meditation exercise will have no effect on math test scores in teenagers.
  • Alternative hypothesis: A 5-minute meditation exercise will improve math test scores in teenagers.
  • Null hypothesis: Parental income and GPA have no relationship with each other in college students.
  • Alternative hypothesis: Parental income and GPA are positively correlated in college students.

Planning your research design

A research design is your overall strategy for data collection and analysis. It determines the statistical tests you can use to test your hypothesis later on.

First, decide whether your research will use a descriptive, correlational, or experimental design. Experiments directly influence variables, whereas descriptive and correlational studies only measure variables.

  • In an experimental design , you can assess a cause-and-effect relationship (e.g., the effect of meditation on test scores) using statistical tests of comparison or regression.
  • In a correlational design , you can explore relationships between variables (e.g., parental income and GPA) without any assumption of causality using correlation coefficients and significance tests.
  • In a descriptive design , you can study the characteristics of a population or phenomenon (e.g., the prevalence of anxiety in U.S. college students) using statistical tests to draw inferences from sample data.

Your research design also concerns whether you’ll compare participants at the group level or individual level, or both.

  • In a between-subjects design , you compare the group-level outcomes of participants who have been exposed to different treatments (e.g., those who performed a meditation exercise vs those who didn’t).
  • In a within-subjects design , you compare repeated measures from participants who have participated in all treatments of a study (e.g., scores from before and after performing a meditation exercise).
  • In a mixed (factorial) design , one variable is altered between subjects and another is altered within subjects (e.g., pretest and posttest scores from participants who either did or didn’t do a meditation exercise).
  • Experimental
  • Correlational

First, you’ll take baseline test scores from participants. Then, your participants will undergo a 5-minute meditation exercise. Finally, you’ll record participants’ scores from a second math test.

In this experiment, the independent variable is the 5-minute meditation exercise, and the dependent variable is the math test score from before and after the intervention. Example: Correlational research design In a correlational study, you test whether there is a relationship between parental income and GPA in graduating college students. To collect your data, you will ask participants to fill in a survey and self-report their parents’ incomes and their own GPA.

Measuring variables

When planning a research design, you should operationalise your variables and decide exactly how you will measure them.

For statistical analysis, it’s important to consider the level of measurement of your variables, which tells you what kind of data they contain:

  • Categorical data represents groupings. These may be nominal (e.g., gender) or ordinal (e.g. level of language ability).
  • Quantitative data represents amounts. These may be on an interval scale (e.g. test score) or a ratio scale (e.g. age).

Many variables can be measured at different levels of precision. For example, age data can be quantitative (8 years old) or categorical (young). If a variable is coded numerically (e.g., level of agreement from 1–5), it doesn’t automatically mean that it’s quantitative instead of categorical.

Identifying the measurement level is important for choosing appropriate statistics and hypothesis tests. For example, you can calculate a mean score with quantitative data, but not with categorical data.

In a research study, along with measures of your variables of interest, you’ll often collect data on relevant participant characteristics.

Variable Type of data
Age Quantitative (ratio)
Gender Categorical (nominal)
Race or ethnicity Categorical (nominal)
Baseline test scores Quantitative (interval)
Final test scores Quantitative (interval)
Parental income Quantitative (ratio)
GPA Quantitative (interval)

Population vs sample

In most cases, it’s too difficult or expensive to collect data from every member of the population you’re interested in studying. Instead, you’ll collect data from a sample.

Statistical analysis allows you to apply your findings beyond your own sample as long as you use appropriate sampling procedures . You should aim for a sample that is representative of the population.

Sampling for statistical analysis

There are two main approaches to selecting a sample.

  • Probability sampling: every member of the population has a chance of being selected for the study through random selection.
  • Non-probability sampling: some members of the population are more likely than others to be selected for the study because of criteria such as convenience or voluntary self-selection.

In theory, for highly generalisable findings, you should use a probability sampling method. Random selection reduces sampling bias and ensures that data from your sample is actually typical of the population. Parametric tests can be used to make strong statistical inferences when data are collected using probability sampling.

But in practice, it’s rarely possible to gather the ideal sample. While non-probability samples are more likely to be biased, they are much easier to recruit and collect data from. Non-parametric tests are more appropriate for non-probability samples, but they result in weaker inferences about the population.

If you want to use parametric tests for non-probability samples, you have to make the case that:

  • your sample is representative of the population you’re generalising your findings to.
  • your sample lacks systematic bias.

Keep in mind that external validity means that you can only generalise your conclusions to others who share the characteristics of your sample. For instance, results from Western, Educated, Industrialised, Rich and Democratic samples (e.g., college students in the US) aren’t automatically applicable to all non-WEIRD populations.

If you apply parametric tests to data from non-probability samples, be sure to elaborate on the limitations of how far your results can be generalised in your discussion section .

Create an appropriate sampling procedure

Based on the resources available for your research, decide on how you’ll recruit participants.

  • Will you have resources to advertise your study widely, including outside of your university setting?
  • Will you have the means to recruit a diverse sample that represents a broad population?
  • Do you have time to contact and follow up with members of hard-to-reach groups?

Your participants are self-selected by their schools. Although you’re using a non-probability sample, you aim for a diverse and representative sample. Example: Sampling (correlational study) Your main population of interest is male college students in the US. Using social media advertising, you recruit senior-year male college students from a smaller subpopulation: seven universities in the Boston area.

Calculate sufficient sample size

Before recruiting participants, decide on your sample size either by looking at other studies in your field or using statistics. A sample that’s too small may be unrepresentative of the sample, while a sample that’s too large will be more costly than necessary.

There are many sample size calculators online. Different formulas are used depending on whether you have subgroups or how rigorous your study should be (e.g., in clinical research). As a rule of thumb, a minimum of 30 units or more per subgroup is necessary.

To use these calculators, you have to understand and input these key components:

  • Significance level (alpha): the risk of rejecting a true null hypothesis that you are willing to take, usually set at 5%.
  • Statistical power : the probability of your study detecting an effect of a certain size if there is one, usually 80% or higher.
  • Expected effect size : a standardised indication of how large the expected result of your study will be, usually based on other similar studies.
  • Population standard deviation: an estimate of the population parameter based on a previous study or a pilot study of your own.

Once you’ve collected all of your data, you can inspect them and calculate descriptive statistics that summarise them.

Inspect your data

There are various ways to inspect your data, including the following:

  • Organising data from each variable in frequency distribution tables .
  • Displaying data from a key variable in a bar chart to view the distribution of responses.
  • Visualising the relationship between two variables using a scatter plot .

By visualising your data in tables and graphs, you can assess whether your data follow a skewed or normal distribution and whether there are any outliers or missing data.

A normal distribution means that your data are symmetrically distributed around a center where most values lie, with the values tapering off at the tail ends.

Mean, median, mode, and standard deviation in a normal distribution

In contrast, a skewed distribution is asymmetric and has more values on one end than the other. The shape of the distribution is important to keep in mind because only some descriptive statistics should be used with skewed distributions.

Extreme outliers can also produce misleading statistics, so you may need a systematic approach to dealing with these values.

Calculate measures of central tendency

Measures of central tendency describe where most of the values in a data set lie. Three main measures of central tendency are often reported:

  • Mode : the most popular response or value in the data set.
  • Median : the value in the exact middle of the data set when ordered from low to high.
  • Mean : the sum of all values divided by the number of values.

However, depending on the shape of the distribution and level of measurement, only one or two of these measures may be appropriate. For example, many demographic characteristics can only be described using the mode or proportions, while a variable like reaction time may not have a mode at all.

Calculate measures of variability

Measures of variability tell you how spread out the values in a data set are. Four main measures of variability are often reported:

  • Range : the highest value minus the lowest value of the data set.
  • Interquartile range : the range of the middle half of the data set.
  • Standard deviation : the average distance between each value in your data set and the mean.
  • Variance : the square of the standard deviation.

Once again, the shape of the distribution and level of measurement should guide your choice of variability statistics. The interquartile range is the best measure for skewed distributions, while standard deviation and variance provide the best information for normal distributions.

Using your table, you should check whether the units of the descriptive statistics are comparable for pretest and posttest scores. For example, are the variance levels similar across the groups? Are there any extreme values? If there are, you may need to identify and remove extreme outliers in your data set or transform your data before performing a statistical test.

Pretest scores Posttest scores
Mean 68.44 75.25
Standard deviation 9.43 9.88
Variance 88.96 97.96
Range 36.25 45.12
30

From this table, we can see that the mean score increased after the meditation exercise, and the variances of the two scores are comparable. Next, we can perform a statistical test to find out if this improvement in test scores is statistically significant in the population. Example: Descriptive statistics (correlational study) After collecting data from 653 students, you tabulate descriptive statistics for annual parental income and GPA.

It’s important to check whether you have a broad range of data points. If you don’t, your data may be skewed towards some groups more than others (e.g., high academic achievers), and only limited inferences can be made about a relationship.

Parental income (USD) GPA
Mean 62,100 3.12
Standard deviation 15,000 0.45
Variance 225,000,000 0.16
Range 8,000–378,000 2.64–4.00
653

A number that describes a sample is called a statistic , while a number describing a population is called a parameter . Using inferential statistics , you can make conclusions about population parameters based on sample statistics.

Researchers often use two main methods (simultaneously) to make inferences in statistics.

  • Estimation: calculating population parameters based on sample statistics.
  • Hypothesis testing: a formal process for testing research predictions about the population using samples.

You can make two types of estimates of population parameters from sample statistics:

  • A point estimate : a value that represents your best guess of the exact parameter.
  • An interval estimate : a range of values that represent your best guess of where the parameter lies.

If your aim is to infer and report population characteristics from sample data, it’s best to use both point and interval estimates in your paper.

You can consider a sample statistic a point estimate for the population parameter when you have a representative sample (e.g., in a wide public opinion poll, the proportion of a sample that supports the current government is taken as the population proportion of government supporters).

There’s always error involved in estimation, so you should also provide a confidence interval as an interval estimate to show the variability around a point estimate.

A confidence interval uses the standard error and the z score from the standard normal distribution to convey where you’d generally expect to find the population parameter most of the time.

Hypothesis testing

Using data from a sample, you can test hypotheses about relationships between variables in the population. Hypothesis testing starts with the assumption that the null hypothesis is true in the population, and you use statistical tests to assess whether the null hypothesis can be rejected or not.

Statistical tests determine where your sample data would lie on an expected distribution of sample data if the null hypothesis were true. These tests give two main outputs:

  • A test statistic tells you how much your data differs from the null hypothesis of the test.
  • A p value tells you the likelihood of obtaining your results if the null hypothesis is actually true in the population.

Statistical tests come in three main varieties:

  • Comparison tests assess group differences in outcomes.
  • Regression tests assess cause-and-effect relationships between variables.
  • Correlation tests assess relationships between variables without assuming causation.

Your choice of statistical test depends on your research questions, research design, sampling method, and data characteristics.

Parametric tests

Parametric tests make powerful inferences about the population based on sample data. But to use them, some assumptions must be met, and only some types of variables can be used. If your data violate these assumptions, you can perform appropriate data transformations or use alternative non-parametric tests instead.

A regression models the extent to which changes in a predictor variable results in changes in outcome variable(s).

  • A simple linear regression includes one predictor variable and one outcome variable.
  • A multiple linear regression includes two or more predictor variables and one outcome variable.

Comparison tests usually compare the means of groups. These may be the means of different groups within a sample (e.g., a treatment and control group), the means of one sample group taken at different times (e.g., pretest and posttest scores), or a sample mean and a population mean.

  • A t test is for exactly 1 or 2 groups when the sample is small (30 or less).
  • A z test is for exactly 1 or 2 groups when the sample is large.
  • An ANOVA is for 3 or more groups.

The z and t tests have subtypes based on the number and types of samples and the hypotheses:

  • If you have only one sample that you want to compare to a population mean, use a one-sample test .
  • If you have paired measurements (within-subjects design), use a dependent (paired) samples test .
  • If you have completely separate measurements from two unmatched groups (between-subjects design), use an independent (unpaired) samples test .
  • If you expect a difference between groups in a specific direction, use a one-tailed test .
  • If you don’t have any expectations for the direction of a difference between groups, use a two-tailed test .

The only parametric correlation test is Pearson’s r . The correlation coefficient ( r ) tells you the strength of a linear relationship between two quantitative variables.

However, to test whether the correlation in the sample is strong enough to be important in the population, you also need to perform a significance test of the correlation coefficient, usually a t test, to obtain a p value. This test uses your sample size to calculate how much the correlation coefficient differs from zero in the population.

You use a dependent-samples, one-tailed t test to assess whether the meditation exercise significantly improved math test scores. The test gives you:

  • a t value (test statistic) of 3.00
  • a p value of 0.0028

Although Pearson’s r is a test statistic, it doesn’t tell you anything about how significant the correlation is in the population. You also need to test whether this sample correlation coefficient is large enough to demonstrate a correlation in the population.

A t test can also determine how significantly a correlation coefficient differs from zero based on sample size. Since you expect a positive correlation between parental income and GPA, you use a one-sample, one-tailed t test. The t test gives you:

  • a t value of 3.08
  • a p value of 0.001

The final step of statistical analysis is interpreting your results.

Statistical significance

In hypothesis testing, statistical significance is the main criterion for forming conclusions. You compare your p value to a set significance level (usually 0.05) to decide whether your results are statistically significant or non-significant.

Statistically significant results are considered unlikely to have arisen solely due to chance. There is only a very low chance of such a result occurring if the null hypothesis is true in the population.

This means that you believe the meditation intervention, rather than random factors, directly caused the increase in test scores. Example: Interpret your results (correlational study) You compare your p value of 0.001 to your significance threshold of 0.05. With a p value under this threshold, you can reject the null hypothesis. This indicates a statistically significant correlation between parental income and GPA in male college students.

Note that correlation doesn’t always mean causation, because there are often many underlying factors contributing to a complex variable like GPA. Even if one variable is related to another, this may be because of a third variable influencing both of them, or indirect links between the two variables.

Effect size

A statistically significant result doesn’t necessarily mean that there are important real life applications or clinical outcomes for a finding.

In contrast, the effect size indicates the practical significance of your results. It’s important to report effect sizes along with your inferential statistics for a complete picture of your results. You should also report interval estimates of effect sizes if you’re writing an APA style paper .

With a Cohen’s d of 0.72, there’s medium to high practical significance to your finding that the meditation exercise improved test scores. Example: Effect size (correlational study) To determine the effect size of the correlation coefficient, you compare your Pearson’s r value to Cohen’s effect size criteria.

Decision errors

Type I and Type II errors are mistakes made in research conclusions. A Type I error means rejecting the null hypothesis when it’s actually true, while a Type II error means failing to reject the null hypothesis when it’s false.

You can aim to minimise the risk of these errors by selecting an optimal significance level and ensuring high power . However, there’s a trade-off between the two errors, so a fine balance is necessary.

Frequentist versus Bayesian statistics

Traditionally, frequentist statistics emphasises null hypothesis significance testing and always starts with the assumption of a true null hypothesis.

However, Bayesian statistics has grown in popularity as an alternative approach in the last few decades. In this approach, you use previous research to continually update your hypotheses based on your expectations and observations.

Bayes factor compares the relative strength of evidence for the null versus the alternative hypothesis rather than making a conclusion about rejecting the null hypothesis or not.

Hypothesis testing is a formal procedure for investigating our ideas about the world using statistics. It is used by scientists to test specific predictions, called hypotheses , by calculating how likely it is that a pattern or relationship between variables could have arisen by chance.

The research methods you use depend on the type of data you need to answer your research question .

  • If you want to measure something or test a hypothesis , use quantitative methods . If you want to explore ideas, thoughts, and meanings, use qualitative methods .
  • If you want to analyse a large amount of readily available data, use secondary data. If you want data specific to your purposes with control over how they are generated, collect primary data.
  • If you want to establish cause-and-effect relationships between variables , use experimental methods. If you want to understand the characteristics of a research subject, use descriptive methods.

Statistical analysis is the main method for analyzing quantitative research data . It uses probabilities and models to test predictions about a population from sample data.

Is this article helpful?

Other students also liked, a quick guide to experimental design | 5 steps & examples, controlled experiments | methods & examples of control, between-subjects design | examples, pros & cons, more interesting articles.

  • Central Limit Theorem | Formula, Definition & Examples
  • Central Tendency | Understanding the Mean, Median & Mode
  • Correlation Coefficient | Types, Formulas & Examples
  • Descriptive Statistics | Definitions, Types, Examples
  • How to Calculate Standard Deviation (Guide) | Calculator & Examples
  • How to Calculate Variance | Calculator, Analysis & Examples
  • How to Find Degrees of Freedom | Definition & Formula
  • How to Find Interquartile Range (IQR) | Calculator & Examples
  • How to Find Outliers | Meaning, Formula & Examples
  • How to Find the Geometric Mean | Calculator & Formula
  • How to Find the Mean | Definition, Examples & Calculator
  • How to Find the Median | Definition, Examples & Calculator
  • How to Find the Range of a Data Set | Calculator & Formula
  • Inferential Statistics | An Easy Introduction & Examples
  • Levels of measurement: Nominal, ordinal, interval, ratio
  • Missing Data | Types, Explanation, & Imputation
  • Normal Distribution | Examples, Formulas, & Uses
  • Null and Alternative Hypotheses | Definitions & Examples
  • Poisson Distributions | Definition, Formula & Examples
  • Skewness | Definition, Examples & Formula
  • T-Distribution | What It Is and How To Use It (With Examples)
  • The Standard Normal Distribution | Calculator, Examples & Uses
  • Type I & Type II Errors | Differences, Examples, Visualizations
  • Understanding Confidence Intervals | Easy Examples & Formulas
  • Variability | Calculating Range, IQR, Variance, Standard Deviation
  • What is Effect Size and Why Does It Matter? (Examples)
  • What Is Interval Data? | Examples & Definition
  • What Is Nominal Data? | Examples & Definition
  • What Is Ordinal Data? | Examples & Definition
  • What Is Ratio Data? | Examples & Definition
  • What Is the Mode in Statistics? | Definition, Examples & Calculator

Enago Academy

Effective Use of Statistics in Research – Methods and Tools for Data Analysis

' src=

Remember that impending feeling you get when you are asked to analyze your data! Now that you have all the required raw data, you need to statistically prove your hypothesis. Representing your numerical data as part of statistics in research will also help in breaking the stereotype of being a biology student who can’t do math.

Statistical methods are essential for scientific research. In fact, statistical methods dominate the scientific research as they include planning, designing, collecting data, analyzing, drawing meaningful interpretation and reporting of research findings. Furthermore, the results acquired from research project are meaningless raw data unless analyzed with statistical tools. Therefore, determining statistics in research is of utmost necessity to justify research findings. In this article, we will discuss how using statistical methods for biology could help draw meaningful conclusion to analyze biological studies.

Table of Contents

Role of Statistics in Biological Research

Statistics is a branch of science that deals with collection, organization and analysis of data from the sample to the whole population. Moreover, it aids in designing a study more meticulously and also give a logical reasoning in concluding the hypothesis. Furthermore, biology study focuses on study of living organisms and their complex living pathways, which are very dynamic and cannot be explained with logical reasoning. However, statistics is more complex a field of study that defines and explains study patterns based on the sample sizes used. To be precise, statistics provides a trend in the conducted study.

Biological researchers often disregard the use of statistics in their research planning, and mainly use statistical tools at the end of their experiment. Therefore, giving rise to a complicated set of results which are not easily analyzed from statistical tools in research. Statistics in research can help a researcher approach the study in a stepwise manner, wherein the statistical analysis in research follows –

1. Establishing a Sample Size

Usually, a biological experiment starts with choosing samples and selecting the right number of repetitive experiments. Statistics in research deals with basics in statistics that provides statistical randomness and law of using large samples. Statistics teaches how choosing a sample size from a random large pool of sample helps extrapolate statistical findings and reduce experimental bias and errors.

2. Testing of Hypothesis

When conducting a statistical study with large sample pool, biological researchers must make sure that a conclusion is statistically significant. To achieve this, a researcher must create a hypothesis before examining the distribution of data. Furthermore, statistics in research helps interpret the data clustered near the mean of distributed data or spread across the distribution. These trends help analyze the sample and signify the hypothesis.

3. Data Interpretation Through Analysis

When dealing with large data, statistics in research assist in data analysis. This helps researchers to draw an effective conclusion from their experiment and observations. Concluding the study manually or from visual observation may give erroneous results; therefore, thorough statistical analysis will take into consideration all the other statistical measures and variance in the sample to provide a detailed interpretation of the data. Therefore, researchers produce a detailed and important data to support the conclusion.

Types of Statistical Research Methods That Aid in Data Analysis

statistics in research

Statistical analysis is the process of analyzing samples of data into patterns or trends that help researchers anticipate situations and make appropriate research conclusions. Based on the type of data, statistical analyses are of the following type:

1. Descriptive Analysis

The descriptive statistical analysis allows organizing and summarizing the large data into graphs and tables . Descriptive analysis involves various processes such as tabulation, measure of central tendency, measure of dispersion or variance, skewness measurements etc.

2. Inferential Analysis

The inferential statistical analysis allows to extrapolate the data acquired from a small sample size to the complete population. This analysis helps draw conclusions and make decisions about the whole population on the basis of sample data. It is a highly recommended statistical method for research projects that work with smaller sample size and meaning to extrapolate conclusion for large population.

3. Predictive Analysis

Predictive analysis is used to make a prediction of future events. This analysis is approached by marketing companies, insurance organizations, online service providers, data-driven marketing, and financial corporations.

4. Prescriptive Analysis

Prescriptive analysis examines data to find out what can be done next. It is widely used in business analysis for finding out the best possible outcome for a situation. It is nearly related to descriptive and predictive analysis. However, prescriptive analysis deals with giving appropriate suggestions among the available preferences.

5. Exploratory Data Analysis

EDA is generally the first step of the data analysis process that is conducted before performing any other statistical analysis technique. It completely focuses on analyzing patterns in the data to recognize potential relationships. EDA is used to discover unknown associations within data, inspect missing data from collected data and obtain maximum insights.

6. Causal Analysis

Causal analysis assists in understanding and determining the reasons behind “why” things happen in a certain way, as they appear. This analysis helps identify root cause of failures or simply find the basic reason why something could happen. For example, causal analysis is used to understand what will happen to the provided variable if another variable changes.

7. Mechanistic Analysis

This is a least common type of statistical analysis. The mechanistic analysis is used in the process of big data analytics and biological science. It uses the concept of understanding individual changes in variables that cause changes in other variables correspondingly while excluding external influences.

Important Statistical Tools In Research

Researchers in the biological field find statistical analysis in research as the scariest aspect of completing research. However, statistical tools in research can help researchers understand what to do with data and how to interpret the results, making this process as easy as possible.

1. Statistical Package for Social Science (SPSS)

It is a widely used software package for human behavior research. SPSS can compile descriptive statistics, as well as graphical depictions of result. Moreover, it includes the option to create scripts that automate analysis or carry out more advanced statistical processing.

2. R Foundation for Statistical Computing

This software package is used among human behavior research and other fields. R is a powerful tool and has a steep learning curve. However, it requires a certain level of coding. Furthermore, it comes with an active community that is engaged in building and enhancing the software and the associated plugins.

3. MATLAB (The Mathworks)

It is an analytical platform and a programming language. Researchers and engineers use this software and create their own code and help answer their research question. While MatLab can be a difficult tool to use for novices, it offers flexibility in terms of what the researcher needs.

4. Microsoft Excel

Not the best solution for statistical analysis in research, but MS Excel offers wide variety of tools for data visualization and simple statistics. It is easy to generate summary and customizable graphs and figures. MS Excel is the most accessible option for those wanting to start with statistics.

5. Statistical Analysis Software (SAS)

It is a statistical platform used in business, healthcare, and human behavior research alike. It can carry out advanced analyzes and produce publication-worthy figures, tables and charts .

6. GraphPad Prism

It is a premium software that is primarily used among biology researchers. But, it offers a range of variety to be used in various other fields. Similar to SPSS, GraphPad gives scripting option to automate analyses to carry out complex statistical calculations.

This software offers basic as well as advanced statistical tools for data analysis. However, similar to GraphPad and SPSS, minitab needs command over coding and can offer automated analyses.

Use of Statistical Tools In Research and Data Analysis

Statistical tools manage the large data. Many biological studies use large data to analyze the trends and patterns in studies. Therefore, using statistical tools becomes essential, as they manage the large data sets, making data processing more convenient.

Following these steps will help biological researchers to showcase the statistics in research in detail, and develop accurate hypothesis and use correct tools for it.

There are a range of statistical tools in research which can help researchers manage their research data and improve the outcome of their research by better interpretation of data. You could use statistics in research by understanding the research question, knowledge of statistics and your personal experience in coding.

Have you faced challenges while using statistics in research? How did you manage it? Did you use any of the statistical tools to help you with your research data? Do write to us or comment below!

Frequently Asked Questions

Statistics in research can help a researcher approach the study in a stepwise manner: 1. Establishing a sample size 2. Testing of hypothesis 3. Data interpretation through analysis

Statistical methods are essential for scientific research. In fact, statistical methods dominate the scientific research as they include planning, designing, collecting data, analyzing, drawing meaningful interpretation and reporting of research findings. Furthermore, the results acquired from research project are meaningless raw data unless analyzed with statistical tools. Therefore, determining statistics in research is of utmost necessity to justify research findings.

Statistical tools in research can help researchers understand what to do with data and how to interpret the results, making this process as easy as possible. They can manage large data sets, making data processing more convenient. A great number of tools are available to carry out statistical analysis of data like SPSS, SAS (Statistical Analysis Software), and Minitab.

' src=

nice article to read

Holistic but delineating. A very good read.

Rate this article Cancel Reply

Your email address will not be published.

research statistical treatment example

Enago Academy's Most Popular Articles

Empowering Researchers, Enabling Progress: How Enago Academy contributes to the SDGs

  • Promoting Research
  • Thought Leadership
  • Trending Now

How Enago Academy Contributes to Sustainable Development Goals (SDGs) Through Empowering Researchers

The United Nations Sustainable Development Goals (SDGs) are a universal call to action to end…

Research Interviews for Data Collection

  • Reporting Research

Research Interviews: An effective and insightful way of data collection

Research interviews play a pivotal role in collecting data for various academic, scientific, and professional…

Planning Your Data Collection

Planning Your Data Collection: Designing methods for effective research

Planning your research is very important to obtain desirable results. In research, the relevance of…

best plagiarism checker

  • Language & Grammar

Best Plagiarism Checker Tool for Researchers — Top 4 to choose from!

While common writing issues like language enhancement, punctuation errors, grammatical errors, etc. can be dealt…

Year

  • Industry News
  • Publishing News

2022 in a Nutshell — Reminiscing the year when opportunities were seized and feats were achieved!

It’s beginning to look a lot like success! Some of the greatest opportunities to research…

2022 in a Nutshell — Reminiscing the year when opportunities were seized and feats…

research statistical treatment example

Sign-up to read more

Subscribe for free to get unrestricted access to all our resources on research writing and academic publishing including:

  • 2000+ blog articles
  • 50+ Webinars
  • 10+ Expert podcasts
  • 50+ Infographics
  • 10+ Checklists
  • Research Guides

We hate spam too. We promise to protect your privacy and never spam you.

I am looking for Editing/ Proofreading services for my manuscript Tentative date of next journal submission:

research statistical treatment example

As a researcher, what do you consider most when choosing an image manipulation detector?

Academia.edu no longer supports Internet Explorer.

To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to  upgrade your browser .

Enter the email address you signed up with and we'll email you a reset link.

  • We're Hiring!
  • Help Center

paper cover thumbnail

Chapter 3 RESEARCH AND METHODOLOGY

Profile image of Rodelito Aramay

Related Papers

Jessa Arcina

Research design is the blue print of the procedures that enable the research to test hypothesis by reaching valid conclusions about the relationship between dependent and in depend variables. It is a plan structure and strategy of research prepared to obtain answer to research questions and to control variances. Before doing the various studies on the present thesis the researcher has fixed the topic and area because it provide the entire draft of the scheme of the research staring from writing the hypothesis their operational implications to the final analysis of the data. The structural of the research is more specific as it provides the outline, the scheme the paradigm o f the operation of the variables. It presents a series of guide posts to enable the researcher to progress in the right direction it gives an outline of the

research statistical treatment example

Scholarly Communication and the Publish or Perish Pressures of Academia A volume in the Advances in Knowledge Acquisition, Transfer, and Management (AKATM) Book Series

Dr. Naresh A . Babariya , Alka V. Gohel

The most important of research methodology in research study it is necessary for a researcher to design a methodology for the problem chosen and systematically solves the problem. Formulation of the research problem is to decide on a broad subject area on which has thorough knowledge and second important responsibility in research is to compare findings, it is literature review plays an extremely important role. The literature review is part of the research process and makes a valuable contribution to almost every operational step. A good research design provides information concerning with the selection of the sample population treatments and controls to be imposed and research work cannot be undertaken without sampling. Collecting the data and create data structure as organizing the data, analyzing the data help of different statistical method, summarizing the analysis, and using these results for making judgments, decisions and predictions. Keywords: Research Problem, Economical Plan, Developing Ideas, Research Strategy, Sampling Design, Theoretical Procedures, Experimental Studies, Numerical Schemes, Statistical Techniques.

Xochitl Ortiz

The authors felt during their several years of teaching experience that students fail to understand the books written on Research Methodology because generally they are written in technical language. Since this course is not taught before the Master’s degree, the students are not familiar with its vocabulary, methodology and course contents. The authors have made an attempt to write it in very non- technical language. It has been attempted that students who try to understand the research methodology through self-learning may also find it easy. The chapters are written with that approach. Even those students who intend to attain high level of knowledge of the research methodology in social sciences will find this book very helpful in understanding the basic concepts before they read any book on research methodology. This book is useful those students who offer the Research Methodology at Post Graduation and M.Phil. Level. This book is also very useful for Ph.D. Course Work examinations.

Chisomo Mgunda

MD Ashikur Rahman

Roxannie Ibot

Second Language Learning and Teaching

Magdalena Walenta

Lester Millara

the purpose of this paper is to know what are the use of methodology in a research paper.

Asmatullah Ghayasi

Loading Preview

Sorry, preview is currently unavailable. You can download the paper by clicking the button above.

RELATED PAPERS

Ralph Renzo Salangsang

collins wetiatia

kassu sileyew

Dr Sunarsih

Naeem Tabassum

  •   We're Hiring!
  •   Help Center
  • Find new research papers in:
  • Health Sciences
  • Earth Sciences
  • Cognitive Science
  • Mathematics
  • Computer Science
  • Academia ©2024

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Exp Biol Med (Maywood)
  • v.247(9); 2022 May

Logo of ebm

Statistical considerations for outcomes in clinical research: A review of common data types and methodology

Associated data.

Supplemental material, sj-docx-1-ebm-10.1177_15353702221085710 for Statistical considerations for outcomes in clinical research: A review of common data types and methodology by Matthew P Smeltzer and Meredith A Ray in Experimental Biology and Medicine

With the increasing number and variety of clinical trials and observational data analyses, producers and consumers of clinical research must have a working knowledge of an array of statistical methods. Our goal with this body of work is to highlight common types of data and analyses in clinical research. We provide a brief, yet comprehensive overview of common data types in clinical research and appropriate statistical methods for analyses. These include continuous data, binary data, count data, multinomial data, and time-to-event data. We include references for further studies and real-world examples of the application of these methods. In summary, we review common continuous and discrete data, summary statistics for said data, common hypothesis tests and appropriate statistical tests, and underlying assumption for the statistical tests. This information is summarized in tabular format, for additional accessibility.

Impact Statement

Particularly in the clinical field, a larger variety of statistical analyses are conducted and results are utilized by a wide range of researchers, with some having more in-depth statistical training than others. Thus, we set out to summarize and outline appropriate statistical analyses for the most common data found in clinical research. We aimed to make this body of work comprehensive, yet brief and such that anyone working in clinical or public health research could gain a basic understanding of the different types of data and analyses.

Introduction

Clinical research is vitally important for translating basic scientific discoveries into improved medical and public health practice through research involving human subjects. 1 The goal is to generate high-quality evidence to inform standards of care or practice. At its later stages, clinical research, in the form of clinical trials or observational studies, often focuses on comparing health outcomes between groups of persons who differ based on a treatment received or some other external exposure. 2

The scientific method dictates that we test falsifiable hypotheses with quantitative data. 3 Evidence for or against a treatment or exposure must be evaluated statistically to determine if any observed differences are likely to represent true differences or are likely to have occurred by chance. Statistical methods are used to conduct hypothesis testing to this end. 4 In addition, statistical methods are employed to summarize the results of a study and to estimate the observed effect of the treatment or exposure on the outcome of interest. 4

All clinical trials and many observational studies have a designated primary outcome of interest, which is the quantitative metric used to determine the effect of the treatment or exposure. The statistical properties, such as its probability distribution, of the outcome variable and quantifying changes in said variable due to the exposure are of primary importance in determining the choice of statistical methodology. 4 , 5 Here, we review some of the most common types of outcome variables in comparative research and common statistical methods used for analysis.

In this summary, we review standard statistical methodology used for data analysis in clinical research. We identify five common types of outcome data and provide an overview of the typical methods of analysis, effect estimates derived, and graphical presentation. We aim to provide a resource for the clinical researcher who is not a practicing statistician. Methods discussed can be reviewed in more detail in graduate-level textbooks on applied statistics, which are referenced throughout our summary. We also provide references for real-world clinical research projects that have employed each core method. In addition, the procedures available in standard statistical software for data analysis in each of these scenarios are provided in Supplemental Tables 1 and 2 .

At the core, there are generally two categories of outcome data: discrete and continuous. By definition, discrete data, also called categorical data, are the data that have natural, mutually exclusive, non-overlapping groups. 6 Two examples would be severity, defined as mild, moderate, or severe, and intervention exposure groups, such as those receiving the intervention and those not receiving the intervention. Such categories may be ordinal (having an inherent order) or nominal (no inherent order) and can range from two groups or more. 6 The categories may represent qualitative groups (such as the previous examples) or quantitative data, that is, age groupings, such as 18–35, 36–55, 56 years and above.

Continuous data have more flexibility and can be defined as a variable that “can assume any values within a specified relevant interval of values.” 6 , 7 More concrete examples include a person’s age, height, weight, or blood pressure. While we may round for convenience, that is, round to the nearest integer (year) for age, there are no theoretical gaps between two continuous values. Addressing perhaps an obvious question, there are unique situations where data may skirt between discrete and continuous. For example, when does a quantitative ordinal discrete variable have enough categories to be considered a continuous variable? These are often a situation-by-situation basis and decided a priori before the onset of the study.

In addition to the type of data, the sample size may also influence the method used for calculating test statistics and p -values for statistical inference. When sample sizes are sufficiently large, we typically use a class of statistics called asymptotic statistics that rely on a result known as the central limit theorem. 8 These often rely on a Z statistic, chi-square statistic, or F statistic. When sample sizes are more limited, we typically use non-parametric or exact statistical methods that do not rely on these large sample assumptions. Most of the statistical methods that we review here rely on asymptotic statistics in their basic form, but often have an analogous method relying on exact and or non-parametric methods. 9 When a researcher encounters small sample sizes, it is important to consider these alternative methods.

In addition to identifying appropriate statistical methodology for testing hypotheses given the study’s outcome data, there are a number of additional influences that should be considered, such as effect modification and confounding. Additional factors can alter the association of the exposure and outcome and thus are critical to consider when analyzing biological associations. Effect modification, by definition, occurs when a third factor alters the effect of the exposure on the outcome. 10 Specifically, the magnitude of this alteration changes across the values of this third factor. A separate phenomenon, known as confounding, occurs when an imbalance in the distribution of a third factor in the data distorts the observed effect (association) of the exposure on the outcome. 10 To meet the criteria of a confounder, this third factor must be associated with the exposure and with the outcome but not in the casual pathway. If all scenarios above occur, this third factor is a confounder and introduces bias when not properly controlled. While effect modification is a biologic phenomenon in which an exposure impacts the outcome differently for different groups of individuals, confounding is a phenomenon caused by the imbalance of the data itself and may not have biologic significance.

An important consideration is that the effect-modifying or confounding factor is not in the casual pathway from the exposure to outcome. The casual pathway is the primary biological pathway in which the exposure influences the outcome. For example, if Variable A causes Z, and Z causes Y, then Variable Z is on the causal pathway from A to Y. In this case, controlling for Z as either a confounder or effect modifier while estimating the effect of A on Y will induce bias in the estimate. Investigators should also avoid controlling for common effects of A and Y, which can induce “collider bias.” We will discuss how to assess for effect modification and confounding later.

Methods for common variable types

Continuous data.

Continuous data, as described above, are quantitative data with no theoretical gaps between values, where the range (minimum and maximum) of values is dependent on what is being measured. For example, the natural range for age is (0, ~100) while the natural range for temperature measured in degrees Fahrenheit is (−459.67, 134). These types of data are often summarized with a central measurement and a spread measurement. 7 The most common central measurements are the mean or median and represent the “center” of the observed data. The spread measurement aims to quantify how much variation is in the data or how much of the data deviates from the central measurement. Thus, if a mean is presented as the central metric, the variance or standard deviation is typically presented as the spread measurement. If the median is presented as the central metric, the interquartile range (IQR: 25th and 75th percentile) and range (minimum and maximum) are reported as the spread measurement. Understandably, the next question is: which metric to use and when?

This leads to the topic of data distributions. If our continuous data follow what we call a normal probability distribution, this is a symmetric distribution around the mean and therefore, our mean and median will be approximately the same value. 7 While it is statistically appropriate to report either the mean (with variance or standard deviation) or the median (with IQR and range), if our data follow a normal distribution, the most common practice is to report the mean. If the data are skewed and do not follow a normal distribution, it is appropriate to report the median. If the data are skewed, the mean is pulled toward the more extreme values and no longer a true central measurement, while the median is not influenced by skewness. 7

A normal distribution is a statistical probability distribution, defined by a mean and variance, that illustrates the probability of observing a specific value or values from the data. It has convenient statistical properties, such as a pre-specified probability density function and cumulative density function, which are the functions that calculate said probabilities. 7 In addition to the normal distribution, other distributions exist for continuous data and discrete data. Other continuous distributions include, but not limited to exponential, chi-square, F, T, gamma, and beta distributions. 8 Discrete distributions include, but not limited to Bernoulli, binomial, Poisson, negative binomial, and hypergeometric. 8 Each distribution is defined by one or more parameters which control the average, standard deviation, and other aspects of the distribution. If the data follow one of these known distributions, calculating the likelihoods of occurrence, such as for hypothesis testing, becomes straightforward.

How we determine if data follow one of these distributions vary for each type of distribution. For the scope of this body of work, we will only cover how to assess if a continuous variable follows a normal distribution. There are three ways in which one can assess normality, each has its strength and weakness and, therefore, encourage the consideration of all three approaches. Normality can be assessed visually with quantile–quantile (QQ) plots, visually with histograms, or by statistical test (Shapiro–Wilk test, Kolmogorov–Smirnov test, Cramer–von Mises test, or Anderson–Darling test). 11 , 12 Other tests exist but these are the most commonly available in statistical software. The normality tests tend to be very strict, and the smallest deviations will lead to non-normal conclusion. 11 , 12 The visual assessments, such as the QQ plots and histograms, are more subjective to the researcher’s judgment, hence useful to consider visual and statistical approaches.

When our outcome variable is normally distributed, there are several factors that must be considered for selecting the appropriate statistical method to test the hypothesis, such as number of samples, independence, and so on. These analyses have been summarized in Table 1 . Note this table is not comprehensive but a generalized summary of common analyses and assumptions. When the continuous outcome violates normality, or the sample size is small, non-parametric approaches can instead be used. Non-parametric approaches are analyses that do not make any assumptions about the type of distribution; they can analyze normal and non-normally distributed data. However, if data are normal, parametric approaches are more appropriate to implement.

Summary of continuous data analyses and assumptions (all observations are independent).

Type of outcome variableOutcome statistical distributionTheoretical hypotheses AssumptionsCommonly used point estimateCommonly used Effect estimate – Common statistical methods
One variableNormal
NormalityMeanMean -test
No assumption
NoneMedianMedianSign test or signed-rank test
Two variablesNormal
1. Normality
2. Two groups are independent
3. Group variances are equal
MeanDifference of means
or Cohen’s d
-test
No Assumption
H1 : M1 ≠ M2
1. Two groups are independent
2. Both groups have same distribution shape
MedianU statisticMann–Whitney U test
Three or more variables
( groups)
Normal

1. Normality
2. All groups are independent
3. Group variances are equal
MeanCohen’s fANOVA
No Assumption

1. All groups are independentMedian Kruskal–Wallis
Association analyses: modeling outcome as a function of one or more explanatory variables
One continuous variableNormal


or


1. Linear association between explanatory variables and outcome
2. Independent explanatory variables (if more than one)
3. Normally distributed error terms
4. Equal variances
NoneCohen’s f or R Linear regression (overall F-test or partial -tests)

ANOVA: Analysis of variance.

If our aim is to quantify the association between an outcome and exposure, we can apply linear regression (assuming all assumptions are met, see Table 1 ). As outlined earlier, we need to consider possible effect modifiers and confounders. To assess for effect modification, we can do so by introducing an interaction term in the model. As a simple example, the model would contain the exposure variable, the possible effect modifier, and a multiplication term between the exposure and possible effect modifier (termed the interaction term). If the interaction term is statistically significant, we would conclude effect modification is present. If a variable is not an effect modifier, consideration for confounding is then checked. There exist different approaches for assessing confounding but the most widely used is the 10% rule. This rule states that a variable is a confounder if the regression coefficient for the exposure variable changes by more than 10% with the inclusion of the possible confounder in the model. A nice example of this can be seen in Ray et al. (2020). 16

Counts and rates

Count data are the number of times a particular event occurs for each individual, taking non-negative integer values. In biomedical science, we most often look at count data over a period of time, creating an event rate (event count / period of time). The simplest analysis of these data involves calculating events per patient-year of follow-up. When conducting patient-year analyses in large populations, it is often acceptable to look at this statistic in aggregate (sum of total events in the population / sum of total patient-years at risk in the population). Confidence intervals can be calculated by assuming a Poisson distribution.

Statistical modeling of count data or event rates is common with a Poisson model. These models can adjust for confounding by other variables and incorporate interaction terms for effect modification. When a binary treatment variable is used with event rate as the outcome, incidence rate ratios (with confidence intervals) can be estimated from these models. The model can be extended to a zero inflated Poisson (ZIP) model or a negative binomial model when the standard Poisson model does not fit the data well. Population level analyses often look at disease incidence rates and ratios using these methods. 17 , 18 Recently, this type of statistic modeling is at the core of statistical methods used to calculate vaccine efficacy against COVID-19 in a highly impactful randomized trial. 19

Binary data

Arguably, the simplest form of an outcome variable in clinical research is the binary variable for which every observation is classified in one of two groups (disease versus no disease, response versus no response, etc.). 20 We typically assume a binomial statistical distribution for this type of data. When the treatment variable is also binary, results can be analyzed by the simple analysis of the classic 2 × 2 table. From this table, we can estimate the proportion of responses, odds of response, or risk of response/disease within each treatment group. We then compare these estimates between treatment groups using differences or ratio measures. These include the difference in proportions, risk difference, odds ratios, and risk ratios (relative risk). Hypothesis testing around these estimates may utilize the chi-square test to assess the general association between the two variables, large sample asymptotic tests relying on normality under the central limit theorem, or exact tests that do not assume a specific statistical distribution.

Statistical models for binary outcomes can be constructed using logistic regression. In this way, the effect estimates (typically the odds ratio) can be adjusted for confounding by measured variables. These models typically rely on asymptotic normality for hypothesis testing but exact statistics are also available. The models can also assess effect modification through statistical interaction terms. An example of the classical 2 × 2 table can be referenced in Khan et al. 21 A typical application of logistic regression can be seen in Ray et al. 22 We have summarized methods for categorical data in Table 2 .

Summary of discrete data analyses and assumptions (all observations are independent).

Type of outcome variableOutcome statistical distributionTheoretical hypotheses AssumptionsCommonly used point estimateCommonly Used Effect estimate – Common statistical methods
Discrete
 One binary variableBinomial
One binary variable.ProportionProportionZ-test or binominal exact test
 Two binary variablesBinomial
1. One binary metric measured on two different samples.
2. Two samples are independent.
ProportionsDifference in proportions or
Cohen’s h
Z-test

H1 : OR ≠ 1
1. Two binary variables measured on same sample.
2. One variable measuring outcome.
3. One variable measuring exposure.
OddsOdds ratioLogistic regression

H1 : RR ≠ 1
1. Two binary variables measured on same sample.
2. One variable measuring outcome.
3. One variable measuring exposure.
RiskRisk ratioLogistic, Poisson, or negative binomial regression
 Two discrete variablesNo Assumption


1. Two variables measured on the same sample.
2. Each variable is measuring a different metric.
NoneCramer’s V or PhiChi-squared test, Fisher’s exact test (small sample sizes), or logistic regression
Association analyses: modeling outcome as a function of one or more explanatory variables
 One binary variableBinomial


or


or

H1 : ORi ≠ 1

1. Outcome variable is binary
2. Explanatory variables are independent
3. Explanatory variables are linearly associated with the log odds.
OddsOdds ratioLogistic regression
 One discrete variable with > 2 levelsMultinomial (ordered or unordered)If outcome data are nominal, the assumptions are the same as binomial logistic regression.
If outcome data are ordinal, the proportional odds assumption must be met in addition to binomial logistic regression assumptions.
OddsOdds ratioMultinomial logistic regression: generalized logit link for unordered and cumulative logit link for ordered
 Counts and events per follow-upPoisson or
negative binomial



or


or

H1 : IRR ≠ 1

1. Outcome variable is positive integer counts following a Poisson or negative binomial distributionIncidence rateIncidence rate ratioPoisson or negative binomial regression
Time-to-eventNo Distribution Assumed



1. Single discrete exploratory variable (with categories)
2. Censoring is not related to explanatory variables
5-year survivalDifference in 5-year survivalKaplan–Meier (Log-rank test)



or


or

H1 : ≠ 1

1. Hazard remains constant over time (hazards are proportional assumption).
2. Explanatory variables are independent.
3. Explanatory variables are linearly associated with the log hazard.
NoneHazard ratioCox proportional hazards model

Multinomial data

Multinomial data are a natural extension of binary data such that it is a discrete variable with more than two levels. It follows that the extensions of logistic regression can be applied to estimate effects and adjust for effect modification and confounding. However, multinomial data can be nominal or ordinal. For nominal data, the order is of no importance and, therefore, the models use a generalized logit link. 23 This will select one category as a referent category and then perform a set of logistic regression models, each comparing one non-referent level to this referent level. For example, in Kane et al. , 24 they applied a multinomial logistic regression to model type of treatment (five categories) as a function of education level and other covariates. They select watchful waiting as the referent treatment. The analysis thus had four logistic regressions to report, respective of each of the other treatment categories compared to watchful waiting.

If the multinomial data are ordinal, we use a cumulative logit link in the regression model. This link will model the categories cumulatively and sequentially. 23 For example, suppose our outcome has three levels, 1, 2, and 3 and are representative of the number of treatments. Cumulative logit will conduct two logistic regressions: first, Modeling Category 1 versus Categories 2 and 3 (combined) and then Categories 1 and 2 (combined) versus Category 3. Because of the combining of categories, this assumes that the odds are proportional across categories. Thus, this assumption must be checked and satisfied before applying this model. Depending on the outcome, only one of the logistic models may be needed, such as in Bostwick et al. , 25 where their outcome was palliative performance status (low, moderate, and high) and the effects of cancer/non-cancer status. Here, they only reported high-performance status versus moderate and low combined as their outcome.

Time-to-event

Time-to-event data, often called survival data, compare the time from a baseline point to the potential occurrence of an outcome between groups. 26 These data are unique as a statistical outcome because they involve a binary component (event occurred or event did not occur) and the time to event occurrence or last follow-up. Both the occurrence of event and the time it took to occur are of interest. These outcomes are most frequently analyzed with two common statistical methodologies, the Kaplan–Meier method and the Cox proportional hazards model. 26

The Kaplan–Meier method allows for the estimation of a survival distribution of observed data in the presence of censored observations and does not assume any statistical distribution for the data. 26 , 27 In this way, knowledge that an individual did not experience an event up to a certain time point, but is still at risk, is incorporated into the estimates. For example, knowing an individual survived 2 months after a therapy and was censored is less information than knowing an individual survived 2 years after a therapy and was censored. The method assumes that the occurrence of censoring is not associated with the exposure variable. In addition to estimating the entire curve over time, the Kaplan–Meier plot allows for the estimation of the survival probability to a certain point in time, such as “5-year” survival. Survival curves are typically estimated for each group of interest (if exposure is discrete), shown together on a plot. The log-rank test is often used to test for a statistically significant difference in two or more survival curves. 26 An analogous method, known as Cumulative Incidence, takes a similar approach to the non-parametric Kaplan–Meier method, but starts from zero and counts events as they occur, with estimates increasing with time (rather than decreasing). 26 Cumulative Incidence analyses can also be adjusted for competing risks, which occur when subjects experience a different event during the follow-up time that precludes them from experiencing the event of primary interest. In the presence of competing risks, Cumulative Incidence curves can be compared using Gray’s test. 26

Time-to-event data can also be analyzed using statistical models. The most common statistical model is the Cox proportional hazards model. 28 From this model, we can estimate hazard ratios with confidence intervals for comparing the risk of the event occurring between two groups. 26 Multiple variable models can be fit to incorporate interaction terms or can be adjust for confounding (the 10% rule can be applied to the hazard ratio estimate). Although the Cox model does not assume a statistical distribution for the outcome variable, it does assume that the ratio of effect between two treatment groups is constant across time (i.e., proportional hazards). Therefore, one hazard ratio estimate applies to all time points in the study. Extensions of this model are available to allow for more flexibility, with additional complexity in interpretation. Examples of standard applications of the Kaplan–Meier method and Cox proportional hazards models can be seen in recent papers by Mok et al. 29 and Aparicio et al. 30

Generalized linear models

With the exception of time-to-event data, all of the statistical modeling techniques described above can be classified as some form of generalized linear model (GLM). 20 Modern statistical methods utilize GLMs as a broader class of statistical model. In the GLM, the outcome variable can take on different forms (continuous, categorical, multinomial, count, etc) and it is mathematically transformed using a link function. In fact, the statistical modeling methods we have discussed here are each a special case of a GLM. The GLM can accommodate multiple covariates that could be either continuous or categorical. The GLM framework is often a useful tool for understanding the interconnectedness of common statistical methods. For the interested reader, an elegant description of the most common GLMs and how they interrelate is given in Chapter 5 of Categorical Data Analysis by Alan Agresti. 20

Concerns of bias and validity

While statistical significance is necessary to demonstrate that an observed result is not likely to have occurred by chance alone, it is not sufficient to insure a valid result. Bias can arise in clinical research from many causes, including misclassification of the exposure, misclassification of the outcome, confounding, missing data, and selection of the study cohort. 10 , 31 Care should be taken at the study design phase to reduce potential bias as much as possible. To this end, application of proper research methodology is essential. Confounding can sometimes be corrected through statistical adjustment after collection of the data, if the confounding factor is properly measured in the study. 10 , 31 All of these issues are outside the scope of basic statistics and this current summary. However, good clinical research studies should consider both statistical methodology and potential threats to validity from bias. 10 , 31

In this review, we have discussed five of the most common types of outcome data in clinical studies, including continuous, count, binary, multinomial, and time-to-event data. Each data type requires specific statistical methodology, specific assumptions, and consideration of other important factors in data analysis. However, most fall within the overarching GLM framework. In addition, the study design is an important factor in the selection of the appropriate method. Statistical methods can be applied for effect estimation, hypothesis testing, and confidence interval estimation. All of the methods discussed here can be applied using commonly available statistical analysis software without excessive customized programming.

In addition to the common types of data discussed here, other statistical methods are sometimes necessary. We have not discussed in detail situations where data are correlated or clustered. These scenarios typically violate the independence assumption required by many methods. Common subsets of these include longitudinal analyses with multiple observations collected across time and time series data which also require specialized techniques. We have also not covered situations where outcome data are multidimensional, such as the case for research in genetics. The analysis of large amounts of genetic information often relies on the basic methods discussed here, but special considerations and adapted methodology are needed to account for the large numbers of hypothesis tests conducted. One consideration is multiple comparisons. When a single sample is tested more than one time, this increases the chance of making either type I or II error. 32 This means we incorrectly reject or fail to reject the null hypothesis given the truth at the population level. Because of this increased likelihood of error, the significance level must be adjusted. These types of adjustments are not discussed here. Moreover, this overview is not comprehensive, and many additional statistical methodologies are available for specific situations.

In this work, we have focused our discussion on statistical analysis. Another key element in clinical research is a priori statistical design of trials. Appropriate selection of the trial design, including both epidemiologic and statistical design, allows data to be collected in a way that valid statistical comparisons can be made. Power and sample size calculations are key design elements that rely on many of the statistical principals discussed above. Investigators are encouraged to work with experienced statisticians early in the trial design phase, to ensure appropriate statistical considerations are made.

In summary, statistical methods play a critical role in clinical research. A vast array of statistical methods are currently available to handle a breath of data scenarios. Proper application of these techniques requires intimate knowledge of the study design and data collected. A working knowledge of common statistical methodologies and their similarities and differences is vital for producers and consumers of clinical research.

Supplemental Material

Author’ Contributions: All authors participated in the design, interpretation of the studies, writing and review of the manuscript.

Declaration of conflicting interests: The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding: The author(s) received no financial support for the research, authorship, and/or publication of this article.

An external file that holds a picture, illustration, etc.
Object name is 10.1177_15353702221085710-img1.jpg

Supplemental material: Supplemental material for this article is available online.

  • Skip to secondary menu
  • Skip to main content
  • Skip to primary sidebar

Statistics By Jim

Making statistics intuitive

What is the Mean and How to Find It: Definition & Formula

By Jim Frost 4 Comments

What is the Mean?

The mean in math and statistics summarizes an entire dataset with a single number representing the data’s center point or typical value. It is also known as the arithmetic mean, and it is the most common measure of central tendency. It is frequently called the “average.”

Learn how to find the mean and know when it is and is not a good statistic to use!

How to Find the Mean

Finding the mean is very simple. Just add all the values and divide by the number of observations. The mean formula is below:

Mean formula.

For example, if the heights of five people are 48, 51, 52, 54, and 56 inches. Here’s how to find the mean:

48 + 51 + 52 + 54 + 56 / 5 = 52.2

Their average height is 52.2 inches.

Mean Formula

There are two versions of the mean formula in math—the sample and population formulas. In each case, the process for how to find the mean mathematically does not change. Add the values and divide by the number of values. However, the formula notation differs between the two types.

Sample Mean Formula

The sample mean formula is the following:

How to find the sample mean formula.

  • x̄ is the sample average of variable x.
  • ∑x n = sum of n values.
  • n = number of values in the sample.

Typically, the sample formula notation uses lowercase letters.

Population Mean Formula

The population mean formula is the following:

How to find the population mean formula.

  • µ is the population average.
  • ∑X N = sum of N values.
  • N = number of values in the population.

Typically, the population mean formula notation uses Greek and uppercase letters.

Learn more in depth about Sample Mean vs. Population Mean .

When Do You Use the Average?

Ideally, the mean in math (aka the average) indicates the region where most values in a distribution fall. Statisticians refer to it as the central location of a distribution. You can think of it as the tendency of data to cluster around a middle value. The histogram below illustrates the average accurately finding the center of the data’s distribution.

Histogram of a symmetric distribution that shows the mean (aka the average) in the center.

However, the average does not always find the center of the data. It is sensitive to skewed data and extreme values. For example, when the data are skewed, it can miss the mark. In the histogram below, the average is outside the area with the most common values.

Histogram of a skewed distribution showing the average falling away from the most common values.

This problem occurs because outliers have a substantial impact on the mean. Extreme values in an extended tail pull it away from the center. As the distribution becomes more skewed, the average is drawn further away from the center.

In these cases, the average can be misleading because it might not be near the most common values. Consequently, it’s best to use the average to measure the central tendency when you have a symmetric distribution.

For skewed distributions , it’s often better to use the median or trimmed mean , which use different methods to find the central location. Note that the average provides no information about the variability present in a distribution. To evaluate that characteristic, assess the standard deviation .

Relate post : Measures of Central Tendency

Using Sample Means to Estimate Population Means

In statistics, analysts often use a sample average to estimate a population mean. For small samples, the sample can differ greatly from the population. However, as the sample size grows, the law of large numbers states that the sample average is likely to be close to the population value.

Hypothesis tests, such as t-tests and ANOVA , use samples to determine whether population means are different. Statisticians refer to this process of using samples to estimate the properties of entire populations as inferential statistics .

Related post : Descriptive Statistics Vs. Inferential Statistics

In statistics, we usually use the arithmetic average, which is the type I focus on this post. However, there are other types of averages, including the geometric version. Read my post about the geometric mean to learn more . There is also a weighted mean .

Now that you know about statistical mean, learn about regression to the mean . That’s the tendency for extreme events to be followed by more typical occurrences.

Share this:

research statistical treatment example

Reader Interactions

' src=

December 6, 2023 at 9:12 am

What is name of the, that write this books?

' src=

December 4, 2023 at 1:34 am

When was this published ?

' src=

December 4, 2023 at 1:38 am

When citing online resources, you typically use an “Accessed” date rather than a publication date because online content can change over time. For more information, read Purdue University’s Citing Electronic Resources .

' src=

January 29, 2023 at 12:49 am

Great explanation, Jim!

Comments and Questions Cancel reply

Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, generate accurate citations for free.

  • Knowledge Base

Methodology

  • Types of Variables in Research & Statistics | Examples

Types of Variables in Research & Statistics | Examples

Published on September 19, 2022 by Rebecca Bevans . Revised on June 21, 2023.

In statistical research , a variable is defined as an attribute of an object of study. Choosing which variables to measure is central to good experimental design .

If you want to test whether some plant species are more salt-tolerant than others, some key variables you might measure include the amount of salt you add to the water, the species of plants being studied, and variables related to plant health like growth and wilting .

You need to know which types of variables you are working with in order to choose appropriate statistical tests and interpret the results of your study.

You can usually identify the type of variable by asking two questions:

  • What type of data does the variable contain?
  • What part of the experiment does the variable represent?

Table of contents

Types of data: quantitative vs categorical variables, parts of the experiment: independent vs dependent variables, other common types of variables, other interesting articles, frequently asked questions about variables.

Data is a specific measurement of a variable – it is the value you record in your data sheet. Data is generally divided into two categories:

  • Quantitative data represents amounts
  • Categorical data represents groupings

A variable that contains quantitative data is a quantitative variable ; a variable that contains categorical data is a categorical variable . Each of these types of variables can be broken down into further types.

Quantitative variables

When you collect quantitative data, the numbers you record represent real amounts that can be added, subtracted, divided, etc. There are two types of quantitative variables: discrete and continuous .

Discrete vs continuous variables
Type of variable What does the data represent? Examples
Discrete variables (aka integer variables) Counts of individual items or values.
Continuous variables (aka ratio variables) Measurements of continuous or non-finite values.

Categorical variables

Categorical variables represent groupings of some kind. They are sometimes recorded as numbers, but the numbers represent categories rather than actual amounts of things.

There are three types of categorical variables: binary , nominal , and ordinal variables .

Binary vs nominal vs ordinal variables
Type of variable What does the data represent? Examples
Binary variables (aka dichotomous variables) Yes or no outcomes.
Nominal variables Groups with no rank or order between them.
Ordinal variables Groups that are ranked in a specific order. *

*Note that sometimes a variable can work as more than one type! An ordinal variable can also be used as a quantitative variable if the scale is numeric and doesn’t need to be kept as discrete integers. For example, star ratings on product reviews are ordinal (1 to 5 stars), but the average star rating is quantitative.

Example data sheet

To keep track of your salt-tolerance experiment, you make a data sheet where you record information about the variables in the experiment, like salt addition and plant health.

To gather information about plant responses over time, you can fill out the same data sheet every few days until the end of the experiment. This example sheet is color-coded according to the type of variable: nominal , continuous , ordinal , and binary .

Example data sheet showing types of variables in a plant salt tolerance experiment

Prevent plagiarism. Run a free check.

Experiments are usually designed to find out what effect one variable has on another – in our example, the effect of salt addition on plant growth.

You manipulate the independent variable (the one you think might be the cause ) and then measure the dependent variable (the one you think might be the effect ) to find out what this effect might be.

You will probably also have variables that you hold constant ( control variables ) in order to focus on your experimental treatment.

Independent vs dependent vs control variables
Type of variable Definition Example (salt tolerance experiment)
Independent variables (aka treatment variables) Variables you manipulate in order to affect the outcome of an experiment. The amount of salt added to each plant’s water.
Dependent variables (aka ) Variables that represent the outcome of the experiment. Any measurement of plant health and growth: in this case, plant height and wilting.
Control variables Variables that are held constant throughout the experiment. The temperature and light in the room the plants are kept in, and the volume of water given to each plant.

In this experiment, we have one independent and three dependent variables.

The other variables in the sheet can’t be classified as independent or dependent, but they do contain data that you will need in order to interpret your dependent and independent variables.

Example of a data sheet showing dependent and independent variables for a plant salt tolerance experiment.

What about correlational research?

When you do correlational research , the terms “dependent” and “independent” don’t apply, because you are not trying to establish a cause and effect relationship ( causation ).

However, there might be cases where one variable clearly precedes the other (for example, rainfall leads to mud, rather than the other way around). In these cases you may call the preceding variable (i.e., the rainfall) the predictor variable and the following variable (i.e. the mud) the outcome variable .

Once you have defined your independent and dependent variables and determined whether they are categorical or quantitative, you will be able to choose the correct statistical test .

But there are many other ways of describing variables that help with interpreting your results. Some useful types of variables are listed below.

Type of variable Definition Example (salt tolerance experiment)
A variable that hides the true effect of another variable in your experiment. This can happen when another variable is closely related to a variable you are interested in, but you haven’t controlled it in your experiment. Be careful with these, because confounding variables run a high risk of introducing a variety of to your work, particularly . Pot size and soil type might affect plant survival as much or more than salt additions. In an experiment you would control these potential confounders by holding them constant.
Latent variables A variable that can’t be directly measured, but that you represent via a proxy. Salt tolerance in plants cannot be measured directly, but can be inferred from measurements of plant health in our salt-addition experiment.
Composite variables A variable that is made by combining multiple variables in an experiment. These variables are created when you analyze data, not when you measure it. The three plant health variables could be combined into a single plant-health score to make it easier to present your findings.

If you want to know more about statistics , methodology , or research bias , make sure to check out some of our other articles with explanations and examples.

  • Student’s  t -distribution
  • Normal distribution
  • Null and Alternative Hypotheses
  • Chi square tests
  • Confidence interval
  • Cluster sampling
  • Stratified sampling
  • Data cleansing
  • Reproducibility vs Replicability
  • Peer review
  • Likert scale

Research bias

  • Implicit bias
  • Framing effect
  • Cognitive bias
  • Placebo effect
  • Hawthorne effect
  • Hindsight bias
  • Affect heuristic

Receive feedback on language, structure, and formatting

Professional editors proofread and edit your paper by focusing on:

  • Academic style
  • Vague sentences
  • Style consistency

See an example

research statistical treatment example

You can think of independent and dependent variables in terms of cause and effect: an independent variable is the variable you think is the cause , while a dependent variable is the effect .

In an experiment, you manipulate the independent variable and measure the outcome in the dependent variable. For example, in an experiment about the effect of nutrients on crop growth:

  • The  independent variable  is the amount of nutrients added to the crop field.
  • The  dependent variable is the biomass of the crops at harvest time.

Defining your variables, and deciding how you will manipulate and measure them, is an important part of experimental design .

A confounding variable , also called a confounder or confounding factor, is a third variable in a study examining a potential cause-and-effect relationship.

A confounding variable is related to both the supposed cause and the supposed effect of the study. It can be difficult to separate the true effect of the independent variable from the effect of the confounding variable.

In your research design , it’s important to identify potential confounding variables and plan how you will reduce their impact.

Quantitative variables are any variables where the data represent amounts (e.g. height, weight, or age).

Categorical variables are any variables where the data represent groups. This includes rankings (e.g. finishing places in a race), classifications (e.g. brands of cereal), and binary outcomes (e.g. coin flips).

You need to know what type of variables you are working with to choose the right statistical test for your data and interpret your results .

Discrete and continuous variables are two types of quantitative variables :

  • Discrete variables represent counts (e.g. the number of objects in a collection).
  • Continuous variables represent measurable amounts (e.g. water volume or weight).

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the “Cite this Scribbr article” button to automatically add the citation to our free Citation Generator.

Bevans, R. (2023, June 21). Types of Variables in Research & Statistics | Examples. Scribbr. Retrieved June 10, 2024, from https://www.scribbr.com/methodology/types-of-variables/

Is this article helpful?

Rebecca Bevans

Rebecca Bevans

Other students also liked, independent vs. dependent variables | definition & examples, confounding variables | definition, examples & controls, control variables | what are they & why do they matter, get unlimited documents corrected.

✔ Free APA citation check included ✔ Unlimited document corrections ✔ Specialized in correcting academic texts

Common Comorbidities with Substance Use Disorders Research Report Part 1: The Connection Between Substance Use Disorders and Mental Illness

Many individuals who develop substance use disorders (SUD) are also diagnosed with mental disorders, and vice versa. 2,3 Although there are fewer studies on comorbidity among youth, research suggests that adolescents with substance use disorders also have high rates of co-occurring mental illness; over 60 percent of adolescents in community-based substance use disorder treatment programs also meet diagnostic criteria for another mental illness. 4

Data show high rates of comorbid substance use disorders and anxiety disorders—which include generalized anxiety disorder, panic disorder, and post-traumatic stress disorder. 5–9 Substance use disorders also co-occur at high prevalence with mental disorders, such as depression and bipolar disorder, 6,9–11 attention-deficit hyperactivity disorder (ADHD), 12,13 psychotic illness, 14,15 borderline personality disorder, 16 and antisocial personality disorder. 10,15 Patients with schizophrenia have higher rates of alcohol, tobacco, and drug use disorders than the general population. 17 As Figure 1 shows, the overlap is especially pronounced with serious mental illness (SMI). Serious mental illness among people ages 18 and older is defined at the federal level as having, at any time during the past year, a diagnosable mental, behavior, or emotional disorder that causes serious functional impairment that substantially interferes with or limits one or more major life activities. Serious mental illnesses include major depression, schizophrenia, and bipolar disorder, and other mental disorders that cause serious impairment. 18 Around 1 in 4 individuals with SMI also have an SUD.

Data from a large nationally representative sample suggested that people with mental, personality, and substance use disorders were at increased risk for nonmedical use of prescription opioids. 19 Research indicates that 43 percent of people in SUD treatment for nonmedical use of prescription painkillers have a diagnosis or symptoms of mental health disorders, particularly depression and anxiety. 20

Youth—A Vulnerable Time

Although drug use and addiction can happen at any time during a person’s life, drug use typically starts in adolescence, a period when the first signs of mental illness commonly appear. Comorbid disorders can also be seen among youth. 21–23 During the transition to young adulthood (age 18 to 25 years), people with comorbid disorders need coordinated support to help them navigate potentially stressful changes in education, work, and relationships. 21

Drug Use and Mental Health Disorders in Childhood or Adolescence Increases Later Risk

The brain continues to develop through adolescence. Circuits that control executive functions such as decision making and impulse control are among the last to mature, which enhances vulnerability to drug use and the development of a substance use disorder. 3,24 Early drug use is a strong risk factor for later development of substance use disorders, 24 and it may also be a risk factor for the later occurrence of other mental illnesses. 25,26 However, this link is not necessarily causative and may reflect shared risk factors including genetic vulnerability, psychosocial experiences, and/or general environmental influences. For example, frequent marijuana use during adolescence can increase the risk of psychosis in adulthood, specifically in individuals who carry a particular gene variant. 26,27

It is also true that having a mental disorder in childhood or adolescence can increase the risk of later drug use and the development of a substance use disorder. Some research has found that mental illness may precede a substance use disorder, suggesting that better diagnosis of youth mental illness may help reduce comorbidity. One study found that adolescent-onset bipolar disorder confers a greater risk of subsequent substance use disorder compared to adult-onset bipolar disorder. 28 Similarly, other research suggests that youth develop internalizing disorders, including depression and anxiety, prior to developing substance use disorders. 29

Untreated Childhood ADHD Can Increase Later Risk of Drug Problems

Numerous studies have documented an increased risk for substance use disorders in youth with untreated ADHD, 13,30 although some studies suggest that only those with comorbid conduct disorders have greater odds of later developing a substance use disorder. 30,31 Given this linkage, it is important to determine whether effective treatment of ADHD could prevent subsequent drug use and addiction. Treatment of childhood ADHD with stimulant medications such as methylphenidate or amphetamine reduces the impulsive behavior, fidgeting, and  inability to concentrate that characterize ADHD. 32

That risk presents a challenge when treating children with ADHD, since effective treatment often involves prescribing stimulant medications with addictive potential. Although the research is not yet conclusive, many studies suggest that ADHD medications do not increase the risk of substance use disorder among children with this condition. 31,32 It is important to combine stimulant medication for ADHD with appropriate family and child education and behavioral interventions, including counseling on the chronic nature of ADHD and risk for substance use disorder. 13,32

An official website of the United States government

Here’s how you know

Official websites use .gov A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS A lock ( Lock Locked padlock icon ) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

  • Entire Site
  • Research & Funding
  • Health Information
  • About NIDDK
  • Health Statistics
  • Overweight & Obesity Statistics

Overweight & Obesity Statistics

On this page:

Defining Overweight and Obesity

Prevalence of overweight and obesity, trends in obesity among adults and youth in the united states.

A person whose weight is higher than what is considered to be a normal weight for a given height is described as being overweight or having obesity. 1

According to 2017–2018 data from the National Health and Nutrition Examination Survey (NHANES)

  • Nearly 1 in 3 adults (30.7%) are overweight. 2
  • More than 2 in 5 adults (42.4%) have obesity. 2
  • About 1 in 11 adults (9.2%) have severe obesity. 2

According to 2017–2018 NHANES data

  • About 1 in 6 children and adolescents ages 2 to 19 (16.1%) are overweight. 3
  • Almost 1 in 5 children and adolescents ages 2 to 19 (19.3%) have obesity. 3
  • About 1 in 16 children and adolescents ages 2 to 19 (6.1%) have severe obesity. 3

Using Body Mass Index (BMI) to Estimate Overweight and Obesity

BMI is a tool to estimate and screen for overweight and obesity in adults and children. BMI is defined as weight in kilograms divided by height in meters squared. BMI is related to the amount of fat in the body. A high amount of fat can raise the risk of many health problems. A health care professional can determine if a person’s health may be at risk because of his or her weight.

The table below shows BMI ranges for overweight and obesity in adults 20 and older.

BMI of Adults Ages 20 and Older

BMI Classification
18.5 to 24.9 Normal, or healthy, weight
25 to 29.9 Overweight
30+ Obesity (including severe obesity)
40+ Severe obesity

Use this online tool from the Centers for Disease Control and Prevention (CDC) to gauge BMI for adults.

Children and Teens

A child’s body composition changes during growth from infancy into adulthood, and it differs by sex. Therefore, a young person’s weight status is calculated based on a comparison with other same-age and same-sex children or teens using CDC’s age- and sex-specific growth charts. The comparison results in a percentile placement. For example, a boy whose weight in relation to his height is greater than 75% of other same-aged boys places in the 75th percentile for BMI and is considered to be of normal or healthy weight.

Children grow at different rates at different times, so it is not always easy to tell if a child is overweight. A child’s health care professional should evaluate the child’s BMI, growth, and potential health risks due to excess body weight.

BMI for Children and Teens

Weight Status Category Percentile Range
Underweight Less than 5th percentile
Normal or healthy weight 5th percentile to less than 85th percentile
Overweight 85th to less than 95th percentile
Obesity 95th percentile or greater
Severe obesity 120% of the 95th percentile

Use this online tool from the CDC to calculate BMI and the corresponding BMI-for-age percentile based on CDC growth charts, for children and teens.

Causes and Health Consequences of Overweight and Obesity

Factors that may contribute to excess weight gain among adults and youth include genetics; types and amounts of food and drinks consumed; level of physical activity; degree of time spent on sedentary behaviors, such as watching TV, engaging with a computer, or talking and texting on the phone; sleep habits; medical conditions or medicines; and where and how people live, including their access to and ability to afford healthy foods and safe places to be active. 4,5

Overweight and obesity increase the risk for many health problems, such as type 2 diabetes, high blood pressure, heart disease, stroke, joint problems, liver disease, gallstones, some types of cancer, and sleep and breathing problems, among other conditions. 5,6  Learn more about the causes and health consequences of overweight and obesity .

Age-adjusted percentage of US adults with overweight, obesity, and severe obesity by sex, 2017–2018 NHANES Data 2

All (Men and Women) Men Women
Overweight 30.7 34.1 27.5
Obesity (including severe obesity) 42.4 43.0 41.9
Severe obesity 9.2 6.9 11.5

As shown in the above table

  • Nearly 1 in 3 adults (30.7%) are overweight.
  • More than 1 in 3 men (34.1%) and more than 1 in 4 women (27.5%) are overweight.
  • More than 2 in 5 adults (42.4%) have obesity (including severe obesity).
  • About 1 in 11 adults (9.2%) have severe obesity.
  • The percentage of men who are overweight (34.1%) is higher than the percentage of women who are overweight (27.5%).
  • The percentage of women who have severe obesity (11.5%) is higher than the percentage of men who have severe obesity (6.9%).

Age-adjusted prevalence of obesity among adults ages 20 and over, by sex and age: United States, 2017–2018 7

A bar chart that shows the age-adjusted prevalence of obesity among adults ages 20 and over, by sex and age, in the United States from 2017 through 2018.<br />Among all adults ages 20 and over, the age-adjusted prevalence of obesity was 42.4%. Among people 20-39 years of age, the prevalence of obesity was 40%. Among people 40-59 years of age, the age-adjusted prevalence of obesity was 44.8%. Among people 60 years of age and older, the age-adjusted prevalence of obesity was 42.8%.<br />Among men, the age-adjusted prevalence of obesity among all adult men 20 years of age and over was 43%. Among men 20-39 years of age, the age-adjusted prevalence of obesity was 40.3%. Among men 40-59 years of age, the age-adjusted prevalence of obesity was 46.4%. Among men 60 years of age and older, the age-adjusted prevalence of obesity was 42.2%.<br />Among women, the age-adjusted prevalence of obesity among all adult women 20 years of age and over was 41.9%. Among women 20-39 years of age, the age-adjusted prevalence of obesity was 39.7%. Among women 40-59 years of age, the age-adjusted prevalence of obesity was 43.3%. Among women 60 years of age and older, the age-adjusted prevalence of obesity was 43.3%.

As shown in the above bar graph

  • Among adults ages 20 and over, there are no significant differences in prevalence of obesity by sex or age group

Age-adjusted prevalence of obesity among adults ages 20 and over, by sex, race, and Hispanic origin: United States, 2017–2018 7

A bar chart that shows the age-adjusted prevalence of obesity among adults aged 20 and over, by sex and race and Hispanic origin, in the United States from 2017 through 2018  Among all adults ages 20 and over, the age-adjusted prevalence of obesity was 42.2% for non-Hispanic whites. 49.6% for non-Hispanic Blacks, 17.4% for Non-Hispanic Asians, and 44.8% for Hispanics.   Among men ages 20 and over, the age-adjusted prevalence of obesity was 44.7% for non-Hispanic whites,  41.1% for non-Hispanic Blacks, 17.5% for Non-Hispanic Asians, and 45.7% for Hispanics.   Among women ages 20 and over, the age-adjusted prevalence of obesity was 39.8% for non-Hispanic whites,  56.9% for non-Hispanic Blacks, 17.2% for Non-Hispanic Asians, and 43.7% for Hispanics.

  • More than 2 in 5 non-Hispanic white adults (42.2%) have obesity.
  • Nearly 1 in 2 non-Hispanic Black adults (49.6%) have obesity.
  • More than 1 in 6 non-Hispanic Asian adults (17.4%) have obesity.
  • Nearly 1 in 2 Hispanic adults (44.8%) have obesity.
  • Obesity affects more than 2 in 5 non-Hispanic white men (44.7%), more than 2 in 5 non-Hispanic Black men (41.1%), more than 1 in 6 non-Hispanic Asian men (17.5%), and more than 2 in 5 Hispanic men (45.7%).
  • Nearly 2 in 5 non-Hispanic white women (39.8%), more than half of non-Hispanic Black women (56.9%), more than 1 in 6 non-Hispanic Asian women (17.2%), and more than 2 in 5 Hispanic women (43.7%), have obesity.

Age-adjusted prevalence of severe obesity among adults ages 20 and over, by sex, age, and race and Hispanic origin: United States, 2017–2018 7

A bar chart that shows the age-adjusted prevalence of severe obesity among adults ages 20 and over, by sex and age, in the United States from 2017 through 2018.  The age-adjusted prevalence of severity obesity among all adults ages 20 and over was 9.2%.   The age-adjusted prevalence of severity obesity among all adult men ages 20 and over was 6.9%. The prevalence of severity obesity among all adult women ages 20 and over was 11.5%.   The age-adjusted prevalence of severity obesity was 9.7% for all adults ages 20 to 39, 11.5% for all adults ages 40 to 59, and 5.8% for all adults 60 years of age and over.  The age-adjusted prevalence of severity obesity was 9.3% for all non-Hispanic whites, 13.8% for all non-Hispanic Blacks, 2.0% for all Non-Hispanic Asians, and 7.9% for all Hispanics.

  • More women (11.5%) than men (6.9%) have severe obesity.
  • Severe obesity was highest among people ages 40 to 59 (11.5%), followed by people ages 20 to 39 (9.1%) and people ages 60 and older (5.8%).
  • About 1 in 11 non-Hispanic white adults (9.3%) have severe obesity.
  • More than 1 in 8 non-Hispanic Black adults (13.8%) have severe obesity.
  • About 1 in 50 non-Hispanic Asian adults (2.0%) have severe obesity.
  • About 1 in 13 Hispanic adults (7.9%) have severe obesity.
  • Severe obesity was highest among non-Hispanic Black adults (13.8%), followed by non-Hispanic white adults (9.3%), Hispanic adults (7.9%), and non-Hispanic Asian adults (2.0%).

Prevalence of overweight, obesity, and severe obesity among children and adolescents ages 2 to 19 years: United States, 2017–2018 NHANES data 3

A bar chart showing the prevalence of overweight, obesity, and severe obesity among children and adolescents ages 2 to 19 years in United States between 2017–2018. The prevalence rate was 16.1% for overweight, 19.3% for obesity, and 6.1% for severe obesity.

  • Among children and adolescents ages 2 to 19, about 1 in 6 (16.1%) are overweight, more than 1 in 6 (19.3%) have obesity, and about 1 in 18 (6.1%) have severe obesity.

Prevalence of obesity among children and adolescents ages 2 to 19 years: United States, 2017–2018 NHANES data 3

A bar chart showing the prevalence of obesity among children and adolescents ages 2 to 19 years in United States between 2017–2018. The prevalence rate for obesity was 13.4% among children ages 2 to 5, 20.3% among children ages 6 to 11, 21.2% among children ages 12 to 16.

  • Among children ages 2 to 5, more than 1 in 8 (13.4%) have obesity.
  • Among children and youth ages 6 to 11, more than 1 in 5 (20.3%) have obesity.
  • Among adolescents ages 12 to 19, more than 1 in 5 (21.2%) have obesity.

Prevalence of obesity among children and adolescents ages 2 to 19 years, by sex and race and Hispanic origin: United States, 2017–2018 NHANES data 3

A bar chart showing the prevalence of obesity among children and adolescents ages 2 to 19 years by sex, race, and Hispanic origin in United States between 2017–2018.   The prevalence rate for obesity was 17.4% among non-Hispanic white boys, 14.8% among non-Hispanic white girls, 19.4% among non-Hispanic Black boys, 29.1% non-Hispanic Black girls, 12.4% among non-Hispanic Asian boys, 5.1% among non-Hispanic Asian girls, 29.2% among Mexican American boys, and 24.9% among Mexican American girls.

  • More than 1 in 6 non-Hispanic white boys (17.4%) have obesity and more than 1 in 7 non-Hispanic white girls (14.8%) have obesity.
  • Nearly 1 in 5 non-Hispanic Black boys (19.4%) and more than 2 in 7 non-Hispanic Black girls (29.1%) have obesity.
  • About 1 in 8 non-Hispanic Asian boys (12.4%) and about 1 in 20 non-Hispanic Asian girls (5.1%*) have obesity.
  • About 2 in 7 Hispanic boys (28.1%) and nearly 1 in 4 Hispanic girls (23.0%) have obesity.
  • More than 2 in 7 Mexican American boys (29.2%) and 1 in 4 of Mexican American girls (24.9%) have obesity.

* See asterisked note in the figure above.

Trends in age-adjusted (PDF, 97.2 KB) obesity and severe obesity prevalence among adults ages 20 and over: United States, 1999–2000 through 2017–2018 7

A two-line graph that shows trends in age-adjusted obesity and severe obesity prevalence among adults aged 20 and over, in the United States from year cycles 1999 and 2000 through 2017 and 2018.<br />The age-adjusted prevalence of obesity was 30.5% in 1999-2000 and rose steadily to 42.4% by 2017-2018.<br />The age-adjusted prevalence of severe obesity was 4.7% in 1999-2000 and rose steadily to 9.2% by 2017-2018.

  • The prevalence of obesity and severe obesity increased significantly among adult men and women between 1999–2000 and 2017–2018.

Trends in obesity among children and adolescents ages 2–19 years, by age: United States, 1963–1965 through 2017–2018 3

A four-line graph that shows trends in obesity among children and adolescents ages 2 to 19 years in the United States between 1963-1965 and 2017-2018.<br />For all children and adolescents ages 2-19 years, the prevalence of obesity rose from about 4% in 1963-1964 to 20% in 2017-2018.<br />For children ages 2 to 5 years, the prevalence of obesity rose from about 5% in 1971-1974 to nearly 15% in 2017-2018.<br />For children ages 6 to 11 years, the prevalence of obesity rose from about 4% in 1963-1965 to about 20% in 2017-2018.<br />For children and adolescents ages 12-19 years, the prevalence of obesity rose from about 4% in 1966-1967 to over 20% in 2017-2018.

As shown in the above line graph

  • The prevalence of obesity among children and adolescents ages 2 to 19 years roughly doubled between 1988–1994 and 2017–2018.
  • Among children ages 2 to 5, the prevalence of obesity increased between 1988–1994 and 2003–2004, decreased between 2003–2004 and 2011–2012, and then increased again.
  • Among children ages 6 to 11, the prevalence of obesity increased between 1988–1994 and 2003–2004, fluctuated over the next several years, and most recently (2013–2014 to 2017–2018) increased.
  • Among adolescents, ages 12 to 19, the prevalence of obesity has increased between 1988–1994 and 2017–2018.

This content is provided as a service of the National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK), part of the National Institutes of Health. NIDDK translates and disseminates research findings to increase knowledge and understanding about health and disease among patients, health professionals, and the public. Content produced by NIDDK is carefully reviewed by NIDDK scientists and other experts.

The NIDDK would like to thank: Sohyun Park, Ph.D., Centers for Disease Control and Prevention, and Cheryl D. Fryar, M.S.P.H., National Center for Health Statistics, Centers for Disease Control and Prevention

IMAGES

  1. PPT

    research statistical treatment example

  2. CHAPTER (2) Statistical Treatment of Experimental Data

    research statistical treatment example

  3. FREE 9+ Statistical Analysis Plan Templates in PDF

    research statistical treatment example

  4. Statistical Treatment Of Data For Descriptive Research Example

    research statistical treatment example

  5. Statistical Treatment of Data

    research statistical treatment example

  6. English Wizard Online: Sample Statistical Treatment of Data in Chapter

    research statistical treatment example

VIDEO

  1. Statistical Methods for Psychological Research 2_Introductory Video

  2. Selecting the Appropriate Hypothesis Test [FIL]

  3. Statistical Treatment of Data : Frequency Distribution, Measures of Central Tendancy

  4. WHAT STATISTICAL TREATMENT WILL YOU BE USING IN THE STUDY?

  5. Week 7

  6. MET_EDU_RES_CA_0922_VDO_38_02

COMMENTS

  1. Research Paper Statistical Treatment of Data: A Primer

    Introduction to Statistical Treatment in Research. Statistical analysis is a crucial component of both quantitative and qualitative research. Properly treating data enables researchers to draw valid conclusions from their studies. This primer provides an introductory guide to fundamental statistical concepts and methods for manuscripts.

  2. Statistical Treatment of Data

    Statistical Treatment Example - Quantitative Research. For a statistical treatment of data example, consider a medical study that is investigating the effect of a drug on the human population. As the drug can affect different people in different ways based on parameters such as gender, age and race, the researchers would want to group the ...

  3. The Beginner's Guide to Statistical Analysis

    This article is a practical introduction to statistical analysis for students and researchers. We'll walk you through the steps using two research examples. The first investigates a potential cause-and-effect relationship, while the second investigates a potential correlation between variables. Example: Causal research question.

  4. Introduction to Research Statistical Analysis: An Overview of the

    Introduction. Statistical analysis is necessary for any research project seeking to make quantitative conclusions. The following is a primer for research-based statistical analysis. It is intended to be a high-level overview of appropriate statistical testing, while not diving too deep into any specific methodology.

  5. Statistical Treatment

    The term "statistical treatment" is a catch all term which means to apply any statistical method to your data. Treatments are divided into two groups: descriptive statistics, which summarize your data as a graph or summary statistic and inferential statistics, which make predictions and test hypotheses about your data. Treatments could include:

  6. Basic statistical tools in research and data analysis

    Statistical methods involved in carrying out a study include planning, designing, collecting data, analysing, drawing meaningful interpretation and reporting of the research findings. The statistical analysis gives meaning to the meaningless numbers, thereby breathing life into a lifeless data. The results and inferences are precise only if ...

  7. PDF Chapter 10. Experimental Design: Statistical Analysis of Data Purpose

    Now, if we divide the frequency with which a given mean was obtained by the total number of sample means (36), we obtain the probability of selecting that mean (last column in Table 10.5). Thus, eight different samples of n = 2 would yield a mean equal to 3.0. The probability of selecting that mean is 8/36 = 0.222.

  8. Selection of Appropriate Statistical Methods for Data Analysis

    Practice of wrong or inappropriate statistical method is a common phenomenon in the published articles in biomedical research. Incorrect statistical methods can be seen in many conditions like use of ... Minimum Sample Size Required for Statistical Methods ... of the control (126.45 ± 8.85, n 1 =20) and treatment (121.85 ± 5.96, n 2 =20 ...

  9. Choosing the Right Statistical Test

    When to perform a statistical test. You can perform statistical tests on data that have been collected in a statistically valid manner - either through an experiment, or through observations made using probability sampling methods.. For a statistical test to be valid, your sample size needs to be large enough to approximate the true distribution of the population being studied.

  10. Statistical Treatment of Data

    For example, in a survey regarding the election of a Mayor, parameters like age, gender, occupation, etc. would be important in influencing the person's decision to vote for a particular candidate. Therefore the data needs to be treated in these reference frames. An important aspect of statistical treatment of data is the handling of errors.

  11. (PDF) Chapter 3 Research Design and Methodology

    Research Design and Methodology. Chapter 3 consists of three parts: (1) Purpose of the. study and research design, (2) Methods, and (3) Statistical. Data analysis procedure. Part one, Purpose of ...

  12. The Treatment of Data

    For example, centralized facilities or collaborative efforts can provide a cost-effective way of providing research materials or information from large databases. Examples include repositories established to maintain and distribute astronomical images, protein sequences, archaeological data, cell lines, reagents, and transgenic animals.

  13. Statistical Treatment of Data

    The standard deviation, σ, measures how closely values are clustered about the mean. The standard deviation for small samples is defined by: σ = ∑N i=1(xi −x¯)2 N− −−−−−−−−−−−√ σ = ∑ i = 1 N ( x i − x ¯) 2 N. The smaller the value of σ, the more closely packed the data are about the mean, and we say that ...

  14. Statistical Treatment of Data for Survey: The Right Approach

    Statistical treatment of data is a process used to convert raw data into something interpretable. This process is essential because it allows businesses to make better decisions based on customer feedback. This blog post will give a short overview of the statistical treatment of data and how it can be used to improve your business.

  15. (PDF) Statistical Treatment of Experimental Data

    Jun 1992. POLYM COMPOSITE. A. Cervenka. P. Sheard. PDF | On Nov 1, 1979, James W. Dally published Statistical Treatment of Experimental Data | Find, read and cite all the research you need on ...

  16. The Beginner's Guide to Statistical Analysis

    Example: Causal research question Can meditation improve exam performance in teenagers? ... Example: Statistical hypotheses to test a correlation. ... These may be the means of different groups within a sample (e.g., a treatment and control group), the means of one sample group taken at different times (e.g., pretest and posttest scores), or a ...

  17. An Introduction to Statistics: Choosing the Correct Statistical Test

    In a previous article in this series, we looked at different types of data and ways to summarise them. 1 At the end of the research study, statistical analyses are performed to test the hypothesis and either prove or disprove it. The choice of statistical test needs to be carefully performed since the use of incorrect tests could lead to misleading conclusions.

  18. Descriptive Statistics

    Types of descriptive statistics. There are 3 main types of descriptive statistics: The distribution concerns the frequency of each value.; The central tendency concerns the averages of the values.; The variability or dispersion concerns how spread out the values are.; You can apply these to assess only one variable at a time, in univariate analysis, or to compare two or more, in bivariate and ...

  19. Role of Statistics in Research

    Role of Statistics in Biological Research. Statistics is a branch of science that deals with collection, organization and analysis of data from the sample to the whole population. Moreover, it aids in designing a study more meticulously and also give a logical reasoning in concluding the hypothesis.

  20. Chapter 3 RESEARCH AND METHODOLOGY

    A good research design provides information concerning with the selection of the sample population treatments and controls to be imposed and research work cannot be undertaken without sampling. Collecting the data and create data structure as organizing the data, analyzing the data help of different statistical method, summarizing the analysis ...

  21. What is the best statistical method for a correlational ...

    Answer: We see this question builds on this previous question. So, deciding which statistical test to use would involve considering various criteria such as whether to use parametric or non-parametric tests, your sample properties, and so on. Accordingly, based on these criteria, you may choose a t-test, a one-way or two-way ANOVA, a Wilcoxon ...

  22. Statistical considerations for outcomes in clinical research: A review

    We provide a brief, yet comprehensive overview of common data types in clinical research and appropriate statistical methods for analyses. These include continuous data, binary data, count data, multinomial data, and time-to-event data. We include references for further studies and real-world examples of the application of these methods.

  23. What is the Mean and How to Find It: Definition & Formula

    Finding the mean is very simple. Just add all the values and divide by the number of observations. The mean formula is below: For example, if the heights of five people are 48, 51, 52, 54, and 56 inches. Here's how to find the mean: 48 + 51 + 52 + 54 + 56 / 5 = 52.2. Their average height is 52.2 inches.

  24. GDC Data Portal Homepage

    Harmonized clinical and genomic data allow for convenient cross-analysis and comparison. Clinical data, including demographics, diagnosis and treatment information, are standardized across hundreds of distinct properties. State-of-the-art bioinformatics workflows are employed to align sequencing reads, ranging from whole genome to single-cell ...

  25. Types of Variables in Research & Statistics

    Example (salt tolerance experiment) Independent variables (aka treatment variables) Variables you manipulate in order to affect the outcome of an experiment. The amount of salt added to each plant's water. Dependent variables (aka response variables) Variables that represent the outcome of the experiment.

  26. Part 1: The Connection Between Substance Use Disorders and Mental

    Many individuals who develop substance use disorders (SUD) are also diagnosed with mental disorders, and vice versa.2,3 Although there are fewer studies on comorbidity among youth, research suggests that adolescents with substance use disorders also have high rates of co-occurring mental illness; over 60 percent of adolescents in community-based substance use disorder treatment programs also ...

  27. Overweight & Obesity Statistics

    For example, a boy whose weight in relation to his height is greater than 75% of other same-aged boys places in the 75th percentile for BMI and is considered to be of normal or healthy weight. ... SOURCES: National Center for Health Statistics, National Health Examination Surveys II (ages 6-11) and III (ages 12-17); and National Health and ...