Research Paper Statistical Treatment of Data: A Primer
We can all agree that analyzing and presenting data effectively in a research paper is critical, yet often challenging.
This primer on statistical treatment of data will equip you with the key concepts and procedures to accurately analyze and clearly convey research findings.
You'll discover the fundamentals of statistical analysis and data management, the common quantitative and qualitative techniques, how to visually represent data, and best practices for writing the results - all framed specifically for research papers.
If you are curious on how AI can help you with statistica analysis for research, check Hepta AI .
Introduction to Statistical Treatment in Research
Statistical analysis is a crucial component of both quantitative and qualitative research. Properly treating data enables researchers to draw valid conclusions from their studies. This primer provides an introductory guide to fundamental statistical concepts and methods for manuscripts.
Understanding the Importance of Statistical Treatment
Careful statistical treatment demonstrates the reliability of results and ensures findings are grounded in robust quantitative evidence. From determining appropriate sample sizes to selecting accurate analytical tests, statistical rigor adds credibility. Both quantitative and qualitative papers benefit from precise data handling.
Objectives of the Primer
This primer aims to equip researchers with best practices for:
Statistical tools to apply during different research phases
Techniques to manage, analyze, and present data
Methods to demonstrate the validity and reliability of measurements
By covering fundamental concepts ranging from descriptive statistics to measurement validity, it enables both novice and experienced researchers to incorporate proper statistical treatment.
Navigating the Primer: Key Topics and Audience
The primer spans introductory topics including:
Research planning and design
Data collection, management, analysis
Result presentation and interpretation
While useful for researchers at any career stage, earlier-career scientists with limited statistical exposure will find it particularly valuable as they prepare manuscripts.
How do you write a statistical method in a research paper?
Statistical methods are a critical component of research papers, allowing you to analyze, interpret, and draw conclusions from your study data. When writing the statistical methods section, you need to provide enough detail so readers can evaluate the appropriateness of the methods you used.
Here are some key things to include when describing statistical methods in a research paper:
Type of Statistical Tests Used
Specify the types of statistical tests performed on the data, including:
Parametric vs nonparametric tests
Descriptive statistics (means, standard deviations)
Inferential statistics (t-tests, ANOVA, regression, etc.)
Statistical significance level (often p < 0.05)
For example: We used t-tests and one-way ANOVA to compare means across groups, with statistical significance set at p < 0.05.
Analysis of Subgroups
If you examined subgroups or additional variables, describe the methods used for these analyses.
For example: We stratified data by gender and used chi-square tests to analyze differences between subgroups.
Software and Versions
List any statistical software packages used for analysis, including version numbers. Common programs include SPSS, SAS, R, and Stata.
For example: Data were analyzed using SPSS version 25 (IBM Corp, Armonk, NY).
The key is to give readers enough detail to assess the rigor and appropriateness of your statistical methods. The methods should align with your research aims and design. Keep explanations clear and concise using consistent terminology throughout the paper.
What are the 5 statistical treatment in research?
The five most common statistical treatments used in academic research papers include:
The mean, or average, is used to describe the central tendency of a dataset. It provides a singular value that represents the middle of a distribution of numbers. Calculating means allows researchers to characterize typical observations within a sample.
Standard Deviation
Standard deviation measures the amount of variability in a dataset. A low standard deviation indicates observations are clustered closely around the mean, while a high standard deviation signifies the data is more spread out. Reporting standard deviations helps readers contextualize means.
Regression Analysis
Regression analysis models the relationship between independent and dependent variables. It generates an equation that predicts changes in the dependent variable based on changes in the independents. Regressions are useful for hypothesizing causal connections between variables.
Hypothesis Testing
Hypothesis testing evaluates assumptions about population parameters based on statistics calculated from a sample. Common hypothesis tests include t-tests, ANOVA, and chi-squared. These quantify the likelihood of observed differences being due to chance.
Sample Size Determination
Sample size calculations identify the minimum number of observations needed to detect effects of a given size at a desired statistical power. Appropriate sampling ensures studies can uncover true relationships within the constraints of resource limitations.
These five statistical analysis methods form the backbone of most quantitative research processes. Correct application allows researchers to characterize data trends, model predictive relationships, and make probabilistic inferences regarding broader populations. Expertise in these techniques is fundamental for producing valid, reliable, and publishable academic studies.
How do you know what statistical treatment to use in research?
The selection of appropriate statistical methods for the treatment of data in a research paper depends on three key factors:
The Aim and Objective of the Study
The aim and objectives that the study seeks to achieve will determine the type of statistical analysis required.
Descriptive research presenting characteristics of the data may only require descriptive statistics like measures of central tendency (mean, median, mode) and dispersion (range, standard deviation).
Studies aiming to establish relationships or differences between variables need inferential statistics like correlation, t-tests, ANOVA, regression etc.
Predictive modeling research requires methods like regression, discriminant analysis, logistic regression etc.
Thus, clearly identifying the research purpose and objectives is the first step in planning appropriate statistical treatment.
Type and Distribution of Data
The type of data (categorical, numerical) and its distribution (normal, skewed) also guide the choice of statistical techniques.
Parametric tests have assumptions related to normality and homogeneity of variance.
Non-parametric methods are distribution-free and better suited for non-normal or categorical data.
Testing data distribution and characteristics is therefore vital.
Nature of Observations
Statistical methods also differ based on whether the observations are paired or unpaired.
Analyzing changes within one group requires paired tests like paired t-test, Wilcoxon signed-rank test etc.
Comparing between two or more independent groups needs unpaired tests like independent t-test, ANOVA, Kruskal-Wallis test etc.
Thus the nature of observations is pivotal in selecting suitable statistical analyses.
In summary, clearly defining the research objectives, testing the collected data, and understanding the observational units guides proper statistical treatment and interpretation.
What is statistical techniques in research paper?
Statistical methods are essential tools in scientific research papers. They allow researchers to summarize, analyze, interpret and present data in meaningful ways.
Some key statistical techniques used in research papers include:
Descriptive statistics: These provide simple summaries of the sample and the measures. Common examples include measures of central tendency (mean, median, mode), measures of variability (range, standard deviation) and graphs (histograms, pie charts).
Inferential statistics: These help make inferences and predictions about a population from a sample. Common techniques include estimation of parameters, hypothesis testing, correlation and regression analysis.
Analysis of variance (ANOVA): This technique allows researchers to compare means across multiple groups and determine statistical significance.
Factor analysis: This technique identifies underlying relationships between variables and latent constructs. It allows reducing a large set of variables into fewer factors.
Structural equation modeling: This technique estimates causal relationships using both latent and observed factors. It is widely used for testing theoretical models in social sciences.
Proper statistical treatment and presentation of data are crucial for the integrity of any quantitative research paper. Statistical techniques help establish validity, account for errors, test hypotheses, build models and derive meaningful insights from the research.
Fundamental Concepts and Data Management
Exploring basic statistical terms.
Understanding key statistical concepts is essential for effective research design and data analysis. This includes defining key terms like:
Statistics : The science of collecting, organizing, analyzing, and interpreting numerical data to draw conclusions or make predictions.
Variables : Characteristics or attributes of the study participants that can take on different values.
Measurement : The process of assigning numbers to variables based on a set of rules.
Sampling : Selecting a subset of a larger population to estimate characteristics of the whole population.
Data types : Quantitative (numerical) or qualitative (categorical) data.
Descriptive vs. inferential statistics : Descriptive statistics summarize data while inferential statistics allow making conclusions from the sample to the larger population.
Ensuring Validity and Reliability in Measurement
When selecting measurement instruments, it is critical they demonstrate:
Validity : The extent to which the instrument measures what it intends to measure.
Reliability : The consistency of measurement over time and across raters.
Researchers should choose instruments aligned to their research questions and study methodology .
Data Management Essentials
Proper data management requires:
Ethical collection procedures respecting autonomy, justice, beneficence and non-maleficence.
Handling missing data through deletion, imputation or modeling procedures.
Data cleaning by identifying and fixing errors, inconsistencies and duplicates.
Data screening via visual inspection and statistical methods to detect anomalies.
Data Management Techniques and Ethical Considerations
Ethical data management includes:
Obtaining informed consent from all participants.
Anonymization and encryption to protect privacy.
Secure data storage and transfer procedures.
Responsible use of statistical tools free from manipulation or misrepresentation.
Adhering to ethical guidelines preserves public trust in the integrity of research.
Statistical Methods and Procedures
This section provides an introduction to key quantitative analysis techniques and guidance on when to apply them to different types of research questions and data.
Descriptive Statistics and Data Summarization
Descriptive statistics summarize and organize data characteristics such as central tendency, variability, and distributions. Common descriptive statistical methods include:
Measures of central tendency (mean, median, mode)
Measures of variability (range, interquartile range, standard deviation)
Graphical representations (histograms, box plots, scatter plots)
Frequency distributions and percentages
These methods help describe and summarize the sample data so researchers can spot patterns and trends.
Inferential Statistics for Generalizing Findings
While descriptive statistics summarize sample data, inferential statistics help generalize findings to the larger population. Common techniques include:
Hypothesis testing with t-tests, ANOVA
Correlation and regression analysis
Nonparametric tests
These methods allow researchers to draw conclusions and make predictions about the broader population based on the sample data.
Selecting the Right Statistical Tools
Choosing the appropriate analyses involves assessing:
The research design and questions asked
Type of data (categorical, continuous)
Data distributions
Statistical assumptions required
Matching the correct statistical tests to these elements helps ensure accurate results.
Statistical Treatment of Data for Quantitative Research
For quantitative research, common statistical data treatments include:
Testing data reliability and validity
Checking assumptions of statistical tests
Transforming non-normal data
Identifying and handling outliers
Applying appropriate analyses for the research questions and data type
Examples and case studies help demonstrate correct application of statistical tests.
Approaches to Qualitative Data Analysis
Qualitative data is analyzed through methods like:
Thematic analysis
Content analysis
Discourse analysis
Grounded theory
These help researchers discover concepts and patterns within non-numerical data to derive rich insights.
Data Presentation and Research Method
Crafting effective visuals for data presentation.
When presenting analyzed results and statistics in a research paper, well-designed tables, graphs, and charts are key for clearly showcasing patterns in the data to readers. Adhering to formatting standards like APA helps ensure professional data presentation. Consider these best practices:
Choose the appropriate visual type based on the type of data and relationship being depicted. For example, bar charts for comparing categorical data, line graphs to show trends over time.
Label the x-axis, y-axis, legends clearly. Include informative captions.
Use consistent, readable fonts and sizing. Avoid clutter with unnecessary elements. White space can aid readability.
Order data logically. Such as largest to smallest values, or chronologically.
Include clear statistical notations, like error bars, where applicable.
Following academic standards for visuals lends credibility while making interpretation intuitive for readers.
Writing the Results Section with Clarity
When writing the quantitative Results section, aim for clarity by balancing statistical reporting with interpretation of findings. Consider this structure:
Open with an overview of the analysis approach and measurements used.
Break down results by logical subsections for each hypothesis, construct measured etc.
Report exact statistics first, followed by interpretation of their meaning. For example, “Participants exposed to the intervention had significantly higher average scores (M=78, SD=3.2) compared to controls (M=71, SD=4.1), t(115)=3.42, p = 0.001. This suggests the intervention was highly effective for increasing scores.”
Use present verb tense. And scientific, formal language.
Include tables/figures where they aid understanding or visualization.
Writing results clearly gives readers deeper context around statistical findings.
Highlighting Research Method and Design
With a results section full of statistics, it's vital to communicate key aspects of the research method and design. Consider including:
Brief overview of study variables, materials, apparatus used. Helps reproducibility.
Descriptions of study sampling techniques, data collection procedures. Supports transparency.
Explanations around approaches to measurement, data analysis performed. Bolsters methodological rigor.
Noting control variables, attempts to limit biases etc. Demonstrates awareness of limitations.
Covering these methodological details shows readers the care taken in designing the study and analyzing the results obtained.
Acknowledging Limitations and Addressing Biases
Honestly recognizing methodological weaknesses and limitations goes a long way in establishing credibility within the published discussion section. Consider transparently noting:
Measurement errors and biases that may have impacted findings.
Limitations around sampling methods that constrain generalizability.
Caveats related to statistical assumptions, analysis techniques applied.
Attempts made to control/account for biases and directions for future research.
Rather than detracting value, acknowledging limitations demonstrates academic integrity regarding the research performed. It also gives readers deeper insight into interpreting the reported results and findings.
Conclusion: Synthesizing Statistical Treatment Insights
Recap of statistical treatment fundamentals.
Statistical treatment of data is a crucial component of high-quality quantitative research. Proper application of statistical methods and analysis principles enables valid interpretations and inferences from study data. Key fundamentals covered include:
Descriptive statistics to summarize and describe the basic features of study data
Inferential statistics to make judgments of the probability and significance based on the data
Using appropriate statistical tools aligned to the research design and objectives
Following established practices for measurement techniques, data collection, and reporting
Adhering to these core tenets ensures research integrity and allows findings to withstand scientific scrutiny.
Key Takeaways for Research Paper Success
When incorporating statistical treatment into a research paper, keep these best practices in mind:
Clearly state the research hypothesis and variables under examination
Select reliable and valid quantitative measures for assessment
Determine appropriate sample size to achieve statistical power
Apply correct analytical methods suited to the data type and distribution
Comprehensively report methodology procedures and statistical outputs
Interpret results in context of the study limitations and scope
Following these guidelines will bolster confidence in the statistical treatment and strengthen the research quality overall.
Encouraging Continued Learning and Application
As statistical techniques continue advancing, it is imperative for researchers to actively further their statistical literacy. Regularly reviewing new methodological developments and learning advanced tools will augment analytical capabilities. Persistently putting enhanced statistical knowledge into practice through research projects and manuscript preparations will cement competencies. Statistical treatment mastery is a journey requiring persistent effort, but one that pays dividends in research proficiency.
Antonio Carlos Filho @acfilho_dev
Community Blog
Keep up-to-date on postgraduate related issues with our quick reads written by students, postdocs, professors and industry leaders.
Statistical Treatment of Data – Explained & Example
- By DiscoverPhDs
- September 8, 2020
‘Statistical treatment’ is when you apply a statistical method to a data set to draw meaning from it. Statistical treatment can be either descriptive statistics, which describes the relationship between variables in a population, or inferential statistics, which tests a hypothesis by making inferences from the collected data.
Introduction to Statistical Treatment in Research
Every research student, regardless of whether they are a biologist, computer scientist or psychologist, must have a basic understanding of statistical treatment if their study is to be reliable.
This is because designing experiments and collecting data are only a small part of conducting research. The other components, which are often not so well understood by new researchers, are the analysis, interpretation and presentation of the data. This is just as important, if not more important, as this is where meaning is extracted from the study .
What is Statistical Treatment of Data?
Statistical treatment of data is when you apply some form of statistical method to a data set to transform it from a group of meaningless numbers into meaningful output.
Statistical treatment of data involves the use of statistical methods such as:
- regression,
- conditional probability,
- standard deviation and
- distribution range.
These statistical methods allow us to investigate the statistical relationships between the data and identify possible errors in the study.
In addition to being able to identify trends, statistical treatment also allows us to organise and process our data in the first place. This is because when carrying out statistical analysis of our data, it is generally more useful to draw several conclusions for each subgroup within our population than to draw a single, more general conclusion for the whole population. However, to do this, we need to be able to classify the population into different subgroups so that we can later break down our data in the same way before analysing it.
Statistical Treatment Example – Quantitative Research
For a statistical treatment of data example, consider a medical study that is investigating the effect of a drug on the human population. As the drug can affect different people in different ways based on parameters such as gender, age and race, the researchers would want to group the data into different subgroups based on these parameters to determine how each one affects the effectiveness of the drug. Categorising the data in this way is an example of performing basic statistical treatment.
Type of Errors
A fundamental part of statistical treatment is using statistical methods to identify possible outliers and errors. No matter how careful we are, all experiments are subject to inaccuracies resulting from two types of errors: systematic errors and random errors.
Systematic errors are errors associated with either the equipment being used to collect the data or with the method in which they are used. Random errors are errors that occur unknowingly or unpredictably in the experimental configuration, such as internal deformations within specimens or small voltage fluctuations in measurement testing instruments.
These experimental errors, in turn, can lead to two types of conclusion errors: type I errors and type II errors . A type I error is a false positive which occurs when a researcher rejects a true null hypothesis. On the other hand, a type II error is a false negative which occurs when a researcher fails to reject a false null hypothesis.
Fieldwork can be essential for your PhD project. Use these tips to help maximise site productivity and reduce your research time by a few weeks.
A thesis and dissertation appendix contains additional information which supports your main arguments. Find out what they should include and how to format them.
Do you need to have published papers to do a PhD? The simple answer is no but it could benefit your application if you can.
Join thousands of other students and stay up to date with the latest PhD programmes, funding opportunities and advice.
Browse PhDs Now
The purpose of research is to enhance society by advancing knowledge through developing scientific theories, concepts and ideas – find out more on what this involves.
The Thurstone Scale is used to quantify the attitudes of people being surveyed, using a format of ‘agree-disagree’ statements.
Julia is entering the third year of a combined master’s and PhD program at Stanford University. Her research explores how to give robots the sense of touch to make them more useful for tasks such as dexterous manipulation.
Akshay is in the final year of his PhD researching how well models can predict Indian monsoon low-pressure systems. The results of his research will help improve disaster preparedness and long-term planning.
Join Thousands of Students
Have a language expert improve your writing
Run a free plagiarism check in 10 minutes, generate accurate citations for free.
- Knowledge Base
The Beginner's Guide to Statistical Analysis | 5 Steps & Examples
Statistical analysis means investigating trends, patterns, and relationships using quantitative data . It is an important research tool used by scientists, governments, businesses, and other organizations.
To draw valid conclusions, statistical analysis requires careful planning from the very start of the research process . You need to specify your hypotheses and make decisions about your research design, sample size, and sampling procedure.
After collecting data from your sample, you can organize and summarize the data using descriptive statistics . Then, you can use inferential statistics to formally test hypotheses and make estimates about the population. Finally, you can interpret and generalize your findings.
This article is a practical introduction to statistical analysis for students and researchers. We’ll walk you through the steps using two research examples. The first investigates a potential cause-and-effect relationship, while the second investigates a potential correlation between variables.
Table of contents
Step 1: write your hypotheses and plan your research design, step 2: collect data from a sample, step 3: summarize your data with descriptive statistics, step 4: test hypotheses or make estimates with inferential statistics, step 5: interpret your results, other interesting articles.
To collect valid data for statistical analysis, you first need to specify your hypotheses and plan out your research design.
Writing statistical hypotheses
The goal of research is often to investigate a relationship between variables within a population . You start with a prediction, and use statistical analysis to test that prediction.
A statistical hypothesis is a formal way of writing a prediction about a population. Every research prediction is rephrased into null and alternative hypotheses that can be tested using sample data.
While the null hypothesis always predicts no effect or no relationship between variables, the alternative hypothesis states your research prediction of an effect or relationship.
- Null hypothesis: A 5-minute meditation exercise will have no effect on math test scores in teenagers.
- Alternative hypothesis: A 5-minute meditation exercise will improve math test scores in teenagers.
- Null hypothesis: Parental income and GPA have no relationship with each other in college students.
- Alternative hypothesis: Parental income and GPA are positively correlated in college students.
Planning your research design
A research design is your overall strategy for data collection and analysis. It determines the statistical tests you can use to test your hypothesis later on.
First, decide whether your research will use a descriptive, correlational, or experimental design. Experiments directly influence variables, whereas descriptive and correlational studies only measure variables.
- In an experimental design , you can assess a cause-and-effect relationship (e.g., the effect of meditation on test scores) using statistical tests of comparison or regression.
- In a correlational design , you can explore relationships between variables (e.g., parental income and GPA) without any assumption of causality using correlation coefficients and significance tests.
- In a descriptive design , you can study the characteristics of a population or phenomenon (e.g., the prevalence of anxiety in U.S. college students) using statistical tests to draw inferences from sample data.
Your research design also concerns whether you’ll compare participants at the group level or individual level, or both.
- In a between-subjects design , you compare the group-level outcomes of participants who have been exposed to different treatments (e.g., those who performed a meditation exercise vs those who didn’t).
- In a within-subjects design , you compare repeated measures from participants who have participated in all treatments of a study (e.g., scores from before and after performing a meditation exercise).
- In a mixed (factorial) design , one variable is altered between subjects and another is altered within subjects (e.g., pretest and posttest scores from participants who either did or didn’t do a meditation exercise).
- Experimental
- Correlational
First, you’ll take baseline test scores from participants. Then, your participants will undergo a 5-minute meditation exercise. Finally, you’ll record participants’ scores from a second math test.
In this experiment, the independent variable is the 5-minute meditation exercise, and the dependent variable is the math test score from before and after the intervention. Example: Correlational research design In a correlational study, you test whether there is a relationship between parental income and GPA in graduating college students. To collect your data, you will ask participants to fill in a survey and self-report their parents’ incomes and their own GPA.
Measuring variables
When planning a research design, you should operationalize your variables and decide exactly how you will measure them.
For statistical analysis, it’s important to consider the level of measurement of your variables, which tells you what kind of data they contain:
- Categorical data represents groupings. These may be nominal (e.g., gender) or ordinal (e.g. level of language ability).
- Quantitative data represents amounts. These may be on an interval scale (e.g. test score) or a ratio scale (e.g. age).
Many variables can be measured at different levels of precision. For example, age data can be quantitative (8 years old) or categorical (young). If a variable is coded numerically (e.g., level of agreement from 1–5), it doesn’t automatically mean that it’s quantitative instead of categorical.
Identifying the measurement level is important for choosing appropriate statistics and hypothesis tests. For example, you can calculate a mean score with quantitative data, but not with categorical data.
In a research study, along with measures of your variables of interest, you’ll often collect data on relevant participant characteristics.
Variable | Type of data |
---|---|
Age | Quantitative (ratio) |
Gender | Categorical (nominal) |
Race or ethnicity | Categorical (nominal) |
Baseline test scores | Quantitative (interval) |
Final test scores | Quantitative (interval) |
Parental income | Quantitative (ratio) |
---|---|
GPA | Quantitative (interval) |
Prevent plagiarism. Run a free check.
In most cases, it’s too difficult or expensive to collect data from every member of the population you’re interested in studying. Instead, you’ll collect data from a sample.
Statistical analysis allows you to apply your findings beyond your own sample as long as you use appropriate sampling procedures . You should aim for a sample that is representative of the population.
Sampling for statistical analysis
There are two main approaches to selecting a sample.
- Probability sampling: every member of the population has a chance of being selected for the study through random selection.
- Non-probability sampling: some members of the population are more likely than others to be selected for the study because of criteria such as convenience or voluntary self-selection.
In theory, for highly generalizable findings, you should use a probability sampling method. Random selection reduces several types of research bias , like sampling bias , and ensures that data from your sample is actually typical of the population. Parametric tests can be used to make strong statistical inferences when data are collected using probability sampling.
But in practice, it’s rarely possible to gather the ideal sample. While non-probability samples are more likely to at risk for biases like self-selection bias , they are much easier to recruit and collect data from. Non-parametric tests are more appropriate for non-probability samples, but they result in weaker inferences about the population.
If you want to use parametric tests for non-probability samples, you have to make the case that:
- your sample is representative of the population you’re generalizing your findings to.
- your sample lacks systematic bias.
Keep in mind that external validity means that you can only generalize your conclusions to others who share the characteristics of your sample. For instance, results from Western, Educated, Industrialized, Rich and Democratic samples (e.g., college students in the US) aren’t automatically applicable to all non-WEIRD populations.
If you apply parametric tests to data from non-probability samples, be sure to elaborate on the limitations of how far your results can be generalized in your discussion section .
Create an appropriate sampling procedure
Based on the resources available for your research, decide on how you’ll recruit participants.
- Will you have resources to advertise your study widely, including outside of your university setting?
- Will you have the means to recruit a diverse sample that represents a broad population?
- Do you have time to contact and follow up with members of hard-to-reach groups?
Your participants are self-selected by their schools. Although you’re using a non-probability sample, you aim for a diverse and representative sample. Example: Sampling (correlational study) Your main population of interest is male college students in the US. Using social media advertising, you recruit senior-year male college students from a smaller subpopulation: seven universities in the Boston area.
Calculate sufficient sample size
Before recruiting participants, decide on your sample size either by looking at other studies in your field or using statistics. A sample that’s too small may be unrepresentative of the sample, while a sample that’s too large will be more costly than necessary.
There are many sample size calculators online. Different formulas are used depending on whether you have subgroups or how rigorous your study should be (e.g., in clinical research). As a rule of thumb, a minimum of 30 units or more per subgroup is necessary.
To use these calculators, you have to understand and input these key components:
- Significance level (alpha): the risk of rejecting a true null hypothesis that you are willing to take, usually set at 5%.
- Statistical power : the probability of your study detecting an effect of a certain size if there is one, usually 80% or higher.
- Expected effect size : a standardized indication of how large the expected result of your study will be, usually based on other similar studies.
- Population standard deviation: an estimate of the population parameter based on a previous study or a pilot study of your own.
Once you’ve collected all of your data, you can inspect them and calculate descriptive statistics that summarize them.
Inspect your data
There are various ways to inspect your data, including the following:
- Organizing data from each variable in frequency distribution tables .
- Displaying data from a key variable in a bar chart to view the distribution of responses.
- Visualizing the relationship between two variables using a scatter plot .
By visualizing your data in tables and graphs, you can assess whether your data follow a skewed or normal distribution and whether there are any outliers or missing data.
A normal distribution means that your data are symmetrically distributed around a center where most values lie, with the values tapering off at the tail ends.
In contrast, a skewed distribution is asymmetric and has more values on one end than the other. The shape of the distribution is important to keep in mind because only some descriptive statistics should be used with skewed distributions.
Extreme outliers can also produce misleading statistics, so you may need a systematic approach to dealing with these values.
Calculate measures of central tendency
Measures of central tendency describe where most of the values in a data set lie. Three main measures of central tendency are often reported:
- Mode : the most popular response or value in the data set.
- Median : the value in the exact middle of the data set when ordered from low to high.
- Mean : the sum of all values divided by the number of values.
However, depending on the shape of the distribution and level of measurement, only one or two of these measures may be appropriate. For example, many demographic characteristics can only be described using the mode or proportions, while a variable like reaction time may not have a mode at all.
Calculate measures of variability
Measures of variability tell you how spread out the values in a data set are. Four main measures of variability are often reported:
- Range : the highest value minus the lowest value of the data set.
- Interquartile range : the range of the middle half of the data set.
- Standard deviation : the average distance between each value in your data set and the mean.
- Variance : the square of the standard deviation.
Once again, the shape of the distribution and level of measurement should guide your choice of variability statistics. The interquartile range is the best measure for skewed distributions, while standard deviation and variance provide the best information for normal distributions.
Using your table, you should check whether the units of the descriptive statistics are comparable for pretest and posttest scores. For example, are the variance levels similar across the groups? Are there any extreme values? If there are, you may need to identify and remove extreme outliers in your data set or transform your data before performing a statistical test.
Pretest scores | Posttest scores | |
---|---|---|
Mean | 68.44 | 75.25 |
Standard deviation | 9.43 | 9.88 |
Variance | 88.96 | 97.96 |
Range | 36.25 | 45.12 |
30 |
From this table, we can see that the mean score increased after the meditation exercise, and the variances of the two scores are comparable. Next, we can perform a statistical test to find out if this improvement in test scores is statistically significant in the population. Example: Descriptive statistics (correlational study) After collecting data from 653 students, you tabulate descriptive statistics for annual parental income and GPA.
It’s important to check whether you have a broad range of data points. If you don’t, your data may be skewed towards some groups more than others (e.g., high academic achievers), and only limited inferences can be made about a relationship.
Parental income (USD) | GPA | |
---|---|---|
Mean | 62,100 | 3.12 |
Standard deviation | 15,000 | 0.45 |
Variance | 225,000,000 | 0.16 |
Range | 8,000–378,000 | 2.64–4.00 |
653 |
A number that describes a sample is called a statistic , while a number describing a population is called a parameter . Using inferential statistics , you can make conclusions about population parameters based on sample statistics.
Researchers often use two main methods (simultaneously) to make inferences in statistics.
- Estimation: calculating population parameters based on sample statistics.
- Hypothesis testing: a formal process for testing research predictions about the population using samples.
You can make two types of estimates of population parameters from sample statistics:
- A point estimate : a value that represents your best guess of the exact parameter.
- An interval estimate : a range of values that represent your best guess of where the parameter lies.
If your aim is to infer and report population characteristics from sample data, it’s best to use both point and interval estimates in your paper.
You can consider a sample statistic a point estimate for the population parameter when you have a representative sample (e.g., in a wide public opinion poll, the proportion of a sample that supports the current government is taken as the population proportion of government supporters).
There’s always error involved in estimation, so you should also provide a confidence interval as an interval estimate to show the variability around a point estimate.
A confidence interval uses the standard error and the z score from the standard normal distribution to convey where you’d generally expect to find the population parameter most of the time.
Hypothesis testing
Using data from a sample, you can test hypotheses about relationships between variables in the population. Hypothesis testing starts with the assumption that the null hypothesis is true in the population, and you use statistical tests to assess whether the null hypothesis can be rejected or not.
Statistical tests determine where your sample data would lie on an expected distribution of sample data if the null hypothesis were true. These tests give two main outputs:
- A test statistic tells you how much your data differs from the null hypothesis of the test.
- A p value tells you the likelihood of obtaining your results if the null hypothesis is actually true in the population.
Statistical tests come in three main varieties:
- Comparison tests assess group differences in outcomes.
- Regression tests assess cause-and-effect relationships between variables.
- Correlation tests assess relationships between variables without assuming causation.
Your choice of statistical test depends on your research questions, research design, sampling method, and data characteristics.
Parametric tests
Parametric tests make powerful inferences about the population based on sample data. But to use them, some assumptions must be met, and only some types of variables can be used. If your data violate these assumptions, you can perform appropriate data transformations or use alternative non-parametric tests instead.
A regression models the extent to which changes in a predictor variable results in changes in outcome variable(s).
- A simple linear regression includes one predictor variable and one outcome variable.
- A multiple linear regression includes two or more predictor variables and one outcome variable.
Comparison tests usually compare the means of groups. These may be the means of different groups within a sample (e.g., a treatment and control group), the means of one sample group taken at different times (e.g., pretest and posttest scores), or a sample mean and a population mean.
- A t test is for exactly 1 or 2 groups when the sample is small (30 or less).
- A z test is for exactly 1 or 2 groups when the sample is large.
- An ANOVA is for 3 or more groups.
The z and t tests have subtypes based on the number and types of samples and the hypotheses:
- If you have only one sample that you want to compare to a population mean, use a one-sample test .
- If you have paired measurements (within-subjects design), use a dependent (paired) samples test .
- If you have completely separate measurements from two unmatched groups (between-subjects design), use an independent (unpaired) samples test .
- If you expect a difference between groups in a specific direction, use a one-tailed test .
- If you don’t have any expectations for the direction of a difference between groups, use a two-tailed test .
The only parametric correlation test is Pearson’s r . The correlation coefficient ( r ) tells you the strength of a linear relationship between two quantitative variables.
However, to test whether the correlation in the sample is strong enough to be important in the population, you also need to perform a significance test of the correlation coefficient, usually a t test, to obtain a p value. This test uses your sample size to calculate how much the correlation coefficient differs from zero in the population.
You use a dependent-samples, one-tailed t test to assess whether the meditation exercise significantly improved math test scores. The test gives you:
- a t value (test statistic) of 3.00
- a p value of 0.0028
Although Pearson’s r is a test statistic, it doesn’t tell you anything about how significant the correlation is in the population. You also need to test whether this sample correlation coefficient is large enough to demonstrate a correlation in the population.
A t test can also determine how significantly a correlation coefficient differs from zero based on sample size. Since you expect a positive correlation between parental income and GPA, you use a one-sample, one-tailed t test. The t test gives you:
- a t value of 3.08
- a p value of 0.001
The final step of statistical analysis is interpreting your results.
Statistical significance
In hypothesis testing, statistical significance is the main criterion for forming conclusions. You compare your p value to a set significance level (usually 0.05) to decide whether your results are statistically significant or non-significant.
Statistically significant results are considered unlikely to have arisen solely due to chance. There is only a very low chance of such a result occurring if the null hypothesis is true in the population.
This means that you believe the meditation intervention, rather than random factors, directly caused the increase in test scores. Example: Interpret your results (correlational study) You compare your p value of 0.001 to your significance threshold of 0.05. With a p value under this threshold, you can reject the null hypothesis. This indicates a statistically significant correlation between parental income and GPA in male college students.
Note that correlation doesn’t always mean causation, because there are often many underlying factors contributing to a complex variable like GPA. Even if one variable is related to another, this may be because of a third variable influencing both of them, or indirect links between the two variables.
Effect size
A statistically significant result doesn’t necessarily mean that there are important real life applications or clinical outcomes for a finding.
In contrast, the effect size indicates the practical significance of your results. It’s important to report effect sizes along with your inferential statistics for a complete picture of your results. You should also report interval estimates of effect sizes if you’re writing an APA style paper .
With a Cohen’s d of 0.72, there’s medium to high practical significance to your finding that the meditation exercise improved test scores. Example: Effect size (correlational study) To determine the effect size of the correlation coefficient, you compare your Pearson’s r value to Cohen’s effect size criteria.
Decision errors
Type I and Type II errors are mistakes made in research conclusions. A Type I error means rejecting the null hypothesis when it’s actually true, while a Type II error means failing to reject the null hypothesis when it’s false.
You can aim to minimize the risk of these errors by selecting an optimal significance level and ensuring high power . However, there’s a trade-off between the two errors, so a fine balance is necessary.
Frequentist versus Bayesian statistics
Traditionally, frequentist statistics emphasizes null hypothesis significance testing and always starts with the assumption of a true null hypothesis.
However, Bayesian statistics has grown in popularity as an alternative approach in the last few decades. In this approach, you use previous research to continually update your hypotheses based on your expectations and observations.
Bayes factor compares the relative strength of evidence for the null versus the alternative hypothesis rather than making a conclusion about rejecting the null hypothesis or not.
If you want to know more about statistics , methodology , or research bias , make sure to check out some of our other articles with explanations and examples.
- Student’s t -distribution
- Normal distribution
- Null and Alternative Hypotheses
- Chi square tests
- Confidence interval
Methodology
- Cluster sampling
- Stratified sampling
- Data cleansing
- Reproducibility vs Replicability
- Peer review
- Likert scale
Research bias
- Implicit bias
- Framing effect
- Cognitive bias
- Placebo effect
- Hawthorne effect
- Hostile attribution bias
- Affect heuristic
Is this article helpful?
Other students also liked.
- Descriptive Statistics | Definitions, Types, Examples
- Inferential Statistics | An Easy Introduction & Examples
- Choosing the Right Statistical Test | Types & Examples
More interesting articles
- Akaike Information Criterion | When & How to Use It (Example)
- An Easy Introduction to Statistical Significance (With Examples)
- An Introduction to t Tests | Definitions, Formula and Examples
- ANOVA in R | A Complete Step-by-Step Guide with Examples
- Central Limit Theorem | Formula, Definition & Examples
- Central Tendency | Understanding the Mean, Median & Mode
- Chi-Square (Χ²) Distributions | Definition & Examples
- Chi-Square (Χ²) Table | Examples & Downloadable Table
- Chi-Square (Χ²) Tests | Types, Formula & Examples
- Chi-Square Goodness of Fit Test | Formula, Guide & Examples
- Chi-Square Test of Independence | Formula, Guide & Examples
- Coefficient of Determination (R²) | Calculation & Interpretation
- Correlation Coefficient | Types, Formulas & Examples
- Frequency Distribution | Tables, Types & Examples
- How to Calculate Standard Deviation (Guide) | Calculator & Examples
- How to Calculate Variance | Calculator, Analysis & Examples
- How to Find Degrees of Freedom | Definition & Formula
- How to Find Interquartile Range (IQR) | Calculator & Examples
- How to Find Outliers | 4 Ways with Examples & Explanation
- How to Find the Geometric Mean | Calculator & Formula
- How to Find the Mean | Definition, Examples & Calculator
- How to Find the Median | Definition, Examples & Calculator
- How to Find the Mode | Definition, Examples & Calculator
- How to Find the Range of a Data Set | Calculator & Formula
- Hypothesis Testing | A Step-by-Step Guide with Easy Examples
- Interval Data and How to Analyze It | Definitions & Examples
- Levels of Measurement | Nominal, Ordinal, Interval and Ratio
- Linear Regression in R | A Step-by-Step Guide & Examples
- Missing Data | Types, Explanation, & Imputation
- Multiple Linear Regression | A Quick Guide (Examples)
- Nominal Data | Definition, Examples, Data Collection & Analysis
- Normal Distribution | Examples, Formulas, & Uses
- Null and Alternative Hypotheses | Definitions & Examples
- One-way ANOVA | When and How to Use It (With Examples)
- Ordinal Data | Definition, Examples, Data Collection & Analysis
- Parameter vs Statistic | Definitions, Differences & Examples
- Pearson Correlation Coefficient (r) | Guide & Examples
- Poisson Distributions | Definition, Formula & Examples
- Probability Distribution | Formula, Types, & Examples
- Quartiles & Quantiles | Calculation, Definition & Interpretation
- Ratio Scales | Definition, Examples, & Data Analysis
- Simple Linear Regression | An Easy Introduction & Examples
- Skewness | Definition, Examples & Formula
- Statistical Power and Why It Matters | A Simple Introduction
- Student's t Table (Free Download) | Guide & Examples
- T-distribution: What it is and how to use it
- Test statistics | Definition, Interpretation, and Examples
- The Standard Normal Distribution | Calculator, Examples & Uses
- Two-Way ANOVA | Examples & When To Use It
- Type I & Type II Errors | Differences, Examples, Visualizations
- Understanding Confidence Intervals | Easy Examples & Formulas
- Understanding P values | Definition and Examples
- Variability | Calculating Range, IQR, Variance, Standard Deviation
- What is Effect Size and Why Does It Matter? (Examples)
- What Is Kurtosis? | Definition, Examples & Formula
- What Is Standard Error? | How to Calculate (Guide with Examples)
What is your plagiarism score?
An official website of the United States government
The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.
The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.
- Publications
- Account settings
The PMC website is updating on October 15, 2024. Learn More or Try it out now .
- Advanced Search
- Journal List
- Indian J Crit Care Med
- v.25(Suppl 2); 2021 May
An Introduction to Statistics: Choosing the Correct Statistical Test
Priya ranganathan.
1 Department of Anaesthesiology, Critical Care and Pain, Tata Memorial Centre, Homi Bhabha National Institute, Mumbai, Maharashtra, India
The choice of statistical test used for analysis of data from a research study is crucial in interpreting the results of the study. This article gives an overview of the various factors that determine the selection of a statistical test and lists some statistical testsused in common practice.
How to cite this article: Ranganathan P. An Introduction to Statistics: Choosing the Correct Statistical Test. Indian J Crit Care Med 2021;25(Suppl 2):S184–S186.
In a previous article in this series, we looked at different types of data and ways to summarise them. 1 At the end of the research study, statistical analyses are performed to test the hypothesis and either prove or disprove it. The choice of statistical test needs to be carefully performed since the use of incorrect tests could lead to misleading conclusions. Some key questions help us to decide the type of statistical test to be used for analysis of study data. 2
W hat is the R esearch H ypothesis ?
Sometimes, a study may just describe the characteristics of the sample, e.g., a prevalence study. Here, the statistical analysis involves only descriptive statistics . For example, Sridharan et al. aimed to analyze the clinical profile, species distribution, and susceptibility pattern of patients with invasive candidiasis. 3 They used descriptive statistics to express the characteristics of their study sample, including mean (and standard deviation) for normally distributed data, median (with interquartile range) for skewed data, and percentages for categorical data.
Studies may be conducted to test a hypothesis and derive inferences from the sample results to the population. This is known as inferential statistics . The goal of inferential statistics may be to assess differences between groups (comparison), establish an association between two variables (correlation), predict one variable from another (regression), or look for agreement between measurements (agreement). Studies may also look at time to a particular event, analyzed using survival analysis.
A re the C omparisons M atched (P aired ) or U nmatched (U npaired )?
Observations made on the same individual (before–after or comparing two sides of the body) are usually matched or paired . Comparisons made between individuals are usually unpaired or unmatched . Data are considered paired if the values in one set of data are likely to be influenced by the other set (as can happen in before and after readings from the same individual). Examples of paired data include serial measurements of procalcitonin in critically ill patients or comparison of pain relief during sequential administration of different analgesics in a patient with osteoarthritis.
W hat are the T ype of D ata B eing M easured ?
The test chosen to analyze data will depend on whether the data are categorical (and whether nominal or ordinal) or numerical (and whether skewed or normally distributed). Tests used to analyze normally distributed data are known as parametric tests and have a nonparametric counterpart that is used for data, which is distribution-free. 4 Parametric tests assume that the sample data are normally distributed and have the same characteristics as the population; nonparametric tests make no such assumptions. Parametric tests are more powerful and have a greater ability to pick up differences between groups (where they exist); in contrast, nonparametric tests are less efficient at identifying significant differences. Time-to-event data requires a special type of analysis, known as survival analysis.
H ow M any M easurements are B eing C ompared ?
The choice of the test differs depending on whether two or more than two measurements are being compared. This includes more than two groups (unmatched data) or more than two measurements in a group (matched data).
T ests for C omparison
( Table 1 lists the tests commonly used for comparing unpaired data, depending on the number of groups and type of data. As an example, Megahed and colleagues evaluated the role of early bronchoscopy in mechanically ventilated patients with aspiration pneumonitis. 5 Patients were randomized to receive either early bronchoscopy or conventional treatment. Between groups, comparisons were made using the unpaired t test for normally distributed continuous variables, the Mann–Whitney U -test for non-normal continuous variables, and the chi-square test for categorical variables. Chowhan et al. compared the efficacy of left ventricular outflow tract velocity time integral (LVOTVTI) and carotid artery velocity time integral (CAVTI) as predictors of fluid responsiveness in patients with sepsis and septic shock. 6 Patients were divided into three groups— sepsis, septic shock, and controls. Since there were three groups, comparisons of numerical variables were done using analysis of variance (for normally distributed data) or Kruskal–Wallis test (for skewed data).
Tests for comparison of unpaired data
Nominal | Chi-square test or Fisher's exact test | |
Ordinal or skewed | Mann–Whitney -test (Wilcoxon rank sum test) | Kruskal–Wallis test |
Normally distributed | Unpaired -test | Analysis of variance (ANOVA) |
A common error is to use multiple unpaired t -tests for comparing more than two groups; i.e., for a study with three treatment groups A, B, and C, it would be incorrect to run unpaired t -tests for group A vs B, B vs C, and C vs A. The correct technique of analysis is to run ANOVA and use post hoc tests (if ANOVA yields a significant result) to determine which group is different from the others.
( Table 2 lists the tests commonly used for comparing paired data, depending on the number of groups and type of data. As discussed above, it would be incorrect to use multiple paired t -tests to compare more than two measurements within a group. In the study by Chowhan, each parameter (LVOTVTI and CAVTI) was measured in the supine position and following passive leg raise. These represented paired readings from the same individual and comparison of prereading and postreading was performed using the paired t -test. 6 Verma et al. evaluated the role of physiotherapy on oxygen requirements and physiological parameters in patients with COVID-19. 7 Each patient had pretreatment and post-treatment data for heart rate and oxygen supplementation recorded on day 1 and day 14. Since data did not follow a normal distribution, they used Wilcoxon's matched pair test to compare the prevalues and postvalues of heart rate (numerical variable). McNemar's test was used to compare the presupplemental and postsupplemental oxygen status expressed as dichotomous data in terms of yes/no. In the study by Megahed, patients had various parameters such as sepsis-related organ failure assessment score, lung injury score, and clinical pulmonary infection score (CPIS) measured at baseline, on day 3 and day 7. 5 Within groups, comparisons were made using repeated measures ANOVA for normally distributed data and Friedman's test for skewed data.
Tests for comparison of paired data
Nominal | McNemar's test | Cochran's Q |
Ordinal or skewed | Wilcoxon signed rank test | Friedman test |
Normally distributed | Paired -test | Repeated measures ANOVA |
T ests for A ssociation between V ariables
( Table 3 lists the tests used to determine the association between variables. Correlation determines the strength of the relationship between two variables; regression allows the prediction of one variable from another. Tyagi examined the correlation between ETCO 2 and PaCO 2 in patients with chronic obstructive pulmonary disease with acute exacerbation, who were mechanically ventilated. 8 Since these were normally distributed variables, the linear correlation between ETCO 2 and PaCO 2 was determined by Pearson's correlation coefficient. Parajuli et al. compared the acute physiology and chronic health evaluation II (APACHE II) and acute physiology and chronic health evaluation IV (APACHE IV) scores to predict intensive care unit mortality, both of which were ordinal data. Correlation between APACHE II and APACHE IV score was tested using Spearman's coefficient. 9 A study by Roshan et al. identified risk factors for the development of aspiration pneumonia following rapid sequence intubation. 10 Since the outcome was categorical binary data (aspiration pneumonia— yes/no), they performed a bivariate analysis to derive unadjusted odds ratios, followed by a multivariable logistic regression analysis to calculate adjusted odds ratios for risk factors associated with aspiration pneumonia.
Tests for assessing the association between variables
Both variables normally distributed | Pearson's correlation coefficient |
One or both variables ordinal or skewed | Spearman's or Kendall's correlation coefficient |
Nominal data | Chi-square test; odds ratio or relative risk (for binary outcomes) |
Continuous outcome | Linear regression analysis |
Categorical outcome (binary) | Logistic regression analysis |
T ests for A greement between M easurements
( Table 4 outlines the tests used for assessing agreement between measurements. Gunalan evaluated concordance between the National Healthcare Safety Network surveillance criteria and CPIS for the diagnosis of ventilator-associated pneumonia. 11 Since both the scores are examples of ordinal data, Kappa statistics were calculated to assess the concordance between the two methods. In the previously quoted study by Tyagi, the agreement between ETCO 2 and PaCO 2 (both numerical variables) was represented using the Bland–Altman method. 8
Tests for assessing agreement between measurements
Categorical data | Cohen's kappa |
Numerical data | Intraclass correlation coefficient (numerical) and Bland–Altman plot (graphical display) |
T ests for T ime-to -E vent D ata (S urvival A nalysis )
Time-to-event data represent a unique type of data where some participants have not experienced the outcome of interest at the time of analysis. Such participants are considered to be “censored” but are allowed to contribute to the analysis for the period of their follow-up. A detailed discussion on the analysis of time-to-event data is beyond the scope of this article. For analyzing time-to-event data, we use survival analysis (with the Kaplan–Meier method) and compare groups using the log-rank test. The risk of experiencing the event is expressed as a hazard ratio. Cox proportional hazards regression model is used to identify risk factors that are significantly associated with the event.
Hasanzadeh evaluated the impact of zinc supplementation on the development of ventilator-associated pneumonia (VAP) in adult mechanically ventilated trauma patients. 12 Survival analysis (Kaplan–Meier technique) was used to calculate the median time to development of VAP after ICU admission. The Cox proportional hazards regression model was used to calculate hazard ratios to identify factors significantly associated with the development of VAP.
The choice of statistical test used to analyze research data depends on the study hypothesis, the type of data, the number of measurements, and whether the data are paired or unpaired. Reviews of articles published in medical specialties such as family medicine, cytopathology, and pain have found several errors related to the use of descriptive and inferential statistics. 12 – 15 The statistical technique needs to be carefully chosen and specified in the protocol prior to commencement of the study, to ensure that the conclusions of the study are valid. This article has outlined the principles for selecting a statistical test, along with a list of tests used commonly. Researchers should seek help from statisticians while writing the research study protocol, to formulate the plan for statistical analysis.
Priya Ranganathan https://orcid.org/0000-0003-1004-5264
Source of support: Nil
Conflict of interest: None
R eferences
Statistical Treatment
Statistics Definitions > Statistical Treatment
What is Statistical Treatment?
Statistical treatment can mean a few different things:
- In Data Analysis : Applying any statistical method — like regression or calculating a mean — to data.
- In Factor Analysis : Any combination of factor levels is called a treatment.
- In a Thesis or Experiment : A summary of the procedure, including statistical methods used.
1. Statistical Treatment in Data Analysis
The term “statistical treatment” is a catch all term which means to apply any statistical method to your data. Treatments are divided into two groups: descriptive statistics , which summarize your data as a graph or summary statistic and inferential statistics , which make predictions and test hypotheses about your data. Treatments could include:
- Finding standard deviations and sample standard errors ,
- Finding T-Scores or Z-Scores .
- Calculating Correlation coefficients .
2. Treatments in Factor Analysis
3. Treatments in a Thesis or Experiment
Sometimes you might be asked to include a treatment as part of a thesis. This is asking you to summarize the data and analysis portion of your experiment, including measurements and formulas used. For example, the following experimental summary is from Statistical Treatment in Acta Physiologica Scandinavica. :
Each of the test solutions was injected twice in each subject…30-42 values were obtained for the intensity, and a like number for the duration, of the pain indiced by the solution. The pain values reported in the following are arithmetical means for these 30-42 injections.”
The author goes on to provide formulas for the mean, the standard deviation and the standard error of the mean.
Vogt, W.P. (2005). Dictionary of Statistics & Methodology: A Nontechnical Guide for the Social Sciences . SAGE. Wheelan, C. (2014). Naked Statistics . W. W. Norton & Company Unknown author (1961) Chapter 3: Statistical Treatment. Acta Physiologica Scandinavica. Volume 51, Issue s179 December Pages 16–20.
Loading metrics
Open Access
Ten Simple Rules for Effective Statistical Practice
Affiliation Department of Statistics, Machine Learning Department, and Center for the Neural Basis of Cognition, Carnegie Mellon University, Pittsburgh, Pennsylvania, United States of America
Affiliation Department of Biostatistics, Bloomberg School of Public Health, Johns Hopkins University, Baltimore, Maryland, United States of America
Affiliation Department of Statistics, North Carolina State University, Raleigh, North Carolina, United States of America
Affiliation Department of Statistics, Harvard University, Cambridge, Massachusetts, United States of America
Affiliation Department of Statistics and Department of Electrical Engineering and Computer Science, University of California Berkeley, Berkeley, California, United States of America
* E-mail: [email protected]
Affiliation Department of Statistical Sciences, University of Toronto, Toronto, Ontario, Canada
- Robert E. Kass,
- Brian S. Caffo,
- Marie Davidian,
- Xiao-Li Meng,
- Bin Yu,
Published: June 9, 2016
- https://doi.org/10.1371/journal.pcbi.1004961
- Reader Comments
Citation: Kass RE, Caffo BS, Davidian M, Meng X-L, Yu B, Reid N (2016) Ten Simple Rules for Effective Statistical Practice. PLoS Comput Biol 12(6): e1004961. https://doi.org/10.1371/journal.pcbi.1004961
Editor: Fran Lewitter, Whitehead Institute, UNITED STATES
Copyright: © 2016 Kass et al. This is an open access article distributed under the terms of the Creative Commons Attribution License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Funding: BSC's research is partially supported by the National Institutes of Health grant EB012547: www.nibib.nih.gov . MD's research is partially supported by the National Institutes of Health grant NIH P01 CA142538: www.nih.gov . REK's research is partially supported by the National Institute of Mental Health R01 MH064537: www.nimh.nih.gov . NR's research is partially supported by the Natural Sciences and Engineering Council of Canada grant RGPIN-2015-06390: www.nserc.ca . BY's research is partially supported by the National Science Foundation grant CCF-0939370: www.nsf.gov . XLM's research is partially supported by the National Science Foundation ( www.nsf.gov ) DMS 1513492. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing interests: The authors have declared that no competing interests exist.
Introduction
Several months ago, Phil Bourne, the initiator and frequent author of the wildly successful and incredibly useful “Ten Simple Rules” series, suggested that some statisticians put together a Ten Simple Rules article related to statistics. (One of the rules for writing a PLOS Ten Simple Rules article is to be Phil Bourne [ 1 ]. In lieu of that, we hope effusive praise for Phil will suffice.)
Implicit in the guidelines for writing Ten Simple Rules [ 1 ] is “know your audience.” We developed our list of rules with researchers in mind: researchers having some knowledge of statistics, possibly with one or more statisticians available in their building, or possibly with a healthy do-it-yourself attitude and a handful of statistical packages on their laptops. We drew on our experience in both collaborative research and teaching, and, it must be said, from our frustration at being asked, more than once, to “take a quick look at my student’s thesis/my grant application/my referee’s report: it needs some input on the stats, but it should be pretty straightforward.”
There are some outstanding resources available that explain many of these concepts clearly and in much more detail than we have been able to do here: among our favorites are Cox and Donnelly [ 2 ], Leek [ 3 ], Peng [ 4 ], Kass et al. [ 5 ], Tukey [ 6 ], and Yu [ 7 ].
Every article on statistics requires at least one caveat. Here is ours: we refer in this article to “science” as a convenient shorthand for investigations using data to study questions of interest. This includes social science, engineering, digital humanities, finance, and so on. Statisticians are not shy about reminding administrators that statistical science has an impact on nearly every part of almost all organizations.
Rule 1: Statistical Methods Should Enable Data to Answer Scientific Questions
A big difference between inexperienced users of statistics and expert statisticians appears as soon as they contemplate the uses of some data. While it is obvious that experiments generate data to answer scientific questions, inexperienced users of statistics tend to take for granted the link between data and scientific issues and, as a result, may jump directly to a technique based on data structure rather than scientific goal. For example, if the data were in a table, as for microarray gene expression data, they might look for a method by asking, “Which test should I use?” while a more experienced person would, instead, start with the underlying question, such as, “Where are the differentiated genes?” and, from there, would consider multiple ways the data might provide answers. Perhaps a formal statistical test would be useful, but other approaches might be applied as alternatives, such as heat maps or clustering techniques. Similarly, in neuroimaging, understanding brain activity under various experimental conditions is the main goal; illustrating this with nice images is secondary. This shift in perspective from statistical technique to scientific question may change the way one approaches data collection and analysis. After learning about the questions, statistical experts discuss with their scientific collaborators the ways that data might answer these questions and, thus, what kinds of studies might be most useful. Together, they try to identify potential sources of variability and what hidden realities could break the hypothesized links between data and scientific inferences; only then do they develop analytic goals and strategies. This is a major reason why collaborating with statisticians can be helpful, and also why the collaborative process works best when initiated early in an investigation. See Rule 3 .
Rule 2: Signals Always Come with Noise
Grappling with variability is central to the discipline of statistics. Variability comes in many forms. In some cases variability is good, because we need variability in predictors to explain variability in outcomes. For example, to determine if smoking is associated with lung cancer, we need variability in smoking habits; to find genetic associations with diseases, we need genetic variation. Other times variability may be annoying, such as when we get three different numbers when measuring the same thing three times. This latter variability is usually called “noise,” in the sense that it is either not understood or thought to be irrelevant. Statistical analyses aim to assess the signal provided by the data, the interesting variability, in the presence of noise, or irrelevant variability.
A starting point for many statistical procedures is to introduce a mathematical abstraction: outcomes, such as patients being diagnosed with specific diseases or receiving numerical scores on diagnostic tests, will vary across the set of individuals being studied, and statistical formalism describes such variation using probability distributions. Thus, for example, a data histogram might be replaced, in theory, by a probability distribution, thereby shifting attention from the raw data to the numerical parameters that determine the precise features of the probability distribution, such as its shape, its spread, or the location of its center. Probability distributions are used in statistical models, with the model specifying the way signal and noise get combined in producing the data we observe, or would like to observe. This fundamental step makes statistical inferences possible. Without it, every data value would be considered unique, and we would be left trying to figure out all the detailed processes that might cause an instrument to give different values when measuring the same thing several times. Conceptualizing signal and noise in terms of probability within statistical models has proven to be an extremely effective simplification, allowing us to capture the variability in data in order to express uncertainty about quantities we are trying to understand. The formalism can also help by directing us to look for likely sources of systematic error, known as bias.
Big data makes these issues more important, not less. For example, Google Flu Trends debuted to great excitement in 2008, but turned out to overestimate the prevalence of influenza by nearly 50%, largely due to bias caused by the way the data were collected; see Harford [ 8 ], for example.
Rule 3: Plan Ahead, Really Ahead
When substantial effort will be involved in collecting data, statistical issues may not be captured in an isolated statistical question such as, “What should my n be?” As we suggested in Rule 1, rather than focusing on a specific detail in the design of the experiment, someone with a lot of statistical experience is likely to step back and consider many aspects of data collection in the context of overall goals and may start by asking, “What would be the ideal outcome of your experiment, and how would you interpret it?” In trying to determine whether observations of X and Y tend to vary together, as opposed to independently, key issues would involve the way X and Y are measured, the extent to which the measurements represent the underlying conceptual meanings of X and Y, the many factors that could affect the measurements, the ability to control those factors, and whether some of those factors might introduce systematic errors (bias).
In Rule 2 we pointed out that statistical models help link data to goals by shifting attention to theoretical quantities of interest. For example, in making electrophysiological measurements from a pair of neurons, a neurobiologist may take for granted a particular measurement methodology along with the supposition that these two neurons will represent a whole class of similar neurons under similar experimental conditions. On the other hand, a statistician will immediately wonder how the specific measurements get at the issue of co-variation; what the major influences on the measurements are, and whether some of them can be eliminated by clever experimental design; what causes variation among repeated measurements, and how quantitative knowledge about sources of variation might influence data collection; and whether these neurons may be considered to be sampled from a well-defined population, and how the process of picking that pair could influence subsequent statistical analyses. A conversation that covers such basic issues may reveal possibilities an experimenter has not yet considered.
Asking questions at the design stage can save headaches at the analysis stage: careful data collection can greatly simplify analysis and make it more rigorous. Or, as Sir Ronald Fisher put it: “To consult the statistician after an experiment is finished is often merely to ask him to conduct a post mortem examination. He can perhaps say what the experiment died of” [ 9 ]. As a good starting point for reading on planning of investigations, see Chapters 1 through 4 of [ 2 ].
Rule 4: Worry about Data Quality
Well-trained experimenters understand instinctively that, when it comes to data analysis, “garbage in produces garbage out.” However, the complexity of modern data collection requires many assumptions about the function of technology, often including data pre-processing technology. It is highly advisable to approach pre-processing with care, as it can have profound effects that easily go unnoticed.
Even with pre-processed data, further considerable effort may be needed prior to analysis; this is variously called “data cleaning,” “data munging,” or “data carpentry.” Hands-on experience can be extremely useful, as data cleaning often reveals important concerns about data quality, in the best case confirming that what was measured is indeed what was intended to be measured and, in the worst case, ensuring that losses are cut early.
Units of measurement should be understood and recorded consistently. It is important that missing data values can be recognized as such by relevant software. For example, 999 may signify the number 999, or it could be code for “we have no clue.” There should be a defensible rule for handling situations such as “non-detects,” and data should be scanned for anomalies such as variable 27 having half its values equal to 0.00027. Try to understand as much as you can how these data arrived at your desk or disk. Why are some data missing or incomplete? Did they get lost through some substantively relevant mechanism? Understanding such mechanisms can help to avoid some seriously misleading results. For example, in a developmental imaging study of attention deficit hyperactivity disorder, might some data have been lost from children with the most severe hyperactivity because they could not sit still in the MR scanner?
Once the data have been wrestled into a convenient format, have a look! Tinkering around with the data, also known as exploratory data analysis, is often the most informative part of the analysis. Exploratory plots can reveal data quality issues and outliers. Simple summaries, such as means, standard deviations, and quantiles, can help refine thinking and offer face validity checks for hypotheses. Many studies, especially when going in completely new scientific directions, are exploratory by design; the area may be too novel to include clear a priori hypotheses. Working with the data informally can help generate new hypotheses and ideas. However, it is also important to acknowledge the specific ways data are selected prior to formal analyses and to consider how such selection might affect conclusions. And it is important to remember that using a single set of data to both generate and test hypotheses is problematic. See Rule 9 .
Rule 5: Statistical Analysis Is More Than a Set of Computations
Statistical software provides tools to assist analyses, not define them. The scientific context is critical, and the key to principled statistical analysis is to bring analytic methods into close correspondence with scientific questions. See Rule 1 . While it can be helpful to include references to a specific algorithm or piece of software in the Methods section of a paper, this should not be a substitute for an explanation of the choice of statistical method in answering a question. A reader will likely want to consider the fundamental issue of whether the analytic technique is appropriately linked to the substantive questions being answered. Don’t make the reader puzzle over this: spell it out clearly.
At the same time, a structured algorithmic approach to the steps in your analysis can be very helpful in making this analysis reproducible by yourself at a later time, or by others with the same or similar data. See Rule 10 .
Rule 6: Keep it Simple
All else being equal, simplicity trumps complexity. This rule has been rediscovered and enshrined in operating procedures across many domains and variously described as “Occam’s razor,” “KISS,” “less is more,” and “simplicity is the ultimate sophistication.” The principle of parsimony can be a trusted guide: start with simple approaches and only add complexity as needed, and then only add as little as seems essential.
Having said this, scientific data have detailed structure, and simple models can’t always accommodate important intricacies. The common assumption of independence is often incorrect and nearly always needs careful examination. See Rule 8 . Large numbers of measurements, interactions among explanatory variables, nonlinear mechanisms of action, missing data, confounding, sampling biases, and so on, can all require an increase in model complexity.
Keep in mind that good design, implemented well, can often allow simple methods of analysis to produce strong results. See Rule 3 . Simple models help us to create order out of complex phenomena, and simple models are well suited for communication to our colleagues and the wider world.
Rule 7: Provide Assessments of Variability
Nearly all biological measurements, when repeated, exhibit substantial variation, and this creates uncertainty in the result of every calculation based on the data. A basic purpose of statistical analysis is to help assess uncertainty, often in the form of a standard error or confidence interval, and one of the great successes of statistical modeling and inference is that it can provide estimates of standard errors from the same data that produce estimates of the quantity of interest. When reporting results, it is essential to supply some notion of statistical uncertainty. A common mistake is to calculate standard errors without taking into account the dependencies among data or variables, which usually means a substantial underestimate of the real uncertainty. See Rule 8 .
Remember that every number obtained from the data by some computation would change somewhat, even if the measurements were repeated on the same biological material. If you are using new material, you can add to the measurement variability an increase due to the natural variability among samples. If you are collecting data on a different day, in a different lab, or under a slightly changed protocol, there are now three more potential sources of variability to be accounted for. In microarray analysis, batch effects are well known to introduce extra variability, and several methods are available to filter these. Extra variability means extra uncertainty in the conclusions, and this uncertainty needs to be reported. Such reporting is invaluable for planning the next investigation.
It is a very common feature of big data that uncertainty assessments tend to be overly optimistic (Cox [ 10 ], Meng [ 11 ]). For an instructive, and beguilingly simple, quantitative analysis most relevant to surveys, see the “data defect” section of [ 11 ]. Big data is not always as big as it looks: a large number of measurements on a small number of samples requires very careful estimation of the standard error, not least because these measurements are quite likely to be dependent.
Rule 8: Check Your Assumptions
Every statistical inference involves assumptions, which are based on substantive knowledge and some probabilistic representation of data variation—this is what we call a statistical model. Even the so-called “model-free” techniques require assumptions, albeit less restrictive assumptions, so this terminology is somewhat misleading.
The most common statistical methods involve an assumption of linear relationships. For example, the ordinary correlation coefficient, also called the Pearson correlation, is a measure of linear association. Linearity often works well as a first approximation or as a depiction of a general trend, especially when the amount of noise in the data makes it difficult to distinguish between linear and nonlinear relationships. However, for any given set of data, the appropriateness of the linear model is an empirical issue and should be investigated.
In many ways, a more worrisome, and very common, assumption in statistical analysis is that multiple observations in the data are statistically independent. This is worrisome because relatively small deviations from this assumption can have drastic effects. When measurements are made across time, for example, the temporal sequencing may be important; if it is, specialized methods appropriate for time series need to be considered.
In addition to nonlinearity and statistical dependence, missing data, systematic biases in measurements, and a variety of other factors can cause violations of statistical modeling assumptions, even in the best experiments. Widely available statistical software makes it easy to perform analyses without careful attention to inherent assumptions, and this risks inaccurate, or even misleading, results. It is therefore important to understand the assumptions embodied in the methods you are using and to do whatever you can to understand and assess those assumptions. At a minimum, you will want to check how well your statistical model fits the data. Visual displays and plots of data and residuals from fitting are helpful for evaluating the relevance of assumptions and the fit of the model, and some basic techniques for assessing model fit are available in most statistical software. Remember, though, that several models can “pass the fit test” on the same data. See Rule 1 and Rule 6 .
Rule 9: When Possible, Replicate!
Every good analyst examines the data at great length, looking for patterns of many types and searching for predicted and unpredicted results. This process often involves dozens of procedures, including many alternative visualizations and a host of numerical slices through the data. Eventually, some particular features of the data are deemed interesting and important, and these are often the results reported in the resulting publication.
When statistical inferences, such as p -values, follow extensive looks at the data, they no longer have their usual interpretation. Ignoring this reality is dishonest: it is like painting a bull’s eye around the landing spot of your arrow. This is known in some circles as p -hacking, and much has been written about its perils and pitfalls: see, for example, [ 12 ] and [ 13 ].
Recently there has been a great deal of criticism of the use of p -values in science, largely related to the misperception that results can’t be worthy of publication unless “ p is less than 0.05.” The recent statement from the American Statistical Association (ASA) [ 14 ] presents a detailed view of the merits and limitations of the p -value.
Statisticians tend to be aware of the most obvious kinds of data snooping, such as choosing particular variables for a reported analysis, and there are methods that can help adjust results in these cases; the False Discovery Rate method of Benjamini and Hochberg [ 15 ] is the basis for several of these.
For some analyses, there may be a case that some kinds of preliminary data manipulation are likely to be innocuous. In other situations, analysts may build into their work an informal check by trusting only extremely small p -values. For example, in high energy physics, the requirement of a “5-sigma” result is at least partly an approximate correction for what is called the “look-elsewhere effect.”
The only truly reliable solution to the problem posed by data snooping is to record the statistical inference procedures that produced the key results, together with the features of the data to which they were applied, and then to replicate the same analysis using new data. Independent replications of this type often go a step further by introducing modifications to the experimental protocol, so that the replication will also provide some degree of robustness to experimental details.
Ideally, replication is performed by an independent investigator. The scientific results that stand the test of time are those that get confirmed across a variety of different, but closely related, situations. In the absence of experimental replications, appropriate forms of data perturbation can be helpful (Yu [ 16 ]). In many contexts, complete replication is very difficult or impossible, as in large-scale experiments such as multi-center clinical trials. In such cases, a minimum standard would be to follow Rule 10.
Rule 10: Make Your Analysis Reproducible
In our current framework for publication of scientific results, the independent replication discussed in Rule 9 is not practical for most investigators. A different standard, which is easier to achieve, is reproducibility: given the same set of data, together with a complete description of the analysis, it should be possible to reproduce the tables, figures, and statistical inferences. However, even this lower standard can face multiple barriers, such as different computing architectures, software versions, and settings.
One can dramatically improve the ability to reproduce findings by being very systematic about the steps in the analysis (see Rule 5 ), by sharing the data and code used to produce the results, and by following Goodman et al. [ 17 ]. Modern reproducible research tools like Sweave [ 18 ], knitr [ 19 ], and iPython [ 20 ] notebooks take this a step further and combine the research report with the code. Reproducible research is itself an ongoing area of research and a very important area that we all need to pay attention to.
Mark Twain popularized the saying, “There are three kinds of lies: lies, damned lies, and statistics.” It is true that data are frequently used selectively to give arguments a false sense of support. Knowingly misusing data or concealing important information about the way data and data summaries have been obtained is, of course, highly unethical. More insidious, however, are the widespread instances of claims made about scientific hypotheses based on well-intentioned yet faulty statistical reasoning. One of our chief aims here has been to emphasize succinctly many of the origins of such problems and ways to avoid the pitfalls.
A central and common task for us as research investigators is to decipher what our data are able to say about the problems we are trying to solve. Statistics is a language constructed to assist this process, with probability as its grammar. While rudimentary conversations are possible without good command of the language (and are conducted routinely), principled statistical analysis is critical in grappling with many subtle phenomena to ensure that nothing serious will be lost in translation and to increase the likelihood that your research findings will stand the test of time. To achieve full fluency in this mathematically sophisticated language requires years of training and practice, but we hope the Ten Simple Rules laid out here will provide some essential guidelines.
Among the many articles reporting on the ASA’s statement on p- values, we particularly liked a quote from biostatistician Andrew Vickers in [ 21 ]: “Treat statistics as a science, not a recipe.” This is a great candidate for Rule 0.
Acknowledgments
We consulted many colleagues informally about this article, but the opinions expressed here are unique to our small committee of authors. We’d like to give a shout out to xkcd.com for conveying statistical ideas with humor, to the Simply Statistics blog as a reliable source for thoughtful commentary, to FiveThirtyEight for bringing statistics to the world (or at least to the media), to Phil Bourne for suggesting that we put together this article, and to Steve Pierson of the American Statistical Association for getting the effort started.
- View Article
- PubMed/NCBI
- Google Scholar
- 2. Cox DR, Donnelly CA (2011) Principles of Applied Statistics. Cambridge: Cambridge University Press.
- 3. Leek JT (2015) The Elements of Data Analytic Style. Leanpub, https://leanpub.com/artofdatascience .
- 4. Peng R (2014) The Art of Data Science. Leanpub, https://leanpub.com/artofdatascience .
- 5. Kass RE, Eden UT, Brown EN (2014) Analysis of Neural Data. Springer: New York.
- 11. Meng XL (2014) A trio of inference problems that could win you a Nobel prize in statistics (if you help fund it). In: Lin X, Genest C, Banks DL, Molenberghs G, Scott DW, Wang J-L,editors. Past, Present, and Future of Statistical Science, Boca Raton: CRC Press. pp. 537–562.
- 13. Aschwanden C (2015) Science isn’t broken. August 11 2015 http://fivethirtyeight.com/features/science-isnt-broken/
- 16. Yu, B (2015) Data wisdom for data science. April 13 2015 http://www.odbms.org/2015/04/data-wisdom-for-data-science/
- 18. Leisch F (2002) Sweave: Dynamic generation of statistical reports using data analysis. In Härdle W, Rönz H, editors. Compstat: Proceedings in Computational Statistics, Heidelberg: Springer-Verlag, pp. 575–580.
- 19. Xie Y (2014) Dynamic Documents with R and knitr. Boca Raton: CRC Press.
On Being a Scientist: A Guide to Responsible Conduct in Research: Third Edition (2009)
Chapter: the treatment of data.
Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
On Being a S c i e n t i s t The Treatment of Data In order to conduct research responsibly, graduate students need to understand how to treat data correctly. In 2002, the editors of the Journal of Cell Biology began to test the images in all accepted manu- scripts to see if they had been altered in ways that violated the jour- nalâs guidelines. About a quarter of the papers had images that showed evidence of inappropriate manipulation. The editors requested the original data for these papers, compared the original data with the submitted images, and required that figures be remade to accord with the guidelines. In about 1 percent of the papers, the editors found evidence for what they termed âfraudulent manipulationâ that affected conclusions drawn in the paper, resulting in the papersâ rejection. Researchers who manipulate their data in ways that deceive others, even if the manipulation seems insignificant at the time, are violating both the basic values and widely accepted professional standards of science. Researchers draw conclusions based on their observations of nature. If data are altered to present a case that is stronger than the data warrant, researchers fail to fulfill all three of the obligations described at the beginning of this guide. They mis- lead their colleagues and potentially impede progress in their field or research. They undermine their own authority and trustworthiness as researchers. And they introduce information into the scientific record that could cause harm to the broader society, as when the dangers of a medical treatment are understated. This is particularly important in an age in which the Internet al- lows for an almost uncontrollably fast and extensive spread of infor- mation to an increasingly broad audience. Misleading or inaccurate data can thus have far-reaching and unpredictable consequences of a magnitude not known before the Internet and other modern com- munication technologies. Misleading data can arise from poor experimental design or care- less measurements as well as from improper manipulation. Over time,
T h e T r e a t m e n t o f D a t a researchers have developed and have continually improved methods and tools designed to maintain the integrity of research. Some of these methods and tools are used within specific fields of research, such as statistical tests of significance, double-blind trials, and proper phrasing of questions on surveys. Others apply across all research fields, such as describing to others what one has done so that research data and results can be verified and extended. Because of the critical importance of methods, scientific papers must include a description of the procedures used to produce the data, sufficient to permit reviewers and readers of a scientific paper to evaluate not only the validity of the data but also the reliability of the methods used to derive those data. If this information is not available, other researchers may be less likely to accept the data and the conclusions drawn from them. They also may be unable to reproduce accurately the conditions under which the data were derived. The best methods will count for little if data are recorded incor- rectly or haphazardly. The requirements for data collection differ among disciplines and research groups, but researchers have a fun- damental obligation to create and maintain an accurate, accessible, and permanent record of what they have done in sufficient detail for others to check and replicate their work. Depending on the field, this obligation may require entering data into bound notebooks with sequentially numbered pages using permanent ink, using a computer application with secure data entry fields, identifying when and where work was done, and retaining data for specified lengths of time. In much industrial research and in some academic research, data note- books need to be signed and dated by a witness on a daily basis. Unfortunately, beginning researchers often receive little or no formal training in recording, analyzing, storing, or sharing data. Regularly scheduled meetings to discuss data issues and policies maintained by research groups and institutions can establish clear expectations and responsibilities.
10 On Being a S c i e n t i s t The Selection of Data Deborah, a third-year graduate student, and Kamala, a postdoc- toral fellow, have made a series of measurements on a new experimental semiconductor material using an expensive neutron test at a national laboratory. When they return to their own laboratory and examine the data, a newly proposed mathematical explanation of the semiconductorâs behavior predicts results indicated by a curve. During the measurements at the national laboratory, Deborah and Kamala observed electrical power fluctuations that they could not control or predict were affecting their detector. They suspect the fluctuations af- fected some of their measurements, but they donât know which ones. When Deborah and Kamala begin to write up their results to present at a lab meeting, which they know will be the first step in preparing a publication, Kamala suggests dropping two anomalous data points near the horizontal axis from the graph they are preparing. She says that due to their deviation from the theoretical curve, the low data points were obviously caused by the power fluctuations. Furthermore, the deviations were outside the expected error bars calculated for the remaining data points. Deborah is concerned that dropping the two points could be seen as manipulating the data. She and Kamala could not be sure that any of their data points, if any, were affected by the power fluctuations. They also did not know if the theoretical prediction was valid. She wants to do a separate analysis that includes the points and discuss the issue in the lab meeting. But Kamala says that if they include the data points in their talk, others will think the issue important enough to discuss in a draft paper, which will make it harder to get the paper published. Instead, she and Deborah should use their professional judgment to drop the points now. 1. What factors should Kamala and Deborah take into account in deciding how to present the data from their experiment? 2. Should the new explanation predicting the results affect their deliberations? 3. Should a draft paper be prepared at this point? 4. If Deborah and Kamala canât agree on how the data should be presented, should one of them consider not being an author of the paper?
T h e T r e a t m e n t o f D a t a 11 Most researchers are not required to share data with others as soon as the data are generated, although a few disciplines have ad- opted this standard to speed the pace of research. A period of confi- dentiality allows researchers to check the accuracy of their data and draw conclusions. However, when a scientific paper or book is published, other re- searchers must have access to the data and research materials needed to support the conclusions stated in the publication if they are to verify and build on that research. Many research institutions, funding agencies, and scientific journals have policies that require the sharing of data and unique research materials. Given the expectation that data will be accessible, researchers who refuse to share the evidentiary basis behind their conclusions, or the materials needed to replicate published experiments, fail to maintain the standards of science. In some cases, research data or materials may be too voluminous, unwieldy, or costly to share quickly and without expense. Neverthe- less, researchers have a responsibility to devise ways to share their data and materials in the best ways possible. For example, centralized facilities or collaborative efforts can provide a cost-effective way of providing research materials or information from large databases. Examples include repositories established to maintain and distribute astronomical images, protein sequences, archaeological data, cell lines, reagents, and transgenic animals. New issues in the treatment and sharing of data continue to arise as scientific disciplines evolve and new technologies appear. Some forms of data undergo extensive analysis before being recorded; con- sequently, sharing those data can require sharing the software and sometimes the hardware used to analyze them. Because digital tech- nologies are rapidly changing, some data stored electronically may be inaccessible in a few years unless provisions are made to transport the data from one platform to another. New forms of publication are challenging traditional practices associated with publication and the evaluation of scholarly work.
The scientific research enterprise is built on a foundation of trust. Scientists trust that the results reported by others are valid. Society trusts that the results of research reflect an honest attempt by scientists to describe the world accurately and without bias. But this trust will endure only if the scientific community devotes itself to exemplifying and transmitting the values associated with ethical scientific conduct.
On Being a Scientist was designed to supplement the informal lessons in ethics provided by research supervisors and mentors. The book describes the ethical foundations of scientific practices and some of the personal and professional issues that researchers encounter in their work. It applies to all forms of research—whether in academic, industrial, or governmental settings-and to all scientific disciplines.
This third edition of On Being a Scientist reflects developments since the publication of the original edition in 1989 and a second edition in 1995. A continuing feature of this edition is the inclusion of a number of hypothetical scenarios offering guidance in thinking about and discussing these scenarios.
On Being a Scientist is aimed primarily at graduate students and beginning researchers, but its lessons apply to all scientists at all stages of their scientific careers.
READ FREE ONLINE
Welcome to OpenBook!
You're looking at OpenBook, NAP.edu's online reading room since 1999. Based on feedback from you, our users, we've made some improvements that make it easier than ever to read thousands of publications on our website.
Do you want to take a quick tour of the OpenBook's features?
Show this book's table of contents , where you can jump to any chapter by name.
...or use these buttons to go back to the previous chapter or skip to the next one.
Jump up to the previous page or down to the next one. Also, you can type in a page number and press Enter to go directly to that page in the book.
To search the entire text of this book, type in your search term here and press Enter .
Share a link to this book page on your preferred social network or via email.
View our suggested citation for this chapter.
Ready to take your reading offline? Click here to buy this book in print or download it as a free PDF, if available.
Get Email Updates
Do you enjoy reading reports from the Academies online for free ? Sign up for email notifications and we'll let you know about new publications in your areas of interest when they're released.
IMAGES
VIDEO
COMMENTS
Proper statistical treatment and presentation of data are crucial for the integrity of any quantitative research paper. Statistical techniques help establish validity, account for errors, test hypotheses, build models and derive meaningful insights from the research.
Statistical treatment of data involves the use of statistical methods such as: mean, mode, median, regression, conditional probability, sampling, standard deviation and; distribution range. These statistical methods allow us to investigate the statistical relationships between the data and identify possible errors in the study.
This chapter provides the readers with information about the various types of statistical data analysis methods in research, and example of best scenarios for the use of each method.
Table of contents. Step 1: Write your hypotheses and plan your research design. Step 2: Collect data from a sample. Step 3: Summarize your data with descriptive statistics. Step 4: Test hypotheses or make estimates with inferential statistics. Step 5: Interpret your results. Other interesting articles.
Two main statistical methods are used in data analysis: descriptive statistics, which summarizes data using indexes such as mean, median, standard deviation and another is inferential statistics, which draws conclusions from data using statistical tests such as student's t-test, ANOVA test, etc.
This article covers many statistical ideas essential to research statistical analysis. Sample size is explained through the concepts of statistical significance level and power. Variable types and definitions are included to clarify necessities for how the analysis will be interpreted.
The goal of inferential statistics may be to assess differences between groups (comparison), establish an association between two variables (correlation), predict one variable from another (regression), or look for agreement between measurements (agreement). Studies may also look at time to a particular event, analyzed using survival analysis.
What is Statistical Treatment? Statistical treatment can mean a few different things: In Data Analysis: Applying any statistical method — like regression or calculating a mean — to data. In Factor Analysis: Any combination of factor levels is called a treatment. In a Thesis or Experiment: A summary of the procedure, including statistical ...
We developed our list of rules with researchers in mind: researchers having some knowledge of statistics, possibly with one or more statisticians available in their building, or possibly with a healthy do-it-yourself attitude and a handful of statistical packages on their laptops.
Some of these methods and tools are used within specific fields of research, such as statistical tests of significance, double-blind trials, and proper phrasing of questions on surveys. Others apply across all research fields, such as describing to others what one has done so that research data and results can be verified and extended.