• View  PDF
  • Download full issue

Elsevier

Ecological Indicators

A review of water quality index models and their use for assessing surface water quality.

  • • Twenty-one different WQI models were identified and reviewed.
  • • Rivers are by far the most common application of WQI models.
  • • Most models comprised of four key components, the specifics of which varied significantly.
  • • Uncertainty and eclipsing problems are key issues affecting model accuracy.
  • Previous article in issue
  • Next article in issue

Cited by (0)

Water Quality Research Journal

  • Previous Article

INTRODUCTION

Scoping review, statistical methods and approaches, practical example on surface water quality in the athabasca oil sands, acknowledgements, data availability statement, statistical tools for water quality assessment and monitoring in river ecosystems – a scoping review and recommendations for data analysis.

ORCID logo

  • Article contents
  • Figures & tables
  • Supplementary Data
  • Open the PDF for in another window
  • Guest Access
  • Cite Icon Cite
  • Permissions
  • Search Site

Stefan G. Schreiber , Sanja Schreiber , Rajiv N. Tanna , David R. Roberts , Tim J. Arciszewski; Statistical tools for water quality assessment and monitoring in river ecosystems – a scoping review and recommendations for data analysis. Water Quality Research Journal 1 February 2022; 57 (1): 40–57. doi: https://doi.org/10.2166/wqrj.2022.028

Download citation file:

  • Ris (Zotero)
  • Reference Manager

Robust scientific inference is crucial to ensure evidence-based decision making. Accordingly, the selection of appropriate statistical tools and experimental designs is integral to achieve accuracy from data analytical processes. Environmental monitoring of water quality has become increasingly common and widespread as a result of technological advances, leading to an abundance of datasets. We conducted a scoping review of the water quality literature and found that correlation and linear regression are by far the most used statistical tools. However, the accuracy of inferences drawn from ordinary least squares (OLS) techniques depends on a set of assumptions, most prominently: (a) independence among observations , (b) normally distributed errors , (c) equal variances of errors , and (d) balanced designs . Environmental data, however, are often faced with temporal and spatial dependencies, and unbalanced designs, thus making OLS techniques not suitable to provide valid statistical inferences. Generalized least squares (GLS), linear mixed-effect models (LMMs), and generalized linear mixed-effect models (GLMMs), as well as Bayesian data analyses, have been developed to better tackle these problems. Recent progress in the development of statistical software has made these approaches more accessible and user-friendly. We provide a high-level summary and practical guidance for those statistical techniques.

Correlation and linear regression are commonly used to assess water quality data.

Environmental data, however, are often characterized by temporal and spatial dependency structures in the data thus making ordinary least squares techniques inappropriate.

Generalized least squares, linear mixed, and generalized linear mixed-effect models, as well as Bayesian techniques, may be more suitable for such data.

Graphical Abstract

Graphical Abstract

Anthropogenic activity influences the biological, chemical, and physical components of the environment. Environmental monitoring systematically measures these components over time to determine if changes have occurred or are occurring ( Yoccoz et al. 2001 ). Among the many measurement outcomes used in monitoring, such as atmospheric deposition ( Horb et al. in press ) or wildlife health ( Roberts et al. in press ), ambient water quality monitoring is intended to characterize broader chemical changes and enable the identification of potential risks to water resources, including its ability to support aquatic life or suitability for consumption, recreation, or other uses ( Eckner 1998 ). While any identified changes can provide context for evaluation of more localized monitoring activities such as end-of-pipe compliance monitoring ( Walker et al. 2002 ) or to evaluate the influence of diffuse exposure pathways ( Arciszewski et al. in press ), the incorporation of water quality into a monitoring program also has other advantages. Water samples can easily be obtained, and an increasing number of reliable analysis methods enable a wide range of chemical analytes to be quantified. This has led, in some cases, to the accumulation of large and often publicly available datasets (e.g., Open Government 2017 ). These datasets are typically evaluated via statistical analyses and comparison with relevant jurisdictional water quality guidelines including, but not limited to, local watershed or sub-watershed management framework targets and limits, provincial limits, and federal limits (e.g., Glozier et al. 2018 ).

While some purpose-driven data collection and analysis have clear techniques and criteria for assigning significance, such as measurements of E. coli in surface waters ( Eckner 1998 ), water quality analyses can also include undirected analyses to detect unknown, changing, or unusual conditions ( Wintle et al. 2010 ). Similarly, data may also be routinely collected in water quality programs without a specific statistical evaluation framework developed a priori . Additionally, water quality programs can continue on for decades, increasing the likelihood of changes in sampling methods, frequency, locations, and analytical precision, which may limit the use of statistical analyses ( Helsel et al. 2020 ). As with any data, analyzing water quality data requires careful consideration of both what the investigators hope to learn from the data and the state of the dataset itself.

Despite the potential challenges, most water quality research primarily utilizes conventional statistical tools based on the method of ordinary least squares (OLS). Techniques using OLS are popular because of their statistical power, familiarity, their analytical solvability, and computational simplicity ( McElreath 2020 ). OLS techniques, however, can also be quite restrictive because of statistical assumptions, which are often challenging to satisfy in environmental and field-based datasets. Those assumptions are (a) independence among observations (independence assumption), (b) normally distributed errors (normality assumption), (c) equal variances of errors (homoscedasticity assumption), and (d) a balanced design (identical sample size among factor levels) in the case of Analysis of Variance (ANOVA). In such cases when statistical assumptions cannot easily be satisfied, applying more appropriate methodologies, including mixed-effect models, can facilitate more effective decision-making by minimizing potential uncertainty in results of environmental data analyses.

In this work, we quantified the usage of specific statistical methods for the analysis of water quality data via a scoping review of literature. This review informs subsequent discussion around statistical assumptions and the selection of appropriate methods where those assumptions are violated. We further provide a high-level summary and practical guidance for those statistical techniques using a real-world water quality dataset. Analyses were carried out in the R Language for Statistical Computing ( R Core Team 2021 ) and all scripts are provided in Supplementary Material S1.

In this study, we conducted a scoping review of literature to quantify the number of statistical techniques used in water quality (WQ) research and identify patterns over time. Contrary to a full systematic review, a scoping review is conducted to identify the available literature on a topic in a systematic way but does not attempt to appraise the quality of the identified studies and their methodology ( Arksey & O'Malley 2005 ; Grant & Booth 2009 ; Peters et al. 2020 ).

The research question for the scoping review was ‘What statistical methods have been commonly used in analysis of water quality data?’, and conversely, ‘What suitable methods have not been used?’. Primary studies that focused on investigation of water quality or monitoring of water quality in ecosystems were identified by conducting a search using a pre-established list of statistical techniques and keywords. Eighteen statistical methods were identified as a basis for initial identification of studies including: Analysis of Variance (ANOVA), Bayesian Analysis, Cluster Analysis, Control Charts, Correlation, Correspondence Analysis, Factor Analysis, Kruskal–Wallis, Machine Learning, Mann-Kendall, Mann–Whitney, Generalized Linear Mixed-Effect Models (GLMMs), Linear Mixed-Effect Models (LMMs), Principal Component Analysis (PCA), Non-metric Multidimensional Scaling (NMDS), Regression, Simulation and Forecasting, and t -test. WQ studies that did not employ any of these methods were reviewed for additional or alternate methods, which were the basis for separate follow-up searches of the water quality literature. The Web of Science Core Collection database and Google Scholar were used to conduct this review covering a time frame from Jan 1, 1990 to Sep 27, 2020.

Frequencies of each statistical approach were plotted to determine the most common techniques for assessment of water quality in river ecosystems ( Figure 1(a) ). Many articles used more than one statistical approach (e.g., Pearson correlation in conjunction with linear regression analysis), and, in these cases, a single paper is counted in multiple categories. Of 24,819 identified references that assessed WQ in water ecosystems, 11,024 (44.4%) mentioned using at least one statistical approach. To address any potential bias, of 13,795 remaining articles, we randomly selected 580 (∼4%), for which full-texts were reviewed and new statistical methods were identified ( Figure 1(b) ). All references were reviewed by one reviewer and verified for inclusion by a second independent reviewer.

Number of papers identified by the scoping review, grouped by statistical techniques. (a) Classified papers through title and abstract search; (b) full-text search of a random sample of 580 papers that did not mention statistical techniques in the title and abstract.

Number of papers identified by the scoping review, grouped by statistical techniques. (a) Classified papers through title and abstract search; (b) full-text search of a random sample of 580 papers that did not mention statistical techniques in the title and abstract.

The four most used statistical approaches included various simulation and forecasting methods ( N =5,374, 34.4%), correlation ( N =3,701, 23.7%), linear regression ( N =1,727, 11.1%), and PCA ( N =1,258, 8.1%), accounted for 77.3% of the entire classified literature ( Figure 1 ). Of the simulation techniques, the most commonly used method was the Soil and Water Assessment Tool (SWAT), a watershed scale-based model that has a continuous daily time-step ( Neitsch et al. 2011 ; Shawul et al. 2013 ; Worku et al. 2017 ) accounting for 19.8% of all identified simulation-specific papers.

Multivariate ordination techniques including correspondence analysis, factor analysis and cluster analysis, were used in 10.1% ( N =1,577) of publications ( Figure 1 ). Our review recorded the first use of Bayesian methods to analyze WQ in 1993 ( Varis et al. 1993 ) in a study considering different analytical techniques for a WQ forecasting system. In the 20 years that followed, only 83 papers mentioned using Bayesian analysis. In the 8 years that precede present-day, 197 papers were identified, indicating that the Bayesian approaches have become more frequently used, with published papers incorporating this family of methods steadily increasing ( Figure 2 ).

Number of papers identified by technique and year over the total number of papers recorded for the year. Techniques grouped by analytical categories.

Number of papers identified by technique and year over the total number of papers recorded for the year. Techniques grouped by analytical categories.

Multilevel models are another group of statistical methods whose popularity has steadily been increasing in published WQ studies, particularly in recent years ( Figure 2 ). This family of analyses includes popular methods such as linear mixed-effect (LMMs) and generalized linear mixed-effect models (GLMMs). Multilevel models were first introduced by Laird & Ware (1982) to better address incorrect analysis of datasets using common statistical procedures such as t -test or ANOVA when certain statistical assumptions are not met, such as statistical independence among data points. Hawkins et al. (2010) described this problem and noted that, while ‘replication is a critical component in any type of assessment, its use in ecological assessment and monitoring has differed in important ways from its use in classic controlled experiments’. In ecological field experiments, while replicate samples taken within a site are often regarded as independent measures, they are in fact pseudoreplicates ( Hawkins et al. 2010 ). Consequently, analyzing such data using a t -test or similar approach inflates the likelihood of observing a statistically significant difference in means (increasing the probability of a Type I error). This phenomenon of pseudoreplication was defined by Hurlbert (1984) as ‘the use of inferential statistics to test for treatment effects with data from experiments where either treatments are not replicated or experimental units are not statistically independent’. By comparison, multilevel models avoid this problem as they do not require balanced data and they allow explicit modeling and analysis of between- and within-individual variation.

The first use of linear multilevel models (LMM) for water quality analysis in our paper sample was by Tate et al. (2003) , who looked at the effect of cattle feces distributions on water quality. Similar to the use of Bayesian approaches, multilevel models began to notably increase in popularity from 2012 onward, with only 10 WQ studies using the method between 2003 and 2012.

An understanding of how to select the most appropriate statistical methods is critical if practitioners are to make informed decisions. When inappropriate or suboptimal methods are applied to even the most robust datasets, consequences may include drawing false conclusions, including missing environmentally critical changes. To facilitate an improved understanding by practitioners, we briefly describe the difference between traditional or frequentist statistics and Bayesian statistics (Section 3.1). We then highlight key statistical assumptions within the traditional statistical framework and what other tools are available if these assumptions are not sufficiently met (Section 3.2). We further provide resources in the form of required functions and references to run those methods in the R Language for Statistical Computing (R Core Team 2021 ). Lastly, we use a publicly available water quality dataset to demonstrate the use case for more flexible statistical approaches such as multilevel models over traditional tools such as t -tests, ANOVA and simple linear regression techniques (Section 4).

Frequentist vs. Bayesian statistics

In contrast, the goal of Bayesian statistics is to estimate a probability distribution of unknown parameters given the observed data and some prior knowledge about those parameters in question. Incorporating prior knowledge into the calculation of the distribution of parameters is the defining feature of Bayesian statistics, and a key argument for why this approach could produce more realistic evaluations of data (i.e., because our prior knowledge is seldom zero, as is assumed in frequentist analysis). In Bayesian statistics, parameters are considered random variables and can therefore be modeled probabilistically and displayed in a distribution function unlike in frequentist statistics where parameters are assumed to be fixed. Bayesian statistics can answer questions such as ‘What is the probability that dissolved metal concentrations fall below a certain threshold level in a river ecosystem?’ Written probabilistically, we are asking ‘What is the probability of our hypothesis given the data, or P(hypothesis|data)?’ This is not possible in frequentist statistics, where we can only make statements about the probability of the observed data given a hypothesis, i.e., P(data|hypothesis) . In most cases, attaching a probability to a hypothesis, such as ‘there is a 70% probability that metal concentrations are above threshold levels’, feels more intuitive than attaching a probability to the observed data, such as ‘there is a 5% probability or less of observing our data if we assume metal concentrations are within threshold levels’ (frequentist interpretation of probability). The fact that the Bayesian interpretation of probability feels more intuitive may also explain the often-documented misinterpretation of the frequentist p -values ( Kruschke & Liddell 2018 ). Whether researchers should choose frequentist or Bayesian analysis should depend on the research context and questions and not simply on preference. For an in-depth discussion about Bayesian data analyses using R, see Kruschke (2015) and McElreath (2020) . We also provide a conceptual explanation of the Bayesian analysis approach with a simple dataset in Supplementary Material S2.

Implications of statistical assumptions on method selection

Linear regression is a well-known statistical technique routinely used to analyze data. Linear regression models a mean outcome or response variable as a linear function of, or conditioned on, one or more predictor variables using a straight line of best fit. While many methods are available to describe best-fit lines, the most common is the OLS method. The mathematical foundation of the OLS regression is the minimization of the sum of squared differences (or residuals) of the observed data to the regression line. Similar to other statistical techniques, OLS regression relies on assumptions. The method of least squares provides unbiased parameter estimates if the errors are independent (uncorrelated) and normally distributed around the regression line with a mean of zero and constant variance. While diagnostic methods to evaluate for statistical assumptions can be numerical, graphical methods are generally considered more versatile and informative ( Faraway 2016 ) which is usually done by inspecting the residuals for any obvious patterns that may be present and may reflect uncaptured systematic bias ( Zuur et al. 2010 ; Zuur & Ieno 2016 ).

While OLS models are constrained by assumptions that must be tested, they can be powerful tools when assumptions are met and can be easily calculated. For environmental scientists, however, field-collected data often present limitations on the use of simple linear regression techniques. Field data often do not meet assumptions because either measurements could not be taken, or measurements were in close proximity to each other and are not independent, or the observed response variable may not follow a normal distribution and may not even be linear at all, or any combination of the three. In such cases, using linear regression techniques could lead to biased parameter estimates and conclusions. Implications of violation of statistical assumptions of constant variance, normality, linearity, and independence are discussed in Sections 3.2.1 through 3.2.4, respectively and more appropriate techniques suitable for analyzing water quality data are provided.

Violation of the constant variance assumption

Whenever the assumption of constant variance (homoscedasticity) is violated, but the other assumptions hold, generalized least squares (GLS) techniques represent a more appropriate alternative. Constant variance is assumed across all values/levels of the predictor variable(s). Lack of homoscedasticity may result in biased standard errors and p -values. GLS allows errors to be correlated and/or have unequal variances as it allows us to model the within-group correlation and/or heteroscedasticity structure and therefore results in more accurate p -values. While it is possible to use variance stabilizing transformations to satisfy the constant variance assumptions in OLS models, it is generally not recommended since potentially important characteristics of the sampled data are removed ( Bolker et al. 2009 ; Zuur et al. 2009 ; Feng et al. 2014 ). Instead, choosing the appropriate statistical tool would be more appropriate. In R, GLS models can be run using the gls() function of the nlme package ( Pinheiro et al. 2020 ).

Violation of the assumption of normality

Quite often in environmental research the distributions of the collected sample data do not appear to be normally distributed at each value or at each level of the predictor variable. In fact, we often have only one observation at each value of a continuous predictor variable and cannot say anything about the normality assumption. Hence, in order to assess whether the data meets this assumption, we need to fit a model first and then inspect the distribution of the pooled residuals. Quite frequently those distributions look left or right skewed, or show other ‘non-normal’ attributes. Sometimes transforming the data (i.e., performing a mathematical operation on the data, such as taking the logarithm) can force data to comply with OLS model assumptions; however, by altering the original scale of the data, important information about the sample may be removed and any subsequent statistical tests are only valid on the transformed scale ( Feng et al. 2014 ). It can also make the interpretation of results cumbersome and unintuitive (e.g., reporting significant differences in ‘log transformed metal concentration’). In some cases, data transformations may even be incalculable, such as when many zeros are part of the data ( Bolker et al. 2009 ).

Generalized linear models (GLMs) offer an alternative as they accommodate response variables with non-normal error distributions ( Nelder & Wedderburn 1972 ; McCullagh & Nelder 1989 ; Faraway 2016 ; Dobson & Barnett 2018 ). The functional differences between OLS models and GLMs are (a) the response variable in GLMs can take on any distribution that is a member of the exponential family of probability distributions and (b) GLMs use a so-called link function, which describes how the mean of that response distribution and a linear combination of the predictors are related ( Faraway 2016 ).

Two very common GLMs are known as logistic regression, when observed data are binary, and Poisson regression, when observed data represent counts. Another very useful GLM, especially when data are continuous with no upper bound but truncated at zero, such as in concentrations, heights, and weights, is the gamma GLM. The gamma distribution is a very flexible continuous probability distribution that accommodates data with a wide variety of shapes. In R, GLMs can be run using the glm() function.

Lastly, another very useful technique is called beta regression. Beta regression can accommodate datasets containing proportion and percentages and as such provides support for data that naturally fall in between zero and one. In R, beta regression can be run using the betareg() function of the betareg package ( Cribari-Neto & Zeileis 2010 ).

Violation of the assumption of linearity

It is common in nature for observations to not follow linear trends. In these cases, neither OLS models nor GLMs are appropriate analytical tools. In these cases, non-linear least squares ( Baty et al. 2015 ) or generalized additive modeling represent suitable alternatives ( Zuur & Ieno 2016 ; Stasinopoulos et al. 2017 ; Wood 2017 ). Generalized additive models (GAMs) are GLMs with a linear predictor involving one or more smoothing functions at each covariate. Smoothing functions allow relationships within GAMs to be sinuous or wiggly, and thus permit great flexibility for modeling non-linear relationships. The risk when using non-linear or generalized additive models, however, comes in the form of overfitting, i.e., when too many smoothing functions are added into the model. The general problem with overfitting is that the model becomes less and less useful in generalizing the results and predicting future observations. In R, GAMs can be run, for example, using the mgcv package ( Wood 2017 ).

Violation of the assumption of independence

Very often in field based or environmental research the assumption of independence cannot be satisfied. In this case, multilevel models are the most appropriate choice as they allow us to account for temporally and spatially repeated measurements as well as for unbalanced datasets, i.e., when the target variable has more observations in one specific class than the others. Furthermore, they enable assessments of variation among plants, animals, ecosystems, and other groupings or clusters, and do not require data averaging before conducting the analysis. Multilevel regression models represent by far the most useful models in science. In fact, McElreath (2020) argues that ‘[…] multilevel regression deserves to be the default form of regression. Papers that do not use multilevel models should have to justify not using a multilevel approach’ (p. 15). Multilevel models such as LMMs and GLMMs require the inclusion of random effects and optionally fixed effects ( Pinheiro & Bates 2000 ; Bolker et al. 2009 ). Fixed effects are those variables for which parameters such as the mean or the slope are estimated, while random effects are grouping terms that account for the dependency structure in the data and estimate the variation within those groups. In R, such models can be run for example using the lme4 package ( Bates et al. 2015 ) and the function lmer() and glmer() or the glmmTMB package ( Brooks et al. 2017 ) with the glmmTMB() function. The glmmTMB() function also allows us to specify various covariance structures to deal with autocorrelation ( https://cran.r-project.org/web/packages/glmmTMB/vignettes/covstruct.html ).

Other challenges in the analyses of water quality data

While solutions to violations of statistical assumptions are available, other challenges can also affect the availability of techniques. A common problem in water quality analyses is the occurrence of censored observations and many censoring limits ( Helsel 2006 , 2012 ). Some approaches to overcome these limits include non-parametric techniques, but robust approaches which can address all possible constraints common in water quality datasets are not widely available, although they can be combined with other techniques to identify changes in environmental indicators ( Arciszewski et al. 2018 ). Similar to the selection of techniques described above, the use of techniques for censored data is based on the attributes of the data and the desires of the investigators.

To demonstrate the issues with respect to statistical assumptions when analyzing environmental data, we chose a multi-year, multi-station surface water quality monitoring project along the Athabasca River in northeastern Alberta. The goal of this monitoring project was to assess surface water quality (including spatial and temporal trends) in relation to potential impacts of oil-sands mining activities ( Glozier et al. 2018 ). Multiple water samples were taken at six sampling stations along the Athabasca River ( Figure 3 ) over an 8-year period. The experimental design could be described as a crossed-nested multilevel design with six sampling stations along the Athabasca River sampled over a time period from 2011 to 2018 (i.e., crossed) and sampling dates (month-day) nested in years. Our hypothetical research question was whether there are significant differences in vanadium (V) concentrations along the sampling stations and among years. In the Oil Sands Region, the mobilization of V is associated with both industrial development and natural exposures ( Glozier et al. 2018 ; Gopalapillai et al. 2019 ) and is suitable and useful to illustrate the approaches advocated here.

Locations of water quality sampling stations (M2–M7) monitored in the Mainstem Athabasca Water Quality Program, as part of the joint Canada-Alberta governments’ Oil Sands Monitoring program.

Locations of water quality sampling stations (M2–M7) monitored in the Mainstem Athabasca Water Quality Program, as part of the joint Canada-Alberta governments’ Oil Sands Monitoring program.

The data are publicly available ( https://data.ec.gc.ca/data/substances/monitor/surface-water-quality-oil-sands-region/mainstem-water-quality-oil-sands-region ). From there we downloaded the following file: ‘MainstemWaterQuality-LTWQM-M2-M7-Metals45-EPA-2011-2018-v2’ and prepared it for data analysis in R ( R Core Team 2021 ). All R scripts for data preparation, descriptive statistics as well as the statistical analyses can be found in Supplementary Material S1. All analyses were conducted in R version 4.1.2 (2021-11-01) – ‘Bird Hippie’ with packages updated on Jan 4, 2022.

The first step in our example is to visualize the raw data ( Figure 4 ). Visualizing the raw data can help determine which statistical analyses might be most appropriate and whether there are any underlying problems in the data. Figure 4(a) shows the data points across years and grouped by sampling stations. To quickly check whether the data show any signs of seasonality we uniquely colored the points representing the quarters, i.e., Jan to Mar, Apr to Jun, Jul to Sep, and Oct to Dec. This information could be helpful at the data analysis stage, for example by identifying additional model terms that potentially explain variation due to seasonality and hence improve the overall fit of the model. In the context of water quality in rivers, such variables could be water displacement and water depth. These data suggest that a generic variable, such as season or quarter, can be used to account for effects of flow where no stream gauges are present.

Raw data visualization of vanadium concentrations in the Athabasca River.

Raw data visualization of vanadium concentrations in the Athabasca River.

Figure 4(b) visualizes the data using boxplots. Boxplots are an effective way to check the variation and shape of the data. At this point, some expected patterns emerge from the box plots, such as increases in concentrations between M2 and M7 in 2016, but it also becomes clear that this dataset has various problems such as missing data, skewed data, as well as unbalanced data (see also Table 1 ). Figure 4(c) shows yet another visualization of the raw data, which are known as density curves. Density curves can help identify skewness, multimodality and other issues with respect to the data distribution and hence can help choose the appropriate distribution for the final statistical model.

Number of sampling dates by year and sampling station

YearM2M3M4M5M6M7Totals (years)
2011 
2012 10 28 
2013 32 
2014 37 
2015 31 
2016 21 
2017 10 31 
2018 
Totals (location) 19 44 35 33 19 41  
YearM2M3M4M5M6M7Totals (years)
2011 
2012 10 28 
2013 32 
2014 37 
2015 31 
2016 21 
2017 10 31 
2018 
Totals (location) 19 44 35 33 19 41  

Note: Within each sampling date, multiple transect measurements were taken.

Lastly, since the data show nesting structures (e.g., sampling dates nested in years) as well as multiple measurements along a transect (potential violation of the assumption of independent sampling), it becomes clear that analyzing this dataset using an OLS approach, e.g., ANOVA, linear regression, etc., is not advisable. Instead, multilevel models should be employed ( Zuur et al. 2009 ). For our example, we decided to use a GLMM since it can account (a) for the nesting structure, (b) for the dependency structure, and (c) for the non-normal distribution of the data. For this model, we used the gamma distribution since it is naturally truncated at zero, which makes sense since we have concentration data that cannot be negative. One frequently observed problem when using OLS techniques on data that cannot be negative, e.g., biomass, height, or concentrations, is that confidence intervals can extend into the negative. This is not possible when using the gamma distribution. Another benefit of the gamma distribution is its ability to model highly left as well as right skewed data.

Model output for the generalized linear mixed-effect model using the glmmTMB() function

Concentration
PredictorsEstimatesCI
(Intercept) 0.14 0.07–0.29  .  
station id [M3] 1.06 0.82–1.37 0.65 
station id [M4] 1.14 0.85–1.53 0.375 
station id [M5] 1.05 0.78–1.42 0.744 
station id [M6] 1.13 0.83–1.55 0.434 
station id [M7] 1.02 0.78–1.32 0.899 
yr [2012] 1.59 0.82–3.08 0.168 
yr [2013] 2.16 1.12–4.18  .  
yr [2014] 1.88 0.97–3.62 0.061 
yr [2015] 1.8 0.92–3.51 0.086 
yr [2016] 2.77 1.40–5.47  .  
yr [2017] 2.07 1.05–4.06  .  
yr [2018] 1.37 0.66–2.82 0.399 
 
 0.21 
N  191 
Observations 588 
Concentration
PredictorsEstimatesCI
(Intercept) 0.14 0.07–0.29  .  
station id [M3] 1.06 0.82–1.37 0.65 
station id [M4] 1.14 0.85–1.53 0.375 
station id [M5] 1.05 0.78–1.42 0.744 
station id [M6] 1.13 0.83–1.55 0.434 
station id [M7] 1.02 0.78–1.32 0.899 
yr [2012] 1.59 0.82–3.08 0.168 
yr [2013] 2.16 1.12–4.18  .  
yr [2014] 1.88 0.97–3.62 0.061 
yr [2015] 1.8 0.92–3.51 0.086 
yr [2016] 2.77 1.40–5.47  .  
yr [2017] 2.07 1.05–4.06  .  
yr [2018] 1.37 0.66–2.82 0.399 
 
 0.21 
N  191 
Observations 588 

CI=95% confidence interval, p = p -value.

Significant p-values are indicated in bold (α = 0.05).

The model results were significant for Year but not for Station ( Table 3 ). The next step is to check the fit and usefulness of the model given the underlying data. This assessment was done using the DHARMa package ( Hartig 2021 ). The ‘DHARMa’ package uses a simulation-based approach to create readily interpretable scaled (quantile) residuals for fitted (generalized) linear mixed models. The default output of the simulateResiduals() function is a plot with two panels. The first (left) panel is a QQ plot to detect overall deviations from the expected distribution. If the chosen distribution provides a perfect fit, all data points would fall on the red line. The second (right) panel plots the residuals against the predicted values. If no residual patterns are detected, the model fits the data accurately only leaving random errors. If residual patterns are present (indicated by red wiggly lines), removing or adding model terms (fixed effects and/or random effects) can help capture unexplained systematic variation. Figure 5 shows the residual plot for our model. Both the QQ plot residuals, as well as the residuals vs. the predicted values, showed significant deviations from what would be expected if the model would adequately describe the data. This information needs to be kept in mind when judging the usefulness of the model as well as when interpreting significant differences and p -values.

Analysis of deviance table with Chi-Square score, degrees of freedom and p -value for the fixed effect terms, station_id and year (yr)

Analysis of Deviance Table (Type II Wald Chi-Square tests)
Response: concentration
ChisqdfPr(>Chisq)
station_id 1.613 0.8997 
yr 28.098  .  
Analysis of Deviance Table (Type II Wald Chi-Square tests)
Response: concentration
ChisqdfPr(>Chisq)
station_id 1.613 0.8997 
yr 28.098  .  

Residual diagnostic plots. Please refer to the online version of this paper to see this figure in colour: https://doi.org/10.2166/wqrj.2022.028.

Residual diagnostic plots. Please refer to the online version of this paper to see this figure in colour: https://doi.org/10.2166/wqrj.2022.028 .

The next step was to calculate mean values based on the fitted model. This can be done using estimated marginal means. Those mean values are also known as lsmeans (or least squares means) in other statistical software packages. In R, estimated marginal means were calculated using the emmeans package and the emmeans() function ( Lenth 2020 ). The next step was to add letters to the mean values using the cld() function of the multcomp package to indicate significant differences for the factor Year ( Figure 6 ).

Output of the generalized linear model fit using the glmmTMB() function. Estimated marginal means for Station (a) and Year (b). Error bars represent 95% confidence intervals.

Output of the generalized linear model fit using the glmmTMB() function. Estimated marginal means for Station (a) and Year (b). Error bars represent 95% confidence intervals.

Next, we also fitted the same model as a Bayesian model using the brms R package and the brm() function ( Bürkner 2017 , 2018 ; Table 4 and Figure 7 ). The brms package provides straightforward tools for fitting Bayesian models, following the identical model specification syntax as in lme4 ( Bates et al. 2015 ) or glmmTMB ( Brooks et al. 2017 ). Since this an illustrative example only, we used flat (uninformative) priors for this analysis, hence both models resulted in almost identical outcomes. Table 5 shows the default flat priors used by the brm() function given the model specification. This information can be requested using the get_priors() function and modified in accordance with the investigators' prior beliefs. Furthermore, choosing minimally informative priors over flat priors is generally recommended when running a Bayesian analysis ( Lemoine 2019 ). Model validation in a Bayesian context is done by inspecting the posterior distributions, MCMC (Markov chain Monte Carlo) chains as well as by conducting posterior predictive checks. Functions for conducting these model checks are available in the brms package, such as plot() and pp_check() .

Model out for the Bayesian multilevel model using the brm() function

Concentration
PredictorsEstimatesCI (95%)
Intercept 0.14 0.07–0.30 
station_id: M3 1.05 0.81–1.41 
station_id: M4 1.14 0.82–1.58 
station_id: M5 1.05 0.76–1.43 
station_id: M6 1.14 0.82–1.66 
station_id: M7 1.01 0.76–1.31 
yr: yr2012 1.59 0.81–3.14 
yr: yr2013 2.17 1.12–4.31 
yr: yr2014 1.89 0.96–3.71 
yr: yr2015 1.81 0.94–3.62 
yr: yr2016 2.69 1.38–5.73 
yr: yr2017 2.06 1.04–4.16 
yr: yr2018 1.38 0.65–3.03 
 
 0.02 
 0.01 
ICC 0.76 
N  191 
Observations 588 
Marginal /Conditional  0.177/0.906 
Concentration
PredictorsEstimatesCI (95%)
Intercept 0.14 0.07–0.30 
station_id: M3 1.05 0.81–1.41 
station_id: M4 1.14 0.82–1.58 
station_id: M5 1.05 0.76–1.43 
station_id: M6 1.14 0.82–1.66 
station_id: M7 1.01 0.76–1.31 
yr: yr2012 1.59 0.81–3.14 
yr: yr2013 2.17 1.12–4.31 
yr: yr2014 1.89 0.96–3.71 
yr: yr2015 1.81 0.94–3.62 
yr: yr2016 2.69 1.38–5.73 
yr: yr2017 2.06 1.04–4.16 
yr: yr2018 1.38 0.65–3.03 
 
 0.02 
 0.01 
ICC 0.76 
N  191 
Observations 588 
Marginal /Conditional  0.177/0.906 

CI=95% credible interval.

Default priors of the fitted model

PriorclasscoefGroupsource
(flat)   default 
(flat) station_idM3  (vectorized) 
(flat) station_idM4  (vectorized) 
(flat) station_idM5  (vectorized) 
(flat) station_idM6  (vectorized) 
(flat) station_idM7  (vectorized) 
(flat) yr2012  (vectorized) 
(flat) yr2013  (vectorized) 
(flat) yr2014  (vectorized) 
(flat) yr2015  (vectorized) 
(flat) yr2016  (vectorized) 
(flat) yr2017  (vectorized) 
(flat) yr2018  (vectorized) 
student_t(3, −1.3, 2.5) Intercept   default 
student_t(3, 0, 2.5) sd   default 
student_t(3, 0, 2.5) sd  unique_id (vectorized) 
student_t(3, 0, 2.5) sd Intercept unique_id (vectorized) 
gamma(0.01, 0.01) shape   default 
PriorclasscoefGroupsource
(flat)   default 
(flat) station_idM3  (vectorized) 
(flat) station_idM4  (vectorized) 
(flat) station_idM5  (vectorized) 
(flat) station_idM6  (vectorized) 
(flat) station_idM7  (vectorized) 
(flat) yr2012  (vectorized) 
(flat) yr2013  (vectorized) 
(flat) yr2014  (vectorized) 
(flat) yr2015  (vectorized) 
(flat) yr2016  (vectorized) 
(flat) yr2017  (vectorized) 
(flat) yr2018  (vectorized) 
student_t(3, −1.3, 2.5) Intercept   default 
student_t(3, 0, 2.5) sd   default 
student_t(3, 0, 2.5) sd  unique_id (vectorized) 
student_t(3, 0, 2.5) sd Intercept unique_id (vectorized) 
gamma(0.01, 0.01) shape   default 

The brms package does not require to specify the priors. In this case, it uses default priors given the specified model. This output can be generated using the get_priors() function.

Output of the Bayesian generalized linear model fit using the brm() function. Posterior predictive means for Station ID (a) and Year (b). Error bars represent the 95% credible intervals. Note: credible intervals are a Bayesian version of the frequentist confidence intervals but are not interpreted in the same way.

Output of the Bayesian generalized linear model fit using the brm() function. Posterior predictive means for Station ID (a) and Year (b). Error bars represent the 95% credible intervals. Note: credible intervals are a Bayesian version of the frequentist confidence intervals but are not interpreted in the same way.

In both analyses, we found similar point estimates and 95% confidence/credible intervals for vanadium concentration for sampling station and year ( Figures 6 and 7 ). The frequentist approach suggested that there are significant differences between sampling years but not sampling stations ( Figure 6 ). The Bayesian analysis does not provide frequentist p -values and significant differences. Instead, it generates posterior probability distributions for all model terms, which contain all possible vanadium concentrations given the data and prior information. This allows the researcher to make probabilistic statements about hypotheses given the data, such as ‘what is the probability of observing a specific metal concentration for a given sampling station?’. This is in contrast to frequentist statistics where statements are made about the probability of observing the collected data given a null hypothesis (see Section 3.1). It should also be noted that the interpretation of the aforementioned confidence and credible intervals for frequentist and Bayesian inference is different. In a frequentist setting, a confidence interval represents one possible interval that either contains the true population parameter or not. A 95% confidence interval means that (on average) out of 100 identically repeated experiments, 95 calculated confidence intervals will contain the true population parameter (see simulation in Supplementary Material S3). On the other hand, the Bayesian 95% credible interval contains all possible vanadium concentrations at a given sampling station with 95% probability. This is possible since Bayesian analysis treats parameters as random variables and computes posterior probability distributions, which include all possible values of that parameter given the data. To learn more about how to report Bayesian data analyses in scientific publications, please refer to the Bayesian analysis reporting guidelines by Kruschke (2021 ).

Lastly, we would also note that there may be cases where datasets simply cannot be meaningfully analyzed even using the most sophisticated inferential statistical techniques due to various problems with the underlying data collection. In such instances, a thorough descriptive statistical approach, including raw data visualizations, may still be very useful in extracting important information.

In this study, we conducted a scoping review of literature to quantify the number of statistical techniques used in water quality (WQ) research and identify patterns over time. We further built on the results and highlighted useful statistical techniques for water quality assessment with simple guidance for practitioners.

The most frequently used statistical approaches included simulation and forecasting methods, correlation, linear regression, and PCA. Simulation techniques do not provide direct information on water quality, but rather provide results and trends in the context of water processes and functioning of water systems. Similarly, PCA provides descriptive rather than inferential information and is most commonly used in exploratory data analysis as well as being a tool for dimensionality reduction. Correlation and regression may appear more appealing techniques to infer water quality, however, field data typically are not normally distributed or independent. As a result, the estimates and conclusions may be biased.

Consequently, multilevel models that do not require balanced data and that allow explicit modelling and analysis of between- and within-individual variation are more appropriate because they better address analyses of datasets when traditional statistical assumptions are not met, which is frequently observed in ecological research. Moreover, Bayesian inference may be suitable for decision making especially when there is prior information available on datasets that are being analyzed.

When inappropriate or suboptimal methods are applied to even the most robust datasets, consequences may include drawing false conclusions, including missing environmentally critical triggers or changes. In agreement with recommendations of McElreath (2020) , we suggest that multilevel models in ecological datasets, including water quality analyses, should be the default statistical approach.

The authors acknowledge the financial support of the Oil Sands Monitoring Program (OSM). While this work was funded by OSM, it does not necessarily reflect the position of the Program. We thank Eleanor Stern for abstract and full-text screening for the scoping review. We also want to thank Dr Florian Hartig for comments on an earlier version of this manuscript.

All relevant data are included in the paper or its Supplementary Information.

Supplementary data

Water Quality Research Journal Metrics

Affiliations

water analysis research paper pdf

  • ISSN 2709-8044 EISSN 2709-8052
  • Open Access
  • Collections
  • Subscriptions
  • Subscribe to Open
  • Editorial Services
  • Rights and Permissions
  • Sign Up for Our Mailing List
  • IWA Publishing
  • Republic – Export Building, Units 1.04 & 1.05
  • 1 Clove Crescent
  • London, E14 2BA, UK
  • Telephone:  +44 208 054 8202
  • Fax:  +44 207 654 5555
  • IWAPublishing.com
  • IWA-network.org
  • IWA-connect.org
  • Cookie Policy
  • Terms & Conditions
  • Get Adobe Acrobat Reader
  • Company registration: 03690822
  • VAT number: GB740445745
  • ©Copyright 2024 IWA Publishing

This Feature Is Available To Subscribers Only

Sign In or Create an Account

Academia.edu no longer supports Internet Explorer.

To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to  upgrade your browser .

Enter the email address you signed up with and we'll email you a reset link.

  • We're Hiring!
  • Help Center

First page of “An Introduction to water quality analysis”

Download Free PDF

An Introduction to water quality analysis

Profile image of Ritabrata  Roy

ESSENCE – International Journal for Environmental Rehabilitation and Conservation

Related papers

Onuoha Ejikeme Destiny, 2022

Water as a universal solvent has the capability to dissolve many substances including organic and inorganic compounds. This outstanding property of water can be ascertained to the inconceivable to take in water in its pure form. The quality of water generally refers to the component of water present at the optimum level for suitable growth of plants and animals. Aquatic organisms need a healthy environment to live and adequate nutrients for their growth; the productivity depends on the physicochemical characteristics of the water body. The maximum productivity can be obtained only when the physical and chemical parameters are present at optimum level. Water for human consumption must be free from organisms and chemical substances at such large concentrations that may affect health. The pollution of water is increased due to human population, industrialization, the use of fertilizers in agriculture and man-made activity. Parameters such as temperature, turbidity, nutrients, hardness, alkalinity, dissolved oxygen, etc. are some of the important factors that determines the growth of living organisms in the water body. Hence, water quality assessment involves the analysis of physico-chemical, biological and microbiological parameters that reflect the biotic and abiotic status of the ecosystem.

In present scenario, the river water has become wastewater due to disposal of city waste through which it flows. Most of the existing wastewater treatment plants are getting overload because of unexpected rapid urbanization and due to change in life style of common man. The quality of water is analyze at some time interval because due to the contamination of water the water borne diseases are arise which affects on the human health and aquatic life also. Parameters that may be tested including temperature, pH, DO, turbidity, COD, BOD, heavy metals. Heavy Metals such as Pb,Cr,Hgetc produce chronic poisoning in aquatic life. Heavy Metals naturally exist in very little amount of water.

Even if the surface of the earth is mostly consists of water, very a small amount is usable that makes the resource limited. This precious and limited resource, therefore, must be used with care. The study carried out by taking samples of water from Kulfo River, Hare River, Abaya Lake, and Chamo Lake. This experiment was carried out to analyze selected physio-chemical parameter such as Potassium, Chloride, Acidity, TSS, TDS, TS, pH, Alkalinity, and Salinity by using different approaches. According to the study, Abaya lake sampling site was characterized with pH (8.5), alkalinity (278 mg/L), acidity (177mg/l). Water samples analysis from Chamo lake sampling site was characterized with pH (9.1), alkalinity (471 mg/L), acidity (130 mg/l). Kulfo River sampling site was characterized with pH (8.4), alkalinity (188 mg/L), acidity (117mg/l). Hare River sampling site was characterized with pH (7.8), alkalinity (105 mg/L), acidity (94 mg/l). The analysis of TS, TSS, and TDS has employed usin...

Water is a liquid at ambient conditions and it often co-exists on Earth with its solid state ice & gaseous state water vapors or steam. Water covers 70.9% of the earth’s surface and is vital for all forms of life. Water on earth moves continually through a cycle of evaporation or transpiration, precipitation and runoff, ultimately reaching the sea. Water is an essential requisite for life for human beings, animals or plants. The present research work is to analyze the health of the water bodies of the two districts of West Bengal (Murshidabad and Birbhum) in order to make the people aware that consumption of polluted water may cause morbidity and mortality not only to them but to the ecosystem as a whole.

People on globe are under tremendous threat due to undesired changes in the characteristics of air, water and soil. Water is one of the vital needs of all living beings. Water occupies about 70% of the earth's surface and provides essential elements, when polluted it may become dangerous to humans health. Many infectious diseases are transmitted by water through the fecal-oral route. Humans need water in many daily activities like drinking, washing, bathing, cooking etc. The quality of water usually described according to its physical, chemical and biological characteristics. It is seen that chemical and physical pollution of water is not less important but the deadly pollutants present in drinking water are of biological origin. Due to increased human population, industrialization, use of fertilizers and man-made activities water is highly polluted with different harmful contaminants. It is necessary that the quality of water should be checked at regular time interval, because due to use of contaminated water, human population suffers from varied of water borne diseases. The availability of good quality water is an indispensable feature for preventing diseases and improving quality of life.

The current analysis deal by means of the Water Quality Index(WQI) assessment of Gostani Velpur stream, a tributary of river Godavari. Water quality index (WQI) exhibit the on the whole quality of water based resting on a number of water quality parameter. As part of the analysis, eleven parameters viz., pH, electrical conductivity, Overall dissolved solids, Overall alkalinity, Overall hardness, calcium, magnesium, Chlorides, Nitrates, dissolved oxygen as well as biological oxygen demand were analysed as well as were used to determine WQI. Brown WQI method was used to find overall WQI. The results obtained on WQI and physicochemical parameters of water quality were found to be very high exceeding the permissible range indicating that water is unsuitable for drinking. It is recommended that constant monitoring is needed to maintain water quality of the water supplies along the source canal which is a major source used for intake in addition to domestic purpose.

Loading Preview

Sorry, preview is currently unavailable. You can download the paper by clicking the button above.

IRJET, 2021

Renewable and Nonrenewable Energy, 2023

IRJET, 2022

Journal of Ecological Engineering

Civil Engineering Journal, 2018

DESALINATION AND WATER TREATMENT

International Journal of Current Research, 2016

Green Chemistry & Technology Letters

  •   We're Hiring!
  •   Help Center
  • Find new research papers in:
  • Health Sciences
  • Earth Sciences
  • Cognitive Science
  • Mathematics
  • Computer Science
  • Academia ©2024

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Open access
  • Published: 16 December 2020

Water quality assessment based on multivariate statistics and water quality index of a strategic river in the Brazilian Atlantic Forest

  • David de Andrade Costa 1 , 2 ,
  • José Paulo Soares de Azevedo 1 ,
  • Marco Aurélio dos Santos 1 &
  • Rafaela dos Santos Facchetti Vinhaes Assumpção 3  

Scientific Reports volume  10 , Article number:  22038 ( 2020 ) Cite this article

11k Accesses

46 Citations

Metrics details

  • Environmental sciences

Fifty-four water samples were collected between July and December 2019 at nine monitoring stations and fifteen parameters were analysed to provide an updated diagnosis of the Piabanha River water quality. Further, forty years of monitoring were analysed, including government data and previous research projects. A georeferenced database was also built containing water management data. The Water Quality Index from the National Sanitation Foundation (WQI NSF ) was calculated using two datasets and showed an improvement in overall water quality, despite still presenting systematic violations to Brazilian standards. Principal components analysis (PCA) showed the most contributing parameters to water quality and enabled its association with the main pollution sources identified in the geodatabase. PCA showed that sewage discharge is still the main pollution source. The cluster analysis (CA) made possible to recommend the monitoring network optimization, thereby enabling the expansion of the monitoring to other rivers. Finally, the diagnosis provided by this research establishes the first step towards the Framing of water resources according to their intended uses, as established by the Brazilian National Water Resources Policy.

Similar content being viewed by others

water analysis research paper pdf

Spatio-seasonal variation of water quality influenced by land use and land cover in Lake Muhazi

water analysis research paper pdf

Evaluation of the surface water quality using global water quality index (WQI) models: perspective of river water pollution

water analysis research paper pdf

The multiscale nexus among land use-land cover changes and water quality in the Suquía River Basin, a semi-arid region of Argentina

Introduction.

Aquatic systems have been significantly affected by human activities causing water quality deterioration, decreasing water availability and reducing the carrying capacity of aquatic life 1 , 2 , 3 , 4 . Water quality deterioration still persists in developed countries, while it is a major problem in developing countries in which a substantial amount of sewage is discharged directly into rivers 5 , 6 , 7 , 8 . Moreover, according to UNEP 9 , water pollution has worsened since the 1990s in the majority of rivers in Latin America. The global concern with water availability and its quality has been growing, and it is estimated that the demand for water will increase between 20 and 30% by 2050 10 , 11 . In addition, spatial and temporal variations in the hydrological cycle and their uncertainties related to climate change may worsen this scenario 12 , 13 , 14 , 15 , 16 .

Monitoring water quality in order to assess its spatial and temporal variations is essential for water management and pollution control 17 . On the other hand, monitoring programs generate large data sets that require interpretation techniques 18 . There are a number of methods for water quality assessment, including single-factor, multi-index, fuzzy mathematics, grey system evaluation, artificial neural network, multi-criteria analysis, geographical interpolation and multivariate statistical approach 18 , 19 , 20 , 21 , 22 , 23 , 24 , 25 . Among them, the most used are the Water Quality Indexes (WQI) that transform a complex set of data into a single value indicative of water quality 26 , 27 and reflect its suitability for different uses 28 . Multivariate statistics is another widely used approach 29 , 30 , mainly with Principal Components Analysis (PCA) and Cluster Analysis (CA), helping to achieve a better understanding of the spatial and temporal dynamics of water quality.

A comparison of seven methods for assessing water quality indicated WQI as one of the best 20 . The assessment of Poyang Lake 28 , China and the upper Selenga River 31 , Mongolia showed that WQIs are suitable for the assessment of both interannual trends and seasonal variations 28 . Multivariate statistical techniques associated with WQI have been used for numerous water bodies world-wide, including the Nag River 30 , India, the Paraíba do Sul River 32 , Brazil, and the before mentioned Selenga River 31 . CA grouped the monitoring stations according to their similarities, while the PCA highlighted components that were related to its pollution sources 30 , 31 , 32 .

In order to ensure water quantity and quality, the Brazilian National Water Resources Policy 33 has established a management tool called Framework, according to the main intended uses of water. It has also created participatory management committees, the so-called Basin Committees, which, together with its technical agency, are responsible for the Framework establishment. Unfortunately, even after two decades, Brazil has had very few successful experiences on the subject 34 .

Brazil has a gigantic and complex hydrographic network present in many different ecosystems 34 . The Brazilian Atlantic Forest is one of the most biodiverse biomes on the planet 35 , 36 , extending along the Brazilian coast and currently covering only 11.4% of its original territory 37 under constant threats 38 , 39 , 40 . The hydrographic basin of the Paraíba do Sul river is located in this environment, which is the integration axis of the most industrialized Brazilian states, São Paulo, Rio de Janeiro and Minas Gerais, and home to around 6.2 million people 41 . A water transfer system regularly supplies another 9 million people in the metropolitan region of Rio de Janeiro, through the Guandu system. Another water transfer system connects the Paraíba do Sul river to the Cantareira system, complementing with 5 m 3 /s the water supply to over 9 million people in the metropolitan region of São Paulo 41 . These systems went through an intense water scarcity between 2014 and 2016 with severe impacts on water quality and availability 32 .

Our study is focused on the Piabanha River watershed, a strategic sub-basin of the Paraíba do Sul river, combining urban, industrial, rural characteristics, and large preserved fragments of Atlantic Forest 36 , 42 . The Piabanha Basin has been monitored for over 10 years with the Studies in Experimental and Representative Watersheds (EIBEX) project, a partnership between universities and government agencies 42 , 43 , 44 . The State Environmental Agency of Rio de Janeiro (INEA) has been monitoring the basin since 1980. Other studies in the region include the analysis of contamination by pesticides 45 , energy generation 46 and dispersion of pollutants 47 . The Piabanha Basin received international attention in Nature's article on biodiversity 36 . But in addition to forest preservation, can the Piabanha River support biodiversity? How is its water quality today? In this way, the Piabanha Basin Committee defined the Framework as a priority in its management plan (2018–2020) and to accomplish this goal, established water monitoring as a strategic action 48 .

Our study covers 40 years of monitoring, including government data, our research projects and, currently, a monitoring program that is being conducted with funding from the Piabanha Basin Committee. The main objectives were: (1) to carry out an updated diagnosis of water quality using multivariate techniques and WQI; (2) to examine the parameters that most influence water quality, and (3) to identify river stretches with similar water quality. Our study provides an extensive understanding of the Piabanha River and supports its Steering Committee in the application of public policies. This is a pilot project that can be a reference for other Framework programs for improving water quality in Brazil.

We have requested and received from INEA two water user databases of the Piabanha Basin. The first set corresponds to raw data from the National Water Resources Register (CNARH), with all the registrations until December 2017 and with 1549 registered interferences (water abstraction or effluent discharge). The second one is the registration validated by INEA until August 2018 by the Águas do Rio project comprising a total of 669 validated interferences. With these data, it was possible to build a georeferenced base. By so doing, it was possible to list the main effluent discharges by type for each monitoring station.

In the validated database, from the 669 interferences, 84% are water abstractions and 16% are effluent discharges. Water abstraction account for 425 m 3  day −1 with 75% from wells and 25% from rivers. On the other hand, effluent discharges are 89 m 3  day −1 . The largest volume of effluents comes from the sanitation sector with 57% of the total, whereas industries account for 33%, aquaculture with 4% and mining for 3% of discharges.

When comparing the two databases, it is clear that the universe of registered users is much larger than the universe of validated users; in other words, those whose data were made up by the state environmental agency and, therefore, received a license. For example, the validated database has only six interferences related to agriculture, in contrast to 789 interferences awaiting validation. This is a serious obstacle for water resources management in the region, which threatens the sustainability of water resources.

Short time monitoring and water quality index

In order to assess and compare the water quality of the Piabanha River, we calculated the Water Quality Index from the National Sanitation Foundation (WQI NSF ) using two datasets, the first one from 2012 and the last one from 2019 (Tables 1 and Table 2 ). The 2012 results (Fig.  1 A) oscillated between the bad and medium categories, generally with medium quality (50.5 ± 10.3). In 2019 (Fig.  1 B), the results ranged between the medium and good categories, in general with medium quality (61.6 ± 10.8).

figure 1

WQI NSF spatial variation over each station from July to December ( A ) 2012 and ( B ) 2019. WQI NSF seasonal variation over the entire length of the river ( C ) 2012 and ( D ) 2019. The entire dataset can be found online as Supplementary Table S1 and S2 , respectively for 2012 and 2019.

Data sets show significant seasonal behavior (p < 0.05) (Fig.  1 C,D) between the end of the dry period (Jul, Aug, Sep) and the beginning of the rainy period (Oct, Nov, Dec) for the parameters DO, WT, pH, nitrate, phosphate and turbidity, while no significant seasonal difference (p > 0.05) was found for the parameters E. coli , BOD and TDS. The parameters that have most impacted the WQI NSF were coliforms and BOD. Ammonia and total phosphorus do not account to WQI NSF , but their concentration has violated Brazilian legislation and their influence can be better understood by PCA.

Principal components and clusters analysis

The 2019 dataset (n = 48), comprising six monitoring campaigns at the eight monitoring stations along the Piabanha River with 15 parameters analysed, was grouped by the average value of each parameter at each station (n = 8). Pearson’s correlation matrix is presented in Table 3 , most parameters showing a strong correlation (r > 0.5) with a confidence interval greater than 95% (α = 0.05). The KMO measures of sampling adequacy (n = 8) were near to 0.5 and the significance level of test of sphericity was less than 0.001, indicating that the data was fit for PCA and the correlation matrix is not an identity matrix and so the variables are significantly related. The Shapiro test confirmed the data normality (p > 0.01) for all parameters, except for E. coli .

ACP was applied to identify groups of parameters that influence water quality. PC 1, PC 2 and PC3 account for 72% (eigenvalue 10.74), 14% (eigenvalue 13.94) and 5% (eigenvalue 0.8), respectively, of the data variance. Components with eigenvalues larger than the unit were selected. That is, the first two components together account for 86% of the total variance. The loadings that compose the first two components are presented in the Table 4 and the stations that most influence the results are represented in Fig.  2 A.

figure 2

Multivariate techniques. ( A ) PCA plot with station scores and parameters loadings. ( B ) Hierarchical clustering by Ward linkage with Euclidean distance. The entire dataset can be found as Supplementary Table S2 online.

PC1 was substantially correlated with practically all parameters. Stations number 1 to 4 loaded positively (loadings > 0.7) to PC1 with the parameters TDS, Alkalinity, Ammonia, Total Nitrogen, Phosphate, Total Phosphorus, DBO, COD, E. coli , while stations number 5 to 8 loaded negatively (loadings < − 0.7) with Nitrate, Turbidity, SS, pH and WT. PC2 was most influenced by stations in the urban area, notably station 1, and showed a positive correlation (loadings > 0.5) with OD, COD, BOD and less by SS (loading = 0.33), being more influenced by station 1 in the urban area. On the other hand, it was negatively correlated with E. coli (loading = − 0.66) with a large influence of station 3.

The sampling stations were grouped into three statistically significant clusters with 75% of similarity by agglomerative hierarchical clusterization based on the ward linkage by Euclidean distance (Fig.  2 B): cluster 1 (Stations 2 and 3), cluster 2 (Stations 7 and 8) and cluster 3 (Stations 1, 4, 5 and 6).

Longtime monitoring assessment based on Mann–Kendall rank test and Fourier transform

In a complementary way, in order to evaluate a possible trend on water quality and to detect the seasonal behavior of the basin, we used a time series with 40 years of monitoring. Since dissolved oxygen can be used as a surrogate variable for the general health of aquatic ecosystems 49 , 50 , 51 , it was selected to perform the Mann–Kendall rank test of randomness for the station more upstream and further downstream of the Piabanha River, PB002 and PB011 respectively. The upstream station showed a statistically significant increasing trend (n = 166, S = 1507, Z = 2.10, p < 0.03), whereas the downstream station does not show a statistically significant trend (n = 198, S = 1179, Z = 1.27, p = 0.20). The entire dataset can be found as Supplementary Table S3 and S4 .

To detect the seasonal behavior, we have applied a Fourier transform algorithm to the time series from 1980 to 2019 to the station PB011 (Fig.  3 A, which does not display a tendency behavior and can be considered as representative of the entire basin because it is the most downstream station. The data were organized in quarterly averages for the DO parameter. The two most powerful signals correspond to the frequencies of 0.25 and 0.45, nearly (Fig.  3 B) It corresponds to periods of 12 and 6 months, respectively. Taking into account this seasonality, we confirmed that our 2019 field campaigns are representative of seasonality comprising the final half of the dry season and the initial half of the rainy season.

figure 3

( A ) Temporal distribution of dissolved oxygen from 1980 to 2019 at station PB002 (n = 160). ( B ) Periodogram. The entire dataset can be found in Supplementary Table S5 .

Water quality assessment

The Piabanha River had a better water quality in 2019 than in 2012, according to WQI NSF results (Fig.  1 ). The improvement was substantial over the first 40 km, rated as “bad” in most campaigns in 2012, while rated as medium in most campaigns in 2019 due to sewage collection and treatment system expansion. Since 2012, Petrópolis has built 50 km of sewage collection network and 7 new sewage treatment units 52 . These plants produce secondary level effluents through biological treatment, the plants flow capacity reaches about 800 L s −1 . These stations use different technologies such as: submerged aerated biofilters, anaerobic upflow reactor, moving bed biofilm reactor and upflow anaerobic sludge blanket reactor. Beside this, in some plants are used biosystems 53 . Water quality improved in stretches after 40 km due to self-purification processes and the contribution of clean tributaries. This is in line with findings from other rivers worldwide 31 , 54 , 55 .

Dry seasons, in general, presented better water quality indexes than rainy seasons. Other studies 28 , 56 , 57 have shown similar seasonal behavior, where water quality worsens in the rainy season due to sediments and pollutants input carried by the rain. In addition, most of the sewage network is the same network that collects rainwater. Thus, during rainy events, sewage is no longer treated and is discharged directly into rivers.

Although the WQI NSF had a medium rating in 2019, BOD and Coliforms were substantially above the maximum allowed by Brazilian regulation. In addition, the index is limited to the parameters used in its calculation 58 . This is the case for the ammonium parameter, which presented concentrations up to three times higher than allowed in Brazilian regulation, reminding that only nitrate is used in the WQI NSF . The same occurs with total phosphorus: only phosphate is considered, although it does not have a maximum value established by the Brazilian federal regulation. In what follows, we analyse these parameters in more detail.

Biochemical Oxygen Demand (BOD) is one of the most widely used criteria for water quality assessment. It provides information on the ready biodegradable fraction of the organic load in water 59 . High BOD concentrations reduce oxygen availability, mainly correlated to microbiological activity 60 . Its concentration ranged from 2.00 to 45 mg L −1 (average 7.69 ± 7.52) over the entire data, with its concentrations most of the time substantially above the maximum allowed by Brazilian regulation (5 mg L −1 ). Escherichia coli is naturally present in the intestinal tracts of warm-blooded animals and it is widely used as an indicator of fecal contamination 61 , 62 . Villas-Boas 42 pointed to fecal coliforms as the most relevant water quality parameter in the urban area of Petrópolis, mainly related to pollution caused by untreated domestic sewage.

Phosphorus is an essential nutrient for all forms of life 63 . Its availability can be related to atmospheric deposition 64 , anthropic uses of products such as detergents 65 and due to agricultural activities 66 . Orthophosphates are the most relevant in the aquatic environment as they are the main form of phosphate assimilated by aquatic vegetables 67 . Previous studies 42 , 68 , 69 in the Piabanha Basin found phosphate values in perfect agreement with ours. Alvim 68 points out that the main source of phosphorus for the Piabanha River is the sewage discharge and the higher concentrations are found during the rainy season.

Nitrate is a very common element in surface water since it is the end product of the aerobic decomposition of the organic nitrogenous compound 70 , 71 . Its sources are related to landscape composition, being influenced by both agricultural and urban uses 72 . Villas-Boas 42 found high concentration of nitrate and ammonium in the urban region of Piabanha River in agreement with this study. Alvim 68 reports that domestic sewage discharged into Piabanha River waters account for 43% of the nitrogen load, the atmospheric contribution for 31% and the farming activity for 15%.

The major contributors to water quality and stretches of river with similar water quality

The first two components together account for 86% of the total variance, indicating method high explanatory power of the method. It was far better than other similar studies around the world 29 , 30 , 71 , 73 , 74 , 75 . PC1 predominantly accounts for urban sewage pollution. This is clearly demonstrated by the fact that stations from 1 to 4, located in the urban area of Petrópolis, positively loaded PC1 with organic matter (BOD and COD), TDS and nutrients such as phosphorus and nitrogenous constituents, especially ammonia, indicating recent pollution. Even clearer is the fact that stations from 5 to 8 have negatively loaded with nitrate, showing the nitrogen compounds degradation in the downstream stretches of the urban area. On the other hand, the increase in nitrate concentrations in association with the increase in turbidity in stations outside the urban area may also be associated with land use, especially in agriculture.

PC2 is dominated by the dissolved oxygen parameter and other parameters that indicate the health of the river, as organic load and coliforms. It is explained by water pollution by organic matter and biological activity and reinforces the result of CP1. In the study region, sanitation is still a challenge to be faced by the government, especially in the first urban stretch, after 40 km from the source of the Piabanha River, this region has 26% of untreated sewage 53 .

Cluster analysis was used to group sampling stations into similarity classes indicating the stretches of river with similar water quality. As pointed out by Singh 29 , it implies that only one site in each cluster may serve as good in spatial assessment of the water quality as the whole cluster. So, the number of sampling sites can be reduced; hence, cost without losing any significance of the outcome. On the other hand, this interpretation should be done with caution since trends in different stretches can be very different, making future changes significant. Therefore, great care must be taken to reduce monitoring stations.

It is important to notice that the first cluster (S1, S6 and S4, S5) groups station 1 with station 6, the first one corresponding to the urban area of Petrópolis whose pollution stems from sewage and industrial effluents. Likewise, station 6 is located after the confluence of the Preto-Paquequer River, which crosses Teresópolis, the second largest city in the hydrographic basin, also with the presence of economic and industrial activities. Sand mining is the predominant activity near stations 4 and 5, which together receive the impact of five mining companies. Similarly, station 6, after the Preto River, receives the impact of seven sand mines. In fact, this group brings together economic activities whose impact on water quality is similar. Station 5 could be removed from the network monitoring in order to reduce costs.

The second cluster (S2 and S3) refers to the most urbanized section of the basin. When individually checking the quality parameters between these stations, one can conclude that they differ only by the diluting effect caused by the contribution of the Araras River, on the left bank, and of the Poço do Ferreira River, on the right bank, which receives its waters from the Bonfim River after its source in the Serra dos Órgãos National Park, an important federal conservation unit. Station 3 was introduced precisely to detect this diluting effect, but since the cluster analysis showed that it was not significant it is recommended to remove this station.

The third cluster (S7 and S8) has a very similar behavior: station 8 is just before the Piabanha River mouth and station 7 is located less than 10 km upstream of the mouth. In addition, on this stretch there are only three interferences registered as discharges. Thus, it is recommended to remove station 7, considering the importance of maintaining a station close to the river mouth.

Trend analysis and seasonal variation

Although it still presents systematic violations to Brazilian standards 76 , the water quality, in general, has improved in the Piabanha River over the past 40 years (Fig.  3 A,B). This statement is supported by the Mann–Kendall rank test of randomness, indicating a significant (p = 0.03) tendency to increase the values of the dissolved oxygen parameter at station PB002, located in the urban area of Petrópolis, which is highly impacted by effluent discharges, despite the fact that this region has municipal sewage treatment. PB011 presents high levels of DO, since the beginning of the time series exhibiting an almost monotonic behavior over time, thus it has no tendency. The high DO levels are due to both the river's reoxygenation process and the contribution of clean waters from its tributaries, such as the Fagundes River.

A strong annual and semi-annual seasonality was indicated by the power spectral density, which can be seen in the periodogram (Fig.  3 B) resulting from the Fast Fourier Transform. The results are in accordance with the literature 77 indicating that more than 90% of the total variance of dissolved oxygen is accounted for by the annual periodicity and the next four higher harmonics (semi-annual; tri-annual, etc.). Seasonality follows the rainfall regime with a dry period from April to September, and a wet period from October to March, according to Araújo's 78 study carried out in the Piabanha River basin.

Water quality at point PB002 started to improve in 2000, when the first sewage treatment plant in the city of Petrópolis came into operation. Currently, 95% of the population has access to drinking water, and the coverage of treated urban sewage is 85%. The municipality has 26 sewage treatment units, responsible for the treatment of 56.2 million liters per day. In relation to the other municipalities in the basin, according to the National Sanitation Information System 79 (SNIS), the municipality of Três Rios treats 2.97% of its sewage, while the other municipalities, Teresópolis, Areal, São José do Vale do Rio Preto, Paty do Alferes and Paraíba do Sul did not report their data to SNIS, potentially indicating that they do not perform sewage treatment. In other words, about 50% of the population has no formal access to sewage treatment services.

The diagnosis provided by this research establishes the first step towards the Framing of water resources according to their intended uses, as established by the Brazilian National Water Resources Policy. In addition to the diagnosis which was carried out a georeferenced database was built. There are few cases of Framework in Brazil and none in the studied watershed. This makes this study relevant to Brazilian water resources management. The considerable number of users awaiting regularization from the State Environmental Institute is a limitation to implement the Framework and requires a joint effort of the watershed committee.

Answering our initial question, Piabanha River water quality is medium according to the WQI NSF and certainly is not able to support high levels of biodiversity. Some river stretches have quality compatible with class 4 according to the Brazilian regulation for the coliforms, BOD and TP parameters; hence, they cannot be used for irrigation, human or animal consumption, not even after treatment. On the other hand, the Framework must be carried out according to intended uses. Therefore, we recommend that the Piabanha Committee, in partnership with the State Public Ministry, lead actions to reduce the concentrations of these parameters, mainly in the sanitation sector.

It is recommended that the monitoring program be continued and expanded to stretches where conflicts between water uses occur, in order to implement the Framework to enforce the improvement of water quality. It is also important to point out that this study was financed with public resources from the Piabanha water resources fund and that the present analysis made possible to recommend the exclusion of three of the eight existing stations, thereby enabling the expansion of the monitoring to other tributaries of the Piabanha River under the influence of large population with practically no sanitation, notably the Rio Preto/Paquequer sub-basin.

This work describes a methodological approach that can be useful for other researches in environmental science and management. We have applied an integrated approach using data from different sources combined with data analysis based on WQI, PCA, CA, frequency analysis and trend analysis, which were used in a complementary way to understand a research problem.

Materials and methods

The Piabanha Basin is located in southern Brazil, belonging to the mountainous region of the State of Rio de Janeiro with an area of 2050 km 2 (Fig.  4 ). The Piabanha River source is at 1150 m of altitude and runs down 80 km until it flows into the Paraíba do Sul River at an altitude of 260 m. The upper portion of the basin presents a humid tropical climate. With steep slopes, annual rainfall exceeds 2000 mm. The lower portion of the basin has a sub-humid climate and the average rainfall decreases to 1300 mm. The seasons are well defined throughout the basin and the rainfall regime has symmetry in its distribution between the periods from January to June and from July to December 78 . The territory is home to 535 thousand people in 2018 80 . The two largest cities in the region, Petrópolis and Teresópolis, are located in the headwaters of the basins and give rise to the Piabanha and Preto rivers, respectively. Additionally, because the sewage treatment is limited and the river flows are low, high constituent concentrations are observed (e.g., fecal coliform, nitrate, and BOD), especially in urban areas 42 .

figure 4

Study area, sample stations and interference points (water abstraction or effluxent discharge). This map was generated in the open source software QGIS version 3.14.15 ( https://qgis.org/ ).

Three sets of monitoring data have been used in this researchh (Fig.  4 ). The first and main one was the result of a monitoring program that is being conducted by the Piabanha watershed Committee, in which data from July to December 2019 have been analysed and are described in more details in the next item. The second were from 6 campaigns carried out in 2012 by HIDROECO project 44 also with financial resources from the Piabanha Committee which is used as a baseline for comparison purposes. The third was comprised of two stations of the basic monitoring network of the Rio de Janeiro Environmental Institute, with data from 1980 to the present, except for periods of data gaps.

A georeferenced database was also built containing water management data. Brazilian National Water Agency (ANA) has developed the National Water Resources Users Register (CNARH) for any bulk water user that changes regime, quantity or quality of a water body. It is a federal platform, but it can be managed by each state. Registration is a prerequisite for the other stages of uses regularization.

Monitoring campaigns and analytical procedures

Physical–chemical parameters were measured in situ using a multiparameter probe (YSI model 556) and a portable turbidimeter (HANNA model HI 98703-0), both previously calibrated and later verified. The samples were placed in specific containers for each analysis, for the necessary parameters the samples were preserved with H 2 SO 4 and kept at a temperature below 4 °C. Laboratory analyses (Table 1 ) were performed according to Standard Methods for the Examination of Water and Wastewater (SMWW) 81 . The laboratory has an accreditation certificate issued by the State Environmental Agency (INEA CCL No. IN044710) and also complies to ISO/IEC 17025 (CRL 1035).

Water Quality Index

A Water Quality Index (WQI) is an empirical expression which integrates significant physical, chemical and microbiological parameters of water quality into a single number 82 . It can be a powerful communication tool to simplify a complex set of parameters, whose individual interpretation can be difficult, into a single index representing the general water quality. A water quality index was initially proposed by Horton 26 and further developed by Brown 27 , 83 resulting in the National (USA) Sanitation Foundation Water Quality Index (WQI NSF ).

The original version of the WQI NSF established an additive expression 27 ; on the other hand, field data analysis suggested that the additive WQI lacked sensitivity in adequately reflecting the effect of a single low value parameter on the overall water quality. As a result, a multiplicative form of WQI was proposed 82 , 83 :

q i is the quality class for the n th variable, a number between 0 and 100, obtained from the respective average quality variation curve 82 , depending on the concentration of each nth variable. W i is the relative weight for the n th variable, number between 0 and 1, assigned according to the importance of the variable for overall quality conformation. WQI NSF is the National Sanitation Foundation Water Quality Index, a number between 0 and 100, rated as "excellent" (100 > WQI ≥ 90), "good," (90 > WQI ≥ 70), "medium" (70 > WQI ≥ 50), "bad" (50 > WQI ≥ 25) or "very bad" (25 > WQI ≥ 0).

The WQI NSF and its many adaptations have been widely used 84 , 85 , however, its use is not uniform, replacing parameters without the necessary adaptation of the respective curve of the indicator. In Brazil, since 1975 the WQI NSF has been used by CETESB (Environmental Company of the State of São Paulo). In the following decades, other Brazilian states adopted, with minor adaptations, this index, which today is the most widely used in the country. In the present study, the weights (w i ) have been used according to the methodology established by INEA (Environmental Institute of the State of Rio de Janeiro): DO (0.17); Fecal coliforms (0.16); pH and BOD (0.11); Nitrates, Phosphate and Temperature (0.10); Turbidity (0.08) and TDS (0.07), rather than total solids.

The replacement of the total solids for dissolved solids parameter may cause an average variation of 0.2% in the final result of WQI NSF , based on our estimates (n = 48, data 2019). In relation to microbiology, E. coli have been used instead of fecal coliforms, applying a correction factor 86 of 1.25 on the result of E. coli .

Principal component analysis and cluster analysis

Principal component analysis (PCA), as defined by Hotelling 87 , is a multivariate technique of covariance modeling that reduces the dimensionality of an originally correlated dataset, with the lowest possible information loss. A new set of variables containing new orthogonal, uncorrelated variables, is formed from a dataset of correlated variables, which are weighed linear combinations of the original variables 30 .

PCA technique extracts the eigenvalues and eigenvectors from the covariance matrix of original variables. The PCs are obtained by multiplying the original correlated variables with the eigenvector, which is a list of coefficients, frequently called “loadings” 29 , 30 , 88 , 89 . A widely accepted and simple qualitative rule proposes that loadings greater than 0.30 or less than − 0.30 are significant; loadings greater than 0.40 or less than − 0.40 are more important, whereas loadings greater than 0.50 or less than − 0.50 are very significant 90 . The suitability of data for PCA was evaluated by Kaiser–Meyer–Olkin 91 , 92 (KMO) measuring of sampling adequacy and Bartlett tests of sphericity 93 . The Shapiro test was evaluated to verify the data normality (α = 0.01).

Cluster analysis reveals the latent behavior of a dataset to categorize the objects into groups or clusters on the basis of similarities 30 , 88 , 89 . Hierarchical agglomerative cluster analysis (CA) classifies objects by first putting each object in a separate cluster, and then joins the clusters together stepwise until a single cluster remains 29 .

Timeseries analysis and trend detection

Mann–Kendall trend test is a nonparametric test used to identify a trend in a series, first proposed by Mann 94 and further improved by Kendall 95 and Hirsch 96 . The null hypothesis (H 0 ) for these tests is that there is no trend in the series. The tests are based on the calculation of Kendall's tau measure of association between two samples, which is itself based on the ranks with the samples. The variables are ranked in pairs, and the difference of each variable to its antecessor is calculated. The total number of pairs that present negative differences is subtracted from the number of pairs with positive differences (S). A positive value of S indicates an upward trend, and a negative value of S a downward trend. For n > 10, a normal approximation is used to calculate Z statistic which is used to calculate p-value 96 .

Fourier decomposition is a technique which allows the separation of frequency components from a data series with seasonal behavior from a complex water quality dataset 97 . Spectral analysis performed using a Fast Fourier Transform (FFT) algorithm is widely used in environmental studies, because it reveals the dominant influences and their scales 50 . Power spectral density (PSD) obtained from FFT and represented by periodograms is a recommended procedure to detect seasonality 98 , 99 .

Brazilian legal regulation

Brazilian fresh waters are divided into four classes, depending on the intended use 76 . The Special Class is intended mainly for the preservation of the natural balance of aquatic communities in fully protected conservation areas. Class 1 is designed for human consumption supply, after simplified treatment, for the protection of aquatic communities and for primary contact recreation. Class 2 requires conventional treatment for human consumption. Class 3 requires conventional or advanced treatment for human consumption and can be used to feed animals and irrigate some crops. Class 4 is intended only for navigation and landscape harmony. It is important to note that the Framework refers to the required water quality target according to water uses. The river basin committees are responsible for implementing the Framework, in accordance with the Brazilian National Water Resources Policy 33 . As long as the Framework is not established by the basin committee, fresh waters will be considered class 2 (Art. 42 CONAMA 357/2005) 76 .

Data availability

All data generated or analysed during this study are included in this published article and its Supplementary Information files.

Martin, V. M. & Joel, A. T. History of the Urban Environment (University of Pittsburgh Press, Pittsburgh, 2012).

Google Scholar  

Wang, J., Liu, X. D. & Lu, J. Urban river pollution control and remediation. Procedia Environ. Sci. 13 , 1856–1862 (2012).

Article   CAS   Google Scholar  

Zhang, X., Wu, Y. & Gu, B. Urban rivers as hotspots of regional nitrogen pollution. Environ. Pollut. 205 , 139–144 (2015).

Article   CAS   PubMed   Google Scholar  

Harding, L. W. et al. Long-term trends, current status, and transitions of water quality in Chesapeake Bay. Sci. Rep. 9 , 1–19 (2019).

John, V., Jain, P., Rahate, M. & Labhasetwar, P. Assessment of deterioration in water quality from source to household storage in semi-urban settings of developing countries. Environ. Monit. Assess. 186 , 725–734 (2014).

Mishra, B. K. et al. Assessment of Bagmati river pollution in Kathmandu Valley: scenario-based modeling and analysis for sustainable urban development. Sustain. Water Qual. Ecol. https://doi.org/10.1016/j.swaqe.2017.06.001 (2017).

Article   Google Scholar  

Xu, Z. et al. Urban river pollution control in developing countries. Nat. Sustain. 2 , 158–160 (2019).

UN-Water. Sustainable Development Goal 6 Synthesis Report on Water and Sanitation 2018. Un (2018). https://doi.org/10.1126/science.278.5339.827 .

UNEP. A Snapshot of the World’s Water Quality: Towards a global assessment . (United Nations Environment Programme, 2016).

Wada, Y. et al. Modeling global water use for the 21st century: the Water Futures and Solutions (WFaS) initiative and its approaches. Geosci. Model Dev. https://doi.org/10.5194/gmd-9-175-2016 (2016).

WWAP. The United Nations World Water Development Report 2019: Leaving No One Behind (2019).

Fan, M. & Shibata, H. Simulation of watershed hydrology and stream water quality under land use and climate change scenarios in Teshio River watershed, northern Japan. Ecol. Indic. https://doi.org/10.1016/j.ecolind.2014.11.003 (2015).

Putro, B., Kjeldsen, T. R., Hutchins, M. G. & Miller, J. An empirical investigation of climate and land-use effects on water quantity and quality in two urbanising catchments in the southern United Kingdom. Sci. Total Environ. https://doi.org/10.1016/j.scitotenv.2015.12.132 (2016).

Article   PubMed   Google Scholar  

Li, B., Rodell, M., Sheffield, J., Wood, E. & Sutanudjaja, E. Long-term, non-anthropogenic groundwater storage changes simulated by three global-scale hydrological models. Sci. Rep. 9 , 10746 (2019).

Article   ADS   PubMed   PubMed Central   CAS   Google Scholar  

Jaeger, W. K. et al. Scope and limitations of drought management within complex human–natural systems. Nat. Sustain. https://doi.org/10.1038/s41893-019-0326-y (2019).

Pastor, A. V. et al. The global nexus of food–trade–water sustaining environmental flows by 2050. Nat. Sustain. https://doi.org/10.1038/s41893-019-0287-1 (2019).

Melo, D. C. D. et al. The big picture of field hydrology studies in Brazil. Hydrol. Sci. J. 65 , 1262–1280 (2020).

Dixon, W. & Chiswell, B. Review of aquatic monitoring program design. Water Res. 30 , 1935–1948 (1996).

Wang, Y., Xiang, C., Zhao, P., Mao, G. & Du, H. A bibliometric analysis for the research on river water quality assessment and simulation during 2000–2014. Scientometrics 108 , 1333–1346 (2016).

Ji, X., Dahlgren, R. A. & Zhang, M. Comparison of seven water quality assessment methods for the characterization and management of highly impaired river systems. Environ. Monit. Assess. 188 , 15 (2016).

Article   PubMed   CAS   Google Scholar  

Deng, W. & Wang, G. A novel water quality data analysis framework based on time-series data mining. J. Environ. Manag. 196 , 365–375 (2017).

Singh, S. et al. Development of indices for surface and ground water quality assessment and characterization for Indian conditions. Environ. Monit. Assess. 191 , 182 (2019).

Mladenović-Ranisavljević, I. I., Takić, L. & Nikolić, D. Water quality assessment based on combined multi-criteria decision-making method with index method. Water Resour. Manag. 32 , 2261–2276 (2018).

Chen, S. K., Jang, C. S. & Chou, C. Y. Assessment of spatiotemporal variations in river water quality for sustainable environmental and recreational management in the highly urbanized Danshui River basin. Environ. Monit. Assess. 191 , 100 (2019).

Rakotondrabe, F. et al. Water quality assessment in the Bétaré-Oya gold mining area (East-Cameroon): multivariate statistical analysis approach. Sci. Total Environ. 610–611 , 831–844 (2018).

Article   ADS   PubMed   CAS   Google Scholar  

Horton, R. K. An Index Number System for Rating Water Quality. J. Water Pollut. Control Fed. (1965).

Brown, R. M., McClelland, N. I., Deininger, R. A. & Tozer, R. G. A water quality index—do we dare?. Water Sew. Work 117 , 339–343 (1970).

Wu, Z. et al. Water quality assessment based on the water quality index method in Lake Poyang: the largest freshwater lake in China. Sci. Rep. 7 , 1–10 (2017).

Article   ADS   CAS   Google Scholar  

Singh, K. P., Malik, A., Mohan, D. & Sinha, S. Multivariate statistical techniques for the evaluation of spatial and temporal variations in water quality of Gomti River (India)—a case study. Water Res. 38 , 3980–3992 (2004).

Dutta, S., Dwivedi, A. & Suresh Kumar, M. Use of water quality index and multivariate statistical techniques for the assessment of spatial variations in water quality of a small river. Environ. Monit. Assess. 190 , 718 (2018).

Malsy, M., Flörke, M. & Borchardt, D. What drives the water quality changes in the Selenga Basin: climate change or socio-economic development?. Reg. Environ. Change 17 , 1977–1989 (2017).

Pacheco, F. S. et al. Water quality longitudinal profile of the Paraíba do Sul River, Brazil during an extreme drought event. Limnol. Oceanogr. 62 , S131–S146 (2017).

Brazilian National Congress. Brazilian National Water Resources Policy. Federal Law n. 9433. (1997).

ANA. Brazilian Water Resources Report—2017 (National Water Agency (Brazil), 2018).

Myers, N., Mittermeier, R. A., Mittermeier, C. G., da Fonseca, G. A. B. & Kent, J. Biodiversity hotspots for conservation priorities. Nature 403 , 853–858 (2000).

Article   ADS   CAS   PubMed   Google Scholar  

Russo, G. Biodiversity: biodiversity’s bright spot. Nature 462 , 266–269 (2009).

Ribeiro, M. C., Metzger, J. P., Martensen, A. C., Ponzoni, F. J. & Hirota, M. M. The Brazilian Atlantic Forest: How much is left, and how is the remaining forest distributed? Implications for conservation. Biol. Conserv. https://doi.org/10.1016/j.biocon.2009.02.021 (2009).

Tabarelli, M., Aguiar, A. V., Ribeiro, M. C., Metzger, J. P. & Peres, C. A. Prospects for biodiversity conservation in the Atlantic Forest: lessons from aging human-modified landscapes. Biol. Conserv. 143 , 2328–2340 (2010).

Bogoni, J. A., Pires, J. S. R., Graipel, M. E., Peroni, N. & Peres, C. A. Wish you were here: how defaunated is the Atlantic Forest biome of its medium- to large-bodied mammal fauna?. PLoS ONE 13 , e0204515 (2018).

Article   PubMed   PubMed Central   CAS   Google Scholar  

Rezende, C. L. et al. From hotspot to hopespot: an opportunity for the Brazilian Atlantic Forest. Perspect. Ecol. Conserv. 16 , 208–214 (2018).

CEIVAP & PROFILL. Plano de Bacia: Consolidação do diagnóstico . http://54.94.199.16:8080/publicacoesArquivos/ceivap/arq_pubMidia_Processo_030-2018_AGVP_PS_PIRH-Atualizacao_TOMO_I_R03_2.pdf (2018).

Villas-Boas, M. D., Olivera, F. & de Azevedo, J. P. S. Assessment of the water quality monitoring network of the Piabanha River experimental watersheds in Rio de Janeiro, Brazil, using autoassociative neural networks. Environ. Monit. Assess. https://doi.org/10.1007/s10661-017-6134-9 (2017).

Morais, A., Villas-Boas, M., Bastos, A., Monteiro, A. & Araújo, L. Estudos para um diagnóstico quali-quantitativo em bacias experimentais – Estudo de Caso: Bacia do rio Piabanha. In II Seminário de Recursos Hídricos da Bacia Hidrográfica do Paraíba do Sul: Recuperação de Áreas Degradadas, Serviços Ambientais e Sustentabilidade 173–180 (2009). https://doi.org/10.4136/serhidro.23 .

Azevedo, J. P. S. de. Relatório final do projeto HIDROECO/Piabanha: Metodologia para Determinação de Vazões Ambientais na Região Serrana do RJ Integrando Aspectos Hidrometeorológicos, Ecológicos e Socioeconômico. Volume 1: Informações Quali-quantitativas (2017).

de Mello, F. V. et al. Current state of contamination by persistent organic pollutants and trace elements on Piabanha River Basin—Rio de Janeiro, Brazil. Orbital Electron. J. Chem. 10 , 327–336 (2018).

Chiappori, D., Hora, M. & Azevedo, J. Interface between hydropower generation and other water uses in the Piabanha River Basin in Brazil. Br. J. Appl. Sci. Technol. https://doi.org/10.9734/bjast/2016/23935 (2016).

da Silva, P. V. R. M., Pecly, J. O. G. & de Azevedo, J. P. S. Uso de traçadores fluorescentes para determinar características de transporte e dispersão no Rio Piabanha (RJ) para a modelagem quali-quantitativa pelo HEC-RAS. Eng. Sanit. Ambient. https://doi.org/10.1590/s1413-41522017150187 (2017).

de Costa, D. A., dos Assumpção, R. S. F. V., de Azevedo, J. P. S. & dos Santos, M. A. On water resources management instruments—Framing—as a tool for river rehabilitation. Saúde em Debate 43 , 35–50 (2019).

Abdul-Aziz, O. I., Wilson, B. N. & Gulliver, J. S. An extended stochastic harmonic analysis algorithm: application for dissolved oxygen. Water Resour. Res. 43 , W08417 (2007).

Rajwa-Kuligiewicz, A., Bialik, R. J. & Rowiński, P. M. Dissolved oxygen and water temperature dynamics in lowland rivers over various timescales. J. Hydrol. Hydromechanics 63 , 353–363 (2015).

United States Environmental Protection Agency. Quality criteria for water . https://nepis.epa.gov/Exe/ZyPDF.cgi/00001MGA.PDF?Dockey=00001MGA.PDF (1986).

Imperador, Á. do. Our history. https://www.grupoaguasdobrasil.com.br/aguas-imperador/en/ (2020).

ANA. Atlas Esgotos—Despoluição de bacias hidrográficas . (Brazilian National Water Agency, 2017).

Karthe, D., Lin, P.-Y. & Westphal, K. Instream coliform gradients in the Holtemme, a small headwater stream in the Elbe River Basin, Northern Germany. Front. Earth Sci. 11 , 544–553 (2017).

Article   ADS   Google Scholar  

von Sperling, M. & von Sperling, E. Challenges for bathing in rivers in terms of compliance with coliform standards. Case study in a large urbanized basin (das Velhas River, Brazil). Water Sci. Technol. 67 , 2534–2542 (2013).

Bae, H. Changes of river’s water quality responded to rainfall events. Environ. Ecol. Res. 1 , 21–25 (2013).

Yu, S., Xu, Z., Wu, W. & Zuo, D. Effect of land use types on stream water quality under seasonal variation and topographic characteristics in the Wei River basin, China. Ecol. Indic. 60 , 202–212 (2016).

Lumb, A., Sharma, T. C. & Bibeault, J.-F. A review of genesis and evolution of Water Quality Index (WQI) and some future directions. Water Qual. Exposure Health 3 , 11–24 (2011).

Jouanneau, S. et al. Methods for assessing biochemical oxygen demand (BOD): a review. Water Res. 49 , 62–82 (2014).

Vigiak, O. et al. Predicting biochemical oxygen demand in European freshwater bodies. Sci. Total Environ. 666 , 1089–1105 (2019).

Article   ADS   CAS   PubMed   PubMed Central   Google Scholar  

Ishii, S. & Sadowsky, M. J. Escherichia coli in the environment: implications for water quality and human health. Microbes Environ. 23 , 101–108 (2008).

Odonkor, S. T. & Ampofo, J. K. Escherichia coli as an indicator of bacteriological quality of water: an overview. Microbiol. Res. (Pavia) 4 , 2 (2013).

Schlesinger, W. H. & Bernhardt, E. S. Biogeochemistry. Biogeochemistry: an analysis of global change 3rd edn. (Elsevier, Amsterdam , 2013). https://doi.org/10.1016/C2010-0-66291-2 .

Book   Google Scholar  

Tipping, E. et al. Atmospheric deposition of phosphorus to land and freshwater. Environ. Sci. Process. Impacts https://doi.org/10.1039/c3em00641g (2014).

Withers, P. J. A. & Jarvie, H. P. Delivery and cycling of phosphorus in rivers: a review. Sci. Total Environ. https://doi.org/10.1016/j.scitotenv.2008.08.002 (2008).

Sharpley, A. Agricultural phosphorus, water quality, and poultry production: are they compatible?. Poult. Sci. https://doi.org/10.1093/ps/78.5.660 (1999).

House, W. A. & Denison, F. H. Exchange of inorganic phosphate between river waters and bed-sediments. Environ. Sci. Technol. https://doi.org/10.1021/es020039z (2002).

Alvim, B. R. Dinâmica do nitrogênio e fósforo em águas fluviais de uma bacia hidrográfica com diferentes usos do solo no Sudeste do Brasil (Universidade Federal Fluminense, 2016).

Molinari, B. S. Modelagem espacial da qualidade de água na bacia do rio Piabanha/RJ (Universidade Federal do Rio de Janeiro, 2015).

Jaji, M. O., Bamgbose, O., Odukoya, O. O. & Arowolo, T. A. Water quality assessment of Ogun river, South West Nigeria. Environ. Monit. Assess. 133 , 473–482 (2007).

Mitra, S. et al. Water quality assessment of the ecologically stressed Hooghly River Estuary, India: a multivariate approach. Mar. Pollut. Bull. 126 , 592–599 (2018).

Guo, H. Y., Wang, X. R. & Zhu, J. G. Quantification and index of non-point source pollution in Taihu Lake region with GIS. Environ. Geochem. Health https://doi.org/10.1023/B:EGAH.0000039577.67508.76 (2004).

Khuhawar, M. Y., Zaman Brohi, R. O., Jahangir, T. M. & Lanjwani, M. F. Water quality assessment of Ramser site, Indus Delta, Sindh, Pakistan. Environ. Monit. Assess. 190 , 492 (2018).

Alves, R. I. S. et al. Water quality assessment of the Pardo River Basin, Brazil: a multivariate approach using limnological parameters, metal concentrations and indicator bacteria. Arch. Environ. Contam. Toxicol. 75 , 199–212 (2018).

Liang, B. et al. Distribution, sources, and water quality assessment of dissolved heavy metals in the Jiulongjiang River water, Southeast China. Int. J. Environ. Res. Public Health 15 , 2752 (2018).

Article   CAS   PubMed Central   Google Scholar  

Brazil. Brazilian National Environment Council ( CONAMA ) Resolution n. 357. Provides the classification of water bodies and environmental guidelines for their framework, as well as establishes the conditions and standards for effluents discharge 1–27 (2005).

Thomann, R. V. Time-series analyses of water-quality data. J. Sanit. Eng. Div. 93 , 1–24 (1967).

Araújo, L. M. N. de. Identification of precipitation and soil moisture hydrological patterns at Piabanha river basin. Ph.D. thesis. (Federal University of Rio de Janiero, 2016).

Brazil. Brazilian National Sanitation Information System (SNIS). http://www.snis.gov.br/ (2020).

CEIVAP/PROFILL. Integrated plan for water resources in the watershed of the Paraíba do Sul river . http://sigaceivap.org.br:8080/publicacoesArquivos/ceivap/arq_pubMidia_AGVP_PS_PIRH_PP-06_REV03_FINAL.pdf (2020).

American Public Health Association. Standard Method for Examination of Water and Wastewater . (American Public Health Association, 2012).

McClelland, N. I. Water Quality Index Application in the Kansas River Basin (1974).

Brown, R.M., McClelland, N.I., Deininger, R.A., Landwehr, J. M. Validating the WQI . National meeting of American Society of Civil Engineers on water resources engineering (1973).

Noori, R., Berndtsson, R., Hosseinzadeh, M., Adamowski, J. F. & Abyaneh, M. R. A critical review on the application of the National Sanitation Foundation Water Quality Index. Environ. Pollut. 244 , 575–587 (2019).

Kachroud, M., Trolard, F., Kefi, M., Jebari, S. & Bourrié, G. Water quality indices: challenges and application limits in the literature. Water (Switzerland) 11 , 1–26 (2019).

CETESB, Companhia Ambiental do Estado de São Paulo. Qualidade das Águas Interiores no Estado de São Paulo - Apêndice D—Índices de Qualidade das Águas . https://cetesb.sp.gov.br/aguas-interiores/publicacoes-e-relatorios/ (2019).

Hotelling, H. Analysis of a complex of statistical variables into principal components. J. Educ. Psychol. 24 , 417–441 (1933).

Article   MATH   Google Scholar  

Helena, B. et al. Temporal evolution of groundwater composition in an alluvial aquifer (Pisuerga River, Spain) by principal component analysis. Water Res. https://doi.org/10.1016/S0043-1354(99)00225-0 (2000).

Vega, M., Pardo, R., Barrado, E. & Debán, L. Assessment of seasonal and polluting effects on the quality of river water by exploratory data analysis. Water Res. 32 , 3581–3592 (1998).

Sergeant, C. J., Starkey, E. N., Bartz, K. K., Wilson, M. H. & Mueter, F. J. A practitioner’s guide for exploring water quality patterns using principal components analysis and procrustes. Environ. Monit. Assess. 188 , 249 (2016).

Cerny, B. A. & Kaiser, H. F. A study of a measure of sampling adequacy for factor-analytic correlation matrices. Multivariate Behav. Res. https://doi.org/10.1207/s15327906mbr1201_3 (1977).

Kaiser, H. F. An index of factorial simplicity. Psychometrika https://doi.org/10.1007/BF02291575 (1974).

Arsham, H. & Lovric, M. Bartlett’s Test. In International encyclopedia of statistical science (ed. Lovric, M.) 87–88 (Springer, Berlin, 2011). https://doi.org/10.1007/978-3-642-04898-2_132 .

Chapter   MATH   Google Scholar  

Mann, H. B. Nonparametric tests against trend. Econometrica 13 , 245 (1945).

Article   MathSciNet   MATH   Google Scholar  

Kendall, M. G. Rank correlation methods (Oxford University Press, Oxford, 1975).

Hirsch, R. M., Slack, J. R. & Smith, R. A. Techniques of trend analysis for monthly water quality data. Water Resour. Res. 18 , 107–121 (1982).

Whitfield, P. H. Identification and characterization of transient water quality events by Fourier analysis. Environ. Int. 21 , 571–575 (1995).

Harris, J., Loftis, J. C. & Montgomery, R. H. Statistical methods for characterizing ground-water quality. Ground Water 25 , 185–193 (1987).

Hipel, K. W. & McLeod, A. I. Time series modelling of water resources and environmental systems. Time Ser. Model. Water Resour. Environ. Syst. https://doi.org/10.1016/0022-1694(95)90010-1 (1994).

Download references

Acknowledgements

We thank the Piabanha Committee for financially support our research. We also thank Juliana Pereira Dias for helping with statistical analysis, Renata Demori Costa and Jamie Sweeney for the english review.

Author information

Authors and affiliations.

Federal University of Rio de Janeiro (UFRJ), Alberto Luiz Coimbra Institute for Graduate Studies and Engineering Research (COPPE), Centro Tecnológico, Cidade Universitária, Rio de Janeiro, RJ, Brazil

David de Andrade Costa, José Paulo Soares de Azevedo & Marco Aurélio dos Santos

Federal Fluminense Institute, São João da Barra Advanced Campus, BR 356, KM 181, São João da Barra, RJ, Brazil

David de Andrade Costa

Oswaldo Cruz Foundation (Fiocruz), National School of Public Health Sergio Arouca (ENSP), Rua Leopoldo Bulhões, 1.480, Manguinhos, Rio de Janeiro, RJ, Brazil

Rafaela dos Santos Facchetti Vinhaes Assumpção

You can also search for this author in PubMed   Google Scholar

Contributions

D.A.C. compiled the manuscript, performed the analysis, and generated the figures and tables in the main text. J.P.S.A. contributed to the discussions and carefully reviewed the manuscript. M.A.S., R.S.F.V.A. and J.P.S.A. made substantial contributions to the conception and design of the research. All the authors reviewed the manuscript.

Corresponding authors

Correspondence to David de Andrade Costa or José Paulo Soares de Azevedo .

Ethics declarations

Competing interests.

The authors declare no competing interests.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary table s1., supplementary table s2., supplementary table s3., supplementary table s4., supplementary table s5., rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Cite this article.

de Andrade Costa, D., Soares de Azevedo, J.P., dos Santos, M.A. et al. Water quality assessment based on multivariate statistics and water quality index of a strategic river in the Brazilian Atlantic Forest. Sci Rep 10 , 22038 (2020). https://doi.org/10.1038/s41598-020-78563-0

Download citation

Received : 02 May 2020

Accepted : 13 November 2020

Published : 16 December 2020

DOI : https://doi.org/10.1038/s41598-020-78563-0

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

This article is cited by

Assessment of water quality and identification of priority areas for intervention in guanabara bay basin, rio de janeiro, brazil, using nonparametric and multivariate statistical methods.

  • Dayane Andrade da Silva
  • Micael de Souza Fraga
  • Marcel Carvalho Abreu

Environmental Monitoring and Assessment (2024)

Trend detection and depletion effects evidence in time series of groundwater levels in the southern sector of the left bank of the Tagus-Sado Basin (Portugal, Iberian Peninsula)

  • Mariana Ferreira Branco
  • Sofia Verónica Barbosa
  • João Xavier Matos

Sustainable Water Resources Management (2024)

Atlantic Coast water quality in Santa Catarina (Southern Brazil): multivariate analysis and relationship with land use

  • Camila Marcon de Carvalho Leite
  • Rubia Girardi
  • Vinicius Tavares Constante

Water, Air, & Soil Pollution (2024)

Spatial distribution of physicochemical parameters and drinking and irrigation water quality indices in the Jhelum River

  • Tofeeq Ahmad
  • Said Muhammad
  • Rizwan Ullah

Environmental Geochemistry and Health (2024)

Statistical analysis of seasonal variation in the characteristics of soil like material and refuse derived fuel recovered from landfill mining

  • Gurusamy Saravanan
  • Srikrishnaperumal Thangam Ramesh

Stochastic Environmental Research and Risk Assessment (2024)

By submitting a comment you agree to abide by our Terms and Community Guidelines . If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

Sign up for the Nature Briefing: Anthropocene newsletter — what matters in anthropocene research, free to your inbox weekly.

water analysis research paper pdf

Information

  • Author Services

Initiatives

You are accessing a machine-readable page. In order to be human-readable, please install an RSS reader.

All articles published by MDPI are made immediately available worldwide under an open access license. No special permission is required to reuse all or part of the article published by MDPI, including figures and tables. For articles published under an open access Creative Common CC BY license, any part of the article may be reused without permission provided that the original article is clearly cited. For more information, please refer to https://www.mdpi.com/openaccess .

Feature papers represent the most advanced research with significant potential for high impact in the field. A Feature Paper should be a substantial original Article that involves several techniques or approaches, provides an outlook for future research directions and describes possible research applications.

Feature papers are submitted upon individual invitation or recommendation by the scientific editors and must receive positive feedback from the reviewers.

Editor’s Choice articles are based on recommendations by the scientific editors of MDPI journals from around the world. Editors select a small number of articles recently published in the journal that they believe will be particularly interesting to readers, or important in the respective research area. The aim is to provide a snapshot of some of the most exciting work published in the various research areas of the journal.

Original Submission Date Received: .

  • Active Journals
  • Find a Journal
  • Journal Proposal
  • Proceedings Series
  • For Authors
  • For Reviewers
  • For Editors
  • For Librarians
  • For Publishers
  • For Societies
  • For Conference Organizers
  • Open Access Policy
  • Institutional Open Access Program
  • Special Issues Guidelines
  • Editorial Process
  • Research and Publication Ethics
  • Article Processing Charges
  • Testimonials
  • Preprints.org
  • SciProfiles
  • Encyclopedia

lubricants-logo

Article Menu

  • Subscribe SciFeed
  • Recommended Articles
  • Google Scholar
  • on Google Scholar
  • Table of Contents

Find support for a specific problem in the support section of our website.

Please let us know what you think of our products and services.

Visit our dedicated information section to learn more about MDPI.

JSmol Viewer

Surface integrity of austenitic manganese alloys hard layers after cavitation erosion.

water analysis research paper pdf

1. Introduction

2. materials and experimental details.

  • Installed power: 500 W
  • Vibration frequency: 20 kHz
  • Vibration amplitude: 50 μm ± 5%
  • Supply voltage: 220 V/50 Hz
  • Working fluid: tap water

3. Discussion of the Experimental Results

3.1. cavitation curves, 3.2. microstructural observations. degree of dilution, 4. conclusions, author contributions, data availability statement, conflicts of interest.

  • Carlton, J. Cavitation, in Marine Propellers and Propulsion , 2nd ed.; Carlton, J., Ed.; Butterworth-Heinemann: Oxford, UK, 2012; pp. 209–250. [ Google Scholar ]
  • Lloyd’s Register Propeller root cavitation erosion. Tech. Matters 2006 , 1 , 4–5.
  • Ge, M.; Sun, C.; Zhang, G.; Coutier-Delgosha, O.; Fan, D. Combined suppression effects on hydrodynamic cavitation performance in Venturi-type reactor for process intensification. Ultrason. Sonochem. 2022 , 86 , 106035. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Ge, M.; Petkovšek, M.; Zhang, G.; Jacobs, D.; Coutier-Delgosha, O. Cavitation dynamics and thermodynamic effects at elevated temperatures in a small Venturi channel. Int. J. Heat Mass Transfer 2021 , 170 , 120970. [ Google Scholar ] [ CrossRef ]
  • He, J.; Liu, X.; Li, B.; Zhai, J.; Song, J. Cavitation Erosion Characteristics for Different Metal Surface and Influencing Factors in Water Flowing System. Appl. Sci. 2022 , 12 , 5840. [ Google Scholar ] [ CrossRef ]
  • Ghose, J.P.; Gokarn, R.P. Propeller Materials. Basic Ship Propulsion ; Allied Publishers: Kharagpur, India, 2004; pp. 166–173. [ Google Scholar ]
  • Fitch, E.C. Cavitation Wear in Hydraulic Systems. Machinery Lubrication. 2011. Available online: http://www.machinerylubrication.com/Articles/Print/380 (accessed on 20 June 2024).
  • Shinde, P.; Satam, A. Cavitation Effect in Centrifugal Pump. International Journal of Researchers. Sci. Dev. 2014 , 2 , 20–23. [ Google Scholar ]
  • Kurosawa, Y.; Kato, K.; Saito, S.; Kubo, M.; Uzuka, T.; Fujii, Y.; Takahashi, H. Basic study of brain injury mechanism caused by cavitation. Engineering in Medicine and Biology Society 2009. EMBC 2009. In Proceedings of the Annual International Conference of the IEEE, Minneapolis, MN, USA, 3–6 September 2009; pp. 7224–7227. [ Google Scholar ]
  • Brennen, C.E. Hydrodynamics of Pumps—Bubble Dynamics, Damage and Noise ; NREC: Pittsburgh, PA, USA, 1994; Available online: http://brennen.caltech.edu/HTMPUM/chap6.htm (accessed on 11 July 2024).
  • Carlton, J.S. Cavitation. Marine Propellers and Propulsion ; Elsevier: Amsterdam, The Netherlands, 2012; pp. 431–458. [ Google Scholar ]
  • Materials for Propeller Fabrication. Rules for Classification and Construction Materials and Welding , 2nd ed.; Germanischer Lloyd Aktiengesellschaft: Hamburg, Germany, 2009; pp. 1–11. Available online: https://www.dnv.com/rules-standards/gl-rules-guidelines/ (accessed on 25 June 2024).
  • Practical Guidelines for the Fabrication of Duplex Stainless Steels ; International Molybdenum Association (IMOA): London, UK, 2014; Available online: https://www.imoa.info/download_files/stainless-steel/Duplex_Stainless_Steel_3rd_Edition.pdf (accessed on 15 June 2024).
  • Duplex Stainless Steel: IMOA. Available online: https://www.imoa.info/molybdenum-uses/molybdenum-grade-stainless-steels/duplex-stainless-steel.php (accessed on 10 May 2024).
  • Abreu, C.M.; Cristóbal, M.J.; Losada, R.; Nóvoa, X.R.; Pena, G.; Pérez, M.C. Comparative study of passive films of different stainless steels developed on alkaline medium. Electrochim. Acta 2004 , 49 , 3049–3056. [ Google Scholar ] [ CrossRef ]
  • Wallén, B. Corrosion of Duplex Stainless Steels in Seawater. In Duplex Stainless Steels, World Conference ; KCI Publishing: Zutphen, The Netherlands, 1997; Volume 5, pp. 59–71. [ Google Scholar ]
  • Francis, R. Bimetallic Corrosion, Guides to Good Practice in Corrosion Control ; 1982; Available online: https://www.npl.co.uk/getattachment/research/electrochemistry/corrosion-guides/Corrosion-Guide-No-5.pdf.aspx?lang=en-GB (accessed on 25 June 2024).
  • Aribo, S. Corrosion and Erosion-Corrosion. Behaviour of Lean Duplex Stainless Steels in Marine and Oilfield Environments. Ph.D. Thesis, University of Leeds, Leeds, UK, 2014. Available online: https://etheses.whiterose.ac.uk/7170/1/PhD%20corrections-%20Thesis%20-Sunday%20Aribo.pdf (accessed on 10 July 2024).
  • Aalco. Stainless Steel—Duplex 2205. Available online: http://www.aalco.co.uk/datasheets/Stainless-Steel_1.4462-2205_102.ashx (accessed on 24 March 2024).
  • Stainless Steel Alloy Duplex 2205, UNS S32205. Available online: https://continentalsteel.com/stainless-steel/grades/alloy-duplex-2205-uns-s32205/ (accessed on 14 June 2024).
  • Ai, W.; Lo, K.H.; Kwok, C.T. Cavitation erosion of a spinodally decomposed wrought duplex stainless steel in a benign environment. Wear 2019 , 424–425 , 111–121. [ Google Scholar ] [ CrossRef ]
  • Escobar, J.D.; Velásquez, E.; Santos, T.F.A.; Ramirez, A.J.; Lópezp, D. Improvement of cavitation erosion resistance of a duplex stainless steel through friction stir processing (FSP). Wear 2013 , 297 , 998–1005. [ Google Scholar ] [ CrossRef ]
  • Mesa, D.H.; Pinedo, C.E.; Tschiptschin, A.P. Improvement of the cavitation erosion resistance of UNS S31803 stainless steel by duplex treatment. Surf. Coat. Technol. 2010 , 205 , 1552–1556. [ Google Scholar ] [ CrossRef ]
  • Hattori, S.; Ishikura, R. Revision of cavitation erosion database and analysis of stainless steel data. Wear 2010 , 268 , 109–116. [ Google Scholar ] [ CrossRef ]
  • Standard G32 ; Standard Method of Vibratory Cavitation Erosion Test. ASTM: West Conshohocken, PA, USA, 2016.
  • Mitelea, I.; Mutascu, D.; Karancsi, O.; Crăciunescu, C.M.; Buzdugan, D.; Utu, I.D. Microstructural Investigations of Weld Deposits from Manganese Austenitic Alloy on X2CrNiMoN22-5-3 Duplex Stainless Steel. Appl. Sci. 2024 , 14 , 3751. [ Google Scholar ] [ CrossRef ]
  • Bordeasu, I. Monografia Laboratorului de Cercetare a Eroziunii prin Cavitatie al Universitatii Politehnica Timisoara, 1960–2020 ; Editura Politehnica: Timişoara, Romania, 2020. [ Google Scholar ]
  • Bordeaşu, I.; Patrascoiu, C.; Badarau, R.; Sucitu, L.; Popoviciu, M.; Balasoiu, V. New contributions in cavitation erosion curves modeling, FME Transactions Faculty of Mechanical Engineering ; University of Belgrade: Belgrade, Serbia, 2006; New Series, Volume 34, Nr.1/2006, YU ISSN 1451-2092; pp. 39–44. [ Google Scholar ]

Click here to enlarge figure

The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

Mitelea, I.; Bordeașu, I.; Mutașcu, D.; Crăciunescu, C.M.; Uțu, I.D. Surface Integrity of Austenitic Manganese Alloys Hard Layers after Cavitation Erosion. Lubricants 2024 , 12 , 330. https://doi.org/10.3390/lubricants12100330

Mitelea I, Bordeașu I, Mutașcu D, Crăciunescu CM, Uțu ID. Surface Integrity of Austenitic Manganese Alloys Hard Layers after Cavitation Erosion. Lubricants . 2024; 12(10):330. https://doi.org/10.3390/lubricants12100330

Mitelea, Ion, Ilare Bordeașu, Daniel Mutașcu, Corneliu Marius Crăciunescu, and Ion Dragoș Uțu. 2024. "Surface Integrity of Austenitic Manganese Alloys Hard Layers after Cavitation Erosion" Lubricants 12, no. 10: 330. https://doi.org/10.3390/lubricants12100330

Article Metrics

Article access statistics, further information, mdpi initiatives, follow mdpi.

MDPI

Subscribe to receive issue release notifications and newsletters from MDPI journals

Assessment of groundwater quality

  • October 2017
  • Publisher: e-PG Pathshala, UGC, MHRD, Govt. of India

Shashank Shekhar at University of Delhi

  • University of Delhi

Abstract and Figures

Trilinear diagram for representation of major ion chemistry of groundwater (After Piper 1944)

Discover the world's research

  • 25+ million members
  • 160+ million publication pages
  • 2.3+ billion citations

Smruti RANJAN Panda

  • Jagadish Ku Tripathy

Kamal Kumar Barik

  • L. A. RICHARDS
  • D W Westcot
  • Larry W Mays
  • Recruit researchers
  • Join for free
  • Login Email Tip: Most researchers use their institutional email address as their ResearchGate login Password Forgot password? Keep me logged in Log in or Continue with Google Welcome back! Please log in. Email · Hint Tip: Most researchers use their institutional email address as their ResearchGate login Password Forgot password? Keep me logged in Log in or Continue with Google No account? Sign up

IMAGES

  1. (PDF) Research Paper on Analysing impact of Various Parameters on Water

    water analysis research paper pdf

  2. (PDF) Analysis of Ground Water Quality and its Management

    water analysis research paper pdf

  3. (PDF) A Review Paper on Water Resource Management

    water analysis research paper pdf

  4. How to Interpret a Water Analysis Report

    water analysis research paper pdf

  5. (PDF) Time series analysis of water quality parameters

    water analysis research paper pdf

  6. (PDF) Evaluation of Irrigation Water Quality by Data Envelopment

    water analysis research paper pdf

VIDEO

  1. Study Electrolysis of Water

  2. Data Analysis and Interpretation of Spring Water Quality

  3. Lecture 56: Case Study

  4. How to Write a Scientific Research Paper

  5. Lec-5

  6. How to conduct Meta-analysis

COMMENTS

  1. (PDF) An Introduction to Water Quality Analysis

    Water quality analysis is required mainly for monitoring. purpose. Some importance of such assessment includes: (i) To check whether the water quality is in compliance. with the standards, and ...

  2. (PDF) Water Quality Parameters

    The ph ysical, chemical, and biologi cal parameters of wate r quality are re viewed. in terms of definition, source s, impacts, effects, and m easuring methods. The clas-. sification of wa ter ac ...

  3. Evaluating Drinking Water Quality Using Water Quality Parameters and

    Water is a vital natural resource for human survival as well as an efficient tool of economic development. Drinking water quality is a global issue, with contaminated unimproved water sources and inadequate sanitation practices causing human diseases (Gorchev & Ozolins, 1984; Prüss-Ustün et al., 2019).Approximately 2 billion people consume water that has been tainted with feces ().

  4. 76814 PDFs

    Explore the latest full-text research PDFs, articles, conference papers, preprints and more on WATER QUALITY ANALYSIS. ... references or conduct a literature review on WATER QUALITY ANALYSIS ...

  5. A critical analysis of parameter choices in water quality assessment

    The water quality index (WQI) is a crucial tool in environmental monitoring, offering a comprehensive evaluation of water quality. This index transforms a variety of parameters into a single numerical value, thereby facilitating the classification of water samples into distinct safety levels (Tasneem and Abbasi, 2012, Sutadian et al., 2016 ...

  6. Water Analysis: Emerging Contaminants and Current Issues

    Article Views are the COUNTER-compliant sum of full text article downloads since November 2008 (both PDF and HTML) across all institutions and individuals.

  7. Groundwater quality assessment using water quality index and principal

    Principal component analysis (PCA) A well-reported statistical approach in ground water research, the principal component analysis. The data clarification is obtained along the hidden factor ...

  8. PDF Assessment of Water Quality Parameters: A Review

    Measurement of pH: The pH is important parameter of water, which determines the suitability of water for various purposes such as drinking, bathing, cooking, washing and agriculture etc. The pH level of water having desirable limit is 6.5 to 8.5 as specified by the BIS. Pure water is said to be neutral, with a pH of 7.

  9. PDF Water Quality Analysis of Narmada River with Reference to ...

    International Journal of Science and Research (IJSR) ISSN: 2319-7064 ResearchGate Impact Factor (2018): 0.28 | SJIF (2018): 7.426 Volume 9 Issue 1, January 2020 www.ijsr.net Licensed Under Creative Commons Attribution CC BY Water Quality Analysis of Narmada River with Reference to Physico-Chemical Parameters at Hoshangabad City, M.P; India

  10. Reliable water quality prediction and parametric analysis using

    Despite all the implementation and analysis from an engineering perspective, the involvement of an environmental scientist in any aspect of water research would contribute towards the enhancement ...

  11. A review of water quality index models and their use for assessing

    Water is a crucial component of the environment; but surface water and groundwater quality have long been deteriorating due to both natural and human-related activities. Natural factors that influence water quality are hydrological, atmospheric, climatic, topographical and lithological factors (Magesh et al., 2013, Uddinet al., 2018).

  12. Statistical tools for water quality assessment and monitoring in river

    The first use of linear multilevel models (LMM) for water quality analysis in our paper sample was by Tate et al., who looked at the effect of cattle feces distributions on water quality. Similar to the use of Bayesian approaches, multilevel models began to notably increase in popularity from 2012 onward, with only 10 WQ studies using the ...

  13. PDF An assessment of water quality index of Godavari river water ...

    In this study water quality of Godavari river in Nashik city was evaluated, to evaluate water quality of Godavari river 10 sampling stations were determined, and 8 parameters were selected. The physico-chemical analysis of water samples indicates that the river water sample has alkaline properties.

  14. (PDF) Bacteriological water analysis

    PDF | Bacteriological water analysis is a method of analysing water to estimate the numbers of bacteria present and, if needed, to find out what sort of... | Find, read and cite all the research ...

  15. Evaluation of water quality index and geochemical characteristics of

    In this research, the WQI value calculated for sites 13, 14, 17, 18, and 25 was 24.58, 21.22, 23.96, 23.07 and 25.80 respectively. ... as PCA and coorelation analysis were used in this paper to ...

  16. An Introduction to water quality analysis

    The general flow of procedures for water quality analysis is mentioned in Chart-1 Water Quality can be defined as the chemical, physical and biological characteristics of water, usually in respect to its suitability for a designated use. Water can be used for recreation, drinking, fisheries, agriculture or industry.

  17. PDF Methods for Collection and Analysis of Water Samples

    Water-supply paper 1454) Includes bibliographies. 1. Water Analysis. I. Thatcher, Leland Lincoln, 1923 ... tion of results of water-quality research and investigations for publication. This manual covers only the data-collection segment ... revolutionized water analysis in recent decades. Further advances are on the horizon. Better methods for ...

  18. Qualitative and Quantitative Analysis of Water

    The main purpose to write this chapter is to spread knowledge of the quality of the available water. In this chapter, different wastewater parameters and their estimation methods are discussed in detail. Also, qualitative and quantitative measurements are discussed to constantly monitor the quality of water from the various sources of supply.

  19. (PDF) Water Quality Assessment with Water Quality Indices

    Water Quality Assessment. Water quality is determined b y assessing three classes. of attributes: biological, chemical, and physical. There. are standards of water quality set for each of these ...

  20. Water quality assessment based on multivariate statistics and water

    A bibliometric analysis for the research on river water quality assessment and simulation during 2000-2014. Scientometrics 108 , 1333-1346 (2016). Article Google Scholar

  21. PDF WATER ANALYSIS

    Th~s User-fr~endly FieldILaboratory Manual for water analysis is a inajor contributioii made by the scientists who are actively engaged in water-related research at the Department of Environmental Sc~ences, Institute of Fuiidamental Studies, Kandy. Successful effort has been made by them to comprle cost-effective techn~ques for analysis ...

  22. Mapping the Terrain: A Systematic Review of Drama Research Using Data

    This paper maps the terrain of drama research, referring to scholarly work focusing on television and online dramas through a systematic review using data mining techniques. We chose drama as our unit of analysis because we are interested in how this medium pervades and reflects daily life and society.

  23. (Pdf) Water Purification: a Brief Review on Tools and Techniques Used

    This paper includes brief information about some of the them namely, physio-chemical water analysis (PCWA), adsorption, metal pollution index (MPI), water quality index (WQI), water quality ...

  24. Effects of locally available burlap fiber on the mechanical and

    The average water absorption of the concrete samples (at 28 days) for different levels of burlap fiber is listed in Table 6. As shown in the Table 2, the water absorption for the different fiber contents varied from 3.68% to 5.57%, with an overall mean of 4.81%, varying at a coefficient of variation of 13.56%. It can be noted that higher ...

  25. Surface Integrity of Austenitic Manganese Alloys Hard Layers after

    A Feature Paper should be a substantial original Article that involves several techniques or approaches, provides an outlook for future research directions and describes possible research applications. Feature papers are submitted upon individual invitation or recommendation by the scientific editors and must receive positive feedback from the ...

  26. (PDF) Assessment of groundwater quality

    salinity, sodium adsorption ratio and specific ion toxicity. For holistic assessment of groundwater quality data graphical plots like. Trilinear diagram, Durov's diagram and C -S diagram can be ...