ExceedanceScreen

Water Quality Trend Analysis: A Mann-Kendall Worked Example for Compliance Monitoring

You’ve got eight quarters of trichloroethene (TCE) results from one monitoring well at a former dry cleaner site. The concentrations look like they’re creeping up — from 12 to 25 µg/L over two years — but the regulator wants a statistical determination, not a visual one. Is there a real upward trend, or are you reading a pattern into eight noisy points? This is how to set up statistical analysis for water quality trends in a way that holds up to a state reviewer.

The Mann-Kendall trend test answers exactly this question, and it’s the test EPA accepts for groundwater trend determinations. This is a worked example: 8 quarterly TCE measurements, computed S statistic, variance, Z-score, and a defensible conclusion.

When Mann-Kendall Is the Right Statistical Analysis for Water Quality Trends

Mann-Kendall is a non-parametric rank-based test. It evaluates whether a time series shows a monotonic upward, downward, or no trend. Three properties make it the workhorse for groundwater and surface water trend analysis:

  • Distribution-free. No assumption of normality. Environmental concentration data is typically right-skewed and lognormal; you don’t need to log-transform first.
  • Robust to non-detects. Censored data (U-flagged results below the reporting limit) can be handled by ranking conventions documented in EPA Technical Notes 6 (2013).
  • Invariant to monotonic transformations. Whether you analyze raw concentrations or log-concentrations, the test result is the same.

It does not tell you the magnitude of the trend — just whether one exists and in which direction. Pair it with the Theil-Sen slope estimator (median of all pairwise slopes) when you need a slope estimate that’s consistent with the rank-based test.

The Formula

For a time series \(x_1, x_2, \ldots, x_n\), the Mann-Kendall S statistic is:

$$S = \sum_{i=1}^{n-1} \sum_{j=i+1}^{n} \text{sgn}(x_j - x_i)$$

where \(\text{sgn}(d) = +1\) if \(d > 0\), \(-1\) if \(d < 0\), and \(0\) if \(d = 0\) (a tie).

The variance of S, with no ties, is:

$$\text{Var}(S) = \frac{n(n-1)(2n+5)}{18}$$

For \(n > 10\), the standardized test statistic Z is approximately standard normal:

$$Z = \begin{cases} \dfrac{S-1}{\sqrt{\text{Var}(S)}} & \text{if } S > 0 \\[4pt] 0 & \text{if } S = 0 \\[4pt] \dfrac{S+1}{\sqrt{\text{Var}(S)}} & \text{if } S < 0 \end{cases}$$

The continuity correction (\(\pm 1\)) is what most environmental references use. For \(n \leq 10\), use the exact distribution table from EPA Technical Notes 6 instead of the normal approximation.

The Data

Eight quarters of TCE results from monitoring well MW-3 at a state-cleanup site:

QuarterSample DateTCE (µg/L)Qualifier
Q12024-02-1512
Q22024-05-2015
Q32024-08-1213
Q42024-11-1818
Q52025-02-1416
Q62025-05-1922
Q72025-08-1120
Q82025-11-1725

The federal MCL for TCE is 5 µg/L; every result is an exceedance. The trend question matters because it determines whether the plume is stable, attenuating, or expanding — which drives the next-phase decision (continued monitoring, source-zone re-investigation, or remedial action).

Computing S

With \(n = 8\), there are \(\binom{8}{2} = 28\) pairs to evaluate. Compare each later quarter to each earlier quarter and sum the signs:

Worked Example: S Calculation

Pairs starting from Q1 (\(x_1 = 12\)): Q2−Q1 = +3 (+), Q3−Q1 = +1 (+), Q4−Q1 = +6 (+), Q5−Q1 = +4 (+), Q6−Q1 = +10 (+), Q7−Q1 = +8 (+), Q8−Q1 = +13 (+). Subtotal: +7.

Pairs from Q2 (\(x_2 = 15\)): Q3−Q2 = −2 (−), Q4−Q2 = +3 (+), Q5−Q2 = +1 (+), Q6−Q2 = +7 (+), Q7−Q2 = +5 (+), Q8−Q2 = +10 (+). Subtotal: +4.

Pairs from Q3 (13): all five later quarters are larger. Subtotal: +5.

Pairs from Q4 (18): Q5−Q4 = −2 (−), Q6 (+), Q7 (+), Q8 (+). Subtotal: +2.

Pairs from Q5 (16): Q6, Q7, Q8 all larger. Subtotal: +3.

Pairs from Q6 (22): Q7−Q6 = −2 (−), Q8−Q6 = +3 (+). Subtotal: 0.

Pairs from Q7 (20): Q8−Q7 = +5 (+). Subtotal: +1.

Total: \(S = 7 + 4 + 5 + 2 + 3 + 0 + 1 = 22\).

Of 28 pairs, 25 were positive (later value higher), 3 were negative, 0 were tied. \(S = 25 - 3 = 22\), consistent with the subtotal sum.

Variance and Z-Score

With no ties, the variance simplifies to:

$$\text{Var}(S) = \frac{8 \times 7 \times (2 \times 8 + 5)}{18} = \frac{8 \times 7 \times 21}{18} = \frac{1176}{18} \approx 65.33$$

The standard deviation is \(\sqrt{65.33} \approx 8.08\). Applying the continuity correction since \(S > 0\):

$$Z = \frac{S - 1}{\sqrt{\text{Var}(S)}} = \frac{22 - 1}{8.08} = \frac{21}{8.08} \approx 2.60$$

Significance and Conclusion

For a two-tailed test against the null hypothesis of no trend, compare \(|Z|\) to the standard normal critical values:

  • \(\alpha = 0.10\): critical \(|Z| = 1.645\)
  • \(\alpha = 0.05\): critical \(|Z| = 1.960\)
  • \(\alpha = 0.01\): critical \(|Z| = 2.576\)

Computed \(|Z| = 2.60 > 2.576\), so reject the null hypothesis at \(\alpha = 0.01\). The two-tailed p-value is approximately 0.0094. The MW-3 TCE record shows a statistically significant increasing trend at the 99% confidence level over the 8-quarter period.

Kendall’s tau, a measure of trend strength, is:

$$\tau = \frac{S}{n(n-1)/2} = \frac{22}{28} \approx 0.79$$

Tau ranges from −1 (perfect decreasing) to +1 (perfect increasing). 0.79 is a strong positive trend — consistent with the visual impression but now defensible.

Sanity Checks

  1. Direction matches the data. Q1 = 12, Q8 = 25; the trend is upward and \(S\) is positive. Sign check passes.
  2. n is adequate. The normal approximation is recommended for \(n \geq 10\); at \(n = 8\) we’re technically in “use the exact table” territory. Cross-check against the exact distribution: for \(n=8\), the probability of \(|S| \geq 22\) under the null is approximately 0.005 one-tailed, ~0.010 two-tailed — consistent with the normal approximation result and still significant at \(\alpha = 0.01\).
  3. No seasonal cycle. Visual inspection: Q3 (Aug) results are not systematically higher or lower than Q1 (Feb) results. If they were, you’d need the seasonal Kendall test instead, which de-blocks the data by season before computing S.

Where the Test Breaks

  • Strong seasonality. If concentrations cycle annually (common for nitrate, often for chloride from road salt), the standard MK test conflates seasonal cycles with trends. Use the seasonal Kendall test, which computes S within each season and pools.
  • Heavy non-detect censoring. If >50% of results are non-detect, the rank-based result becomes dominated by ties at the reporting limit and loses power. Use Tarone-Ware or generalized Wilcoxon variants designed for left-censored data.
  • Step changes (not monotonic). If a remediation system started up midway through the record, you have two regimes, not a monotonic trend. Mann-Kendall will report “decreasing” but the underlying pattern is a step. Run the test on each regime separately.
  • Serial autocorrelation. Quarterly groundwater data is often autocorrelated — the lag-1 correlation inflates the apparent significance. For autocorrelated series, apply Hamed and Rao’s pre-whitening or Yue’s modification before computing Z.

Tools That Automate This

EPA’s ProUCL software implements Mann-Kendall, seasonal Kendall, and Theil-Sen estimators with appropriate handling for non-detects. It’s free and EPA-vetted — the right choice when you need defensibility but don’t want to defend hand calculations to a reviewer. The R package Kendall and the Python pymannkendall library implement the same tests if you’re scripting an analysis pipeline.

For groundwater data already organized in your monitoring database, the workflow is: pull the location-analyte time series, filter to the relevant period, handle non-detects per the project’s data validation conventions (covered in handling non-detects in environmental data gaps), run the test, and document the result with the S statistic, n, Z, and p-value as part of the trend narrative. The full data-management context for this is covered in groundwater monitoring data management from field to submission.

The trend result feeds the same compliance narrative that an exceedance table or a TMDL load calculation does. For pollutant-load context where trend results often appear, see calculating pollutant loading for TMDL reports.