SigmaResolve

Sample Size for Process Capability Studies: Minimum Requirements for Reliable Cpk

The auditor asks for Cpk on a key characteristic. You ran a process capability study with 30 measured parts from the qualification run. You compute Cpk = 1.45, write it on the PPAP form, and submit. The customer reads 1.45 as a true value. The actual 95% confidence interval, with n = 30, is roughly [1.07, 1.83]—wide enough that the underlying capability could plausibly be below the 1.33 minimum or comfortably above the 1.67 automotive threshold. The point estimate told the auditor what they wanted to hear and obscured what they actually needed to know.

Sample size for a capability study is the question of how much certainty you owe the next reader of the Cpk number. This guide gives the formula behind the certainty, three standards-based recommendations (AIAG, ISO 22514, Bosch), worked numbers for the most common targets, and a pragmatic section for shops where the textbook minimums are not achievable.

What Sample Size Is Actually Buying You

The point estimate of Cpk is reasonably stable past about 30 measurements. What changes with sample size is the confidence interval around that estimate. A Cpk of 1.45 from 30 parts and a Cpk of 1.45 from 200 parts are not the same number when reported to a customer—the second one is a much tighter claim about the process.

So the right question to set sample size is not “what’s the minimum?” It is: “how tight does the CI on Cpk need to be for this report?”

Process Capability Study Sample Size: The Underlying Formula

The Bissell (1990) asymptotic approximation gives the standard error of the Cpk estimate as:

$$SE(\hat{C}_{pk}) = \hat{C}_{pk} \cdot \sqrt{\frac{1}{9n\hat{C}_{pk}^2} + \frac{1}{2(n-1)}}$$

For a 95% confidence interval, the half-width is approximately:

$$w \approx 1.96 \cdot SE(\hat{C}_{pk})$$

where n is the total number of measurements and \(\hat{C}_{pk}\) is the point estimate. The formula assumes the data is approximately normal and the process is in statistical control—both prerequisites that need to be checked before the calculation has any meaning. Gage R&R must also be acceptable or the variation being measured is partly measurement noise rather than process variation.

Plain-English Reading The first term inside the square root captures uncertainty about how off-center the process is; the second captures uncertainty about the process standard deviation. Both shrink with larger n, but the σ uncertainty (second term) shrinks more slowly than the centering uncertainty (first term) at small samples.

Worked Numbers at Cpk = 1.33

Plugging \(\hat{C}_{pk}\) = 1.33 (the universal minimum) into the formula at four common sample sizes:

nSE95% CI half-width95% CIPractical implication
300.185±0.36[0.97, 1.69]Lower bound below 1.33 minimum; cannot conclude the process is capable
500.142±0.28[1.05, 1.61]Lower bound still below 1.33; capability not demonstrated
1000.100±0.20[1.13, 1.53]Lower bound below 1.33 but close; commonly accepted by automotive customers
1250.090±0.18[1.15, 1.51]Bosch-style machine capability (Cmk) standard sample size
2000.071±0.14[1.19, 1.47]Tight enough to claim 1.33 capability with good margin

The 30-piece study, common in PPAP and quick qualification runs, produces an estimate but does not statistically demonstrate Cpk ≥ 1.33 at 95% confidence. Customers know this; they accept 30-piece studies because the alternative (waiting for 100+ parts) is operationally expensive, not because the math has changed. Read 30-piece Cpk values as point estimates, not as guarantees.

What the Standards Recommend

Three reference documents converge on similar numbers from different angles:

SourceShort-term / machine capabilityLong-term / process capabilityRationale
AIAG SPC Reference Manual (3rd ed.)30+ parts (initial study)100+ parts across multiple subgroups30 establishes a working point estimate; 100 across subgroups captures between-subgroup variation
ISO 22514-2 (process performance)50–100 parts depending on Cpk target125+ partsSample size tied to confidence-interval width; explicit treatment of CI
Bosch Machine and Process Capability booklet50 parts (machine capability, Cmk ≥ 1.67)125 parts across 25 subgroups of 5 (process capability, Cpk ≥ 1.33)Long-standing automotive convention; 125 chosen for tight CI on stricter Cmk threshold

The pattern: 30 parts is a floor for an initial estimate, 100–125 parts is the working target for a defensible long-term claim, and beyond 200 you get diminishing returns unless the process drift is the main concern (in which case more time-spaced subgroups matter more than total n). Customer-specific requirements (Ford Q1, GM BIQS) sometimes specify 300+ parts for safety-critical characteristics.

Subgroup Size vs Total n: Which Lever to Pull

For a long-term Pp/Ppk study, you collect k subgroups of n measurements each, total = k · n. The two levers trade off differently:

  • Increasing n (subgroup size) tightens the within-subgroup σ estimate. Useful when within-subgroup variation is the main process noise (e.g., machining within one cycle).
  • Increasing k (number of subgroups) tightens the between-subgroup variation estimate. Useful when shifts and drifts (between operator, between shift, between material lot) are the dominant source of variation, which is the usual case for long-term Pp/Ppk.

The Bosch convention of 25 × 5 (25 subgroups of 5) reflects this: most of the budget goes into subgroup count to capture between-subgroup variation, with a modest within-subgroup sample. A 5 × 25 study (5 subgroups of 25) of the same total n = 125 would underestimate process variation because the 5 subgroups don’t span enough time. X-bar and R chart construction covers the rational subgrouping logic in more depth.

When the Data Is Not Normal

The Cpk formula and its CI assume the underlying distribution is approximately normal. For skewed or bounded data (concentricity, surface finish, time-to-event measurements), the standard formula understates capability on the heavy-tailed side and overstates it on the light-tailed side.

Three options when normality fails:

  1. Box-Cox or Johnson transformation, then compute Cpk on the transformed data. The standard approach in AIAG-aligned QMS programs.
  2. Percentile-based capability (the ISO 22514-2 method): use the 0.135 and 99.865 percentiles directly from the data instead of mean ± 3σ. Requires more data—n = 200+ for stable percentile estimates.
  3. Bootstrap CI: resample to get the CI empirically. Distribution-free but computationally heavier; supported in R (qcc, boot packages) and most statistical software.

For non-normal data, the percentile or bootstrap approaches need roughly double the sample size of the parametric formula above to reach equivalent CI width. A 100-part study under normality assumptions becomes a 200-part study under bootstrap. Plan accordingly when the process produces non-normal output.

What If You Only Have 20 Parts

Low-volume and high-mix shops can’t hit 100-piece samples for every characteristic on every part. Practical guidance for small-n capability:

  • Report the CI alongside the point estimate. If Cpk = 1.50 from n = 20 with a CI of [1.00, 2.00], say so. The customer’s decision changes when they see the CI.
  • Pool data across runs of the same part. If you make 20 of the same part five times a year, the cumulative 100 parts can support a Pp/Ppk study even if no single run had 100. Document the pooling assumption (same setup, same tooling, same material).
  • Use short-run SPC techniques. Standardized variables (Z-charts) and target Cpk indices let you combine measurements from different parts that share a process. AIAG’s short-run SPC chapter is the starting reference.
  • Negotiate the sample size with the customer. Some PPAP-equivalent submissions accept smaller samples on a documented basis (engineering judgment, low part complexity). Ford Q1 and GM BIQS both have provisions for this.
  • Use control charting to demonstrate stability over time. A stable I-MR chart over 50 parts spread across multiple shifts can substitute for a 100-part Pp/Ppk in some customer contexts. Stability is part of what large samples are demonstrating.

Decision Summary

  • 30 parts: a point estimate. Useful for go/no-go decisions on PPAP if the customer accepts it. Not a defensible claim of capability.
  • 50–75 parts: tighter point estimate, still wide CI at Cpk ≈ 1.33. Reasonable for internal process monitoring.
  • 100 parts: working minimum for a long-term Pp/Ppk claim per AIAG. CI width around ±0.20 at the 1.33 threshold.
  • 125 parts (25 subgroups of 5): Bosch-style machine capability standard; tight CI; widely accepted in automotive.
  • 200+ parts: needed for non-normal data treated parametrically, or when the customer requires high confidence at the 1.67 threshold.

The SPC control chart calculator on this site computes Cpk and Ppk from CSV data and reports the 95% CI alongside the point estimate using the Bissell approximation above. Knowing the CI is half the practitioner’s job; the tool makes it visible without spreadsheet math.

For deeper reading on the standards: ISO 22514-2:2017 (process performance) and the AIAG SPC Reference Manual are the primary sources. The Bosch Machine and Process Capability booklet (publicly available as Bosch Booklet No. 9) is the most accessible practitioner reference for the 25×5 sampling convention.