Need a Suitable Test Strategy for High Quality Products? Graduate from FDFD to RCCA

Network connections to unit under test

Test for electro- mechanical- optical- products is pretty simple, right? There’s an engineering spec for how the unit is supposed to work, so in test you run it, and if it doesn’t do what it’s supposed to, it fails. Then either you retest it to see if it will pass the next time, or send it to the rework area and someone fixes it. Testing is often treated as Non Value-Add (NVA), since the unit of product is not transformed (no “value” is added) by the act of testing it. If testing is NVA, Lean manufacturing says we should quit doing it, right? And since there is no test performed in the process, there are no defective units screened out, and Voila!, everything can be shipped – the manufacturer’s dream come true!

     There’s a lot of defective thinking to unpack in the previous paragraph, so let’s get to it. In this post we will examine manufacturing test from the perspective of a plant management team who wants to know that what comes off the manufacturing line is acceptable for customers first of all, and better today than yesterday if they are enlightened to the organizational learning and process improvement philosophy of Lean Six Sigma.

Background

Any test is fundamentally a measurement of something, accompanied by a comparison of that measurement against an expectation or standard. We do production testing for several reasons: to identify defects in final product or subassemblies; to provide data on certain parameters suspected to be leading indicators of downstream defects; and to provide data believed to be useful for controlling and improving the process. Whether we are concerned about any of these reasons for testing depends greatly on the nature of the product, market attributes, and customer expectations.

Let’s start with a couple of definitions.

Defect: The operational definition often used is “A defect is any discrepancy which, if we know of its existence in a unit of product, will cause us to remove that unit from the production lot so that it will not be shipped to a customer.”

Failure: A test result that does not meet the expectations for the product.

Test Strategy: A description of the testing approach to be used for a product, based on the product’s characteristics, the anticipated customers’ expectations for product performance, and the quality, cost, and reliability targets for the product.

Stage 1 Test Strategy – Find Defect, Fix Defect (FDFD)

This most basic test strategy does what the introductory paragraph suggests – run the product briefly in its normal configuration, make one or a few measurements or observations, and disposition each unit as either Pass or Fail. In certain product and business environments, with low customer expectations and low cost being the primary concern, the FDFD test strategy may be perfectly acceptable.

     In my backyard, I have a few novelty ornaments consisting of a 5-inch by 10-inch painted metal sculpture with a mounting post and an attached enclosure containing a solar cell, a small circuit board, and a battery. (One is a dog driving a pickup truck with LED-lighted wheels – sounds tacky, but my favorite 2-year-old thinks it’s wonderful!) Each one cost me about $15.00; it cost the manufacturer probably 7 or 8 dollars including their marketing overhead and the shipping costs. There are only a few possible defects, such as – solar cell doesn’t charge battery in daylight; battery doesn’t light the LEDs at night; weld holding electronic enclosure onto post breaks; not much else.

I would surmise the novelty ornament manufacturer does nothing more than FDFD, and probably spends only one cycle on fixing; if the first attempt to fix doesn’t make it work, they scrap it. Or maybe ship it anyway, for all I know. I have one that arrived non-functional, and I didn’t bother to send it back. For this manufacturer and product line, any test strategy more elaborate than FDFD would be wasteful and inappropriate.

Novelty backyard ornament, electronic solar lighted metal sculpture

Stage 2 Test Strategy – Track Daily Production Yield

The second stage test strategy addresses many consumer products, for which sufficient value is invested in materials and assembly labor that the manufacturer wants to prevent defects from escaping the production process, and still produce at a profit. The marketplace and customers have an expectation that the product will work when they receive it, and not require repair for a nominal lifetime of a few years. A Stage 2 test strategy identifies defective units of product and also reports passing and failing test results using some type of database – probably using an electronic factory control network if they expect to stay in business for many years.

     Collecting test data in this manner allows monitoring test yield (a rudimentary quality metric), usually daily, but possibly hourly or weekly, based on production rate. Simply stated, yield is the number of units that pass the required tests divided by the number of units tested in a time period, expressed as a percent. If you test 652 units and 611 units pass, the yield is 611 / 652 = 93.7%.

     Tracking the daily yield on a line graph will give a visual indication of the variability in the process – consistent day-to-day or getting better or getting worse. If there are multiple products running through the assembly and test process during the course of a month, it is beneficial to track performance of each product separately. Product identification in the test record allows the database to be sorted for individual products. Production unit serialization allows monitoring first-run tests separately from second or later tests of first-time failures, to check on rework effectiveness, for example.

Stage 3 Test Strategy – Track Daily Test Results by Failure Description

As the business gets comfortable with the Stage 2 test strategy, the next evolutionary step is to identify, separate, and monitor test failures based on unique failure descriptions contained in the test records. In Stage 2, we were only concerned whether each unit passed or failed. In Stage 3, our additional concern is to understand the reasons for test failures.

     A production test system may actually run multiple test cases on each unit of product, with each test case checking performance of a specific subset of the product. If a unit fails test, the test system reports whether the test failed when running case A, case B, case C, or … case Z, and provides some details such as the measured parameter value that failed and its associated test limit. The test database includes that additional information, and the production response to failure may vary based on which test case identified the failure. A well-structured test system also reports the values of critical measurements for units which pass all tests, so that a comparison of passing and failing units is possible.

     With failure data on each production unit, it is possible to augment daily yield reporting by including the number of failed units, the percentage of failure against total tests, and percentage of failures for each identified descriptor. Continuing the example in Stage 2, and assuming 18 units failed for test case A:

     Yield = 611 / 652 = 93.7%
     Failure rate overall = 41 / 652 = 6.3%
     Failure rate for case A (total tested) = 18 / 652 = 2.8% of tested units
     Failure rate for case A (failures only) = 18 / 41 = 43.9% of failures

     Tracking failure data in this manner for cases B through Z provides a more complete picture of product and process performance than simply first-pass or overall yield, and is a starting point for process improvement actions.

Stage 4 Test Strategy – Pareto Analysis Guides Root Cause and Corrective Action (RCCA)

At this point we need to discuss the relationship between failures and defects. Simply stated, a Failure is a test process expectation that was not met for the unit under test. A Defect is the underlying or root cause of the Failure, and is usually the result of one or more components whose variation exceeds expectations, or a process operation improperly performed, or a test measurement anomaly.

     Although some improvement is possible by addressing only the failure, in order to significantly reduce or eliminate failures, it is necessary to understand and remove the cause of the underlying defects. An analogy: If you fall while playing basketball on your lunch hour, and then you have a sore arm, you can take a pain reliever to alleviate the soreness (the failure). But if the pain persists, you will need to address the underlying cause, which may actually be a fractured bone (the defect) requiring proper setting and a cast to heal.

Technicians testing a complex server

Complex products have many possible failures which can occur, and each failure may have many defects as potential root causes. This presents a golden opportunity to learn how to employ Pareto analysis at two levels. Pareto diagrams are histograms of attribute (category) data organized from highest to lowest rate of occurrence, along with a line graph of cumulative rate from 0 to 100%. (See example nearby)

Reviewing a Pareto of test failures for a period of time (day, week, month) is a good first step toward process improvement, since it allows you to see the failure ranking much more clearly than a table of numbers or a running record of test results. Focus attention on the highest ranked failure first. Further analysis of the units which exhibit that failure cause will be required, possibly entailing running diagnostics that reveal more detailed data on parameters not directly tested in production. Ultimately, a second-level Pareto diagram is needed, showing the defects associated with that failure. Those defects are then further analyzed to determine probable root causes for their occurrence. And knowing the root causes, it is possible to implement corrective actions to eliminate, or at least reduce, those defects.

Sound difficult? Definitely not as easy as guessing to try to “fix” the failure, thinking that the failure is the defect. Just like taking Tylenol to “fix” your arm pain is easier than going to a doctor for a full diagnosis to discover the bone fracture. 

Failure Pareto diagram.

This level of analysis takes dedication and practice, and requires learning and applying problem-solving and continuous improvement skills across the realm of the product family and the process. Test developers, test operations support engineers, process engineers, and production technicians all have contributions to make in the Test Strategy Stage 4 environment.

The Value of Test

This post is intended to provide a high-level map for moving to a comprehensive test strategy consistent with high value, high quality products. Test is more than sorting defective product and pointing the way to a repair action to fix individual unit failures. Properly implemented, production test provides valuable feedback to the assembly process and product engineers, leading the way to refinements in current products and major improvements to future product designs.

The VP of Engineering at my primary test equipment supplier taught me long ago:

“The true value of test is realized by developing people’s ability to turn data into actionable information, and thereby drive ongoing process improvement.”

Dann Gustavson, PMP®, Lean Six-Sigma Black Belt, helps Program Managers and their teams achieve superior results through high-impact program execution. Prepare, structure, and run successful programs in product engineering, manufacturing operations (including outsourcing), and cross-functional change initiatives.

Contact Dann@Lean6SigmaPM.com.