Profits up when dead boards go down

07 May 2006

A major concern for both OEMs and contract manufacturers is excessive board scrap, as this impacts their profits. Completely 'dead' boards and difficult to trace faults are the primary causes. However, a low skill and mobile labour force also contribute. A cost effective solution is needed to this problem, as Billy Fenton reports

A 'dead' board is usually defined as one that shows no signs of life apart from power, but has passed all its structural tests (i.e. boundary scan, ICT, MDA, etc.). Life could mean a flashing sequence of LEDs, or a console output. Figure 1 shows an example. In this case, a console output is the first sign of life. No console output means it's defined as 'dead'.

What's also clear from Figure 1 is that over 50% of the board must be operational before the console springs to life - parts shown in red. So, many types of defects can cause the board to be 'dead'! And, when it is 'dead', there is no simple means of diagnosing it.

Difficult to Trace Faults
Boards are increasing in complexity, but adequate diagnostic skills are either not available, or frequent re-training is required due to staff turnaround. However, most functional tests give little guidance on failure causes, and it can be difficult for the average technician to diagnose the exact fault location in an efficient manner. Simply changing parts until the board passes -'shotgunning'- is often the method of choice, and this can lead to future quality problems due to excessive reworking.

Diagnosing a 'Dead' Board
This is the most difficult type of problem to diagnose, as there is no indicator of the fault cause. A processor board has a hierarchical structure, in that it boots from the processor downwards. What is needed is a tool that begins analysis and diagnostics from the processor, and then successively checks one device at a time in the same sequence as the board's hierarchical structure. In the case of Figure 1, the test sequence would be:
1. Power
2. CPU
3. Main CPU Bus
4. Memory Controller ASIC
5. SDRAM
6. Flash
7. PCI Bridge
8. PCI Bus
9. Serial UART
10. Serial UART Loopback

A JTAG-based CPU Emulator provides such as a capability. It connects to the CPU's JTAG port, and stops the CPU at reset. Stopping the CPU at reset prevents a CPU 'hang' as it performs its initial external accesses. Once the emulator is in control, it can now execute a sequence of tests written in REAL CPU code via the JTAG port. It is important to note that this is not normal boundary scan vector testing, but the execution of real CPU software via JTAG.

In the case of Figure 1, the test sequence would be 1-10 above. One of these tests will fail, and detailed diagnostics will be returned via the emulator, thus pinpointing the 'dead' board failure cause. Examples of error messages returned by an emulator for a 'dead' board failure include:

Memory Fail @ Address 0
D31 - D16: ++++++++++++
D15 - D00: ++++++++++++1

LAN Loopback Failure
D31 - D16: +++++++++++1111
D15 - D00: +++++++++++++++

FPFA Access Fail @ Address A0000000

An emulator provides the best solution for 'dead' board faultfinding for a number of reasons:
1. Testing starts at the CPU.
2. It stops the CPU at reset, so external CPU accesses do not 'hang' the board.
3. The test sequence follows the board hierarchy, so the failing item is logically isolated.
4. Full diagnostics are returned via the emulator, so identifying the defect cause is simplified.

De-skilling Complex Diagnosis
The task of diagnosing modern PCBAs is exacerbated by:
1. Increasing board complexity.
2. Reducing technician skill, particularly as production moves to low cost economies.
3. Frequent staff turnaround leading to lost experience, and a continuous need for retraining.

To assist with these problems, an automated diagnostic solution is desirable, but difficult to implement in reality. Solutions have included troubleshooting manuals and expert systems, but such approaches are difficult and time consuming to create and maintain.

A newer approach is the use of a JTAG-based CPU emulator, with a user interface that runs diagnostics in the board's hierarchical sequence, and presents the results in an easy to understand format. Figure 3 shows an output from such an emulator - it is presented via a block diagram of the board, and shows the components with the greatest probability of causing the failure. All the operator has to do is connect the emulator to the board's JTAG port, run the tests, and then follow the repair guidance as shown in Figure 3, so little skill is needed. Admittedly, some skill is needed to configure the emulator for each board type, but this task is normally carried out by test engineers, and modern emulators even simplify this task, as they come supplied with utilities such as Automatic Test Generation (ATG), links to bills of materials, etc.

Return On Investment
Using an emulator is often cost effective as it deskills diagnosis, giving savings on both scrap and labour. For example, if a board costs US$500, then repairing as few as 20 boards from your scrap pile (bonepile) leads to a complete return on investment.

Conclusion
CPU emulation is the most efficient method for creating a Guided Fault Isolation solution for complex PCBAs. It can help eliminate board scrap and reduce labour costs, and the return on investment is measured in months, not years.


Contact Details and Archive...

Related Articles...

Most Viewed Articles...

Print this page | E-mail this page