Engineering Horror Stories #2 | Bench Talk
 
India - Flag India

Please confirm your currency selection:

Indian Rupee
Incoterms:FCA (Shipping Point)
Duty, customs fees and taxes are collected at time of delivery.
Payment accepted in Credit cards only

US Dollars
Incoterms:FCA (Shipping Point)
Duty, customs fees and taxes are collected at time of delivery.
All payment options available

Bench Talk for Design Engineers

Bench Talk

rss

Bench Talk for Design Engineers | The Official Blog of Mouser Electronics


Engineering Horror Stories #2 Lynnette Reese

In the purely scientific interest of allowing others to learn from mistakes that I have not made myself but either witnessed or heard about, let’s hear Story #2 (OK, that’s purposely tongue –in-cheek; do you really think I would tell you about stuff I have broken? Usually it’s something in my house that I’m trying to do myself.) 

When I worked at a major semiconductor-company-who-shall-not-be named, we had a remote software crew located in China. These were very nice guys, but some of them were green horns. 

The incident in question began when we found that someone had accidentally mixed up the pilot line chips of a family of chips with different memory sizes. Pilot line chips are usually first silicon chips that get packaged, and they may or may not have sufficient identifying markings. In this case there was an MCU that had 256Kb and another that had 512KB, and yet this was indiscernible from the markings that the product engineer had bestowed on the packages! 

So we decided that the way to tell was to boot them all up and look at the memory on the flash screen. The flash screen is that message that comes across the monitor upon boot-up, much like the screen you might see when you boot a PC and glance at the stats of your processor in DOS-like text. At one point, it states the memory of the chip you’re booting. We would put each chip into the socket on the evaluation test board, and slowly sort the 512KBs. 

After booting 9 of the 10 chips, however, they were all showing up as 256KB… much questioning ensued. Had the product engineer made a mistake? No, he insisted. OK, well, why was this happening? We scratched our heads and couldn’t figure it out, until someone decided to look at the boot code itself. No one expects the boot code; it was booting just fine…or was it? 

Turns out that the firmware engineer, reading the spec for the code to list the amount of memory of the chip on the flash screen at boot-up, had interpreted this to mean “read the data sheet and hard code text with that amount of memory to flash on the screen”, rather than, “use the peek command to peek at the memory and dynamically tell us via the flash screen how much memory the chip has.” 

Sigh. We can’t all be perfect, but this is a perfect example of how common sense is one of the more valuable tools in an engineer’s tool box. And the other lesson is to think of every possible cause, not just the obvious one: The product engineer did indeed give us several 512KB chips. Even though the development board was working well, and had been for months, it was still possible to find a bug that had never been tested. No one thought to test that part of the boot code in QA before they released it to the rest of world. 

On the other hand, we all thought that the programming date thing in flipping from 1999 to 2000* was going to bring down the financial world, when in fact it was the financiers themselves who did a terrific job of wiping out Lehman Brothers and the rest of Wall Street. The last vicarious lesson here is that a good engineer suspects everything as a possible cause, starting with the most obvious one. 

*The “date thing” anticipated that using just 2 digits for 1999, as in “99”, if left alone, would cause programs to date vital processes and data back to 1900 when we hit the year 2000. In reality, nothing much happened except loads of Cobol programmers were working OT for years leading up to the year 2000. If the crash of 2008 had occurred in January 2000, there would have been a very convenient scapegoat, indeed.



« Back


Lynnette Reese holds a B.S.E.E from Louisiana State University in Baton Rouge. Lynnette has worked at Mouser Electronics, Texas Instruments, Freescale (now NXP), and Cypress Semiconductor. Lynnette has three kids and occasionally runs benign experiments on them. She is currently saving for the kids’ college and eventual therapy once they find out that cauliflower isn’t a rare albino broccoli (and other white lies.)


All Authors

Show More Show More
View Blogs by Date