They Screwed Up, They Followed Our Specification

Written by Sean Murphy. Posted in skmurphy

A story from when I was working at Cisco about how a lack of trust slowed down the solution to a problem.

We had been shipping a successful new midrange router for about a year when we were suddenly faced with a line stop, the boxes were failing final test. The product team was working on new versions and suddenly faced a fire drill to find and fix a problem on a box already in volume production.

A quick investigation showed that a cost reduction effort had phased in a new memory part from a different supplier and the design engineers, in the thick of a new design, were frustrated that they had to go back and undo what they viewed as a short sighted decision by the supply base team. “It’s typical, they try and shave a few cents off a part and end up buying one that doesn’t meet spec.”

I was the acting manager for the component engineering team and we immediately requested some of the new parts to test and start a root cause analysis. A few days later David, the senior component engineer, called me, “Murphy, these parts meet spec.” David was a very careful engineer, but I knew that the design engineers were not going to believe him. I said, “Pull your notes together and let’s sit down this afternoon with Chuck and Tom in the lab.”

We meet and walked them through the testing we had done. They were not satisfied and we ran some more tests on new parts which they also passed. Then David replaced a memory part in an existing box with a part from the new supplier and the box, which had been working, now failed the bootup diagnostic.

So we had a part which passed the spec but caused the design to fail. Tom said, “They screwed up, they followed our specification.”

Based on the nature of the system failure David suspected the problem had to do with one key performance parameter for the memory, he went back and tested several of the older parts and they were well inside the margin for the specification of this parameter.

At this point the design team went back and studied their system timing and realized that they had a problem in the old design, and the next generation designs that they were at work on. They made a change to the production design to allow it to work with parts that actually satisfied the specification and incorporated this into the new designs.

I think because the supply base folks had been less involved in the product team up front they were viewed as less trustworthy and more likely to make a mistake. They thought less like engineers and more like cost accountants, they were measured differently than the design engineers, and most of the conversations involved either criticism of the design engineers sourcing decisions or requests for work on additional sources.

Three take-aways

  • Be methodical in your troubleshooting, especially when there is a lot of pressure to find an answer. Document the steps so that others can follow your test and reproduce them (this is less work than proving you reached the right answer).
  • Always seek common ground, especially when there are others on the team with a different skill sets and frame of reference, communication problems, especially due to a lack of shared context, cause more problems than errors.
  • Have a plan for how you will troubleshoot your contribution to the design. If the problem is serious, consider starting to work this plan before you are presented with strong evidence that you need to reverify your work.

Trackback from your site.

Leave a comment

Quick Links

Bootstrappers Breakfast Link Startup Stages Clients In the News Upcoming Events Office Hours Button Newsletter SignUp