In some sense, extremely short missions seem simpler than other types, just by virtue of their duration. Alternate-grade EEE parts’ mean time to failure may far exceed the mission duration itself. On the other hand, these missions can be very challenging because the stakes are so high in the event of an anomaly. Therefore, prudent approaches to mitigating a mission-ending event might be to employ overly conservative redundancy schemes; selectively upgrade to higher- reliability devices for mission-critical functions; perform independent testing on a statistically significant sample of parts identical those in the mission or leverage heritage flight data, if it exists. However, these options incur costs which, in aggregate, must be weighed against the cost savings of procuring the alternate- grade part.
There may be occasions where a part not intended for space use possesses sufficiently attractive functionality, performance, cost and/or availability benefits that make it a candidate for a particular space mission. This may or may not present significant risk. Independent testing under conditions representative of the actual mission, very conservative redundancy schemes, or performing worst-case, combined effects (e.g., electrical, thermal, radiation, vibration) analyses based on measured data, may be options that can increase confidence in the parts.
In simple terms, a FMEA process is a structured way of asking the question “What could go wrong?”, even if the device, circuit or product meets requirements. The FMEA process seeks to identify potential impacts that can affect the customer or mission at various levels, and works to put measures in place to mitigate or eliminate undesired performance or unexpected failure. A key benefit is that it continuously improves the successful development of new products, processes and missions, and saves cost. Conducting an FMEA evaluation, including inputs from diverse project members familiar with the mission, as early as the analysis-of-alternatives stage, can yield significant benefits later. In some cases, customers request evidence that an FMEA was conducted at some level.
Eliminating detected nonconformance (corrective action) and preventing nonconformance occurrence (preventive action) are proven ways of reducing cost over-runs and schedule delays, and improving production effectiveness and quality. They also demonstrate the rigor and discipline of the supplier, which will be helpful for you and your customer in establishing confidence in the parts.
Surprisingly, similar problems (e.g., design issues, discrepant material, anomalous test results, changes to or misinterpretation of requirements) occur on multiple projects and missions. Some are even repeated on the same project over time as staff changes. Timely communication of the problem through a Lessons Learned database or similar system, the impact, and its resolution, creates awareness and reduces risk of repeat. Also, even when such databases do exist, they are not always referred to. One way to ensure they are, is to include a step in a design review or project milestone, for example, that shows the alert system or database has been reviewed for any relevant items. This may also enhance the likelihood that the database will be kept up to date.
In a single-point failure or mission-critical part application, a few options are available to prevent premature end of the mission. The designers could significantly derate the part, consider selective use of a higher-quality part, add selective redundancy or invest in independent testing of identical samples.
Also as part of your architecture design and risk trade studies, it is important to evaluate the option of a single string of highly reliable units vs use of lower reliability, redundant units.
Defects can play a key role in premature or unexpected failure. Therefore, understanding the physics of failure, oftentimes driven by defects, is critical to the success of a design. Anomalies in the electrical device itself or in its packaging may give rise to defects.
These AEC documents provide general, statistically-based methods for removing outliers from populations of microcircuits and semiconductors supplied per AEC- Q100 and AEC-Q101, respectively. History has shown that such parts with abnormal characteristics significantly contribute to quality and reliability problems. The AEC recommends that use of the Part Average Testing and Statistical Yield Analysis techniques described “will also flag process shifts and provide a source of rapid feedback that should prevent quality accidents”. Knowing that parts you procure come from such lots may provide much of the rationale and evidence needed to build your and/or your customer’s confidence in their selection for your mission. Note that the AEC provides these documents as guidelines, not requirements, and permits alternate approaches as long as there is good justification.
Having knowledge of the end-to-end production, life cycle and chain of custody of your parts can help expedite resolution when anomalies arise, lend insight into part quality and consistency, and assess supply chain risk.
Thoroughly understanding the way(s) a part will fail is essential to mission success. This knowledge enables designers to complete credible worst case circuit analyses and build in sufficient performance margin to ensure success of the mission.
AEC suppliers typically verify product families to Stress Test Qualification Standards every two years. This may or may not be frequently enough for your or your customer’s needs. Note that AEC recommends, but does not require, such periodic verification. If verifying product on a specific periodic basis is important to you or your customers, then it is best to ask about it early.
In the case where you need to use a part due to functionality, cost, or schedule, but there is, for example, no mission-relevant data for a specific parameter available, there may be options to assist you in making an assessment. Among these are joining a collaborative working group; consulting with academic colleagues working in the area; hiring an experienced consultant with expertise in the specific discipline, or searching open database web sites such as https://radcentral.jpl.nasa.gov/ or https://radhome.gsfc.nasa.gov/radhome/RadDataBase/RadDataBase.html
The EEE parts supply chain is far-flung and chain of custody can vary wildly. Receiving counterfeit and/or tampered devices are real possibilities when procuring alternate-grad parts form the global supply chain. These products may not perform anything like the authentic ones, even though they may physically look identical. Depending on the part type, these products may fail to meet your requirements, or even present a danger to the circuit or the mission. The best minimum mitigation is to always procure parts from a reputable, authorized distributor, who is likely to have corporate procedures in place for some level of protection against risk of counterfeit and/or tampering.
When selecting a supplier, it is prudent to ask about their problem or anomaly resolution process, and determine how much support you can expect when issues arise. This may be important to your customer as well. Similarly, requesting advance notice of when a product will become obsolete or no longer supported is important in helping you determine how soon a design will need to be changed or re-analyzed to accommodate a replacement or new part.