BATTERIES GONE WRONG – ASSESSMENT, MITIGATION, AND EXPECTATIONS

A Review of Options to Improve Lithium Battery Safety Performance

In the world of product safety, it could be said that there are two basic approaches to risk mitigation, proactive and reactive, with proactive being the preferred choice. Most would agree with the adage that “an ounce of prevention is worth a pound of cure,” but in truth, this oversimplifies the reality in which product manufacturers operate. As with most things in life, things are rarely black and white but rather a continuous spectrum of shades of gray. To this, there are many competing aspects in all commercial product ventures. Could you make a product that was fully reliable under all conditions? Perhaps, but the odds are that it would be a commercial failure as it would take an inordinate amount of time to produce and be prohibitively expensive. In today’s market, the traditional characteristics of safety, time to market, quality, cost, reliability, manufacturability, testability, and usability (to name a few) still apply. But these have been further augmented by more modern concerns of environmental impact, sustainability, social responsibility, and others. We mention these not to offer any judgment but only to note that the expectation that a product will perform flawlessly over its lifecycle is a difficult proposition given the myriad of competing needs. The battery industry is no different when it comes to satisfying market requirements. With batteries having become ubiquitous in our daily lives as the world has migrated to all things becoming portable, the challenge for providers of these products has increased. With the advent of high-energy, rechargeable lithium-ion chemistries, battery performance has dramatically increased, but so have the risks. No longer are battery packs simple devices. In most modern electronic products, they are better characterized as complex components of an integrated system with one key difference – most other components of such systems rarely have the ability to spontaneously overheat and burn (i.e., go to “thermal runaway”) with little to no warning, potentially resulting in personal injury, product damage, and the associated legal and market liabilities.

HOW DO WE ASSESS BATTERY SAFETY RISKS?
In focusing on the safety risks, what are the options for risk mitigation in the battery space? Ideally, these begin early in the design phase. Clearly, there is no substitute for a good design using high-quality components. In the world of batteries, safety-critical components such as the cell, safety circuit, and passive protective devices such as fuses, positive temperature coefficient (PTC) devices, and other thermal devices are the initial focus. Mechanical considerations also come into play to help ensure that the cell is accommodated within its specified limits including levels of protection against reasonably foreseeable external use conditions. To ensure that such efforts are yielding the desired result, testing of both the components and the battery pack assembly is key, covering the aspects of safety as well as long-term reliability and performance. This testing should be initiated early in the product development process so that, if issues are uncovered, there is the time and flexibility to adjust the design, followed by retesting to verify the efficacy of the changes and to ensure that other problems were not inadvertently introduced. As the development process progresses, production samples should be built and evaluated to understand if manufacturing variations can create unanticipated safety risks. In many cases, this design-build-test-adjust process is performed by the component and battery pack manufacturers and is sometimes augmented by external testing laboratory resources. For more complex systems, the end-device manufacturer may also be involved early in the process to ensure system aspects do not negatively impact battery safety.

TESTING BATTERIES FOR REGULATORY APPROVAL
As the design stabilizes, regulatory approval at the battery pack level is usually the next layer of risk mitigation. A key input to this process is the approval of the component cell as it represents the greatest single safety risk. Regulatory testing typically involves small sample sizes and is not meant to serve as a statistically significant sample size to find outliers in a large population but rather is meant to find gross issues such as design or process defects that have escaped detection in the early stages of product development. Common testing protocols involve a combination of electrical, mechanical, and thermal overstress. Some involve the application of faults to better assess the inherent safety robustness of the battery pack. Other tests attempt to evaluate the product for stresses that might be common to a specific industry or use case. At a minimum, battery packs will be tested to the transportation requirements found in UN 38.3. Testing to one of the 62133-2 series of standards (IEC, EN, UL) is also commonly performed and is required for regulatory approval in many global markets. Testing to such standards is usually conducted by accredited third-party testing laboratories with the end result being the authorized application of the testing lab’s mark to the product. This approval facilitates regulatory acceptance by government authorities and may also be a prerequisite for commercial entities such as retailers and distributors to offer the product for sale. Some approvals also require periodic post-market inspection of production facilities to ensure the design is still being manufactured as originally qualified. Infrequently, a testing laboratory or regulatory agency may mandate retesting when significant changes to the relevant test standards are implemented.

THE CHALLENGES OF BATTERIES AS END-PRODUCT COMPONENTS
The discussion up to this point is intended as background for what is typically done in a normal battery pack product development cycle. The level to which these actions are implemented directly correlates to a base level of risk mitigation for safety events once the product is released into the market. This does not mean that there are any guarantees that there won’t be field problems, but the level of exposure is certainly reduced as more product safety information is proactively discerned and addressed. What if the battery pack is simply a purchased component and the purchaser was not involved in the design process and may not even have any visibility into the production of the battery pack? Similarly, what if the purchaser is procuring an end device that has an embedded battery pack? These are both very common situations for retailers and distributors who typically have very limited internal engineering resources. Certainly, buying such products from reputable sources and checking for the presence of the requisite safety marks is a good start, but is it sufficient? Modern supply chains are global. Therefore, discerning where a product was manufactured and by whom can be a challenge in itself. This means that regardless of the actual manufacturer’s liability, a retailer’s or distributor’s brand can be put in jeopardy by a single video posted on social media that quickly goes viral. How can product risk be mitigated in this situation? The general answer is to work backward beginning with production samples. A product teardown of new product samples by a knowledgeable third party can aid in assessing what risks exist with purchased products where the detailed design knowledge is not available. Although every product is different, an evaluation of a product from a portable energy safety perspective might include such items as:
• Verification of any regulatory marks on the product. Was the testing actually done and is the regulatory status current?
• Evaluation of insulating methods including their integrity and consistency
• Evaluation of conductor sizing
• Review of manufacturing quality indicators that might equate to latent defects
• Review of the safety circuit or other protective devices for proper operation under abnormal conditions such as over-voltage, over-current, short-circuit, and under-voltage
• Review of the charging circuit design. Does it subject the battery or cell to improper conditions?
• Determination of the cell manufacturer and type. This also includes an assessment of whether the cell might be counterfeit
• Cell examination (radiographs and/or CT scans), teardown, and construction analysis
• Review of the mechanical design of the product in terms of its ability to protect the safety critical components
• End-user instructions and safety warnings

WHAT ABOUT BATTERY PERFORMANCE ISSUES?
In addition to a review of safety concerns, performance relative to competing market options should be evaluated through benchmarking. This is typically done in parallel with the safety review and is focused on how a user is expected to employ the product in expected use cases. Competing samples are drawn from the market ensuring that they are of the same price tier to ensure that the comparisons are valid. A custom evaluation plan is drafted and might involve visual inspections, functional checks, and even comparisons of long-term electrical or mechanical reliability. Many times, the criteria are drawn from marketing assertions as shown on the products’ packaging. Examples might include the number of hours that the device will operate in a given mode before needing to be recharged and how long that recharge might take. The evaluation can also go much further, perhaps considering the relative drop performance from a given height or the number of charge-discharge cycles before a loss of function is detected. As a general rule, safety concerns tend towards the absolute given the nature of such risks to people and property. Conversely, performance concerns lend themselves towards a more relative evaluation against other competing market options.

ANTICIPATING THERMAL RUNAWAY RISKS
Given the above processes for minimizing risks through proper design or post-production design evaluations, are there other proactive risk mitigation actions that warrant consideration from a product safety perspective? Consider this – even if all of the above steps are followed with the best of intentions, what happens if things still go wrong? More specifically, what is the effect to the end product and nearby users if a cell goes into thermal runaway when the device is in use? Second, what happens if a cell goes into thermal runaway during the transportation and shipping process? Most designers can only guess as definitively knowing what happens is rarely directly investigated. To answer these questions, there are two general methodologies. Simulation is an option but requires very advanced electrochemical and thermal modeling. Our experience is that this tends to be cost-prohibitive for most organizations and thus is only seen in relatively large companies where such expertise is available in-house. What about direct testing? Like simulation, it has barriers for implementation as well, the most obvious being concerns related to personnel safety and expertise, as well as having the appropriate facilities to provide the proper test containment of high-energy events while being able to document their effects. With the right facilities and expertise available, a determination must be made about how to force the cell or battery into thermal runaway. Overcharging and surface heating are two common methods, although the design of the product and the chemistry of the cells will guide what method is most appropriate. Other considerations for such testing involve what data is to be collected and how. Video evidence is considered by most clients to be the most useful. It should be further supported by appropriate logging of relevant temperatures and possibly other product parameters, as well as forensic documentation of the actual effects to the end-product. Once again, the goal is to use the information obtained to determine if design improvement should be made to minimize the chances of personal injury or property damage during a thermal runaway event. Although the above is presented in a relatively clinical fashion, the danger of injury and property damage is very real. Depending on the energy level of the particular sample, an exploding cell can produce temperatures above 1200 °C (2192 °F) and deadly shrapnel particularly in the case of large-format cells with metal cans. Readers are strongly cautioned to not attempt such testing without the proper expertise and containment equipment.

THE IMPORTANCE OF FAILURE ANALYSIS
Designing and testing cells and batteries properly from a safety perspective, including understanding the impacts should a thermal runaway event occur, are the best risk mitigation tools that we have at our disposal. Even with those best proactive efforts, things will still go wrong. The real question is how often. True failure rates for cells and batteries are not publicly available as companies keep such information confidential. But anecdotally, high-quality lithium-ion cells have a rough order of magnitude (ROM) failure rate somewhere around 1 in 10 million, while lesser quality cells are likely to have poorer field performance. With over eight billion cells being produced globally every year, the math is inescapable that bad things will happen. These factors make clear the importance of using retrospective methods to gain insights into what happened, how it happened, and why it happened. These methods collectively fall under the heading of lithium battery failure analysis. Failures in the field can happen at any point in the battery’s life cycle and can vary significantly in severity and frequency. Responses to such issues also vary accordingly, ranging from simply replacing a product under warranty to retrieval of the product for a full forensic evaluation. For minor issues, it may be determined that a product change is not warranted. Conversely, safety issues may mandate a full product recall and rework of the design. In the end, failure analysis actions provide after-the-fact knowledge for organizations from which to make decisions that will impact future risk.

THE VALUE OF THIRD-PARTY EXPERTISE
Like thermal runaway testing, cell and battery failure analysis involves expertise, processes, and tools that may not be readily available to most organizations. Because of the uniqueness and the infrequency of need, expertise tends to be primarily resident in third-party test labs that specialize in portable energy. Conducting cell and battery failure analysis through an expert third party offers a number of benefits, including:
Reduction of personal bias: A third-party test lab has no vested interest in the outcome of the analysis, nor do they have intimate knowledge of the product or company’s history.
• Independent verification: A third-party lab can help to independently verify the findings of an internal team or a supplier.
Resource utilization: As noted previously, field safety events are generally an infrequent occurrence. Having an internal team staffed with the proper expertise and equipment to respond to such a rare event is generally not possible or even desirable.
Diligence: In the most severe of cases such as potential product recalls, it may be valuable for the company to have an independent party involved to minimize negative perceptions regarding objectivity.
Focus: Having failure analysis conducted by an external party may permit the company’s internal teams to remain focused on the day-to-day operations of their mainline business.
Process rigor: An external testing lab will have already developed the processes and methods for orderly evaluation and documentation of field failures, with specific expertise in evidence preservation.
Breadth of experience: Because of their focus on failure analysis spread across multiple clients over time, a third-party testing lab will generally have a wider range of technical experience when it comes to what constitutes typical versus atypical findings.

WORKING WITH A THIRD-PARTY EXPERT
When working with a third-party failure analysis provider, you will be asked to provide more than the failed unit to facilitate the investigation. It is important to be as open and honest as possible. Your provider should be accustomed to handling confidential materials and should be willing to work under a non-disclosure agreement (NDA) to protect all proprietary information.
In terms of the supplemental information, basic product information is the starting point. This might include specifications and similar documents to support the work along with any relevant details regarding product history. These will not be used to prematurely assume conclusions, but rather to supplement the physical evidence and help prioritize the investigatory efforts.
Information on the specific unit along with incident details are also very important to piecing together what happened. How was the unit configured? Was it operating in a particular mode? Did the unit demonstrate anything unusual prior to the event? It is best to provide all of the information that is available and let the failure analysis team draw their own conclusions regarding relevance. It is important to realize that as the investigation moves forward, the relevance of such information may change as more information is learned.
The actual failed units will need to be delivered to the laboratory. In this situation, more is better. It is possible that there may be multiple failure modes at play and having additional samples may help to isolate these. It is also important to preserve the evidence as much as possible by limiting unnecessary handling, examining, or actual tampering which might further damage the unit and lead to erroneous findings.
Proper packaging is a must. It is best if all components of the reported system can be provided, i.e., the failed cell or battery, the end-device if applicable, charging devices and cables, etc., as it is possible that the root cause of the failure may have been external to the cell or battery that failed. Samples should be marked or segregated so that it is clear which components go together. In addition to the failed systems, it is also good if a fully functional new system can be provided for purposes of comparison.
What should you expect from your third party expert? Every investigation is unique, and your provider should work with you to generate a project scope that meets your needs, and they should limit their efforts to that scope. Considerations include specific concerns, communication frequency, deliverables, and budget.
Be aware that the actual work of failure analysis involves a mix of analytical tools such as fault tree analysis (FTA) combined with empirical methods such as x-ray imaging, CT scanning, optical microscopy, product dissection (battery pack and cell teardowns), quantitative measurement, circuit testing, and replication testing. Not every tool is appropriate for every situation. Your provider will provide guidance on these technical aspects. In the end, your provider should provide your team a clear, unbiased analysis report that details the investigation and its associated findings.
What should you not expect from your provider? First, don’t expect speculation. This is a “just the facts” activity. If the evidence doesn’t support it, your provider shouldn’t be offering it up. Second, keep in mind that not every investigation yields the root cause or even the true failure mode. Depending upon the condition of the evidence and nature of the incident, it simply may not be feasible to reach this level of understanding. Conversely, the efforts may seek to eliminate likely root causes thus narrowing the possibilities.
Third, don’t expect your provider to tell you if this issue will repeat in the future. A risk analysis to predict the likelihood of future failures requires a different set of information, although data from the failure analysis investigation may serve as key inputs into that analysis. Finally, don’t expect your provider to tell you what actions to take, although the root cause data from your provider may serve as a basis for your team to make those decisions.

FINAL THOUGHTS
In conclusion, there is a wide array of proactive and reactive steps that can be taken to minimize and mitigate product risks associated with modern lithium-ion cells and battery packs. On the front end, these include the proper design for safety, use of high quality cells and components, thorough testing from the component to the system level to include thermal runaway evaluations, and third-party certifications where appropriate. When problems do occur in the field, consider the engagement of a reputable third-party failure analysis organization that specializes in cells and batteries. Their team of experts can help to assess what happened, how it happened, and possibly even why the incident occurred. In turn, your organization can use this information to objectively determine appropriate responses, both immediate and longer-term, to mitigate risk to your customers, your product, and your brand.

Verified by MonsterInsights