The Potential Pitfalls of “Free” Software: A Firmware Engineer’s Tale

 

As a seasoned firmware engineer, I’ve encountered my fair share of perplexing bugs. But few have been as challenging and enlightening as an insidious SDRAM initialization bug I stumbled upon in the free software provided by a prominent chip manufacturer. In this blog post, I’ll take you through the journey of how this bug was discovered, the process of unraveling its mysteries, and the eventual triumph of fixing it.  

The Discovery 

I was tasked with starting to develop for a new MPU, so I bought three identical evaluation kits. The kits didn’t come with a display, but they did have a connector so a display could be added, which I did. I downloaded the MPU manufacturer’s suite of software for bare metal since we were not interested in running Linux on this particular MPU. The free suite of software included startup code, many examples using the individual peripherals found on the MPU, as well as drivers for those peripherals. This evaluation kit had external SDRAM and included software to initialize the specific SDRAM used on the kit. Everything seemed straightforward, and I got our software running on the MPU quickly thanks to the included suite of software. Everything seemed to be working well, so I was getting ready to hand off one of the three evaluation kits to another developer.  

As a quick sanity check I put the same software that was working fine on my evaluation kit onto the second kit and to my dismay, the system behaved erratically. The LCD sometimes showed garbage (seldomly in the same locations), there were lockups at random times, and the kit generally exhibited unpredictable behavior. The bad behavior didn’t show up immediately but would usually happen within a minute of powering up.  

The Investigation 

Since I had a third development kit, I put the same code on this kit and noticed similar behavior to the second kit. It appeared that I had been lucky that I chose the initial kit first. I then did a longer-term test on that first kit just to confirm that it didn’t have the same issues. But, regardless of how long it was running, this first kit worked perfectly every time. 

I went back to the second and third kits and confirmed they consistently showed bad behavior at random times, usually within the first minute of powering the system up. Sometimes, the hardest part of debugging a problem is being able to consistently get it to exhibit the problem, so in some respects, I was in a good position to start debugging. Unfortunately, even though the kits were consistently exhibiting bad behavior, it was seldom the same error at the same time.  

When starting to debug, before making any code changes, I usually create a feature branch in git so I can always return to a state where the error was known to exist. The next step I take is to observe as much as I can about when and how the bug appears. Before blindly debugging, I like to get as much information about the bug as I can get. I spent several minutes trying to find a pattern to the bad behavior, but again, it appeared to be very random in both when and where it would occur. 

Once I have observed the bug and gotten a feel for where and what might be causing it, I will usually pull out my jtag debugger and start setting some breakpoints. In this case, since the failures were seemingly so random, I wasn’t sure where to set breakpoints, so I instead just let it run within the debugger and hoped that once it failed, I could look at the memory and hopefully get a clue as to what the culprit was. I did manage to pause the jtag after a failure, but this didn’t provide any “aha” moments, but rather solidified my feeling that the issue I was seeing wasn’t really related to my software, but rather something to do with the SDRAM on the kit, which is where my code was running. 

I started looking at the components on the development kits to ensure that they were all the same from kit to kit. From what I could see, all three kits were identical, including the SDRAM. I then started looking into the provided code that was used to initialize the SDRAM. SDRAM initialization is a delicate process, involving precise timing and configuration to ensure the memory is ready for access, and luckily the manufacturer provided us with the exact timing for the specific SDRAM that was on the development kit. Upon cursory inspection, everything looked fine.  

Another useful debug approach is to search various support forums to see if anyone else has run into a similar issue. In this case, since it was a relatively new development kit, the chip manufacturer’s customer support forum seemed like the best place to go for help.  Unfortunately, the little information that was there was regarding the Linux code as opposed to the bare metal code that we were using.  Being an introvert, I hated to admit that it was time to make a call to the manufacturer itself for some assistance. 

The Debugging 

I contacted a local FAE from the manufacturer, and he came to my office to witness the problem firsthand. He was sympathetic and quickly realized this issue was beyond his expertise, so he helped me contact one of their design engineers via the manufacturer’s help desk. Unfortunately, that’s where things went south, because not only was the turnaround time in answering my questions very long, but the design engineer also became very defensive and basically said that it had to be something in my software, even though I explained to him that my software was working perfectly fine on one of the three development kits. I had assumed the manufacturer would want to get to the bottom of this issue, but between their finger pointing and the painfully long delays between their help desk questions and answers, I realized that they weren’t going to help me, and I had to go back to the job of debugging this myself. 

I went back to the SDRAM initialization code and started going through it with a fine-tooth comb. I also needed the datasheet of the SDRAM on the development kits to ensure that the timing was meeting their specifications. The initialization code itself looked fine. Looking at the timing specs, the minimum values specified in the SDRAM datasheet were the same numbers used by the SDRAM initialization code, at least as far as the comments were concerned.  

 

I was getting a bit discouraged but luckily one of my coworkers pointed me to the SDRAM initialization in the Linux codebase to compare it to the bare metal codebase that we were using.  

The Linux codebase looked completely different, and it took a while to compare apples to apples, because the Linux timing parameters used absolute values, whereas the bare metal used computed values through a macro. Although the comments declared the same minimum timing values as those in the bare metal code, once I looked at the actual values being used, I noticed that there were three timing parameters that were larger in the Linux codebase by one, specifically the trcd, trp, and trfc timing parameters.    

I hadn’t tested the Linux codebase on these three new development kits, so I didn’t know for sure that it would work correctly, but since most of the users were using Linux and no one else had complained about this combination of MPU and SDRAM, I figured it was an easy test to change those three timing parameters in the bare metal code and then run it on one of the “bad” development kits.   

The Fix  

To my relief, with the new timing values for trcd, trp, and trfc, both of the “bad” kits started working just as solidly and reliably as the first “good” kit. Just to confirm, I put the new timing code in the “good” kit, and it continued to work well, so I was convinced that these SDRAM timing values were the culprit of the intermittent random failures I was seeing on the “bad” kits. I tried changing only one or two of the timings and empirically determined that all three values needed to be changed for these kits to work perfectly over time.  

The unfortunate part about the ultimate fix was that I never really understood why the bare metal code had a problem to begin with. As mentioned, the values used in the macro to determine the actual timing values looked to be correct. Honestly, I’m not sure I ever would have changed the timing parameters to their new values without having access to the Linux codebase, which had slightly different values for those three timing parameters, because there was nothing that pointed to them having a problem, other than the empirical “bad” behavior on two of the three development kits. 

The Reflection 

It is only an assumption, but based on my observations, the timing parameters were right on the hairy edge and were good enough on some MPU/SDRAM/board combinations but were apparently not good enough for all potential combinations. This experience was a stark reminder of the importance of attention to detail in firmware engineering. It also highlighted the value of a thorough understanding of the hardware you’re working with. 

The Takeaway 

For you, the aspiring firmware engineer, or the seasoned veteran, this tale serves as a lesson in perseverance and the importance of a methodical approach to problem-solving. Bugs like these are not just obstacles; they are opportunities to learn and grow. Ultimately, there are so many peripherals on modern MPU’s that it is almost a necessity that we use already-written code to initialize and use these peripherals. In many cases, that means either using a Linux codebase or some bare metal code provided by the MPU manufacturer, and that can save a lot of time and effort. But, as with most things, especially free things, you should cautiously trust but vigorously verify. Don’t assume that just because it is code provided by the manufacturer that it is going to be without bugs. 

Conclusion   

This SDRAM initialization bug was a formidable foe, but with a combination of technical acumen and a systematic approach, it was conquered. As you embark on your own firmware engineering journey, remember that every bug tells a story, and within that story lies the potential for personal and professional development. 

I hope this recount of my encounter with this SDRAM initialization bug has provided you with insights into the world of firmware engineering. Stay curious, stay diligent, and happy coding! 

Read Our Latest Updates Here:-

  • Johnny
  • Amulet Technologies Joins DigiKey’s Design and Integration Services Network
    Amulet Technologies Joins DigiKey’s Design and Integration Services Network, Expanding Access to Advanced GUI and Embedded Display Solutions Campbell, CA — December 1, 2025 —… Read more: Amulet Technologies Joins DigiKey’s Design and Integration Services Network
  • User Experience Design, UX Design, Visual Design By Erica Spratt
    Understanding the Aesthetic Usability Effect   The Aesthetic-Usability Effect refers to a user’s tendency to perceive more aesthetically pleasing designs as more usable. This phenomenon, deeply rooted in human psychology, plays a crucial role in the user experience and interface design. The principle suggests that users are more likely to tolerate minor usability issues in a product or system if they find its design appealing. This overview aims to shed light on this intriguing effect by defining it, exploring supporting research findings, and delving into the psychological principles that explain why.
  • Technology By Johnny Gohata
    A Real-Time Operation System (RTOS) fundamentally differs from general-purpose operating systems like Windows or macOS. While the typical OS can afford occasional delays or a leisurely approach to task management (imagine casually stirring a risotto while chatting with guests), an RTOS must adhere to strict timing constraints (think of deftly flipping a steak at just the right second for the perfect sear). The stakes are high, and there’s no room for error. What Defines a Timing Critical Application? Timing critical applications are those in which the correct functioning of a system within.
  • UI Design, Visual Design By Brian Deters
    Introduction: Welcome back! In the first post in this series, we delved into the world of DIY UI design for embedded systems and introduced the concept of using off-the-shelf UI Kits and Element Packs. Today, we’re taking the next step in this journey. You’ve chosen your UI Kit, and now it’s time to extract the assets you need to create a cohesive and visually stunning user interface. This blog will focus on two popular types of UI Kits: Photoshop documents (.psd) and Figma files (.fig). We’ll cover the essentials of working
  • DIY UI, Part 2: Extracting Assets from UI Kits
    Introduction:  Welcome back! In the first post in this series, we delved into the world of DIY UI design for embedded systems and introduced the… Read more: DIY UI, Part 2: Extracting Assets from UI Kits
  • When is RTOS Necessary for Embedded Development
    A Real-Time Operation System (RTOS) fundamentally differs from general-purpose operating systems like Windows or macOS. While the typical OS can afford occasional delays or a leisurely approach to task management (imagine casually stirring a risotto while chatting with guests), an RTOS must adhere to strict timing constraints (think of deftly flipping a steak at just the right second for the perfect sear). The stakes are high, and there’s no room for error.
  • Understanding the Aesthetic Usability Effect
    The Aesthetic-Usability Effect refers to a user’s tendency to perceive more aesthetically pleasing designs as more usable. This phenomenon, deeply rooted in human psychology, plays a crucial role in the user experience and interface design. The principle suggests that users are more likely to tolerate minor usability issues in a product or system if they find its design appealing.
  • Designing for ESD Protection: Essential Best Practices
    Imagine you’re an architect tasked with designing a skyscraper. Every aspect, from the materials used to the structural framework, needs to be meticulously planned to withstand environmental stresses, ensuring the building stands tall and secure for decades. Designing electronic systems, particularly those involving sensitive components like touchscreens, involves a similar level of precision and foresight.
  • Boosting Bytecode Efficiency: The Power of GCC’s Label as Value
    If you’ve been using GEMstudio, you’re probably familiar with our programming language, GEMscript. We designed GEMscript to be a user-friendly, C-like language with the intention of enabling a “write once, run anywhere” approach. This means it can be used seamlessly across all our platforms, including GEMplayer on PC and various hardware devices. 
  • The Potential Pitfalls of “Free” Software: A Firmware Engineer’s Tale
    As a seasoned firmware engineer, I’ve encountered my fair share of perplexing bugs. But few have been as challenging and enlightening as an insidious SDRAM initialization bug I stumbled upon in the free software provided by a prominent chip manufacturer. In this blog post, I’ll take you through the journey of how this bug was discovered, the process of unraveling its mysteries, and the eventual triumph of fixing it.  
  • What Makes Capacitive Touch So Versatile
    The continuing advancement in capacitive touch technology has made it possible for modern capacitive touch screens to become the leading, or primary, user interface of choice. Early capacitive touch screens were limited in capability, whereas today’s touch screens can detect multiple fingers, reject water, know when gloves are worn, and work through thick protective glass or acrylic.
  • Beyond Usability: How Aesthetics in UX Design Foster Brand Loyalty and Intuitive Experiences
    In the realm of User Experience (UX) Design, aesthetics extend far beyond the mere appearance of a product. They encompass the overall sensory experience a user encounters when interacting with a digital interface. This includes the layout, color scheme, typography, and imagery that collectively evoke an emotional response. Aesthetics in UX design play a pivotal role in the digital landscape, as they significantly influence user engagement, satisfaction, and, ultimately, the success of a product or service.
  • Unlocking Superior HMI Design: Simple Strategies to Elevate Your HMI Game
    In the realm of embedded firmware engineering, creating a product that not only functions flawlessly but also boasts a superior Human-Machine Interface (HMI) is a challenge worth embracing. For engineers with advanced technical experience but limited exposure to User Interface (UI) and User Experience (UX) design, differentiating your HMI from the competition may seem daunting. Fear not – in this guide, we’ll explore practical strategies to set your HMI apart without delving into the intricacies of UI/UX design.
  • Designing Cohesive User Interfaces for Embedded Systems – A DIY Approach
    As engineers, our primary focus is on functionality and performance. We thrive on solving complex problems and pushing the boundaries of technology. But when it comes to UI design, we often find ourselves out of our depth. You want it to be intuitive, visually appealing, and seamlessly integrated with your project.  This type of design requires a different set of skills – skills that many of us simply don’t possess. 
  • UX vs. UI Design: What’s the Difference?
    In the ever-evolving world of digital experiences, the terms User Experience (UX) and User Interface (UI) are often used interchangeably, leading to confusion about their roles and significance. Understanding the intricacies of product design requires a clear distinction between User Experience (UX) and User Interface (UI) design. While often used interchangeably, these two disciplines encompass different aspects of the product development process and directly impact the usability and aesthetic appeal of the final product. This article explores the definitions, roles, and importance of both UX and UI design, shedding light on their unique contributions to creating successful products.
  • Resistive vs. Capacitive: Making the Intelligent Choice
    In the ever-evolving world of touchscreen technology, two types of touchscreen technology have predominantly occupied the market: resistive and capacitive touchscreens. Each of these technologies offers unique features and caters to different applications. Let’s dive into a comparative analysis to understand their distinct characteristics and help you make the correct choice for your application.
  • The Business Case for UX: How Investing in User Experience Boosts ROI
    In today’s digital era, businesses have come to realize the importance of providing a seamless and enjoyable User Experience (UX). It is no longer just a nicety but a strategic necessity. Besides enhancing user satisfaction, a well-crafted UX can significantly impact a company’s bottom line. In this article, we explore the business case for UX and delve into the tangible benefits of investing in User Experience. And discuss how it can translate into a substantial Return on Investment (ROI).
  • The New AM070RVS01: Wifi Connectivity, Bluetooth, and More!
    Take your project to the next level with the new AM070RVS01. Our new 7″ display comes with built in Wi-Fi!
  • All the Latest Features in GEMstudio Pro 4.0
    You ask, we deliver! From brand new features to better functionality, we are constantly rolling out new improvements requested by our users. Learn more about what’s new in GEMstudio Pro 4.0.
  • New in GEMstudio Pro Version 3.6.1.0
    Since the last official release of GEMstudio Pro version 3.4.0.2, we have added many customer requested features, some major enhancements and a list of bug fixes.
  • May the (holiday) force be with you
    Dickensian scenes of miniature Christmas villages have been a staple of indoor holiday decorations at my house for decades. This year I wanted to spruce things up with something a little different, incorporating Amulet’s MK-070C-HP display and GEMstudio Pro software.
  • Halloween Pumpkin Carving – Best on the Block
    Terrifying Jack-O-Lanterns take a fair amount of planning and artistry. This is not one of those. In this one-day build I make an electrifying Jack-O-Lantern using the spookiest components that haunt my closet. Read how I jazzed up my pumpkin with Amulet’s STK-043-HP and Arduino Uno to light up our porch on Halloween.
  • Amulet Does Wireless
    Amulet can do Wireless?? File this under the “I didnt know it could do that” category. The MK-070C-HP actually has a special header dedicated to many types of devices that conform to the Digi XBee™ form factor, meaning you can add pre-certified wireless functionality quickly and easily. This project demonstrates the use of an Esprissif ESP8266 “Bee” to take a stock thermostat demo and add live forecast data from the weather service Wunderground.com.
  • Battle of the GUI Design – Old School vs New School
    What happens when you put 2 dueling coders in the same room, with the same goal of designing a GUI, but one person does it the “old fashioned” way, and the other with the current GEMstudio Pro™? Watch the video to see how it all plays out. Who will win – will it be Johnny or Minta? Place your bets and watch the short video!
  • The Word is Getting Out – Arduino and Amulet
    The word is getting out – it’s really easy to incorporate Adruino with Amulet displays! One user who wanted to add a full color LCD to Arduino wrote, “The easiest method to communicate data between Arduino and any LCD display, can be found with the Amulet Technologies’ Arduino library. Amulet has cleverly taken out the need to know their communication protocol or any type of serial communication.”
  • Translating your GUI to Klingon using Amulet’s Multi-Language Feature
    Have you ever wondered how to change the language on your HMI touchscreen display to Chinese, French, Spanish, or maybe … Klingon? Amulet has simple step-by-step instructions to do just that! In our example, we show you how to use the multi-language feature in Amulet’s GEMstudio Pro™ software to translate from English to Klingon (but you can do the same steps for any language). Beam me up, Scotty!
Author Bio: 
Jim Weber is a Senior firmware engineer at Amulet Technologies. With over three decades of experience in design and development, he takes pride in his work and hates bugs, unless of course they’re a feature. In his spare time, Jim has recently become addicted to pickleball and can often be found dinking and driving around his local courts.