Article

Solving the Open Source Dependency Detection Gap in Embedded Systems

Jun 23, 2025

Many embedded systems rely on C, C++, or assembly languages that power everything from automotive electronic control units to medical devices and industrial control systems. These environments were not designed for modern dependency management, and they often lack the structural metadata (think manifest files, dependency trees, etc.) that software composition analysis (SCA) tools depend on.

The demand for visibility, compliance, and accountability in embedded software is rapidly increasing. Regulatory frameworks in the US, EU, China, Japan, and other industrial countries are expanding their scope. Customer audits and cybersecurity directives, particularly in the US and the EU, now reach into sectors that previously operated under the radar. Much of this pressure focuses on securing the software supply chain with explicit attention to open source software. As a result, it’s no longer acceptable for embedded systems to function as black boxes of proprietary logic. It’s time to take action.

At FossID, we recognized this gap early on. Many SCA tool vendors view embedded codebases as uniquely difficult to analyze. But analyzing embedded systems to produce accurate and complete audits is possible and straightforward. We’ve been doing it successfully for years.

The Embedded Challenge: Code Without Clues

C and C++ are the dominant languages in embedded systems because they provide direct access to hardware and predictable real-time performance. These environments often consist of handcrafted code, unmanaged dependencies, and legacy software components that have evolved over several years and decades. What these code bases rarely include are things like:

Dependency manifests
Version metadata or declared license files
Standardized directory structures

Instead, open source code is often integrated manually. A developer may have copied and customized an open source driver in 2007. That code is still there, untraceable by SCA tools that expect cleanly declared packages and repositories. Such situations create real problems:

Legal and compliance teams can’t determine license obligations.
Security leads can’t evaluate known vulnerabilities.
Engineers can’t confidently answer: “What open source are we using?”

For embedded system teams preparing for product certification, customer audits, or M&A due diligence, these blind spots are a liability. You’d think that SCA tools can help you uncover that; however, not all SCA tools are equal.

Why Most SCA Tools Struggle in Embedded Environments

Traditional SCA tools are built for modern software ecosystems. They work by parsing declared dependency trees, analyzing metadata, and using language-specific parsers to map what’s included and where it came from. That’s highly effective for Node.js, Java, Python, or containerized cloud-native applications. But in embedded environments, it is a different challenge:

There are no package managers
Open source code is often deeply integrated, modified, or stripped down
Dependency metadata is either absent or nonstandard

As a result, these SCA tools return high false negative rates, or worse, they return false positives that leave teams chasing ghosts through codebases.

FossID's detection capabilities

The Power of the FossID Knowledge Base

At the core of FossID’s detection capabilities is our knowledge base. It enables us to identify open source code even in the most obscure, stripped-down, or heavily modified codebases. This isn’t just a list of popular open source projects. It’s a deeply curated, continuously updated index capturing the global open source ecosystem.

What makes it different?

Coverage: Our intelligence database is maintained and curated by a dedicated research team. It covers over 200 million software components from dozens of public sources and user contribution sites, including historical and forked versions, small utilities, and hard-to-track libraries often reused in embedded systems.
Source code snippets: Unlike many tools that rely on full-file or declared package detection, FossID breaks code down to the snippet level. This approach allows us to identify open source code even if it’s been renamed or modified.
License lineage: Each match is paired with accurate, traceable license information, even for multi-license or dual-licensed components. The Knowledge Base is constantly updated and holds over 2,500 different known licenses today.

This extensive knowledge base allows us to see what other tools miss. Whether it’s a forgotten fork of zlib implementation, an old BSD-licensed library, or a snippet of GPL code from an old driver that is merged into your firmware, FossID can identify it, track its origin, and notify you with licensing information and possible security vulnerabilities.

It doesn’t stop here, though. Our knowledge base is one aspect of the capabilities needed in embedded environments. The second aspect is our snippet-level detection. Let’s explore that next.

Snippet-Level Detection Without Metadata

FossID takes a fundamentally different approach. Instead of depending on metadata, we focus on the code itself. Our engine performs deep, snippet-level scanning, analyzing the source code for known open source components, regardless of how they were integrated.

Here’s how it works:

We maintain one of the world’s most comprehensive knowledge bases of open source code, covering tens of thousands of projects and billions of snippets. We mentioned that in the previous section.
When scanning a codebase, whether a clean Node.js app, a legacy C++ firmware, or a C driver from 1995, we look for matches at the code level, even if the files were renamed, reformatted, or partially rewritten.
Our engine maps those snippets to their origin, including license data, project metadata, known obligations, and security vulnerabilities.

This method allows FossID to detect reused code in environments with no formal dependency structure, addressing the exact challenge presented by embedded software and environments.

Benefits for Embedded Development and Compliance

This approach of combining the vastness of our knowledge base and the granularity of our snippet detection unlocks several benefits:

Precision in legacy codebases: We identify open source even if it was manually pasted, heavily modified, or lacks attribution.
Actionable compliance: We uncover associated open source licenses and guide compliance teams.
Preparedness for audits: We generate detailed open source reports and SBOMs, even from metadata-poor environments.
Confidence in safety-critical contexts: We validate that you’re not unknowingly shipping non-compliant or security-sensitive code.

This level of visibility is no longer optional: it’s expected and required.

Example Embedded Use Cases Where FossID Adds Value

Our solution shines across a range of embedded scenarios:

Compliance: Legacy projects often contain untracked open source. We help teams establish a clean baseline for licensing and security.
Certification: We help you generate an accurate SBOM when preparing for product certifications.
Due diligence: M&A teams require a clean inventory of open source software before acquiring companies with embedded IP. We invented Blind Audits that result in faster, simpler, yet detailed audits: read about it here.
Long-tail product support: Many embedded systems stay in production for over a decade (automotive is a great example). Managing open source over that lifecycle demands robust tracking and periodic reassessment.

Why This Matters Now

Open source software is foundational across all domains and finds new avenues into a codebase through AI coding assistants. As regulatory scrutiny increases, driven by mandates like the U.S. Executive Order on Cybersecurity, the EU Cyber Resilience Act, and evolving SBOM requirements, embedded teams must modernize their approach to compliance and visibility.

The good news? Modernization doesn’t mean overhauling your entire stack. It means adopting tools that understand your stack.

At FossID, we believe open source compliance should be seamless regardless of the programming language and development environment. Yes, analyzing embedded codebases is challenging. But with the right tools, you can uncover hidden open source code, fulfill your compliance obligations, and prepare your systems for a future defined by transparency and accountability.

If you’re building embedded systems and wondering what’s inside your code, we’re here to help. Book a call with one of our experts to discuss your business needs and how our tools and services can help.