Interview with Ibrahim Haddad on Software Composition Analysis Tools

Ibrahim Haddad is a well-known profile in the global open source community. His career started in the late nineties as…

Ibrahim Haddad is a well-known profile in the global open source community. His career started in the late nineties as a software developer focusing on open source software. Over two decades later, he is now Vice President of Strategic Programs at the Linux Foundation, where he leads the LF AI Foundation. He is highly passionate about open source and more importantly about the collaboration methodology where he focused a lot of his energy at the Linux Foundation on facilitating vendor-neutral environments for advancing open source innovation.

We reached out to him as the author of several publications on open source related topics and specifically open source compliance. More recently, he revisited his “Metrics to compare software composition analysis tools”, a live document he hosts on Google drive now with contributions from Thomas Steenbergen (HERE Technologies), Gilles Gravier (Wipro), and Jeff Luszcz (Palamida). The document, originally Chapeter 12 of his ebook “Open Source Compliance in the Enterprise (2nd Edition), has been updated to better reflect today’s use of SCA tools, and the possible metrics one should consider when shopping for a tool that meets their requirements.

Hi Ibrahim, what made you revisit this document?

Thank you Fredrik for contacting me and for the opportunity to shed light on this ongoing effort of documenting key metrics to consider when evaluating SCA tools. I created the initial version of this document about two and a half years ago just before I released the second edition of Open Source Compliance in the Enterprise. Back then, every conversation I had around open source compliance involved the challenge of evaluating compliance tools available in the market both as commercial and open source solutions. My idea was very simple – let’s come up with a set of metrics that we can standardize and use for evaluating and comparing tools in that space. I put together the initial set of core metrics and published it with the e-book. Two plus years later, these tools have evolved in different ways; there are several new entrants into the market; many tools now support the detection of security vulnerabilities and offer a wide range of new and improved features. In parallel, as a consumer of these tools, I always felt the pain of going through these different demos and trying to distill information that will help guide my decision on what tool to use and for what purpose.

In the past couple months, It became essential to me to revisit the chapter as an independent paper, update it, receive feedback from professionals on what they consider when they evaluate and compare these tools, and document the whole of that following an open and transparent process.

What are the most recent metrics that are added to the document?

This is a hard question to address. I think the most appropriate answer would be – that depends on your specific requirements. The document captures 10 different categories to evaluate:

  • Knowledge base
  • Detection capabilities
  • Ease of use
  • Operational capabilities
  • Integration capabilities
  • Security vulnerability database
  • Advanced discovery methods
  • Cost
  • Deployment models
  • Reporting capabilities

There might be other categories that we’re missing and for that I would invite anyone reading this Q&A to contribute to the document.

As a user of this document and someone looking to evaluate SCA tools, I would encourage them to create their evaluation criteria based on requirements that they mostly care about. Then proceed with the evaluation which will include rating the tools with respect to metrics set within each of these categories.

What’s new in this updated second revision? We have organized the top level major categories and significantly improved the description of each of them. More importantly, we’ve also added a large number of sub-criteria within each of the categories making it easy to formulate the questions you should ask the tool vendor to learn more about the specific features.

Why is snippet detection important in someone’s compliance/security endeavor?

I personally believe that support for snippet detection and identification is key in any license compliance effort. My hypothesis is very simple. Open source software use in commercial products and services is ubiquitous. Software developers incorporate open source code in proprietary code bases in two forms:

  1. As a whole components – for example using the zlib library as is, or
  2. As Snippets – small pieces or listing of code, in this context copied from software component licensed under an open source license into other software components which are either proprietary or open source

If we ignore the support for discovering snippets and identifying their original source or origin and their license, we are pretty much ignoring half of our use cases and proceeding with the assumption that developers only use open source software as a whole component. This is a completely unrealistic approach. In my humble opinion, 20 years after I started with open source compliance, it is not really worth the effort to ensure compliance if you ignore the use case of snippets.

Further to my argument, snippet support is even more critical in relation to security vulnerability detection. Let’s take the example of a development team that decided to use a specific open source library by its entirety. If they run the scanner on their code base, the SCA engine will discover the whole component and flag that there are 2 open security vulnerabilities in relation to that specific component. However, what happens if the development team adopts 300 lines out of that library to integrate with their code base? With the lack of snippet support, the SCA engine will not discover the copied snippets and will not notify the developers that the copied snippets contain a security vulnerability.

Hence why I believe that snippet support is critical at any level. I believe advanced compliance practitioners understand the situation very well and appreciate the snippet support and opt to align with a vendor who provides that support.

One thing to note is that code comes into an organization from various sources… and in most of the time it is in a snippet format, i.e. not whole components that can be compiled, built and run or linked to. So by merely ignoring this fact, ensuring compliance moves from being a practice to discover all open source code, their original source and license, and the fulfilling the license obligations, to becoming discovering and identifying the open source code we bring into the organization as a complete or whole component.

Can you share some insight in how the community is tackling compliance at the moment?

Yes! I would like to make sure that people reading this Q&A are aware of the various efforts at the Linux Foundation and our available free resources to help with open source compliance. We host several community-driven projects focusing on collaborative approaches to managing licensing and compliance. These range from development of best practices, to specifications for inter-organizational exchanges of information, to the software tools needed to automate those exchanges. In particular, I would like to specifically mention the following:

  • Open Compliance Program: The Open Compliance Program website is a starting point for developers and lawyers, particularly those who are new to open source compliance considerations, to learn more about the tools and best practices that can make compliance easier.
  • ACT (Automating Compliance Tooling) seeks to improve software tooling for detecting and complying with open source licenses. Its goal is to improve the interoperability of open source compliance tools to enable compliance workflows that can be optimized for each company’s unique build and release process.
  • OpenChain defines the key requirements for an organization’s open source compliance program. It establishes a conformance program where companies can self-certify to these requirements, with the goal of improving transparency and communication of compliance information across supply chains.
  • SPDX (Software Package Data Exchange) is a specification for communicating Software Bill of Materials information in a standardized, human- and machine-readable format. It enables better communication of information, including license and copyright details, between organizations and interoperability between compliance tools.

We also provide free online training on open source licensing and compliance – Open Source Licensing Basics for Software Developers – and many other resources such as white papers and ebooks, all of which are accessible via our website

Thank you Ibrahim, it was a pleasure talking to you!

Thank you Fredrik.

Other Articles relevant