In more generative AI news, powered by its Gemini AI model, Google announced the release of AlphaCode 2, the next version of its natural language AI coder previously released by Google’s DeepMind lab. TechCrunch reported that AlphaCode 2 greatly stepped up its coding performance over its predecessor by performing better than 85% of human competitors in a programming contest – compared to just 50% about a year ago. That’s an impressive pace of improvement.
OpenAI continues to push its natural-language coding project, Codex, which also powers GitHub Copilot. All this innovation in AI-powered software coding assistants has developers excited about the opportunity to spend more time architecting creative solutions and less time banging out repetitive code line-by-line.
For its part, AlphaCode 2 has shown the ability to understand complex programming challenges and break them down into simpler components. However, there are practical roadblocks to using AlphaCode 2 in that it is expensive to operate at scale and involves significant trial and error.
But with this rapid pace of innovation, it won’t be long until we can expect to see AI-generated code formerly introduced into DevOps teams and the software development lifecycles (SDLC).
What AI-Generated Code Means for Business Risk
On the flip side of this shiny coin is an increase in risk to businesses that produce proprietary, commercial software. With an influx of AI-generated code comes an increase in potential security vulnerabilities and legal compliance headaches. Both risks, security and compliance, are significant for concern amongst executive leadership, but let’s take a closer look at just the legal compliance issue.
Jon Aldama, Chief Product Officer at FossID points out that AI models are “trained on open source and generate suggestions that match open source code without any regard to license or copyright ownership”.
This training data enables the AI system to learn coding patterns, structures, and styles. However, the generated code may include verbatim code copied from the open source repositories. As software developers increasingly pluck code from tools like AlphaCode 2, GitHub Copilot and Codex, more third-party code will find its way into the codebase of these companies.
Currently, many dev teams use Software Composition Analysis (SCA) tools to scan their code to detect all instances of free and open source software (FOSS) used. Most SCA tools reliably identify complete open source software packages and components. The problem, however, is that finding a code snippet is significantly more difficult than finding open source software packages used in their entirety. These code snippets are just as protected by license and copyright law as the entire package.
Finding these AI-generated snippets of open source code is difficult because not all SCA tools can do so. Many SCA solutions only scan managed open source by inspecting package manifest files (detecting only direct and transitive dependencies). Detecting these code snippets requires a more advanced SCA tool that can detect unmanaged open source. Unmanaged code includes all the open source that is not declared in your package manifests – namely snippets of code, binaries, scripts, etc.
Embrace AI-Generated Code and Mitigate Business Risk with AlphaCode 2
These risks can be mitigated to allow developers to fully leverage AI models to accelerate and innovate. By implementing Software Composition Analysis (SCA) tools that have these two critical functions, compliance officers and executive teams can feel confident that their intellectual property is protected from legal liability.
- Code-snippet detection: Granular inspection of code to identify partial matches to open source software.
- Unmanaged code scanning: Scan your entire codebase, not only managed code.
While AlphaCode 2 may not yet be ready for integration into your SDLC, software engineers are anticipating the day highly capable AI models will solve their complex challenges and allow them to focus on innovation. With a capable SCA tool that includes code-snippet detection and scans unmanaged code as well, executive leadership can embrace the generative-AI movement.
The FossID Team byline indicates this article reflects the collective work of the FossID team. With nearly a decade of expertise delivering open source auditing services, FossID is a pioneer in the critical field of software auditing and compliance. FossID’s Software Composition Analysis (SCA) tool, Workbench, and professional services are designed to ensure comprehensive open source compliance and security in software development.