Lightweight Reachability System: GitLab Knowledge Graph + AI Agents

Sat, 04 Oct 2025 00:00:00 +0000

The Problem: Not Every SCA Vulnerability is Exploitable

When CVE-2025-29927 dropped for the Next.js library, many appsec teams worldwide immediately started working on a fix. Dependency scanners generated a vast amount of security findings across companies codebases that had the vulnerable version of the package.

The reason many false positive alerts were generated was due to a technical aspect that most traditional dependency scanners miss: Vulnerability != Exploitability.

The Next.js CVE perfectly illustrates this gap. From an exploitability perspective this vulnerability is only relevant where:

Self-hosted deployments using next start with output: standalone
Middleware file (middleware.js/middleware.ts) exists and is actively used
Middleware performs critical security operations like authentication or authorization checks

Yet every traditional dependency scanner will flag any application using affected Next.js versions, regardless of deployment context or middleware usage. This creates a massive false positive problem that burns security & engineering hours and creates alert fatigue.

Why Reachability Analysis Matters

Traditional SCA (Software Composition Analysis) tools operate on a simple binary: They extract the SBOM ➡️ Compare versions to a vulnerability database ➡️ Flag any match as risk.

This simplistic approach results in a high rate of false positives, as it doesn’t consider whether the vulnerable code is actually reachable or exploitable in your specific context. Because real exploitation requires:

Vulnerable code must be imported - Is the vulnerable function/module actually imported?
Reachable execution path - Does the call graph show that application flow can reach the vulnerable code?
Client controlled input - Can external input influence the vulnerable code path?

GitLab’s Knowledge Graph: Tool Overview

GitLab recently released a Knowledge Graph tool that transforms your codebase into a queryable graph database. It understands:

Function call hierarchies
Import dependencies
Variable flow
Cross-file relationships

The special thing about this tool is that it can be easily integrated into your AI agents & LLMs by providing a local MCP setup, which helps you query your codebase for code insights.

While this tool is primarily designed to assist engineers during development, I realized its capabilities could be redirected to help AppSec engineers investigate SCA vulnerabilities.

Quick Example: CVE-2024-47081 Analysis

To demonstrate this approach I analyzed a sample repository for SCA findings. From the initial traditional scan I identified the requests library at version 2.32.3 which has CVE-2024-47081: a credential leakage vulnerability in Python’s requests library. This vulnerability allows .netrc credential leakage when processing malicious URLs, but our analysis revealed:

3 files using the requests library: github_collector.py, npm_collector.py, and smithery_collector.py
All requests calls use hardcoded URL templates with validated string formatting
No arbitrary URL injection vectors found in the codebase

This analysis took < 30 seconds using GitLab’s Knowledge Graph + Claude, compared to a long time of manual code review. The system correctly identified that while the vulnerable package was present, the specific conditions for exploitation were not met.

Automating the Process

Taking this a step further, we can essentially automate the process to help with remediations by creating a job that takes the SCA findings from our traditional scanners and uses the knowledge graph along with an AI agent to review them. It will automatically generate a reachability report for each repository.

Architecture Overview

graph LR A[SCA Scanner Findings] --> B[LLM + Knowledge Graph] B --> C[Call Graph Analysis] C --> D[Reachability Report] E[Your Codebase] --> B

Example: The Reachability Analysis Prompt

Here’s the prompt template that powers the analysis:

You are a security researcher analyzing CVE reachability.
Given CVE: [CVE-ID]
Vulnerable component: [package@version]
Vulnerability details: [description]
Target Project: [project]

Using the GitLab Knowledge Graph, determine:
1. Is the vulnerable package imported anywhere?
2. What are the call paths to vulnerable functions?
3. Are these paths reachable from external entry points?
4. What conditions must be met for exploitation?

Provide a reachability verdict: REACHABLE | UNREACHABLE

Beyond CVEs: Advanced Security Applications

The knowledge graph isn’t just for investigating CVEs, it opens up entirely new possibilities for answering complex security questions that traditional SAST and SCA tools can’t address. For example:

Finding Authentication Bypasses

"Show me all code flows that reach database queries without passing through an authentication check"

Tracking PII Flow

"Trace all paths where PII data flows from API input to external services"

Detecting SSRF Patterns

"Find all places where user input can influence URL parameters in HTTP requests"

The combination of GitLab’s Knowledge Graph and AI agents enables us to perform deep level analysis of SCA vulnerabilities in a budget friendly manner, using this open source tool combined with our AI agents. This approach not only saves time and reduces alert fatigue, but it also opens up new possibilities for automated security analysis without the need for expensive commercial tools.

References:

CVE-2025-29927 - Next.js Authorization Bypass Vulnerability | National Vulnerability Database
CVE-2025-29927 Deep Dive | JFrog Security Research
GitLab Knowledge Graph MCP Server | Official Documentation
CVE-2024-47081 - Python Requests Credential Leakage | GitHub Security Advisory

LLM | Ben Benhemo