BCDA: The AI Detective Separating Real Bugs from False Alarms

BCDA: The AI Detective Separating Real Bugs from False Alarms

🎯 From Potential Sink to Actionable Intelligence

BCDA (Bug Candidate Detection Agent)’s core mission is to address the fundamental challenge of lightweight sink analysis: distinguishing real vulnerabilities from false-positive noise. When MCGA, our cartographer, flags a function containing a potentially vulnerable “sink” (such as a function that executes system commands), BCDA takes over.

Its job isn’t just to say “yes” or “no.” BCDA performs a deep, multi-stage investigation powered by LLMs to produce a Bug Inducing Thing (BIT). A BIT is a high-fidelity, structured report detailing a confirmed vulnerability candidate. It includes the exact location, the specific trigger conditions (like if-else branches), and a detailed analysis generated by LLMs. This report becomes a detailed guide for our demolition expert, BGA, and the fuzzing stages.

To get there, BCDA follows a rigorous, four-step investigative process.


Step 1: πŸ” Path Expansion and Pruning (Gathering the Evidence)

A simple call stack is like a list of locations a suspect has visited: it shows where they’ve been, but not what they did there or who else was involved. It often misses crucial context from functions outside the direct call chain, such as utility or validation functions.

BCDA starts by expanding the execution path. It receives the call path from the source (typically the harness) to the sink (flagged by MCGA) and uses Tree-sitter-based parsing to identify every single function call made along that path. This expanded set of functions is then provided to an LLM.

But more data isn’t always better. To avoid analyzing irrelevant code, BCDA immediately applies LLM-powered pruning. Since MCGA already hints at the type of vulnerability (e.g., Command Injection, XXE, Path Traversal), BCDA asks the LLM: β€œOf all these functions, which ones are truly relevant to this vulnerability type?”

The LLM responds with only the necessary functions, leaving BCDA with a rich but focused context for its investigation.

Step 2: πŸ•΅οΈ Vulnerability Classification (The Interrogation)

With the relevant code path established, BCDA begins its interrogation. BCDA constructs a sanitizer-specific prompt for the LLM. This isn’t a generic question; it’s a highly detailed briefing that includes:

  • A clear explanation of the suspected vulnerability type.
  • Common coding patterns and anti-patterns associated with that vulnerability.
  • Effective strategies for detecting it in source code.

For example, when analyzing LDAP injection, BCDA provides this specific guidance:

<sanitizer>
  <type>LDAPInjection</type>
  <description>
    LDAP queries constructed by concatenating unescaped user input.

    Find: String concatenation building LDAP DN or filters, including multi-valued RDNs.
    ```java
    String username = request.getParameter("user");
    String dn = "cn=" + username + ",dc=example,dc=com";  // BUG: unescaped
    dirContext.search(dn, attrs);

    // Filter context
    String filter = "(uid=" + username + ")";  // BUG: unescaped filter
    dirContext.search(baseDN, filter, controls);
    ```
  </description>
</sanitizer>

Armed with this vulnerability domain knowledge and the expanded-and-pruned code path, the LLM analyzes the code path and makes a definitive judgment: does this path contain the suspected vulnerability, or is it a false alarm?

For more details on how we prepare and implement context engineering for vulnerability detection, check out our Domain Knowledge Integration technique.

Step 3: πŸ”‘ Key Condition and Trigger Path Extraction (Reconstructing the Crime)

If the LLM confirms a vulnerability, BCDA’s most critical task begins: identifying the key conditions required to trigger it. A bug is useless if you can’t reach it. BCDA needs to find the exact sequence of if statements, try-catch blocks, and other conditional logic that unlocks the path to the vulnerable code.

Instead of overwhelming the LLM with the entire path at once, BCDA analyzes it transition by transition. It focuses on one function call at a time, asking the LLM: “To get from function A to function B, what conditions must be true?”

This granular, step-by-step approach allows the LLM to focus its analysis and accurately identify critical decision points. For example, it might determine that an if (user.isAdmin()) check must evaluate to true or that control must flow into a catch (Exception e) block to reach the sink.

Step 4: πŸ“ BIT Generation (Filing the Case Report)

The investigation concludes with the creation of a Bug Inducing Thing (BIT). This isn’t just a simple alert; it’s a structured data object, optimized for the next stages of our pipeline.

Each BIT contains everything an exploitation agent needs to know:

  • Vulnerability Type: The class of the bug (e.g., COMMAND_INJECTION).
  • Location: The exact file path and line numbers.
  • Key Conditions: A list of all branch conditions that must be satisfied.
  • Analysis Messages: A log of the LLM’s reasoning from each step.
  • Priority Level: A score based on factors like how recently the code was changed.

This structured report is then passed to BGA (Blob Generation Agent) and our fuzzers. Armed with a BIT, BGA and our fuzzers receive a confirmed, detailed blueprint for crafting a precise and effective exploit.


πŸ“‹ What BCDA Actually Analyzes: A Real Example

Here’s the actual source code BCDA analyzes when investigating an LDAP injection vulnerability in Jenkins from our benchmark. We use the same annotation system described in our context engineering source code annotations:

  • /* @KEY_CONDITION */ marks conditions that must be satisfied to reach the vulnerability
  • /* @BUG_HERE */ identifies the exact location of the vulnerable code

Entry Point (Harness):

// fuzzerTestOneInput - where the fuzz data enters
public static void fuzzerTestOneInput(byte[] data) throws Exception {
    BugDetectors.allowNetworkConnections((host, port) -> host.equals("localhost"));
    new JenkinsThree().fuzz(data);
}

Routing Logic with Key Conditions:

public void fuzz(byte[] data) throws Exception {
    ByteBuffer buf = ByteBuffer.wrap(data);
    if (buf.remaining() < 4) { /* @KEY_CONDITION */
        return;
    }
    
    int picker = buf.getInt();
    switch (picker) { /* @KEY_CONDITION */
        case 190: /* @KEY_CONDITION */
            testAuthAction(buf);
            break;
        // ... other cases
    }
}

Input Processing:

void testAuthAction(ByteBuffer buf) {
    String[] parts = getRemainingAsString(buf).split("\0");
    if (parts.length != 4) { /* @KEY_CONDITION */
        return;
    }
    // Mock request with user-controlled parameters
    when(innerReq.getParameter(parts[0])).thenReturn(parts[1]);
    when(innerReq.getParameter(parts[2])).thenReturn(parts[3]);
    // ...
    action.authenticateAsAdmin(req, rsp);
}

The Vulnerable Sink:

public void authenticateAsAdmin(StaplerRequest request, StaplerResponse response) 
        throws IOException, NamingException {
    if (!request.hasParameter("username") || !request.hasParameter("key")) { /* @KEY_CONDITION */
        response.sendError(HttpServletResponse.SC_BAD_REQUEST);
        return;
    }
    
    String username = request.getParameter("username");
    String key = request.getParameter("key");
    
    if (!isAdmin(dirContext, controls, username)) { /* @KEY_CONDITION */
        writer.print("{\"status\": \"failure\"}");
        return;
    }
    
    // THE VULNERABILITY: Unsanitized user input in LDAP filter
    String searchFilter = "(&(objectClass=inetOrgPerson)(cn=" + username + ")(userPassword=" + key + "))"; /* @BUG_HERE */
    NamingEnumeration<SearchResult> results = dirContext.search("ou=users,dc=example,dc=com", searchFilter, controls);
}

BCDA Goes Deeper: Analyzing the isAdmin Gate

But BCDA doesn’t stop at identifying isAdmin() as a key condition. It follows the execution path deeper to understand what conditions must be satisfied inside that function:

private boolean isAdmin(DirContext dirContext, SearchControls controls, String name) throws NamingException {
    String searchFilter = "(&(objectClass=inetOrgPerson)(description=admin))";
    try {
        NamingEnumeration<SearchResult> results = dirContext.search("ou=users,dc=example,dc=com", searchFilter, controls);
        while (results.hasMore()) {
            SearchResult result = results.next();
            if (result.getAttributes().get("cn").get().equals(name)) { /* @KEY_CONDITION */
                return true;
            }
        }
    } catch (Exception e) {
    }
    return false;
}

BCDA discovers that for isAdmin() to return true, two nested conditions must be met:

  1. The LDAP search must return at least one admin user (results.hasMore())
  2. One of those admin users must have a cn attribute matching the input username

This depth of analysis is crucial. Many vulnerability detection tools would stop at the high-level isAdmin() check, missing the specific LDAP query conditions needed to bypass it.

This is exactly the kind of complex, multi-condition vulnerability path that BCDA excels at analyzing. Notice how the bug is buried behind multiple conditional checks, switch statements, and function calls. Traditional static analysis tools would struggle to map this complete execution path and identify the precise conditions needed to reach the vulnerability.


✨ The BCDA Difference: From Guesswork to Certainty

BCDA transforms the system from a heavy, resource-draining carpet-bombing approach into a ‘strategic vulnerability discovery platform’ that precisely targets high-probability vectors and concentrates resources where they matter most.

MCGA (Finds Leads) βž” BCDA (Verifies & Details Leads) βž” BGA (Exploitation Leads)

By filtering out false positives and enriching real vulnerabilities with precise trigger conditions, BCDA ensures that our most powerful and computationally expensive agent, BGA, focuses its efforts only on confirmed, high-value targets. It is the crucial link that turns MCGA’s broad surveillance into BGA’s surgical strike, saving time and improving the quality of discovered bugs.

BCDA demonstrates that in automated security analysis, the goal isn’t just to find more potential bugs. It’s to find the right ones, armed with the intelligence needed to act on them.

πŸ”„ Forward vs. Backward Analysis: A Design Choice

BCDA currently employs a forward analysis approach, following the software analyst’s mindset: starting from the entry point (harness code), following the call trace, and then finding the buggy point. This mirrors how traditional program analysis works: trace execution paths from known entry points to discover what might go wrong.

But there’s another perspective: the hacker’s approach. Security researchers often work backward. They first identify potentially vulnerable sinks (like system() calls, SQL query construction, or memory corrupting memcpy() or strcpy()), then trace backward through references to find if these sinks are reachable from user-controlled entry points.

This backward analysis has compelling advantages:

  • Efficiency: Focus immediately on high-risk code patterns
  • Coverage: Find vulnerabilities that might not be reachable through obvious execution paths
  • Hacker mindset: Mirror how real attackers analyze code for weaknesses

We experimented with implementing this backward tracing approach during the competition, starting from vulnerable sinks and following references back to entry points. However, we couldn’t complete this implementation within the competition timeframe.

The backward analysis remains an interesting direction for future research. Combining both forward and backward approaches could provide more comprehensive vulnerability coverage while maintaining BCDA’s precision in identifying exploitable conditions.


Now, interested to see how BCDA’s results are used for self-evolving exploits? Check out πŸ› οΈ BGA: Self-Evolving Exploits Through Multi-Agent AI.

Dive Deeper

This was a look into our AI detective, BCDA. To see how MLLA components work together, check out our other deep dives:


Related Posts

Autonomously Uncovering and Fixing a Hidden Vulnerability in SQLite3 with an LLM-Based System

Autonomously Uncovering and Fixing a Hidden Vulnerability in SQLite3 with an LLM-Based System

Without knowing beforehand that the challenge project involved SQLite3, our team, Team Atlanta, entered our Cyber Reasoning System (CRS), named Atlantis, into the AI Cyber Challenge organized by ARPA-H, DARPA, and the White House. Remarkably, Atlantis secured six first-bloods and autonomously identified and patched a real bug in SQLite31, earning us a $2 million prize and a place in the grand finals of AIxCC. For more details, check out our team’s announcement blog.

Jazzer+LibAFL: Insights into Java Fuzzing

Jazzer+LibAFL: Insights into Java Fuzzing

AIxCC involved finding bugs in software written in two languages: C++ and Java. The focus of the competition was on the use of LLMs and AI, however, our teams approach was to balance ambitious strategies alongside proven traditional bug-finding techniques like fuzzing. While our team was deeply familiar with fuzzing C++ from decades of academic research and industry work, Java was uncharted territory for us. In part of our Java fuzzing development we created a fork of Jazzer that uses LibAFL as the fuzzing backend and it is available as part of our open source release. This post details some of the lessons we learned about Java fuzzing and the creation of this fork.

BGA: Self-Evolving Exploits Through Multi-Agent AI

BGA: Self-Evolving Exploits Through Multi-Agent AI

πŸ”„ Where BGA Fits in the MLLA Pipeline Before we dive into BGA’s self-evolving exploits, here’s how it fits into the broader MLLA vulnerability discovery pipeline: Discovery Agents (CPUA, MCGA, CGPA) β†’ Detective (BCDA) β†’ Exploit Generation (BGA) Discovery agents map the codebase and identify potential vulnerability paths BCDA investigates these paths, filtering false positives and creating Bug Inducing Things (BITs) with precise trigger conditions BGA receives these confirmed vulnerabilities and generates self-evolving exploits to trigger them Now BGA takes the stage, armed with BCDA’s detailed intelligence about exactly what conditions must be satisfied to reach each vulnerability.