<- All Posts
CI/CDDevSecOpsAI SecurityAutomation

AI Agent Security Testing in CI/CD: Automating Adversarial Testing in Your Pipeline

How to integrate AI agent security testing into CI/CD pipelines — gating deployments on adversarial AI testing results, running FortifAI in GitHub Actions, and building continuous AI security coverage.

FortifAI||6 min read

AI Agent Security Testing in CI/CD

Security testing for AI agents is often treated as a periodic exercise — a red team engagement every quarter, a pre-launch audit before a major release. This approach has a fundamental problem: AI agents change constantly. Model updates, new tools, prompt revisions, RAG corpus changes — each is a potential new vulnerability. Quarterly testing catches a tiny fraction of the risk surface.

The solution is continuous adversarial AI testing integrated directly into CI/CD pipelines — the same way you'd integrate unit tests or SAST scanning.

This guide covers how to do it.


Why CI/CD Integration Is Non-Negotiable for AI Agent Security

Traditional software security can rely heavily on static analysis — examining code for patterns that correlate with vulnerability classes. AI agent security cannot. The vulnerabilities are behavioral and emergent. They only manifest when the agent runs.

This means:

  1. Every change is potentially a new vulnerability — a prompt edit that makes the agent more helpful may also make it more susceptible to goal hijacking. A new tool that adds capability expands the tool abuse surface.
  1. Model updates introduce regressions — when your LLM provider updates the underlying model, your agent's behavior changes. Defense mechanisms that worked against GPT-4o may need calibration against its successor.
  1. Manual testing doesn't scale — running a full adversarial test suite manually before every deployment is infeasible. CI/CD integration makes it automatic.
  1. Compliance requires continuous evidence — NIST AI RMF and emerging AI security frameworks expect documented, ongoing security testing — not a once-a-year report.

The CI/CD Integration Architecture

┌─────────────────────────────────────────────────────┐
│  Developer pushes code / prompt / tool change        │
└────────────────────────┬────────────────────────────┘
                         │
                         ▼
┌─────────────────────────────────────────────────────┐
│  CI Pipeline: Build + Unit Tests + Type Check        │
└────────────────────────┬────────────────────────────┘
                         │
                         ▼
┌─────────────────────────────────────────────────────┐
│  Deploy to staging environment                       │
└────────────────────────┬────────────────────────────┘
                         │
                         ▼
┌─────────────────────────────────────────────────────┐
│  FortifAI Adversarial Scan                           │
│  → 150+ payloads against staging agent endpoint     │
│  → OWASP Agentic Top 10 coverage                    │
│  → Structured JSON results                           │
└────────────────────────┬────────────────────────────┘
                         │
              ┌──────────┴──────────┐
              │                     │
      Critical/High findings    No Critical/High
              │                     │
              ▼                     ▼
     ┌─────────────────┐   ┌────────────────────┐
     │  Block deploy   │   │  Deploy to prod     │
     │  Create ticket  │   │  Security badge ✓   │
     └─────────────────┘   └────────────────────┘

Step 1: Set Up Your FortifAI Configuration

Create a fortifai.config.ts in your project root that points to your staging environment endpoint:

// fortifai.config.ts
export default {
  // Your AI agent's HTTP endpoint
  target: process.env.AGENT_STAGING_URL ?? "http://localhost:3000/api/chat",

  // HTTP method your agent uses
  method: "POST" as const,

  // Request headers (auth token, content type)
  headers: {
    "Content-Type": "application/json",
    "Authorization": `Bearer ${process.env.AGENT_INTERNAL_TOKEN}`,
  },

  // Request body shape — {{FORTIFAI_PAYLOAD}} is replaced with each test payload
  requestBody: {
    messages: [
      {
        role: "user",
        content: "{{FORTIFAI_PAYLOAD}}"
      }
    ]
  },

  // Where in the response to find the agent's reply
  responseExtractor: "choices[0].message.content",

  // Which OWASP categories to run (default: all)
  categories: ["AA1", "AA2", "AA3", "AA4", "AA5", "AA6"],

  // Fail the scan if any of these severities are found
  failOn: ["critical", "high"],
}

Step 2: GitHub Actions Integration

# .github/workflows/ai-security.yml
name: AI Agent Security Scan

on:
  push:
    branches: [main, staging]
  pull_request:
    branches: [main]

jobs:
  ai-security-scan:
    name: Adversarial AI Security Scan
    runs-on: ubuntu-latest
    needs: [build, deploy-staging]  # Run after staging deployment

    steps:
      - name: Checkout
        uses: actions/checkout@v4

      - name: Setup Node.js
        uses: actions/setup-node@v4
        with:
          node-version: "20"

      - name: Run FortifAI Adversarial Scan
        run: npx fortifai scan
        env:
          FORTIFAI_API_KEY: ${{ secrets.FORTIFAI_API_KEY }}
          AGENT_STAGING_URL: ${{ secrets.AGENT_STAGING_URL }}
          AGENT_INTERNAL_TOKEN: ${{ secrets.AGENT_INTERNAL_TOKEN }}

      - name: Upload Security Report
        if: always()
        uses: actions/upload-artifact@v4
        with:
          name: fortifai-security-report
          path: fortifai-report.json
          retention-days: 90

      - name: Post PR Comment with Findings
        if: github.event_name == 'pull_request' && failure()
        uses: actions/github-script@v7
        with:
          script: |
            const fs = require('fs');
            const report = JSON.parse(fs.readFileSync('fortifai-report.json', 'utf8'));
            const criticalFindings = report.findings.filter(f => f.severity === 'critical');
            const body = `## ⚠️ AI Security Scan Failed\n\n` +
              `**Critical findings:** ${criticalFindings.length}\n\n` +
              criticalFindings.map(f => `- **${f.category}**: ${f.title}`).join('\n');
            github.rest.issues.createComment({
              issue_number: context.issue.number,
              owner: context.repo.owner,
              repo: context.repo.repo,
              body
            });

Step 3: GitLab CI Integration

# .gitlab-ci.yml (AI security stage)
ai-security-scan:
  stage: security
  image: node:20-alpine
  needs: [deploy-staging]
  script:
    - npx fortifai scan
  artifacts:
    when: always
    paths:
      - fortifai-report.json
    expire_in: 90 days
  variables:
    FORTIFAI_API_KEY: $FORTIFAI_API_KEY
    AGENT_STAGING_URL: $AGENT_STAGING_URL
  rules:
    - if: $CI_COMMIT_BRANCH == "main"
    - if: $CI_PIPELINE_SOURCE == "merge_request_event"

Step 4: Configure Severity Gates

The most important decision is what severity level blocks a deployment. Recommended defaults:

SeverityFinding TypeGate Policy
CriticalAgent fully complied with adversarial instructionBlock deployment
HighAgent showed significant vulnerability (partial compliance, data leakage)Block deployment
MediumAgent showed minor vulnerability or defense gapWarn + create ticket
LowInformational security improvementCreate ticket

Configure your gates in fortifai.config.ts:

failOn: ["critical", "high"],  // Exit code 1 on these severities
warnOn: ["medium"],            // Exit code 0, but print warnings

Step 5: Handle False Positives

Adversarial AI testing on probabilistic systems will produce some false positives — the model may behave non-deterministically, producing a "vulnerable" response on one run and a clean response on the next.

Managing this:

// fortifai.config.ts
retries: 3,           // Re-run payloads that produce findings to confirm
confirmThreshold: 2,  // Finding must appear in 2 of 3 runs to be reported

With a confirmThreshold: 2 setting, a finding only blocks deployment if it reproduces reliably — reducing false-positive-caused deployment blocks while maintaining security signal.


Step 6: Shift Left — Run in Local Dev

Developers don't have to wait for CI to catch AI security regressions. FortifAI runs locally:

# Run against local dev server
AGENT_STAGING_URL=http://localhost:3000/api/chat npx fortifai scan

# Quick scan — run only Critical-severity payload categories
npx fortifai scan --categories AA1,AA2,AA6

# Watch mode — re-scan on file changes (for prompt/config development)
npx fortifai scan --watch

Adding a pre-commit hook:

# .husky/pre-commit
npx fortifai scan --quick --fail-on critical

This catches prompt injection regressions before they're even committed.


Step 7: Track Security Posture Over Time

The FortifAI dashboard aggregates scan results across deployments, showing:

  • Security posture trend over time (are Critical findings increasing or decreasing?)
  • Finding distribution by OWASP category
  • Regression tracking (which changes introduced which findings)
  • Coverage confirmation (were all OWASP categories scanned?)

Use this data to:

  • Identify OWASP categories where your agent consistently struggles
  • Detect when model updates cause security regressions
  • Generate compliance evidence for NIST AI RMF or internal audit requirements

Sample Security Gate Policy for Teams

Pull Request → Staging → AI Security Scan → Production

Gate 1 (Staging gate): Block on Critical or High
Gate 2 (Production gate): Block on Critical; require CISO sign-off on High

Exemptions: High findings may be accepted with documented risk decision
from security team lead, expiring after 30 days.

Reporting: Monthly security posture report generated from scan history.

FortifAI integrates with any CI/CD system via the npx fortifai scan CLI. Set up your first pipeline scan → | Read the CLI docs →

Add Runtime Security To Your Agent Stack

FortifAI provides OWASP Agentic Top 10 coverage for modern agent pipelines.