AIBounty #000: 100 Days to PWN AI — The 2026 Reset

Follow the 100-day journey and learn how high-impact AI vulnerabilities are researched, validated, and responsibly disclosed.

Jul 01, 2026

On July 1, 2026, I am restarting my “100 Days to PWN AI” challenge.

The window is simple:

July 1, 2026 to October 8, 2026.

The public version of this challenge focuses on discipline, research process, and responsible disclosure. I will not publish working exploits, active vulnerable endpoints, private data, or reproduction steps for unresolved vulnerabilities. Until the issues are properly addressed, any information that could facilitate real-world attacks will remain strictly confidential.

What’s Changed?

The first iteration of the challenge was about habit building and momentum. For this reset, the objective is much more targeted: to validate a critical research thesis—that the most high-value AI security vulnerabilities are shifting from the model layer to product logic, permission boundaries, and agent tool execution layers.

While the 2026 financial benchmark of achieving $100,000 in bug bounties remains the metric of validation, the strategic focus must shift.

Minor CORS misconfigurations or informational leaks only create a false sense of progress and will not prove the thesis. I must focus entirely on a small subset of high-impact logic vulnerabilities:

Cross-tenant data exposure
Broken Object-Level Authorization (BOLA)
Knowledge base or retrieval isolation bugs
Permission failures in agents, tools, connectors, or MCPs (Model Context Protocol)
Logic flaws in authentication, billing, or execution boundaries

The strategy shifts from merely “finding bugs” to “proving that the AI product’s security model has fundamentally broken at critical junctures.”

Targets and Core Direction

Over the next 100 days, I am prioritizing scoped AI products and programs from frontier AI ecosystems, including OpenAI, Anthropic, and Google’s Gemini / AI ecosystem where permitted by their published rules of engagement. To maintain depth, I will have no more than two active main targets at any given time.

All testing will be conducted only within authorized scopes, published program rules, and safe-harbor boundaries. If a target, workflow, or test case falls outside an approved scope, it will be excluded from the public challenge.

Modern AI products are built on highly complex trust boundaries (spanning organizations, projects, vector stores, tools, and billing). Uncovering vulnerabilities here requires deep, methodical differential testing.

My core thesis is this: The most valuable AI security vulnerabilities in 2026 will stem from product logic flaws surrounding data, identity, tools, and execution contexts—not from model internals or generic infrastructure.

I will focus on validating the following scenarios:

Unauthorized references to files, memory, source code, or vector databases across workspaces.
Agents executing sensitive tools without proper authorization.
Cross-tenant data leakage via shared objects, forks, or caching mechanisms.
Retrieval-Augmented Generation (RAG) systems leaking private contexts due to disconnected identity validation.
Privilege escalation and boundary bypasses in billing, organization, or admin logic.

The Execution Plan

The 100-day window is divided into four key phases, designed to iterate quickly and secure high-quality submissions before the end of September:

July 1 - July 16 | Reset and Setup: Clean up pending high-value reports, configure a local multi-account testing workflow based on Burp Suite, and debug/optimize local AI-assisted testing tools to clear all friction.
July 17 - August 16 | Attack Surface Mapping & Hypothesis Building: Map out trust boundaries for OpenAI (GPTs/Memory/API), Gemini (AI Studio/Vertex/Workspace), and Anthropic (Tools/MCP). Combine AI-driven reconnaissance with manual analysis to produce a concise registry of testable vulnerability hypotheses.
August 17 - September 16 | Vulnerability Validation & Early Submission: Enter the deep execution phase. Combine deep manual testing with AI-assisted automation, aiming to submit at least three high-quality reports by September 16. We will enforce a brutal filtering rule: if a hypothesis yields no meaningful progress within a week, demote or dump it immediately.
September 17 - October 8 | Wrap-up and Q4 Prep: Complete the 100-day challenge review (with cleanup extending to October 16). Focus heavily on triage communication and bounty negotiation to set the stage for Q4 goals.

What I Will Share

I plan to share a series of sanitized technical notes and weekly field updates during these 100 days, but with strict guardrails.

What I will share:

Research processes and target selection logic
Anonymized methodologies and AI-assisted bug hunting techniques (e.g., prompt engineering tactics and collaborative agent workflows)
Insights on building and tuning security tools and scanners (e.g., custom AI-assisted testing scripts)
Failed vulnerability hypotheses
Reflections on triage dynamics and bounty economics
High-level architectural patterns once vulnerabilities are resolved or safe to disclose

What I will not share:

Active vulnerable endpoints
Fully functional exploit chains
Specific details of private programs
Customer data or screenshots revealing sensitive states
Exact reproduction steps for unresolved issues

The Scoreboard and Self-Discipline

In security research, it is incredibly easy to feel busy by collecting domains and running passive scanners. But high-tier bounties only reward those who choose the right attack surfaces, ask the sharpest questions, and demonstrate concrete business impact.

As such, I won’t be tracking the volume of reports. The only scoreboard that matters is:

3 to 5 High or Critical vulnerability candidates;
At least 6 high-quality submissions during the first 90 days of the challenge;
At least one target entering deep technical or triage discussion, validating the real-world impact and financial viability of logic-based AI safety research.

100 days. Select frontier AI ecosystems. One objective: turn real AI product security failures into high-value, responsibly disclosed bug bounty reports.

About the AIPwn Newsletter

If you’re interested in AI security, the frontier of AI bug hunting, developing custom security tooling, or designing autonomous scanning agents, subscribe to our newsletter. We’ll be sharing sanitized research methodologies and weekly field notes as we go.

AIPwn

Discussion about this post

Ready for more?