Microsoft's AI System Uncovers 16 Windows Flaws in Patch Tuesday Release

MDASH “orchestrates more than 100 specialized AI agents across an ensemble of frontier and distilled models to discover, debate, and prove exploitable bugs end-to-end,” Microsoft says — and in its first public tests the system independently surfaced 16 flaws fixed in this month’s Patch Tuesday.

MDASH’s multi-agent design

Microsoft describes MDASH (multi-model agentic scanning harness) as a model-agnostic, agentic system that combines many specialized AI agents into a structured pipeline. Taesoo Kim, vice president of agentic security at Microsoft, framed the architecture as a departure from single-model approaches: auditors, debaters and provers play distinct roles and are tuned with their own prompt regimes, tools and stop criteria. The company says the harness runs "more than 100" bespoke agents, each focused on different classes of vulnerabilities, and that it uses a configurable panel of models — state-of-the-art models for reasoning, distilled models for high-volume validation passes, and a second separate SOTA model for independent counterpoint.

How Microsoft says MDASH finds and proves bugs

According to Microsoft, MDASH ingests a codebase and constructs a threat model and attack surface before running specialized auditor agents over candidate code paths to flag issues. A second set of debater agents attempts to validate or refute those flags; semantically equivalent findings are grouped; and the final stage is to prove the existence of vulnerabilities. Microsoft emphasizes that disagreement between models is informative: "when an auditor flags something as suspect and the debater can't refute it, that finding’s posterior credibility goes up," the company said. The firm also reported that the agents were constructed based on past common vulnerabilities and exposures (CVEs) and their patches, and that the architecture allows portability across model generations.

The 16 Patch Tuesday findings, including two remote-code-execution flaws

Microsoft has already exercised MDASH in limited private preview with some customers and reports that the system unearthed 16 of the vulnerabilities corrected in the most recent Patch Tuesday release. The identified shortcomings span the Windows networking and authentication stack and include two issues Microsoft classified as allowing remote code execution:

CVE-2026-33824 (CVSS score: 9.8) — described as a double-free vulnerability in "ikeext.dll" that could allow an unauthenticated attacker to send specially crafted packets to a Windows machine with Internet Key Exchange (IKE) version 2 enabled, leading to remote code execution.
CVE-2026-33827 (CVSS score: 8.1) — described as a race condition vulnerability in Windows TCP/IP ("tcpip.sys") that allows an unauthorized attacker to send a specially crafted IPv6 packet to a Windows node where IPSec is enabled, leading to remote code execution exploitation.

Microsoft’s account ties MDASH’s practical value to its ability not only to flag candidate defects but to validate and prove them end-to-end within complex codebases like Windows.

MDASH alongside other AI vulnerability projects

Microsoft’s announcement came in a landscape where other vendors have also unveiled AI-focused security initiatives. The company noted Project Glasswing from Anthropic and OpenAI Daybreak as contemporaneous efforts aimed at accelerating vulnerability discovery, validation and remediation before defects are found by “bad actors.” Microsoft framed the strategic shift succinctly through Kim’s assessment: "The strategic implication is clear: AI vulnerability discovery has crossed from research curiosity into production-grade defense at enterprise scale, and the durable advantage lies in the agentic system around the model rather than any single model itself."

What this means for technologists, enterprises, and adversaries

Technologists and security teams: Teams tasked with code review and patching will watch MDASH-style pipelines for how they change triage workflows — particularly the promise of automated validation and provable exploit chains that can reduce false positives and prioritize fixes.
Affected enterprises and procurement leaders: Organizations buying security tools will see a new emphasis on multi-model, agentic architectures and may look for solutions that demonstrate portability across model generations and the capacity to produce validated, provable findings rather than raw alerts.
Adversaries and threat actors: Microsoft framed the work as defensive and preemptive; the company and peers are racing to surface and remediate defects before they are weaponized. The public disclosure that MDASH rediscovered high-severity CVEs underscores the potential for AI systems to match human-led discovery in some scenarios.

Microsoft’s MDASH is now in limited private preview with customer tests and a stated architecture built around specialized agents, model ensembles and an emphasis on validation. The system’s early return — locating 16 issues corrected in Patch Tuesday, including two remote-code-execution flaws with CVSS scores of 9.8 and 8.1 — offers a data point for the company's claim that agentic, multi-model tooling can operate at enterprise scale. Whether that capability scales broadly, how it integrates into existing toolchains, and how rivals’ systems compare in practice are questions tied to future tests and deployment rather than this initial disclosure.

Original story at The Hacker News