“We must understand not just how to use AI systems but how to question and override them when necessary.” That admonition from Brigadier General Michael Miller captures the awkward truth managers are discovering: giving machines more autonomy does not automatically make organizations smarter or faster. In practice, teams asked to adopt agentic AI — systems that can take initiative and coordinate other systems — are too often met with bots that misunderstand instructions, return trivial answers, or consume time on chores older tools could have handled more reliably.
That mismatch is not merely a technical hiccup. It reflects a deeper gap between how engineers build agentic systems and how human organizations actually work. The promising path forward is not to force-fit AI into existing team habits, but to redesign the social, operational, and governance environments so humans and agents can truly collaborate.
Background: agentic AI and the rising expectations
Agentic AI moves beyond passive models that answer queries; these systems pursue objectives, make decisions within set bounds, and coordinate with other software or hardware. Policymakers and defense planners have highlighted the potential: agentic capabilities could accelerate decision cycles, automate repetitive triage, and act as force multipliers for scarce human expertise. At the same time, officials warn that autonomy must be paired with accountability. As Deputy Secretary of Defense Dr. Kathleen Hicks has emphasized, “Our commitment to responsible AI use means maintaining rigorous human control and accountability.”
Current situation: high expectations, uneven outcomes
Across firms and agencies, leaders press teams to “adopt agents” to cut costs and free human attention for higher-order work. Yet pilot projects often fail to yield benefits. Common complaints include agents that:
- misinterpret context and ignore constraints;
- produce obvious or low-value outputs that waste reviewer time;
- spin on open-ended tasks that simpler scripts or scheduled processes could complete faster;
- create new operational fragility when telemetry, provenance, and monitoring are insufficient.
Technologists getting the most value treat agentic systems less like magic shortcuts and more like teammates whose behavior must be engineered and governed. Successful adopters focus on layered governance, observability, and carefully limited autonomy during rollout. These lessons emerge from deployments that emphasize integration across people, process, and infrastructure rather than model capability alone.
Why this matters: efficiency, trust, and risk
Organizations that fail to align AI with operational realities risk three outcomes: wasted investment, degraded trust among staff, and new attack surfaces for adversaries. When agents behave unpredictably, human supervisors either re-assert control (defeating the point of autonomy) or defer to the machine (risking opaque, unexamined decisions). Both extremes are costly. Moreover, adversaries — whether criminals or hostile states — view AI as both a tool and a target, raising the stakes on security and resilience.
What works: essential strategies for effortless collaboration
From field pilots and summit recommendations, a coherent playbook has begun to emerge. The following strategies are practical, immediately actionable, and grounded in experience.
- Design for hybrid roles and translation. Build teams that pair domain experts with AI engineers and product managers so that mission goals, inputs, and tradeoffs are transparent. Summit participants urged career ladders and rotation programs that cultivate staff who can bridge technical and business perspectives.
- Limit autonomy with scaffolding. Treat agentic behavior as a layered system: policy defines acceptable uses and escalation paths; process embeds continuous testing, red teams, and lifecycle management; tooling supplies logging, versioning, and explainability to support audits. This “scaffolding” lets experiments proceed without exceeding legal, ethical, or operational boundaries.
- Start small, pilot in low-risk workflows. Map mission-critical workflows and identify where agents can add value with constrained scope — such as automating routine enrichment or surfacing likely matches — before expanding responsibility. Real-world pilots show faster learning and fewer governance headaches when autonomy grows in stages.
- Invest in observability and provenance. Make every agent decision auditable: collect context, inputs, intermediate reasoning artifacts, and outputs so humans can diagnose failures and identify adversarial manipulation. This is essential for accountability and remediation.
- Prioritize workforce resilience and training. Retrain staff to ask the right questions of agents, override unsafe actions, and maintain institutional knowledge. Cross-training reduces single points of failure and produces operators who can treat AI behavior as something to be interrogated, not implicitly trusted.
- Embed security and adversarial testing. Assume models will be probed. Layered cyber defenses, strict supply-chain vetting, continuous adversarial testing, and contractual security requirements for vendors make deployments more robust. As one briefing noted, adversaries will attempt to weaponize models and subvert data flows; defenses must be built in.
- Balance centralization and edge autonomy. Central cloud models ease oversight but concentrate risk; edge deployments reduce latency and boost autonomy but complicate governance. Pragmatic architectures combine redundancy, hardened edge sites, and orchestration tools that preserve consistency without creating single points of failure.
Different perspectives: technologists, policymakers, users, adversaries
Technologists see agentic systems as a new control plane for automation that can free human time for creative work — provided the systems are constrained and instrumented. Policymakers emphasize adaptive governance frameworks (like NIST’s AI Risk Management Framework) that can be updated as technology evolves, warning against overly prescriptive rules that date quickly.
End users — the staff who must live with agents day-to-day — value transparency and predictability above raw capability. If an agent saves ten minutes once but then introduces daily uncertainty, users will reject it. That’s why user-centered design, participatory development, and clear remediation paths are as critical as model accuracy.
Adversaries complicate every success story. Whether probing for data leaks, crafting adversarial inputs, or weaponizing automation in cyberattacks, hostile actors force defenders to make security and resilience non-negotiable. Practitioners therefore recommend layered defense-in-depth and continuous monitoring as core elements of any deployment.
Case lessons: where success looks like human-centered engineering
Successful programs highlight a common theme: effectiveness depends less on model power and more on alignment across layers. City-scale traffic-control pilots and hospital triage trials show measurable improvements when infrastructure, governance, and human oversight are intentionally integrated. Conversely, attempts that focus only on agentic novelty without process and training produce disappointing, sometimes harmful outcomes.
Risks and trade-offs
Centralization simplifies control but amplifies concentration risk; decentralization improves autonomy but fragments oversight. Overly lax regimes invite harm; overly rigid regulation stifles innovation. The pragmatic answer is iterative governance — adaptive standards, routine audits, and certifications that evolve with systems and use cases.
Getting practical: an adoption checklist
- Map workflows and pick low-risk pilot areas.
- Form hybrid teams (domain + ML + product).
- Define policy guardrails and escalation paths.
- Instrument agents with logging, explainability, and provenance.
- Run adversarial and red-team tests before scale.
- Train staff to question, override, and audit agent actions.
- Embed security clauses and documentation requirements in procurement.
Conclusion
The lesson is modest but profound: AI that collaborates well with humans is not purely a software problem; it is an organizational design challenge. Leaders who expect instantaneous gains from agentic systems without rethinking roles, governance, and infrastructure will be disappointed. Those who treat agents as new kinds of teammates — design their behaviors, constrain their authority, and give humans the tools and training to supervise them — will realize steady, durable benefits.
In an era where machines can behave remarkably like humans, will we build systems that augment human judgment or ones that quietly erode it? The answer will depend less on the next model release than on the choices we make now about rules, oversight, and the responsibilities we retain.
Source: https://www.schneier.com/blog/archives/2026/01/ai-humans-making-the-relationship-work.html




