"This is the holy grail of surveillance," said a European official whose country uses the technology on its cities.
What the Financial Times reported from Israel, Iran, and Russia
The Financial Times published reporting showing how artificial intelligence is changing what video surveillance can do, drawing on information from Israel, Iran and Russia. The FT reporting, as summarized in the source, links those country-level examples to a broader technical shift: AIs can now accept natural-language questions about stored or live video and return answers.
Natural-language video queries: a new capability
The central technical change described is straightforward and stark: users can pose natural-language questions about footage to AIs, and the AIs can answer them. That capability expands the ways operators interact with video archives and continuous streams, moving away from terse, pre-programmed queries and toward conversational search.
From a few dozen preset searches to "an almost unlimited range" of enquiries
The change is not merely incremental. Older surveillance tools are described as being "restricted to a few dozen preset searches." By contrast, the new generation of tools enables "an almost unlimited range of enquiries" because they accept language-based search terms. The effect, according to the source material, is to let people hunt through massive volumes of footage using simple descriptive phrases rather than selecting from a fixed menu of object-detection filters.
Concrete examples operators can now ask for
- Two men handing a bag to each other.
- A person who has changed their appearance, or has changed clothes multiple times in a day.
- A vehicle that has recently been painted over, or has driven past the same spot several times in a short period.
Those examples, supplied in the source material, illustrate the shift from searching for static objects to searching for behaviour patterns or events described in plain language.
How intelligence officers and European city officials are using it
The source specifically says intelligence officers can "hunt through massive streams of videos" with simple search terms; that operational profile is presented as an explicit use case for the technology. It also notes that at least one European country is deploying the tools on its cities — the attribution comes from the European official quoted at the top of this piece, who described the technology's adoption in urban contexts.
Why the author frames this as mass spying, revisiting a past argument
The writer of the source material points to a continuity in how technology expands surveillance. He wrote previously that AI enables mass spying "in the way that computers and networks enabled mass surveillance." The Financial Times reporting, and the examples cited above, are presented as evidence that AI has translated those theoretical concerns into practical new searching capabilities: behaviour-focused, language-driven, and scalable across large volumes of video.
The immediate, concrete takeaway is simple: systems that once required operators to select from a limited set of object or face matches now accept plain-English descriptions of behaviours and events, and return relevant clips. A European official called that "the holy grail of surveillance" because it reframes video monitoring from identifying objects to identifying actions — a shift the source describes as having "created a world of new possibilities."
That observation leaves a clear open question anchored in the facts reported here: as cities and intelligence agencies deploy AI tools that answer natural-language questions about footage, how will the uses, controls, and public visibility of those deployments evolve? The source material documents the capability and its uptake; it does not say how oversight or policy will respond.




