
The Possibilities – and Perils – of a Trump-Putin Meeting
EXPERT INTERVIEWS – President Donald Trump has once again defied diplomatic norms and scrambled the geopolitical picture, this time with a pair of conversations involving […] More
Eric Washabaugh, Former CIA Targeting & Technology Manager
Eric Washabaugh served as a targeting and technology manager at the CIA, where he served from 2006 – 2019, leading multiple inter-agency and multi-disciplinary targeting teams focused on al-Qa’ida, ISIS, and al-Shabaab at CIA’s Counterterrorism Center (CTC). He is currently the Vice President of Mission Success at Anno.Ai, where he oversees multiple machine learning-focused development efforts across the government space.
PERSPECTIVE — As the U.S. competes with Beijing and addresses a host of national security needs, U.S. defense will require more speed, not less, against more data than ever before. The current system cannot support the future. Without robots, we’re going to fail.
News articles in recent years detailing the rise of China’s technology sector have highlighted the country’s increased focus on advanced computing, artificial intelligence, and communication technologies. The country’s five year plans have increasingly focused on meeting and exceeding western standards, while constructing reliable, internal supply chains and research and development for artificial intelligence (AI). A key driver of this advancement are Beijing’s defense and intelligence goals.
Beijing’s deployment of surveillance in their cities, online, and financial spaces has been well documented. There should be little doubt that many of these implementations are being mined for direct or analogous uses in the intelligence and defense spaces. Beijing has been vacuuming up domestic data, mining the commercial deployment of their technology abroad, and has collected vast amounts of information on Americans, especially those in the national security space.
The goal behind this collection? The development, training, and retraining of machine learning models to enhance Beijing’s intelligence collection efforts, disrupt U.S. collection, and identify weak points in U.S. defenses. Recent reports clearly reflect the scale and focus of this effort – the physical relocation of national security personnel and resources to Chinese datacenters to mine massive collections to disrupt U.S. intelligence collection. Far and away, the Chinese exceed all other U.S. adversaries in this effort.
As the new administration begins to shape its policies and goals, we’re seeing typical media focus on political appointees, priority lists, and overall philosophical approaches but what we need is an intense focus on the intersection of data collection and artificial intelligence if the U.S. is to remain competitive and counter this rising threat.
The Emergent System After 9/11: Data Isn’t a Problem, Using It Is
In the wake of the 9/11 attacks, the U.S. intelligence community and the Department of Defense poured billions into intelligence collection. Data was collected from around the world in a variety of forms to prevent new terrorist attacks against the U.S. homeland. Every conceivable relevant detail of information that could prevent an attack or hunt down those responsible for attack plotting was collected. Simply put, the United States does not suffer from a lack of data. The emerging capability gap between Beijing and Washington is the processing of this data that allows for the identification of details and patterns that are relevant to America’s national security needs.
Historically, the traditional intersection of data collection, analysis, and national defense were a cadre of people in the intelligence community and the Department of Defense known as analysts. A bottom-up evolution started after 9/11, has revolutionized how analysis is done and to what end. As data supplies grew and new demands for analysis emerged, the cadre began to cleave. The traditional cadre remained focused on strategic needs: warning policymakers and informing them of the plans and intentions of America’s adversaries. The new demands were more detailed and tactical, and the focus was on enabling operations, not informing the President. Who, specifically, should the U.S. focus its collection against? What member of a terrorist group should the U.S. military target and where does he live, what time does he drive to meet his buddies? This new, distinct cadre of professionals rose to meet the new demand – they became known as targeters.
The targeter is a detective who pieces together the life of a subject or network in excruciating detail: their schedule, their family, their social contacts, their interests, their possessions, their behavior, and so on. The targeter does all of this to understand the subject so well that they can assess their subject’s importance in their organization and predict their behavior and motivation. They also make reasoned and supported arguments as to where to place additional intelligence collection resources against their target to better understand them and their network, or what actions the USG or our allies should take against the target to diminish their ability to do harm.
The day-to-day responsibilities of a targeter include combing through intelligence collection, be it reporting from a spy in the ranks of al-Qa’ida, a drug cartel, or a foreign government (HUMINT); collection of enemy communications (SIGINT); images of a suspicious location or object (IMINT); review of social media, publications, news reports, etc.(OSINT); or materials captured by U.S. military or partner country forces during raids against a specific target, location, or network member (DOCEX). Using all of the information available, the targeter looks for specific details that will help assess their subject or networks and predict behaviors.
As more and more of the cadre cleaved into this targeter role, agencies began to formalize their roles and responsibilities. Data piled up and more targeters were needed. As this emergent system was being formalized into the bureaucracy, it quickly became overwhelmed by the volumes of data. Too few tools existed to exploit the datasets. Antiquated security orthodoxy surrounding how data is stored and accessed disrupting the targeter’s ability to find links. The bottom-up innovation stalled. Even within the most sophisticated and well-supported environments for targeting in the U.S. Government, the problem has persisted and is growing worse. Without attention and resolution, these issues may make the system obsolete.
The Threat of the Status Quo
Two practical issues loom over the future of targeting and effective, focused U.S. national security actions: data overload and targeter enablement.
The New Stovepipes
Since the 9/11 Commission Report, intelligence “stovepipes” became part of the American lexicon and reflected bureaucratic turf wars and politics. Information wasn’t shared between agencies that could have increased the probability that the attack could have been detected and prevented. Today, volumes of information are shared between agencies, exponentially more per month is collected and shared than what was in the months before 9/11. Ten years ago, a targeter pursuing a high value target (HVT) – say the leader of a terrorist group – couldn’t find, let alone analyze, all of the information of potential value to the manhunt. Too much poorly organized data means the targeter cannot possibly conduct a thorough analysis at the speed the mission demands. Details are missed, opportunities lost, patterns misidentified, mistakes made. The disorganization and walling off of data for security purposes means new stovepipes have appeared, not between agencies, but between datasets-often within the same agency. As the data volume grows, these challenges have also grown.
Authors have been writing about the issue of data overload in the national security space for years now. Unfortunately, progress to manage the issue or offer workable solutions has been modest, at best. Data of a variety of types and formats, structured and unstructured, flows into USG repositories every hour; 24/7/365. Every year it grows exponentially. In the very near future, there should be little doubt, the USG will collect against foreign 5G, IoT, advanced satellite internet, and adversary databases in the terabyte, petabyte, exabyte, or larger realm. The ingestion, processing, parsing, and sensemaking challenges of these data loads will be like nothing anyone has ever faced before.
Let’s illustrate the issue with a notional comparison.
The U.S. military in 2008, raided an al-Qa’ida safehouse in Iraq and recovered a laptop with a 1GB hard drive. The data on the hard drive was passed to a targeter for analysis. It contained a variety of documents, photos, and video. It took several hours and the help of a linguist, but the targeter was able to identify several leads and items of interest that would advance the fight against al-Qa’ida.
The Afghan Government in 2017, raided an al-Qa’ida media house and recovered over 40TB of data. The data on the hard drives was passed to a targeter for analysis. It contained a variety of documents, photos, and video. Let’s be nice to our targeter and say, only a quarter of the 40TB is video – that’s still as much as 5,000 hours. That’s 208 days of around-the-clock video review and she still hasn’t been able to review the documents, audio, or photos. Obviously, this workload is impossible given the pace of her mission, so she’s not going to do that. Her and her team only look for a handful of specific documents and largely discard the rest.
Let’s say the National Security Agency in 2025, collected 1.4 petabytes of leaked Chinese Government emails and attachments. Our targeter and all of her teammates could easily spend the rest of their careers reviewing the data using current methods and tools.
In real life, the raid on Usama Bin Ladin’s compound produced over 250GB of material. It took an interagency task force in 2011 many months to manually comb through the data and identify material of interest. These examples shed light on only a subset of data overload. Keep in mind, this DOCEX is only one source our targeter has to review to get a full picture of her target and network. She’s also looking through all of the potentially relevant collected HUMINT, SIGINT, IMINT, OSINT, etc. that could be related to her target. That’s many more datasets, often stovepipes within stovepipes, with the same outmoded tools and methods.
This leads us to our second problem, human enablement.
The Collapsing Emergent System
Much of our targeter’s workday is spent on information extraction and organization, the vast majority of which is, well, robot work. She’ll be repeating manual tasks for most of the day. She knows what she needs to investigate today to continue building her target or network profile. Today it’s a name and a phone number. She has a time consuming, tedious, and potentially error-prone effort ahead of her–a “swivel chair process”–tracking down the name and phone number in multiple databases using a variety of outmoded software tools. She’ll manually investigate her name and phone number in multiple stovepiped databases. She’ll map what she’s found in a network analysis tool, in an electronic document, or <*wince*> a pen to paper notebook. Now…finally…she will begin to use her brain. She’ll look for patterns, she’ll analyze the data temporally, she’ll find new associations and correlations, and she’ll challenge her assumptions and come to new conclusions. Too bad she spent 80% of her time doing robot work.
This is the problem as it stands today. The targeter is overwhelmed with too much unstructured and stovepiped information and does not have access to the tools required to clean, sift, sort and process massive amounts of data. And remember, the system she operates is about to receive exponentially more data. Absent change, a handful of things are almost certain to happen:
This may seem banal or weedy, but it should be very concerning. This system – how the United States processes the information it collects to identify and prevent threats – will not work in the very near future. The data stovepipes of the 2020s can result in a surprise or catastrophe like the institutional stovepipes of the 1990s; it won’t be a black swan. As the U.S. competes with Beijing, its national defense will require more speed, not less, against more data than ever before. It will require evaluating data and making connections and correlations faster than a human can. It will require the effective processing of this mass of data to identify precision solutions that reduce the scope of intervention to achieve our goals, while minimizing harm. Our current and future national defense needs our targeter to be motivated, enabled, and effective.
Innovating the System
To overcome the exponential growth in data and subsequent stovepiping, the IC doesn’t need to hire armies of 20-somethings to do around-the-clock analysis in warehouses all over northern Virginia. It needs to modernize its security approach to connect these datasets, and apply a vast suite of machine learning models and other analytics to help targeters start innovating. Now. Technological innovations are also likely to lead to more engaged, productive, and energized targeters who spend their time applying their creativity and problem-solving skills, and spend less time doing robot work. We can’t afford to lose any more trained and experienced targeters to this rapidly fatiguing system.
The current system as discussed, is one of unvalidated data collection and mass storage, manual loading, mostly manual review, and robotic swivel chair processes for analysis.
The system of the future breaks down data stovepipes and eliminates the manual and swivel chair robot processes of the past. The system of the future automates data triage, so users can readily identify datasets of interest for deep manual research. It automates data processing, cleaning, correlations and target profiling – clustering information around a potential identity. It helps targeters identify patterns and suggests areas for future research.
How do current and emerging analytic and ML techniques bring us to the system of the future and better enable our targeter? Here are four ideas to start with:
Let’s be clear, we’re far from the Laplace’s demon of HBO’s “Westworld” or FX’s “Devs”: there is no super machine that will replace the talented and dedicated folks that make up the targeting cadre. Targeters will remain critical to evaluating and validating these results, doing deep research, and applying their human creativity and problem solving. The national security space hires brilliant and highly educated personnel to tackle these problems, let’s challenge and inspire them, not relegate them to the swivel chair processes of the past.
We need a new system to handle the data avalanche and support the next generation. Advanced computing, analytics, and applied machine learning will be critical to efficient data collection, successful data exploitation, and automated triage, correlation, and pattern identification. It’s time for a new chapter in how we ingest, process, and evaluate intelligence information. Let’s move forward.
Read more expert-driven national security insights, perspective and analysis in The Cipher Brief
Related Articles
EXPERT INTERVIEWS – President Donald Trump has once again defied diplomatic norms and scrambled the geopolitical picture, this time with a pair of conversations involving […] More
DEEP DIVE – On January 10, the oil tanker Eventin was steaming westward from Russia along Germany’s Baltic coast, bucking eight-foot waves and biting winds, […] More
BOTTOM LINE UP FRONT – On Tuesday, with Jordan’s King Abdullah II at his side, President Donald Trump exuded confidence that the Arab world will […] More
EXPERT INTERVIEW — World leaders and tech executives are gathered in Paris for the latest global summit on artificial intelligence. The French AI summit, co-hosted by […] More
BOTTOM LINE UP FRONT – As President Donald Trump doubles down on his idea of a U.S. takeover of Gaza, Arab governments are also doubling […] More
EXPERT INTERVIEW – A report from China about a massive new military command center – a complex that, when completed, is expected to be 10 […] More
Search