Skip to content
Search

Latest Stories

Welcome! Log in to stay connected and make the most of your experience.

Input clean

Data is “Ammo in AI Warfare,” and the Pentagon May Be Running Low

OPINION — “During the daily battle rhythm, the DoD [Department of Defense] creates more than 22 terabytes [equal to 22 trillion bytes with a byte being eight digits long] of digital data daily, and because of their [DoD] outdated data retention and management policies, warfighters, analysts, and operators are unable to tap into its full potential because it is not AI-ready [Artificial Intelligence-ready]. These potential insights are wasted.”

That was Alexandr Wang, founder and Chief Executive Officer of ScaleAI, testifying last Tuesday before the House Armed Services Subcommittee on Cyber, Information Technologies, and Innovation on “Man and Machine: Artificial Intelligence on the Battlefield.


I’m an amateur when it comes to AI, but I found Wang’s prepared statement and his oral testimony much more understandable about the subject than other presentations I have read or heard discussing this complex subject.

First, a bit about Wang. He is 26 years old, a dropout from MIT who at 19 established ScaleAI, which has grown to have a $7 billion-plus valuation. Among other things, ScaleAI builds computer infrastructure for AI programs and then trains individuals to take data and make it production-ready for those AI programs. Wang’s company has created AI data infrastructure for autonomous vehicle programs at General Motors and Toyota; worked with leading technology companies such as OpenAI, Meta and Microsoft; and has had contracts with the DoD's Chief Digital and Artificial Intelligence Office (CDAO), and the U.S. Army and Air Force.



Looking for a way to get ahead of the week in cyber and tech?  Sign up for the Cyber Initiatives Group Sunday newsletter to quickly get up to speed on the biggest cyber and tech headlines and be ready for the week ahead. Sign up today.



Called by Time Magazine the youngest self-made billionaire in the world, Wang proudly told House members last week, “Supporting the U.S. government and the national security mission is deeply personal for me. I grew up in Los Alamos, New Mexico, where my parents were physicists at Los Alamos National Laboratory, the birthplace of a technology that defined the last era of warfare - the atomic bomb.”

Because of that heritage, Wang said, “I was keenly aware that an emerging technology, like AI, could completely change global politics and the nature of war.”

Wang told the House members to view AI data as “the Ammunition in AI Warfare…[which] will feature algorithm-fueled military planning, targeting, command and control, and autonomous platforms.”

 Wang stressed to the subcommittee, “AI always boils down to data. All of the advancements in commercial AI technologies, such as ChatGPT [which generates human-like texts] have come from using mass troves of data.”

However, right now, there is no common database for all the data collected in the Defense Department. Instead, there are hundreds of different databases stored all over the department, the majority of which do not have common coding that would allow AI interfacing.

The foundational first step for AI is the collection of all that raw DoD data. That data then has to be curated for things like accuracy; then evaluated to be better aligned to needs of the project; and then annotated, the process of attributing, tagging, or labeling data to help machine-learning algorithms understand and classify the information they process.

Wang said his company through Scale Data Engine “is working across government agencies to annotate and prepare vast troves of data into a high-quality resource that can be used to train AI models.”

For example, Wang told the subcommittee of the May 2023 launching of Scale Donovan, which he described as “our AI-powered decision-making platform, the first LLM [Large Language Model] deployed on Department of Defense classified networks.”

Back on May 10, in a letter to his employees, Wang said much more about Donovan, which I see in effect as the brains of this AI LLM [or like HAL in the movie 2001, A Space Odyssey].  



Today’s constant barrage of information makes it easy for countries to wage disinformation campaigns and your emotions are the weapon of choice.  Learn how disinformation works and how we can fight it in this short video.  This is one link you can feel good about sharing.



Wang said, “Donovan ingests and understands vast amounts of structured and unstructured data to enable operators and analysts to make sense of any aspect of the real world in minutes.” He also said, Donovan was “on a classified network for the [Army’s] XVIII Airborne Corps — ingesting over 100,000 pages of live data to enable actionable insights across the battlefield. The sheer volume of inputs from various formats is a key challenge facing analysts and operators across the federal space. We cannot let the proliferation of information and manual processing be our Achilles heel.”

In a ScaleAI website presentation, it says Donovan has three layers: 

  • The first takes in data from any on-network source and makes it available to the AI model. That includes unstructured text data, structured data bases, geoint imagery, and even traditional data like Power-points, emails, and even PDFs [Portable Document Format]. It also integrates with other record systems.
  • Second is the management layer where users customize and audit model behavior, manage access and ensure secure, understandable and the expected results. It is also where Donovan connects the data layer to a network authorized LLM.
  • Third is the action layer where operators and analysts use chat to interact with the AI model by asking questions, giving and getting instructions, visualizing maps, authoring reports and tasking external systems, such as surveillance or weaponry platforms.

The ScaleAI website does a hypothetical exercise to illustrate how a U.S. Navy operations officer in the Indo-Pacific Command area could use Donovan to research suspicious foreign vessel movements.

The officer would ask Donovan to put any foreign vessel movements over the past 48 hours on the map and flag any out-of-pattern activity. That turns up one questionable vessel. The officer then asks Donovan for options for taking a closer look at that vessel in the next 48 hours. Within moments Donovan produces U.S. satellite surveillance patterns and the officer requests coverage next time the right satellite passes over the vessel. With the imagery and remote sensing collection, which shows radiation from the vessel, the officer can then send this intelligence to the National Geospatial-Intelligence Agency with a request for additional imagery and other intelligence communiuty agencies for what they might have on the ship’s unusual pathway and emanations. Using Donovan’s data, an analyst detects a similar pattern from one specific country’s ships and within minutes the original Navy officer has enough to create an intelligence report for Indo-Pac Command to either gather more intelligence or consider some kind of kinetic or non-kinetic action. For the latter, Donovan in moments could provide current rules of engagement and weapons system availability.

All this could take place in minutes, whereas in the past it could have taken weeks.

Along with its other AI activities, Wang’s company is providing aid to relief organizations in Ukraine. Using streams of commercial satellite imagery, ScaleAI created its Automated Damage Identification Service, a machine learning tool that trained AI to automatically detect new damage to buildings and structures. ScaleAI’s geo-tagged links to attack reports are helping humanitarian organizations target areas of need. 

At last week’s hearing, Wang said, “Unless we start to prioritize investment in both AI systems and the underlying data infrastructure to power it, we risk falling behind China and doing too little too late.” He said he was pleased to see the Biden budget had $1.8 billion, up from $1.3 billion this year, for DoD AI. But, Wang added, “AI systems are only as good as the data that they are trained on, and leading the world in developing AI-ready data is an absolute requirement to maintain our strategic advantage in the era of AI warfare.”

Wang was pleased that Congress had mandated a centralized DoD data repository, but he pointed out implemention has been challenging because DoD lacks the proper data retention and management systems to make it fully operational. “Within the DoD, much of our key AI asset — our data — is being wasted every day. This concept is critical to enabling AI platforms of all kinds, but it relies on AI-ready data to succeed, and one of our most critical AI resources is not being used,” he said.

One day after Wang’s testimony, Deputy Secretary of Defense Kathleen H. Hicks marked the one year anniversary of CDAO, the Pentagon’s chief AI office saying. “Our intent was to accelerate DoD's adoption of data, analytics and AI from boardroom to battlefield, because of how essential these technologies are to staying ahead in the strategic competition for the 21st century.”

Among the accomplishments she cited were “Cleaner datasets are fueling AI models and machine learning algorithms to more rapidly detect specific maritime targets and acoustic signatures,” and “military commanders are collaborating with teams of data scientists that we've deployed at every Combatant Command to work on problems that our war fighters find most vexing, while integrating data across applications, systems, and users.”

Hicks concluded, “Our investments in data and AI are yielding returns much faster than most new defense capabilities … They're producing and delivering for the war-fighter in the here and now, in matters of months, weeks, and even days.”

However, I doubt that DoD AI is developing fast enough for the young Mr. Wang.

The Cipher Brief is committed to publishing a range of perspectives on national security issues submitted by deeply experienced national security professionals. 

Opinions expressed are those of the author and do not represent the views or opinions of The Cipher Brief.

Have a perspective to share based on your experience in the national security field?  Send it to Editor@thecipherbrief.com for publication consideration.

Read more expert-driven national security insights, perspective and analysis in The Cipher Brief

Related Articles

A U.S.-Philippines ‘Full-Battle Test’ Aimed at China 

OPINION — “Beijing's aggressive maneuvers around Taiwan are not just exercises – they are dress rehearsals for forced unification…Russia's growing [...] More

Trump’s Dangerous Game with El Salvador  

OPINION - “We have offered the United States of America the opportunity to outsource part of its prison system. We are willing to take in only [...] More

In Hegseth Panama Visit, Reading the 'Untranslated' Comments

OPINION — “Together with Panama in the lead, we will keep the canal secure and available for all nations through the deterrent power of the [...] More

What A U.S. Commander’s Testimony Tells Us About Russia’s War on Ukraine

OPINION — “The Russian economy has been both bolstered and distorted by this war. Specifically, the Russian government has had to turbo-charge their [...] More

If It’s Trump v. Greenland’s Leaders, I’m Betting on Greenland

OPINION — “We respect that the United States needs a greater military presence in Greenland, as Vice President Vance mentioned this evening [last [...] More

Could Trump’s ‘Golden Dome’ Lead to Nuclear Weapons in Space?  

OPINION — “The only time I can think of any history of the United States where we have gone after something this complex [President Trump’s “Golden [...] More