With news of nation-states allegedly attacking companies, political institutions, and world governments, it is important to know how attribution works in cybersecurity. For the unfamiliar, attribution is the process investigators and intelligence workers use to tie responsibility of an event or action to a person, group, or country.
Unless there is physical forensic evidence showing that an individual or a group of individuals was on a computer at the exact time an organization was compromised, you cannot definitively attribute the attack. This is one of the reasons that “attacking or hacking back” on a corporate-level does not fly in the industry.
When attempting to attribute any criminal activity, there are inherent assumptions, analysis of the event itself, analysis of similar historical events, and then the state of the environment. In the United States and in most other countries’ criminal courts, a defendant must be proven guilty “beyond a reasonable doubt.” Cyber attribution is very similar: there is analysis of the digital forensic evidence, consideration of the overarching situation, historical data, and an establishment of potential motives or intent in order to come to a conclusion “beyond a reasonable doubt.”
There are broadly six dimensions analysts use to attribute attacks, but keep in mind that these should all be considered as a whole in attribution mechanics.
Analyzing the toolcraft
By analyzing the tools, scripts, and programs an adversary is using, analysts can glean critical data-points such as: compile time, localized language of the compiler, programming language, spelling mistakes in the code, libraries utilized, patterns/ordering of execution events, and more. Perhaps, for example, the adversary simply cannot spell a certain word, and this is present across various iterations of the attacker’s software, or maybe the piece of malware being analyzed was written with Cyrillic keyboard-set.
Analyzing the adversary’s TTPs
Adversaries sometimes use unique tactics, techniques, and procedures (TTPs). Identifying these can often shed some light into the perpetrator of an attack. Examples of TTP’s could be: method of delivering the attack, such as the use of social engineering, types of malware deployed, and behavior within the network like the steps used to find what they are looking for and to accomplish their mission; and then the actions adversaries take to cloak themselves, such as the mechanics and methods they use to cover their tracks like disguised traffic or false flags.
Analyzing the “source” data
Attacks have origins, and often they must communicate with nodes outside the targeted network, either for command and control or for exfiltrating data from their victims. Associated metadata—the when, where, and how—can be collected, analyzed and put into the attribution case to consider as a whole. This metadata could be, source IP addresses, domain names, domain name registration information (registrant’s name, location, and email address and nameservers used), third-party cross-reference data coming from sources like Crowdsource or VirusTotal, email addresses, hashes, even hosting platforms (if they are using Bitcoin-paid virtual private servers). These datapoints expose the public information behind the source(s) of the attack but can easily be faked. By analyzing these across a series of multiple attacks targeting various—and perhaps geopolitically linked—organizations, certain assumptions and assertions can be made based upon the reoccurrence of false data discovered. For instance, an anonymous email address can be traced from an attack and linked back to the actor based on the domain names being used that were previously identified as a specific actor’s command and control points.
Adversary goals
Not every attacker is motivated by money. Certainly, cybercriminals exist in this world, but most adversaries have specific goals in mind. Through an analysis of their behavior in a network, we can collect valuable intelligence as to what they were after, which can directly affect the attribution process. Did they just persist and spy over a long period of time? Were they looking for specific data during their intrusions?
Business drivers
Every industry in the world encounters cyber attacks in some form or another. Some industries go through phases or cycles that attract adversaries, for instance, when oil prices go up as a result of less supply, it drives companies to spend more on exploration, which in turn makes those organizations at higher risk for geo-spatial data-theft. Another example would be when technology firms roll out innovative devices to the market, resulting in those companies being targeted for intellectual property acquisition and data-theft. Having an idea of what is going on in a company’s industry, is key to understanding the mind of the adversary and therefore directly contributes to attribution.
Geopolitics
Nations have disagreements over defense, trade, foreign policy, territory, economics, and various other conflicts. These situations play a vital role in attribution. For instance, is a certain nation-state going through an energy crisis and therefore trying to infiltrate oil and gas companies to find exploration data? Such geopolitical analysis attempts to determine an actor’s identity by placing their actions under the lens of current events, tying a variety of assumptions over stakeholder motivations to the technical forensics of a cyber attack.
Conclusion
Each of these six dimensions plays a part in the overall attribution process, some with varying degrees of importance. Without careful consideration and analysis of each dimension across an investigation and a set of investigations across multiple different organizations, it would be impossible to come to a conclusion “beyond a reasonable doubt” when attributing an attack to a specific actor.
Throughout the attribution process, analysts are essentially assembling a virtual who, what, why, when, where, and how. On a micro-level, incident response teams are collecting evidence to answer these questions on an investigation-by-investigation basis. It’s important, not only to explain the order of events for the client, but also to apply the right expulsion and remediation processes. On a macro-level, threat intelligence teams are receiving evidence, assumptions, and analysis from multiple different investigations, pulling it all together and identifying patterns from around the globe. If there are patterns across a multitude of investigations, the attacks can be categorized and assumptions can be made for attribution based upon the global visibility we have within industries and across them.
Many breaches and incidents—along with their associated threat intelligence—are shared with law enforcement agencies and parts of the federal government. The FBI analyzes this incoming data and joins it with the data they have in their purview, which could come from other commercial investigations, government investigations, and the U.S. intelligence community.
Attribution is not an exact science. We will never live in a world where we can name a responsible party for an attack with 100 percent certainty, particularly without the physical evidence. However, by analyzing the trends, tools, and TTP’s—including a dash of situational awareness—we can come close to attribution “beyond a reasonable doubt.”