Name: Behind the Shield: Decoding MITRE ATT&CK® Evaluations — What It Is and Why It Matters
Uploaded: 2025-12-18T11:16:06.127Z
Duration: 1 h 8 s
Description: Behind the Shield: Decoding MITRE ATT&CK® Evaluations — What It Is and Why It Matters

Transcript for "Behind the Shield: Decoding MITRE ATT&CK® Evaluations — What It Is and Why It Matters": Alright. Let's give it just a minute for more people to join in, and then we'll get started. Okay. We'll let more people join in. In the meantime, thank you everyone who's already joined in for today's webinar. Good morning and good afternoon for wherever you are. Today, we are going to deep dive into the results of the recent MITRE ATT and CK evaluation for enterprise 2025. We are going to record the session. So if you want to refer to this content again, we will send it over by email. Also during this session, if you have any questions for our speakers, feel free to pop them into the q and a panel, and we will try to answer them live or in the q and a panel or as follow-up to this session. As a reminder, this session is being recorded, so, feel free to refer to it by email after the webinar. With that, over to you, Paul and Reif, and thank you all for joining. Thank you, Deviani. So, welcome. Good morning, good afternoon, good evening, depending on where you're joining us from today. So as Deviani just said, we're going to be diving into the most recent MITRE ATT and CK evaluation enterprise, the enterprise 2025 evaluation. Your host today, myself, Paul Murray. I'm senior product marketing director here at Sophos, and I'm joined by Ralph Pilling, director of threat intelligence from the Sophos counter threat unit. So we've got some great content for you today. So let's dive in. So what are we gonna be covering? Firstly, a little bit of orientation. So for those that may not be familiar with MITRE ATT and CK evaluations, I'm just going to provide an introduction. What are these evaluations all about? How do they work? Rafe is going to spend some time on the emulated threat groups that might, used in this evaluation and the attack scenarios that they undertook as part of this test. Then we will go into, the results and, importantly, how you can use those results if you're looking at a, an XDR or an EDR solution. So let's dive in. So what are MITRE ATT and CK evaluation? So, firstly, these evaluations are among the world's most respected independent security tests, and that's due in part to the emulation of real world attack scenarios and to the transparency of the results that are provided. They're also among the most rigorous independent testing evaluations. So they emulate the tactics, techniques, and procedures, the TTPs, leveraged by real world adversarial groups. And the evaluation, assesses each participating vendor's ability to not only detect and analyze, but also the way in which they describe those threats with the output of those detections aligned to the language and the structure of the MITRE ATT and CK framework, which I'm sure many of you are familiar with. Now importantly, unlike, many other evaluations, there's actually no single way to interpret the results of attack evaluations. They're not intended by MITRE to be competitive analyses. So as a result, they don't result in, a winner or a leader. They simply show what the evaluation observed. So this was round seven of the attack evaluations for enterprise. Now this is MITRE's product focused evaluation, not to be confused with their managed services evaluation. So MITRE conducts these evaluations both for products, which is essentially EDR and XDR solutions, but also managed services, which is essentially MDR solutions. Now Sophos participates in both of those sets of evaluations. This one is focused on the products, I e, the, XDR solution. We participated in this evaluation for the last five rounds. So I mentioned it was, one of the most rigorous evaluations, available. This one was, the most complex and demanding evaluation to date, mainly because in addition to endpoints and servers, for the first time, MITRE included cloud infrastructure and AWS into the evaluation. Now in this evaluation, MITRE emulated two distinct adversary groups, scattered spider and Mustang panda. And Ray's gonna spend some time on these in a few minutes talking about what we know about these two threat groups. The evaluation comprised 90 adversary activities or substeps to use the language of MITRE across two distinct attack scenarios. So that's essentially 90 adversary activities that MITRE is looking to see, a, whether vendors can can identify those activities as suspicious or malicious, and, b, how effectively those vendors can present the details of those detections in a way that's actionable for a security team. So why does Sophos participate? So we participate in these evaluations alongside some of the best security vendors in the industry. Now the threat landscape is ever evolving. So as a community of security vendors, we are united against, that common enemy of the adversary. Moreover, though, these evaluations help to make us better, both individually as a vendor, but also collectively for the benefit of the organizations that we will defend. Now Softwares uses MITRE attack evaluations not as a marketing checkbox, but as a catalyst to drive deeper innovation in our detection pipeline, our AI models, and our analyst workflows. MITRE gives us a clear unbiased measurement of our technology, and we use it as a driver for continuous improvement. So on this slide, you can see who participated in this year's evaluation, including Sophos there. So there's some regular participants here, including Trend Micro, Cyber Reason, ESA, and others. CrowdStrike dropped out last year in the twenty twenty four evaluation, but decided to rejoin for the 2025 evaluation. And there are a couple of new participants in this year's evaluation. Acronis and Cyberani decided to join for the first time. However, several vendors decided not to stand up and be counted in this round as you can see on the right hand side. Examples including Microsoft, SentinelOne, Palo Alto, and others. Now these vendors have provided, little to no public commentary on why they declined to participate, and we won't speculate. Moreover, though, vendors have to make a calculated decision about whether they want their detection engines tested in a transparent and third party environment. And MITRE's enterprise evaluation has become significantly more rigorous, technically demanding, and resource intensive. So for vendors that may be, facing platform rewrites or acquisition integrations or road map gaps, the risk of participating may outweigh the reward. But Sophos continues to participate because transparency matters. Customers deserve real proof about whether a vendor can detect, analyze, and respond to advanced threats. So there's our quick introduction, quick orientation. I'm now gonna hand over to Ralph Pilling, who's gonna talk to you about the threat groups and the attack scenarios that MITRE emulated in this evaluation. Ralph, over to you. Excellent. Thank you, Paul. So, yeah, my name is Ray Billing. I am the director of threat intelligence for the Sophos counter threat units. Recently joined Sophos earlier this year. I'm gonna tell you a little bit about that. So Sophos acquired Secureworks in February 2025, and with that came the, CTU or the counter threat unit, a dedicated all source threat intelligence team tracking complex cybercrime and sophisticated state sponsored threat actors. CTU then became part of Sophos xOps, an advanced threat response joint task force. CTU brings strategic threat intelligence and adversary tracking capability with over fifteen years of threat intelligence data and reporting. And it complements an existing team of, seasoned experts delivering critical cybersecurity services to protect customers. So it's really great to be part of that ex ops family. Next slide, please, Paul. So who were the groups that were emulated as part of this evaluation? So there are two groups, scatter spider, which we track as gold harvest, and Mustang Panda, which we track as bronze president. And I should say something about that. Obviously, different vendors have different names, and different naming schemes for these, threat actors and threat groups. And, yeah, I will talk, in terms of both the, the threat group names used in the evaluation and also, our equivalent. So you can reference those online, look up some of the reporting and and other threat group profiles that are published online. So these groups, are well known to us, with ongoing track tracking of both, sets and the adjacent groups to them. In fact, there are over 245 direct groups, that have been tracked or are being tracked by the CTU. So while we didn't know the groups in advance, our focus on real world versions of these threat groups, then the softwares have both protections in place and the intelligence at our fingertips to detect malicious activity throughout the emulated scenarios and identify the emulated groups. ScatterSpy did have targeted individuals, seeking to steal cryptocurrency, and in the last couple of years have moved into partnering with Russian speaking ransomware groups to cause chaos in multiple countries, particularly in The US and The UK. They lean heavily on social engineering and techniques like SMS SIM swapping to conduct their attacks. In contrast, Braun's president has been a long running source of espionage and data theft in the service of Chinese intelligence. This pair was a great choice as together they encompass a broad range of attacker TTPs or techniques, tactics, and procedures that are employed by multiple threat groups, fully exercising the capabilities that Sophos has. Next slide, please. So today, I'll walk you through a brief, emulation, or the the emulation of ScatterSpider, the threat behind several high profile breaches, including against sectors like retail, aviation, and gaming. So this scenario, illustrates that complete attack chain with critical elements that made it especially dangerous. The attacker moves from an on premise environment into the cloud, which is, a realistic element that we see, in real world attacks. So this attack started with targeting, of an employee using a phishing email and impersonating internal IT staff. Again, very common scenario particularly for, for this particular threat group. The message claims a security update was completed and urges the user to quickly reauthenticate. When the user clicks the link, they're taken to a convincing login page that captures both credentials and an active single sign on session token. Once inside, the attacker slows down and begins exploring the environment. Again, a very common process in the overall attack chain as the attacker has to orient themselves and figure out what they have access to, where they are in the environment, where they need to get to. So they use normal administrative tools to understand what systems exist, what protections are in place, and what valuable access might be. And so our role here is to, extract that malicious signal from the noise of the enterprise environment. In this scenario, they also quietly add access to shared resources and set up mail rules to reduce the chance of being noticed. Using the stolen single sign on session, the attacker accessed the organization's AWS environment without being challenged for multi factor authentication. Almost immediately, they begin exploring cloud services and permissions, effectively moving from a single compromised user and into the cloud control plane. And from there, the attacker focuses on persistence. They create new cloud identities with administrative access and deploy new cloud infrastructure that blends in with legitimate resources. This allows them to move laterally and maintain access even the original even if the original account, is discovered or or deactivated. And in the final phase, the attacker collects data from internal systems, stages it in cloud storage, and then exfiltrates it to an attack controlled infrastructure. Throughout this process, alerts and notifications are being quietly suppressed to delay detection. This emulation was really interesting. It highlighted how quickly a simple phishing email can escalate into a full cloud compromise when identity and cloud access are abused. Next slide, please, Paul. So in this second scenario, that I'll walk you through, this is an emulation of Mustang Panda or, as I said, bronze president to us. It's a long running Chinese based espionage group that's targeted governments and non government organizations for over a decade. So this scenario included two attack paths or or sub scenarios, one called Orpheus and one called Perseus, each showing a different way the same threat act gains access to and steals data. The first path, Orpheus, begins with a document based lure. The user receives a legitimate looking office document themed around geopolitical analysis. And inside the document is a link that, led to a compressed file containing what appeared to be normal software. But behind the scenes, the attacker abuses trust, sign programs to quietly load malicious code, allowing them to execute without triggering alarms. Once active, the attacker keeps a low profile. They run only a small number of basic checks to understand the network and then move laterally, quickly reaching a domain controller and obtaining high level access. And this is always a critical, element of an attack and and a target that threat actors will initially go after so they can gain their high level access and own the domain. To maintain communication, they established a secondary communication channel or command control channel that blend in, blends in with normal developer traffic. With the control back to directory, the attacker extracts credential data and sets up a scheduled task to ensure continued access. From there, sensitive files are gathered, from shared network locations, they're compressed and quietly exfiltrated using common data transfer tools. All of these are, APT tactics that we see groups using in order to try and stay under the radar, blend in. Often now, we see groups using living off the land type tactics, living off the land binaries, and and legitimate tools in order to, work around detections that might occur for for malware, etcetera. So the second path, known as Perseus, demonstrates a different entry technique. In this case, the user clicks a link that leads to a web page that builds the malicious installer directly in the browser, bypassing traditional network defenses. That installer deploys a well known Mustang Panda backdoor, which displays a decoy document to the user. Persistence is established, files of interest are collected, and data is exfiltrated using standard utilities. Once completed, the malware cleans up after itself removing evidence to delay investigation. And together, these two scenarios show how Mustang Mustang Panda operates. Slow, deliberate, and focused on that long term access and intelligence collection. They're not in a hurry. They don't have to, get to that final target as quickly as possible. They have to remain undetected for as long as they can, or at least that's what they aim to do. Like many one threat actors, they rely heavily on legitimate tools and trusted processes, making that activity difficult to distinguish from normal and administrative behavior. Now, Paul, over to you to have a look at the results. Great. Thank you, Rafe. Actually, before we move on to the results, just to point out the, the Sophos ex ops team have just published a very detailed write up step by step of those two attack scenarios on the Sophos news website. So if you search for Sophos news and and find our latest blog article from Sophos ex ops, you can find out more about what Rafe has just described in terms of those two attack scenarios and the Sophos ex ops little deep dive. That's really good. Yeah. So, how did Sophos perform in this evaluation? Well, just gonna ask you to to hold for a moment or so because I think it's really important to to spend a minute on how the, the MITRE ratings work for this evaluation. So before we talk about numbers, how is MITRE actually assessing, the performance of participating vendors? So as I mentioned a moment ago, each of the adversary activities that are evaluated from these emulations are called substeps, and there were 90 of them in total across those two scenarios. And each of those substeps receives, one of the ratings shown on this slide here, which indicates the solution's ability to detect, analyze, and describe the adversary activity with their output aligned to the language and structure of the MITRE ATT and CK framework. So I'm just gonna walk through each of these ratings so it's clear when we talk about the results what each of them means. So firstly, I'm gonna go from right to left. So, on the right hand side, not assessed. As it suggests, that particular subset was not assessed for a particular vendor. Usually, that happens because a vendor has opted out of a portion of the evaluation, perhaps because their solution, lacks coverage in that particular area or it might be because of technical limitations. So any substeps that that that couldn't be assessed are are just classed as not assessed. The, second rating is called none. So really consider this as a miss. So the the vendor solution may have, may have collected telemetry, and there may be some level of visibility in the in the solution. But the solution didn't generate an alert, so it failed to identify the activity as potentially suspicious or malicious. So it doesn't it didn't raise a detection that could be usable by security analysts. Now the, the top three ratings, so general tactic and technique is where, we really consider those to be, detections. So we actually detected detected the adversary activity and generated an alert for the security practitioner. So a general detection is, a detection, but it's with limited context. So the solution generated an alert and identified the adversary activity as, potentially suspicious or malicious, but it didn't associate it with the MITRE tactical technique. So it's missing some of that context. The, the second, rating here is a, a tactic detection. So this is considered to be a partial detection. The solution generated an alert and identified the adversary activity at the tactic level pertaining to that behavior. So the alert provided details on the execution and the impact, and the adversary behavior. Now the the top highest fidelity detections are categorized or rated as technique level detections. So this is where the solution generated an alert that identified the active the, adversary activity at the attack technique or subtechnique level. So the alerts provides clear insights into the who, what, when, where, how, and why. And this is the highest possible score that MITRE will assign to any sub step in the, in the evaluation. Now the, those top three, as I mentioned a moment ago, the, general tactic and technique classifications are grouped under a definition of analytic coverage, which essentially measures each solution's ability to convert telemetry into actionable threat detections. So drum roll. How did, how did Sophos do? Sophos achieved 100% detection coverage in the attack enterprise 2025 valuation, which is our strongest detection coverage performance to date. So we're absolutely thrilled with the, the result here. Now if we dig a little bit deeper into that, what does that mean? It means that we successfully identified and raised, alerts and detections for all 90 adversary activities or substeps across those two attack scenarios that Rafe described. So 100 detection coverage, zero misses. Not only that, though, we, we also, for 86 out of those 90 sub steps, we scored the highest possible technique ratings technique level ratings, which I just described a moment ago. And that means that Sophos XDR provided the highest fidelity detections possible for almost every sub step in this MITRE evaluation. So just break that down into those two scenarios that Ray described. So on the left hand side, scattered spider, again, 100% detection coverage. We didn't miss any, substeps in that portion of the evaluation, and we achieved that highest possible technique level detection rating for 61 out of 62 substeps in that, in that scenario. And in the second scenario, Mustang Panda, again, 100% detection coverage. We detected and raised alerts on, all 28 out of 28 sub steps in that evaluation. And for 25 of those sub steps, we raised the that highest possible technique level detection. So, overall, they an extremely strong performance from Sophos in this evaluation. Now Simon Reed, our, chief research and scientific officer summed up our results nicely. He said, achieving full detection coverage against both scenarios validates the accuracy and depth of Sophos' analytics. It demonstrates how Sophos' AI native XDR platform converts complex telemetry into clear, actionable intelligence, helping security teams detect, understand, and stop advanced attacks with confidence. And I think Simon summed that very nicely, with that quote. Now we mentioned earlier that this year's evaluation was more demanding than ever, and a big part of that was because for the first time, the evaluation extended beyond endpoints and servers, and this year included cloud infrastructure and AWS. Now the Sophos XDR platform extends visibility across the entire IT environment, and that's thanks to an extensive ecosystem of technology integrations across endpoint, identity, email, backup, firewall, network, productivity tools, and cloud solutions. So we include dozens of integrations automatically out of the box with Sophos XDR, and that enables customers to plug in their existing solutions without needing to rip and replace. But, of course, we also provide a broad portfolio of Sophos native solutions that, as you'd expect, also integrate seamlessly with the XDR platform. So now let's take a look at how Sophos' performance compared to others in the evaluation. So just to stress and reiterate, there is no single way to interpret the results of these evaluations, and MITRE does not rank or rate participants. So this is Sophos' view of the results and how, how we performed. So, how we look to like how we like to look at it is around detection quality. So quality of detections and alerts is key to giving analysts the insights they need to investigate and respond to threats quickly. So we find that one of the most valuable ways to interpret the results of attack evaluations is to review the number of subsets that produced rich detailed detections, so analytic coverage along the left along the x axis of the chart we're showing you here, and compare that against those subsets that achieve that highest possible, highest fidelity technique level coverage. So as you can see on this visualization here, once again, Sophos delivered an exceptional performance in this evaluation. Again, Sophos detected all 90 sub steps and provided that analytic coverage alerts, all 90 sub steps, and provided the highest fidelity alerts for almost every sub step, 86 out of 90 in the evaluation. So if you want to dig deeper into the results of this evaluation, we've created a range of materials to enable you to do so. So the easiest way to do this is to simply go to sophos.com/mitre. Here, you can read more about the evaluation. You can see those attack scenarios, and you can download our evaluation brief to read more. And as I mentioned a moment ago, we've also published a technical write up of the evaluation from the Sophos ex ops team, and you'll find this on the Sophos news blog. So whilst we're thrilled with the superb results that Sophos managed to achieve this year, it's also really important to consider these results in a broader context. So we'd encourage you first, to look beyond these numbers. Think about whether the XDR tool, evaluated will really meet the needs of your own team. So, for example, does the tool present the information in a way that your team would like it? Are disparate events correlated automatically, or is that something that you would need to do automatically by using these tools? Can the XDR tool integrate with other technologies that you're using within your environment? And importantly, are you planning to use the tool by yourself, or will you have the support of an MDR, a managed detection and response partner? So Sophos XDR is the technology that underpins the Sophos MDR service, which is now protecting over 35,000 organizations worldwide. It's it's the, the largest MDR, and most trusted MDR service worldwide. It's also important to take a broader view of the MITRE attack evaluation results alongside other reputable independent proof points. So So include things like verified customer reviews and analyzed evaluation. So a few examples here of recent recognitions for Sophos XDR. So Sophos is a leader in the twenty twenty five Gartner Magic Quadrant for endpoint protection platforms for the sixteenth consecutive time this year. So this evaluate include this evaluation includes not just endpoint protection, but also EDR and XDR capabilities. And, again, now we are, a leader in this MQ and have been since 2007 in 16 consecutive reports. We're also a leader in the 2025 IDC market scape for worldwide extended detection response software recently published. And also published, very recently, g two's winter twenty twenty six reports. Now customers once again rated Sophos as a top security vendor. Sophos is ranked the number one overall solution in endpoint protection platforms, managed detection and response, firewall software, and extended detection response or XDR, which is what we're talking about today. Now also based on verified customer reviews, Sophos is a 2025 Gartner Peer Insights customer's choice vendor for XDR. Now in the twenty twenty five voice to the customer report in Gartner, Sophos XDR is the highest rated and most reviewed solution in that report. Now to access that report and others, simply visit sophos.com/y. And here you can access analyst evaluations, independent testing results, among other resources. So you've seen the results. This brings us to the end of, end of our content. So you've seen the results. We'd love you to see the solution, that underpins this. So following this webinar, we'll be sending you an email which will include a number of links to videos that will show Sophos XDR in action. If you can't wait, simply visit sophos.com/xdr to learn more about the solution today. And when you're ready, you can give the solution a try. You can activate a free thirty day trial of Sophos XDR. There's no obligation. Just go to sophos.com/xdr, sign up for the free trial, and you'll have instant activation to test the solution in your own environment. And that concludes our content for today. So, Deviani, I'll pass back to you. Awesome. Thank you so much, Paul. Thank you so much, Dave. I hope we all learned something today, and we can take back some actionable insights. We will follow-up this webinar with the recording by email and also all the useful links that, Paul just mentioned. As a reminder, please fill out the survey that now appears in front of your screen, and we'd love to hear from you your feedback, and you can also connect with our teams using this survey. With that, thank you so much for attending. Thank you, our presenters, for the content today, and we'll see you in the next one. Thank you. Thank you.