This dataset captures detailed forensic evidence and system behavior generated from a simulated APT29 cyberattack scenario. The attack emulation was based on multiple Cyber Threat Intelligence (CTI) reports describing real-world tactics and techniques attributed to the APT29 threat actor group. The goal is to provide a comprehensive view of host and network-level activities during a full attack chain in a controlled cyber range.
Contents
1. Provenance Graphs
.dot
files representing system-level events and causal dependencies.
- Two variants:
*-provenance-graph.dot
: Linux-based systems.
*-sysmon-provenance-graph.dot
: Windows-based systems.
2. System Logs
Linux Hosts
- audit.log
from /var/log/audit/
- sysdig.scap
(zipped in /var/log/sysdig.zip
) from Sysdig monitoring
Windows Hosts
- Microsoft-Windows-Sysmon_Operational.evtx
from Sysmon logs
3. Attack Annotation
annotation-attack.csv
: Semicolon-delimited file that maps events to attack steps.
- Includes fields such as timestamp, system, executable, and high-level activity label.
4. Network Capture
.pcap
files capturing network traffic from both Linux (tcpdump
) and Windows (Pktmon
).
5. Memory Dumps
- Raw memory images (
*.elf
) collected during the simulation are available upon request for in-depth analysis.
Cyber Threat Intelligence (CTI) Description
APT29 is threat group that has been attributed to Russia's Foreign Intelligence Service (SVR). They have operated since at least 2008, often targeting government networks in Europe and NATO member countries, research institutes, and think tanks. APT29 reportedly compromised the Democratic National Committee starting in the summer of 2015. In April 2021, the US and UK governments attributed the SolarWinds supply chain compromise cyber operation to the SVR; public statements included citations to APT29, Cozy Bear, and The Dukes. Victims of this campaign included government, consulting, technology, telecom, and other organizations in North America, Europe, Asia, and the Middle East. Industry reporting referred to the actors involved in this campaign as UNC2452, NOBELIUM, StellarParticle, and Dark Halo. In July 2020, APT29 has used WinRM via PowerShell to execute command and payloads on remote hosts. APT29 used 7-Zip to decode its Raindrop malware. APT29 has acquired C2 domains, sometimes through resellers and has registered algorithmically generated Twitter handles that are used for C2 by malware, such as HAMMERTOSS. APT29 has used Rundll32.exe to execute payloads which calls out to a remote network infrastructure using the domains avsvmcloud.com. Observed malicious instances of SolarWinds.Orion.Core.BusinessLayer.dll have hashes like e0b9eda35f01c1540134aba9195e7e6393286dde3e001fce36fb661cc346b91d and a58d02465e26bdd3a839fd90e4b317eece431d28cab203bbdde569e11247d9e2.
CTI Report: link
License
This dataset is licensed under the Creative Commons Attribution 4.0 International (CC BY 4.0) license.
You are free to share, modify, and build upon the data for any purpose, including commercial use, as long as proper credit is given to the original authors.
Citation
If you use this dataset in your work, please cite the following publication:
bibtex
@inproceedings{provcon25,
title = {From {Observations} to {Insights}: {Constructing} {Effective} {Cyberattack} {Provenance} {With} {PROVCON}},
language = {en},
booktitle = {Workshop on {SOC} {Operations} and {Construction} ({WOSOC}) 2025},
author = {Yusof, Anis and Li, Shaofei and Kawatra, Arshdeep Singh and
Li, Ding and Chang, Ee-Chien and Liang, Zhenkai},
year = {2025},
isbn = {9798991927604},
doi = {https://dx.doi.org/10.14722/wosoc.2025.23008},
}