As you can imagine, we look at a lot of phishing emails on a daily basis. That also means that we look into and analyze a large number of malicious files every day as well.
We leverage the open source Cuckoo sandbox as one of our sources/methods of dynamic analysis of these malware samples. However, as Cuckoo sandbox reports can generate a large volume of data, we needed a programmatic way to purge old data from all Cuckoo data sources and maintain a maximum number of days of sandbox reports for real-time hunting and malware analysis. As such, we have created a method that can be used to tailor data purging based on our needs and timelines.
Figuring we are not the only ones that have run into this data volume issue, we decided to make our Cuckoo Purge script open and available for others to use. It can be found here: https://github.com/CofenseLabs/cuckoo-purge.