10. 01. 2023 Francesco Pavanello Blue Team, SEC4U

Spam Trap Box – A Powerful Method to Detect Phishing Attempts

It’s more and more common to receive emails asking for credentials. They usually say that there’s some kind of issue that can only be solved by accessing the involved service using the link inside the message text. In most cases these emails are malicious, intended to steal users’ or employees’ credentials and gain access to personal or corporate areas. 

This scenario is commonly known as phishing, and nowadays it’s the most common cause of corporate data breaches, according to the IBM report “Cost of a Data Breach Full Report 2022“. The attacker uses social engineering techniques to exploit human vulnerabilities like fear, concern or carelessness to obtain what would otherwise be difficult to achieve. 

Even if it’s easy from an expert point of view to recognize such attempts, it’s not as simple to automate their detection, due to the fact that there are various techniques to elude systematic checks. Nevertheless, we, the SEC4U team at Würth Phoenix, are working to improve our cyber defenses against any possible threat.

We’re currently developing a program that can analyze all email delivered to an email server that’s set up specifically without any filters on incoming traffic, and hence is called a spam-trap-box. It’s configured with accounts registered for domains owned by failed companies that used to operate in the same industry as Würth Phoenix clients. This way it’s more likely we’ll be able to analyze traffic similar to what you’d see in a real case scenario. 

Basically, this first phase consisted of searching for domains that once belonged to a real company that now is either bankrupt or has changed name, buying the old domain name, and using the email addresses previously registered to configure the spam-trap-box. To detect them, we scraped the FALLCO portal where all the documents and information about judicial liquidations and corporate bankruptcies are stored. Of course, the more email addresses they have connected, the more worthwhile it is acquiring them. To verify them, we used the Intelligence X search engine called Phonebook.cz and our collection of data breaches.

Furthermore, we can identify two approaches that have been implemented to analyze the data extracted from the emails. One consists of comparing header fields between each other, or looking at the value of one field by itself, in order to detect inconsistencies, suspicious information or structures that are not compliant with the RFC specifications. The other involves the use of OSINT sources to compare what we find in the email itself with existing indicators of compromise, to create new indicators providing additional information or evidence. 

We extract most of the interesting data from a message using meioc (Mail Extractor IoC), which is an open source project published on GitHub by Andrea Draghetti, an Italian cyber security researcher especially known for his work on phishing.

Calibrating the influence to give to each aspect based on analyzing the final outcome requires a lot of time and attention, since the aim is to limit false positives as much as possible. Hence we sketched out a temporary score system that needs to be reviewed.

The first check is performed on the name in the “FROM” email header field. Its presence is already somewhat suspicious, due to the fact that the RFC 5322 says that the value of this header should be a list of one or more email addresses. Anyway, if present, this name is displayed in the email client, often replacing the email address itself. Hence, this verification consists in checking if the name is contained in the user name of any of the subsequent emails in this header field. If this is the case, it’s considered acceptable. Otherwise, the email score representing how malicious the email is considered is increased. 

Then the “SENDER” header field value is taken into consideration. This header is often not included in an email, since usually the address specified in the “FROM” header is also the real sender of the email, so adding the “SENDER” header as well would create useless redundancy. Hence, if present, the check consists in understanding if its value is compliant with the RFC 5322 specification. In the case of a negative response, the maliciousness score is raised, but with a very low incidence. In fact, this check alone is pretty much useless to identify phishing emails, but its contribution can increase if considered along with other factors. 

The moment in which the email was received is also a factor that has to be evaluated. Indeed, an attacker wants to contact the victims when they are more vulnerable, which means when they are isolated and not, for example, in the office, since we are considering phishing campaigns that target a company. Thus email messages can be considered suspicious if received during the weekend or out of office hours. When this is the case, we also increase the score. 

The last check that does not involve any OSINT sources is one performed against spoofing. The program verifies the validity of the DMARC, DKIM and SPF records, giving more importance to the DMARC evaluation, since this is the strongest check and includes the DKIM and SPF verification as part of its check. In any event, each failure increases the score, even if in different ways.

The first check performed using the information gathered by OpenCTI (the platform integrated in our SOC that collects threat intelligence insights) is on the sender IP address. In fact, it verifies that the IP address has not already been reported in any blacklist that is publicly available and indexed in OpenCTI. The main sources are AlienVault and AbuseIPDB. Furthermore, the API of VirusTotal is called to make sure that the IP has not been recorded there. Any match influences the score.

Subsequently, all emails in the “FROM”, “SENDER”, “REPLY TO” and “RETURN PATH” email headers are taken into consideration. First, each email user name is compared to the list of employees found by SATAYO (since it may suggest an impersonation attempt) in case the domain does not match the company’s. Moreover, in the same case just described, despite the result of the user name verification, the domain is searched among the ones indexed by OpenCTI and SATAYO. The former collects domains that have previously been blacklisted by any other publicly available source; the primary source for this list is Phishing Army. The latter instead stores all the domains that are similar to the monitored one, which means that they are different for example because of typosquatting. As before, any match raises the score.

The value of the “SUBJECT” header is analyzed using a list of words that are often included in that header when the email is a phishing attempt. Each word is searched for inside the header value, which is commonly a short sentence. In a phishing scenario, the sentence invites the reader to open the email and download the attachment or click on the link written in the message, usually so as not to miss an opportunity or to check their bank account. Again, in case of any match, the score is increased.

Last but not least, any attachment is validated using VirusTotal. Since the content might be legitimate and contain sensitive information, even if the spam-trap-box is built using domains that do not belong to real businesses anymore, it’s still good practice to upload not the whole file on the platform, but just its hash. Indeed, any user who buys a VirusTotal subscription can access any document uploaded. Using the hash, it’s possible to check if it matches that of any malware known by the platform. As in the cases above, a positive result affects the score.

Once all checks have been completed, the score is evaluated. If it’s higher than an expected threshold calculated on the basis of the fields that it was possible to verify, then the email is considered malicious, and an IoC created. The findings are inherent to clients’ businesses, thus the SOC will have more complete details to use while analyzing their email traffic. In fact, this is added to a JSON file, where each element is a different IoC that reports the IP address, the domain of the sender email address, the value of the “SUBJECT” header field, and the hash of any attachment. This JSON is then retrieved by OpenCTI using a connector, with the purpose of ingesting the new evidence. Subsequently, this information will be used by the SOC, which will have to block any further emails intended for Würth Phoenix customers having that IP or domain in the header fields reporting the sender information.

These Solutions are Engineered by Humans

Did you learn from this article? Perhaps you’re already familiar with some of the techniques above? If you find security issues interesting, maybe you could start in a cybersecurity or similar position here at Würth Phoenix.

Francesco Pavanello

Francesco Pavanello

Hi, I'm Francesco, and I'm currently working as a technical consultant at Würth Phoenix. Here I mainly develop the Cyber Threat Intelligence platform SATAYO, my "little child", even if it's not so little anymore, but I also analyze the evidence found and help the customers to understand and mitigate them.


Francesco Pavanello

Hi, I'm Francesco, and I'm currently working as a technical consultant at Würth Phoenix. Here I mainly develop the Cyber Threat Intelligence platform SATAYO, my "little child", even if it's not so little anymore, but I also analyze the evidence found and help the customers to understand and mitigate them.

Leave a Reply

Your email address will not be published. Required fields are marked *