07. 06. 2023 Federico Corona Red Team, SEC4U

Cracking the Code: Unveiling Data Breach Secrets through OSINT-driven Scripts

Welcome, today’s blog is dedicated to data breach analyses and evaluating their reliability. In an increasingly data-centric digital landscape, it’s crucial to delve into the complexities of data breaches and develop effective methods for determining the trustworthiness of the information they contain. In this blog, we’ll explore a professional approach to data breach analysis using some advanced scripts.

Data breaches represent a form of security breach involving the compromise and disclosure of personal or sensitive information belonging to individuals or organizations. Our custom scripts have been specifically developed to handle and analyze data breaches downloaded from VirusTotal, a cutting-edge service that provides analyses of suspicious files and URLs through a wide range of antivirus engines.

By applying analysis algorithms to the data obtained from VirusTotal, our scripts conduct a thorough investigation to identify the distinguishing characteristics of reliable data breaches. Through this sophisticated methodology, we will gain a comprehensive understanding of the origin of compromised data and be able to detect any suspicious patterns or behaviors that may suggest the presence of an inauthentic data breach.

Get ready to explore the intricate workings of data breaches, and to acquire valuable tools and knowledge to protect yourself and your organization from the constant risk of data breaches. By utilizing VirusTotal and our advanced scripts, we’ll be able to obtain a reliable assessment of data breaches, providing a solid foundation for safeguarding sensitive information and mitigating digital threats.

Welcome to the world of data breach analysis.

Let’s begin with an explanation of the origin of the breaches under analysis. These breaches are artifacts obtained from VirusTotal and processed by an initial script called VirusTotal.py. The primary objective of this script is twofold:

  • Verify whether the breach contains credentials belonging to our clients
  • Check if the client’s data is already present in any of our existing breaches

If both checks pass successfully, the VirusTotal item is marked as True and subsequently analyzed by a second script. Otherwise, it is marked as False.

The first script provides various pieces of information, including:

  • The sha256 hash of the VirusTotal file
  • The title of the file
  • The file’s format
  • An example line to assist in understanding the data format
  • A list of credentials associated with our clients’ accounts found within the breach, where each entry is labeled in green or red to indicate whether it’s a new or pre-existing data point
  • If the data is already present in our database, it indicates where

Following the initial step, the VirusTotal item undergoes analysis by a second script, which focuses on assessing the breach’s reliability. Reliability, in this context, refers to the probability of the breach containing both authentic and unique data.

Upon executing the “dbchecker.py” script, the following information is displayed:

  • The sha256 hash of the VirusTotal item
  • The name of the file uploaded to VirusTotal
  • The file’s format as uploaded to VirusTotal
  • The publication date of the breach
  • The number of rows in the breach

The script is divided into six sections, but only the relevant sections are shown based on the file’s characteristics:

  1. Analyze file name and format: identify a possible subject behind the publication of the breach, through an OSINT analysis of the file name uploaded to VirusTotal
  2. Check user table: identify a possible origin of the DB (the first rows are usually filled with fields which aims to test the infrastructure when the table was created)
  3. Analyze the number of rows and domains: identify whether this file is a combo list or not based on the total number of lines in the file, and check the more frequent domains in order to retrieve a possible target of the breach
  4. Understand the hash format: try to understand the hash format of the passwords in the breach, so as to understand the severity of the data exposure
  5. Check if breach is fake: identify how reliable this breach is, with a value calculated during code execution
  6. Set Acknowledge: decide whether the breach is reliable and communicate this to the DB

In the given example from this blog, the file is in .txt format. As a result, point 2 (Check user table) will not be shown, and point 4 (Understand the hash format) will also be omitted since the passwords in the breach are stored in plaintext.

Subsequently, screenshots with accompanying explanations will be presented to illustrate the analysis of the mentioned breach using the dbchecker.py script.

[1] Analyze file name and format:

This step is of utmost importance in analyzing the breach as it focuses on reconstructing the breach’s publication prior to its upload to VirusTotal. It also aims to uncover the identity of the uploader.

During the execution of step 1, the script generates pre-compiled links that are invaluable for conducting thorough OSINT (Open Source Intelligence) investigations:

In this case, by utilizing the URL generated by the script, we successfully identified the corresponding page on craxpro[.]io. It was discovered that the user “yxxngstxr” had uploaded a data breach with the identical title as the VirusTotal file.

Furthermore, along with the threat actor’s nickname obtained from the forum, we will provide the precise date of the breach publication, which is also sourced from the forum.

The following step will output a URL that helps to identify any Telegram groups where “yxxngstxr” may be posting additional breaches as part of his activities.

Yxxngstxr doesn’t post anything on Telegram (at least not under this username), but we have managed to track down a Telegram channel where this breach has been re-posted.

Therefore, we will input the name of the Telegram group as requested by the script. Additionally, the script will check if this group is already being monitored through the deepdarkCTI project on GitHub [https://github.com/fastfire/deepdarkCTI]

In addition to checking on deepdarkCTI, the script provides us with other useful links to investigate “yxxngstxr”. Meanwhile Maigret, a powerful OSINT tool that has been imported and adapted into the script, loads the list of services where the user “yxxngstxr” is registered with this nickname. Once the loading is complete, we will be prompted to choose the format of the result (let’s select “advanced” for a more in-depth investigation).

The screenshot above showcases a partial view of the output due to the extensive number of lines, making it impractical to show them all.

Upon thorough examination of the output, we successfully uncovered the true identity of Yxxngstxr by examining his Steam accounts. Steam is a renowned gaming platform widely recognized for its role as a game launcher and its extensive collection of games, multiplayer features, and community engagement.

By correlating the nickname with the real name, we discovered a Facebook account that is likely associated with Yxxngstxr. This account bears the name XXXXXXX and prominently displays Itachi Uchiha, a fictional character, as the background image. It is noteworthy that Yxxngstxr also uses an image of Madara Uchiha, a character closely linked to Itachi Uchiha, as the background on craxpro[.]io.

On Facebook, Yxxngstxr adopts the nickname “Youngster,” suggesting that Yxxngstxr is a modified version used for illicit activities, while “Youngster” serves as the everyday alias.

Here is a list of other breaches that have been published by Yxxngstxr on craxpro[.]io:

Further investigations have revealed that Yxxngstxr is involved with other illegal services, such as underground markets like Odin, and more. These services are often used for illicit activities, including the sale and trade of stolen data, hacking tools, and other illegal goods and services.

By saving the password he uses and searching for accounts with the username ‘Youngster’ using a simple Ctrl+F command, I was able to compile a list of services he accessed with the same password he used for ‘Yxxngstxr’.

Having acquired ample information, it’s evident that Yxxngstxr is no novice in the realm of cybercrime. Consequently, we can now proceed to step 2.

[2]: Analyzing the number of rows and domains:

The output consists of two rankings: one includes all the domains of the mails contained in the breach, while the second list includes all the domains of the mails contained in the breach except the domains of known providers such as gmail, yahoo, outlook, etc…. The purpose of the second ranking is to provide a more focused analysis of the breach’s target.

Based on the domain analysis, it is clear that this breach comprises a collection of Spanish domains. While this specific information was already mentioned in the title, it may not always be the case.

We can now move on to the third step.

[3] Check if the breach is fake:

This step aims to verify whether the user accounts of interest were already included in other breaches that we have in our database. If some of the accounts are found to already be present in our database, it suggests a higher level of reliability.

Obviously, in a situation where there is only one user account associated with our clients in a breach, it is impossible for that account to already exist in other breaches we have acquired. Otherwise, we would not have imported the VirusTotal item, as previously explained.

We can now proceed to the final step of the analysis.

[0] Set Acknowledge:

Based on the analysis conducted earlier, we can conclude that the breach is considered reliable. It’s crucial to inform the client about the compromised account so that immediate action can be taken to ensure the security of their company’s services. Promptly notifying the employee to change their password is a necessary step to mitigate any potential risks associated with the breach.

“There are only two types of organizations: Those that have been hacked and those that don’t know it yet!”

John Chambers

These Solutions are Engineered by Humans

Did you learn from this article? Perhaps you’re already familiar with some of the techniques above? If you find security issues interesting, maybe you could start in a cybersecurity or similar position here at Würth Phoenix.

Federico Corona

Federico Corona


Federico Corona

Leave a Reply

Your email address will not be published. Required fields are marked *