SECRET

This article’s title most likely sounds like a meme’s caption. Rather, this is an actual issue that the engineers at GitGuardian had to deal with when they were putting the processes in place for their brand-new HasMySecretLeaked service. Their goal was to assist developers in determining whether their private keys, cryptographic certificates, passwords, API credentials, or other secrets had been exposed in publicly accessible GitHub repositories. Without requiring you to provide sensitive information, how might someone search through a sizable library of secrets contained in publicly accessible GitHub repositories and their histories and compare them to your secrets? This post will explain how.

First, a ton of data would be around 121.9 quadrillion petabytes of data at regular Earth gravity, or $39.2 billion billion billion US dollars in MacBook Pro storage upgrades (more than all the money in the world), if we were to set a bit’s mass equal to that of one electron. Therefore, it is figurative rather than literal when this article states that GitGuardian scanned a “ton” of GitHub public commit data.

Indeed, after going through commit history and scanning a “ton” of public commits and gists on GitHub, they discovered millions of secrets, including passwords, private keys, API credentials, cryptographic certificates, and more. Furthermore, “millions” is not a metaphor. In 2022, they genuinely discovered more than 10 million.

How could GitGuardian prevent millions of secrets from being published, making it simpler for threat actors to locate and harvest them, and letting a lot of genies out of a lot of bottles, while still enabling developers and their employers to determine whether their valid and current secrets were among those 10+ million? Fingerprinting, in a nutshell.

They created a secret-fingerprinting technique that encrypts and hashes the secret after carefully examining and evaluating it; just a portion of the hash is then provided with GitGuardian. This would enable them to reduce the amount of possible matches to a tolerable amount without having sufficient knowledge of the hash to reverse and decode it. They placed the toolkit for hashing and encrypting the secret on the client-side to increase security.

If you’re using the HasMySecretLeaked online interface, you can generate the hash locally by copying a Python script and just putting the result in the browser. You can simply verify that the 21 lines of code are not sending anything outside of the terminal session you opened to run the script, and you never need to put the secret itself wherever the browser may transfer it. If that’s still not enough, you may monitor what data the web interface is transmitting upstream by opening the F12 developer tools in Chrome or another browser and selecting the “Network” panel.

You may examine the CLI’s code to find out what happens when you run the hmsl command if you’re using the open-source ggshield CLI. Need even more reassurance? Use a traffic inspector such as Wireshark or Fiddler to see the data that is being sent.

The engineers at GitGuardian were aware that even consumers who trusted them would be wary of typing a secret password or API key into a box on a website. They made the decision to be as open and transparent as possible, giving customers as much influence over the process as possible, for everyone’s safety and peace of mind. This extends to the ggshield documentation for the hsml command, not just their promotional materials.

GitGuardian went above and above to ensure that users of their HasMySecretLeaked checker would not need to divulge the real secrets in order to determine whether or not they were leaked. And it has been fruitful. In the first several weeks of going live, over 9,000 secrets were checked.

It’s better to know whether your secrets have already been made public than not to. Even if they haven’t been taken advantage of yet, it will probably happen sometime. With the HasMySecretLeaked online checker, you can check up to five times a day for free. You can check even more frequently with the GitGuardian shield CLI. Additionally, you should examine their code and procedures to get ideas about how to make it easier for your customers to communicate sensitive information without really releasing the information itself, even if you’re not seeking to check if your secrets have spilled.