Why We Fail at Keeping Git Secrets

08 Sep 2019

Reading time ~5 minutes

Your secrets are showing!

Over several years, I’ve seen thousands of developers make the same set of mistakes, all resulting in the same outcome: company secrets or credentials in public, visible to everyone. This is not a rare problem. Seriously, ask your coworkers, I bet they’ve done one of these things. These mistakes follow a few common patterns:

Assuming everything is private
Assuming private things can be public
Blacklisting when you should be whitelisting
Not knowing how git works

Assuming everything is private

This happens when a developer decides to push code to a public project, which they don’t know is public. For instance, a few years ago, a research branch of a government agency decided to publish credentials to a postgresql database to a public GitHubrepo. Of course, since I was scanning GitHub’s public feed, I got a notification for this. After investigating the repo, it became clear that there was no intent to hide secrets/credentials/internal notes. This is a strong indicator that this repo was never meant to be public, and the developer simply worked as if they were pushing code to a secure/private location. This seems to be more common with older or larger organizations that are used to operating in an environment like a private version control server or an SFTP server.

Assuming private things can be public

I see this mistake a lot. A developer assumes something that seems benign, but contains secrets or credentials. A good example of this is firefox’s .mozilla folder. Right now there are a few hundred .mozilla folders on GitHub containing sensitive info. What most people fail to realize about these folders is that they contain two files, key3.db and logins.json, which are usually enough to recover the entire list of a user’s passwords using a tool like firefox_decrypt (https://github.com/unode/firefox_decrypt). There are many more files like this, such as IDE configuration files. An easy way to prevent this mistake is simply checking your files for credentials before committing them, and only committing files you actually need. A file named “logins.json” seems like a pretty good sign you shouldn’t be committing that file, and a quick check before pushing will save a LOT of headache.

Ignoring when they should be un-ignoring

In security, it’s often said that opening the box is better than closing it. This is because it allows us to account for edge cases we may not expect by only allowing for the cases we know are acceptable. For developers, this is not necessarily common knowledge. One of the most powerful tools for keeping secrets safe in git projects is the gitignore file. Unfortunately, when you look at any post about using gitignore, or even just the name “gitignore”, there’s a strong suggestion that you should be specifying what files to remove, versus specifying what files to keep.

Thankfully, there’s another way. Let’s say your directory structure looks like this:

/secrets
  /secret.json
/public
  /index.php
  /db_client
    /db_client.php

A standard gitignore file might look something like this

secrets/
secrets.json

However, this does not account for future work. Let’s say a second developer does work in the db_client folder, and adds a file called db_creds.json. That file will not be ignored by default. However, if the gitignore file looked like this:

*
# Allow .gitignore
!.gitignore
# Allow directories
!*/
# Allow any PHP file under db_client
!public/db_client/*.php
# Allow any PHP file under public
!public/**/*.php

Any non-php files added to db_client would be ignored. If specific files need to be included, they can be whitelisted manually in the gitignore. This creates an environment where developers need to think carefully about what files they should include, and makes it more difficult to make mistakes.

Not knowing how git works

This is more of a mistake in how we respond to leaked secrets than it is leaking the secrets in the first place. Git is essentially a changelog of every insertion and deletion that occurs. Because of this, deleting files or contents from those files in git will actually make that file appear twice in git, and will make it more likely that someone will see the secrets you removed. Because this is counter-intuitive, it’s easy to make this mistake. To properly respond to leaked secrets, you should do the following:

Assume the secret is public, and has already been stolen. This means you should cycle the secret as quickly as possible (but don’t commit it back to the repo yet…)
Remove the secret from git history. This could be as simple as a git rebase (if the secret was just committed), or may require more complicated cleanup. For either case, I recommend reading GitHub’s guide on removing sensitive data here: https://help.github.com/en/articles/removing-sensitive-data-from-a-repository
Find an alternative way to store the secret, whether it’s ignoring the file, setting up your gitignore to whitelist, or pulling the secret from an environment variable, you should make sure you don’t make the same mistake again.

I hope that developers can use the methods above to help reduce the number of secrets leaking. This is a problem that impacts thousands of companies and businesses across the world, and can have catastrophic consequences if not carefully considered. If you’re interested in some of the types of secrets that are leaking, there’s an excellent list that’s been compiled here: https://github.com/techgaun/github-dorks, along with a tool for checking repositories. Also, please let me know if you have any other ideas on how we can reduce the leaking of tokens on git. I’d love to hear them!