Hi there, I built this library after reading up some InfoSec SE posts about what sensitive files (and information) that should be gitignored or not included at all in a git repo.
How this library works: sniffgit starts from the root of your git working directory, and check if there are any sensitive files (id_rsa, *.cert, etc) that are exposed, i.e. files that haven't been gitignored or files that shouldn’t be in a repo at all.
This library also checks textfiles for sensitive information, such as AWS_SECRET_ACCESS_KEY, email, password, etc. Some files and directories are not going to be read at all, though (e.g. binary file, .git, yarn.lock).
Currently, the “sensitive info / line analysis” will have a lot of false positive result for larger projects. The reason is that it only checks for keyword such as “password, API_KEY, email, etc” for each line in a text file.
This is my first ever open-source project. Feedbacks are truly appreciated, particularly about OSS best practices :).
Interesting project! Perhaps you could add a return value depending on whether results were found (using sys.exit or something like that) so it can be integrated in CI-pipelines.
Thank you for the suggestion! I will add that feature today. I believe that the project will be more useful if it can be easily integrated into CI pipelines!
The following article was also a motivation for me to start the project, “Dev put AWS keys on Github. Then BAD THINGS happened”: https://www.theregister.co.uk/2015/01/06/dev_blunder_shows_g...
How this library works: sniffgit starts from the root of your git working directory, and check if there are any sensitive files (id_rsa, *.cert, etc) that are exposed, i.e. files that haven't been gitignored or files that shouldn’t be in a repo at all.
This library also checks textfiles for sensitive information, such as AWS_SECRET_ACCESS_KEY, email, password, etc. Some files and directories are not going to be read at all, though (e.g. binary file, .git, yarn.lock).
Currently, the “sensitive info / line analysis” will have a lot of false positive result for larger projects. The reason is that it only checks for keyword such as “password, API_KEY, email, etc” for each line in a text file.
This is my first ever open-source project. Feedbacks are truly appreciated, particularly about OSS best practices :).