Niles
Niles

Reputation: 141

Git: How to be sure untracked config files aren't silently deleted

My situation is, I suspect, pretty typical. I'm working on a (young) project with some others using git for version control. Our project is a web app which requires local configurations for certain paths and keys spread across a few different config files. We thought a good way to handle this was to make a template version of these config files and track it in the repository, but not track our individual config files (recommended, e.g., here and here). To make sure no one accidentally commits their config file, we added it to the gitignore list.

But we didn't realize this at the beginning of our project -- only after one person had started it and others joined. So one of the config files was tracked early in the commit history. Our solution: remove it from the index, of course!

But this creates a nasty gotcha! Here's a simplified scenario: You have a branch where the config file is tracked.

git init # new repository
echo 'file a' > a.txt
git add a.txt
git commit -m 'initial commit'

Then you realize this is a problem so you create a new branch to fix it: On the new branch you delete config file from the repository index (and add a template version which you do want to track). Then gitignore the original file.

git checkout -b testbranch
cp a.txt a.template.txt
echo 'a.txt' > .gitignore  # ignore a.txt
git add .gitignore
git add a.template.txt
git rm --cached a.txt
git commit -m 'make template file for a'
ls  # shows that a.txt and a.template.txt are still in working tree
git status  # shows that working directory is clean

And of course make a critical update to the config file.

echo 'super-critical config setting' >> a.txt

Then switch branches, merge, and BOOM!!

The config file is really gone, and the changes you made are not tracked on any branch.

git checkout master
ls # shows a.txt, not a.template.txt
git checkout testbranch
ls # a.txt is gone!!

git checkout master
git merge testbranch master  # a.txt is gone forever!!

Having a.txt in the gitignore file masks the warning that removing a file from the index and then switching branches will overwrite or delete it. If you carry out the steps above, except for the ones gitignoring a.txt, you won't be allowed to switch away from testbranch without moving or deleting a.txt. If you move it to a different untracked file (a-copy.txt), checkout master and then checkout testbranch again, you'll see that a.txt is gone, just as you asked it to be, but a-copy.txt is still there.


That's the part I (might) understand. Here's what I don't understand: what else might cause trouble with this system? Since git doesn't track individual files, but chunks of content, is there some way the super-critical config settings could be lost even if a particular file(name) is never tracked in the repository (and in particular never deleted from the index)? Is there a way to be absolutely certain that untracked (gitignored) data in a repository is never silently deleted?

And, for the record, here are other options for dealing with local config that I've come across. The first three seem like hacks prone to forgetting/error, and the next two require some other file(s) to be configured locally and potentially lost. The last one seems like overkill, but maybe it's not. If you're sure that one of these is the best way to deal with config files, please explain why. If you know of something even better, that would be great!

  1. git stash
  2. git update-index --assume-unchanged
  3. separate branches for local settings, private to each separate developer
  4. git attribute filter driver (smudge/clean scripts)
  5. "deployment" script, again separate and private for each developer
  6. each developer maintains a separate repository for tracking their config files, completely independently of the main code repo

Upvotes: 3

Views: 1144

Answers (1)

VonC
VonC

Reputation: 1324937

The general rule of thumb, for sensitive information, is:

don't put sensitive information in git

No matter what policy you are following (special branch for sensitive stuff, or some "git update-index --assume-unchanged" tactic), you always have the risk to push something that you shouldn't.

Slaven Rezic mentions symlinks:

ln -s config.yaml.$username config.yaml

But that requires every user to have a proper config file, plus their sensitive data in it.
If that file must get new evolutions, it is hard to propagate them across each user's own config (sylinked) file.

The other option is to use a content filter driver.

content filter

It will, on checkout:

  • read a template config file
  • access the sensitive data from a referential outside the git repo (you define your own policy here)
  • generate a (private, as in "not versioned") config file, with the value placeholder replaced with the right data.

Upvotes: 1

Related Questions