[linux-elitists] Fun with Git repository copying

Greg KH greg at kroah.com
Sat Apr 13 08:07:32 PDT 2013

On Sat, Apr 13, 2013 at 07:45:39AM -0700, Don Marti wrote:
> What happens when you're doing a copy of a Git
> repository that's in the process of being pushed to
> or garbage collected?
>   http://joeyh.name/blog/entry/difficulties_in_backing_up_live_git_repositories/
>   http://marc.info/?l=git&m=136422341014631&w=2
> Sometimes, bad things.

Sometimes?  It's more common than you might think, which is why the
kernel.org admin has created grokmirror to handle mirroring of git
repos, which have the same problem of backing up / copying them on a
live system:

> Here's a hypothetical game.
> Let's say that programmer A has the job of
> implementing POSIX cp(1), but has decided to do it
> in a way that will pass the "cp" test suite but order
> the file copying to maximize the chances of breaking
> copies of Git repositories that are being changed
> during the copy.  (For example, "evil cp" might see
> if there are any subdirectories directories named
> "objects", copy their contents first, then pause,
> then copy the rest.)
> Programmer B has decided to extend Git to defend
> against "evil cp" so that the copy is usable, even if
> "evil cp" and a large push and repack happened at
> the same time.
> A has full access to the Git source code and mailing
> list.  B is aware of the existence of "evil cp"
> but not the details of what it does.
> Who wins?

B because git doesn't use 'cp' but rather the syscalls directly, so the
user of the git repo itself will be just fine, who knows about the user
of the copied repo, an "evil" cp could just not copy all of the files.

Again, don't just use rsync or cp on a live git repo, you wouldn't do
that on a database, would you?

greg k-h

More information about the linux-elitists mailing list