Share a Development Machine

One of the requirements is reducing costs by not duplicating original media or other large data needlessly and with multiple users working with the same data, we want to have only a few copies of the same data spread out among users and repositories. Therefore, to save space, DVC allows us to set up a shared cache [2][5].

All users can point to the shared cache:

But how does this help to save space? Instead of having copies of the same data in the local repository, the shared cache, and all the other repositories on the machine, DVC can use links. Links are a feature of operating systems.

If we have a file, like an image, then we can create a link to that file. The link looks just like another file on the system but doesn't contain the data. It only refers to the actual file somewhere else on the system, like a shortcut. There are many types of links, like reflinks, symlinks, and hardlinks. Each has different properties. DVC will try to use reflinks by default, but they're not available on all computers. If the OS doesn't support reflinks, DVC will default to creating copies.

Last updated