Everything.

A blog about nothing. By Don Kuntz.

Let's not kill the traditional file system

This is a response to Let’s at least start to consider killing the plain old file system.

Hey Mark,

I read your blog post, and have discussed this topic with you in person too, but I disagree, it’s not time to consider killing the plain old file system. It does exactly what it needs to do, doesn’t force users to consume additional space for things they probably don’t need or want, and almost all professional tools have some capacity for tracking revision history in their files.

You argue that version control systems are an acknowledgment that plain file systems don’t work for programmers. Following from that, and because disk space is relatively cheap compared to historical prices, you argue that the file system should track all changes to files. This is where I think your argument falls apart (I know, right there are the beginning…): tracking revision changes is optimizing for a small subset of computer users.

Similarly, your claim of version control being proof that file systems aren’t perfect is like claiming that smart phones are proof that traditional computers aren’t perfect. For some use cases traditional computers aren’t perfect, but there are many where they’re preferable to a smart phone (including writing this, and likely writing your own post). Convergence isn’t always a good thing, and optimizing for a subset of use cases isn’t either. Specifically relating to version control, it’s not an acknowledgment that traditional file systems don’t work; it’s an acknowledgment that they work quite well, and in a very specific case an extension to them is really useful.

Your next point is that files evolve over time, and capturing those changes with traditional version control systems is a hassle to those users. Pro Software agrees with you, even some consumer software agrees with you, which is why they have revision tracking built in. While by default Photoshop clears an image’s history when the image is closed, you can change Photoshop’s settings to store that history in the file. Likewise, Microsoft Word, Google Drive, and most other word processors and other office programs track revisions by default already, and if not by default, can be set to.

So, valid point. The solution however, changing the file system, is silly. Non-programmers don’t really use plain text, they use whatever odd, binary file type that their programs create. Tracking changes between saves of a binary file is mostly useless, because you’re just going to have one big blog of data and another big blob of data. The programs that have revision history built in have optimized how they track their changes for both file size and usage, shoehorning them all to use a standardized revision mechanism loses those optimizations. While it might make it easier to play with the changes, it doesn’t make it better.

While not exactly claiming “think of the children”, your point on storing changes for future generations to learn from is similar enough. The main problems with arguing for future generations is that they’re not, and not every change made to a file is interesting. Even Storyteller, which you mention as a pseudo-solution acknowledges that not every change is interesting, and has facilities to ignore uninteresting changes built in. Is it interesting that my music jukebox’s library file incremented the play count field for the song I just listened to?

Your last argument is basically that “Google can, and does, do it, why shouldn’t everyone else?”. In short, because everyone else isn’t Google. We don’t buy cheap disks that have a higher risk of failure in bulk, or the facilities to host those disks. Google does what it wants, but just because they do something doesn’t mean that it’s best for everyone.

Your position completely ignores large binary files. Here’s something that probably happens fairly often: I want to change the artist field in an entire album’s ID3 tags. I’ve randomly selected Albert Hammond Jr’s Yours to Keep, 12 tracks adding up to 73 megabytes. Changing every file’s field once means I now have two copies of the album stored on my drive. Every time I want to make any kind of change the same thing is true. Maybe I added subtitles to a DVD quality movie, that’s ~700 megabytes that’s now stored twice.

While maybe you want to track those changes, doing so on the file system level is silly. Having a special container file that keeps track of the changes means that instead of having two copies of the audio on my system, I have two copies of the artist string on my system, in the container file. The professional tools way of doing this is better than assuming that everyone is using their computers solely for work. And that’s what version control is, a professional tool, a container file of sorts. Even Storyteller is the container file, and that’s better than replacing the whole file system as an optimization for a small subset of computer users, a small fraction of their time using computers.