March 2008 – FIE LLC Historic Twenties

GlobalStorage solves a lot of problems I’ve been pondering for a long time. Here are some of those problems.

You’re not really supposed to work directly on the file server in most production environments.

It clogs the network for everyone.
Network fabric is slower than internal data buses like SATA. You get better performance on a local drive.
Redundancy can save you. If you break your local copy, you can pull the original from the server to fix it.

But there are advantages to working on the server

Organizationally, its easier to work on a common filesystem. Changes others make to the filesystem are immediately seen by you. You and your fellow artists wont miss each other’s changes as they live in the same place.
You don’t have to manage which files you have "changed" and which you have not when publishing your changes back to the server since changes are changed immediately and… thats it.
Absolute filepaths that are part of the files you are working on (references from one file to another) are not an issue if everyone maps the shared directory to the same place. Even the renderfarm can work directly with the filesystem this way. Otherwise, you have to manage absolute paths and artists mess this up all the time.
You wont as often "forget" to publish new files to the server that you had local on your harddrive as your first instinct will be to save directly to the server.

There are things that neither solution fixes

Without actually locking the files you are working on when they are loaded into memory you risk two artists changing the same file and overwriting each other’s changes when they publish (or write) them back.
When files are eventually published to the server they are overwritten, the old version is lost. So if you mess something up, you’re out of luck.

An experienced artist will look at my list of issues and immediately start listing application features, workflows and tools to deal with the problems one by one. And lets not be unreasonable here. Most medium sized facilites have at the very least, solutions, standards and practices that mitigate a lot of these problems to varying degrees. Here are a few:

Alienbrain
Perforce
Versioning files manually

Never create myfile.txt
Always make myfile_v01_01.txt

Use "incremental save" features in your 3d app
Keep your whole project in a Subversion or CVS repository
Make artists responsible for individual assets, reducing the number of people who may be working on a file.
Use the verbal Chekout Checkin system (i.e. "I’m Checking Out SC_02!")

The better solutions listed here fall into the category of Source Control Management (SCM) systems. SCMs solve a lot of problems. They were created to manage the first real digital assets, computer sourcecode, many years ago. Modern SCMs manage locking and versioning of complex directory trees. They can manage collisions down to the file level and if your files are text based, they can often manage them down to the line level.

Perforce and Alienbrain have been optimized to work with digital media assets (which are usually characterized as being big binary files rather than text files). They are however, proprietary and expensive. If you choose one of these solutions (and many digital media production facilities do) you will be stuck licensing each artist seat, or buying a rather hefty site license. They are proprietary and therefore, closed source. And as much as they can provide plugin APIs, anything thats closed source is more difficult to customize than an open source solution.

Subversion on the other hand, is open source and has a large volume of support. Subversion has been my favorite SCM for years and I’ve used it in production a number of times.

However, Subversion is not the end solution to the problem. All SCMs I’ve worked with including Subversion, have a few problems.

You don’t necessarily want to version every file in your tree. Some files are meant to be replaced. Especially large files that are generated from small files. Its probably good enough to generate them and push them to the server. There’s no need to track their every incarnation over time. Its wasteful of space and processing power.
If you accidentally commit large amounts of data to the server, its often quite hard to get rid of it for good. Its part of the history and SCM systems are kind of built NOT to lose historical data.
Archiving granularity is an issue. You can create a repository per project but then the projects are separate. Or you can keep everything under one tree, but it becomes hard to delete a project after archiving it. Also, when archiving a project, you may want to keep versioing information for some parts but only the latest version for others. This is even more complex, if not impossible.

Anyhow, what I’ve built and have running in Alpha right now, is what I’m terming GlobalStorage. Its a suite of tools that use Subversion to implement a more robust SCM thats tailored better to production. Basically, it a system built on top of Subversion and more common filesystem tools, to act as the single storage solution for a digital production studio.

Here are some features in no particular order:

Generic storage solution. Even the production accountant can use it as his/her data store, regardless of his/her completely different tools and workload. Producers can use it for their storage needs. Its not 3d or video specific in nature.
Written mostly in python and therefore able to integrate directly into leading digital production packages directly and easily.
Assets can be SVN backed or FLAT. So they are either under full historical version control, or just a flat copy on the server, depending on the appropriate storage for the asset.
Assets can show up multiple times in the directory tree. A single asset (say, HDRI Skies) can be in the textures directory for an XSI project, Maya project, and a central asset library, all without making redundant copies of the asset.
Dependecies. Assets can be set to be dependant on other assets. Dependencies can optionally be updated and commited in lockstep with one another from a single call on the top level asset.
Disconnected Mode. When the system is disconnected from its server, it can create and work with new assets locally, as if they were on a server. When you reconnect to the server, these assets are then able to be transferred to the server.
Assets can have their history deleted when its time to save space.
Assets can be filtered at the path level, allowing the permanant deletion of parts of an asset’s history witout affecting the history of the rest of the asset.
Assets are easily copied and moved from server to server for archival purposes.
Assets are stored via hashcode and will never collide at the storage level. The entire history of your production at the company will be able to live on a single storage system if its big enough. historical projects can be brought back into an existing server without worry of data loss.

The 900 pound gorilla in the room has the word "Scalability" shaved into his chest hair. This of course being a serious issue and the cause of many growing pains.

There are a few ways to deal with this.

Firstly, I’m going to add a "round robin" load balancing system into GlobalStorage, where a newly created asset is put on a randomly selected server from a pool. Assets will also be able to be created on specific servers at the user’s requ
est. And assets will be able to be moved to specific servers at a user’s request. GlobalStorage will magically merge the assets into a directory structure on the user’s machine when they are checked out. Their location on the network is irrelevant as long as it has access to the repositories.

The round robin solution is pretty powerful and will probably meet the needs of a large number of facilities. With the application of minimal brain power by the artists, assets will move to unencumbered servers every once and a while.

However, what you’d really want is what’s known as a clustered file system, where it appears a single server does all the work and it runs really fast. In reality, a cluster of servers is moving data around and load balancing in a logical data driven manner. You also would want redundancy at every step of the way to avoid having a single point of failure to keep your uptime in the 99.9% range even when you have a bad week and 4 drives and a network switch fail on you.

Clustered file systems are a pain to set up and usually quite expensive to license. However, one of my goals over the next few months is to put together a set of virtual machines and infrastructure to make the deployment of a clustered filesystem based on commodity hardware and open source software a simple matter.

GlobalStorage is designed to work just as well in a Clustered environment as in a round robin environment. But there’s no doubt that at some size, you’ll really want to put a storage cluster into your facility rather than maintain many individual servers.

Month: March 2008

Introducing GlobalStorage