I've been spending a lot of time with our Maya render farm lately. The old one was set up with three 8-core macs and Apple's built-in render manager, and it just wasn't cutting it any more. We are rendering animations that are taking upwards of 30 minutes a frame (thank you Mental Ray), and we needed a more flexible and self-sustaining solution.

The whole topic of batch rendering Maya files is an esoteric subject, at best. Add to that the need to batch render across a distributed farm of computers and you are entering into an area whose true secrets are known only by the ILMs and Pixars of the world. 

Maya's built in documentation, while normally quite thorough, fails completely when you are trying to troubleshoot command line rendering errors. And manually setting up Mental Ray Satellite or Mental Ray Batch or Mental Ray Standalone (or honestly even trying to figure out conceptually what the difference between these are) is challenging.

I found that stepping away from the integrated Maya tools was the only way for me to wrap my head around rebuilding our farm, and it forced me to consider in more general terms the features that I was really looking for in our setup.

There were five features I came to realize I needed.  

  1. Readable error messages
    Seems like a simple thing, but few tools provide readable error messages, and by readable I mean by an actual human of average to above average intelligence. I realize that error codes are an easier way to abbreviate the problem, but the general lack of documentation on what those codes actually mean makes using them all but impossible. If I am having permissions issues with writing files to the shared drive that I am rendering to, just tell me that. "Process ended with error 102" means nothing. 
  2. Frame error handling
    Anyone I know that works with distributed batch rendering will tell you that despite best efforts, frames will sometimes fail for one reason or another. For example when the CPU that you have assigned a frame to becomes occupied by a higher-priority process, or network traffic gets in the way. It happens. And the best way to deal with this eventuality is to have the render manager automatically detect the failure and re-que the frame to another available CPU. This is a must. 
  3. Priority control
    Priority control is useful for a number of reasons, at both the job and machine level. You may want uneven distribution of frames to certain machines (say if certain machines are more powerful, or being used heavily on another task) or you may want to give priority to a more important job such that any frames to be rendered for this job supersede other frames in the que. You may also want to use priority to determine what happens to frames when they fail. 
  4. Multiple "master" controller machines
    Each machine on our farm has the ability to send jobs to the farm and act as a controller of the farm. I find this to be the best scenario for us as it lets my animators submit and manage their own jobs without having to get up and send them from a centralized machine, and (with the right software) provides for an amount of redundancy on the farm.
  5. Redundancy 
    As I mention above, frame failures happen. And sometimes the manager experiences problems that cause it too to fail. The most useful software allows for multiple managers on the same farm to act as backups for one another. So if I start a render from one machine but that manager crashes for some reason, another manager can pick up the job (because they all monitor all of the traffic on the network) and continue rendering.  

In Conclusion

For anyone dealing with their own in-house render farm, I hope this quick list of desirables is helpful as a guide to not only understanding important elements of a render farm but also for providing criteria around which you can evaluate hardware and software solutions.

On the hardware side we upgraded our three macs to 16-core (48 total on the farm) with plans to add in at least three other machines soon. On the software side we went with Smedge, a very capable render farm manager that offers a great deal of flexibility and tolerance for render farms at an attractive price.

AuthorJoshua David