Animation Industries and HPC

Animation rendering is a very computationally intensive process that involves large files and a lot of creativity to create. The requirements of a HPC Cluster differs from that in a engineering perspective. Machines used to help render these animations are called render-farms. Below are some notes that highlights some of the things when creating solutions for a render-farm.

Industry Overview

Animation or film editing is divided into various stages. each one of them driving into the finer points of a frame. A general description can be organized into

  1. Movie Project
  2. Acts
  3. Scenes
  4. Shots
  5. Cuts
  6. Takes
  7. Reports for Daily progress

Takes are also used for the animators or the script writers to have a view of the daily progress of the overall movie. Similarly, when working on a commercial, the sequence is the same. However, as a commercial is usually shorter, it is divided mainly into

  1. Commercial Project
  2. Shots
  3. Cuts
  4. Takes
  5. Reports for Daily progress

On a very high level, the main components of an animation studio will consist of

  • Content Creation and Editing workstations
  • Renderfarm
  • Central Storage
  • Fast scalable Network

Due to the user oriented-ness of this industry, the knowledge of which applications are to be used and which OS supports them is important. This drives the need for a cross platform infrastructure. The architecture of the File Storage System is also critical to be able to support the various functions in the work flow. Sometimes, the network is also to be connected to remote facilities.

Production Stages

The stages of production in the film industry can be divided into

  1. Shot Layout
  2. Animation
  3. Lighting
  4. Rendering
  5. Compositing
  6. Painting
  7. NLE or Non Linear Editing
  8. Film transfer

For all these stages, shots are typically used as a way to keep track of what’s happening throughout the production. Each shot can range from a few seconds to a few minutes. The resolution of each shot can be 4996×3112 or above, in 32 or 64bit colour. Each frame can also be passed multiple times for better quality. 150 passes on a single frame is not uncommon. The amount of time it takes for a single frame can thus take anything from 48 to 72 horus to complete, and each shot can be up to a terabyte or more of data.

Considerations

Renderfarm

The requirements of the renderfarm has relationship to the animators application as well as the render engine that the studio chooses to use. The support of the platform is of utmost importance. The render manager is also an important compoment. Products such as Alfred or EnFuzion can be used.

The best way to begin the design of the renderfarm is to make a list of the applications that the studio would like to use and define its supported platform. It is to be noted that it is a requirement that the architecture of the CPU used in the renderfarm should be kept consistent. Possibly, this should also be kept the same on the workstations to ensure consistency of colour and output of the resulting images.

Storage

Three forms of storage are possible. Either a NAS, SAN or other advanced file systems such as parallel filesystems. However, for its price performance, a combination of SAN FS and NAS is common. NFS is also usually used due to performance.

It is also important to note that NLE workstations would need to work on several streams of video at one time. This would mean that sustained throughput is more important compared to peak throughput. The speed of the I/O channel as well as the speed of the disk is critical for this operations to be well supported. Direct attached SAN or GPFS like filesystems would seem to be suited well for this purpose.

Comparatively, render nodes and animators workstations will utilize the network in a more bursty nature. For these users, as long as the delivery of the data is timely, it would be acceptable.

Due to the difference in the requirements of the workstations, it is common to seperate the storage pool for video files, for NLE, and the information required for the rendering files. This is equivalent to designing a high speed storage pool versus a general storage pool.

Sizing

There are various hardware compoments in an environment for Animation. Workstations will form a main bulk of it, and server/storage for the infrstructure and renderfarm will form the other.

Workstation Requirements

Each artist will definately require their own workstations. Dual screen displays would also be preferred to give the visual prespective. An estimate of RAM requirements could be the required output file sizes plus the input file sizes. A good rule would be approximately 2GB/CPU.

Renderfarm Requirements

The number of nodes and the specifications of each node needs to be able to match the baseline expectation of the number of scenes to render. i.e. the amount of time it would take for a movie to completes its render. an estimate of the render time required for a frame is between 3 to 72 hours. This can be calculated by:-

  • FPS\times ShotDuration\times ShotsPerScene=TotalNumberofFrames(F)
  • SerialTimetoRenderMovie(S)=F\times RenderTimePerFrame(T)
  • MovieRenderTime=min(T,\frac{S}{NumberofNodes})
  • NumberofNodes*=F
  • MovieRenderTime*=T

Note that * denotes the optimal value.

Storage Requirements

Storage requirements can be calculated by using the size per frame and the length of the movie. The throughput of the I/O bandwidth required will be approximately 5MB/s per node. It is also typical to assume the final output is only 10% of the required input. To estimate the amount of storage required, we can calculate using the following:-

  • ResolutionX\times ResolutionY\times F\times bitdepth\times3
  • FrameSizeinMB\times LengthofMovie\times FPS

Due to the very tight requirements in the bandwidth, the Fiber Channel Fabric must be able to support the aggregated I/O throughput required. This is in both throughput as well as peak. Also note that the bandwidth and I/O throughput of the NETWORK fabric is limited. This is especially in the case where storage nodes are used. Do also note that storage nodes are NOT to be part of the renderfarm.

NLEs typically also require multiple streams of sustained throughput giving it a much higher requirement. Latency is also critical for those workstations. It is therefore always recommended to separate the NLE I/O bandwidth from that of the main renderfarm and workstations. Also note that the STORAGE POOL should also be separated due to improved disk performance. It is typical to assume that NLEs require anything from 5-20MB/s. Due to these requirements, NLE stations should consider separate NICs to differentiate between file I/O traffic and video streaming traffic.

So far, the bandwidth requirements has been on the fabric layer. Storage pool bandwidth must also be sized to be able to support the required bandwidth. it is safe to assume that the sustained bandwidth for each disk drive is 5MB/s. Real values will change depending on the disk technology that is used. It is essential that all fabric layer requirements will be able to translate into Storage pool I/O requirements. If this is not done, a bottleneck will result.

Complexities

Cost:-

  • 1/2 day animator cost 1500 USD

Computational:-

  • 10 minutes of animations uses 50 data files
  • Fur rendering. Small model had 1.2 million pixels. Layered rendering and sub process requirement between computers. Basic fur render takes 20 minutes. 1000×800 resolution. 30fps.
  • RIB files are huge. And transmitting in this format is ok. Not enough data to reconstruct
  • RIB is to be recombined at central location for composition.

Data:-

  • Centralized or distributed data? If central, the bottle neck. The distributed, the sync and cache mechanism.

Process:-

  • During production, allow changes with large interruptions
  • Autoupdate of models. For those in process as well.
  • Break down of models into objects.
  • Animator requires immediate feedback. Interactiveness is very important
  • how to layer for more efficient rendering
  • Maya renderer is not as good as Renderman
  • Due date is the most important
  • who does what is also just a guideline

Possible Solutions

Some of the possible customized solutions includes:-

  • building cluster of workstations to enable workstation load sharing
  • designing custom algorithms to reduce vertices’s such as to achieve faster and better rendering

– clothe simulation

  • deep integration with the user interface of the animation package

The benefits of the solutions should always drive towards lower cost and faster turn around in changes and rendering. It should always bring to the users a result that is as true to real life as possible. In terms of workflow, the idea of straight through processing should be done to ensure the best feedback to the animators as possible during the production process.