Parsing and converting animations to glTF

by Anirudh on 15 Aug '21

TypeScript
Electron
Animation
3D Graphics

As I've mentioned in an earlier post, I spend some of my time working on tools to help with the development of private servers for an old MMORPG called Tales of Pirates.

This time, I dove into the proprietary file format that the game uses (.lab) to store their character skeleton and animation data, and wrote a tool that could be used to convert it to and from glTF which is an "open" file format that is used to store data about 3D scenes and models.

Sample Model

This is one of the character models from the game, which I'll be using to walk through the process.

File Format

The game stores animation and skeleton data about characters in a .lab file, which is a format that only exists for games using the proprietary 3D engine that this game uses (called MindPower). The file is a blob, containing little-endian information about the character skeleton.

The file data is sectioned into the following logical information chunks:

Header
Bones
Inverse Matrices
Dummies
Key Sequences

The header section contains metadata about the rest of the file. It contains 4 main u32 values -

Number of bones
Number of frames
Number of dummies
The Key type being used (more on this later)

I used the binary-parser library to easily parse binary data into a format that was consumable.

For the header, the parser format was fairly simple -

private getHeaderParser(): Parser {
    return new Parser()
    .endianess('little')
    .uint32('boneNum')
    .uint32('frameNum')
    .uint32('dummyNum')
    .uint32('keyType');
  }

This sets up the base to then extract the rest of the data from the file.

Bones

Bones and joints are a universal concept in 3D model animation softwares. Analogous to the human skeleton, joints are designated points on a 3D model that form a hierarchial structure, and are used to perform movement in 3D space.

_{Source for the image}

Each sphere in the above image is a joint and the conical shapes between them are the bones. The model of a character is then rigged to these joints, and if the joint is moved, the model moves according to how the rigging was done.

Bone in the game form a strict hierarchy, with a parent bone which then branches out into the rest of the skeleton.

The structure of the bone data is fairly straightforward. In the file, all the data about the bone hierarchy is stored sequentially in the following format -

{
  boneName: string,
  boneId: number,
  parentBoneId: number
}

where the boneName is an ASCII formatted, NULL-terminated string extracted from 64 bytes of data. The next two numbers are simple 4-byte u32 numbers.

Inverse Matrices

Matrices are heavily used in the world of computer graphics. They allow to perform operations (like translation, rotation, scaling) on the joints of your 3D model and a lot more.

This translation/rotation/scaling information about joints is stored relative to the parent joint instead of relative to the global space/context. This is done to make interpolation easier.

The inverse matrices (also known as the inverse bind pose matrices) are used to store the translation & rotation information about a joint at "rest" pose ( i.e the base pose where no animation has yet been applied ). These matrices are then operated on by vectors & quaternions to create transformations for each frame of the animation.

The following is the parser code to extract the inverse matrices -

  private getInverseMatrixParser(): Parser {
    return new Parser()
    .endianess('little')
    .array('', {
      type: 'uint8',
      length: 64,
      formatter: function(arr: number[]) {
        const uint8Array = Uint8Array.from(arr);
        const floatArr = new Float32Array(uint8Array.buffer);
        
        const matrix = new Matrix4();
        matrix.fromArray(floatArr);
        
        return matrix;
      },
    });
  }

Each transformation is represented by a 4x4 matrix of floating point numbers. We parse 16 uint8 numbers of 4 bytes each, and then convert them into Float32 numbers, and create a 4x4 matrix out of them. I'm using the math data structures provided by three.js to store and operate on this data.

Dummies

Dummy Objects are used in animation to have easier control over the motion of 3D models. They are "invisible" objects that acts as parents for some joints, and instead of having to transform the positions of those joints for each keyframe, we can translate/scale/rotate the dummy object instead, which causes it's child joints to also transform, which is a lot easier.

Dummy objects in this game are used to "hook" the animation model at a particular point in the game world, so that interaction with the character is based on the position of the dummy object. The dummy data was also fairly simple to parse, the structure of the data was -

{
  id: number,
  parentBoneId: number,
  matrix: Matrix4
}

where the id was the ID of the dummy, the parentBoneId was the bone that the dummy was linked to, and the matrix is a 4x4 matrix containing the transforms of the dummy object.

Key Sequences

Keyframes (or key sequences, as called in the game files) are points in time where a particular transformation has been applied to the base skeleton of a character model. Each keyframe contains the transformation that was applied to it, and the duration between keyframes is usually interpolated by the 3D renderer. This means, we only need to transform the poses of a model at particular points in time, and the renderer will interpolate the transformations that happen in the duration between two keyframes.

Each keyframe in the game contains data about the type of transformation that is happening on the model. This is where the keyType value from the header comes into play. Each animation file can only contain keyframes of a particular type. The types that are supported by the game are -

4x4 Matrices where the last row is a unit vector
4x4 Matrices
A combination of a Quaternion and a 3-D Vector

The end result is always a 4x4 matrix that is made up of the data that is parsed from the file, but the difference is in how that data is stored (I'm assuming this happened due to different amounts of detail required in the animation and Quaternions came into play later, when that amount of control was required).

Each keyType resulted in me defining a parser that could extract the data relevant for that key type. Following is the parser that was written for the Quaternion keyType -

private getQuaternionParser(): Parser {
    return new Parser()
      .endianess('little')
      .array('', {
        length: this.header.frameNum * ( 12 + 16 ),
        type: 'uint8',
        formatter: (arr: number[]) => {
          const positions: Vector3[] = [];
          const rotations: Quaternion[] = [];

          const u8Array = Uint8Array.from(arr);

          let offset = 0;

          for(let i = 0; i < this.header.frameNum; i++) {
            const floatArray = new Float32Array(u8Array.buffer.slice(offset, offset + 12));
            offset += 12;
            const vector = new Vector3(floatArray[0], floatArray[1], floatArray[2]);
            
            positions.push(vector);
          }

          for(let i = 0; i < this.header.frameNum; i++) {
            const floatArray = new Float32Array(u8Array.buffer.slice(offset, offset+16));
            offset += 16;
            const quaternion = new Quaternion(floatArray[0], floatArray[1], floatArray[2], floatArray[3]);

            rotations.push(quaternion);
          }

          return {
            positions,
            rotations,
          };
        },
      });
  }

We parse 28 bytes for every frame (12 bytes for the 3D Vector, and 16 bytes for the Quaternion). This data is then converted into Float32 arrays and used for the Vector3 and Quaternion data structures from three.js.

This completed the parsing of the animation file. The next step is to convert that into the glTF format, which I'll talk about in the 2nd part of this post.