NECromancer:

Breathing Life Into Skeletons Via BVH Animation

Xu Mingxi¹, Qi WANG¹, Zhengyu Wen¹, Phong Dao Thien¹, Zhengyu Li¹, Ning Zhang¹
Xiaoyu He¹, Wei Zhao¹, Kehong Gong², Mingyuan Zhang^1,†

¹Huawei Central Media Technology Institute ²Huawei Technologies Co., Ltd.

^†Corresponding author

Abstract

Motion tokenization is fundamental to the development of generalizable motion models, yet existing approaches remain restricted to species-specific skeletons, such as humans, thereby limiting their applicability across diverse morphologies. We present NECromancer (NEC), a universal motion tokenizer designed to operate on arbitrary BVH skeletons. NEC is built upon three core components: (1) an Ontology-aWare Skeletal Graph EncOder (OwO), which leverages graph neural networks to encode structural priors extracted from BVH files—including joint-name semantics, rest-pose offsets, and skeletal topology—into robust skeletal embeddings; (2) a Topology Agnostic Tokenizer (TAT), which compresses motion sequences into a universal, topology–invariant latent representation, thereby decoupling motion dynamics from morphology; and (3) the Unified BVH Universe (UvU), a large-scale dataset that consolidates BVH motions across heterogeneous skeletons (humans, quadrupeds, and other species), enabling systematic training and evaluation under diverse morphologies. Experimental results demonstrate that NEC achieves high-fidelity motion reconstruction with substantial compression, while effectively disentangling motion from skeletal structure. This capability supports a broad range of downstream tasks, including cross-species motion transfer, motion composition, denoising, generation (plug-and-play with any token-based generator; e.g., MoMask) and motion–text retrieval (via an OwO-based CLIP variant). By grounding motion representation in BVH animation while removing species-specific constraints, NEC establishes a principled framework for universal motion analysis and synthesis across varied morphologies.

Method

Overview of NECromancer (NEC). NEC consists of two main components: (a) Ontology-aware Skeletal Graph Encoder (OwO), which encodes static skeletal information (topology, joint names, rest pose) into structured graph-based joint features;(b) Topology-Agnostic Tokenizer (TAT), including Spatio-Temporal Encoder and Decoder, which maps motion sequences into a unified feature space, appends virtual joints, and converts them into discrete motion tokens.

Results

Click tabs to switch tasks. Use the arrows to navigate video examples within each task

Reconstruction comparison with Other Methods

Click tabs to switch tasks. Use the arrows to navigate video examples within each task