Motion tokenization is fundamental to the development of generalizable motion models, yet existing approaches remain restricted to species-specific skeletons, such as humans, thereby limiting their applicability across diverse morphologies. We present NECromancer (NEC), a universal motion tokenizer designed to operate on arbitrary BVH skeletons. NEC is built upon three core components: (1) an Ontology-aWare Skeletal Graph EncOder (OwO), which leverages graph neural networks to encode structural priors extracted from BVH files—including joint-name semantics, rest-pose offsets, and skeletal topology—into robust skeletal embeddings; (2) a Topology Agnostic Tokenizer (TAT), which compresses motion sequences into a universal, topology–invariant latent representation, thereby decoupling motion dynamics from morphology; and (3) the Unified BVH Universe (UvU), a large-scale dataset that consolidates BVH motions across heterogeneous skeletons (humans, quadrupeds, and other species), enabling systematic training and evaluation under diverse morphologies. Experimental results demonstrate that NEC achieves high-fidelity motion reconstruction with substantial compression, while effectively disentangling motion from skeletal structure. This capability supports a broad range of downstream tasks, including cross-species motion transfer, motion composition, denoising, generation (plug-and-play with any token-based generator; e.g., MoMask) and motion–text retrieval (via an OwO-based CLIP variant). By grounding motion representation in BVH animation while removing species-specific constraints, NEC establishes a principled framework for universal motion analysis and synthesis across varied morphologies.
Overview of NECromancer (NEC). NEC consists of two main components: (a) Ontology-aware Skeletal Graph Encoder (OwO), which encodes static skeletal information (topology, joint names, rest pose) into structured graph-based joint features;(b) Topology-Agnostic Tokenizer (TAT), including Spatio-Temporal Encoder and Decoder, which maps motion sequences into a unified feature space, appends virtual joints, and converts them into discrete motion tokens.
Click tabs to switch tasks. Use the arrows to navigate video examples within each task
Click tabs to switch tasks. Use the arrows to navigate video examples within each task