3DiMo: 3D-Aware Implicit Motion Control for
View-Adaptive Human Video Generation

Zhixue Fang1,* Xu He1,2,* Songlin Tang1,* Haoxian Zhang1,†,✉ Qingfeng Li1,3 Xiaoqiang Liu1 Pengfei Wan1 Kun Gai1
1Kling Team, Kuaishou Technology    2Tsinghua University    3Chinese Academy of Sciences
* Equal contribution    † Project leader    ✉ Corresponding author
TL;DR: Our proposed 3DiMo can faithfully transfer genuine 3D motion from a given driving video to a reference character, while enabling flexible free-view camera control.

Method Overview

Overview Figure

Additional Results: Motion Transfer with Text-Guided Camera Control

Left: Source Motion | Mid-L: "Handheld camera." | Mid-R: "Camera arcs to the left." | Right: "Camera moves upward and tilts down."
Left: Source Motion | Mid-L: "Static camera." | Mid-M: "Camera arcs to the left." | Mid-R: "Camera tilts upward." | Right: "Camera moves upward and tilts down."
Left: Source Motion | Mid-L: "Camera moves left." | Mid-R: "Camera tilts up." | Right: "Camera arcs to the left."
Left: Source Motion | Mid-L: "Camera moves right." | Mid-R: "Camera arcs to the right." | Right: "Static camera."
Left: Source Motion | Middle: "Camera arcs to the right." | Right: "Camera arcs to the left."
Left: Source Motion | Middle: "Handheld camera" | Right: "Camera tilts up and down"
Left: Source Motion | Mid-L: "Camera arcs to the right." | Mid-R: "Camera remains static." | Right: "Camera arcs to the left."
Left: Source Motion | Mid-L: "Camera suddenly zooms in." | Mid-R: "Camera arcs left." | Right: "Camera moves upward while circling right."
Left: Source Motion | Middle: "Camera moves backward, then arcs left." | Right: "Camera tilts down and arcs right."
Left: Source Motion | Mid-L: "Camera remains static." | Mid-R: "Camera slowly moves backward." | Right: "Camera slowly zooms in."
Left: Source Motion | Middle: "Camera arcs to the left." | Right: "Camera moves upward and tilts down slowly."
Left: Source Motion | Middle: "Camera slowly zooms in." | Right: "Camera moves upward and tilts down."
Left: Source Motion | Middle-L: "Camera arcs right slowly." | Middle-R: "Handheld camera." | Right: "Camera moves upward and slowly tilts down."
Left: Source Motion | Middle-L: "Camera moves upward and slowly tilts down." | Middle-R: "Camera moves downward." | Right: "Handheld camera."
Left: Source Motion | Middle: "Handheld camera." | Right: "Camera moves downward and slowly tilts up."
Left: Source Motion | Middle: "Camera moves downward and slowly tilts up." | Right: "Camera moves upward and slowly tilts down."

Broader Applications

Automatic Motion-Image Alignment

Human Novel View Synthesis from a Single Image

Video Stabilization

Ethical Considerations

All image and video materials presented on this page are either publicly available or synthetically generated. They are used solely for academic research to illustrate the technical scope of character animation. The materials are non-commercial and provided only for demonstration under fair use. No identity, likeness, or content ownership beyond research illustration is implied.

BibTeX

@misc{fang20263dawareimplicitmotioncontrol, title={3D-Aware Implicit Motion Control for View-Adaptive Human Video Generation}, author={Zhixue Fang and Xu He and Songlin Tang and Haoxian Zhang and Qingfeng Li and Xiaoqiang Liu and Pengfei Wan and Kun Gai}, year={2026}, eprint={2602.03796}, archivePrefix={arXiv}, primaryClass={cs.CV}, url={https://arxiv.org/abs/2602.03796}, }