StereoPIFu: Depth Aware Clothed Human Digitization via Stereo Vision

CVPR 2021

Yang Hong¹, Juyong Zhang¹, Boyi Jiang¹, Yudong Guo¹, Ligang Liu¹, Hujun Bao²,
¹University of Science and Technology of China, ²Zhejiang University

Paper

Code

In this work, we propose StereoPIFu, which integrates the geometric constraints of stereo vision with implicit function representation of PIFu, to recover the 3D shape of the clothed human from a pair of low-cost rectified images. First, we introduce the effective voxel-aligned features from a stereo vision-based network to enable depth-aware reconstruction. Moreover, the novel relative z-offset is employed to associate predicted high-fidelity human depth and occupancy inference, which helps restore fine-level surface details. Second, a network structure that fully utilizes the geometry information from the stereo images is designed to improve the human body reconstruction quality. Consequently, our StereoPIFu can naturally infer the human body's spatial location in camera space and maintain the correct relative position of different parts of the human body, which enables our method to capture human performance. Compared to previous works, our StereoPIFu significantly improves the robustness, completeness, and accuracy of the clothed human reconstruction, which is demonstrated by extensive experimental results.

Method

Overview of our StereoPIFu pipeline. Given a stereo pair, for a query point P, its pixel-aligned feature, voxel-aligned features, and relative z-offset are constructed. These features encode the information about whether P is inside the underlying surface or not and are used for inferring the occupancy of P by the MLP.