Computer graphics are steadily becoming more realistic in both moviesand video games, but Disney’s Brave and Rockstar Games’ L.A. Noire willone day look like amateur’s work. Microsoft researchers have taken 3Dmodeling a step further with a new technique that leverages both motioncapture and 3D scanning. It creates high-fidelity 3D images of the humanface that depict not only large-scale features and expressions, butalso the accompanied movement of human skin (such as wrinkling).
The researchers start by recording 3D facial performances made by anactor using a marker-based motion capture system (100 reflective dotsare applied to the actor’s face). This recorded data then undergoes afacial analysis to identify a set of face scans needed to reconstructcertain facial features, and to determine the minimal set of face scansrequired. The scientists then use a laser scanner to capturehigh-fidelity facial scans. Finally, they combine the motion capturedata with the minimal set of face scans.Xin Tong
of Microsoft Research Asia Tong is leading the research, with help fromJinxiang Chai, a Texas A&M professor, as well as Haoda Huang andHsiang-Tao Wu, also both from Microsoft Research Asia. Tong saysrealistic face animation is the “holy grail” of computer graphics – thehuman face is powered by 52 muscles and is capable of so many facialexpressions that current technology often makes the resulting animationslook fake.
“We are very familiar with facial expressions, but also verysensitive in seeing any type of errors,” Tong said. “That means we needto capture facial expressions with a high level of detail and alsocapture very subtle facial details with high temporal resolution.”
The most obvious place Microsoft can use this technology is Avatar Kinect
,which unsurprisingly also came out of Microsoft Research. If you wereto wink, Kinect could detect the facial expression and have your avataron the screen do the same, for example. Not every Microsoft Researchproject turns into a final product, but this is the most likelycandidate.
“The character would be virtual, but the expressions real,” Tongsaid. “For teleconference applications, that could be very useful, forexample, in a business meeting, where people are very sensitive toexpressions and use them to know what people are thinking.”
Tong admits that there’s still work to be done. His team’s technologydoesn’t yet capture synchronized eye or lip movements, for example. Italso takes a lot of computing power and time – this is expected, but ofcourse the researchers would like to reduce both (Tong wants the processto occur in real-time).
Chai, Huang, Tong, and Wu will present their paper, titled “Leveraging Motion Capture and 3D Scanning for High-fidelity Facial Performance Acquisition
”(PDF), at the SIGGRAPH 2011 computer graphics conference (August 7 toAugust 11, 2011) in Vancouver, British Columbia. If you want moretechnical details, but don’t want to read the whole 10-page paper,here’s the abstract:
This paper introduces a new approach for acquiringhigh-ﬁdelity 3D facial performances with realistic dynamic wrinkles andﬁnescale facial details. Our approach leverages state-of-the-art motioncapture technology and advanced 3D scanning technology for facialperformance acquisition. We start the process by recording 3D facialperformances of an actor using a marker-based motion capture system andperform facial analysis on the captured data, thereby determining aminimal set of face scans required for accurate facial reconstruction.We introduce a two-step registration process to efﬁciently build denseconsistent surface correspondences across all the face scans. Wereconstruct high-ﬁdelity 3D facial performances by combining motioncapture data with the minimal set of face scans in the blendshapeinterpolation framework. We have evaluated the performance of our systemon both real and synthetic data. Our results show that the system cancapture facial performances that match both the spatial resolution ofstatic face scans and the acquisition speed of motion capture systems.