no code implementations • 26 Mar 2024 • Xingchao Yang, Takafumi Taketomi, Yuki Endo, Yoshihiro Kanamori
Although there is a trade-off between the two models, both are applicable to 3D facial makeup estimation and related applications.
1 code implementation • 5 Jan 2024 • Yuta Okuyama, Yuki Endo, Yoshihiro Kanamori
Because this initial textured body model has artifacts due to occlusion and the inaccurate body shape, the rendered image undergoes a diffusion-based refinement, in which strong noise destroys body structure and identity whereas insufficient noise does not help.
no code implementations • 26 May 2023 • Takato Yoshikawa, Yuki Endo, Yoshihiro Kanamori
We propose a framework for text-guided full-body human image synthesis via an attention-based latent code mapper, which enables more disentangled control of StyleGAN than existing mappers.
no code implementations • 26 Feb 2023 • Xingchao Yang, Takafumi Taketomi, Yoshihiro Kanamori
The extracted makeup is well-aligned in the UV space, from which we build a large-scale makeup dataset and a parametric makeup model for 3D faces.
1 code implementation • 25 Jun 2021 • Yuki Endo, Yoshihiro Kanamori
To handle individual factors that determine object styles, we propose a class- and layer-wise extension to the variational autoencoder (VAE) framework that allows flexible control over each object class at the local to global levels by learning multiple latent spaces.
1 code implementation • 27 Mar 2021 • Yuki Endo, Yoshihiro Kanamori
This paper tackles a challenging problem of generating photorealistic images from semantic layouts in few-shot scenarios where annotated training pairs are hardly available but pixel-wise annotation is quite costly.
1 code implementation • 16 Oct 2019 • Yuki Endo, Yoshihiro Kanamori, Shigeru Kuriyama
Automatic generation of a high-quality video from a single image remains a challenging task despite the recent advances in deep generative models.
no code implementations • 7 Aug 2019 • Yoshihiro Kanamori, Yuki Endo
Based on supervised learning using convolutional neural networks (CNNs), we infer not only an albedo map, illumination but also a light transport map that encodes occlusion as nine SH coefficients per pixel.