Search Results for author: Yu-Chuan Su

Found 20 papers, 1 papers with code

Instruct-Imagen: Image Generation with Multi-modal Instruction

no code implementations • 3 Jan 2024 • Hexiang Hu, Kelvin C. K. Chan, Yu-Chuan Su, Wenhu Chen, Yandong Li, Kihyuk Sohn, Yang Zhao, Xue Ben, Boqing Gong, William Cohen, Ming-Wei Chang, Xuhui Jia

We introduce *multi-modal instruction* for image generation, a task representation articulating a range of generation intents with precision.

Image Generation Retrieval

Paper
Add Code

Fine-grained Controllable Video Generation via Object Appearance and Context

no code implementations • 5 Dec 2023 • Hsin-Ping Huang, Yu-Chuan Su, Deqing Sun, Lu Jiang, Xuhui Jia, Yukun Zhu, Ming-Hsuan Yang

To achieve detailed control, we propose a unified framework to jointly inject control signals into the existing text-to-video model.

Text-to-Video Generation Video Generation

Paper
Add Code

Towards Authentic Face Restoration with Iterative Diffusion Models and Beyond

no code implementations • ICCV 2023 • Yang Zhao, Tingbo Hou, Yu-Chuan Su, Xuhui Jia. Yandong Li, Matthias Grundmann

An authentic face restoration system is becoming increasingly demanding in many computer vision applications, e. g., image enhancement, video communication, and taking portrait.

Blind Face Restoration Denoising +2

Paper
Add Code

Controllable One-Shot Face Video Synthesis With Semantic Aware Prior

no code implementations • 27 Apr 2023 • Kangning Liu, Yu-Chuan Su, Wei, Hong, Ruijin Cang, Xuhui Jia

The one-shot talking-head synthesis task aims to animate a source image to another pose and expression, which is dictated by a driving frame.

Paper
Add Code

Video Generation Beyond a Single Clip

no code implementations • 15 Apr 2023 • Hsin-Ping Huang, Yu-Chuan Su, Ming-Hsuan Yang

We tackle the long video generation problem, i. e.~generating videos beyond the output length of video generation models.

Video Generation

Paper
Add Code

Identity Encoder for Personalized Diffusion

no code implementations • 14 Apr 2023 • Yu-Chuan Su, Kelvin C. K. Chan, Yandong Li, Yang Zhao, Han Zhang, Boqing Gong, Huisheng Wang, Xuhui Jia

Our approach greatly reduces the overhead for personalized image generation and is more applicable in many potential applications.

Image Enhancement Image Generation

Paper
Add Code

Taming Encoder for Zero Fine-tuning Image Customization with Text-to-Image Diffusion Models

no code implementations • 5 Apr 2023 • Xuhui Jia, Yang Zhao, Kelvin C. K. Chan, Yandong Li, Han Zhang, Boqing Gong, Tingbo Hou, Huisheng Wang, Yu-Chuan Su

This paper proposes a method for generating images of customized objects specified by users.

Caption Generation Image Generation +1

Paper
Add Code

Rethinking Deep Face Restoration

no code implementations • CVPR 2022 • Yang Zhao, Yu-Chuan Su, Chun-Te Chu, Yandong Li, Marius Renn, Yukun Zhu, Changyou Chen, Xuhui Jia

While existing approaches for face restoration make significant progress in generating high-quality faces, they often fail to preserve facial features and cannot authentically reconstruct the faces.

Face Generation Face Reconstruction

Paper
Add Code

2.5D Visual Relationship Detection

1 code implementation • 26 Apr 2021 • Yu-Chuan Su, Soravit Changpinyo, Xiangning Chen, Sathish Thoppay, Cho-Jui Hsieh, Lior Shapira, Radu Soricut, Hartwig Adam, Matthew Brown, Ming-Hsuan Yang, Boqing Gong

To enable progress on this task, we create a new dataset consisting of 220k human-annotated 2. 5D relationships among 512K objects from 11K images.

Benchmarking Depth Estimation +2

Paper
Code

Camera View Adjustment Prediction for Improving Image Composition

no code implementations • 15 Apr 2021 • Yu-Chuan Su, Raviteja Vemulapalli, Ben Weiss, Chun-Te Chu, Philip Andrew Mansfield, Lior Shapira, Colvin Pitts

To address this issue, we propose a deep learning-based approach that provides suggestions to the photographer on how to adjust the camera view before capturing.

Image Cropping

Paper
Add Code

Kernel Transformer Networks for Compact Spherical Convolution

no code implementations • CVPR 2019 • Yu-Chuan Su, Kristen Grauman

KTNs efficiently transfer convolution kernels from perspective images to the equirectangular projection of 360{\deg} images.

Paper
Add Code

Learning Compressible 360Â° Video Isomers

no code implementations • CVPR 2018 • Yu-Chuan Su, Kristen Grauman

Standard video encoders developed for conventional narrow field-of-view video are widely applied to 360Â° video as well, with reasonable results.

Paper
Add Code

Learning Compressible 360° Video Isomers

no code implementations • 12 Dec 2017 • Yu-Chuan Su, Kristen Grauman

Standard video encoders developed for conventional narrow field-of-view video are widely applied to 360{\deg} video as well, with reasonable results.

Paper
Add Code

Learning Spherical Convolution for Fast Features from 360° Imagery

no code implementations • NeurIPS 2017 • Yu-Chuan Su, Kristen Grauman

While 360{\deg} cameras offer tremendous new possibilities in vision, graphics, and augmented reality, the spherical images they produce make core feature extraction non-trivial.

Paper
Add Code

Making 360deg Video Watchable in 2D: Learning Videography for Click Free Viewing

no code implementations • CVPR 2017 • Yu-Chuan Su, Kristen Grauman

360deg video requires human viewers to actively control "where" to look while watching the video.

Navigate

Paper
Add Code

Making 360$^{\circ}$ Video Watchable in 2D: Learning Videography for Click Free Viewing

no code implementations • 1 Mar 2017 • Yu-Chuan Su, Kristen Grauman

360$^{\circ}$ video requires human viewers to actively control "where" to look while watching the video.

Navigate

Paper
Add Code

Pano2Vid: Automatic Cinematography for Watching 360$^{\circ}$ Videos

no code implementations • 7 Dec 2016 • Yu-Chuan Su, Dinesh Jayaraman, Kristen Grauman

AutoCam leverages NFOV web video to discriminatively identify space-time "glimpses" of interest at each time instant, and then uses dynamic programming to select optimal human-like camera trajectories.

Paper
Add Code

Detecting Engagement in Egocentric Video

no code implementations • 4 Apr 2016 • Yu-Chuan Su, Kristen Grauman

In a wearable camera video, we see what the camera wearer sees.

Video Summarization

Paper
Add Code

Leaving Some Stones Unturned: Dynamic Feature Prioritization for Activity Detection in Streaming Video

no code implementations • 1 Apr 2016 • Yu-Chuan Su, Kristen Grauman

Current approaches for activity recognition often ignore constraints on computational resources: 1) they rely on extensive feature computation to obtain rich descriptors on all frames, and 2) they assume batch-mode access to the entire test video at once.

Action Detection Activity Detection +2

Paper
Add Code

Transfer Learning for Video Recognition with Scarce Training Data for Deep Convolutional Neural Network

no code implementations • 15 Sep 2014 • Yu-Chuan Su, Tzu-Hsuan Chiu, Chun-Yen Yeh, Hsin-Fu Huang, Winston H. Hsu

The same lack-of-training-sample problem limits the usage of deep models on a wide range of computer vision problems where obtaining training data are difficult.

4k Transfer Learning +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.