Search Results for author: Yuxuan Zhang

Found 22 papers, 8 papers with code

Solution for Point Tracking Task of ICCV 1st Perception Test Challenge 2023

no code implementations • 26 Mar 2024 • Hongpeng Pan, Yang Yang, Zhongtian Fu, Yuxuan Zhang, Shian Du, Yi Xu, Xiangyang Ji

To address this issue, we propose a simple yet effective approach called TAP with confident static points (TAPIR+), which focuses on rectifying the tracking of the static point in the videos shot by a static camera.

Motion Detection Point Tracking +2

Paper
Add Code

Fast Personalized Text-to-Image Syntheses With Attention Injection

no code implementations • 17 Mar 2024 • Yuxuan Zhang, Yiren Song, Jinpeng Yu, Han Pan, Zhongliang Jing

Currently, personalized image generation methods mostly require considerable time to finetune and often overfit the concept resulting in generated images that are similar to custom concepts but difficult to edit by prompts.

Text-to-Image Generation

Paper
Add Code

Stable-Makeup: When Real-World Makeup Transfer Meets Diffusion Model

no code implementations • 12 Mar 2024 • Yuxuan Zhang, Lifu Wei, Qing Zhang, Yiren Song, Jiaming Liu, Huaxia Li, Xu Tang, Yao Hu, Haibo Zhao

Current makeup transfer methods are limited to simple makeup styles, making them difficult to apply in real-world scenarios.

Text-to-Image Generation

Paper
Add Code

SSR-Encoder: Encoding Selective Subject Representation for Subject-Driven Generation

1 code implementation • 26 Dec 2023 • Yuxuan Zhang, Yiren Song, Jiaming Liu, Rui Wang, Jinpeng Yu, Hao Tang, Huaxia Li, Xu Tang, Yao Hu, Han Pan, Zhongliang Jing

Recent advancements in subject-driven image generation have led to zero-shot generation, yet precise selection and focus on crucial subject representations remain challenging.

Image Generation

Paper
Code

CogAgent: A Visual Language Model for GUI Agents

1 code implementation • 14 Dec 2023 • Wenyi Hong, Weihan Wang, Qingsong Lv, Jiazheng Xu, Wenmeng Yu, Junhui Ji, Yan Wang, Zihan Wang, Yuxuan Zhang, Juanzi Li, Bin Xu, Yuxiao Dong, Ming Ding, Jie Tang

People are spending an enormous amount of time on digital devices through graphical user interfaces (GUIs), e. g., computer or smartphone screens.

Ranked #15 on Visual Question Answering on MM-Vet

Language Modelling Visual Question Answering

5,161

Paper
Code

FGNet: Towards Filling the Intra-class and Inter-class Gaps for Few-shot Segmentation

no code implementations • Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence Main Track 2023 • Yuxuan Zhang, Wei Yang, Shaowei Wang

In this paper, we propose a uniform network to fill both the gaps, termed FGNet.

Few-Shot Semantic Segmentation

Paper
Add Code

DPP-based Client Selection for Federated Learning with Non-IID Data

no code implementations • 30 Mar 2023 • Yuxuan Zhang, Chao Xu, Howard H. Yang, Xijun Wang, Tony Q. S. Quek

This paper proposes a client selection (CS) method to tackle the communication bottleneck of federated learning (FL) while concurrently coping with FL's data heterogeneity issue.

Federated Learning

Paper
Add Code

Video4MRI: An Empirical Study on Brain Magnetic Resonance Image Analytics with CNN-based Video Classification Frameworks

no code implementations • 24 Feb 2023 • Yuxuan Zhang, Qingzhong Wang, Jiang Bian, Yi Liu, Yanwu Xu, Dejing Dou, Haoyi Xiong

Due to the high similarity between MRI data and videos, we conduct extensive empirical studies on video recognition techniques for MRI classification to answer the questions: (1) can we directly use video recognition models for MRI classification, (2) which model is more appropriate for MRI, (3) are the common tricks like data augmentation in video recognition still useful for MRI classification?

Classification Data Augmentation +3

Paper
Add Code

Shakes on a Plane: Unsupervised Depth Estimation from Unstabilized Photography

no code implementations • CVPR 2023 • Ilya Chugunov, Yuxuan Zhang, Felix Heide

Modern mobile burst photography pipelines capture and merge a short sequence of frames to recover an enhanced image, but often disregard the 3D nature of the scene they capture, treating pixel motion between images as a 2D aggregation problem.

Depth And Camera Motion Pose Estimation

Paper
Add Code

Neural Volume Super-Resolution

no code implementations • 9 Dec 2022 • Yuval Bahat, Yuxuan Zhang, Hendrik Sommerhoff, Andreas Kolb, Felix Heide

This allows us to super-resolve the 3D scene representation by applying 2D convolutional networks on the 2D feature planes.

Super-Resolution

Paper
Add Code

An Attention-based Multi-Scale Feature Learning Network for Multimodal Medical Image Fusion

1 code implementation • 9 Dec 2022 • Meng Zhou, Xiaolan Xu, Yuxuan Zhang

Furthermore, we propose a novel fixed fusion strategy termed Softmax-based weighted strategy based on the Softmax weights and matrix nuclear norm.

Paper
Code

An Edge Alignment-based Orientation Selection Method for Neutron Tomography

no code implementations • 1 Dec 2022 • Diyu Yang, Shimin Tang, Singanallur V. Venkatakrishnan, Mohammad S. N. Chowdhury, Yuxuan Zhang, Hassina Z. Bilheux, Gregery T. Buzzard, Charles A. Bouman

Neutron computed tomography (nCT) is a 3D characterization technique used to image the internal morphology or chemical composition of samples in biology and materials sciences.

Paper
Add Code

Three-dimensional Microstructural Image Synthesis from 2D Backscattered Electron Image of Cement Paste

no code implementations • 4 Apr 2022 • Xin Zhao, Xu Wu, Lin Wang, Pengkun Hou, Qinfei Li, Yuxuan Zhang, Bo Yang

In experiments, the method is verified on actual 3D Micro-CT images and 2D BSE images.

Image Generation Texture Synthesis

Paper
Add Code

All You Need is RAW: Defending Against Adversarial Attacks with Camera Image Pipelines

1 code implementation • 16 Dec 2021 • Yuxuan Zhang, Bo Dong, Felix Heide

Various defense methods have proposed image-to-image mapping methods, either including these perturbations in the training process or removing them in a preprocessing denoising step.

Adversarial Defense Denoising +3

Paper
Code

The Implicit Values of A Good Hand Shake: Handheld Multi-Frame Neural Depth Refinement

1 code implementation • CVPR 2022 • Ilya Chugunov, Yuxuan Zhang, Zhihao Xia, Xuaner, Zhang, Jiawen Chen, Felix Heide

Modern smartphones can continuously stream multi-megapixel RGB images at 60Hz, synchronized with high-quality 3D pose information and low-resolution LiDAR-driven depth estimates.

Paper
Code

CelebHair: A New Large-Scale Dataset for Hairstyle Recommendation based on CelebA

no code implementations • 14 Apr 2021 • Yutao Chen, Yuxuan Zhang, Zhongrui Huang, Zhenyao Luo, Jinpeng Chen

In this paper, we present a new large-scale dataset for hairstyle recommendation, CelebHair, based on the celebrity facial attributes dataset, CelebA.

Facial Landmark Detection

Paper
Add Code

DatasetGAN: Efficient Labeled Data Factory with Minimal Human Effort

2 code implementations • CVPR 2021 • Yuxuan Zhang, Huan Ling, Jun Gao, Kangxue Yin, Jean-Francois Lafleche, Adela Barriuso, Antonio Torralba, Sanja Fidler

To showcase the power of our approach, we generated datasets for 7 image segmentation tasks which include pixel-level labels for 34 human face parts, and 32 car parts.

Decoder Image Segmentation +1

332

Paper
Code

Adaptive Radar Detection and Classification Algorithms for Multiple Coherent Signals

no code implementations • 23 Dec 2020 • Sudan Han, Linjie Yan, Yuxuan Zhang, Pia Addabbo, Chengpeng Hao, Danilo Orlando

In this paper, we address the problem of target detection in the presence of coherent (or fully correlated) signals, which can be due to multipath propagation effects or electronic attacks by smart jammers.

General Classification

Paper
Add Code

A prognostic dynamic model applicable to infectious diseases providing easily visualized guides -- A case study of COVID-19 in the UK

1 code implementation • 14 Dec 2020 • Yuxuan Zhang, Chen Gong, Dawei Li, Zhi-Wei Wang, Shengda D Pu, Alex W Robertson, Hong Yu, John Parrington

A reasonable prediction of infectious diseases transmission process under different disease control strategies is an important reference point for policy makers.

Paper
Code

Image GANs meet Differentiable Rendering for Inverse Graphics and Interpretable 3D Neural Rendering

no code implementations • ICLR 2021 • Yuxuan Zhang, Wenzheng Chen, Huan Ling, Jun Gao, Yinan Zhang, Antonio Torralba, Sanja Fidler

Key to our approach is to exploit GANs as a multi-view data generator to train an inverse graphics network using an off-the-shelf differentiable renderer, and the trained inverse graphics network as a teacher to disentangle the GAN's latent code into interpretable 3D properties.

Neural Rendering

Paper
Add Code

A Combined Data-driven and Physics-driven Method for Steady Heat Conduction Prediction using Deep Convolutional Neural Networks

no code implementations • 16 May 2020 • Hao Ma, Xiangyu Hu, Yuxuan Zhang, Nils Thuerey, Oskar J. Haidn

For the data-driven based method, the introduction of physical equation not only is able to speed up the convergence, but also produces physically more consistent solutions.

Paper
Add Code

Deep Neural Network Fingerprinting by Conferrable Adversarial Examples

1 code implementation • ICLR 2021 • Nils Lukas, Yuxuan Zhang, Florian Kerschbaum

We propose a fingerprinting method for deep neural network classifiers that extracts a set of inputs from the source model so that only surrogates agree with the source model on the classification of such inputs.

Model extraction Transfer Learning

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.