Search Results for author: Amit H. Bermano

Found 27 papers, 14 papers with code

LCM-Lookahead for Encoder-based Text-to-Image Personalization

no code implementations • 4 Apr 2024 • Rinon Gal, Or Lichter, Elad Richardson, Or Patashnik, Amit H. Bermano, Gal Chechik, Daniel Cohen-Or

In this work, we explore the potential of using such shortcut-mechanisms to guide the personalization of text-to-image models to specific facial identities.

Denoising

Paper
Add Code

Breathing Life Into Sketches Using Text-to-Video Priors

no code implementations • 21 Nov 2023 • Rinon Gal, Yael Vinker, Yuval Alaluf, Amit H. Bermano, Daniel Cohen-Or, Ariel Shamir, Gal Chechik

A sketch is one of the most intuitive and versatile tools humans use to convey their ideas visually.

Paper
Add Code

MAS: Multi-view Ancestral Sampling for 3D motion generation using 2D diffusion

no code implementations • 23 Oct 2023 • Roy Kapon, Guy Tevet, Daniel Cohen-Or, Amit H. Bermano

We introduce Multi-view Ancestral Sampling (MAS), a method for 3D motion generation, using 2D diffusion models that were trained on motions obtained from in-the-wild videos.

Denoising

Paper
Add Code

State of the Art on Diffusion Models for Visual Computing

no code implementations • 11 Oct 2023 • Ryan Po, Wang Yifan, Vladislav Golyanik, Kfir Aberman, Jonathan T. Barron, Amit H. Bermano, Eric Ryan Chan, Tali Dekel, Aleksander Holynski, Angjoo Kanazawa, C. Karen Liu, Lingjie Liu, Ben Mildenhall, Matthias Nießner, Björn Ommer, Christian Theobalt, Peter Wonka, Gordon Wetzstein

The field of visual computing is rapidly advancing due to the emergence of generative artificial intelligence (AI), which unlocks unprecedented capabilities for the generation, editing, and reconstruction of images, videos, and 3D scenes.

Paper
Add Code

OMG-ATTACK: Self-Supervised On-Manifold Generation of Transferable Evasion Attacks

no code implementations • 5 Oct 2023 • Ofir Bar Tal, Adi Haviv, Amit H. Bermano

Evasion Attacks (EA) are used to test the robustness of trained neural networks by distorting input data to misguide the model into incorrect classifications.

Representation Learning

Paper
Add Code

Performance Conditioning for Diffusion-Based Multi-Instrument Music Synthesis

no code implementations • 21 Sep 2023 • Ben Maman, Johannes Zeitler, Meinard Müller, Amit H. Bermano

Building on state-of-the-art diffusion-based music generative models, we introduce performance conditioning - a simple tool indicating the generative model to synthesize music with style and timbre of specific instruments taken from specific performances.

FAD Information Retrieval +2

Paper
Add Code

Domain-Agnostic Tuning-Encoder for Fast Personalization of Text-To-Image Models

no code implementations • 13 Jul 2023 • Moab Arar, Rinon Gal, Yuval Atzmon, Gal Chechik, Daniel Cohen-Or, Ariel Shamir, Amit H. Bermano

Text-to-image (T2I) personalization allows users to guide the creative image generation process by combining their own visual concepts in natural language prompts.

Image Generation

Paper
Add Code

Neural Projection Mapping Using Reflectance Fields

no code implementations • 11 Jun 2023 • Yotam Erel, Daisuke Iwai, Amit H. Bermano

We introduce a high resolution spatially adaptive light source, or a projector, into a neural reflectance field that allows to both calibrate the projector and photo realistic light editing.

Scene Understanding

Paper
Add Code

Human Motion Diffusion as a Generative Prior

2 code implementations • 2 Mar 2023 • Yonatan Shafir, Guy Tevet, Roy Kapon, Amit H. Bermano

We evaluate the composition methods using an off-the-shelf motion diffusion model, and further compare the results to dedicated models trained for these specific tasks.

Ranked #4 on Motion Synthesis on InterHuman

Denoising Motion Synthesis

386

Paper
Code

Encoder-based Domain Tuning for Fast Personalization of Text-to-Image Models

no code implementations • 23 Feb 2023 • Rinon Gal, Moab Arar, Yuval Atzmon, Amit H. Bermano, Gal Chechik, Daniel Cohen-Or

Specifically, we employ two components: First, an encoder that takes as an input a single image of a target concept from a given domain, e. g. a specific face, and learns to map it into a word-embedding representing the concept.

Novel Concepts

Paper
Add Code

Single Motion Diffusion

1 code implementation • 12 Feb 2023 • Sigal Raab, Inbal Leibovitch, Guy Tevet, Moab Arar, Amit H. Bermano, Daniel Cohen-Or

We harness the power of diffusion models and present a denoising network explicitly designed for the task of learning from a single input motion.

Denoising Style Transfer

310

Paper
Code

OReX: Object Reconstruction from Planar Cross-sections Using Neural Fields

1 code implementation • CVPR 2023 • Haim Sawdayee, Amir Vaxman, Amit H. Bermano

A modest neural network is trained on the input planes to return an inside/outside estimate for a given 3D coordinate, yielding a powerful prior that induces smoothness and self-similarities.

3D Shape Reconstruction Object Reconstruction

Paper
Code

Human Motion Diffusion Model

1 code implementation • 29 Sep 2022 • Guy Tevet, Sigal Raab, Brian Gordon, Yonatan Shafir, Daniel Cohen-Or, Amit H. Bermano

In this paper, we introduce Motion Diffusion Model (MDM), a carefully adapted classifier-free diffusion-based generative model for the human motion domain.

Ranked #1 on Motion Synthesis on HumanAct12

Motion Synthesis

2,871

Paper
Code

An Image is Worth One Word: Personalizing Text-to-Image Generation using Textual Inversion

7 code implementations • 2 Aug 2022 • Rinon Gal, Yuval Alaluf, Yuval Atzmon, Or Patashnik, Amit H. Bermano, Gal Chechik, Daniel Cohen-Or

Yet, it is unclear how such freedom can be exercised to generate images of specific unique concepts, modify their appearance, or compose them in new roles and novel scenes.

Text-to-Image Generation

6,644

Paper
Code

Unaligned Supervision For Automatic Music Transcription in The Wild

1 code implementation • 28 Apr 2022 • Ben Maman, Amit H. Bermano

In order to overcome data collection barriers, previous AMT approaches attempt to employ musical scores in the form of a digitized version of the same song or piece.

Information Retrieval Music Information Retrieval +2

Paper
Code

MotionCLIP: Exposing Human Motion Generation to CLIP Space

1 code implementation • 15 Mar 2022 • Guy Tevet, Brian Gordon, Amir Hertz, Amit H. Bermano, Daniel Cohen-Or

MotionCLIP gains its unique power by aligning its latent space with that of the Contrastive Language-Image Pre-training (CLIP) model.

Disentanglement Motion Interpolation

364

Paper
Code

State-of-the-Art in the Architecture, Methods and Applications of StyleGAN

no code implementations • 28 Feb 2022 • Amit H. Bermano, Rinon Gal, Yuval Alaluf, Ron Mokady, Yotam Nitzan, Omer Tov, Or Patashnik, Daniel Cohen-Or

Of these, StyleGAN offers a fascinating case study, owing to its remarkable visual quality and an ability to support a large array of downstream tasks.

Image Generation

Paper
Add Code

Self-Conditioned Generative Adversarial Networks for Image Editing

1 code implementation • 8 Feb 2022 • Yunzhe Liu, Rinon Gal, Amit H. Bermano, Baoquan Chen, Daniel Cohen-Or

We compare our models to a wide range of latent editing methods, and show that by alleviating the bias they achieve finer semantic control and better identity preservation through a wider range of transformations.

Fairness

Paper
Code

Stitch it in Time: GAN-Based Facial Editing of Real Videos

1 code implementation • 20 Jan 2022 • Rotem Tzaban, Ron Mokady, Rinon Gal, Amit H. Bermano, Daniel Cohen-Or

The ability of Generative Adversarial Networks to encode rich semantics within their latent space has been widely adopted for facial image editing.

Facial Editing

1,194

Paper
Code

Leveraging in-domain supervision for unsupervised image-to-image translation tasks via multi-stream generators

no code implementations • 30 Dec 2021 • Dvir Yerushalmi, Dov Danon, Amit H. Bermano

In addition, we propose training a semantic segmentation network along with the translation task, and to leverage this output as a loss term that improves robustness.

Segmentation Semantic Segmentation +2

Paper
Add Code

Learned Queries for Efficient Local Attention

1 code implementation • CVPR 2022 • Moab Arar, Ariel Shamir, Amit H. Bermano

Vision Transformers (ViT) serve as powerful vision models.

Ranked #365 on Image Classification on ImageNet

Image Classification Object Detection

112

Paper
Code

HyperStyle: StyleGAN Inversion with HyperNetworks for Real Image Editing

1 code implementation • CVPR 2022 • Yuval Alaluf, Omer Tov, Ron Mokady, Rinon Gal, Amit H. Bermano

In this work, we introduce this approach into the realm of encoder-based inversion.

992

Paper
Code

ClipCap: CLIP Prefix for Image Captioning

4 code implementations • 18 Nov 2021 • Ron Mokady, Amir Hertz, Amit H. Bermano

Image captioning is a fundamental task in vision-language understanding, where the model predicts a textual informative caption to a given input image.

Ranked #1 on Image Captioning on Conceptual Captions

Image Captioning Language Modelling

1,215

Paper
Code

JOKR: Joint Keypoint Representation for Unsupervised Cross-Domain Motion Retargeting

1 code implementation • 17 Jun 2021 • Ron Mokady, Rotem Tzaban, Sagie Benaim, Amit H. Bermano, Daniel Cohen-Or

To alleviate this problem, we introduce JOKR - a JOint Keypoint Representation that captures the motion common to both the source and target videos, without requiring any object prior or data collection.

Disentanglement motion retargeting

Paper
Code

Pivotal Tuning for Latent-based Editing of Real Images

3 code implementations • 10 Jun 2021 • Daniel Roich, Ron Mokady, Amit H. Bermano, Daniel Cohen-Or

The key idea is pivotal tuning - a brief training process that preserves the editing quality of an in-domain latent region, while changing its portrayed identity and appearance.

Facial Editing Image Manipulation

6,644

Paper
Code