Search Results for author: Iro Laina

Found 33 papers, 13 papers with code

Invisible Stitch: Generating Smooth 3D Scenes with Depth Inpainting

no code implementations • 30 Apr 2024 • Paul Engstler, Andrea Vedaldi, Iro Laina, Christian Rupprecht

These works often depend on pre-trained monocular depth estimators to lift the generated images into 3D, fusing them with the existing scene representation.

Benchmarking Depth Completion +2

Paper
Add Code

DGE: Direct Gaussian 3D Editing by Consistent Multi-view Editing

no code implementations • 29 Apr 2024 • Minghao Chen, Iro Laina, Andrea Vedaldi

However, this is often slow as it requires do update a computationally expensive 3D representations such as a neural radiance field, and to do so by using contradictory guidance from a 2D model which is inherently not multi-view consistent.

Paper
Add Code

N2F2: Hierarchical Scene Understanding with Nested Neural Feature Fields

no code implementations • 16 Mar 2024 • Yash Bhalgat, Iro Laina, João F. Henriques, Andrew Zisserman, Andrea Vedaldi

To address this, we introduce Nested Neural Feature Fields (N2F2), a novel approach that employs hierarchical supervision to learn a single feature field, wherein different dimensions within the same high-dimensional feature encode scene properties at varying granularities.

Scene Understanding

Paper
Add Code

IM-3D: Iterative Multiview Diffusion and Reconstruction for High-Quality 3D Generation

no code implementations • 13 Feb 2024 • Luke Melas-Kyriazi, Iro Laina, Christian Rupprecht, Natalia Neverova, Andrea Vedaldi, Oran Gafni, Filippos Kokkinos

A mitigation is to fine-tune the 2D generator to be multi-view aware, which can help distillation or can be combined with reconstruction networks to output 3D objects directly.

3D Generation 3D Reconstruction +1

Paper
Add Code

SHAP-EDITOR: Instruction-guided Latent 3D Editing in Seconds

no code implementations • 14 Dec 2023 • Minghao Chen, Junyu Xie, Iro Laina, Andrea Vedaldi

In particular, we hypothesise that editing can be greatly simplified by first encoding 3D objects in a suitable latent space.

Paper
Add Code

Understanding Self-Supervised Features for Learning Unsupervised Instance Segmentation

no code implementations • 24 Nov 2023 • Paul Engstler, Luke Melas-Kyriazi, Christian Rupprecht, Iro Laina

Self-supervised learning (SSL) can be used to solve complex visual tasks without human labels.

Ranked #2 on Unsupervised Instance Segmentation on COCO val2017

Instance Segmentation Segmentation +3

Paper
Add Code

Diffusion Models for Zero-Shot Open-Vocabulary Segmentation

no code implementations • 15 Jun 2023 • Laurynas Karazija, Iro Laina, Andrea Vedaldi, Christian Rupprecht

This provides a distribution of appearances for a given text circumventing the ambiguity problem.

Paper
Add Code

EPIC Fields: Marrying 3D Geometry and Video Understanding

1 code implementation • NeurIPS 2023 • Vadim Tschernezki, Ahmad Darkhalil, Zhifan Zhu, David Fouhey, Iro Laina, Diane Larlus, Dima Damen, Andrea Vedaldi

Compared to other neural rendering datasets, EPIC Fields is better tailored to video understanding because it is paired with labelled action segments and the recent VISOR segment annotations.

Neural Rendering Video Understanding

Paper
Code

Contrastive Lift: 3D Object Instance Segmentation by Slow-Fast Contrastive Fusion

1 code implementation • NeurIPS 2023 • Yash Bhalgat, Iro Laina, João F. Henriques, Andrew Zisserman, Andrea Vedaldi

Our approach outperforms the state-of-the-art on challenging scenes from the ScanNet, Hypersim, and Replica datasets, as well as on our newly created Messy Rooms dataset, demonstrating the effectiveness and scalability of our slow-fast clustering method.

Clustering Instance Segmentation +2

Paper
Code

Training-Free Layout Control with Cross-Attention Guidance

1 code implementation • 6 Apr 2023 • Minghao Chen, Iro Laina, Andrea Vedaldi

We thoroughly evaluate our approach on three benchmarks and provide several qualitative examples and a comparative analysis of the two strategies that demonstrate the superiority of backward guidance compared to forward guidance, as well as prior work.

210

Paper
Code

RealFusion: 360° Reconstruction of Any Object from a Single Image

3 code implementations • 21 Feb 2023 • Luke Melas-Kyriazi, Christian Rupprecht, Iro Laina, Andrea Vedaldi

We consider the problem of reconstructing a full 360{\deg} photographic model of an object from a single image of it.

3D Reconstruction Object

7,846

Paper
Code

RealFusion: 360deg Reconstruction of Any Object From a Single Image

no code implementations • CVPR 2023 • Luke Melas-Kyriazi, Iro Laina, Christian Rupprecht, Andrea Vedaldi

We consider the problem of reconstructing a full 360deg photographic model of an object from a single image of it.

3D Reconstruction Object

Paper
Add Code

Unsupervised Multi-object Segmentation by Predicting Probable Motion Patterns

no code implementations • 21 Oct 2022 • Laurynas Karazija, Subhabrata Choudhury, Iro Laina, Christian Rupprecht, Andrea Vedaldi

We propose a new approach to learn to segment multiple image objects without manual supervision.

Image Reconstruction Object +2

Paper
Add Code

Neural Feature Fusion Fields: 3D Distillation of Self-Supervised 2D Image Representations

no code implementations • 7 Sep 2022 • Vadim Tschernezki, Iro Laina, Diane Larlus, Andrea Vedaldi

We present Neural Feature Fusion Fields (N3F), a method that improves dense 2D image feature extractors when the latter are applied to the analysis of multiple images reconstructible as a 3D scene.

Neural Rendering Retrieval

Paper
Add Code

Measuring the Interpretability of Unsupervised Representations via Quantized Reverse Probing

no code implementations • 7 Sep 2022 • Iro Laina, Yuki M. Asano, Andrea Vedaldi

Self-supervised visual representation learning has recently attracted significant research interest.

Representation Learning

Paper
Add Code

Guess What Moves: Unsupervised Video and Image Segmentation by Anticipating Motion

no code implementations • 16 May 2022 • Subhabrata Choudhury, Laurynas Karazija, Iro Laina, Andrea Vedaldi, Christian Rupprecht

Motion, measured via optical flow, provides a powerful cue to discover and learn objects in images and videos.

Ranked #4 on Unsupervised Object Segmentation on SegTrack-v2

Image Segmentation Optical Flow Estimation +6

Paper
Add Code

Deep Spectral Methods: A Surprisingly Strong Baseline for Unsupervised Semantic Segmentation and Localization

1 code implementation • CVPR 2022 • Luke Melas-Kyriazi, Christian Rupprecht, Iro Laina, Andrea Vedaldi

We find that these eigenvectors already decompose an image into meaningful segments, and can be readily used to localize objects in a scene.

graph partitioning Segmentation +1

224

Paper
Code

ClevrTex: A Texture-Rich Benchmark for Unsupervised Multi-Object Segmentation

1 code implementation • 19 Nov 2021 • Laurynas Karazija, Iro Laina, Christian Rupprecht

We benchmark a large set of recent unsupervised multi-object segmentation models on ClevrTex and find all state-of-the-art approaches fail to learn good representations in the textured setting, despite impressive performance on simpler data.

Ranked #3 on Unsupervised Object Segmentation on ClevrTex

Segmentation Semantic Segmentation +1

Paper
Code

Unsupervised Part Discovery from Contrastive Reconstruction

1 code implementation • NeurIPS 2021 • Subhabrata Choudhury, Iro Laina, Christian Rupprecht, Andrea Vedaldi

First, we construct a proxy task through a set of objectives that encourages the model to learn a meaningful decomposition of the image into its parts.

Ranked #1 on Unsupervised Keypoint Estimation on CUB

Representation Learning Unsupervised Keypoint Estimation

Paper
Code

The Curious Layperson: Fine-Grained Image Recognition without Expert Labels

1 code implementation • 5 Nov 2021 • Subhabrata Choudhury, Iro Laina, Christian Rupprecht, Andrea Vedaldi

We then train a fine-grained textual similarity model that matches image descriptions with documents on a sentence-level basis.

Cross-Modal Retrieval Fine-Grained Image Recognition +2

Paper
Code

Measuring the Interpretability of Unsupervised Representations via Quantized Reversed Probing

no code implementations • ICLR 2022 • Iro Laina, Yuki M Asano, Andrea Vedaldi

Self-supervised visual representation learning has attracted significant research interest.

Ranked #90 on Image Classification on ObjectNet (using extra training data)

Image Classification Representation Learning

Paper
Add Code

Finding an Unsupervised Image Segmenter in Each of Your Deep Generative Models

1 code implementation • ICLR 2022 • Luke Melas-Kyriazi, Christian Rupprecht, Iro Laina, Andrea Vedaldi

Recent research has shown that numerous human-interpretable directions exist in the latent space of GANs.

Image Segmentation Segmentation +1

Paper
Code

Quantifying Learnability and Describability of Visual Concepts Emerging in Representation Learning

no code implementations • NeurIPS 2020 • Iro Laina, Ruth C. Fong, Andrea Vedaldi

The increasing impact of black box models, and particularly of unsupervised ones, comes with an increasing interest in tools to understand and interpret them.

Clustering Representation Learning

Paper
Add Code

Semantic Image Manipulation Using Scene Graphs

1 code implementation • CVPR 2020 • Helisa Dhamo, Azade Farshad, Iro Laina, Nassir Navab, Gregory D. Hager, Federico Tombari, Christian Rupprecht

In our work, we address the novel problem of image manipulation from scene graphs, in which a user can edit images by merely applying changes in the nodes or edges of a semantic graph that is generated from the image.

Image Inpainting Image Manipulation +1

Paper
Code

Towards Unsupervised Image Captioning with Shared Multimodal Embeddings

no code implementations • ICCV 2019 • Iro Laina, Christian Rupprecht, Nassir Navab

The core component of our approach is a shared latent space that is structured by visual concepts.

Image Captioning Language Modelling +3

Paper
Add Code

2017 Robotic Instrument Segmentation Challenge

3 code implementations • 18 Feb 2019 • Max Allan, Alex Shvets, Thomas Kurmann, Zichen Zhang, Rahul Duggal, Yun-Hsuan Su, Nicola Rieke, Iro Laina, Niveditha Kalavakonda, Sebastian Bodenstedt, Luis Herrera, Wenqi Li, Vladimir Iglovikov, Huoling Luo, Jian Yang, Danail Stoyanov, Lena Maier-Hein, Stefanie Speidel, Mahdi Azizian

In mainstream computer vision and machine learning, public datasets such as ImageNet, COCO and KITTI have helped drive enormous improvements by enabling researchers to understand the strengths and limitations of different algorithms via performance comparison.

Benchmarking Person Re-Identification +2

617

Paper
Code

Dealing with Ambiguity in Robotic Grasping via Multiple Predictions

no code implementations • 2 Nov 2018 • Ghazal Ghazaei, Iro Laina, Christian Rupprecht, Federico Tombari, Nassir Navab, Kianoush Nazarpour

Further, we reformulate the problem of robotic grasping by replacing conventional grasp rectangles with grasp belief maps, which hold more precise location information than a rectangle and account for the uncertainty inherent to the task.

Robotic Grasping

Paper
Add Code

Peeking Behind Objects: Layered Depth Prediction from a Single Image

no code implementations • 23 Jul 2018 • Helisa Dhamo, Keisuke Tateno, Iro Laina, Nassir Navab, Federico Tombari

While conventional depth estimation can infer the geometry of a scene from a single RGB image, it fails to estimate scene regions that are occluded by foreground objects.

Depth Estimation Depth Prediction

Paper
Add Code

Guide Me: Interacting with Deep Networks

no code implementations • CVPR 2018 • Christian Rupprecht, Iro Laina, Nassir Navab, Gregory D. Hager, Federico Tombari

Interaction and collaboration between humans and intelligent machines has become increasingly important as machine learning methods move into real-world applications that involve end users.

Image Captioning Image Generation

Paper
Add Code

CNN-SLAM: Real-time dense monocular SLAM with learned depth prediction

1 code implementation • CVPR 2017 • Keisuke Tateno, Federico Tombari, Iro Laina, Nassir Navab

Given the recent advances in depth prediction from Convolutional Neural Networks (CNNs), this paper investigates how predicted depth maps from a deep neural network can be deployed for accurate and dense monocular reconstruction.

Depth Estimation Depth Prediction +1