Search Results for author: Michael Sapienza

Found 11 papers, 5 papers with code

SUTRA: Scalable Multilingual Language Model Architecture

no code implementations • 7 May 2024 • Abhijit Bendale, Michael Sapienza, Steven Ripplinger, Simon Gibbs, Jaewon Lee, Pranav Mistry

In this paper, we introduce SUTRA, multilingual Large Language Model architecture capable of understanding, reasoning, and generating text in over 50 languages.

Computational Efficiency Hallucination +2

Paper
Add Code

Straight to Shapes++: Real-time Instance Segmentation Made More Accurate

1 code implementation • 27 May 2019 • Laurynas Miksys, Saumya Jetley, Michael Sapienza, Stuart Golodetz, Philip H. S. Torr

The STS model can run at 35 FPS on a high-end desktop, but its accuracy is significantly worse than that of offline state-of-the-art methods.

Autonomous Driving Data Augmentation +5

Paper
Code

InfiniTAM v3: A Framework for Large-Scale 3D Reconstruction with Loop Closure

1 code implementation • 2 Aug 2017 • Victor Adrian Prisacariu, Olaf Kähler, Stuart Golodetz, Michael Sapienza, Tommaso Cavallari, Philip H. S. Torr, David W. Murray

Representing the reconstruction volumetrically as a TSDF leads to most of the simplicity and efficiency that can be achieved with GPU implementations of these systems.

3D Reconstruction Simultaneous Localization and Mapping

914

Paper
Code

Spatio-temporal Human Action Localisation and Instance Segmentation in Temporally Untrimmed Videos

no code implementations • 22 Jul 2017 • Suman Saha, Gurkirt Singh, Michael Sapienza, Philip H. S. Torr, Fabio Cuzzolin

Current state-of-the-art human action recognition is focused on the classification of temporally trimmed videos in which only one action occurs per frame.

Action Recognition Instance Segmentation +2

Paper
Add Code

Incremental Tube Construction for Human Action Detection

1 code implementation • 5 Apr 2017 • Harkirat Singh Behl, Michael Sapienza, Gurkirt Singh, Suman Saha, Fabio Cuzzolin, Philip H. S. Torr

In this work, we introduce a real-time and online joint-labelling and association algorithm for action detection that can incrementally construct space-time action tubes on the most challenging action videos in which different action categories occur concurrently.

Action Detection

Paper
Code

Online Real-time Multiple Spatiotemporal Action Localisation and Prediction

4 code implementations • ICCV 2017 • Gurkirt Singh, Suman Saha, Michael Sapienza, Philip Torr, Fabio Cuzzolin

To the best of our knowledge, ours is the first real-time (up to 40fps) system able to perform online S/T action localisation and early action prediction on the untrimmed videos of UCF101-24.

Early Action Prediction

318

Paper
Code

Straight to Shapes: Real-time Detection of Encoded Shapes

1 code implementation • CVPR 2017 • Saumya Jetley, Michael Sapienza, Stuart Golodetz, Philip H. S. Torr

To achieve this, we use a denoising convolutional auto-encoder to establish an embedding space, and place the decoder after a fast end-to-end network trained to regress directly to the encoded shape vectors.

Ranked #5 on Semantic Contour Prediction on Sbd val

Decoder Denoising +2

Paper
Code

Deep Learning for Detecting Multiple Space-Time Action Tubes in Videos

no code implementations • 4 Aug 2016 • Suman Saha, Gurkirt Singh, Michael Sapienza, Philip H. S. Torr, Fabio Cuzzolin

In stage 2, the appearance network detections are boosted by combining them with the motion detection scores, in proportion to their respective spatial overlap.

Action Detection Motion Detection +1

Paper
Add Code

Joint Object-Material Category Segmentation from Audio-Visual Cues

no code implementations • 10 Jan 2016 • Anurag Arnab, Michael Sapienza, Stuart Golodetz, Julien Valentin, Ondrej Miksik, Shahram Izadi, Philip Torr

It is not always possible to recognise objects and infer material properties for a scene from visual cues alone, since objects can look visually similar whilst being made of very different materials.

Object

Paper
Add Code

SemanticPaint: A Framework for the Interactive Segmentation of 3D Scenes

no code implementations • 13 Oct 2015 • Stuart Golodetz, Michael Sapienza, Julien P. C. Valentin, Vibhav Vineet, Ming-Ming Cheng, Anurag Arnab, Victor A. Prisacariu, Olaf Kähler, Carl Yuheng Ren, David W. Murray, Shahram Izadi, Philip H. S. Torr

We present an open-source, real-time implementation of SemanticPaint, a system for geometric reconstruction, object-class segmentation and learning of 3D scenes.

Interactive Segmentation Segmentation

Paper
Add Code

Feature sampling and partitioning for visual vocabulary generation on large action classification datasets

no code implementations • 29 May 2014 • Michael Sapienza, Fabio Cuzzolin, Philip H. S. Torr

The recent trend in action recognition is towards larger datasets, an increasing number of action classes and larger visual vocabularies.

Action Classification Action Recognition +2

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.