no code implementations • 7 May 2024 • Abhijit Bendale, Michael Sapienza, Steven Ripplinger, Simon Gibbs, Jaewon Lee, Pranav Mistry
In this paper, we introduce SUTRA, multilingual Large Language Model architecture capable of understanding, reasoning, and generating text in over 50 languages.
1 code implementation • 27 May 2019 • Laurynas Miksys, Saumya Jetley, Michael Sapienza, Stuart Golodetz, Philip H. S. Torr
The STS model can run at 35 FPS on a high-end desktop, but its accuracy is significantly worse than that of offline state-of-the-art methods.
1 code implementation • 2 Aug 2017 • Victor Adrian Prisacariu, Olaf Kähler, Stuart Golodetz, Michael Sapienza, Tommaso Cavallari, Philip H. S. Torr, David W. Murray
Representing the reconstruction volumetrically as a TSDF leads to most of the simplicity and efficiency that can be achieved with GPU implementations of these systems.
no code implementations • 22 Jul 2017 • Suman Saha, Gurkirt Singh, Michael Sapienza, Philip H. S. Torr, Fabio Cuzzolin
Current state-of-the-art human action recognition is focused on the classification of temporally trimmed videos in which only one action occurs per frame.
1 code implementation • 5 Apr 2017 • Harkirat Singh Behl, Michael Sapienza, Gurkirt Singh, Suman Saha, Fabio Cuzzolin, Philip H. S. Torr
In this work, we introduce a real-time and online joint-labelling and association algorithm for action detection that can incrementally construct space-time action tubes on the most challenging action videos in which different action categories occur concurrently.
4 code implementations • ICCV 2017 • Gurkirt Singh, Suman Saha, Michael Sapienza, Philip Torr, Fabio Cuzzolin
To the best of our knowledge, ours is the first real-time (up to 40fps) system able to perform online S/T action localisation and early action prediction on the untrimmed videos of UCF101-24.
1 code implementation • CVPR 2017 • Saumya Jetley, Michael Sapienza, Stuart Golodetz, Philip H. S. Torr
To achieve this, we use a denoising convolutional auto-encoder to establish an embedding space, and place the decoder after a fast end-to-end network trained to regress directly to the encoded shape vectors.
Ranked #5 on Semantic Contour Prediction on Sbd val
no code implementations • 4 Aug 2016 • Suman Saha, Gurkirt Singh, Michael Sapienza, Philip H. S. Torr, Fabio Cuzzolin
In stage 2, the appearance network detections are boosted by combining them with the motion detection scores, in proportion to their respective spatial overlap.
no code implementations • 10 Jan 2016 • Anurag Arnab, Michael Sapienza, Stuart Golodetz, Julien Valentin, Ondrej Miksik, Shahram Izadi, Philip Torr
It is not always possible to recognise objects and infer material properties for a scene from visual cues alone, since objects can look visually similar whilst being made of very different materials.
no code implementations • 13 Oct 2015 • Stuart Golodetz, Michael Sapienza, Julien P. C. Valentin, Vibhav Vineet, Ming-Ming Cheng, Anurag Arnab, Victor A. Prisacariu, Olaf Kähler, Carl Yuheng Ren, David W. Murray, Shahram Izadi, Philip H. S. Torr
We present an open-source, real-time implementation of SemanticPaint, a system for geometric reconstruction, object-class segmentation and learning of 3D scenes.
no code implementations • 29 May 2014 • Michael Sapienza, Fabio Cuzzolin, Philip H. S. Torr
The recent trend in action recognition is towards larger datasets, an increasing number of action classes and larger visual vocabularies.