1 code implementation • 29 Sep 2020 • Orestis Zachariadis, Nitin Satpute, Juan Gómez-Luna, Joaquín Olivares
The key idea of our spGEMM algorithm, tSparse, is to multiply sparse rectangular blocks using the mixed precision mode of TCUs.
Mathematical Software Distributed, Parallel, and Cluster Computing Performance
1 code implementation • 13 Apr 2020 • Orestis Zachariadis, Andrea Teatini, Nitin Satpute, Juan Gómez-Luna, Onur Mutlu, Ole Jakob Elle, Joaquín Olivares
In this paper, we introduce a novel GPU implementation of BSI to accelerate the calculation of the deformation field in non-rigid image registration algorithms.
Distributed, Parallel, and Cluster Computing Image and Video Processing