Search Results for author: Masato Fujitake

Found 8 papers, 1 papers with code

JSTR: Judgment Improves Scene Text Recognition

no code implementations • 9 Apr 2024 • Masato Fujitake

In this paper, we present a method for enhancing the accuracy of scene text recognition tasks by judging whether the image and text match each other.

Scene Text Recognition

Paper
Add Code

LayoutLLM: Large Language Model Instruction Tuning for Visually Rich Document Understanding

no code implementations • 21 Mar 2024 • Masato Fujitake

By leveraging the strengths of existing research in document image understanding and LLMs' superior language understanding capabilities, the proposed model, fine-tuned with multimodal instruction datasets, performs an understanding of document images in a single model.

Document Image Classification document understanding +2

Paper
Add Code

RL-LOGO: Deep Reinforcement Learning Localization for Logo Recognition

no code implementations • 28 Dec 2023 • Masato Fujitake

Therefore, we propose a deep reinforcement learning localization method for logo recognition (RL-LOGO).

2k Image Classification +3

Paper
Add Code

FA Team at the NTCIR-17 UFO Task

no code implementations • 31 Oct 2023 • Yuki Okumura, Masato Fujitake

The FA team participated in the Table Data Extraction (TDE) and Text-to-Table Relationship Extraction (TTRE) tasks of the NTCIR-17 Understanding of Non-Financial Objects in Financial Reports (UFO).

Language Modelling

Paper
Add Code

DTrOCR: Decoder-only Transformer for Optical Character Recognition

no code implementations • 30 Aug 2023 • Masato Fujitake

Typical text recognition methods rely on an encoder-decoder structure, in which the encoder extracts features from an image, and the decoder produces recognized text from these features.

Ranked #1 on Optical Character Recognition (OCR) on Benchmarking Chinese Text Recognition: Datasets, Baselines, and an Empirical Study

Decoder Handwritten Text Recognition +5

Paper
Add Code

DiffusionSTR: Diffusion Model for Scene Text Recognition

no code implementations • 29 Jun 2023 • Masato Fujitake

This paper presents Diffusion Model for Scene Text Recognition (DiffusionSTR), an end-to-end text recognition framework using diffusion models for recognizing text in the wild.

Ranked #12 on Scene Text Recognition on IIIT5k

Scene Text Recognition

Paper
Add Code

A3S: Adversarial learning of semantic representations for Scene-Text Spotting

no code implementations • 21 Feb 2023 • Masato Fujitake

Scene-text spotting is a task that predicts a text area on natural scene images and recognizes its text characters simultaneously.

Ranked #1 on Text Spotting on SCUT-CTW1500

Text Spotting

Paper
Add Code

Video Sparse Transformer With Attention-Guided Memory for Video Object Detection

1 code implementation • IEEE Access 2022 • Masato Fujitake, Akihiro Sugimoto

In this paper, we enhance features element-wisely before the object candidate region detection, proposing Video Sparse Transformer with Attention-guided Memory (VSTAM).

Ranked #1 on Object Detection on UA-DETRAC

Object object-detection +3

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.