Search Results for author: Tristan Kenneweg

Found 3 papers, 3 papers with code

Improving Line Search Methods for Large Scale Neural Network Training

1 code implementation27 Mar 2024 Philip Kenneweg, Tristan Kenneweg, Barbara Hammer

In recent studies, line search methods have shown significant improvements in the performance of traditional stochastic gradient descent techniques, eliminating the need for a specific learning rate schedule.

Faster Convergence for Transformer Fine-tuning with Line Search Methods

1 code implementation27 Mar 2024 Philip Kenneweg, Leonardo Galli, Tristan Kenneweg, Barbara Hammer

Recent works have shown that line search methods greatly increase performance of traditional stochastic gradient descent methods on a variety of datasets and architectures [1], [2].

Retrieval Augmented Generation Systems: Automatic Dataset Creation, Evaluation and Boolean Agent Setup

1 code implementation26 Feb 2024 Tristan Kenneweg, Philip Kenneweg, Barbara Hammer

We use a dataset created this way for the development and evaluation of a boolean agent RAG setup: A system in which a LLM can decide whether to query a vector database or not, thus saving tokens on questions that can be answered with internal knowledge.

Language Modelling Large Language Model +1

Cannot find the paper you are looking for? You can Submit a new open access paper.