Search Results for author: Marius Hobbhahn

Found 10 papers, 7 papers with code

Flexible inference in heterogeneous and attributed multilayer networks

1 code implementation • 31 May 2024 • Martina Contisciani, Marius Hobbhahn, Eleanor A. Power, Philipp Hennig, Caterina De Bacco

In this paper, we develop a probabilistic generative model to perform inference in multilayer networks with arbitrary types of information.

Paper
Code

The Local Interaction Basis: Identifying Computationally-Relevant and Sparsely Interacting Features in Neural Networks

1 code implementation • 17 May 2024 • Lucius Bushnaq, Stefan Heimersheim, Nicholas Goldowsky-Dill, Dan Braun, Jake Mendel, Kaarel Hänni, Avery Griffin, Jörn Stöhler, Magdalena Wache, Marius Hobbhahn

We present a novel interpretability method that aims to overcome this limitation by transforming the activations of the network into a new basis - the Local Interaction Basis (LIB).

Paper
Code

Using Degeneracy in the Loss Landscape for Mechanistic Interpretability

no code implementations • 17 May 2024 • Lucius Bushnaq, Jake Mendel, Stefan Heimersheim, Dan Braun, Nicholas Goldowsky-Dill, Kaarel Hänni, Cindy Wu, Marius Hobbhahn

We propose that if we can represent a neural network in a way that is invariant to reparameterizations that exploit the degeneracies, then this representation is likely to be more interpretable, and we provide some evidence that such a representation is likely to have sparser interactions.

Learning Theory

Paper
Add Code

Black-Box Access is Insufficient for Rigorous AI Audits

no code implementations • 25 Jan 2024 • Stephen Casper, Carson Ezell, Charlotte Siegmann, Noam Kolt, Taylor Lynn Curtis, Benjamin Bucknall, Andreas Haupt, Kevin Wei, Jérémy Scheurer, Marius Hobbhahn, Lee Sharkey, Satyapriya Krishna, Marvin Von Hagen, Silas Alberti, Alan Chan, Qinyi Sun, Michael Gerovitch, David Bau, Max Tegmark, David Krueger, Dylan Hadfield-Menell

External audits of AI systems are increasingly recognized as a key mechanism for AI governance.

Paper
Add Code

Large Language Models can Strategically Deceive their Users when Put Under Pressure

1 code implementation • 9 Nov 2023 • Jérémy Scheurer, Mikita Balesni, Marius Hobbhahn

We demonstrate a situation in which Large Language Models, trained to be helpful, harmless, and honest, can display misaligned behavior and strategically deceive their users about this behavior without being instructed to do so.

Management

Paper
Code

Will we run out of data? Limits of LLM scaling based on human-generated data

1 code implementation • 26 Oct 2022 • Pablo Villalobos, Anson Ho, Jaime Sevilla, Tamay Besiroglu, Lennart Heim, Marius Hobbhahn

We investigate the potential constraints on LLM scaling posed by the availability of public human-generated text data.

Paper
Code

Machine Learning Model Sizes and the Parameter Gap

no code implementations • 5 Jul 2022 • Pablo Villalobos, Jaime Sevilla, Tamay Besiroglu, Lennart Heim, Anson Ho, Marius Hobbhahn

From 1950 to 2018, model size in language models increased steadily by seven orders of magnitude.

BIG-bench Machine Learning

Paper
Add Code

Compute Trends Across Three Eras of Machine Learning

1 code implementation • 11 Feb 2022 • Jaime Sevilla, Lennart Heim, Anson Ho, Tamay Besiroglu, Marius Hobbhahn, Pablo Villalobos

Since the advent of Deep Learning in the early 2010s, the scaling of training compute has accelerated, doubling approximately every 6 months.

BIG-bench Machine Learning

Paper
Code

Laplace Matching for fast Approximate Inference in Latent Gaussian Models

1 code implementation • 7 May 2021 • Marius Hobbhahn, Philipp Hennig

The method can be thought of as a pre-processing step which can be implemented in <5 lines of code and runs in less than a second.

Bayesian Inference Gaussian Processes +1

Paper
Code

Fast Predictive Uncertainty for Classification with Bayesian Deep Networks

1 code implementation • 2 Mar 2020 • Marius Hobbhahn, Agustinus Kristiadi, Philipp Hennig

We reconsider old work (Laplace Bridge) to construct a Dirichlet approximation of this softmax output distribution, which yields an analytic map between Gaussian distributions in logit space and Dirichlet distributions (the conjugate prior to the Categorical distribution) in the output space.

Classification General Classification

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.