Search Results for author: Eric J. Michaud

Found 11 papers, 9 papers with code

Survival of the Fittest Representation: A Case Study with Modular Addition

no code implementations27 May 2024 Xiaoman Delores Ding, Zifan Carl Guo, Eric J. Michaud, Ziming Liu, Max Tegmark

To investigate this Survival of the Fittest hypothesis, we conduct a case study on neural networks performing modular addition, and find that these networks' multiple circular representations at different Fourier frequencies undergo such competitive dynamics, with only a few circles surviving at the end.

Not All Language Model Features Are Linear

1 code implementation23 May 2024 Joshua Engels, Isaac Liao, Eric J. Michaud, Wes Gurnee, Max Tegmark

Recent work has proposed the linear representation hypothesis: that language models perform computation by manipulating one-dimensional representations of concepts ("features") in activation space.

Language Modelling

Opening the AI black box: program synthesis via mechanistic interpretability

1 code implementation7 Feb 2024 Eric J. Michaud, Isaac Liao, Vedang Lad, Ziming Liu, Anish Mudide, Chloe Loughridge, Zifan Carl Guo, Tara Rezaei Kheirkhah, Mateja Vukelić, Max Tegmark

We present MIPS, a novel method for program synthesis based on automated mechanistic interpretability of neural networks trained to perform the desired task, auto-distilling the learned algorithm into Python code.

Program Synthesis Symbolic Regression

The Quantization Model of Neural Scaling

1 code implementation NeurIPS 2023 Eric J. Michaud, Ziming Liu, Uzay Girit, Max Tegmark

We tentatively find that the frequency at which these quanta are used in the training distribution roughly follows a power law corresponding with the empirical scaling exponent for language models, a prediction of our theory.

Language Modelling Quantization

Precision Machine Learning

1 code implementation24 Oct 2022 Eric J. Michaud, Ziming Liu, Max Tegmark

We explore unique considerations involved in fitting ML models to data with very high precision, as is often required for science applications.

Omnigrok: Grokking Beyond Algorithmic Data

1 code implementation3 Oct 2022 Ziming Liu, Eric J. Michaud, Max Tegmark

Grokking, the unusual phenomenon for algorithmic datasets where generalization happens long after overfitting the training data, has remained elusive.

Attribute Representation Learning

Understanding Learned Reward Functions

1 code implementation10 Dec 2020 Eric J. Michaud, Adam Gleave, Stuart Russell

However, current techniques for reward learning may fail to produce reward functions which accurately reflect user preferences.

Examining the causal structures of deep neural networks using information theory

1 code implementation26 Oct 2020 Simon Mattsson, Eric J. Michaud, Erik Hoel

Specifically, we introduce the effective information (EI) of a feedforward DNN, which is the mutual information between layer input and output following a maximum-entropy perturbation.

Cannot find the paper you are looking for? You can Submit a new open access paper.