Search Results for author: Lucas Mentch

Found 15 papers, 7 papers with code

Trees, Forests, Chickens, and Eggs: When and Why to Prune Trees in a Random Forest

no code implementations • 30 Mar 2021 • Siyu Zhou, Lucas Mentch

Due to their long-standing reputation as excellent off-the-shelf predictors, random forests continue remain a go-to model of choice for applied statisticians and data scientists.

Paper
Add Code

Bridging Breiman's Brook: From Algorithmic Modeling to Statistical Learning

no code implementations • 23 Feb 2021 • Lucas Mentch, Giles Hooker

In 2001, Leo Breiman wrote of a divide between "data modeling" and "algorithmic modeling" cultures.

Philosophy

Paper
Add Code

Posterior Calibrated Training on Sentence Classification Tasks

1 code implementation • ACL 2020 • Taehee Jung, Dongyeop Kang, Hua Cheng, Lucas Mentch, Thomas Schaaf

Here we propose an end-to-end training procedure called posterior calibrated (PosCal) training that directly optimizes the objective while minimizing the difference between the predicted and empirical posterior probabilities. We show that PosCal not only helps reduce the calibration error but also improve task performance by penalizing drops in performance of both objectives.

Classification General Classification +2

Paper
Code

Getting Better from Worse: Augmented Bagging and a Cautionary Tale of Variable Importance

no code implementations • 7 Mar 2020 • Lucas Mentch, Siyu Zhou

As the size, complexity, and availability of data continues to grow, scientists are increasingly relying upon black-box learning algorithms that can often provide accurate predictions with minimal a priori model specifications.

Paper
Add Code

$V$-statistics and Variance Estimation

1 code implementation • 2 Dec 2019 • Zhengze Zhou, Lucas Mentch, Giles Hooker

This paper develops a general framework for analyzing asymptotics of $V$-statistics.

Paper
Code

Randomization as Regularization: A Degrees of Freedom Explanation for Random Forest Success

1 code implementation • 1 Nov 2019 • Lucas Mentch, Siyu Zhou

Random forests remain among the most popular off-the-shelf supervised machine learning tools with a well-established track record of predictive accuracy in both regression and classification settings.

regression

Paper
Code

Earlier Isn't Always Better: Sub-aspect Analysis on Corpus and System Biases in Summarization

1 code implementation • IJCNLP 2019 • Taehee Jung, Dongyeop Kang, Lucas Mentch, Eduard Hovy

We find that while position exhibits substantial bias in news articles, this is not the case, for example, with academic papers and meeting minutes.

News Summarization Position

Paper
Code

Locally Optimized Random Forests

1 code implementation • 27 Aug 2019 • Tim Coleman, Kimberly Kaufeld, Mary Frances Dorn, Lucas Mentch

To estimate these ratios with an unlabeled test set, we make the covariate shift assumption, where the differences in distribution are only a function of the training distributions (Shimodaira, 2000.)

Paper
Code

Asymptotic Distributions and Rates of Convergence for Random Forests via Generalized U-statistics

no code implementations • 25 May 2019 • Wei Peng, Tim Coleman, Lucas Mentch

Random forests remain among the most popular off-the-shelf supervised learning algorithms.

Paper
Add Code

Unrestricted Permutation forces Extrapolation: Variable Importance Requires at least One More Model, or There Is No Free Variable Importance

1 code implementation • 1 May 2019 • Giles Hooker, Lucas Mentch, Siyu Zhou

This paper reviews and advocates against the use of permute-and-predict (PaP) methods for interpreting black box functions.

Paper
Code

Scalable and Efficient Hypothesis Testing with Random Forests

2 code implementations • 16 Apr 2019 • Tim Coleman, Wei Peng, Lucas Mentch

Throughout the last decade, random forests have established themselves as among the most accurate and popular supervised learning methods.

Two-sample testing

Paper
Code

Multiphase Segmentation For Simultaneously Homogeneous and Textural Images

no code implementations • 29 Jun 2016 • Duy Hoang Thai, Lucas Mentch

Segmentation remains an important problem in image processing.

Segmentation

Paper
Add Code

Bootstrap Bias Corrections for Ensemble Methods

no code implementations • 1 Jun 2015 • Giles Hooker, Lucas Mentch

This paper examines the use of a residual bootstrap for bias correction in machine learning regression methods.

BIG-bench Machine Learning regression

Paper
Add Code

Formal Hypothesis Tests for Additive Structure in Random Forests

no code implementations • 7 Jun 2014 • Lucas Mentch, Giles Hooker

While statistical learning methods have proved powerful tools for predictive modeling, the black-box nature of the models they produce can severely limit their interpretability and the ability to conduct formal inference.

Paper
Add Code

Quantifying Uncertainty in Random Forests via Confidence Intervals and Hypothesis Tests

no code implementations • 25 Apr 2014 • Lucas Mentch, Giles Hooker

Instead of aggregating full bootstrap samples, we consider predicting by averaging over trees built on subsamples of the training set and demonstrate that the resulting estimator takes the form of a U-statistic.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.