Selection bias

102 papers with code • 0 benchmarks • 2 datasets

This task has no description! Would you like to contribute one?

Most implemented papers

PanNuke Dataset Extension, Insights and Baselines

TIA-Lab/PanNuke-metrics 24 Mar 2020

The emerging area of computational pathology (CPath) is ripe ground for the application of deep learning (DL) methods to healthcare due to the sheer volume of raw pixel data in whole-slide images (WSIs) of cancerous tissue slides.

Entire Space Multi-Task Model: An Effective Approach for Estimating Post-Click Conversion Rate

shenweichen/DeepCTR 21 Apr 2018

To the best of our knowledge, this is the first public dataset which contains samples with sequential dependence of click and conversion labels for CVR modeling.

Active Structure Learning of Causal DAGs via Directed Clique Tree

csquires/dct-policy 1 Nov 2020

Most existing works focus on worst-case or average-case lower bounds for the number of interventions required to orient a DAG.

A Debiased MDI Feature Importance Measure for Random Forests

shifwang/paper-debiased-feature-importance NeurIPS 2019

Based on the original definition of MDI by Breiman et al. for a single tree, we derive a tight non-asymptotic bound on the expected bias of MDI importance of noisy features, showing that deep trees have higher (expected) feature selection bias than shallow ones.

Adversarial Balancing-based Representation Learning for Causal Effect Inference with Observational Data

octeufer/Adversarial-Balancing-based-representation-learning-for-Causal-Effect-Inference 30 Apr 2019

The challenges for this problem are two-fold: on the one hand, we have to derive a causal estimator to estimate the causal quantity from observational data, where there exists confounding bias; on the other hand, we have to deal with the identification of CATE when the distribution of covariates in treatment and control groups are imbalanced.

Selection Bias Explorations and Debias Methods for Natural Language Sentence Matching Datasets

arthua196/Leakage-Neutral-Learning-for-QuoraQP ACL 2019

Natural Language Sentence Matching (NLSM) has gained substantial attention from both academics and the industry, and rich public datasets contribute a lot to this process.

To Model or to Intervene: A Comparison of Counterfactual and Online Learning to Rank from User Interactions

HarrieO/OnlineLearningToRank 15 Jul 2019

At the moment, two methodologies for dealing with bias prevail in the field of LTR: counterfactual methods that learn from historical data and model user behavior to deal with biases; and online methods that perform interventions to deal with bias but use no explicit user models.

Automated Dependence Plots

davidinouye/adp 2 Dec 2019

To address these drawbacks, we formalize a method for automating the selection of interesting PDPs and extend PDPs beyond showing single features to show the model response along arbitrary directions, for example in raw feature space or a latent space arising from some generative model.

Algorithm as Experiment: Machine Learning, Market Design, and Policy Eligibility Rules

rfgong/IVaps 26 Apr 2021

Algorithms make a growing portion of policy and business decisions.

Underspecification in Language Modeling Tasks: A Causality-Informed Study of Gendered Pronoun Resolution

2dot71mily/uspec 30 Sep 2022

Modern language modeling tasks are often underspecified: for a given token prediction, many words may satisfy the user's intent of producing natural language at inference time, however only one word will minimize the task's loss function at training time.