Feature Selection With Maximal Relevance and Minimal Supervised Redundancy

Feature selection (FS) for classification is crucial for large-scale images and bio-microarray data using machine learning. It is challenging to select informative features from high-dimensional data which generally contains many irrelevant and redundant features. These features often impede classifier performance and misdirect classification tasks. In this article, we present an efficient FS algorithm to improve classification accuracy by taking into account both the relevance of the features and the pairwise features correlation in regard to class labels. Based on conditional mutual information and entropy, a new supervised similarity measure is proposed. The supervised similarity measure is connected with feature redundancy inimization evaluation and then combined with feature relevance maximization evaluation. A new criterion max-relevance and min-supervisedredundancy (MRMSR) is introduced and theoretically proved for FS. The proposed MRMSR-based ethod is compared to seven existing FS approaches on several frequently studied public benchmark datasets. Experimental results demonstrate that the proposal is more effective at selecting informative features and results in better competitive classification performance.

PDF

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here