Abstract
Geneexpression data presentssignificant challenges due to their high dimensionality;effective gene selection methods are needed to obtain accurate analysis and biomarker discovery. In this paper, we conducted a comprehensive comparative study using nine filter-based gene selection techniques: Information Gain, Mutual Information, Correlation-based Feature Selection (CFS), Relief-F, T-Test, Wilcoxon, Chi2, Pearson correlation, and Gini index. A breast cancer microarray dataset was used to evaluate these methods based on their classification accuracy, computational efficiency, and stabilityof the selected gene subsets. Most methods achieve high predictive accuracy and perfect stability but differ in their computational costs. This study aims to provide practical insights for choosing appropriate filtering methods based on their balance performance and efficiency in analyzing gene expression.
