Journal Information
Research Areas
Publication Ethics and Malpractice Statement
To Scholarlink Resource Center
Guidelines for Authors
For Authors
Instructions to Authors
Copyright forms
Submit Manuscript
Call for papers
Guidelines for Reviewers
For Reviewers
Review Forms
Contacts and Support
Support and Contact
List of Issues


Journal of Emerging Trends in Engineering and Applied Sciences (JETEAS)


Article Title: Effect of Missing Values on Data Classification
by Tapas Ranjan Baitharu and Subhendu Kumar Pani,

Data classification is an important task in KDD (knowledge discovery in databases) process. It has several potential applications. The performance of classifiers is strongly dependent on the data set used for learning. In practice, a data set may contain noisy or redundant data items and large number of features, many of them may not be relevant for the objective function at hand. Thus noise data may degrade the accuracy and performance of the classification models. Thus, dealing with missing values in data pre-processing is an important step in building an effective and efficient classifier. It is a process by which missing values are replaced by suitable values according an objective function or the noisy data may be filtered. It leads to better performance of the classification models in terms of their predictive or descriptive accuracy, diminishing of computing time needed to build models as they learn faster, and better understanding of the models. In this paper, the effect of missing values on data classification is studied. A comparative analysis of data classification accuracy in different scenarios is presented. Several search techniques are considered in the study for feature selection and are applied to pre-process the dataset. The predictive performances of popular classifiers are compared quantitatively. After analysing the experimental results, the paper establishes the general concept of improved classification accuracy using missing values replacement. The purpose of this research is to maintain the highest accuracy classification rate in missing values.
Keywords: data mining, feature selection, missing values, knowledge discovery databases.
Download full paper

ISSN: 2141-7016

Editor in Chief.

Prof. Gui Yun Tian
Professor of Sensor Technologies
School of Electrical, Electronic and Computer Engineering
University of Newcastle
United Kingdom



Copyright © Journal of Emerging Trends in Engineering and Applied Sciences 2010