Feature Subset Selection Using Meta Heuristic Approaches

Khan, Ayesha

Please use this identifier to cite or link to this item: http://localhost:80/xmlui/handle/123456789/5216

Title:	Feature Subset Selection Using Meta Heuristic Approaches
Authors:	Khan, Ayesha
Keywords:	Computer Science
Issue Date:	2019
Publisher:	National University of Computer and Emerging Sciences, Islamabad
Abstract:	The expansion of the data is so rapid in the real world today that, now accumulating and processing it is a huge task. This growth is exponential and when Data Mining (DM) tools are applied to analyze this enormous data, it makes the algorithms time-consuming and expensive. One of the most important algorithm in DM for analyzing the data is the tool for classi cation. Classi cation is a function of DM for predicting the class of a sample by building a classi er or a prediction model on the basis of already collected samples with their class. The dataset used for classi cation is a supervised data with di erent features or attribute. During classi cation some features can be of great signi cance while some could be irrelevant and redundant. The learning and prediction time of classi cation algorithms is reduced using feature selection. This decrease in time is due to the time saved on the cost of features that are not selected through feature selection. Feature selection also provides understanding into the nature of the problem to be solved. So, there is a vital need of removing those irrelevant and redundant features before building a classi er. This research is based on solving the problem of feature subset selection (FSS) that chooses the features/attributes that are of signi cant value for the classi er to be built. These signi cant features would reduce the data that will eventually help to improve the accuracy and reliability of big data analytics. The reduction of data eventually would increase the accuracy and reliability of decision support systemsespeciallycriticalhealthrelateddecisionsupportsystems. Other areas include sentiment analysis, opinion mining, drug discovery, tumor detection, stroke detection and many other such applications. The rst phase of this research has the novelty of considering FSS prob lem as multi-objective problem and solving it using two metaheuris tic techniques that are Non-dominating Sorting Genetic Algorithm II (NSGA-II) and Multi-objective Particle Swarm Optimization altered to solve FSS as a binary problem (BMOPSO). The experimentation results represent the importance of considering FSS as multi-objective problem as it outperforms against current techniques of FSS not only in terms of the accuracy of a classi er but number features reduced. The sec ond phase of this research explores Ant Colony Optimization (ACO) technique for FSS which is another meta-heuristic technique. To fur ther re ne the search, the signi cance of each feature is measured using minimum Redundancy Maximum Relevance (mRMR) technique before applying ACO. The results show that proposed technique performs bet ter when compared with other existing biological inspired algorithms for FSS. Both of the phases of this research use di erent real world datasets taken from UCI machine repository and k-fold cross validation is used to further authenticate the results of the proposed techniques. The fea ture subset selection primarily deals with the data representation for the classi cation process and reduces the computational complexity and prediction accuracy.
Gov't Doc #:	17922
URI:	http://142.54.178.187:9060/xmlui/handle/123456789/5216
Appears in Collections:	Thesis

Files in This Item:

File	Description	Size	Format
10740.htm		121 B	HTML	View/Open

Show full item record