Please use this identifier to cite or link to this item:
http://localhost:80/xmlui/handle/123456789/5147
Title: | Feature Selection Using Rough Set Based Heuristic Dependency Calculation |
Authors: | Raza, Muhammad Summair |
Keywords: | Software Engineering |
Issue Date: | 2018 |
Publisher: | National University of Science & Technology, Islamabad (NUST) |
Abstract: | The amount of data to be processed is significantly increasing day by day. The increase in data size is not only due to more number of records but also due to substantial number of attributes added to space. The phenomenon is leading to the dilemma called curse of dimensionality i.e. datasets with exponential number of attributes. The ideal approach is to reduce the number of dimensions such that resulted reduced set contains the same information as present in the entire set of attributes. There are various approaches to perform this task of dimensionality reduction. Recently, rough set-based approaches, which use attribute dependency to carry out feature selection, have been prominent. However, this dependency measure requires the calculation of the positive region, which is a computationally expensive task. In this research, we have proposed a new concept called the “Dependency Classes”, which calculates the attribute dependency without using the positive region. Dependency classes define the change in attribute dependency as we move from one record to another. By avoiding the positive region, they can be an ideal replacement for the conventional dependency measure in feature selection algorithms, especially for large datasets. A comparison framework was devised to measure the efficiency and effectiveness of the proposed measure. Experiments on various publically available datasets show that the proposed approaches provide significant computational performance with same accuracy as provided by conventional approach. We have also recommended seven feature selection algorithms using this measure. The experimental results have shown that algorithms using the classes were more effective than their counterparts using the positive region-based approach in terms of accuracy, execution time and required runtime memory. |
Gov't Doc #: | 18021 |
URI: | http://142.54.178.187:9060/xmlui/handle/123456789/5147 |
Appears in Collections: | Thesis |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.