EVALUATION OF HIDDEN MARKOV MODEL FOR MALWARE BEHAVIORAL CLASSIFICATION

Imran, Mohammad

Please use this identifier to cite or link to this item: http://localhost:80/xmlui/handle/123456789/4896

Title:	EVALUATION OF HIDDEN MARKOV MODEL FOR MALWARE BEHAVIORAL CLASSIFICATION
Authors:	Imran, Mohammad
Keywords:	Computer science, information & general works
Issue Date:	2016
Publisher:	CAPITAL UNIVERSITY OF SCIENCE & TECHNOLOGY ISLAMABAD
Abstract:	Malware is a growing threat to computer systems and networks around the world. Ever since the malware construction kits and metamorphic virus generators became easily available, creating and spreading obfuscated malware has become a simple matter. The cyber-security vendors receive thousands of new malware samples everyday for analysis. It has become a challenging task for the malware analysts to identify if a given malware sample is a variant of a known malware or belongs to a new breed altogether. Since making an accurate decision about the nature of an unknown malware sample is crucial for updating of signature databases and propagation of the update to their customers, therefore vendors of cyber-security products need accurate malware classi cation techniques for this purpose. The research community has been active for providing a solution to the above problem, and a number of diverse avenues have been explored such as machine learning, graph theory, nite state machines, etc. Furthermore, many syntactic and semantic aspects of computer programs have been tried out in search of the best aspect that could be used to distinguish between harmful and harmless computer programs, and to di erentiate malware belonging to di erent families. All the proposed approaches have merits and demerits of their own, and the search for a solution that maximizes the classi cation accuracy with minimal computational costs is continued. This dissertation formulates malware classi cation as a sequence classi cation problem, and evaluates a widely used sequence classi cation tool, Hidden Markov Model (HMM), for the task of malware classi cation. HMM has been a method of choice for a broad range of sequential pattern matching applications such as speech analysis, behavior modeling and handwriting recognition to name a few. The dissertation rst proposes and evaluates novel methods of malware classi cation by combining HMM and malware behavioral features, which are attributes frequently used to distinguish between normal and malicious programs and to di erentiate x among malware families. As an another major contribution, the dissertation lls a signi cant research gap by studying the role of an important HMM parameter, the number of hidden states, in malware classi cation applications. Based on observations from comprehensive experiments conducted on a large and diverse dataset consisting of malware behavioral reports, the dissertation concludes that although HMM shows encouraging results when used for malware classi cation tasks, its potential from a practical standpoint is fairly limited. The dissertation makes the third contribution by proposing to replace the HMM component of malware classi cation method with Markov Chain Model (MCM), and performing comparative evaluation between the two models. Results of the comparison prove that classi cation performance achieved by HMM can be attained much more e - ciently by MCM, and therefore MCM should be preferred over HMM for malware classi cation applications.
URI:	http://142.54.178.187:9060/xmlui/handle/123456789/4896
Appears in Collections:	Thesis

Files in This Item:

File	Description	Size	Format
7605.htm		128 B	HTML	View/Open

Show full item record

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets