ROBUST AND BOOTSTRAP PROCEDURES IN REGRESSION ANALYSIS AND OUTLIERS DETECTION TESTS

Please use this identifier to cite or link to this item: http://142.54.178.187:9060/xmlui/handle/123456789/10358

Title:	ROBUST AND BOOTSTRAP PROCEDURES IN REGRESSION ANALYSIS AND OUTLIERS DETECTION TESTS
Keywords:	Natural Sciences
Issue Date:	2013
Publisher:	UNIVERSITY OF PESHAWAR
Abstract:	It is evident from the comments by Bernoulli (1777) that the history of outliers is very old and traces back to some 200 years ago indicating that “discardin g discordant observations” was a common practice even 200 years ago. In the opinion of investigator, “Outliers” are the observations that deviate from the remaining observations or bulk of the data and require proper treatment as statistical anal yses are h ighl y influenced by the presence of such observation in all t ypes of data sets. Many attempts have been made to cope with such observations and to provide protection against outliers. Robust statistics and robust regression techniques have been developed b y researchers with the passage of time to handle outliers and to minimize the effect of outliers. Work is still continuing to modify the previousl y devel oped techniques or to introduce even more advanced and improved techniques. Our present study has thre e important dimensions. The first portion of this study deals with the comparison of several tests developed by researchers to identify one or more outliers in single sample case. In this study we also propose some univarite tests to be used for the detection of outliers in case of sampling from a heavy tailed symmetric distribution, that is, Cauchy distribution. We conduct detailed simulation studies to compute critical values for the tests for various sample sizes available in the literature and also for the proposed tests while sampling is from the Cauchy distribution. We also have computed simulated powers based on 10000 simulations to compare iithese tests for various sample sizes up to 30 in the presence of different number of outliers varying from 1 to 5. We consider three (3) examples where artificial data sets were generated from Cauchy distribution containing 1, 2 and 3 outliers to investigate the performance of all of the tests under consideration. The second part of m y PhD thesis is mainl y concerne d with robust regression. Several researchers have proposed M - estimators and redescending M- estimators to handle outliers by assigning smaller weights to outliers in order to minimize their effect. We propose a new and efficient redescending M - estimator, called “Alamgir Redescending M- Estimator (ALARM)”. We investigate its asymptotic efficiency for various sample sizes and different number of predictors. We determine the optimum value for the tuning constant parameter of our proposed estimator. We condu cted simulation studies to evaluate and compare its performance with several other redescending M - estimators available in the literature. Our proposed estimator performs better than rest of the estimators in majorit y of simulated scenarios and outperforms the remaining estimators in some cases, particularl y, in the prese nce of higher percentages of ou tliers in the data. We also present some examples based on real data sets to illustrate the performance of our proposed estimator. The proposed estimator does well in fitting the real data sets containing different percentages of outliers and detected as many outliers as any other estimator did. Our proposed estimator provides protection against outliers and proves to be very efficient estimator. iiiIn the last pa rt of my PhD thesis, we propose a new bootstrap procedure, called “ Split Sample Bootstrap (SSB)” which is a very robust alternative to other classical or recentl y developed bootstrap procedure providing maximum protection against outliers. The proposed pro cedure has high breakdown point. We conduct ed some simulation studies to examine the performance of SSB and to compare it with two other bootstrap procedures under various simulation scenarios. The performance of the proposed procedure and the two other procedures is judged by computing the bootstrap estimate of the bias, bootstrap standard error (SE) and length of the bootstrap confidence interval. We observe very promising results for our proposed procedure with respect to bias, SE and length. Our propose d bootstrap procedure result s in numerical stabilit y and high efficiency of the estimates as compared to other two bootstrap procedures. The proposed procedure result in shortest confidence intervals for the parameter estimates for all sample sizes and for different number of predictor variables in the regression model at all level of contaminations, particularl y, in the presence of higher percentage of outliers as compared to the other two bootstrap procedures under consideration in the study. We consider two real data examples and the results similar to simulation results have been found in both examples. The Computer programing for simulation studies was done in R software (version 2.14.1 ).
URI:	http://142.54.178.187:9060/xmlui/handle/123456789/10358
Appears in Collections:	Thesis

Files in This Item:

File	Description	Size	Format
885.htm		127 B	HTML	View/Open

Show full item record