DSpace logo

Please use this identifier to cite or link to this item: http://142.54.178.187:9060/xmlui/handle/123456789/2437
Full metadata record
DC FieldValueLanguage
dc.contributor.authorShahnaz, Rukhsana-
dc.date.accessioned2017-12-04T06:50:06Z-
dc.date.accessioned2020-04-09T16:31:09Z-
dc.date.available2020-04-09T16:31:09Z-
dc.date.issued2010-
dc.identifier.urihttp://142.54.178.187:9060/xmlui/handle/123456789/2437-
dc.description.abstractThe efficient parallelization of sparse matrix-vector product (SMVP) is of prime importance in scientific computing. To achieve this on a distributed memory computers, we concentrate on minimizing the inter-processor communication, achieving a good balance of workload, overlapping communication with computation along with optimizing single processor performance. The thesis consists of two parts presenting the optimization and improvement of sparse matrix-vector multiplication performance on single as well as multi processors. For the performance improvement of SMVP on a single scalar processor, we propose two sparse storage formats, namely the grouped compressed row storage with permutation (GCRSP) and the blocked compressed row storage with permutation (BCRSP). The proposed formats are designed to efficiently exploit the benefits of blocking such as reduced indirect addressing, increased spatial and temporal locality along with eliminating the corresponding overheads. For the good load balancing and low communication cost, reordering of sparse matrices according to their sparsity structure is highly important. For this purpose we proposed reordering based partitioning strategies that tend to exploit sparsity of input matrix presenting the balanced load distribution along with the reduced communication cost. It has been observed that GCRSP improves the performance over simple compressed row storage (CRS) and compressed row storage with permutation (CRSP) with an average of 16% and 25%, respectively. Moreover, due to blocking in BCRSP, the performance improvements of an average of 32%, 41% and 20% are observed over CRS, CRSP and GCRSP respectively. Likewise, the proposed partitioning models permuted row column matrix produce an average of 49% better load balancing and 14% better communication than the corresponding naïve row/column and checker board models. Moreover, they produce same level of balanced load and an average of 78% better communication than the corresponding balanced naïve partitioning i.e. row/column block and balanced checker board (BCH) models. On the whole an average of 30% performance gain for parallel SMVP is achieved by using BCRSP format along with permuted row partitioning over the implementation using CRS format with naïve row partitioning using cluster of eight processors.en_US
dc.description.sponsorshipHigher Education Commission, Pakistanen_US
dc.language.isoenen_US
dc.publisherPakistan Institute of Engineering and Applied Sciences Islamabad, Pakistanen_US
dc.subjectApplied Sciencesen_US
dc.titlePerformance Improvement of Parallel Sparse Matrix-Vector Product on PC Clusteren_US
dc.typeThesisen_US
Appears in Collections:Thesis

Files in This Item:
File Description SizeFormat 
1192.htm128 BHTMLView/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.