A split execution model for SpTRSV

dc.authoridBuse Yılmaz / 0000-0001-5529-7188
dc.authorscopusidBuse Yılmaz / 36959745100
dc.authorwosidBuse Yılmaz / AAG-3125-2021
dc.contributor.authorAhmad, Najeeb
dc.contributor.authorYılmaz, Buse
dc.contributor.authorUnat, Didem
dc.date.accessioned2021-05-10T12:24:15Z
dc.date.available2021-05-10T12:24:15Z
dc.date.issued2021en_US
dc.departmentİstinye Üniversitesi, Mühendislik ve Doğa Bilimleri Fakültesi, Yazılım Mühendisliği Bölümüen_US
dc.description.abstractSparse Triangular Solve (SpTRSV) is an important and extensively used kernel in scientific computing. Parallelism within SpTRSV depends upon matrix sparsity pattern and, in many cases, is non-uniform from one computational step to the next. In cases where the SpTRSV computational steps have contrasting parallelism characteristics some steps are more parallel, others more sequential in nature, the performance of an SpTRSV algorithm may be limited by the contrasting parallelism characteristics. In this work, we propose a split-execution model for SpTRSV to automatically divide SpTRSV computation into two sub-SpTRSV systems and an SpMV, such that one of the sub-SpTRSVs has more parallelism than the other. Each sub-SpTRSV is then computed by using a different SpTRSV algorithm and possibly executes on a different platform (CPU or GPU). By analyzing the SpTRSV Directed Acyclic Graph (DAG) and matrix sparsity features, we use a heuristics-based approach to (i) automatically determine suitability of an SpTRSV for split-execution, (ii) find the appropriate split-point, and (iii) execute SpTRSV in a split fashion using two SpTRSV algorithms while managing any required inter-platform communication. Experimental evaluation of the execution model on two CPU-GPU machines with matrix dataset of 327 matrices from the SuiteSparse Matrix Collection shows that our approach correctly selects the fastest SpTRSV method (split or unsplit) for 88% of matrices on the Intel Xeon Gold (6148) + NVIDIA Tesla V100 and 83% on the Intel Core I7 + NVIDIA G1080 Ti platform achieving speedups in the range of 1.01 10 and 1.03 6.36, respectively. IEEEen_US
dc.identifier.citationAhmad, N., Ylmaz, B., & Unat, D. (2021). A Split Execution Model for SpTRSV. IEEE Transactions on Parallel and Distributed Systems.en_US
dc.identifier.doi10.1109/TPDS.2021.3074501en_US
dc.identifier.issn1045-9219en_US
dc.identifier.scopus2-s2.0-85104653673en_US
dc.identifier.scopusqualityQ1en_US
dc.identifier.urihttps://www.doi.org/10.1109/TPDS.2021.3074501
dc.identifier.urihttps://hdl.handle.net/20.500.12713/1730
dc.identifier.wosWOS:000655244100005en_US
dc.identifier.wosqualityQ2en_US
dc.indekslendigikaynakWeb of Scienceen_US
dc.indekslendigikaynakScopusen_US
dc.institutionauthorYılmaz, Buse
dc.language.isoenen_US
dc.publisherIEEE Computer Societyen_US
dc.relation.ispartofIEEE Transactions on Parallel and Distributed Systemsen_US
dc.relation.publicationcategoryMakale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanıen_US
dc.rightsinfo:eu-repo/semantics/closedAccessen_US
dc.subjectComputational Modelingen_US
dc.subjectCPU-GPU Computingen_US
dc.subjectFatsen_US
dc.subjectGraphics Processing Unitsen_US
dc.subjectHeterogeneous Computingen_US
dc.subjectKernelen_US
dc.subjectParallel Algorithmsen_US
dc.subjectPhased Arraysen_US
dc.subjectSparse Linear Systemsen_US
dc.subjectSparse Matricesen_US
dc.subjectSparse Triangular Solveen_US
dc.subjectSpTRSVen_US
dc.subjectSpTSen_US
dc.titleA split execution model for SpTRSVen_US
dc.typeArticleen_US

Dosyalar

Orijinal paket
Listeleniyor 1 - 1 / 1
Küçük Resim Yok
Ä°sim:
16.pdf
Boyut:
4.03 MB
Biçim:
Adobe Portable Document Format
Açıklama:
Tam Metin / Full Text
Lisans paketi
Listeleniyor 1 - 1 / 1
Küçük Resim Yok
Ä°sim:
license.txt
Boyut:
1.44 KB
Biçim:
Item-specific license agreed upon to submission
Açıklama: