Kids Library Home

Welcome to the Kids' Library!

Search for books, movies, music, magazines, and more.

     
Available items only
Print Material
Author Thomas, Stephen (Mathematician), author.

Title Threaded multi-core GEMM with MoA and cache-blocking : preprint / Stephen Thomas and [three others].

Publication Info. Golden, CO : National Renewable Energy Laboratory, 2022.

Copies

Description 1 online resource (10 pages) : color illustrations.
text txt rdacontent
computer c rdamedia
online resource cr rdacarrier
Series NREL/CP ; 2C00-80530
Conference paper (National Renewable Energy Laboratory (U.S.)) ; 2C00-80530.
Note "March 2022."
"Presented at the 2021 World Congress in Computer Science, Computer Engineering, & Applied Computing (CSCE'21), Las Vegas, Nevada, July 26-29, 2021"--Cover.
Bibliography Includes bibliographical references (page 9).
Funding U.S. Department of Energy DE-AC36-08GO28308
Note Description based on online resource; title from PDF title page (NREL, viewed May 26, 2022).
Summary A threaded multi-core implementation of the high performance dense linear algebra matrix-matrix multiply GEMM kernel is described. This kernel is widely implemented by vendors in the basic linear algebra subroutine BLAS library. The mathematics of arrays (MoA) paradigm due to Mullin (1988) results in contiguous memory accesses by employing outer-product forms. Our performance studies demonstrate that the MoA implementation of double precision DGEMM combined with optimal cache-blocking strategies results in at least a 25% performance gain on the Intel Xeon Skylake processor over the vendor supplied Intel MKL basic linear algebra libraries. Results are presented for the NREL Eagle supercomputer. The multi-core DGEMM achieves over 100 GigaFlops/sec with eight openMP threads.
Subject Array processors.
Computer science -- Mathematics.
Algebras, Linear.
Processeurs de tableaux.
Informatique -- Mathématiques.
Algèbre linéaire.
Algebras, Linear
Array processors
Computer science -- Mathematics
Indexed Term cache-blocking
contiguous memory
mathematics of arrays
shared-memory multi-threading
Added Author National Renewable Energy Laboratory (U.S.), issuing body.
Standard No. 1848079 OSTI ID
Gpo Item No. 0430-P-04 (online)
Sudoc No. E 9.17:NREL/CP-2 C 00-80530

 
    
Available items only