← Volver a resultados
Ficha bibliográfica · Consulta y acceso
Artículo de revista

ISA-Based GEMM Acceleration: Maximizing Computational Intensity for Scalable Performance

Casio P. Krebs et al · IEEE · 2026

Acceso abierto disponible
Lectura rápida. Revisá los datos básicos del recurso y luego accedé al contenido desde el botón principal. En esta ficha solo se muestra la información necesaria para identificar la obra, citarla y abrirla.
Publicación seriada

3PS-RAN: A Real-Time Framework for Securing the O-RAN RACH Against DDoS Attacks Toward NextG

Esta publicación seriada contiene 172 contenidos relacionados.

Acceso al recurso

Entrá al contenido desde la opción principal o elegí otra fuente disponible.

Acceso principal

Acceso abierto disponible

Recurso identificado como acceso abierto, sin confirmar automáticamente si es texto completo directo.
Abrir recurso

Resumen

Descripción general del contenido del recurso.

The growing demand for intensive artificial intelligence applications, such as deep neural networks and Large Language Models (LLMs), has driven the creation of architecture extensions and accelerators for matrix multiplication, a fundamental operation in machine learning. Several modern Instruction Set Architectures (ISAs) have introduced matrix extensions, each employing distinct approaches to data storage and execution. This work examines the scalability of computational intensity (CI) across different matrix data layout models and ISA-level processing, focusing on proposals centered on the RISC-V architecture. We create an analytical model that associates CI with the submatrix format and the available storage budget. Our results reveal that the outer product model achieves CI similar to that of architectures that perform full-block multiplication, but with much lower storage costs. We also demonstrate an upper bound for the computational intensity of outer product approaches under storage constraints. Finally, we suggest a technique that reuses existing RISC-V vector registers for matrix computation, organizing the data into two-dimensional grids to optimize CI. The proposed approach achieves up to 99.6% of the upper bound with 256-bit registers and 99.2% with 2048-bit registers. We conclude that it is possible to achieve high efficiency without increasing the architectural area, allowing scalability and compatibility with already standardized vector extensions.

Cómo citar

Elegí el formato que necesitás y copiá la referencia al portapapeles.

APA 7

al, C. P. K. E. (2026). ISA-Based GEMM Acceleration: Maximizing Computational Intensity for Scalable Performance. https://doi.org/10.1109/ACCESS.2026.3685979

MLA

al, Casio P. Krebs et. "ISA-Based GEMM Acceleration: Maximizing Computational Intensity for Scalable Performance." 2026. https://doi.org/10.1109/ACCESS.2026.3685979.

Chicago

al, Casio P. Krebs et. 2026. "ISA-Based GEMM Acceleration: Maximizing Computational Intensity for Scalable Performance.". https://doi.org/10.1109/ACCESS.2026.3685979.

Harvard

al, C. P. K. E. 2026, ISA-Based GEMM Acceleration: Maximizing Computational Intensity for Scalable Performance, IEEE, available at: https://doi.org/10.1109/ACCESS.2026.3685979 [Accessed 29 Jun. 2026].

Compartir e imprimir

Guardá la ficha, copiá su enlace permanente o imprimila como PDF.

Exportar referencia

Si usás un gestor bibliográfico, podés exportar el registro en los formatos más comunes.

Detalles del recurso

Información bibliográfica útil para confirmar que se trata del material correcto.

Título
ISA-Based GEMM Acceleration: Maximizing Computational Intensity for Scalable Performance
Autor / colaboradores
Casio P. Krebs et al
Editorial
IEEE
Año de publicación
2026
ISSN
2169-3536
ISSN
2169-3536
Idioma
eng

Materias

Explorá otros recursos relacionados a partir de estas materias.

Copiado