GitHub - trevorpogue/algebraic-nnhw: AI acceleration using matrix multiplication with half the multiplications

Main Points

FFIP Algorithm and Architecture

The repository delivers a novel algorithm (FFIP) alongside a hardware architecture that enhances the compute efficiency of ML accelerators by reducing the number of necessary multiplications.

Applicability and Performance of FFIP

The FFIP algorithm is applicable across various machine learning model layers and has been shown to outperform existing solutions in throughput and compute efficiency.

Comprehensive Source Code for Implementation

The source code provides a comprehensive setup for implementation including a compiler, RTL descriptions, simulation scripts, and testbenches.

Insights

Introduction of a novel algorithm and architecture

We introduce a new algorithm called the Free-pipeline Fast Inner Product (FFIP) and its hardware architecture that improve an under-explored fast inner-product algorithm (FIP) proposed by Winograd in 1968.

Potential impact on ML accelerators

FFIP can be seamlessly incorporated into traditional fixed-point systolic array ML accelerators to achieve the same throughput with half the number of multiply-accumulate (MAC) units, or it can double the maximum systolic array size that can fit onto devices with a fixed hardware budget.

Technical approach and implementation

The repository contains source code for ML hardware architectures that require nearly half the number of multiplier units to achieve the same performance by executing alternative inner-product algorithms. It includes a compiler for parsing Python model descriptions into accelerator instructions, synthesizable SystemVerilog RTL for the baseline, FIP, and FFIP systolic array architectures, and additional utilities for development.

URL

https://github.com/trevorpogue/algebraic-nnhw

Tags