Green Computing Benchmarks

This repository contains a tool for benchmarking different implementations of the BLAS and LAPACK libraries.

Installation

This tool has the following dependencies:

meson, version 1.3.2 or newer.
ninja
A Fortran compiler
- The build has been tested with gfortran 11.4.0 and ifx 2025.3.0. The project does not currently build with flang.
FlexiBLAS
- All development and testing was done with version 3.4.5.
CMake (for finding dependencies)

To install the tool:

Clone this repository
Navigate to the green-computing-benchmarks directory
Run the following commands:

meson setup build --optimization=3
meson compile -C build

To specify the compiler you want to use, replace the first command with:

FC=compiler meson setup build --optimization=3

Where compiler is replaced with your chosen compiler executable.

Building for GPUs

The tool has support for running on GPUs using CUDA and HIP.

CUDA

To enable CUDA benchmarks, use the meson option -Dcuda=true. For example:

meson setup build -Dcuda=true --optimization=3

You must have an Nvidia GPU and the CUDA and CUBLAS libraries installed.

HIP

To enable HIP benchmarks, use the meson option -Dhip=true. For example:

meson setup build -Dhip=true --optimization=3

You must have HIPFort (note: this must be on your CMAKE_PREFIX_PATH, e.g CMAKE_PREFIX_PATH=/opt/hipfort/lib/fortran/f95/cmake/), HIP and HIPBLAS installed. Currently this tool only supports HIP on AMD GPUs.

Running

You can specify the benchmarks you wish to run using a toml configuration file. To add a function to benchmark, use the syntax:

[[benchmarks]]
name = "NAME"
m-sizes = [10, 20, 30]
n-sizes = [10, 50, 30]
k-sizes = [10, 50, 30]

example.toml is included at the top level of this repository to demonstrate an example config.

Where "NAME" corresponds to the function you wish to benchmark. The list of available functions is below; unknown names will be ignored. m- n- and k-sizes correspond to the sizes of the matrices and/or vectors the functions will be benchmarked with. Benchmarks will be run on the products of these arrays, for example the config above will run on 27 different sized problems. The problems are randomly generated.

To run the benchmarks, run build/benchmark_blas config.toml, where config.toml is replaced with the path to your own configuration file. A separate csv file will be created for each routine in the directory you run the tool from. Alternatively, you can run the run_blas_benchmarks.sh script, which will run your chosen benchmarks with all BLAS implementations available to FLEXIBLAS and collect all results files in a timestamped directory. We recommend that this script is not used for GPU runs, as changing the BLAS implementation will have no effect but multiple runs will still be performed. The same is true of the NAIVE benchmark option.

To manually change the BLAS backend benchmarks are run with, use:

FLEXIBLAS="YOUR_BLAS" ./build/benchmark_blas config.toml

Available BLAS backends are shown with flexiblas list.

Available functions

Level 1 (vector operations):

DAXPY
- Double precision $\alpha x + y$
- Required options:
  - n-sizes (array)
DASUM
- Double precision sum of the absolute values of a vector
- Required options:
  - n-sizes (array)

Level 2 (matrix-vector operations):

DGEMV
- Double precision $\alpha A x + \beta y$
- Required options:
  - m-sizes (array)
  - n-sizes (array)

Level 3 (matrix-matrix operations):

DGEMM
- Double precision $\alpha A B + \beta C$
- Required options:
  - m-sizes (array)
  - n-sizes (array)
  - k-sizes (array)
CUBLAS_DGEMM (CUDA only)
- Double precision $\alpha A B + \beta C$ on Nvidia GPU
- Required options:
  - m-sizes (array)
  - n-sizes (array)
  - k-sizes (array)
HIPBLAS_DGEMM (HIP only)
- Double precision $\alpha A B + \beta C$ on AMD GPU
- Required options:
  - m-sizes (array)
  - n-sizes (array)
  - k-sizes (array)
DSYRK
- Double precision symmetric rank-k update $\alpha A A^T + \beta C$
- Required options:
  - n-sizes (array)
  - k-sizes (array)
DSY2RK
- Double precision symmetric rank-2k update $\alpha A B^T + \alpha B A^T+ \beta C$
- Required options:
  - n-sizes (array)
  - k-sizes (array)
NAIVE
- Naive non-BLAS matrix multiply $\alpha A B + \beta C$
- Required options:
  - m-sizes (array)
  - n-sizes (array)
  - k-sizes (array)

LAPACK Linear solvers

DGESV
- Double precision solution to $Ax = B$
- Required options:
  - n-sizes (array)

Name		Name	Last commit message	Last commit date
Latest commit History 97 Commits
src		src
subprojects		subprojects
.gitignore		.gitignore
README.md		README.md
example.toml		example.toml
meson.build		meson.build
meson.options		meson.options
run_blas_benchmarks.sh		run_blas_benchmarks.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Green Computing Benchmarks

Installation

Building for GPUs

CUDA

HIP

Running

Available functions

Level 1 (vector operations):

Level 2 (matrix-vector operations):

Level 3 (matrix-matrix operations):

LAPACK Linear solvers

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Green Computing Benchmarks

Installation

Building for GPUs

CUDA

HIP

Running

Available functions

Level 1 (vector operations):

Level 2 (matrix-vector operations):

Level 3 (matrix-matrix operations):

LAPACK Linear solvers

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages