Skip to content

v0.6.0

Latest
Compare
Choose a tag to compare
@nathanielsimard nathanielsimard released this 18 Jul 16:12
· 36 commits to main since this release

Summary

CubeCL 0.6.0 introduces significant enhancements to performance, functionality, and compatibility across various backends. Key features include n-dimensional convolution, multi-stage matrix multiplication (matmul), and dynamic shared memory support for CUDA. Performance optimizations, such as reworked into_contiguous and double buffering, improve efficiency. New functionality like random number generation, fp8/fp6 support, and recursive profiling enhance the library's capabilities.
Bug fixes address issues in backends (Metal, HIP, Vulkan, WASM), memory alignment, and deadlocks.

What's New

Features

Performance Improvements

Bug Fixes

Refactorings

Documentation & Testing

Dependencies & Maintenance


Thank you to all contributors for making CubeCL 0.6.0 possible!