Skip to content

Conversation

ReneEnjilian
Copy link
Contributor

This PR starts the implementation of custom Java–CUDA bindings for SystemDS, called cujava. The long-term goal is to replace JCuda altogether. The project is split into a Java layer and a C++/JNI layer. In this PR I’ve focused on the runtime and laid the foundation for the remaining modules.

Project structure

src/main/java/org.apache.sysds/cujava:

cujava/
├─ cublas/
├─ cudnn/
├─ cusolver/
├─ cusparse/
├─ driver/
├─ interop/
└─ runtime/
----├─ CudaDeviceProp
----├─ CudaError
----├─ CudaMemcpyKind
----└─ CuJava
CudaDataType
CudaException
CuJavaLibLoader
NativePointerObject
Pointer
Sizeof

Each directory contains the corresponding Java-side implementation. So far, I have focused on the runtime package.

src/main/cpp/jni:

jni/
├─ common/
├─ cublas/
├─ cudnn/
├─ cusolver/
├─ cusparse/
├─ driver/
└─ runtime/
build_cujava_libs.sh
CMakeLists.txt

These directories hold the C++/JNI implementations that mirror the Java side. Each directory has its own CMakeLists.txt to produce a dedicated shared library. The libraries are emitted under src/main/cpp/lib. In this PR I implemented the initial parts of runtime and common; the rest will follow in the next PRs.

Example usage

Writing code with cuJava is reminiscent of JCuda—only the imports change:

import org.apache.sysds.cujava.runtime.CuJava;
CuJava.cudaMalloc(...);

Notes

  • Parallel to JCuda: cuJava currently exists alongside JCuda and is not integrated into the SystemDS GPU backend yet. The existing GPU backend remains unchanged and continues to use JCuda. The newly added cuJava runtime methods have been tested successfully.
  • API scope: cuJava intentionally implements only the CUDA/NVIDIA library calls that SystemDS actually uses. This downsized approach—contrary to JCuda’s full coverage—enables faster upgrades to newer CUDA versions: we only need to update deprecated APIs we rely on. In contrast, JCuda must update every deprecated method, which is cumbersome and wasteful.

@mboehm7

Copy link

codecov bot commented Aug 20, 2025

Codecov Report

❌ Patch coverage is 0% with 270 lines in your changes missing coverage. Please review.
✅ Project coverage is 72.44%. Comparing base (b1c5d64) to head (ec268c7).
⚠️ Report is 5 commits behind head on main.

Files with missing lines Patch % Lines
...ava/org/apache/sysds/cujava/runtime/CudaError.java 0.00% 136 Missing ⚠️
src/main/java/org/apache/sysds/cujava/Pointer.java 0.00% 52 Missing ⚠️
.../java/org/apache/sysds/cujava/CuJavaLibLoader.java 0.00% 21 Missing ⚠️
...rg/apache/sysds/cujava/runtime/CudaDeviceProp.java 0.00% 20 Missing ⚠️
...a/org/apache/sysds/cujava/NativePointerObject.java 0.00% 10 Missing ⚠️
.../org/apache/sysds/cujava/interop/JCudaAdapter.java 0.00% 10 Missing ⚠️
...n/java/org/apache/sysds/cujava/runtime/CuJava.java 0.00% 8 Missing ⚠️
...in/java/org/apache/sysds/cujava/CudaException.java 0.00% 4 Missing ⚠️
...java/org/apache/sysds/cujava/driver/CUcontext.java 0.00% 1 Missing ⚠️
.../java/org/apache/sysds/cujava/driver/CUdevice.java 0.00% 1 Missing ⚠️
... and 7 more
Additional details and impacted files
@@             Coverage Diff              @@
##               main    #2311      +/-   ##
============================================
- Coverage     72.54%   72.44%   -0.11%     
- Complexity    46621    46629       +8     
============================================
  Files          1492     1509      +17     
  Lines        176098   176368     +270     
  Branches      34587    34589       +2     
============================================
+ Hits         127742   127761      +19     
- Misses        38701    38959     +258     
+ Partials       9655     9648       -7     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@ReneEnjilian
Copy link
Contributor Author

The errors resulted from a mistake in the javadoc. Apparently, one cant use <. I fixed that now.

@ReneEnjilian
Copy link
Contributor Author

The two failing tests are unrelated to my changes. The first failing check stems from the federated backend and the second is in the python layer of SystemDS. cujava touches neither of them and is in fact isolated from the rest of the codebase. I guess this is likely a CI/CD issue ?

@mboehm7
Copy link
Contributor

mboehm7 commented Aug 21, 2025

yes, there are some flaky tests as well as timeouts. I just kicked these failed jobs off again (which you can also do by going to the failed jobs and hit the button re-run failed jobs)

@ReneEnjilian
Copy link
Contributor Author

okay, all tests pass now. After this PR is merged, I will make a follow-up PR, where I complete the cujava runtime package. I will (hopefully) complete that PR either tomorrow or on Saturday.

@ReneEnjilian ReneEnjilian changed the title cujava: Custom java-cuda bindings for SystemDS [WIP] cujava: Custom java-cuda bindings for SystemDS Aug 23, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: In Progress
Development

Successfully merging this pull request may close these issues.

2 participants