Add top-level mutex class #3739

d-torrance · 2025-04-18T18:05:20Z

This week, @LukeOeding inquired on Zulip about the possibility of a parallelSum function. In order to make something like this work, we'd need some kind of mutex object to make sure that multiple threads aren't trying to modify the same thing at the same time.

We're already using a wrapper around pthread_mutex_t in the interpreter, so we export this to top level as a new Mutex class. In addition to the constructor method, there are three methods :

lock, which blocks until it can lock the mutex
tryLock, which tries to lock the mutex and raises an error if it can't
unlock, which unlocks the mutex

For example, here's a possible implementation of parallelSum using this new class:

i2 : parallelSum = (X, f, result) -> (
    mutex := new Mutex;
    T := apply(X, x -> schedule(() -> (
                y := f x;
		lock mutex;
		result += y;
		unlock mutex)));
    while not all(T, isReady) do null;
    result)

                                   
o2 = parallelSum

o2 : FunctionClosure

i3 : R = QQ[a..z]

o3 = R

o3 : PolynomialRing

i4 : parallelSum(gens R, x -> x^2, 0_R)

      2    2    2    2    2    2    2    2    2    2    2    2    2    2    2  
o4 = a  + b  + c  + d  + e  + f  + g  + h  + i  + j  + k  + l  + m  + n  + o  +
     --------------------------------------------------------------------------
      2    2    2    2    2    2    2    2    2    2    2
     p  + q  + r  + s  + t  + u  + v  + w  + x  + y  + z

o4 : R

@jkyang92 - Would you be willing to review this when you get a chance?

mahrud · 2025-04-18T21:01:39Z

Is this actually faster than sequential sum? The while not all(T, isReady) do null; makes me suspect not.

d-torrance · 2025-04-18T21:23:52Z

Yeah -- the while loop just waits until each of the tasks are done, but they're still running in parallel.

Here's a silly example:

i2 : sleepIdentity = x -> (sleep 1; x)

o2 = sleepIdentity

o2 : FunctionClosure

i3 : elapsedTime sum(1..10, sleepIdentity)
 -- 10.0012s elapsed

o3 = 55

i4 : elapsedTime parallelSum(1..10, sleepIdentity, 0)
 -- 2.36233s elapsed

o4 = 55

mahrud · 2025-04-18T22:47:23Z

Could you try a real example?

d-torrance · 2025-04-19T00:21:59Z

Here's a more interesting example:

i2 : R = QQ[x_0..x_8];

i3 : I = ideal random(R^4, R^{-2});

o3 : Ideal of R

i4 : J = ideal gens R;

o4 : Ideal of R

i5 : elapsedTime sum(gens R, f -> saturate(ideal I_*, f));
 -- 3.96602s elapsed

o5 : Ideal of R

i6 : elapsedTime parallelSum(gens R, f -> saturate(ideal I_*, f), ideal 0_R);
 -- 2.80849s elapsed

o6 : Ideal of R

mahrud · 2025-04-19T04:11:39Z

No really, you parallelized computing the entries, but the summing part is even slower:

i1 : parallelSum = (X, f, result) -> (
         mutex := new Mutex;
         T := apply(X, x -> schedule(() -> (
     		y := f x;
     		lock mutex;
     		result += y;
     		unlock mutex)));
         while not all(T, isReady) do null;
         result)

o1 = parallelSum

o1 : FunctionClosure

i2 : R = QQ[x_0..x_8];

i3 : I = ideal random(R^4, R^{-2});

o3 : Ideal of R

i4 : J = ideal gens R;

o4 : Ideal of R

i5 : L = apply(gens R, f -> saturate(ideal I_*, f));

i6 : benchmark "sum L" -- 0.007

o6 = .00718855184482758

o6 : RR (of precision 53)

i7 : benchmark "parallelSum(L, identity, ideal 0_R)"

o7 = .01879672231067965

o7 : RR (of precision 53)

And even in parallelizing the individual saturations, the mutexes and the while loop are hurting more than they're helping. For instance, using await/async pattern is faster without mutexes:

i35 : elapsedTime apply(10, i -> sum await apply(gens R, async(f -> saturate(ideal I_*, f))));
 -- 26.623s elapsed

i36 : elapsedTime apply(10, i -> parallelSum(gens R, f -> saturate(ideal I_*, f), ideal 0_R));
 -- 30.9081s elapsed

d-torrance · 2025-04-19T14:49:16Z

Of course the summing part is slower -- there's the added overhead of dealing with the mutexes.

I'm not proposing that we add this particular parallelSum function to Core or anything, and maybe this was a bad example since we can always just compute the summands concurrently and add sequentially at the end without mutexes as in your example.

But mutexes would be useful in certain parallel algorithms where there's some mutable shared data structure between the different threads like, say, a queue in a breadth-first search.

mahrud · 2025-04-19T16:17:54Z

Sure, I'm not opposed to adding mutexes, and in theory it could be very useful in things like capture to parallalize running examples and tests, but I want to see an example that demonstrates a realistic advantage in Macaulay2 first, otherwise it's hard to tell if there are bugs or leaks under the surface or not.

mahrud · 2025-04-19T16:40:20Z

More specifically, I am worried about what happens when there is an error or a code is interrupted while a mutex is locked. Language-wise, too, I wonder if what we want is the low level pthread_mutex proposed here, or something higher level, like a keyword that prevents one piece of code from being executed by multiple threads and automatically unlocks if the code is interrupted.

Here is an example of what I would rather have:

n = 0
f = x -> threadLock ( n += x )

where threadLock is a keyword that asks the parser to bind the resulting Code object to a mutex which is locked when it is being evaluated and is unlocked upon completion or interruption. The tricky part here is the interruption, because the only thing worse than sequentially computing something very time consuming that should be parallel is crashing M2 while running it in parallel!

d-torrance · 2025-04-19T17:00:05Z

Ooh, that's a cool idea! The current proposed implementation definitely has its downsides, e.g., lock m; lock m will create an uninterruptible loop, and a higher level threadLock would help avoid that.

Converting to a draft for now.

d-torrance · 2025-04-23T14:51:57Z

Closing this -- I have a working draft of a threadLock keyword. I still need to work out a few kinks (e.g. what happens when we cancel a task that currently holds the lock?) but I should have a PR soon.

mahrud · 2025-06-26T05:02:19Z

I frequently wish I could write blah = M -> threadLock M.cache.blah ??= (...) in a way that if blah M is called many times at the same time, only one thread is used to run the computation and cache the value and the rest instantly return the cached value.

mahrud · 2025-06-26T17:17:20Z

Actually, I think having top-level mutexes might be a pretty good solution for #3895 ...

d-torrance added 6 commits April 18, 2025 10:33

Add new Mutex class (wrapper around pthread_mutex_t)

e5c707d

Add interpreter-level mutex methods

c5fa7b5

Move AtomicInt and parallelApply to new threads.m2

9e9315b

Add mutex methods at top level

7592c1d

Document mutexes

feb48b0

Add unit tests for mutexes

29d0f98

d-torrance requested a review from jkyang92 April 18, 2025 18:05

d-torrance added under discussion waiting for review threads Macaulay2/system Interpreter labels Apr 19, 2025

d-torrance removed the waiting for review label Apr 19, 2025

d-torrance marked this pull request as draft April 19, 2025 17:00

d-torrance closed this Apr 23, 2025

d-torrance mentioned this pull request Apr 23, 2025

Add threadLock keyword -- locks mutex, evals code, then unlocks #3749

Draft

2 tasks

d-torrance reopened this Jul 1, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add top-level mutex class #3739

Add top-level mutex class #3739

Uh oh!

d-torrance commented Apr 18, 2025 •

edited

Loading

Uh oh!

mahrud commented Apr 18, 2025

Uh oh!

d-torrance commented Apr 18, 2025

Uh oh!

mahrud commented Apr 18, 2025

Uh oh!

d-torrance commented Apr 19, 2025

Uh oh!

mahrud commented Apr 19, 2025 •

edited

Loading

Uh oh!

d-torrance commented Apr 19, 2025

Uh oh!

mahrud commented Apr 19, 2025

Uh oh!

mahrud commented Apr 19, 2025

Uh oh!

d-torrance commented Apr 19, 2025

Uh oh!

d-torrance commented Apr 23, 2025

Uh oh!

mahrud commented Jun 26, 2025

Uh oh!

mahrud commented Jun 26, 2025

Uh oh!

Uh oh!

Add top-level mutex class #3739

Are you sure you want to change the base?

Add top-level mutex class #3739

Uh oh!

Conversation

d-torrance commented Apr 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mahrud commented Apr 18, 2025

Uh oh!

d-torrance commented Apr 18, 2025

Uh oh!

mahrud commented Apr 18, 2025

Uh oh!

d-torrance commented Apr 19, 2025

Uh oh!

mahrud commented Apr 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

d-torrance commented Apr 19, 2025

Uh oh!

mahrud commented Apr 19, 2025

Uh oh!

mahrud commented Apr 19, 2025

Uh oh!

d-torrance commented Apr 19, 2025

Uh oh!

d-torrance commented Apr 23, 2025

Uh oh!

mahrud commented Jun 26, 2025

Uh oh!

mahrud commented Jun 26, 2025

Uh oh!

Uh oh!

d-torrance commented Apr 18, 2025 •

edited

Loading

mahrud commented Apr 19, 2025 •

edited

Loading