-
Notifications
You must be signed in to change notification settings - Fork 257
Add top-level mutex class #3739
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: development
Are you sure you want to change the base?
Conversation
Is this actually faster than sequential sum? The |
Yeah -- the while loop just waits until each of the tasks are done, but they're still running in parallel. Here's a silly example: i2 : sleepIdentity = x -> (sleep 1; x)
o2 = sleepIdentity
o2 : FunctionClosure
i3 : elapsedTime sum(1..10, sleepIdentity)
-- 10.0012s elapsed
o3 = 55
i4 : elapsedTime parallelSum(1..10, sleepIdentity, 0)
-- 2.36233s elapsed
o4 = 55 |
Could you try a real example? |
Here's a more interesting example: i2 : R = QQ[x_0..x_8];
i3 : I = ideal random(R^4, R^{-2});
o3 : Ideal of R
i4 : J = ideal gens R;
o4 : Ideal of R
i5 : elapsedTime sum(gens R, f -> saturate(ideal I_*, f));
-- 3.96602s elapsed
o5 : Ideal of R
i6 : elapsedTime parallelSum(gens R, f -> saturate(ideal I_*, f), ideal 0_R);
-- 2.80849s elapsed
o6 : Ideal of R |
No really, you parallelized computing the entries, but the summing part is even slower: i1 : parallelSum = (X, f, result) -> (
mutex := new Mutex;
T := apply(X, x -> schedule(() -> (
y := f x;
lock mutex;
result += y;
unlock mutex)));
while not all(T, isReady) do null;
result)
o1 = parallelSum
o1 : FunctionClosure
i2 : R = QQ[x_0..x_8];
i3 : I = ideal random(R^4, R^{-2});
o3 : Ideal of R
i4 : J = ideal gens R;
o4 : Ideal of R
i5 : L = apply(gens R, f -> saturate(ideal I_*, f));
i6 : benchmark "sum L" -- 0.007
o6 = .00718855184482758
o6 : RR (of precision 53)
i7 : benchmark "parallelSum(L, identity, ideal 0_R)"
o7 = .01879672231067965
o7 : RR (of precision 53) And even in parallelizing the individual saturations, the mutexes and the while loop are hurting more than they're helping. For instance, using await/async pattern is faster without mutexes: i35 : elapsedTime apply(10, i -> sum await apply(gens R, async(f -> saturate(ideal I_*, f))));
-- 26.623s elapsed
i36 : elapsedTime apply(10, i -> parallelSum(gens R, f -> saturate(ideal I_*, f), ideal 0_R));
-- 30.9081s elapsed |
Of course the summing part is slower -- there's the added overhead of dealing with the mutexes. I'm not proposing that we add this particular But mutexes would be useful in certain parallel algorithms where there's some mutable shared data structure between the different threads like, say, a queue in a breadth-first search. |
Sure, I'm not opposed to adding mutexes, and in theory it could be very useful in things like |
More specifically, I am worried about what happens when there is an error or a code is interrupted while a mutex is locked. Language-wise, too, I wonder if what we want is the low level pthread_mutex proposed here, or something higher level, like a keyword that prevents one piece of code from being executed by multiple threads and automatically unlocks if the code is interrupted. Here is an example of what I would rather have: n = 0
f = x -> threadLock ( n += x ) where |
Ooh, that's a cool idea! The current proposed implementation definitely has its downsides, e.g., Converting to a draft for now. |
Closing this -- I have a working draft of a |
I frequently wish I could write |
Actually, I think having top-level mutexes might be a pretty good solution for #3895 ... |
This week, @LukeOeding inquired on Zulip about the possibility of a
parallelSum
function. In order to make something like this work, we'd need some kind of mutex object to make sure that multiple threads aren't trying to modify the same thing at the same time.We're already using a wrapper around
pthread_mutex_t
in the interpreter, so we export this to top level as a newMutex
class. In addition to the constructor method, there are three methods :lock
, which blocks until it can lock the mutextryLock
, which tries to lock the mutex and raises an error if it can'tunlock
, which unlocks the mutexFor example, here's a possible implementation of
parallelSum
using this new class:@jkyang92 - Would you be willing to review this when you get a chance?