@@ -61,36 +61,41 @@ identifiers.
61
61
A statement using ` cilk_spawn ` is the start of a potentially parallel
62
62
region of code.
63
63
64
- The ` cilk_spawn ` keyword should appear at the start of an expression
65
- statement or after the ` = ` sign of an assignment (or ` += ` , ` -= ` ,
66
- etc.).
64
+ The ` cilk_spawn ` keyword should appear before an expression statement,
65
+ before a block statement, after the ` = ` sign of a variable
66
+ initialization, or after the ` = ` of an assignment that is the entire
67
+ body of an expression statement.
67
68
68
69
``` cilkc
69
70
int x = cilk_spawn f(i++);
71
+ x = cilk_spawn f(i++);
70
72
cilk_spawn y[j++] = f(i++);
71
73
cilk_spawn { z[j++] = f(i++); }
72
74
```
73
75
74
- Although the compiler accepts ` cilk_spawn ` before almost any
76
+ A future version of OpenCilk may limit use of ` cilk_spawn ` to
77
+ these four contexts.
78
+
79
+ OpenCilk 2.0 allows other statements, except declarations, to be
80
+ spawned. Although the compiler accepts ` cilk_spawn ` before almost any
75
81
expression, spawns inside of expressions are unlikely to have the
76
- expected semantics. A future version of the language may explicitly
77
- limit ` cilk_spawn ` to the contexts above, at or near the top of the
78
- parse tree of a statement.
82
+ expected semantics.
79
83
80
- A declaration may not begin with ` cilk_spawn ` .
84
+ OpenCilk 2.0 will also accept ` cilk_spawn; ` as a statement with no
85
+ effect.
81
86
82
87
### Sync
83
88
84
89
A sync statement, ` cilk_sync; ` , ends a region of potentially parallel
85
- execution. It takes no arguments.
86
-
87
- [ Find a real example with a conditional sync. Or have some spawns
88
- to be synced. Matteo Frigo's all pairs shortest path code has
89
- conditional sync, says TB.]
90
+ execution. It takes no arguments. It may be conditional and has
91
+ no effect if not executed.
90
92
91
93
``` cilkc
92
- if (time_to_sync)
93
- cilk_sync;
94
+ for (int i = 0; i < n; i++) {
95
+ cilk_spawn f(i);
96
+ if (i % 4 == 3)
97
+ cilk_sync;
98
+ }
94
99
```
95
100
96
101
### Scope
@@ -113,6 +118,8 @@ cilk_scope {
113
118
// x, y, and z are usable here because of the implicit sync
114
119
```
115
120
121
+ The compiler also accepts ` cilk_scope; ` as a statement with no effect.
122
+
116
123
### For
117
124
118
125
A loop written using ` cilk_for ` executes each iteration of its body in
@@ -126,27 +133,36 @@ cilk_for (int i = 0; i < n; ++i)
126
133
```
127
134
128
135
The syntax of a ` cilk_for ` statement is very similar to a C ` for `
129
- statement. It is followed by three expressions, the first of which
130
- may declare variables. Unlike in C all three expressions are
131
- mandatory. Parallel C++ range ` for ` constructs are not supported
132
- in OpenCilk 2.0.
133
-
134
- For the loop to be parallelized, several conditions must be met:
135
-
136
- * The first expression must declare a variable (the "loop variable").
137
- * The second expression must compare the loop variable using one of the
138
- relational operators ` <= ` , ` < ` , ` != ` , ` > ` , and ` >= ` .
139
- * The value compared to must be [ ...]
140
- * The third expression must modify the loop variable using ` ++ ` ,
136
+ statement except that none of the three items in parentheses may be
137
+ omitted. C++ "range for" is not supported with ` cilk_for ` in
138
+ OpenCilk 2.0.
139
+
140
+ The first statement inside parentheses must declare at least one
141
+ variable.
142
+
143
+ While the following constraints not required by syntax, the compiler
144
+ may not be able to parallelize the loop if they are not satisfied.
145
+
146
+ * The first expression must declare one variable, the _ control variable_ .
147
+ * In C the control variable must be an integer no larger than 64 bits or
148
+ a pointer to a complete type. In C++ it may be any random access iterator.
149
+ Among other things, this implies that the difference between starting and
150
+ ending values must be an integer computable by subtraction or ` operator- ` .
151
+ * The second expression must compare the control variable using one of the
152
+ relational operators ` <= ` , ` < ` , ` != ` , ` > ` , and ` >= ` . The value to which
153
+ it is compared is the _ loop bound_ . (See below for the
154
+ interpretation of this value.)
155
+ * The third expression must modify the control variable using ` ++ ` ,
141
156
` -- ` , ` += ` , or ` -= ` .
142
157
158
+ The compiler will emit a warning if the loop can not be unrolled,
159
+ eliminated, or parallelized.
160
+
143
161
Because loop iterations may execute out of order there is no way to
144
162
predictably stop the loop in the middle. The ` break ` statement may
145
163
not be used to exit the body of a ` cilk_for ` loop. An exception
146
164
thrown out of a loop body is only guaranteed to terminate the current
147
- iteration. (Also, any later iteration of the same grain; see below.)
148
- The effect on other iterations is unpredictable; they may run to
149
- completion or not run at all.
165
+ iteration.
150
166
151
167
#### Grain size
152
168
@@ -167,38 +183,31 @@ can be manually overridden with a pragma:
167
183
The pragma in the example tells the compiler that each group of 128
168
184
consecutive iterations should be executed as a serial loop. If there
169
185
are 1024 loop iterations in total, there are only 8 parallel tasks.
170
- There is guaranteed to be no spawn or sync between the iterations
171
- for ` i=0 ` and ` i=1 ` (assuming ` n ` is at least 2, otherwise there
172
- will be no second iteration).
173
186
174
187
In OpenCilk 2.0 the argument to the grain size pragma must be an
175
188
integer constant in the range 1..2<sup >31</sup >-1.
176
189
177
190
Without an explicit grainsize the runtime will choose a value from 1
178
191
to 2048.
179
192
180
- Whether the grain size is static or dynamic, an exception thrown from
181
- the loop body will abort the remainder of the group of iterations.
182
- The scope of ` cilk_sync ` will also include all iterations in the
183
- group.
184
-
185
-
186
193
### Reducers
187
194
188
- A type may be suffixed with ` cilk_reducer ` . Syntactically it appears
189
- where ` * ` may be used to declare a pointer type. The type to the left
190
- of ` cilk_reducer ` is the _ view type_ .
195
+ A type may be suffixed with ` cilk_reducer ` . Syntactically this
196
+ keyword appears where ` * ` may be used to declare a pointer type. The
197
+ type to the left of ` cilk_reducer ` is the _ view type_ .
191
198
192
- Two values appear in parentheses after ` cilk_reducer ` . Both emust be
193
- functions returning ` void ` or pointers to functions returning void.
194
- The first, the _ identity callback_ , takes one argument of type ` void* ` .
195
- The second, the _ reduce callback_ , takes two arguments of type ` void * ` .
199
+ Two values appear in parentheses after ` cilk_reducer ` , separated by a
200
+ comma. Both must be functions returning ` void ` or pointers to
201
+ functions returning void. The first, the _ identity callback_ , takes
202
+ one argument of type ` void * ` . The second, the _ reduce callback_ ,
203
+ takes two arguments of type ` void * ` .
196
204
197
205
Two reducer types are the same if their view types are the same and
198
- their callbacks are the same function mentioned by name. Otherwise
199
- two reducer types are different and not compatible. This rule arises
200
- from the impossibility of proving that two different functions are
201
- identical.
206
+ their corresponding callbacks are the same function mentioned by name.
207
+ Otherwise two reducer types are different and not compatible. The
208
+ requirement that the corresponding arguments be manifestly the same
209
+ function is dictated by the impossibility of proving that two
210
+ different expressions are equivalent.
202
211
203
212
```
204
213
extern void identity(void *), reduce(void *, void *);
@@ -209,15 +218,15 @@ identical.
209
218
int cilk_reducer(idp, reduce) type4; // not the same as type3
210
219
```
211
220
212
- In the current version of OpenCilk the callbacks may be omitted in
213
- contexts other than definition of a variable. This behavior may be
214
- removed in a future version of OpenCilk.
221
+ In the OpenCilk 2.0 the callbacks may be omitted in contexts other
222
+ than definition of a variable. This behavior may be removed in a
223
+ future version of OpenCilk.
215
224
216
- In the current version of OpenCilk the arguments to ` cilk_reducer `
217
- are evaluated each time a reducer is created. This behavior may
218
- change in a future version of OpenCilk. For compatibility and
219
- predictable behavior the arguments to ` cilk_reducer ` should not
220
- have side effects.
225
+ In the current version of OpenCilk the arguments to ` cilk_reducer ` are
226
+ evaluated each time a reducer is created but not when a reducer is
227
+ accessed. This behavior may change in a future version of OpenCilk.
228
+ For compatibility and predictable behavior the arguments to
229
+ ` cilk_reducer ` should not have side effects.
221
230
222
231
## Execution of an OpenCilk program
223
232
@@ -240,8 +249,8 @@ thread.
240
249
In some cases it is necessary to specify exactly where the spawn point
241
250
is in a spawn statement. All code up to the point of spawning
242
251
executes in series. The code that follows the spawn point is called
243
- the _ continuation_ of the spawn. It potentially executes in parallel
244
- with the following statements of the program .
252
+ the _ continuation_ of the spawn. The spawn potentially executes in
253
+ parallel with the continuation, up to the next sync .
245
254
246
255
Contrary to the syntax, the spawn itself should be considered as
247
256
having a ` void ` value. The return value of spawn is like a promise or
@@ -306,7 +315,7 @@ continuing.
306
315
307
316
##### Explicit sync
308
317
309
- An explicit sync is a statement using ` cilk_sync ` . This form normally
318
+ An explicit sync is the statement ` cilk_sync; ` . This form normally
310
319
has function scope, meaning it waits for all spawns in the same
311
320
function. A sync inside the body of a ` cilk_for ` or ` cilk_scope ` only
312
321
waits for spawns inside the same construct.
@@ -317,8 +326,8 @@ cilk_spawn cilk_scope { cilk_spawn ... }
317
326
...
318
327
cilk_sync;
319
328
```
320
- a ` cilk_sync ` at top level waits for the outer spawn to complete, and
321
- the outer spawn waits for the inner spawn to complete.
329
+ a ` cilk_sync ` at top level waits for the top level spawn to complete, and
330
+ the top level spawn waits for everything spawned inside it to complete.
322
331
323
332
##### Implicit syncs
324
333
@@ -327,24 +336,22 @@ before exit from some scopes:
327
336
328
337
* Before returning from a function, after calculating the value to
329
338
be returned. This sync has function scope.
330
- * On exit from the body of a ` try ` block. This sync has function scope.
331
- [ Is this true? Is it true only if the try block spawns?]
332
339
* On exit from a ` cilk_scope ` statement. This sync has the scope of the
333
340
` cilk_scope ` statement.
334
341
* On exit from the body of a ` cilk_for ` , i.e. once per iteration of the loop.
335
342
This sync has scope equal to the loop body.
336
- If a grain size is specified, the sync affects the entire group of
337
- iterations in which it is executes. [ TODO: Test grain size. ]
338
-
339
- [ What about on entry to a try block? ]
343
+ * Before entering a ` catch ` block. This sync has the same scope as
344
+ the ` try .. catch ` construct as a whole: the smallest enclosing
345
+ function, ` cilk_scope ` , or ` cilk_for ` body. [ No, it applies to
346
+ the try block, which gets its own sync region. Test this. ]
340
347
341
348
When exiting from a block scope, destructors for block scope variables
342
349
are run after the implicit sync.
343
350
344
351
## Differences between C++ and OpenCilk
345
352
346
- In C++ code there are two exceptions to the rule that serial and
347
- parallel programs are the same.
353
+ There are three exceptions to the rule that serial and parallel
354
+ programs are the same.
348
355
349
356
### Exceptions
350
357
@@ -356,32 +363,69 @@ observable if the continuation has side effects.
356
363
When the parent function executes an implicit or explicit ` cilk_sync `
357
364
the runtime checks whether the spawned child threw an exception. If
358
365
it did, any exception thrown by the parent is discarded and the
359
- exception thrown by the child is handled at the sync. The compiler
360
- inserts an implicit sync at the end of a try block if the try contains
361
- a spawn. [ Make sure this is consistent with the wording in the implicit
362
- syncs section.]
366
+ exception thrown by the child is handled as if thrown at the sync.
367
+ The compiler inserts an implicit sync at the end of a try block if the
368
+ try contains a spawn.
363
369
364
370
If an exception is thrown from the body of a ` cilk_for ` statement the
365
371
current loop iteration is aborted, consistent with the semantics of
366
- ` throw ` . If the ` grainsize ` pragma is used, later iterations in the
367
- current grain do not execute. No guarantee is made about which other
368
- loop iterations execute, except that a grain in progress is not
369
- affected by an exception thrown from outside the grain.
372
+ ` throw ` . Other loop iterations may or may not execute, depending on
373
+ scheduling. An exception thrown by one iteration of the loop will not
374
+ prematurely terminate another iteration.
375
+
376
+ If more than one exception reaches a sync the earliest in serial
377
+ order is thrown by the sync. The other exceptions are destructed.
370
378
371
379
### Left hand side side effects
372
380
373
381
When a function call is spawned in OpenCilk and the result is
374
382
assigned, the compiler evaluates the address of the left hand side of
375
383
the assignment before calling the function. This conflicts with
376
- recent versions of C++, which require evaluation of the left hand
377
- side to follow return from the function.
384
+ C++17 , which requires evaluation of the left hand side to follow
385
+ return from the function.
378
386
379
387
``` cilkcpp
380
388
extern int global;
381
389
a[global++] = cilk_spawn f(); // f() sees incremented value of global
382
390
```
383
391
384
- [ TB needs to confirm or deny this]
392
+ Occurring before the spawn, the evaluation of the left hand side is in
393
+ series with the continuation of the spawn.
394
+
395
+ ### Loops
396
+
397
+ Parallel for loops are implemented by looping over an integer range.
398
+ This transformation requires that the loop count be known before
399
+ the loop begins execution and that the control variable be calculable
400
+ by adding an integer to the starting value.
401
+
402
+ The observable differences are
403
+
404
+ * The loop bound expression may be executed fewer times, likely only
405
+ once.
406
+
407
+ * In C++, ` operator- ` may be called to subtract the start value from
408
+ the loop bound (if the increment is positive) or the loop bound from
409
+ the initial value (if the increment is negative).
410
+
411
+ * The loop increment expression may not be executed or may be executed
412
+ fewer times than in the serial program.
413
+
414
+ * In C++, ` operator+ ` may be called to add an integer to the starting
415
+ value of the control variable.
416
+
417
+ The program is not guaranteed to call ` operator+ ` or ` operator- ` and
418
+ these operators not have side effects. If the loop is not
419
+ parallelized it may be executed as written. For example, the compiler
420
+ may decide to unroll the loop instead. If a ` cilk_for ` loop is
421
+ compiled to a serial loop the compiler will emit a warning.
422
+
423
+ The control variable must not wrap around, even if the control
424
+ variable is an unsigned integer with well defined semantics. As
425
+ consequence of this rule, if the loop condition uses ` != ` the
426
+ difference between start and end must be an exact multiple of the
427
+ increment. This can also be expressed as a requirement that the
428
+ difference between start and end fit in a signed integer.
385
429
386
430
## Races and reducers
387
431
0 commit comments