Skip to content

Commit 0eed8f5

Browse files
committed
More updates to language reference
1 parent 3d3fea9 commit 0eed8f5

File tree

1 file changed

+126
-82
lines changed

1 file changed

+126
-82
lines changed

src/doc/reference/opencilk-language-reference.md

Lines changed: 126 additions & 82 deletions
Original file line numberDiff line numberDiff line change
@@ -61,36 +61,41 @@ identifiers.
6161
A statement using `cilk_spawn` is the start of a potentially parallel
6262
region of code.
6363

64-
The `cilk_spawn` keyword should appear at the start of an expression
65-
statement or after the `=` sign of an assignment (or `+=`, `-=`,
66-
etc.).
64+
The `cilk_spawn` keyword should appear before an expression statement,
65+
before a block statement, after the `=` sign of a variable
66+
initialization, or after the `=` of an assignment that is the entire
67+
body of an expression statement.
6768

6869
```cilkc
6970
int x = cilk_spawn f(i++);
71+
x = cilk_spawn f(i++);
7072
cilk_spawn y[j++] = f(i++);
7173
cilk_spawn { z[j++] = f(i++); }
7274
```
7375

74-
Although the compiler accepts `cilk_spawn` before almost any
76+
A future version of OpenCilk may limit use of `cilk_spawn` to
77+
these four contexts.
78+
79+
OpenCilk 2.0 allows other statements, except declarations, to be
80+
spawned. Although the compiler accepts `cilk_spawn` before almost any
7581
expression, spawns inside of expressions are unlikely to have the
76-
expected semantics. A future version of the language may explicitly
77-
limit `cilk_spawn` to the contexts above, at or near the top of the
78-
parse tree of a statement.
82+
expected semantics.
7983

80-
A declaration may not begin with `cilk_spawn`.
84+
OpenCilk 2.0 will also accept `cilk_spawn;` as a statement with no
85+
effect.
8186

8287
### Sync
8388

8489
A sync statement, `cilk_sync;`, ends a region of potentially parallel
85-
execution. It takes no arguments.
86-
87-
[Find a real example with a conditional sync. Or have some spawns
88-
to be synced. Matteo Frigo's all pairs shortest path code has
89-
conditional sync, says TB.]
90+
execution. It takes no arguments. It may be conditional and has
91+
no effect if not executed.
9092

9193
```cilkc
92-
if (time_to_sync)
93-
cilk_sync;
94+
for (int i = 0; i < n; i++) {
95+
cilk_spawn f(i);
96+
if (i % 4 == 3)
97+
cilk_sync;
98+
}
9499
```
95100

96101
### Scope
@@ -113,6 +118,8 @@ cilk_scope {
113118
// x, y, and z are usable here because of the implicit sync
114119
```
115120

121+
The compiler also accepts `cilk_scope;` as a statement with no effect.
122+
116123
### For
117124

118125
A loop written using `cilk_for` executes each iteration of its body in
@@ -126,27 +133,36 @@ cilk_for (int i = 0; i < n; ++i)
126133
```
127134

128135
The syntax of a `cilk_for` statement is very similar to a C `for`
129-
statement. It is followed by three expressions, the first of which
130-
may declare variables. Unlike in C all three expressions are
131-
mandatory. Parallel C++ range `for` constructs are not supported
132-
in OpenCilk 2.0.
133-
134-
For the loop to be parallelized, several conditions must be met:
135-
136-
* The first expression must declare a variable (the "loop variable").
137-
* The second expression must compare the loop variable using one of the
138-
relational operators `<=`, `<`, `!=`, `>`, and `>=`.
139-
* The value compared to must be [...]
140-
* The third expression must modify the loop variable using `++`,
136+
statement except that none of the three items in parentheses may be
137+
omitted. C++ "range for" is not supported with `cilk_for` in
138+
OpenCilk 2.0.
139+
140+
The first statement inside parentheses must declare at least one
141+
variable.
142+
143+
While the following constraints not required by syntax, the compiler
144+
may not be able to parallelize the loop if they are not satisfied.
145+
146+
* The first expression must declare one variable, the _control variable_.
147+
* In C the control variable must be an integer no larger than 64 bits or
148+
a pointer to a complete type. In C++ it may be any random access iterator.
149+
Among other things, this implies that the difference between starting and
150+
ending values must be an integer computable by subtraction or `operator-`.
151+
* The second expression must compare the control variable using one of the
152+
relational operators `<=`, `<`, `!=`, `>`, and `>=`. The value to which
153+
it is compared is the _loop bound_. (See below for the
154+
interpretation of this value.)
155+
* The third expression must modify the control variable using `++`,
141156
`--`, `+=`, or `-=`.
142157

158+
The compiler will emit a warning if the loop can not be unrolled,
159+
eliminated, or parallelized.
160+
143161
Because loop iterations may execute out of order there is no way to
144162
predictably stop the loop in the middle. The `break` statement may
145163
not be used to exit the body of a `cilk_for` loop. An exception
146164
thrown out of a loop body is only guaranteed to terminate the current
147-
iteration. (Also, any later iteration of the same grain; see below.)
148-
The effect on other iterations is unpredictable; they may run to
149-
completion or not run at all.
165+
iteration.
150166

151167
#### Grain size
152168

@@ -167,38 +183,31 @@ can be manually overridden with a pragma:
167183
The pragma in the example tells the compiler that each group of 128
168184
consecutive iterations should be executed as a serial loop. If there
169185
are 1024 loop iterations in total, there are only 8 parallel tasks.
170-
There is guaranteed to be no spawn or sync between the iterations
171-
for `i=0` and `i=1` (assuming `n` is at least 2, otherwise there
172-
will be no second iteration).
173186

174187
In OpenCilk 2.0 the argument to the grain size pragma must be an
175188
integer constant in the range 1..2<sup>31</sup>-1.
176189

177190
Without an explicit grainsize the runtime will choose a value from 1
178191
to 2048.
179192

180-
Whether the grain size is static or dynamic, an exception thrown from
181-
the loop body will abort the remainder of the group of iterations.
182-
The scope of `cilk_sync` will also include all iterations in the
183-
group.
184-
185-
186193
### Reducers
187194

188-
A type may be suffixed with `cilk_reducer`. Syntactically it appears
189-
where `*` may be used to declare a pointer type. The type to the left
190-
of `cilk_reducer` is the _view type_.
195+
A type may be suffixed with `cilk_reducer`. Syntactically this
196+
keyword appears where `*` may be used to declare a pointer type. The
197+
type to the left of `cilk_reducer` is the _view type_.
191198

192-
Two values appear in parentheses after `cilk_reducer`. Both emust be
193-
functions returning `void` or pointers to functions returning void.
194-
The first, the _identity callback_, takes one argument of type `void*`.
195-
The second, the _reduce callback_, takes two arguments of type `void *`.
199+
Two values appear in parentheses after `cilk_reducer`, separated by a
200+
comma. Both must be functions returning `void` or pointers to
201+
functions returning void. The first, the _identity callback_, takes
202+
one argument of type `void *`. The second, the _reduce callback_,
203+
takes two arguments of type `void *`.
196204

197205
Two reducer types are the same if their view types are the same and
198-
their callbacks are the same function mentioned by name. Otherwise
199-
two reducer types are different and not compatible. This rule arises
200-
from the impossibility of proving that two different functions are
201-
identical.
206+
their corresponding callbacks are the same function mentioned by name.
207+
Otherwise two reducer types are different and not compatible. The
208+
requirement that the corresponding arguments be manifestly the same
209+
function is dictated by the impossibility of proving that two
210+
different expressions are equivalent.
202211

203212
```
204213
extern void identity(void *), reduce(void *, void *);
@@ -209,15 +218,15 @@ identical.
209218
int cilk_reducer(idp, reduce) type4; // not the same as type3
210219
```
211220

212-
In the current version of OpenCilk the callbacks may be omitted in
213-
contexts other than definition of a variable. This behavior may be
214-
removed in a future version of OpenCilk.
221+
In the OpenCilk 2.0 the callbacks may be omitted in contexts other
222+
than definition of a variable. This behavior may be removed in a
223+
future version of OpenCilk.
215224

216-
In the current version of OpenCilk the arguments to `cilk_reducer`
217-
are evaluated each time a reducer is created. This behavior may
218-
change in a future version of OpenCilk. For compatibility and
219-
predictable behavior the arguments to `cilk_reducer` should not
220-
have side effects.
225+
In the current version of OpenCilk the arguments to `cilk_reducer` are
226+
evaluated each time a reducer is created but not when a reducer is
227+
accessed. This behavior may change in a future version of OpenCilk.
228+
For compatibility and predictable behavior the arguments to
229+
`cilk_reducer` should not have side effects.
221230

222231
## Execution of an OpenCilk program
223232

@@ -240,8 +249,8 @@ thread.
240249
In some cases it is necessary to specify exactly where the spawn point
241250
is in a spawn statement. All code up to the point of spawning
242251
executes in series. The code that follows the spawn point is called
243-
the _continuation_ of the spawn. It potentially executes in parallel
244-
with the following statements of the program.
252+
the _continuation_ of the spawn. The spawn potentially executes in
253+
parallel with the continuation, up to the next sync.
245254

246255
Contrary to the syntax, the spawn itself should be considered as
247256
having a `void` value. The return value of spawn is like a promise or
@@ -306,7 +315,7 @@ continuing.
306315

307316
##### Explicit sync
308317

309-
An explicit sync is a statement using `cilk_sync`. This form normally
318+
An explicit sync is the statement `cilk_sync;`. This form normally
310319
has function scope, meaning it waits for all spawns in the same
311320
function. A sync inside the body of a `cilk_for` or `cilk_scope` only
312321
waits for spawns inside the same construct.
@@ -317,8 +326,8 @@ cilk_spawn cilk_scope { cilk_spawn ... }
317326
...
318327
cilk_sync;
319328
```
320-
a `cilk_sync` at top level waits for the outer spawn to complete, and
321-
the outer spawn waits for the inner spawn to complete.
329+
a `cilk_sync` at top level waits for the top level spawn to complete, and
330+
the top level spawn waits for everything spawned inside it to complete.
322331

323332
##### Implicit syncs
324333

@@ -327,24 +336,22 @@ before exit from some scopes:
327336

328337
* Before returning from a function, after calculating the value to
329338
be returned. This sync has function scope.
330-
* On exit from the body of a `try` block. This sync has function scope.
331-
[Is this true? Is it true only if the try block spawns?]
332339
* On exit from a `cilk_scope` statement. This sync has the scope of the
333340
`cilk_scope` statement.
334341
* On exit from the body of a `cilk_for`, i.e. once per iteration of the loop.
335342
This sync has scope equal to the loop body.
336-
If a grain size is specified, the sync affects the entire group of
337-
iterations in which it is executes. [TODO: Test grain size.]
338-
339-
[What about on entry to a try block?]
343+
* Before entering a `catch` block. This sync has the same scope as
344+
the `try .. catch` construct as a whole: the smallest enclosing
345+
function, `cilk_scope`, or `cilk_for` body. [No, it applies to
346+
the try block, which gets its own sync region. Test this.]
340347

341348
When exiting from a block scope, destructors for block scope variables
342349
are run after the implicit sync.
343350

344351
## Differences between C++ and OpenCilk
345352

346-
In C++ code there are two exceptions to the rule that serial and
347-
parallel programs are the same.
353+
There are three exceptions to the rule that serial and parallel
354+
programs are the same.
348355

349356
### Exceptions
350357

@@ -356,32 +363,69 @@ observable if the continuation has side effects.
356363
When the parent function executes an implicit or explicit `cilk_sync`
357364
the runtime checks whether the spawned child threw an exception. If
358365
it did, any exception thrown by the parent is discarded and the
359-
exception thrown by the child is handled at the sync. The compiler
360-
inserts an implicit sync at the end of a try block if the try contains
361-
a spawn. [Make sure this is consistent with the wording in the implicit
362-
syncs section.]
366+
exception thrown by the child is handled as if thrown at the sync.
367+
The compiler inserts an implicit sync at the end of a try block if the
368+
try contains a spawn.
363369

364370
If an exception is thrown from the body of a `cilk_for` statement the
365371
current loop iteration is aborted, consistent with the semantics of
366-
`throw`. If the `grainsize` pragma is used, later iterations in the
367-
current grain do not execute. No guarantee is made about which other
368-
loop iterations execute, except that a grain in progress is not
369-
affected by an exception thrown from outside the grain.
372+
`throw`. Other loop iterations may or may not execute, depending on
373+
scheduling. An exception thrown by one iteration of the loop will not
374+
prematurely terminate another iteration.
375+
376+
If more than one exception reaches a sync the earliest in serial
377+
order is thrown by the sync. The other exceptions are destructed.
370378

371379
### Left hand side side effects
372380

373381
When a function call is spawned in OpenCilk and the result is
374382
assigned, the compiler evaluates the address of the left hand side of
375383
the assignment before calling the function. This conflicts with
376-
recent versions of C++, which require evaluation of the left hand
377-
side to follow return from the function.
384+
C++17, which requires evaluation of the left hand side to follow
385+
return from the function.
378386

379387
```cilkcpp
380388
extern int global;
381389
a[global++] = cilk_spawn f(); // f() sees incremented value of global
382390
```
383391

384-
[TB needs to confirm or deny this]
392+
Occurring before the spawn, the evaluation of the left hand side is in
393+
series with the continuation of the spawn.
394+
395+
### Loops
396+
397+
Parallel for loops are implemented by looping over an integer range.
398+
This transformation requires that the loop count be known before
399+
the loop begins execution and that the control variable be calculable
400+
by adding an integer to the starting value.
401+
402+
The observable differences are
403+
404+
* The loop bound expression may be executed fewer times, likely only
405+
once.
406+
407+
* In C++, `operator-` may be called to subtract the start value from
408+
the loop bound (if the increment is positive) or the loop bound from
409+
the initial value (if the increment is negative).
410+
411+
* The loop increment expression may not be executed or may be executed
412+
fewer times than in the serial program.
413+
414+
* In C++, `operator+` may be called to add an integer to the starting
415+
value of the control variable.
416+
417+
The program is not guaranteed to call `operator+` or `operator-` and
418+
these operators not have side effects. If the loop is not
419+
parallelized it may be executed as written. For example, the compiler
420+
may decide to unroll the loop instead. If a `cilk_for` loop is
421+
compiled to a serial loop the compiler will emit a warning.
422+
423+
The control variable must not wrap around, even if the control
424+
variable is an unsigned integer with well defined semantics. As
425+
consequence of this rule, if the loop condition uses `!=` the
426+
difference between start and end must be an exact multiple of the
427+
increment. This can also be expressed as a requirement that the
428+
difference between start and end fit in a signed integer.
385429

386430
## Races and reducers
387431

0 commit comments

Comments
 (0)