1
0
Fork 0
arangodb/js/node/node_modules/iced-coffee-script/iced.md

710 lines
23 KiB
Markdown

# What Is IcedCoffeeScript?
IcedCoffeeScript (ICS) is a system for handling callbacks in event-based code.
There were two existing implementations, one in [the sfslite library for
C++](https://github.com/maxtaco/sfslite), and another in the [tamejs translator
for JavaScript](https://github.com/maxtaco/tamejs). This extension to
CoffeeScript is a third implementation. The code and translation techniques
are derived from experience with JS, but with some new Coffee-style
flavoring.
This document first presents a "Iced" tutorial (adapted from the JavaScript
version), and then discusses the specifics of the CoffeeScript implementation.
# Installing and Running ICS
ICS is available as an npm package:
npm install -g iced-coffee-script
You can alternatively checkout ICS and install from source:
git clone https://github.com/maxtaco/coffee-script
./bin/cake install
This will give you libraries under `iced-coffee-script` and
the binaries `iced` and `icake`, which are replacements
for `coffee` and `cake` respectively. In almost all cases,
`iced` should serve as a drop-in replacement for `coffee`,
since the ICS language is a superset of CoffeeScript.
For more information about CS and ICS, you can also see
our <a href="http://maxtaco.github.com/coffee-script">brochure page</a>.
# Quick Tutorial and Examples
Here is a simple example that prints "hello" 10 times, with 100ms
delay slots in between:
```coffeescript
# A basic serial loop
for i in [0..10]
await setTimeout(defer(), 100)
console.log "hello"
```
There is one new language addition here, the `await ... ` block (or
expression), and also one new primitive function, `defer`. The two of
them work in concert. A function must "wait" at the close of a
`await` block until all `defer`rals made in that `await` block are
fulfilled. The function `defer` returns a callback, and a callee in
an `await` block can fulfill a deferral by simply calling the callback
it was given. In the code above, there is only one deferral produced
in each iteration of the loop, so after it's fulfilled by `setTimer`
in 100ms, control continues past the `await` block, onto the log line,
and back to the next iteration of the loop. The code looks and feels
like threaded code, but is still in the asynchronous idiom (if you
look at the rewritten code output by the *coffee* compiler).
This next example does the same, while showcasing power of the
`await..` language addition. In the example below, the two timers
are fired in parallel, and only when both have fulfilled their deferrals
(after 100ms), does progress continue...
```coffeescript
for i in [0..10]
await
setTimeout defer(), 100
setTimeout defer(), 10
console.log ("hello");
```
Now for something more useful. Here is a parallel DNS resolver that
will exit as soon as the last of your resolutions completes:
```coffeescript
dns = require("dns");
do_one = (cb, host) ->
await dns.resolve host, "A", defer(err, ip)
msg = if err then "ERROR! #{err}" else "#{host} -> #{ip}"
console.log msg
cb()
do_all = (lst) ->
await
for h in lst
do_one defer(), h
do_all process.argv[2...]
```
You can run this on the command line like so:
iced examples/iced/dns.coffee yahoo.com google.com nytimes.com okcupid.com tinyurl.com
And you will get a response:
yahoo.com -> 72.30.2.43,98.137.149.56,209.191.122.70,67.195.160.76,69.147.125.65
google.com -> 74.125.93.105,74.125.93.99,74.125.93.104,74.125.93.147,74.125.93.106,74.125.93.103
nytimes.com -> 199.239.136.200
okcupid.com -> 66.59.66.6
tinyurl.com -> 195.66.135.140,195.66.135.139
If you want to run these DNS resolutions in serial (rather than
parallel), then the change from above is trivial: just switch the
order of the `await` and `for` statements above:
```coffeescript
do_all = (lst) ->
for h in lst
await
do_one defer(), h
```
### Slightly More Advanced Example
We've shown parallel and serial work flows, what about something in
between? For instance, we might want to make progress in parallel on
our DNS lookups, but not smash the server all at once. A compromise is
windowing, which can be achieved in IcedCoffeeScript conveniently in a
number of different ways. The [2007 academic paper on
tame](http://pdos.csail.mit.edu/~max/docs/tame.pdf) suggests a
technique called a *rendezvous*. A rendezvous is implemented in
CoffeeScript as a pure CS construct (no rewriting involved), which
allows a program to continue as soon as the first deferral is
fulfilled (rather than the last):
```coffeescript
do_all = (lst, windowsz) ->
rv = new iced.Rendezvous
nsent = 0
nrecv = 0
while nrecv < lst.length
if nsent - nrecv < windowsz and nsent < n
do_one rv.id(nsent).defer(), lst[nsent]
nsent++
else
await rv.wait defer evid
console.log "got back lookup nsent=#{evid}"
nrecv++
```
This code maintains two counters: the number of requests sent, and the
number received. It keeps looping until the last lookup is received.
Inside the loop, if there is room in the window and there are more to
send, then send; otherwise, wait and harvest. `Rendezvous.defer`
makes a deferral much like the `defer` primitive, but it can be
labeled with an identifier. This way, the waiter can know which
deferral has fulfilled. In this case we use the variable `nsent` as the
defer ID --- it's the ID of this deferral in launch order. When we
harvest the deferral, `rv.wait` fires its callback with the ID of the
deferral that's harvested.
Note that with windowing, the arrival order might not be the same as
the issue order. In this example, a slower DNS lookup might arrive
after faster ones, even if issued before them.
### Composing Serial And Parallel Patterns
In IcedCoffeeScript, arbitrary composition of serial and parallel control flows is
possible with just normal functional decomposition. Therefore, we
don't allow direct `await` nesting. With inline anonymous CoffeeScript
functions, you can concisely achieve interesting patterns. The code
below launches 10 parallel computations, each of which must complete
two serial actions before finishing:
```coffeescript
f = (n,cb) ->
await
for i in [0..n]
((cb) ->
await setTimeout defer(), 5 * Math.random()
await setTimeout defer(), 4 * Math.random()
cb()
)(defer())
cb()
```
### autocb
Most of the time, an iced function will call its callback and return
at the same time. To get this behavior "for free", you can simply
name this callback `autocb` and it will fire whenever your iced function
returns. For instance, the above example could be equivalently written as:
```coffeescript
f = (n,autocb) ->
await
for i in [0..n]
((autocb) ->
setTimeout defer(), 5 * Math.random()
setTimeout defer(), 4 * Math.random()
)(defer())
```
In the first example, recall, you call `cb()` explicitly. In this
example, because the callback is named `autocb`, it's fired
automatically when the iced function returns.
If your callback needs to fulfill with a value, then you can pass
that value via `return`. Consider the following function, that waits
for a random number of seconds between 0 and 4. After waiting, it
then fulfills its callback `cb` with the amount of time it waited:
```coffeescript
rand_wait = (cb) ->
time = Math.floor Math.random() * 5
if time is 0
cb(0)
return
await setTimeout defer(), time
cb(time) # return here, implicitly.....
```
This function can written equivalently with `autocb` as:
```coffeescript
rand_wait = (autocb) ->
time = Math.floor Math.random() * 5
return 0 if time is 0
await setTimeout defer(), time
return time
```
Implicitly, `return 0;` is mapped by the CoffeeScript compiler to `autocb(0); return`.
## Language Design Considerations
In sum, the iced additions to CoffeeScript consist of three new keywords:
* **await**, marking off a block or a single statement.
* **defer**, which is quite similar to a normal function call, but is compiled specially
to accommodate argument passing.
Finally, `autocb` isn't a bona-fide keyword, but the compiler searches
for it in parameters to CoffeeScript functions, and updates the
behavior of the `Code` block accordingly.
These keywords represent the potential for these iced additions to
break existing CoffeeScript code --- any preexisting use of these
keywords as regular function, variable or class names will cause
headaches.
### Debugging and Stack Traces -- Now Greatly Improved!
An oft-cited problem with async-style programming, with ICS or
hand-rolled, is that stack traces are often incomplete or
incomprehensible. If an exception is caught in a Iced function, the
stack trace will only show the "bottom half" of the call stack, or all
of those functions that are descendents of the main event loop. The
"top half" of the call stack, telling you "who _really_ called this
function," is probably long gone.
ICS has a workaround to this problem. When an iced function is
entered, the runtime will find the first argument to the function that
was output by `defer()`. Such callbacks are annotated to contain the
file, line and function where they were created. They also are
annotated to hold a refernce to `defer()`-generated callback passed to
the function in which they were created. This chaining creates an
implicit stack that can be walked when an exception is thrown.
Consider this example:
```coffeescript
iced.catchExceptions()
foo = (y) ->
await setTimeout defer(), 10
throw new Error "oh no!"
y(10)
bar = (x) ->
await foo defer()
x()
baz = () ->
await bar defer()
baz()
```
The function `iced.catchExceptions` sets the `uncaughtException`
handler in Node to print out the standard callstack, and also the Iced
"callstack", and then to exit. The callback generated by `defer()`
in the function `bar` holds a reference to `x`. Similarly,
the callback generated in `foo` holds a reference to `y`.
Here's what happens when this program is run:
```
Error: oh no!
at Deferrals.continuation (/Users/max/src/coffee-script/prog.iced:24:13)
at Deferrals._call (/Users/max/src/coffee-script/lib/coffee-script/iced.js:86:19)
at Deferrals._fulfill (/Users/max/src/coffee-script/lib/coffee-script/iced.js:97:23)
at Object._onTimeout (/Users/max/src/coffee-script/lib/coffee-script/iced.js:53:18)
at Timer.ontimeout (timers.js:84:39)
Iced 'stack' trace (w/ real line numbers):
at foo (prog.iced:4)
at bar (prog.iced:9)
at baz (prog.iced:13)
```
The first stack trace is the standard Node stacktrace. It is
inscrutable, since it mainly covers node internals, and has line
numbering relative to the translated file (I still haven't fixed this
bug, sorry). The second stack trace is much better. It tells the
sequence of Iced calls the lead to this exception. Line numbers are
relative to the original input file.
The relavant API is as follows:
#### iced.stackWalk cb
Start from the given `cb`, or use the currently active callback
if none was given, and walk up the Iced-generated stack. Return
a list of call site descriptions. You can call this from your
own exception-handling code.
#### iced.catchExceptions()
Tell the runtime to catch uncaught exceptions, and to print
a Iced-aware stack dump as above.
### The Lowdown on defer
The implementation of `defer` is interesting --- it's trying to
emulate ``call by reference'' in languages like C++ or Java. Here is an
example that shows off the four different cases required to make this
happen:
```coffeescript
cb = defer x, obj.field, arr[i], rest...
```
And here is the output from the iced `coffee` compiler:
```javascript
cb = __iced_deferrals.defer({
assign_fn: (function(__slot_1, __slot_2, __slot_3) {
return function() {
x = arguments[0];
__slot_1.field = arguments[1];
__slot_2[__slot_3] = arguments[2];
return rest = __slice.call(arguments, 3);
};
})(obj, arr, i)
});
```
The `__iced_deferrals` object is an internal object of type `Deferrals`
that's collecting all calls to `defer` in the current `await` block.
The one in question should fulfill with 3 or more values. When it does,
it will call into the innermost anonymous function to perform the
appropriate assignments in the original scope. The four cases are:
1. **Simple assignment** --- seen in `x = arguments[0]`. Here, the
`x` variable is in the scope of the original `defer` call.
1. **Object slot assignment** --- seen in `__slot_1.field = arguments[1]`.
Here, the reference `obj` must be captured at the time of the `defer` call,
and `obj.field` is filled in later.
1. **Array cell assignment** --- seen in `__slot_2[__slot_3] = arguments[2]`.
This of course will work on an array or an object. Here, the reference
to the array, and the value of the index must be captured when `defer`
is called, and the cell is assigned later.
1. **Splat assignment** --- seen in `res = __slice.call(arguments,3)`.
This is much like a simple assignment, but allows a ``splat'' meaning
assignment of multiple values at once, accessed as an array.
These specifics are also detailed in the code in the `Defer` class,
file `nodes.coffee`.
### Awaits No Longer Work as Expressions
The following do not work and will generate syntax errors at compile time:
```coffeescript
y = (await foo defer x)
```
```coffeescript
x = if true
await foo defer y
y
else 10
```
```coffescript
my_func 10, (
await foo defer y
y
)
```
That is, you can't treat `await` statements as expressions.
And recursively speaking, you can't treat any blocks that
contain `await` statements as expressions. Previous versions of
IcedCoffeeScript supported this arcane feature, but it was extremely
difficult to implement properly, and unnecessarily obscured the
control flow of iced programs.
## Translation Technique
The IcedCoffeeScript addition uses a similar continuation-passing translation
to *tamejs*, but it's been refined to generate cleaner code, and to translate
only when necessary. Here are the general steps involved:
* **1** Run the standard CoffeeScript lexer, rewriter, and parser, with a
few small additions (for `await` and `defer`), yielding
a standard CoffeeScript-style abstract syntax tree (AST).
* **2** Apply *iced annotations*:
* **2.1** Find all `await` nodes in the AST. Mark these nodes and their
ancestors with an **A** flag.
* **2.2** Find all `for`, `while`, `until`, or `loop` nodes marked with
**A**. Flood them and their descendants with an **L** flag. Stop
flooding when the first loop without an **A** flag is hit.
* **2.3** Find all `continue` or `break` nodes marked with an **L** flag.
Mark them and their descendants with a **P** flag.
* **3** ``Rotate'' all those nodes marked with **A** or **P**:
* **3.1** For each `Block` node _b_ in the `AST` marked **A** or **P**:
* **3.1.1** Find _b_'s first child _c_ marked with **A** or **P**.
* **3.1.2** Cut _b_'s list of expressions after _c_, and move those
expressions on the right of the cut into a new block, called
_d_. This block is _c_'s continuation block and becomes _c_'s
child in the AST. This is the actual ``rotation.''
* **3.1.3** Call the rotation recursively on the child block _d_.
* **3.1.4** Add an additional code to _c_'s body, which is to call the
continuation represented by _d_. For `if` statements this means
calling the continuation in both branches; for `switch`
statements, this means calling the continuation from all
branches; for loops, this means calling `continue` at the end of
the loop body; for blocks, this means just calling the
continuation as the last statement in the block. See
`callContinuation` in `nodes.coffee.`
* **4** Output preamble/boilerplate; for the case of JavaScript output to
browsers, inline the small class `Deferrals` needed during runtime;
for node-based server-side JavaScript, a `require` statement suffices
here. Only do this if the source file has a `defer` statement
in it.
* **5** Compile as normal. The effect of the above is to mutate the original
CoffeeScript AST into another valid CoffeeScript AST. This AST is then
compiled with the normal rules.
Translation Example
------------------
For an example translation, consider the following block of code:
```coffeescript
while x1
f1()
while x2
if y
f2()
continue
f3()
await
f4(defer())
if z
f5()
break
f6()
while x3
f7()
```
* Here is schematic diagram for this AST:
<img src="/media/rotate1.png" width=650 />
* After Step 2.1, nodes in blue are marked with **A**. Recall, Step 2.1 traces
upwards from all `await` blocks.
<img src="/media/rotate2.png" width=650 />
* After Step 2.2, nodes in purple are marked with **L**. Recall, Step 2.2 floods
downwards from any any loops marked with **A**.
<img src="/media/rotate3.png" width=650 />
* After Step 2.3, nodes in yellow are marked with **P**. Recall, Step 2.3
traces upwards from any jumps marked with **L**.
<img src="/media/rotate4.png" width=650 />
* The green nodes are those marked with **A** or **P**. They are "marked"
for rotations in the next step.
<img src="/media/rotate5.png" width=650 />
* In Step 3, rotate all marked nodes AST nodes. This rotation
introduces the new orange `block` nodes in the graph, and attaches
them to pivot nodes as _continuation_ blocks.
<img src="/media/post-rotate.png" width=650 />
* In translated code, the general format of a _pivot_ node is:
```javascript
(function (k) {
// the body
k();
})(function () {
// the continuation block.
}
```
To see how pivots and continuations are output in our example, look
at this portion of the AST, introduced after Step 3:
![detail](/media/detail.png)
Here is the translated output (slightly hand-edited for clarity):
```javascript
(function() {
// await block f4()
(function(k) {
var __deferrals = new iced.Deferrals(k);
f4(__deferrals.defer({}));
__deferrals._fulfill();
})(function() {
// The continuation block, starting at 'if z'
(function(k) {
if (z) {
f5();
(function(k) {
// 'break' throws away the current continuation 'k'
// and just calls _break()
_break();
})(function() {
// A continuation block, after 'break', up to 'f6()'
// This code will never be reached
f6();
return k();
});
} else {
return k();
}
})(function() {
// end of the loop, call _continue() to start at the top
return _continue();
});
});
});
```
## API and Library Documentation
### iced.Rendezvous
The `Rendezvous` is a not a core feature, meaning it's written as a
straight-ahead CoffeeScript library. It's quite useful for more advanced
control flows, so we've included it in the main runtime library.
The `Rendezvous` is similar to a blocking condition variable (or a
"Hoare style monitor") in threaded programming.
#### iced.Rendezvous.id(i,[multi]).defer slots...
Associate a new deferral with the given Rendezvous, whose deferral ID
is `i`, and whose callbacks slots are supplied as `slots`. Those
slots can take the two forms of `defer` return as above. As with
standard `defer`, the return value of the `Rendezvous`'s `defer` is
fed to a function expecting a callback. As soon as that callback
fires (and the deferral is fulfilled), the provided slots will be
filled with the arguments to that callback.
Also, note the optional boolean flag `multi`. By default, a function
generated by `defer` can be called only once, and will generate an
error on subsequent calls. Only with the `multi` flag set to `true`
(and only in the case of a `Rendezvous`), can this restriction be
relaxed.
#### iced.Rendezvous.defer slots...
You don't need to explicitly assign an ID to a deferral generated from a
Rendezvous. If you don't, one will automatically be assigned, in
ascending order starting from `0`.
#### iced.Rendezvous.wait cb
Wait until the next deferral on this rendezvous is fulfilled. When it
is, callback `cb` with the ID of the fulfilled deferral. If an
unclaimed deferral fulfilled before `wait` was called, then `cb` is fired
immediately.
Though `wait` would work with any hand-rolled JS function expecting
a callback, it's meant to work particularly well with *tamejs*'s
`await` function.
#### Example
Here is an example that shows off the different inputs and
outputs of a `Rendezvous`. It does two parallel DNS lookups,
and reports only when the first returns:
```coffeescript
hosts = [ "okcupid.com", "google.com" ];
ips = errs = []
rv = new iced.Rendezvous
for h,i in hosts
dns.resolve hosts[i], rv.id(i).defer errs[i], ips[i]
await rv.wait defer which
console.log "#{hosts[which]} -> #{ips[which]}"
```
### connectors
A *connector* is a function that takes as input
a callback, and outputs another callback. The best example
is a `timeout`, given here:
#### iced.timeout(cb, time, res = [])
Timeout an arbitrary async operation.
Given a callback `cb`, a time to wait `time`, and an array to output a
result `res`, return another callback. This connector will set up a
race between the callback returned to the caller, and the timer that
fires after `time` milliseconds. If the callback returned to the
caller fires first, then fill `res[0] = true;`. If the timer won
(i.e., if there was a timeout), then fill `res[0] = false;`.
In the following example, we timeout a DNS lookup after 100ms:
```coffeescript
{timeout} = require 'icedlib'
info = [];
host = "pirateWarezSite.ru";
await dns.lookup host, timeout(defer(err, ip), 100, info)
if not info[0]
console.log "#{host}: timed out!"
else if (err)
console.log "#{host}: error: #{err}"
else
console.log "#{host} -> #{ip}"
```
### The Pipeliner library
There's another way to do the windowed DNS lookups we saw earlier ---
you can use the control flow library called `Pipeliner`, which
manages the common pattern of having "m calls total, with only
n of them in flight at once, where m > n."
The Pipeliner class is available in the `icedlib` library:
```coffeescript
{Pipeliner} = require 'icedlib'
pipeliner = new Pipeliner w,s
```
Using the pipeliner, we can rewrite our earlier windowed DNS lookups
as follows:
```coffescript
do_all = (lst, windowsz) ->
pipeliner = new Pipeliner windowsz
for x in list
await pipeliner.waitInQueue defer()
do_one pipeliner.defer(), x
await pipeliner.flush defer()
```
The API is as follows:
#### new Pipeliner w, s
Create a new Pipeliner controller, with a window of at most `w` calls
out at once, and waiting `s` seconds before launching each call. The
default values are `w = 10` and `s = 0`.
#### Pipeliner.waitInQueue c
Wait in a queue until there's room in the window to launch a new call.
The callback `c` will be fulfilled when there is room.
#### Pipeliner.defer args...
Create a new `defer`al for this pipeline, and pass it to whatever
function is doing the actual work. When the work completes, fulfill
this `defer`al --- that will update the accounting in the pipeliner
class, allowing queued actions to proceed.
#### Pipeliner.flush c
Wait for the pipeline to clear out. Fulfills the callback `c`
when the last action in the pipeline is done.