9 Pointers with Restricted Aliasing
9.1 Introduction
The main benefit of the regions described thus far is also their
drawback: to free data you must free an entire region. This implies
that to amortize the cost of creating a region, one needs to allocate
into it many times. Furthermore, the objects allocated in a region
should be mostly in use until the region is freed, or else memory will
be wasted in the region that is unused by the program.
In an attempt to alleviate each of these problems Cyclone provides a
mechanism by which individual objects in a region may be freed prior
to freeing the entire region. Pointers to objects that may be freed
early must obey an aliasing discipline to prevent dangling pointers. A
static analysis ensures that such objects can only ever be accessed
through one pointer at any time. At the time it is freed, this pointer
is invalidated, thus preventing all future accesses to the object.
Pointer types in Cyclone can be qualified by their aliasability. As of
now, there are four alias qualifiers. Each of these qualifiers must
appear as arguments to the @aqual(...) pointer qualifier
mentioned in section 3. The four alias qualifiers are:
-
Aliasable
- Pointer types are by default qualified as ALIASABLE. These may be freely aliased, but can never be freed. For
instance, int @@aqual(ALIASABLE) is an aliasable non-null
pointer to an int.
- Unique
- The UNIQUE qualifier on a pointer allows the
object pointed to by that pointer to be deallocated individually,
using the function rufree. For freeing objects to be safe,
all accesses to such objects must be made through a single UNIQUE pointer. That is, only a single pointer may be used to
access the object at any given time; this trivially guarantees that
if the object is freed through its unique pointer, no other access
to the object beyond that point is possible. Objects that become
unreachable but are not freed manually will be freed by the garbage
collector (assuming it's not removed with -nogc). For
instance, int ?@aqual(UNIQUE) is a unique fat pointer to an
int.
- Reference-counted
- The REFCNT qualifier reference-counted
qualifier also permits freeing individual objects. Unlike the UNIQUE qualifier, multiple pointers to a single object are
permitted, the number of which is tracked dynamically via a hidden
reference count stored with the object. Additional pointers to an
object are created explicitly via a call to alias_refptr,
which increases the reference count. Individual pointers are
removed via a call to rdrop_refptr; when the last pointer
is removed (i.e. the reference count is 0), the object is freed.
Like the unique region, objects that become unreachable will be
freed by the garbage collector. For instance, int
*@aqual(REFCNT) is a nullable pointer to a reference counted int.
- Restricted
- All the alias qualifiers are arranged in a subtyping
hierarchy. The RESTRICTED qualifier is a super-type of all the
other qualifiers. A RESTRICTED pointer may not be freed nor
can any aliases of the pointer be created. For instance, int
*@aqual(RESTRICTED) is a restricted nullable pointer to an int. The subtyping hierarchy is as below:
RESTRICTED
/ | \
/ | \
/ | \
ALIASABLE UNIQUE REFCNT
9.2 Allocation and Freeing
Unique and reference-counted pointers can be allocated in any
region. However, to support freeing individual objects from a region
we use a special allocator (see Section 9.10) that maintains
additional metadata. We use the term ``reap'' to refer to regions from
which individual objects can be deleted. A reap is created in a manner
similar to normal regions. Unique and reference-counted pointers that
refer to objects in the heap can also be freed. For a lexical
reap we use the following form:
reap identifier; statement
As with lexical regions, this declares and allocates a new lexical
reap, named identifier, and executes statement.
After statement finishes executing, the reap is
deallocated. Within statement, objects can be added to the
region. As we will explain below, these objects may also be deleted
individually prior to the deallocation of the entire reap.
The allocation mechanism for unique and reference-counted pointers is
similar to that used for region allocation described in the previous
section. The allocation functions rely on handles for alias qualifiers
which are of type aqual_t<`q::Q>. Here `q is a type
variable of alias qualifier kind and may be instantiated with the
alias qualifiers above. The Cyclone libraries provide the following
global variables:
aqual_t<ALIASABLE> Core::aliasable_qual
aqual_t<UNIQUE> Core::unique_qual}
aqual_t<REFCNT> Core::refcnt_qual}
which are handles for the aliasable, unique and reference-counted
qualifiers respectively. Note that there is no handle for the
restricted qualifier.
The following expressions are used for allocation
-
qnew(aq-identifier) expr
- qnew(aq-identifier) array-initializer
The qnew functions allocate in the `H region, i.e. the
heap. They expect a alias qualifier handle. For instance, to allocate
a unique integer in the heap, one might use
int @@aqual(UNIQUE) a = qnew(Core::unique_qual) 0;
- rnew(rgn-identifier, aq-identifier)expr
- rnew(rgn-identifier, aq-identifier)array-initializer
To allocate an alias-free object in a region other than the heap the
rnew functions can be used. For instance,
{
reap<`r> r;
int @@aqual(UNIQUE) `r a = rnew(r, Core::unique_qual) 0;
}
If aq-identifier is not specified it defaults to the ALIASABLE
handle.
- qmalloc(aq-identifier,sizeof(type))
- qcalloc(aq-identifier,n,sizeof(type))
Similar to qnew these functions allocate in the heap.
- rmalloc(rgn-identifier, aq-identifier,sizeof(type))
- rcalloc(rgn-identifier,
aq-identifier,n,sizeof(type))
Similar to rnew these functions can be used to allocate alias
restricted pointers in the specified region. If aq-identifier is
omitted, it defaults to ALIASABLE.
Since there is no handle for the restricted qualifier, RESTRICTED qualified pointers are purely abstract. As with normal
region allocation, we use very simple type inference to ease the
burden of writing qualifiers on types. For instance,
int *a = rnew(r, Core::unique_qual) 0;
suffices to allocate a unique integer in the region r.
The Cyclone library provides the following functions to free
alias-free pointers.
-
Core::ufree(ptr)
To free a heap directed unique pointer ufree is used.
- Core::rufree(rgn-identifier, ptr) To free a
unique pointer from a reap rufree is used.
- Core::drop_refptr(ptr)
To reduce the reference count on a heap resident reference counted
object drop_refptr is used. The ptr can no longer be
dereferenced following this call. If the reference count on the object
reaches zero then the object is freed automatically.
- Core::rdrop_refptr(rgn-identifier, ptr) To reduce
the reference count on a reference counted object in an arbitrary
region rdrop_refptr is used.
9.3 Unique Pointers
To ease programming with unique pointers and allow reuse of library
code, unique pointers can be aliased temporarily within a designated
lexical scope using a special alias pattern. If this kind of
aliasing is not sufficient, users can choose to allocate
reference-counted objects; this idea is explained in the next
subsection. We also define syntax a :=: b to
allow two unique pointers a and b to be atomically swapped.
Careful use of the swap operator allows us to store unique pointers in
objects that are not themselves uniquely pointed to. We also
introduce bounded polymorphism over alias qualifiers and add an
additional kind to specify type variables over these
qualifiers. In practice, all of these mechanisms are necessary for
writing useful and reusable code.
9.3.1 Simple Unique Pointers
Having a unique pointer ensures the object pointed to is not reachable by
any other means. When pointers are first allocated, e.g. using
malloc, they are unique. Such pointers are allowed to be
read through (that is, dereferenced or indexed) but not copied, as
the following example shows:
char @fat @aqual(UNIQUE) buf = malloc(3*sizeof(char));
buf[0] = 'h';
buf[1] = 'i';
buf[2] = '\0';
...
ufree(buf);
This piece of code allocates a unique buffer, and then indexes that buffer
three times to initialize it. Because the process of storing to the
buffer does not copy its unique pointer, it can be safely freed.
When a unique pointer is copied, e.g. when passed as a parameter to a
function or stored in a datastructure, we say it has been consumed.
We ensure that consumed pointers are not read through or copied via a
dataflow analysis. When a consumed pointer is assigned to, very often it
can be unconsumed, making it accessible again. Here is a simple
example that initializes a datastructure with unique pointers:
1 struct pair { int *@aqual(UNIQUE) x; int *@aqual(UNIQUE) y; } p;
2 int *@aqual(UNIQUE) x = new 1; // initializes x
3 p.x = x; // consumes x
4 x = new 2; // unconsumes x
5 p.y = x; // consumes x
If an attempt was made to read through or copy x between lines 3
and 4 or after line 5, the flow analysis would reject the code, as in
int *@aqual(UNIQUE) x = new 1; // initializes x
p.x = x; // consumes x
p.y = x; // rejected! x has been consumed already
When a multi-word data structure is assigned to another one, all of the
unique pointers it contains are consumed. For example:
1 struct pair { int *@aqual(UNIQUE) x; int *@aqual(UNIQUE) y; } p, q;
2 p.x = new 1; p.y = new 2;
3 q = p; // consumes p.x and p.y
By default, when a unique pointer is passed to a function, we expect
that the function will not free the pointer, store it in a data
structure, or otherwise make it unavailable to the caller. If a
function violates any of thse assumptions its type must be augmented
with the attribute consume, which indicates that a particular
argument is no longer available to the caller after the call. For
example:
void foo(int *@aqual(UNIQUE) x) __attribute__((consume(1))) {
ufree(x);
}
void bar() {
int *@aqual(UNIQUE) x = malloc(sizeof(int));
foo(x);
*x;//<--- this dereference is not allowed
}
Here, the consume(1) attribute in the definition of
foo indicates that the first argument is consumed within the
function body.
Note that if you fail to free a unique pointer, it will eventually be
garbage collected.
Unique pointers have some restrictions. In particular:
-
No pointer arithmetic is allowed on unique pointers. This ensures
that all unique pointers point to the beginning of the object, so that the
allocator is not confused when a pointer is passed to ufree.
- Take the address of a unique pointer is not allowed. This is because
doing so effectively creates an alias to the original pointer that cannot
be easily tracked statically.
- Unique pointers cannot be stored within datatypes (though they can be
stored in tagged unions). This is a shortcoming of the current flow
analysis.
9.3.2 Nested Unique Pointers
Directly reading a unique pointer is only allowed along a unique
path. A unique path u has the form
u ::= x | u.m | u->m | *u
where x is a local variable, and u is a unique pointer. To
appreciate the unique-path restriction, consider this incorrect code:
int f(int *@aqual(UNIQUE) *`r x) {
int *@aqual(UNIQUE) *`r y = x; //x and y are aliases
int *@aqual(UNIQUE) z = *y;
ufree(z);
return **x; //accesses deallocated storage!
}
Here, x is a pointer into a conventional region `r and
thus its value can be freely copied to y. We then extract a unique
pointer pointed to by y and free it. Then we attempt to access the
deallocated storage through x.
If a unique pointer is not accessible via a unique path, it must be
swapped out atomically to be used; in Cyclone this is expressed with
syntax :=:. In particular, the code a :=:b
will swap the contents of a and b. We can use this to swap out
a nested unique pointer, and replace it with a different one; we will often
swap in NULL, since this is a unique pointer that is always unconsumed. For
example, in the code below, we define a queue type for queues that contain
unique pointers, and a function take for removing the first element
from the queue.
struct Queue<`a,`r> {
list_t<`a *@aqual(UNIQUE),`r> front;
list_t<`a *@aqual(UNIQUE),`r> rear;
};
typedef struct Queue<`a,`r> *`r queue_t<`a,`r>;
`a *@aqual(UNIQUE) take(queue_t<`a> q) {
if (q->front == NULL)
throw &Empty_val; // exception: def not shown
else {
let elem = NULL;
elem :=: q->front->hd;
q->front = q->front->tl;
if (q->front == NULL) q->rear = NULL;
return elem;
}
}
Here, in order to extract the element stored in the queue (the hd
portion of the underlying list), we need to use swap, because
q->front is a non-unique pointer, and therefore
q->front->hd is not a unique path.
Note that this code is not as polymorphic as it could be. In particular,
the above queue definition requires its elements to be nullable unique
pointers, when they could just as easily be non-unique pointers, or even
reference-counted pointers (illustrated later), and the code for
take would still work. This problem can be addressed, and its
solution is described in Section 9.5.
9.3.3 Pattern Matching on Unique Pointers
As described in Section 5, Cyclone supports pattern
matching on structured data with let declarations and
switch statements. Unique pointers, or structures containing
unique pointers, can be matched against, while still ensuring that only one
legal pointer to a unique object exists at any given time.
In the simplest case, when a unique pointer to a structure is matched
against, the matching operation is treated just like a dereference.
Therefore, the pointer itself is not consumed. For example:
struct pair { int x; int y; };
void foo() {
struct pair @@aqual(UNIQUE) p = new pair(1,2);
let &pair{.x=x, .y=y} = p;
ufree(p);
}
Here, we match against the unique pointer p's two fields
x and y. Because we don't make a copy of p, but
rather only of its fields, p is not consumed. Therefore,
p can be safely freed.
Because each of the fields matched against is assigned to the pattern
variables, unique paths through the original pointer are consumed by virtue
of being assigned. At the conclusion of the scope of the pattern, we can
unconsume any location whose pattern variable has not been consumed
or assigned to, as long as the parent pointer has not been consumed or
assigned to. Here's an example:
struct Foo { int *@aqual(UNIQUE) x; int *@aqual(UNIQUE) y; };
void foo(struct Foo *@aqual(UNIQUE) p) {
{ let &Foo{.x=x, .y=y} = p; // consumes p->x and p->y
ufree(x); // consumes x
} // p->y is unconsumed
ufree(p->y); // p->y consumed
ufree(p); // p consumed
}
The initial match against p consumes p->x and
p->y, whose contents are copied to x and y,
respectively. At the conclusion of the block, p->y is unconsumed
because it did not change, whereas p->x is not, because x
was freed within the block.
Note that the following code is illegal:
void foo(struct Foo *`H p) {
let &Foo{.x=x, .y=y} = p; // non-unique path!
...
}
To see why, notice that this is equivalent to
void foo(struct Foo *`H p) {
let x = p->x;
let y = p->y;
...
}
This code is illegal because neither p->x nor p->y is a
unique path. We also do not allow * patterns to create aliases of
the original unique pointer, for the same reason we forbid &e
when e is a unique pointer. Unfortunately, this means we don't provide a
way to assign to matched-against fields. However, in the case of the
matched-against struct, we can just do this with regular paths. In the
above example pattern block, we could do p->y = new 1 or something
like that (even within the scope of the pattern).
Matching against tagged unions is essentially like matching against
structures, as just described. Since we do not allow unique pointers to be
stored within datatypes, there is no change to how datatypes are
matched.
9.4 Reference-counted Pointers
Cyclone also supports reference-counted pointers, which are treated
quite similarly to unique pointers. Reference-counted objects may be
allocated in any region. We define the constant
Core::refcnt_qual, having type aqual_t<REFCNT>, for
creating reference-counted pointers. The caveat here is that when you
allocate something in this region, an extra word will be prepended to
your data, which contains the reference count, initialized to 1.
As with unique pointers, no pointer arithmetic is allowed, for similar
reasons: it can occlude where the "head" of the object is, and thus make
it impossible to find the hidden reference count. The reference count
can be accessed via the routine Core::refptr_count:
int refptr_count(`a::TA ?@aqual(REFCNT) ptr);
The constant NULL is allowed to have type
`a::A ?@aqual(REFCNT)
, and its reference count is always 0.
Like unique pointers, implicit aliasing is not allowed. Aliases are
created explicitly using the routine Core::alias_refptr:
`a ?@aqual(REFCNT) alias_refptr(`a::TA ?@aqual(REFCNT) ptr);
This routine returns a copy of its argument, which is itself not consumed.
Furthermore, the reference count will be incremented by one. Reference
counts are reduced explicitly by the routine drop_refptr:
void drop_refptr(`a::TA ?@aqual(REFCNT) ptr) __attribute((consume(1)));
In the case that the provided object's reference count is 1 (and is thus
dropped to zero), the provided pointer is freed. The flow analysis will
consume the passed pointer (as is always the case for function arguments),
so you won't be able to use it afterwards. Just like unique pointers, you
can ``forget'' reference-counted pointers without decrementing the count;
this just means you'll never be able to free the pointer explicitly, but the
GC will get it once it becomes unreachable.
Just like unique pointers, reference-counted pointers can be stored in
normal, aliasable datastructures, and accessed using swap (e.g. x
:=: y). Because NULL is a `a::TA ?@aqual(REFCNT)
pointer, we can always
cheaply construct a pointer to swap in.
A good example of the use of unique pointers and reference-counted
pointers is in the Cyclone distribution's tests
directory---the file streambuff.cyc. This is an
implementation of a packet manipulation library with a representation
for packets (called streambuff_t's) that is similar to
Linux's skbuff_t's. It uses a combination of unique header
structures and reference-counted data structures.
9.5 Qualifier Polymorphism
To allow the writing of reusable code we support both subtyping and
bounded polymorphism over alias qualifiers. Type variables that range
over the set of alias qualifiers are of kind Q are used in
addition to the other kinds.
The list data structure in the Cyclone libraries illustrates many
features of qualifier polymorphism. It has the following declaration:
struct List<`a::B,`r::R,`q::Q>{
: RESTRICTED >= aquals(`a), RESTRICTED >= `q
`a hd;
struct List<`a,`r,`q> *@aqual(`q) `r tl;
};
typedef struct List<`a,`r,`q> *@aqual(`q) `r list_t<`a,`r,`q>;
Here, the structure is parameterized by three type variables. The
first, of boxed-kind, admits instantiation by pointer types. Since
pointer types may be qualified according to their aliasability, we
require a bound on this aliasability. For this purpose, we use the
construct aquals(`a). Similar to the regions(`a)
construct, aquals(`a) evaluates to the top-level alias qualifier
of the type that instantiates the variable `a. For instance,aquals(int *@aqual(ALIASABLE) *@aqual(UNIQUE)) = UNIQUE. The bound in
the list declaration RESTRICTED >= aquals(`a) states that `a can be instantiated with a boxed kind with an aliasability that is
a subtype of RESTRICTED. Since RESTRICTED is at the top of
the subtyping hierarchy, this is the most general bound on the
aliasability of the type.
The second type variable is of region kind. These types may not be
qualified by aliasability and thus do not appear in the bounds at
all.
Finally, we have the type variable `q::Q, of alias qualifier
kind. Thus it can be instantiated with any type that reduces to one of
the alias qualifiers. That is, one of ALIASABLE, UNIQUE, REFCNT,
RESTRICTED or aquals(`a). The bound on `q also is the
most general bound.
A list that uses unique pointers on the ``spine'' with reference
counted elements might be constructed as follows:
int *rc_int = qnew(Core::refcnt_qual) 0;
int *rc_int2 = qnew(Core::refcnt_qual) 1;
list_t<int*@aqual(REFCNT),Core::heap_region,Core::unique_qual> l =
rnew(Core::heap_region, Core::unique_qual)
List{rc_int,
rnew(Core::heap_region, Core::unique_qual)
List{rc_int2, NULL}};
We can also quantify over alias qualifiers in function types. For
instance, a function that copies a list can be defined as follows.
list_t<`a,`r,`q> rqcopy(region_t<`r> r,aqual_t<`q> q,
list_t<`a,`r2,`p> l
: RESTRICTED >= `q,
ALIASABLE >= aquals(`a),
RESTRICTED >= `p) {
if(l == NULL)
return NULL;
_ tl = NULL;
tl :=: l->tl;
list_t<`a,`r,`q> result = rnew(r,q) List{l->hd, rqcopy(r,q,tl)};
l->tl :=: tl;
return result;
}
This function copies a list allocated in a region r2 into a
region r. As previously, the function is polymorphic in both the
source and destination regions. Furthermore, the aliasability of the
new list is specified by the handle q which has type aqual_t<`q>. As previously, since `q is of alias qualifier
kind, a bound can be specified for it -- this appears after the
argument list along with any effects for this function. Note that the
list that is being copied may also have a spine that is RESTRICTED as specified by the RESTRICTED >= `p bound. This
bound requires that we use the swap operator (:=:) to ensure
that no aliases are manufactured. Finally, since we are copying the
list, the elements of the list itself must be aliasable -- this is
specified by the ALIASABLE >= aquals(`a) bound.
9.6 Aliasing Unique Pointers
Programmers often write code that aliases values temporarily, e.g. by
storing them in loop iterator variables or by passing them to functions.
Such reasonable uses would be severely hampered by ``no alias'' restrictions
on unique pointers. To address this problem, we introduce a special
alias pattern variable that permits temporary aliasing of a unique
pointer. Here is a simple example:
char *@fat@aqual(UNIQUE) dst, *@fat@aqual(UNIQUE) src = ...
{ let alias <`r>char *@fat`r x = src; // consumes src
memcpy(dst,x,numelts(x)); }
// src unconsumed
...
ufree(src);
In general, an alias pattern has form alias <`r>t x, where
`r is a fresh region variable, and t is the type of x,
which may mention `r. The alias pattern introduces a
region `r, copies src to x which is treated as
having the designated type char *@fat`r. Because `r is
non-unique, x can be freely aliased. As such, we can pass
x to the memcpy function. The matching operation consumes
src during the block, and unconsumes it upon exit, so that
src can be ultimately freed.
Alias pattern variables are similar to regular pattern variables. Like
regular pattern variables, the matched-against expression (i.e. src
in the above example) must be a unique path, and is consumed as a result of
the match. As well, this expression can be unconsumed at the conclusion of
the surrounding block as long as it hasn't been overwritten. However, in
the case of regular pattern variables, unconsumption also requires that the
pattern variable itself (i.e. x in the above example) hasn't
changed within the block, while this requirement is unnecessary for alias
patterns.
Intuitively, alias pattern variables are sound because we cast a unique
pointer to instead point into a fresh region, for which there is no
possibility of either creating new values or storing existing values into
escaping data structures. As such we cannot create aliases that persist
beyond the surrounding scope. However, we must take care when aliasing data
having recursive type. For example, the following code is unsound:
void foo(list_t<`a, `r1, UNIQUE> l) {
let alias <`r> x = (list_t<`a, `r1, UNIQUE>)l;
x->tl = x; // unsound: creates alias!
}
In this case, the alias effectively created many values in the
fresh region `r: one for each element of the list. This allows
storing an alias in an element reachable from the original expression
l, so that when the block is exited, this alias escapes.
For improved programmer convenience, the Cyclone typechecker
optimistically inserts alias blocks around function-call
arguments that are unique pointers when the formal-parameter type is
polymorphic in the pointer's region. If this modified call does not
type-check, we remove the inserted alias. For example, the
alias pattern in the foo function above could be
inferred, so we could instead write:
int foo() {
list_t<int,`H,UNIQUE> l = new List(1,new List(2,NULL));
return length(l);
}
Right now, alias inference in Cyclone is fairly primitive,
but could be extended to more contexts. We plan to improve this
feature in future releases.
9.7 Dynamic Regions
Dynamic regions combine reference-counted or unique pointers and lexical
regions together to essentially create reference-counted or unique
regions; that is, the region is completely first class, and can be
created or freed at conceptually any program point. This is done by
representing a dynamic region as a unique (or reference-counted) pointer to
an abstract struct DynamicRegion (which internally just contains
the handle to a lexical region). The unique (or ref-counted) pointer is
called the key. The key serves as a run-time capability that
grants access to the region. At run-time, a key can be presented to a
special open primitive, described later, that grants lexical access
to the region.
The operation new_ukey() creates a fresh dynamic region and
returns a unique key for the region; new_rckey() creates a
fresh dynamic region and returns a ref-counted key for the region.
The operations free_ukey() and free_rckey() are
used to destroy unique and ref-counted keys respectively. The
free_ukey() operation reclaims the key's region, as well as
the storage for the key. The free_rckey() operation
decrements the reference count, and if it's zero, reclaims the key's
region as well as the storage for the key. Because ref-counted keys
are pointers, you can use alias_refptr to make a copy of a
ref-counted key. (Obviously, you can't make a copy of a unique key.)
By the same token, you can pass a ref-counted key to
drop_refptr (and you can pass a unique key to
ufree), but doing so won't actually deallocate the region,
but rather only the key. To obtain a dynamic reap (i.e. a
dynamic region that supports the deletion of individual objects) use
new_reap_ukey and new_reap_rckey instead.
Given a key k, a user can access the contents of its region by temporarily
`opening the region' within a lexical scope. This is done with the syntax
region
r = open
k. That is, within the remainder of the
current scope, the region handle r can be used to access k's region.
The key k is temporarily consumed throughout the scope, and then
unconsumed at its conclusion. This prevents you from opening up the dynamic
region, and then freeing it while it's still in use. Note that
open is very similar to alias in this way.
Here is a small example of the use of dynamic regions.
int main() {
// Create a new dynamic region
let NewDynamicRegion{<`r> key} = new_ukey();
// At this point, we refer to the region `r to
// specify types, but we cannot actually access
// `r (i.e. it's not in our "static capability,"
// a concept explained later)
list_t<int,`r> x = NULL;
// We can access x by opening the region, which
// temporarily consumes the key
{ region h = open(key);
x = rnew(h) List(3,x);
}
// Now we can access the key again, but not x.
// So we have to open the region to increment
// its contents
{ region h = open(key);
int i = x->hd + 1;
x = rnew (h) List(i,x);
}
// Finally, destroy the key and the region
free_ukey(key);
}
First, we allocate a new unique key and open it up, to reveal the name of
the key's region (`r), and the key itself. Because `r is
now in scope, we can declare a variable x that refers to it.
However, because the key key must be opened before `r
becomes accessible, we cannot actually do anything with x yet (like
dereference it).
Next, we open up the region using key, assigning its handle to the
vairable h. Now, key is inaccessible (consumed) in the
surrounding block, which prevents us from doing anything that might cause it
to be freed while it's in use. We can use h to allocate into
`r, so we allocate a list element and store it in x.
At the conclusion of the block, the region `r becomes
inaccessible again, so once again we cannot dereference x.
However, key can now be accessed again, so we can open it again in
the following block, to add a new list cell to x. At the
conclusion of this block, key is unconsumed once again, so we
legally call free_ukey. This frees the key and the region
`r.
You can "share" a dynamic region key by placing it in some shared data
structure, like a global variable. Of course, you'll then have to swap with
NULL to get it in and out of the shared data structure, as the following
code demonstrates:
struct MyContainer { <`r>
uregion_key_t<`r> key;
list_t<int,`r> data;
} *\U `H global = NULL;
int main() {
// allocate a dynamic region, and create a list
let NewDynamicRegion{<`r> key} = new_ukey();
list_t<int,`r> x = NULL;
{ region h = open(key);
x = rnew(h) List(3,x);
}
// Stick the key and list in a global data
// structure. We've now lost direct access to
// the key and x.
global = new MyContainer{key,x};
// But we can regain it by swapping for the
// container.
struct MyContainer *@aqual(UNIQUE) p = NULL;
global :=: p;
// Now we can use it as above
let MyContainer{<`r2> key2, data2} = *p;
list_t<int,`r2> d = data2;
{ region h = open(key2);
int i = d->hd + 1;
d = rnew (h) List(i,d);
}
}
Here, we define a global variable having type MyContainer, which
consists of a key and some data into that key's region. The main function
allocates a unique as before, and allocates some data into its region. Then
we create a container for that key and data, and store it into the global
variable; this consumes key, making it inaccessible, and
effectively preventing access of x as well.
But we can then get the container back out of the global variable by
swapping its contents with NULL. Then we can open up the container, and use
the key and data as before. This way, a single dynamic region can be used
by many different functions in the program. They simply swap out the global
when they need it, operate on it, and then swap in the result.
One problem with using this technique with unique keys arises when you need
to open the same region multiple times. The problem, of course, is that if
you swap in NULL, then whoever tries to swap it out will fail. In other
words, you can't really do recursive opens with UNIQUE keys. However, you can
do this with REFCNT keys! Swap out the key, make a copy of it, swap it back
in, and use the copy for the open (making sure to destroy the copy after the
open).
One disadvantage of dynamic regions, which is inherited from unique and
reference-counted pointers, is that if you put a key in some shared storage
in a region `r, then it is not the case that when `r is deallocated that the
key will be destroyed automatically. It's up to you to do the right thing
or let the GC eventually collect it. In the long run, the right thing to do
is add a finalizer interface for regions so that we can register a routine
to deallocate a dynamic region whenever we put it in a shared data
structure. The same goes for any unique pointer --- we ought to have a way
to register a finalizer. This is on our To-do list.
9.8 Defaults and Shorthands
As described so far, the notation for alias qualifiers is extremely
verbose. The default qualifier annotations and bounds are intended to
capture the most common cases and reduce the burden on the
programmer. Where the defaults do not suffice, shorthand versions
allow explicit types to specified in a more compact manner.
The shorthand notation is as follows:
-
Alias Qualifier Specifier
- The strings
\A, \U, \RC, \T
can be used as substitutes for @aqual(ALIASABLE),
@aqual(UNIQUE), @aqual(REFCNT), and @aqual(RESTRICTED)
respectively. For example
void cons(int *\T a, int *\U b)
is equivalent to
void cmp(int *@aqual(RESTRICTED) a, int *@aqual(UNIQUE) b)
- Type Variable Bound
- The bound on a type variable, as shown
previously, generally appears together with the effects in a
function type, or with the outlives relations for an aggregate
type. However, when the strings
\A, \U, \RC, \T
succeed a
type variable they are interpreted as a bound on the type
variable. This bound is only legal within an struct declaration or a
function type; i.e. bounds may not appear within a local variable
declaration, typedefs etc. If the type variable is of alias
qualifier kind (kind Q) then this is interpreted as a qualifier
bound. If the variable is of boxed kind then it is interpreted as an
aquals bound. For instance, the function rqcopy above can be
written as
list_t<`a,`r,`q> rqcopy(region_t<`r> r,aqual_t<`q\T> q,
list_t<`a\A,`r2,`p\T> l);
The same convention applies to aggregate types. For instance, we
could define List as follows
struct List<`a::B\T,`r::R,`q::Q\T>{
`a hd;
struct List<`a,`r,`q> *@aqual(`q) `r tl;
};
- Heap Pointers
- For backward compatibility with previous versions
of Cyclone where unique and reference counted pointers always
pointed to the heap, we support the following shorthand.
T *@aqual(UNIQUE) `H
can be written more compactly as
T *`U
; the type T *@aqual(REFCNT) `H
can be written
as T *`RC
. This usage is deprecated.
The default qualifier bounds are as follows:
-
Function Types
- The aliasability of all parameters and return
types in a function type is ALIASABLE by default. If a formal
parameter is declared as
T *@aqual(\T)
then all subtypes of
T *@aqual(RESTRICTED) may be passed as an argument. For
parametricity T *@aqual(`q\T)
should be used instead.
- Struct Types
- For pointers within a struct the default
aliasability is ALIASABLE. For qualifier variables `q::Q
the default bound is RESTRICTED. The aquals bounds for type
variables is RESTRICTED by default. Thus the following
declarations are equivalent
struct PtrList<`a::A, `q> {
`a *hd;
struct PtrList<`a,`q> *@aqual(`q) tl;
};
struct PtrList<`a::A\T,`q\T> {
`a *@aqual(\A) hd;
struct PtrList<`a,`q> *@aqual(`q) tl;
};
- Type Instantiations
- When a type variable is instantiated with a
pointer type we have to decide the aliasability of pointer. When
instantiated in a function type or an aggregate type the
aliasability is by default ALIASABLE. For instance, the
following are equivalent:
void int_list(list_t<int*> l);
void int_list(list_t<int *@aqual(\A)> l);
struct Wrapper {
list_t<int*> l;
}
struct Wrapper {
list_t<int*@aqual(ALIASABLE)> l;
}
In variable declarations the default is RESTRICTED to allow
for more aggressive unification. For instance
void int_list(list_t<int*@aqual(\T)> l) {
list_t<int*> cp = l;
}
In the local variable declaration the bound defaults to RESTRICTED so that unification with the formal succeeds.
9.9 The Outlives Relation for Reaps
Recall from Section 8.4.3 that the subtyping for region
qualifiers is defined in terms of the lifetimes of the regions. We
say that a region `r outlives a region `s if
the lifetime of `r is strictly greater than the lifetime of
`s. In such a case, it is permissible to use a pointer of
type T *`r where a T *`s is expected.
A consequence of the outlives relation is that a pointer declared to
be of type T *`r may be aliased both by pointers declared to
be of type T *`a, where `a outlives `r, as
well as by pointers declared to be of type T *`b, where
`b is outlived by `r. This relationship is
illustrated by the program in Figure 1, where, at
some point during the execution of the function, a,
b, and c may all be aliases of each other.
int region_subtype(uregion_key_t<`dyn> k, int *`H c) {
region<`dyn> dyn = open(k);
int *`dyn a;
{ region<`r> r;
int *`r b;
if(*c) a = c; //`H outlives `dyn
else a = rmalloc(dyn, 0);
b = a; //`dyn outlives `r
}
}
Figure 1: An illustration of region subtyping.
The outlives relation poses a problem when it is applied to reaps. The
implementation of rufree requires both a region handle and
the pointer to the object to be deallocated to be passed as
arguments. We further require that the object to be deallocated reside
in the region referred to by the handle. The outlives relation allows
this constraint to be violated, as in the example below.
region<`r> r;
int *\qU `H p = rmalloc(heap_region, unique_qual, 0);
rufree(r, p); // BAD!
Since the heap outlives all other regions, the region subtyping rules
permit p to be used in place of a int *\qU `r
pointer. Clearly, the object to be deallocated in the call to
rfree does not reside in the region `r.
There are a number of solutions to this problem. One option is a
dynamic check to ensure that pointer passed to rfree actually
points to a location within the specified region. To avoid the run
time overhead, we opted to refine the type system so as to
statically guarantee that the pointer passed to rfree
points within the appropriate region. This is achieved by limiting the
use of the region subtyping for reaps.
The type of rfree is:
void rufree(region_t<`r>, `a *\qU `r : single(`r)) consume(2);
This type states that rfree expects a region handle and a
pointer of unique aliasability as arguments. The construct
single(`r) specifies that the region `r may
not be outlived by any other region. Finally, the attribute
consume(2) is a post-condition on the function which states
that the pointer passed in the second argument is no longer live,
i.e., the referent has been deallocated.
The heap region is clearly satisfies this constraint:
single(`H) is always true since the heap is not outlived by
any other region. For the other cases, recall that we know which
regions may have individually-deallocatable objects: they were
declared with the form reap<`r> r, new_reap_ukey or
new_reap_rckey.
This declaration introduces an assertion of the form
single(`r) into the type-checking environment, which prevents
the use of the outlived-by relation for any pointer to `r. It
also allows the single constraint in the type of
rfree to be proved.
9.10 Reap Allocator Implementation
To support deallocation of objects within a region the allocator must
maintain some meta-data associated with each object. The current
implementation uses a version of the bget allocator adapted for use
with reaps. When deallocation is rare, bget behaves much like a
simple pointer bumping allocator and we expect performance to be
competitive. However, the metadata does consume two additional header
words for each object allocated.
For normal regions (i.e. those declared as region r, or those
created using new_ukey, or new_rckey) the simple
pointer-bumping allocator is used. Thus, if your application does not
use reaps at all, then the overhead due to bget therefore does not
apply.