C3»Blog
Christoffer Lernö
Last post about arithmetics I listed the following problems related to overflow and arithmetics:
  1. Overflow traps on unsigned integers makes the mixed signedness case very hard to get right.
  2. Overflow traps cause unexpected and hard-to-spot non commutative and associative behaviour in expressions that otherwise would have been fine.
  3. Wrapping behaviour enables buffer overflows and similar exploits.
  4. In most languages it doesn't matter if the left hand sign can contain the number, what matters are the types on the right hand side, which isn't always intuitive or desired.

Let's look at some attempts to fix these problems, starting with handling mixed signedness.

Mixed signedness



As we saw in the previous post, mixed signedness is not trivial. The choices made by C are actually fairly reasonable if the idea is to make most situations "just work" as the user intended. However whatever the solution, they all have their own set of problems. Looking at these first will allow us to understand ramifications of other choices later on.

1. "Widen and narrow"


This method relies on widening the operands to a common super type. Let us look at our old example:

1
2
3
unsigned x = 10;
int change = -1;
x = x + change;


The idea is that we promote all operands to the biggest type that can contain both operands, in this case it's likely a long long and perform the operation using that type. The trap is then in an implicit cast back to the original size.

The code above would implicitly be turned into:
1
2
3
4
5
unsigned x = 10;
int change = -1;
long long _temp = (long long)x + (long long)change;
if (_temp < 0 || _temp > UINT_MAX) panic();
x = (unsigned)_temp;


This strategy works, with the caveat that we need to always be able to find a wider type in order to do this widening.

In a language like Zig, which allows arbitrarily wide operands, this is not a problem but if there are a fixed number of types – what do we do? In C# the solution is basically to say that this works up to 32 bits, but signed and unsigned 64 bit ints cannot be combined, even though unsigned 64 bit integers should be fairly common.

If we insist that all integer types need to have an unsigned counterpart then we either need to stop at some arbitrary place - like C# does – or provide arbitrarily wide integers. Another solution is to provide a signed "maxint" that has no unsigned counterpart.

Pros:
- Conceptually simple.
- unsigned + signed has a definite type.

Cons:
- Widening may have a performance cost.
- The listed strategies to pick a wider integer (C# / Zig / maxint) all have drawbacks.

2. "Prefer signed"


This method relies on always converting to the signed type, given two equal types:

The code above would implicitly be turned into:

1
2
3
4
5
6
7
unsigned x = 10;
int change = -1;
// Possibly insert trap here:
// if (x > INT_MAX) panic();
int _temp = (int)x + change;
if (_temp < 0) panic();
x = (unsigned)_temp;


If signed addition overflow is a trapping error we need to have the signed int overflow check on x or otherwise it would be possible to add a large unsigned to a positive signed and get a negative result which would be inconsistent. However, if signed overflow is no error then the unsigned check isn't needed.

Unfortunately, with the overflow check we essentially forbid the unsigned value from having the top bit set which both adds add gotcha as well as making it unusable in some cases.

Pros:
- Conceptually simple.
- Definite result type.
- Efficient.

Cons:
- Broken if signed overflow is an error.

3. "Prefer unsigned"


This is what C does. With wrapping unsigned arithmetics this often yields the expected result, but there are counter examples such as:

1
2
3
4
unsigned x = 2;
int y = -100;
unsigned z = 10
if (z < x + y) unexpectedCall();


In the above code, the right hand side becomes a large number as y is converted into a large unsigned number. Division has a similar issue as unsigned and signed division works differently.

If conversion of the signed number instead traps on negative numbers, we're forced to cast, but even that won't help if unsigned overflow is an error.

Pros:
- Fairly simple.
- Efficient.

Cons:
- Does not work if unsigned overflow is an error.
- The unsigned result may give unexpected results when it is used for comparison or division.
- The resulting type might not always be obvious.

4. "Use a function"


This solution instead provides functions for various combinations, it might look like this:

1
2
3
unsigned x = 10;
int change = -1;
x = addSignedToUnsignedWithTrap(x, change);


How the function works doesn't matter to us here, we just know it will trap if the addition didn't work and give us the correct answer otherwise.

Pros:
- Well defined and easy to understand.
- Definite result type.

Cons:
- Clunky.
- Programmers might look for easier but "wrong" solutions.

5. "Require explicit casts"


A common solution is to require explicit casts, but as we see above in the "Use unsigned" and "Use signed" variants this doesn't necessarily work if there are overflows traps. Worse, it might potentially obscure issues:

1
2
3
unsigned x = 10;
int y = -50;
unsigned z = x + (unsigned)y; // No unsigned overflow, so ok?


Here the problem isn't that we're casting a negative value to a unsigned one – this is actually in many cases what we want. It's that we can't determine when the addition underflows and when it doesn't.

Since explicit casts encompass all of the strategies 1-3, we don't need to tie us to a particular (possibly inferior) solution. On the other hand it may take quite a bit of effort and deep understanding of the possible issues to do "the right thing". The whole idea behind making more overflow trapping is to make the default more safe. If that is the intention, then forcing the programmer to pick just the right set of casts to avoid bugs and potential attack vectors seem inconsistent.

Pros:
- Well defined and easy to understand.
- Definite result type.
- Free to pick the optimal strategy.

Cons:
- Does not work well when overflow is an error.
- May implicity remove safeguards.
- Picking the optimal set of casts is not always obvious.

5. "Mixed signedness type"


A more utopian idea is for unsigned/signed mixes to have no definite type, instead being inferred or creating multiple code paths depending on the result. The complexity and unpredictability, in particular in non-trivial examples makes it hard to view it as a reasonable solution:

1
2
3
unsigned x = ...;
int y = ...;
z = (x + y) / (x - y); // Signed or unsigned division here?




The forgotten left hand side



As an example I mentioned that Zig currently uses the binary operand types when resolving arithmetics, so for example this will overflow:

1
2
3
var num1 : i8 = 16;
var num2 : i32 = 3;
var res : i32 = num2 + num1 * num1; // Boom!


It might feel odd that Zig chose this behaviour, but it's not strange. The reason we don't have this is because C makes an implicit widening to int for all operands first. This is why adding two ints and assigning it to a long long in C cannot result in a value larger than INT_MAX, even though the long long might be able to contain several billion times as big.

The only difference between C and Zig is that we run into this problem more often in the latter. Rust also allows i8 * i8, but there is no implicit widening, so it's more obvious as to what's happening.

Left hand side widening


A simple algorithm for widening using the left hand side, given that all operands have the same signedness is the following:
  1. Store the left hand side type, and assign this as the target type.
  2. Depth first, look at the first leaf operand, which is any operand that already has a definite type (e.g. variable, explicit cast, call). If the type is larger than the target type this is a type error. If it is smaller, promote it to the target type.
  3. Repeat on all leaf nodes.
  4. Derive types of the branch nodes.
  5. At this point all branch nodes should have the same type as the target type already (or there was a type error).


This algorithm can be modified to handle implicit conversion of mixed signedness as well.

If implicit widening is used, then it is probably a good idea to widen using the left hand side regardless of whether the overall behaviour is trapping or not.


Reviewing the options


If we go back and look at overflow in general, it's instructive to look at what some other languages do:

C
- Implicit casts between all integer types.
- Implicit widening to int before arithmetics.
- Unsigned arithmetic wraps.
- Signed overflow is undefined behaviour.

Rust
- No implicit casts.
- Trap in debug on unsigned overflow, wrap in release.
- Trap in debug on signed overflow, wrap in release.

Go
- No implicit casts.
- Wrapping unsigned overflow.
- Wrapping signed overflow.

Zig
- Only safe, widening casts.
- Trap in debug on unsigned overflow, UB in release.
- Trap in debug on signed overflow, UB in release.

Swift
- No implicit casts.
- Trap on unsigned overflow.
- Trap on signed overflow.

Downsides


The problems with C's system is well known: the lack of trapping overflow together with its lack of bounds checking enables buffer overflows and other attacks. This is made worse by the undefined behaviour on signed integers. While C compilers often give wrapping semantics in unoptimized builds, optimized builds may assume overflows never happen, giving different behaviour with optimization on. Implicit casts means that potentially harmful truncating casts are extremely difficult to spot. With UB sanitation it's easy to turn such UB into traps, giving similar to checks as Rust or Zig, but only for signed overflow.

On the other hand, C mostly preserves associativity and commutativity, and has few problems working with mixed signedness. The implicit widening means that overflow rarely happen in normal cases. Usually C "just works" – except when it doesn't.

With languages that trap, such as Rust, Swift and Zig we are much more likely to run into trapping behaviour for intermediate values. For Zig in particular this is problematic, since there is implicit widening and it's not obvious that the computation on the right hand side might use a much smaller bit width than the left hand side would indicate. In Rust and Swift at least it's obvious where the widening occurs. On the other hand, it can be argued that requiring explicit widening where it is safe is fairly noisy. If Zig would use the left hand side to widen all operands before calculation, the problem with implicit widening would largely go away.

Both Rust and Zig restricts overflow trapping to debug versions. While fuzz testing and regular testing is likely to use the debug version, they're not protected where it is the most sensitive: in production. Zig's choice of UB is daring – the behaviour of overflow bugs will be hard to predict.

Go with its wrapping behaviour is lacking the tools to check overflow with normal operators, but at the same time this means that like Swift the code is guaranteed to work the same regardless of debug or release mode. We can also safely use casts to use signed and unsigned numbers together, something that isn't easy in the other languages.

A brief summary


Wrapping semantics offers the least protection against exploits, but works well with types of mixed signeness. Both Rust and Zig has some checks, but against undiscovered exploits in the wild they leave the protection off, trading safety for speed. Swift is the most consistent in preventing overflows, but like all the languages except for Go and C works poorly with mixed signedness in operands.

Ideally we would like both good safety and good mixed signedness use, but none of the languages above is able to provide this combination. Also, trapping is not always desired, especially for unsigned integers as this sometimes give unexpected traps when intermediate values underflow even though the expression as a whole is valid. Triggering such traps can in themselves be used to trigger DoS attacks in languages like Swift, while worse exploits are potentially possible in languages with undefined behaviour.

Looking for solutions



Trapping overflow is the last line of defense against exploits. Since the result of a trap often is a panic, the best one can achieve is a crash. In test, this is useful for finding bugs, but when used "in production", we only gain something if the overflow is leading to something worse than crashing. As mentioned above, overflow in unsigned trapping arithmetics might actually be false errors, creating unnecessary possibilities of crashing an application.

In an ideal situation without any constraints, we would prefer to use infinitely ranged integers to perform our calculations, and then only trap if the the result does not fit in the target type. But unfortunate it is not practical to do so in most cases.

One mitigating strategy against intermediate underflow could be to promote all unsigned operands to a signed integer twice the bit width. For example an u32 could be evaluated as i64. This would also allow us to safely mix signed and unsigned arithmetics in a natural way, working like the "Widen and narrow" strategy discussed for mixed arithmetics. As we recall the downside is the need for a signed maximum type and doing arithmetics with a wider type.

With addition and subtraction we can be sure that reordering won't affect the result.

1
2
3
4
5
6
7
unsigned a, b, c, d, e, f;
...
// All the following would then behave
// identical regardless of values of a, b and c.
d = a + b - c;
e = a + (b - c);
f = (a + b) - c;


The actual trap here happens during the implicit narrowing before the assignment.

If we desire trapping in release builds, such can be prohibitive, even when it is just a single "branch on overflow" jump. Here the "As-if Infinitely Ranged Integer" Model offers a way for cheap release build traps. The underlying idea is that overflows do not immediately need to be detected. For example, in d = a + b + c we only need to check the overflow flag before the assignment. In this case the number of checks are cut in half, but even greater savings are possible. The paper reports an approximate slowdown around 6% for a simple implementation.

What if all integers wrapped? There are a non-trivial amount of errors in C that are due to developers erronously assuming wrapping. While wrapping allows under-checked parameters to cause follow up errors, the behaviour is well understood and matches the underlying hardware. To simplify certain need-to-be-safe expressions, without having to use functions it's possible to introduce trapping operators e.g. a ~+ b, or checked expressions e.g. checked(a + b). Operators in particular avoid the problem where unexpected trapping happens for intermediate results.

Adding implicit conversions


Zig and C both have implicit conversions. While C has pervasive implicit conversions, Zig only allows "safe" widenings that preserves all values. Other languages use explicit casts.

If we decide to use a solution that sometimes do implicit widenings, such as the idea of to widen unsigned types to a signed before arithmetics, then it is fairly natural to combine this with implicit widening and even possibly implicit narrowing with tests.

Because the sizes matter for trapping purposes, it is likely better to always require explicit widenings unless all operands are promoted to the same type.

1
2
3
4
char a = 16;
// Surprising if a * a would trap on
// 8 multiplication overflow
int b = a * a;


It is possible however to use the left hand side for implicit operand widening rather than just considering the other operand. Zig could likely gain a lot by adopting such a scheme.

If instead integers wrap, then converting between unsigned and signed is safe, and having implicit widenings are largely safe. Even though wrapping integers allow lossless conversion between signed and unsigned, it will be useful to clearly indicate what sort of types are used in expressions like (a + b) / (c + d) where types are mixed. A (a + (unsigned)b) / (c + (unsigned)d) would clearly show the intent, unlike some solution that implicitly picked signed or unsigned by default.

Summing it up


There are various ways to go about overflow trapping and mixed type arithmetics. In general we can split clearly between "wrap by default" and "trap by default". Even within each subgroup there are a wide range of solutions, each with its own set of advantages and drawbacks. Picking a strategy is by necessity a trade-off, not only in runtime cost but also in what sort of vulnerabilities they remain susceptible to.

Hopefully this article has provided a rough overview of the problem space as well as some novel ideas that can be explored further, such as the maxint promotion.
Christoffer Lernö
The defer statement is going mainstream. Go has it's own special defer which only fires on function end, otherwise defer has consistent "execute at scope end" semantics. Swift, Zig, Jai, Nim and Odin all use defer in this manner.

The problems with implementing defer is similar to implementing destructors for stack allocated objects in C++, although the presence of virtual functions complicates things.

I couldn't find anyone describing how defer is done in other compilers so when working on a version of it for C2 I had to make it up as I went along.

For posterity's sake I thought it might be interesting to do a writeup on how defer was implemented.

Setting the rules



First up there are lots of different possible rules to adapt defer to. The original draft of this article would handle goto across defers. C2 retains goto, and for a long time, so did C3 – so this was important to make the article complete. However goto adds much complexity to defer which made the article both much longer and harder to follow.

For that reason we'll limit ourselves to return, continue, break plus labelled versions of the latter. If there is interest I can go into details on how to add defer for goto in another article.

Handling early exit



The first issue in defer is the early exit:

1
2
3
4
5
6
void test()
{
  defer printf("A");
  if (rand() % 2 == 0) return;
  printf("B");
}


Every return needs to inline the defer at the end, so this is lowered to:


 1
 2
 3
 4
 5
 6
 7
 8
 9
10
void test()
{
  if (rand() % 2 == 0) 
  {
    printf("A");
    return;
  }
  printf("B");
  printf("A");
}


For break and continue this is handled similar to return but only part of the defers may be inlined at the point of the break:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
void test()
{
  defer printf("A");
  while (true)
  {
    defer printf("B");
    {
      defer printf("C");
      if (rand() % 2 == 0) break;
    }
  }
}


And the inlined version:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
void test()
{
  while (true)
  {
    {
      if (rand() % 2 == 0) 
      {
        printf("C");
        printf("B");
        break;
      }
      printf("C");
    }
    printf("B");
  }
  printf("A");
}


We also have the labelled version of break. (I'll stick to the java-style labelled break syntax here, even though C3 has a different variant)

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
void test()
{
  defer printf("A");
  FOO: while (true)
  {
    defer printf("B");
    while (true)
    {
      defer printf("C");
      if (rand() % 2 == 0) break FOO;
    }
  }
}


This is again lowered to:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
void test()
{
  FOO: while (true)
  {
    while (true)
    {
      if (rand() % 2 == 0) 
      {
        printf("C");
        printf("B");
        break FOO;
      }
      printf("C");
    }
    printf("B");
  }
  printf("A");
}


So as we see it's sufficient to keep a list of the defers and then inline the defer statements in reverse order where we encounter a break, continue or return.

Putting it all together



So now we've listed all the things we need to solve. How do we put it together? Here's the algorithm I used:

  1. For each defer, keep a pointer back to the previously active defer.
  2. For each dynamic scope, keep track of the current defer.
  3. On break/continue/return AST nodes, store 2 AST nodes.
  4. On each scoping AST node (while, for, compound statement etc) store 2 AST nodes.


Algorithm


  1. Set current_scope->current_defer = NULL
  2. Traverse the AST-tree.
  3. When pushing a new dynamic scope
    1
    current_scope->current_defer = prev_scope->current_defer;
    

  4. When encountering a defer:
    1
    2
    defer->prev_defer = current_scope->current_defer;
    current_scope->current_defer = defer;
    

  5. When encountering a scoped statement:
    1
    2
    3
    4
    5
    scoped_stmt->start_defer = current_scope->current_defer;
    push_new_current_scope();
    ... recursively process nodes ...
    scoped_stmt->end_defer = current_scope->current_defer;
    pop_current_scope();
    

  6. When encountering a break or continue:
    1
    2
    3
    Ast* target_ast = find_target(break_stmt);
    break_stmt->end_defer = current_scope->current_defer;
    break_stmt->start_defer = target_ast->start_defer;
    
  7. When encountering a return:
    1
    return_stmt->defer = current_scope->current_defer;
    



This results in us being able to use each defer as the top of a linked list:

current_defer

current_defer->prev_defer

current_defer->prev_defer->prev_defer

NULL

Codegen is now easy.

We introduce a helper function to inline defers:

1
2
3
4
5
6
7
8
void codegen_defers(Defer *current, Defer *last)
{
  while (current_defer != last)
  {
    codegen(current_defer);
    current_defer = current_defer->prev_defer;
  }
}


  1. When doing codegen for a scoped statement:
    1
    2
    codegen(scoped_stmt->inner_stmt);
    codegen_defers(scoped_stmt->end_defer, scoped_stmt->start_defer);
    
  2. When doing codegen for break or continue:
    1
    2
    codegen_defers(break_stmt->end_defer, break_stmt->start_defer);
    codegen_break(break_stmt);
    
  3. Codegen for return
    1
    2
    codegen_defers(return_stmt->defer, NULL);
    codegen_return(return_stmt);
    



Going further



Ok, so now we're done? Not quite, if we want to go beyond C syntax. We can imagine something looking a bit like this:

1
if (File *f = getFile(), defer close(f)) { ... }


In this case we actually have two scopes: one inner scope (between {}) and the outer one that starts in the conditional.

The principle is the same so we can reuse the same solution as above, but it's worth taking note of this case.

Defer after if



We have other questions to answer as well. What does this code do:

1
if (x == 0) defer printf("x was 0\n");


Some people have suggested that this should be treated as:

1
2
3
4
defer
{
  if (x == 0) printf("x was 0\n");
}


I am strongly against that idea, as it would mean that compound statements suddenly have a different meaning than regular statements.

Defer as part of the function


Another interesting thing one can do with defer is the idea that a function may contain an implicit `defer` that is added to the scope which invokes it. Odin has that feature using "deferred attributes" (see further down from this link). This is simple to tie into the defer machinery.

Defers & goto


Handling goto with defers is a bit more complicated as one need to conditionally invoke defers:

1
2
3
4
5
6
7
8
void test(int x)
{
  if (x > 0) goto FOO;
  // When is this called?
  defer printf("A");
  FOO:
  printf("B");
}


The lowered code needs to look like this:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
void test(int x)
{
  bool _defer_1 = false;
  if (x > 0) goto FOO;
  _defer_1 = true;
  FOO:
  printf("B");
  if (_defer_1) 
  {
    printf("A");
  }
}


Since B can jump into scopes as well as out of scopes, this adds another dimension to the analysis. The solution is not hard but definitely not as straightforward as the structured jumps of break and continue

Non-local jumps of setjmp are not possible to handle at all.

Go style defers


Go has a different style of defer. Go's defers actually store the defer code like a closure that is queued and invoked at function end rather than at scope end. This means a defer actually needs to allocate memory for itself. A loop like this:

1
2
3
4
5
for() {
  ...
  defer ...
  ...
}


Would queue up all the defers generated in the loop in a long list and release them at function end. If the `defer` is releasing something limited like db connections then this is a bad idea. For various "gotchas" in Go due to this style of defer, see this.

While Go defers work nicely with exceptions and `goto`, it has quite a bit of quirks as well as the need to reserve memory to store the defers.

Defers and errors


Sometimes one would prefer for defers to only occur on error:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
File *getAndCheckFile()
{
   File *f = getFile();
   if (!f) return NULL;
   // We want to close if we return with error.
   defer close(f);
   if (!fileIsValid(f)) return NuLL;
   if (readHeader(f) != 0xdeadbeef) return NULL;
   // oops, we will be closing f!
   return f;
}


For this reason Zig introduces errdefer, and C3 has defer catch / defer try statements.

Being able to cancel defers


As an alternative (and complement) to special forms of defer is being able to cancel defers. So far I've only seen this functionality on defer implemented as RAII. Theoretically it could look something like:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
File *getAndCheckFile()
{
   File *f = getFile();
   if (!f) return NULL;
   FOO: defer close(f);
   if (!fileIsValid(f)) return NuLL;
   if (readHeader(f) != 0xdeadbeef) return NULL;
   undefer FOO;
   return f;
}


Summary



Defer is useful functionality for languages that lack both finally and RAII. With structured jumps it is straightforward to implement with zero overhead.