Christoffer Lernö

I recently removed untyped literals, and that complexity is now leaving me free to actually add code where there's a tangible benefit to it.

It might not be immediately obvious that having untyped literals adds complexity to a language. For C3, with its C-like implicit conversions and untyped macros, it adds more complexity than in languages like Go, where explicit conversions are enforced.

The sheer amount of complexity I had added to just do this surprised even me though. For example, every expression needed to pass down a "target type" just in case the analysis encountered an untyped literal to give a type to. In some cases this was far from trivial, and analysis had to be done in a certain order to prevent unnecessary error situations, where the incorrect order would only be obvious when the expressions where combined in some special way.

As I removed the code I also discovered I could change how the "failables" (error unions) were handled, which further reduced complexity. This in turn allowed me to do simplifications in tracking constant and "pure" expressions.

Lots of code that I had left half unfinished – with special, problematic, cases to solve another day – would after refactoring easily cover all cases and be smaller.

The code now seems so much easier to grasp, and it seems crazy I didn't removed this feature - that at the most saved a little typing here and there - earlier. But the problem was that it was a gentle creep. I did the untyped literals early, so when I added more complex features I didn't have a comparison with how it would look with untyped literals removed.

When I look at the code now, I see something that more easily can accommodate added features and behaviours. The semantic analysis had before by necessity been much more coupled due to type flow going both bottom up and top down.

There's a lesson here – which is that a seemingly simple and "nice to have" feature might by accident end up making a code base more than twice as complex to read and reason about. And that complexity cost takes away the resources the compiler (and the language) has for other features – features that may actually have a cost that is equal to its benefits.

In removing this small feature with little practical benefit, I am now free to actually add a little complexity where it really helps the users.

Christoffer Lernö

When we last left off we had our "attempt 3" which seemed like a promising candidate to research. We expected some problems and here are two:

1. What is the behaviour of when there is both "top down" casts and peer casts? This was never discussed. For example: x_u64 = a_i32 + b_u32. As we will see, making the wrong choice here will create overflow bugs.
2. When are constants folded? This matters since we only recursively applied casts the right-hand side is a binary expression. But does that check happen before or after constant folding? The proposal seems to imply it's after, but that would give weird semantics.

### Top down casts & peer casts

Our "test" here is this expression:

int32_t a = ...;
uint32_t b = ...;
uint64_t c = a + b;

// Peer -> top down:
uint64_t c = (uint64_t)a + (uint64_t)(int32_t)(b)


For some surprising results, set b = 0x80000000U and a = 1c = 0xffffffff80000001ULL. Top down → peer gives us our expected c = 0x80000001ULL.

It's very important that the resolution works in this order:

1. Check that a and b are valid widening safe expressions (see previous article for definition).
2. Evaluate a and b in isolation, before actually evaluating the binary expression.
3. Cast both to uint64_t (note that this differs from peer resolution which would preserve the signedness)
4. Semantically check the binary expression.

This gives us uint64_t c = (uint64_t)a + (uint64_t)b which presumably is what we want.

Another thing we need to note though, is that only integer widening + signedness changes happen this way. For example something like ptrdiff_t x = ptr1 - ptr2 shouldn't start making casts on the pointers!

### When constant folding happens

Constants folded are either constants or literals (assume int 32 bit, long 64 bit):

const int FOO = 1;
const int BAR = 2;
int y = ...;
int i = ...;
int j = ...;
long a1 = y + (i + j); // Error, needs explicit cast.
long a2 = y + (FOO + BAR); // Error?
long a3 = y + (1 + 2); // Error?


Just arguing for consistency would seem to require that semantics are the same for variables and constants, and by extension also literals. This might be a little disappointing, but consider if one would expect a to be -1, which would be the case if we would fold constants first:

long a = INT_MAX * 2 + 1;


With late folding the result becomes:

long a = INT_MAX * 2 + 1; // Error, needs explicit cast.
long a2 = INT_MAX * 2; // => 0xfffffffe
long a3 = INT_MAX + 1; // => 0x80000000


Which is probably as good as it gets.

Another case is when the left hand side unsigned. In this case we do not actually perform the top down widening, but instead have a general cast on the result:

uint b = INT_MAX * 2 + 1; // => 0xffffffff


## Conclusion

So far so good, there is no showstopper here, but this serves as a reminder and example of how easy it is to forget implications.

Christoffer Lernö

As a reminder, in the last article I listed the following strategies:

#### Strategies for widening

1. Early fold, late cast (deep traversal).
2. Left hand side widening & error on peer widening.
3. No implicit widening.
4. Peer resolution widening (C style).
5. Re-evaluation.
6. Top casting but error on subexpressions.
7. Common integer promotion (C style).
8. Pre-traversal evaluation.

#### Strategies for narrowing

1. Always allow (C style).
2. Narrowing on direct literal assignment only.
3. Original type is smaller or same as narrowed type. Literals always ok.
4. As (3) but literals are size checked.
5. No implicit narrowing.

#### Strategies for conversion of signed/unsigned

1. Prefer unsigned (C style).
2. Prefer signed.
3. Disallow completely.

To begin with, we'll try the "Early fold, late cast" for widening. This is essentially a modification on the peer resolution widening.

## Attempt 1 - "Early fold, late cast"

Here we first fold all constants, then "peer promote" the types recursively downwards. How this differs from C is illustrated below:

int32_t y = foo();
int64_t x = bar();
double w = x + (y * 2);
// C semantics:
// double w = (double)(x + (int64_t)(y * 2));
// Early fold, late cast:
// double w = (double)(x + ((int64_t)y * (int64_t)2));


So far this looks pretty nice, but unfortunately the "early fold" part has less positive consequences:

int32_t y = foo();
int64_t x = bar();
int32_t max = INT32_MAX;
double w = x + (y + (INT32_MAX + 1));
double w2 = x + (y + (max + 1));
// Early fold, late cast:
// double w = (double)(x + ((int64_t)y + (int64_t)-2147483648));
// double w2 = (double)(x + ((int64_t)y
//             + ((int64_t)max + (int64_t)1)));


The good thing about this wart is that it is at least locally observable - as long as it's clear from the source code what will be constant folded.

All in all, while slightly improving the semantics, we can't claim to have solved all the problems with peer resolution while the semantics got more complicated.

## Attempt 2 "Re-evaluation"

This strategy is more theoretical than anything. With this strategy we attempt to first evaluate the maximum type, then re-evaluate everything by casting to this maximum type. It's possible to do this in multiple ways: deep copying the expression, two types of evaluation passes etc.

Our problematic example from attempt 1 works:

int32_t y = foo();
int64_t x = bar();
int32_t max = INT32_MAX;
double w = x + (y + (INT32_MAX + 1));
// => first pass type analysis: int64_t + (int32_t + (int + int))
// => max type in sub expression is int64_t
// => double w = (double)(x + ((int64_t)y
//               + ((int64_t)INT_MAX + (int64_t)1)));


However, we somewhat surprisingly get code that are hard to reason about locally even in this case:

// elsewhere int64_t x = ... or int32_t x = ...
int32_t y = foo();
double w = x + (y << 16);
// If x is int64_t:
// double w = (double)(x + ((int64_t)y << 16));
// If x is int32_t
// double w = (double)(x + (y << 16));


Here it's clear that we get different behaviour: in the case of int32_t the bits are shifted out, whereas for int64_t they are retained. There are better (but more complicated) examples to illustrate this, but it is proving that even with "ideal" semantics, we lose locality.

## Rethinking "perfect"

Let's go back to the example we had with Java in the last article:

char d = 1;
char c = 1 + d; // Error
char c = (char)(1 + d); // Ok


In this case the first is ok, but the second isn't. The special case takes care of the most common use, but for anything non-trivial an explicit cast is needed.

What if we would say that this is okay:

int64_t x = foo();
int32_t y = bar();
double w = x + y


But this isn't:

int64_t x = foo();
int32_t y = bar();
double w = x + (y + y);
//             ^^^^^^^
// Error, can't implicitly promote complex expression
// of type int32_t to int64_t.


Basically we give up widening except for where we don't need to recurse deep into the sub-expressions. Let's try it out:

## Attempt 3: Hybrid peer resolution and no implicit cast

We look at our first example.

int32_t y = foo();
int64_t x = bar();
double w = x + (y * 2); // Error


This is an error, but we can do two different things in order to make it work, depending on what behaviour we want:

// A
double w = x + (y * 2LL); // Promote y
// B
double w = x + (int64_t)(y * 2); // Promote y * 2


Yes, this some extra work, but also note that suddenly we have much better control over what the actual types are compared to the "perfect" promotion. Also note that changing the type of x cannot silently change semantics here.

The second example:

int32_t y = foo();
int64_t x = bar();
int32_t max = INT32_MAX;
double w = x + (y + (INT32_MAX + 1)); // Error


Again we get an error, and again there are multiple ways to remove this ambiguity. To promote the internal expression to use int64_t, we just have to nudge it a little bit:

double w = x + (y + (INT32_MAX + 1LL)); // Ok!


And of course if we want C behaviour, we simply put the cast outside:

double w = x + (int64_t)(y + (INT32_MAX + 1)); // Ok!


Note here how effective the type of the literal is in nudging the semantics in the correct direction!

## Adding narrowing to "attempt 3"

For narrowing we have multiple options, the simplest is following the Java model, although it could be considered to be unnecessarily conservative.

The current C3 model uses a "original type / active type" model. This makes it hard do any checks, like accepting char c1 = c + 127 but rejecting char c1 = c + 256 on account of 256 overflowing char.

If this type checking system is altered, it should be possible to instead pass a "required max type" downwards. The difference is in practice only if literals that exceed the left-hand side are rejected.

For most other purposes they work, and seem fairly safe as they are a restricted variant of C semantics. There is already an implementation of this in C3 compiler.

## Handling unsigned conversions

The current model in C3 allow free implicit conversions between signed and unsigned types of the same size. However, the result is the signed type rather than the unsigned.

However, naively promoting all types to signed doesn't work well:

ushort c = 2;
uint d = 0x80000000;
uint e = d / 2; // => c0000000 ????
// If it was e = (uint)((int)d / (int)2);


Consequently C3 tries to promote to the signedness of the underlying type:

uint e = d / 2;
// e = d / (uint)2;
// => 0x40000000


Given the lack of languages with this sort of promotion, there might very well be serious issues with it, but in such cases we could limit unsigned conversion to the cases where it's fairly safe. For now though, let's keep this behaviour until we have more solid examples of it creating serious issues.

### Summarizing attempt 3:

1. We define a receiving type, which is the parameter type the expression is passed as a value to, or the variable type for in the case of an assignment. This receiving type may also be empty. Example: short a = foo + (b / c), in this case the type is short.
2. We define a widening safe expression as any expression except: add, sub, mult, div, rem, left/right shift, bit negate, negate, ternary, bit or/xor/and.
3. We define a widening allowed expression as a binary add, sub, mult, div where the sub expressions are widening safe expressions.
4. Doing a depth first traversal on sub expressions, the following occurs in order:
1. The type is checked.
2. If the type is an integer and has a smaller bit width than the minimum arithmetic integer width, then it is cast to the corresponding minimum integer with the same signedness.
5. If the expression is a binary add/sub/div/mod/mult/bit or/ bit xor/bit and then the following are applied in order.
1. If both sub expressions are integers but of different size, check the smaller sub expression. If the smaller sub expression is a widening safe expression insert a widening cast to the same size, with retained signedness. Otherwise, if the smaller sub expression is a widening allowed expression, insert a widening cast to the same size with retained signedness on the smaller expression's two sub expressions. Otherwise this is a type error.
2. If one sub expression is an unsigned integer, and the other is signed, and the signed is non-negative constant literal, then the literal sub expression is cast to type of the unsigned sub expression.
3. If one sub expression is unsigned and the other is signed, the unsigned is cast to the signed type.
4. If one sub expression is bool or integer, and the other is a floating point type, then the integer sub expression is cast to the
6. Using the receiving type, ignoring inserted implicit casts, recursively check sub expressions:
1. If the type is wider than the receiving type, and the sub expression is a literal which does not fit in the receiving type, this is a literal out of range error.
2. the receiving type, and the sub expression is a not a literal, and the sub expression's type is wider than the receiving type then this is an error.
3. If an explicit cast is encountered, set receiving type for sub expression checking to empty.

Wheew! That was a lot and fairly complex. We can sum it up a bit simpler:

1. Implicit narrowing only works if all sub expressions would fit in the target type.
2. Implicit widening will only be done on non-arithmetic expressions or simple +-/%* binary expressions. In the second case the widening is done on the sub expressions rather than the binary expression as a whole.
3. Integer literals can be implicitly cast to any type as long as it fits in the resulting type.

## Conclusion

We tried three different semantics for implicit conversions using typed literals. A limited hybrid solution (our attempt 3) doesn't immediately break and consequently warrants some further investigation, but we're extremely far from saying that "this is good". Quite the opposite, we should treat as likely broken in multiple ways, where our task is to find out where.

We've already attempted various solutions, just to find corner cases with fairly awful semantics, even leaving aside implementation issues.

So it should be fairly clear that there are no perfect solutions, but only trade-offs to various degrees. (That said, some solutions are clearly worse than others)

It's also very clear that a healthy skepticism is needed towards one's designs and ideas. What looks excellent and clear today may tomorrow turn out to have fatal flaws. Backtracking and fixing flaws should be part of the design process and I believe it's important to keep an open mind, because as you research you might find that feature X which you thought was really good / really bad, is in fact the opposite.

When doing language design it's probably good to be mentally prepared to be wrong a lot.

(Also see the follow up, where we patch some of the problems!)

Christoffer Lernö

This is continuing the previous article on literals.

C3 is pervasively written to assume untyped literals, so what would the effect be if it changed to typed literals again?

Let's remind us of the rules for C3 with untyped literals:

1. All operands are widened to the minimum arithmetic type (typically a 32 bit int) if needed.
2. If the left-hand side is instead an integer of a larger width, all operands are instead widened to this type.
3. The end result may be implicitly narrowed only if all the operands were as small as the type to narrow to.

To exemplify:

short b = ...
char c = ...

// 1. Widening
$typeof(b + c) => int //$typeof((int)(b) + (int)(c))

// 2. Left hand side widening
long z = b + c;
// => long z = (long)(b) + (long)(c);

// 3. Implicit narrowing.
short w = b + c;
// => short w = (short)((int)(b) + (int)(c))


## Simple assignments

Let's start with simple assignments, where the lack of implicit narrowing tends to be a problem.

func void foo(short s) { ... }
func void ufoo(ushort us) { ... }
...

short s = 1234; // Allowed??
foo(1234); // Allowed??
ushort us = 65535; // Allowed??
ufoo(65535); // Allowed??


This is certainly something we would like to allow, so some additional rule is needed on top of just the literal types.

## Binary expressions

We next turn to binary expressions:

short s = foo();
s = s + 1234; // Allowed??


This example is subtly different, because looking at the types there is no direct conversion. Keep in mind rule (3) which permits narrowing. It requires us to keep track of both the original and current type. The types are something like this short = (int was short) + (int was int). If we try to unify the binary expression, should the resulting type be "int was short" or "int was int"?

If we had s = s + 65535 we would strictly want "int was int" – which causes a type error, and then preferably "int was short" in the case with 1234. Is this possible?

In short: yes, although it does add special check for merging constants on every binary node. That said it has some very far-reaching consequences.

## Peer resolution woes

In Zig, which only has implicit widening, this occurs in binary expressions. So if you had int + short the right-hand side would be promoted to int. For a single binary expression this makes sense, but what if you have this:

char c = bar();
char d = bar2();
short s = foo();
int a = c + d + s;


If we use the "peer resolution" approach, this would be resolved in this manner: int a = (int)((short)(c + d) + s). But if we reorder the terms, like int a = s + c + d, we get int a = (int)((s + (short)c) + (short)d) (assume additions are done using the resulting bit width of the sub expressions).

This means that in the original ordering we have mod-2^8 addition (or even undefined!) behaviour if the sum of c and d can't fit in a char, but with the first ordering there is no problem. Or in other words: we just lost associativity for addition!

That said, C has the same problem, but only when adding int with expressions which has types larger than int.

This example exhibits the same issue in C:

int32_t i = bar();
int32_t j = bar2();
int64_t k = foo();
int64_t a = i + j + k;
// Evaluated as
// int64_t a = (int64_t)(i + j) + k;


Previous to the 64-bit generation, the int was generally register sized in C, so doing larger-than-register-sized additions were rare and consequently less of a problem.

This prompted me to investigate a different approach for C3, using left side widening. On 64-bits, this would still be default 32-bit widening, but if the assigned typ was 64-bit, all operands would be widened to 64-bits instead, so as if it was written int64_t a = (int64_t)i + (int64_t)j + k. This is achieved by "pushing down" the left-hand type during semantic analysis (see rule 2).

The reason it must be pushed down during the pass, rather than doing it after evaluation, is that the type affects untyped literal evaluation and constant folding. Here is an example:

int32_t b = 1;
int64_t a = b + (b + ~0);


If we don't push down the type, then we need to use the closest thing, which either b - which has type int32_t - or just the default arithmetic promotion type, which is "int" in C.

// Use int to constant fold ~0 and
// cast all to int64_t afterwards.
int64_t a = (int64_t)b + ((int64_t)b + (int64_t)(~(int)0));

// Use int64_t on 0 before constant folding:
int64_t a = (int64_t)b + ((int64_t)b + (~(int64_t)0));


We still have problems though, like in this case:

double d = 1.0 + (~0 + a);


If we the analysis in this order:

1. Analyse 1.0
2. Analyse 0
3. Analyse ~0
4. Analyse a
5. Analyse a + ~0
6. Analyse 1.0 + (a + ~0)
7. Analyse d = 1.0 + (a + ~0)

Then here we see that even though we know a to be int64_t we can't use that fact to type 0, because it is analysed after 0. And worse: the a might actually be an arbitrarily complex expression – something that you can't tell until you analysed it!

For that reason, even with untyped values, 0 is forced to default to int, with this result: double d = 1.0 + (double)((int64_t)~(int)0 + a)

Another example is shown below.

int64_t x = 0x7FFF_FFFF * 4;
int64_t y = x - 0x7FFF_FFFF * 2;
// y = 4294967294
double d = x - 0x7FFF_FFFF * 2;
// d = 8589934590.0


Here due to no widening hint from a "double", the two expressions behave in a completely different manner.

## Setting realistic goals

Ideally we would like semantics that eliminate unnecessary casts, and requires casts where cast semantics are intended. Since the compiler can't read our mind, we will need to at least sacrifice a little of one of those goals. In C, elimination of unnecessary casts are prioritized, although these days with the common set of warnings it is somewhat less true.

In addition to this, we would also like to make sure arithmetics happens with the bit width the user intended, and that code should be can be interpreted locally without having to know the details of variables defined far away – this is particularly important if the compiler does not detect mismatches.

Even more important is a simple programmer mental model for semantics: easy to understand semantics that are less convenient trumps convenient but complex semantics.

## Strategies

In order to get started, let's list a bunch of possible strategies for widening and narrowing semantics.

#### Strategies for widening

1. Early fold, late cast (deep traversal).
2. Left hand side widening & error on peer widening.
3. No implicit widening.
4. Peer resolution widening (C style).
5. Re-evaluation.
6. Top casting but error on subexpressions.
7. Common integer promotion (C style).
8. Pre-traversal evaluation.

#### Strategies for narrowing

1. Always allow (C style).
2. Narrowing on direct literal assignment only.
3. Original type is smaller or same as narrowed type. Literals always ok.
4. As (3) but literals are size checked.
5. No implicit narrowing.

#### Strategies for conversion of signed/unsigned

1. Prefer unsigned (C style).
2. Prefer signed.
3. Disallow completely.

We should also note that there is no need to limit ourselves to a single strategy. For example, in Java char c = 1 + 1; is valid, but this isn't:

char d = 1;
char c = 1 + d;


Here the simple literal assignment is considered a special case where implicit conversion is allowed. In other words, we're seeing a combination of strategies.

### Case study 1: "C rules"

C in general works well, except for the unsafe narrowings – until we reach "larger than int" sizes. In this case we get peer resolution widening, which produces bad results for sub expressions:

int a = 0x7FFFFFFF;
int64_t x = a + 1;
// => x = -2147483648


In general working with a combination of int and larger-than-int types work poorly with C.

### Case study 2: "C + left side widening"

This is using the current C3 idea of using the left side type to increase the type implicitly widened to. Our C example works:

int a = 0x7FFFFFFF;
int64_t x = a + 1;
// => x = 2147483648 with left side widening


As previously mentioned, this breaks down when there is no left side type, which then defaults to C peer resolution widening.

int a = 0x7FFFFFFF;
int64_t b = 0;
double d = (double)((a + 1) + b);
// => x = -2147483648.0


In the next article I will walk through various new ideas for promotion semantics with analysis.

Christoffer Lernö

When one is deviating from language semantics, one sometimes accidentally break established, well-understood semantics. One of the worst design mistakes I did when working on C3 was to accidentally break associativity for addition.

Here is some code:

// File foo.h
typedef struct {
unsigned short a;
unsigned short b;
unsigned short c;
} Foo;

// File bar.c
unsigned int fooIt(Foo *foo)
{
unsigned int x = a + b + c;
return x;
}

// File baz.c
int calculate()
{
Foo foo = { 200, 200, 200 };
assert(fooIt(foo) == foo.b + foo.c + foo.a);
return fooIt(foo);
}


I've written this with pure C syntax, but we're going to imagine deviating from C semantics.

In particular we're going to say:

1. if two operands in a binary expression (e.g. a + b) are of the same bit width n, the operation will be performed with wrapping semantics. So if we have two variables that are unsigned char and we add them, then the maximum value is 255. Similar with unsigned short will yield a maximum of 65535.
2. Implicit widening casts will be performed as long as they do not affect signedness.

In our example above fooIt(foo) will return 600 regardless whether we are using C or this new language with different semantics.

But let's say someone found this code to be memory inefficient. b and c should never be used with values over 255 (for one reason or the other). They alter the file foo.h in the following way, which passes compilation:

typedef struct {
unsigned char a;
unsigned char b;
unsigned short c;
} Foo;


You go to work and make changes and discover that suddenly your assert is trapping. You look at calculate and find no changes to that code. Similarly bar.c with fooIt. You find out that fooIt(foo) now returns 344, which makes no sense to you.

Finally the only candidate left is the change to Foo, but the data in Foo is the same and your assert is doing the same calculation as fooIt... or is it?

It turns out that with the non C semantics above, the computer will calculate unsigned int x = a + b + c in the following way:

1. a + b mod 2^8 => 144
2. 144 + c mod 2^16 => 344

In your assert on the other hand, we swapped the order:

1. b + c mod 2^16 => 400
2. 400 + a mod 2^16 => 600

The new semantic silently broke associativity and the compiler didn't warn us a single bit. This is a spooky action at a distance which you definitely don't want. Neither the writer of Foo, nor of fooIt, nor you could know that this would be a problem, it only breaks when the parts come together.

But "Wait!", you say, "There are many languages allowing this 'smaller than int size adds' addition by default, surely they can't all be broken?" – and you'd be right.

So what is the difference between our semantics and non-broken languages like Rust? If your guess is "implicit widening", then you're right.

And doesn't this seem strange? I mean it's not related to why the associativity breaks, but it's still the culprit. Because what happens if we don't have the widening?

Well fooIt would stop compiling for one:

unsigned int fooIt(Foo *foo)
{
unsigned int x = a + b + c;
//               ^^^^^^^^^
// Error: cannot add expression of type unsigned char
// to expression of type unsigned short
return a;
}


And of course it would be known that changing Foo would be a possibly breaking change.

## So what can be learned?

Designing new language semantics isn't trivial. Few consequences are easily recognizable at the beginning. One needs to be ready to drop semantics if they later turn out to have issues one didn't count on, even if they "work in most cases".