Cx (pedantic C-language variants)
Programming Language Concepts (book) creates variants of C called C1, C2, C3, C4, and C5 entirely for instructional purposes. The examples are from the book. SIMPLESEM listings are given line numbers that should be omitted in reality. Stray ?, +, * correspond to their meanings in Perl regexps.
Contents
C1
A lexical variant of a subset of C.
Features
- Simple types
- No dynamic memory
- All primitive types
- Fixed-size arrays
- Structures
- Simple statements
- get(...) is a builtin to read stdin (e.g., get(i,j))
- print(...) is a builtin to write stdout
- No functions
- Entire program is within main()
Sample
main() { int i, j; get(i, j); while(i != j) if(i > j) i -= j; else j -= i; print(i); }
Equivalent SIMPLESEM
0: set 0, read 1: set 1, read 2: jumpt 8, D[0] = D[1] 3: jumpt 6, D[0] <= D[1] 4: set 0, D[0] - D[1] 5: jump 7 6: set 1, D[1] - D[0] 7: jump 2 8: set write, D[0] 9: halt
where D[0] and D[1] are i and j, respectively.
C2
C1, but with simple routines. Note that, unlike in actual C, function definitions (except main()) are suffixed with a ;.
New features
- Simple routines
- Routines can declare their own data
- May access local and (unredeclared) global variables
- No parameters
- No return values
- Routines may call other routines
- Recursion not allowed
- A routine may not call itself (direct recursion)
- A routine may not call (a routine that calls)+ it (indirect recursion)
- Nesting not allowed
Compiling into SIMPLESEM
The assumptions in C1 and C2 allow the size of each unit's activation record (AR) (that is, the layout of D for each routine in SIMPLESEM) to be determined at compile time, and variables can be statically allocated (in absence of nesting, recursion, and dynamic variables, all variables can map permanently to indices of D at compile time). The rules of static allocation are exactly as if all local variables were declared with the static keyword.
The first location in each AR (except main) is reserved for the return pointer for that function. The calling function copies the position number one after the jump to the return pointer before jumping:
14: set 6, 16 15: jump 100
Note that in C2, this is set to a constant (16) and not the more apparently sensible meta-location (ip+1).
Then, the corresponding routine ends by jumping back using the pointer.
jump D[6]
Layout
- Global declaration*
- (Routine definition|routing declaration)*
- main(), which is a routine that is automatically executed and cannot be explicitly called
Sample
int i = 1, j = 2, k = 3; alpha() { int i = 4, l = 5; ... i += k + l; ... }; beta() { int k = 6; ... i = j + k; alpha(); ... } main() { ... beta(); ... }
C2'
C2, but with support for incremental/separate compilation. The extern keyword is used to declare externals:
extern beta(); extern int k;
As program files are compiled individually, addresses are given relative to the beginning of the code block (in C) or the AR (in D), and resolved later by a linker.