Cx (pedantic C-language variants)

From HalfgeekKB
Jump to navigation Jump to search

Programming Language Concepts (book) creates variants of C called C1, C2, C3, C4, and C5 entirely for instructional purposes. The examples are from the book. SIMPLESEM listings are given line numbers that should be omitted in reality. Stray ?, +, * correspond to their meanings in Perl regexps.

C1

A lexical variant of a subset of C.

Features

  • Simple types
    • No dynamic memory
    • All primitive types
    • Fixed-size arrays
    • Structures
  • Simple statements
    • get(...) is a builtin to read stdin (e.g., get(i,j))
    • print(...) is a builtin to write stdout
  • No functions
    • Entire program is within main()

Sample

main()
{
  int i, j;
  get(i, j);
  while(i != j)
    if(i > j)
      i -= j;
    else
      j -= i;
  print(i);
}

Equivalent SIMPLESEM

0: set 0, read
1: set 1, read
2: jumpt 8, D[0] = D[1]
3: jumpt 6, D[0] <= D[1]
4: set 0, D[0] - D[1]
5: jump 7
6: set 1, D[1] - D[0]
7: jump 2
8: set write, D[0]
9: halt

where D[0] and D[1] are i and j, respectively.

C2

C1, but with simple routines. Note that, unlike in actual C, function definitions (except main()) are suffixed with a ;.

New features

  • Simple routines
    • Routines can declare their own data
    • May access local and (unredeclared) global variables
    • No parameters
    • No return values
    • Routines may call other routines
    • Recursion not allowed
      • A routine may not call itself (direct recursion)
      • A routine may not call (a routine that calls)+ it (indirect recursion)
    • Nesting not allowed

Compiling into SIMPLESEM

The assumptions in C1 and C2 allow the size of each unit's activation record (AR) (that is, the layout of D for each routine in SIMPLESEM) to be determined at compile time, and variables can be statically allocated (in absence of nesting, recursion, and dynamic variables, all variables can map permanently to indices of D at compile time). The rules of static allocation are exactly as if all local variables were declared with the static keyword.

The first location in each AR (except main) is reserved for the return pointer for that function. The calling function copies the position number one after the jump to the return pointer before jumping:

14: set 6, 16
15: jump 100

Note that in C2, this is set to a constant (16) and not the more apparently sensible meta-location (ip+1).

Then, the corresponding routine ends by jumping back using the pointer.

jump D[6]

Layout

  • Global declaration*
  • (Routine definition|routing declaration)*
  • main(), which is a routine that is automatically executed and cannot be explicitly called

Sample

int i = 1, j = 2, k = 3;
alpha()
{
  int i = 4, l = 5;
  ...
  i += k + l;
  ...
};
beta()
{
  int k = 6;
  ...
  i = j + k;
  alpha();
  ...
}
main()
{
  ...
  beta();
  ...
}

C2'

C2, but with support for incremental/separate compilation. The extern keyword is used to declare externals:

extern beta();
extern int k;

As program files are compiled individually, addresses are given relative to the beginning of the code block (in C) or the AR (in D), and resolved later by a linker.

References

See also