Cifl/Metacifl

From HalfgeekKB
Jump to: navigation, search

Metacifl refers to a minimal name and identifier mapping system for statically typed macros that could be used to implement Cifl.

Primitives

  • An identifier is a sequence of bytes (perhaps limited to text strings, depending, but preferably unlimited).
  • A package is a sort of meta-namespace that consists of these mappings:
    • The packages mapping maps an identifier to a package.
    • The types mapping maps a tuple of (identifier, ctparams) to a type.
    • The operators mapping maps a tuple of (identifier, ctparams, params) to an operator.
  • A type is a specification of the storage requirements of a variable/member/element and an indicator of which operators in scope apply to the value.
  • An operator is a specification of input parameters and types that refers to a macro.
  • A list of compile-time parameters, or ctparams, qualifies types and operators with values that are known at compile time and can be used e.g. to affect the structure of a data type for template generics.
  • A list of run-time parameters, or params, qualifies operators with values that are determined at run time and used as part of the computation implemented by the operator.

Built-in symbols

Before even a host context is defined, there are symbols defined with some meaning in Metacifl itself. Some are defined in the namespace `-meta`, which has an ugly name to discourage abuse.

Built-in types indicate the types of symbols and compile-time expressions in Metacifl itself.

package `-meta`

cttype Name // unqualified name
cttype Package // a package by identity, not necessarily by name
cttype QName // a pair of an package and an identifier
cttype Type // a data type
cttype Operator // an operator

ctop QName[P:Package,N:Name] : QName
ctop rootPackage : Package

Annotations

  • @`-meta`.pure : This operator always returns the same value for a given set of parameters
  • @`-meta`.v : This operator represents a variable/const. On a getter, it suggests that the result value of this operator will not change between calls to the corresponding setter (unless the variable is also volatile). (A getter has zero run-time arguments.)

More to come.

Types

// Opaque
SomeInt[]         -> #define Type_SomeInt() int
// Template generic
Some2D[T:Type]    -> #define Type_Some2D(T) struct { T x; T y; }
// Erasure generic
SomeArray[T:Type] -> #define Type_SomeArray(T) struct { const size_t length; void * const data }

In Metacifl, a type specifically refers to a value type. A type does not carry notations of mutability or volatility because these are not properties of values. In particular,

  • Mutability is modeled by the presence or absence of a mutation operator (a "setter") by which to alter a value.
  • Volatility is outside the scope of Metacifl itself, but can be expressed using language-specific annotations.

On the other hand, these properties might need to be encoded as part of a pointer type, which is effectively an address (an rvalue) to a variable (an lvalue). These notations would not apply to the value itself but to the variable to which it points.

Operators

// On opaque
`+`(SomeInt[],SomeInt[])          -> #define Op_plus_SomeInt_SomeInt(a,b) (a)+(b)
// Erasure operator on template type
`.x`[T:Type](Some2D[T])           -> #define Op_dotx(T,s) (s).x // T isn't used
// Template operator on erasure type
`()`[T:Type](SomeArray[T],size_t) -> #define Op_call(T,a,i) ((T*)((a).data))[i] // T is used
// Typecast
SomeInt[](SomeOtherIntType[])     -> #define Op_SomeInt_SOIT(other) ((int)other)

In principle, an operator is very similar to a C macro, but with some extra type checking at the high level.

By convention, variables, functions, and type casts are generalized to operators.

  • A function implementation maps fairly closely to an operator implementation. A C macro expansion is already syntactically similar to a C function call.
  • A variable is implemented in terms of a get operator (nullary, returning the variable type) and a set operator (unary, accepting the variable type, returning void).
    • A constant simply omits the set operator.
  • A type cast is implemented by naming a unary operator with the same name as the type.

Note that in general, a template generic type will use erasure generic operators and vice versa.

Appendix

Idioms

Scalar type

C:

typedef int myint;

Model:

// Declare a type
type myint

// Declare operations on that type
op `+`(myint, myint) : myint
op `-`(myint, myint) : myint
// ...

Struct type

C:

typedef struct {
  int memint;
  double memdouble;
  const char memconstchar;
} mystruct;

Model:

type mystruct

// Initializer
op mystruct(int, double, char) : mystruct
// Accessors
op memint : int
op memint(int) : void
op memdouble : double
op memdouble(double) : void
op memconstchar : char
// no setter for memconstchar

A similar model could be used for a plain C union; the implementation would simply differ somewhat.

Enumerated type

C:

typedef enum {
  CLUBS,
  DIAMONDS,
  HEARTS,
  SPADES
} cardsuit;

Model:

package somepackage
type cardsuit
op cardsuit[Dummy] : Enum[cardsuit] // Optional
package somepackage.cardsuit
import somepackage.cardsuit // imports the type
op CLUBS : cardsuit
op DIAMONDS : cardsuit
op HEARTS : cardsuit
op SPADES : cardsuit

In this model, one could import "somepackage.cardsuit" and refer to the constants as e.g. "cardsuit.SPADES". Or, one could import "somepackage.cardsuit._" and refer to the constants as e.g. "SPADES".

The "cardsuit" operator with the hypothetical unit ctparam "Dummy" would allow the extraction of additional information without polluting the associated package. (Something like cardsuit.info could be done, but then it wouldn't be possible to define an enum constant called "info".) The ctparam is a dummy argument; it's only there to specify that you're addressing the operator and not the namespace.

Ctparam'ed types

C:

// Generic via macro
#define DIV_TYPE(T) struct { T quot; T rem; }
typedef DIV_TYPE(int) div_t;
typedef DIV_TYPE(long) ldiv_t;
typedef DIV_TYPE(long long) lldiv_t;

Model:

// In a C implementation, the type is a templated type, meaning that the
// generic behavior is implemented by a distinct type per parameter list.
// Contrast erasure type, where behavior is implemented by the same type
// regardless of parameters.
// Example implementation would be like
// #define anydiv_t(T) struct { T quot; T rem; }
type anydiv_t[T]

op anydiv_t[T](T,T) : anydiv_t[T]
op quot[T](anydiv_t[T]) : T
op quot[T](anydiv_t[T], T) : void
op rem[T](anydiv_t[T]) : T
op rem[T](anydiv_t[T], T) : void

Function types

typedef int (*Comparator)(const void *, const void *);

Consider an operator with the following format:

op somecomp[T](ConstRef[T], ConstRef[T]) : int

A function in general might have the type

type Function[R:Type, P:Type...]

noting that there are no ctparams possible on the function itself. The type parameter reappears in the result type of some expression yielding such a function.

op getComparator[T] : Function[int, ConstRef[T], ConstRef[T]]

Selected symbols consistent with Java

package java.lang // not technically true, but semantics are correct

// Types
type byte
type short
type char
type int
type long
type float
type double
type boolean

// Constants are coded as nullary ops
op true : boolean
op false : boolean

// Numeric casts
op byte(int) : byte
op short(int) : short
op char(int) : char
op int(int) : int // self
op long(int) : long
op float(int) : float
op double(int) : double

// Unary ops
op `+_`(int) : int // self
op `-_`(int) : int
op `~_`(int) : int

// Binary ops
op `<<`(int,int) : int
op `>>>`(int,int) : int
op `>>`(int,int) : int

// Compares
op `==`(int,int) : boolean
op `!=`(int,int) : boolean
op `<`(int,int) : boolean

// Operators involving conversion/promotion are codified explicitly
op `|`(int,int) : int
op `|`(int,long) : long
op `+`(int,int) : int
op `+`(int,long) : long
op `+`(int,double) : double

package java.lang.`-ext` // not accessed directly but by special syntax

type Array[T]

op Array[T](int) : Array[T] // ctor, but modeled as a factory
op Array[T](int,int) : Array[Array[T]]
op Array[T](int,int,int) : Array[Array[Array[T]]]
// ... and so on ...
op length(Array[_]) : int

Selected symbols consistent with C

package stdc.meta // Made up

import stdc.lang._

cttype AnyInteger

@implicit op _(int) : AnyInteger
@implicit op _(`unsigned int`) : AnyInteger
//...

// Function pointer type. Note that it's variadic.
type FunctionRef[ReturnType, ParameterTypes...]

// An example of using non-type ctparams.
// cttype defines a type that does not exist in the runtime environment
// but can be converted into another type that does. In this example, a
// "Refable" is what standard C calls an "lvalue". An lvalue expression
// differs from a typical value expression in that it refers indirectly
// to a memory location, which can be retrieved using the unary &
// operator, then used as its original form with the unary *. If the
// lvalue is not marked constant, the value can be assigned to.
// The unary & yields a Ref[T,C,V], which corresponds, in C, to T *
// (with const and volatile set by C and V, respectively). C
// distinguishes const and volatile expressions on addresses (pointers).
// There is no data type that corresponds strictly with an lvalue, but
// we can encode one for compile-time use with a cttype. A Refable can
// be converted to its Ref by &, or can be reverted to its rvalue
cttype Refable[T:Type, C:Flag, V:Flag]
type Ref[T:Type, C:Flag, V:Flag]

op `&_`[T,C:Flag,V:Flag](Refable[T,C,V]) : Ref[T,C,V]

op `*_`[T,C:Flag,V:Flag](Ref[T,C,V]) : Refable[T,C,V]
op `()`[T,C:Flag,V:Flag](Ref[T,C,V], AnyInteger) : Refable[T,C,V]

// Auto-convert to const and volatile.
// The non-implicit casts are better implemented in C itself on an
// ad-hoc basis.
@implicit op _[T](Ref[T,False,_]) : Ref[T,False,True]
@implicit op _[T](Ref[T,_,False]) : Ref[T,True,False]
@implicit op _[T](Ref[T,_,_]) : Ref[T,True,True]

// If an lvalue is not requested, this should revert the expression to
// an rvalue.
@implicit op _[T](Refable[T,_,_]) : T

// Allow non-const to be assigned
op `=`[T](Refable[T,False,_], T) : T

cttype Array[T:Type, C:Flag, V:Flag]

// An array degenerates to its head pointer.
@implicit op _[T,C:Flag,V:Flag](Array[T,C,V]) : Ref[T,C,V]

// Unlike the pointer, an actual stack array can be sized.
op sizeof[a : Array[_,_,_]]


package stdc.lang // Made up

import stdc.meta._

type _Bool
type char
type `unsigned char`
type `signed char`
type short
type `unsigned short`
type int
type `unsigned int`
type long
type `unsigned long`
type `long long`
type `unsigned long long`
@unit type void

type `size_t`

op `+`(char,int) : int
op `+`(long,int) : long

// How a struct is defined using type and ops
type div_t
@member op quot(div_t) : int
@member op quot(div_t, int) : void
@member op rem(div_t) : int
@member op rem(div_t, int) : void

op `/%`(int,int) : div_t // implement on div()
op `/%`(long,long) : ldiv_t // implement on ldiv()
op `/%`(`long long`,`long long`) : lldiv_t // implement on lldiv()

op strcpy(Ref[char],ConstRef[char]) : Ref[char]
// qsort as a typed generic!
// Note that the size-per-element parameter is omitted. The implementation
// will use a value derived from T for that parameter.
op qsort[T](Ref[T],size_t,FunctionRef[int,ConstRef[T],ConstRef[T]]) : void