TL language
version 1.5

TL language is a description language for creating new Tcl commands. Its compiler creates Tcl interpreters in ANSI C language. Moreover, automatic documentation can be generated, too.

1) What is TL

1.1) Introduction

1.2) Why do we need such a language

1.3) Tcl command interpreter structure

1.4) TL philosophy

2) TL definition

2.1) Syntax

2.1.1) Introduction

2.1.2) Reserved words

2.1.3) Formal syntax

2.1.4) Simplified syntax

2.2) Types

2.2.1) Intype types

2.2.2) Outtype types

2.2.3) Structures

2.2.4) Other types definition

3) Compiler

4) History and future plans

4.1) History

4.2) Future plans

5) Examples

1) What is TL

1.1) Introduction

Writing new Tcl commands can be a hard job. The interpreter needed for a command can be very simple for a typeless Tcl command, but becomes quite complicated when a Tcl command uses just slightly complex types. Moreover, the presence of sub-commands and command options can complicate furthermore such an interpreter.

To simplify the writing of new Tcl commands, a new language, named TL, was created.

1.2) Why do we need such a langage

Once upon the time (October 1996) I had to write a new Tcl command to access a C library. The proposed syntax for the command "vme" is presented in the following frame:

vme open ?$name?
vme map $cid $address $size $am
vme close $cid
vme read $cid $offset ?D8/D16/D32?
vme write $cid $offset $data ?D8/D16/D32?

As could be deduced from the proposed syntax, the command should be quite simple to implement. However, I discovered this was not the case when I have started the command implementation. There were no big conceptual problems, nevertheless there was a lot of code to be written. Lots of switch statements were needed to distinguish between different cases and also error checking has taken a lot of coding. Moreover, at the end of all this code writing, one could simply forget what the command was supposed to do. The biggest part of the code was the interpreter!

1.3) Tcl command interpreter structure

After writing some more or less simple Tcl commands, I found out that interpreters were always more or less the same. The string constants were every time different, but the skeleton did not change.

Interpreters for Tcl commands are based on argument evaluation. The zero argument, i.e. the command name, is treated by the main Tcl interpreter, so the only thing to do is to declare it. All the other arguments must be treated by the specific interpreter itself.

Three different types of arguments can be mixed in a command:

sub-command specifiers
parameter arguments
option specifiers

The first type of arguments, sub-command specifiers, are normally present immediately after the command name and define sub-commands exactly as the command name defines the complete command. When an argument of this type is found, the most obvious thing to do is to call an interpreter for the sub-command. In this way, sub-commands do not differ from complete commands from the interpreter point of view.

The second type of arguments, parameter arguments, are the most frequent ones. Usually they are position dependant and have a well known type. The main task of the interpreter, when an argument of this type is found, is to convert the string in an appropriate C type, signaling any error.

The third type of arguments, option specifiers, are the most difficult to treat when the interpreter is written by hand. They are position independent and can be present any number of times. Moreover, they are frequently followed by a parameter argument. The usual way to treat them is to assume they are present after the last argument of the previous types, and evaluate the remaining arguments one by one trying to identify the option each time.

After evaluating all the arguments, the command (sub-command) specific code is executed. This code has in input the parameters of correct type given by the interpreter and produces in output another parameter of a given type. The output parameter is then transformed in a string and passed as output of the command.

It could look odd that a command (sub-command) returns always the same type of output. It is not the case for generic programs. But the result of a Tcl command is normally used as input for other Tcl commands, so the type must be known in advance or this result cannot be used.

After this short introduction it should be evident that Tcl interpreters are very modular. Moreover, they are also very uniform inside a single argument type. An ideal condition for automatic code production.

1.4) TL philosophy

The TL language is a description language; the programmer has only to describe the Tcl command he wants to implement, and the interpreter will be automatically created by the TL compiler. The only part that must be implemented in C is the command specific code. Obviously, not every possible Tcl command can be implemented using this language, but an equivalent one, using a slightly different syntax, can be.

TL is a strictly typed language. This is derived principally from the target language, the C. But there are also other reasons; specifying the types at definition level, the syntax of the final command is much more clear. Moreover, this is a good programming style.

The other interesting property of TL is its simplicity. Excluding the types, which are very rich, there are essentially only seven (7) commands present in TL. Nevertheless, these few commands cover all the spectrum of Tcl commands.

Let's now see the types:

There are different types in TL. The basic distinction is between the so called Intypes and the Outtypes. As can be foreseen from their names, the first type of types are the types of the arguments, while the second type of types are the types used to return a result. Two different types of types are necessary because some details that are obvious for argument decoding are not when a C type must be encoded, and vice versa. Let's see an example to understand the problem:

Imagine you have a boolean value. You can represent it in different ways: as 0/1 as T/F or as True/False couple. From the input point of view, all these forms are correct. So only one intype, boolean, is enough to represent a boolean argument. From the output point of view this is not so simple; the three types of couples are totally different, so three different outtypes must be present in the system to allow the user to decide what type of output it will have.

Although the two types of types are semantically two different objects, syntactically they are very similar. Moreover, even the logical structure is very similar so, from now on, they will be treated as they are two variants of the same type.

Moreover, the TL types can be divided in two categories:

simple types
compound types

Simple types represent basically numerical values (integer and floating), booleans, pointers and strings. They are the simpler types one can use inside a TL program.

Compound types are a bit more complicated types. They are record, case and list types whose elements are other types. These types of types are nearly impossible to be handled by hand, but can by easily coded by an interpreter.

2) TL definition:

2.1) Syntax

2.1.2) Introduction

TL syntax is case insensitive, but the names of the identifiers are memorized as declared. The need for case sensitive internal memorization is dictated by the use of C as a target language (C is case sensitive).

2.1.2) Reserved words

As in most of the languages, also TL parsing is based on reserved words. This is the list of all of them:

array
binary
bool
boolean
breal
call
case
cbool cboolean
char
command
comment
compact
const
crange
data else
end
freal
hex
insensitive
int
integer
intreal intype
irange
irrange
iset
list
nat
natural
nbool nboolean
octal
of
option
optparam
outtype
param
pointer program
real
record
rrange
select
sensitive
set
step string
unixdata
user

2.1.3) Formal syntax

Click here to see the formal syntax using YACC syntax.

2.2.4) Simplified syntax

This description is not formally perfect nor is it exhaustive. It is intended for fast learning only.

Click here to see the simplified syntax.

2.2)Types

2.2.1) Intype types

Click here to see the intype types.

2.2.2) Outtype types

Click here to see the outtype types.

2.2.3) Structures

This structures can be used as intype or outtype types. There must be only intype/outtype types and not a mix of the two.

Click here to see the structures.

2.2.4) Other types definition

Binary numbers

Binary numbers are strings of 0s and 1s. They are written in base 2.

Octal numbers

Octal numbers are strings of digits form 0 to 7. They are written in base 8.

Hex numbers

Hex numbers are strings of digits from 0 to 9 and letters from a to f (case insensitive), where a==10,...,f==15. They are written in base 16.

Compact numbers

Compact numbers are strings of the following chars(case sensitive):

Char Code Char Code Char Code Char Code

0 0 g 16 w 32 M 48

1 1 h 17 x 33 N 49

2 2 i 18 y 34 O 50

3 3 j 19 z 35 P 51

4 4 k 20 A 36 Q 52

5 5 l 21 B 37 R 53

6 6 m 22 C 38 S 54

7 7 n 23 D 39 T 55

8 8 o 24 E 40 U 56

9 9 p 25 F 41 V 57

a 10 q 26 G 42 W 58

b 11 r 27 H 43 X 59

c 12 s 28 I 44 Y 60

d 13 t 29 J 45 Z 61

e 14 u 30 K 46 _ 62

f 15 v 31 L 47 ^ 63

Char	Code	Char	Code	Char	Code	Char	Code
`0`	0	`g`	16	`w`	32	`M`	48
`1`	1	`h`	17	`x`	33	`N`	49
`2`	2	`i`	18	`y`	34	`O`	50
`3`	3	`j`	19	`z`	35	`P`	51
`4`	4	`k`	20	`A`	36	`Q`	52
`5`	5	`l`	21	`B`	37	`R`	53
`6`	6	`m`	22	`C`	38	`S`	54
`7`	7	`n`	23	`D`	39	`T`	55
`8`	8	`o`	24	`E`	40	`U`	56
`9`	9	`p`	25	`F`	41	`V`	57
`a`	10	`q`	26	`G`	42	`W`	58
`b`	11	`r`	27	`H`	43	`X`	59
`c`	12	`s`	28	`I`	44	`Y`	60
`d`	13	`t`	29	`J`	45	`Z`	61
`e`	14	`u`	30	`K`	46	`_`	62
`f`	15	`v`	31	`L`	47	`^`	63

They are written in base 64.

Floating point in binary format

Floating points in binary format are strings of 0s and 1s plus an optional exponent. The binary part represents the fract part. The first 1 is omitted. (see IEEE floating points)

Attention:
0.0 cannot be represented using floating point in binary format.
TL uses 0 for 0.0 and 0f... otherwise.

3) Compiler

The actual compiler compiles TL programs in C code for use in tcl interpreters.

The syntax is:

tlc [options] file.tl [docdir] [outdir] [indir]

where options are:
-n    :     do not create the html file
-n-   :     create the html file (default)
-l    :     set links to the userNAME_c.html generated by c2html
-l-   :     do not link (default)
-h
-?
?     :     this help

It will create four(three) files:

tclNAME.h
tclNAME.c
userNAME.h
NAME_tl.html

where NAME is the name of the tl program.

tclNAME.h contains the header of the tcl command interpreter, while tclNAME.c contains the real command interpreter.

userNAME.h contains the type declarations and the function prototypes. The actual functions must be supplied by the user/programmer.

NAME_tl.html is the HTML interpretation of the tl file. It is placed in the docdir. If used with the -l option, be sure to generate the appropriate userNAME_c.html using c2html.

4) History and Future plans

4.1) History

Version 1.0

Definition of the TL language.

First working compiler.

Version 1.1

Constant treatment added.

Some bugs fixed.

Version 1.5

HTML from TL ability implemented.

A new variant of the constant definition added.

Comments have been extended.

4.2) Future plans

In the near future I want to implement multi-name options.

I have no other ideas for the moment.
If you find a useful way to extend the tl, send me a mail. Maybe I will do it.

5) Examples

Example 1: Conversion tool

This conversion tool simplifies the conversion between different types supported by the TL.

The input files are:

The generated code files are:

The generated documentation is:

conv_tl.html

The following command has been used:

tlc -l conv.tl $DOC

Example 2: The ROCK library

This is an example of the documentation of a real interface to a C library. Since it uses a real library, some links are not avalilable outside LNF.