THREADED CODE
Bell (1) and Dewar (2) have described the concepts of Threaded Code (also called Direct Threaded Code, or DTC), and an improvement called Indirect Threaded Code, or ITC. DTC was used to implement Fortran IV for the PDP-11, and ITC was used for a machine-independent version of Spitbol (a fast form of the string-processing language Snobol). Forth is a form of ITC, but different from the scheme presented in (2).
In DTC, a program consists of a list of addresses of routines. orc is fast; in fact, only a single PDP-11 instruction execution is required to link from one routine to the next (the instruction is 'JMP @(R)+', where 'R' ts one of the general registers). Overall, DTC was found to be about three percent slower than straight code using frequent subroutine jumps and returns, and to require 10-20 percent less memory. But one problem is that for each variable the compiler had to generate two short routines to push and pop that variable on the internal run-time stack.
In ITC, a program is a list of addresses of addresses of routines to he executed. As used in (2), each variable had pointers to push and pop routines, followed by its value. The major advantage over DTC is that the compiler does not have to generate separate push and pop routines for each variable; instead these were standard library routines. The compiler did not generate any routines, only addresses, so it was more machine independent. In practice, ITC was found to run faster than DTC despite the extra level of indirection. Tt also used less memory.
Forth is a form of ITC, with additional features.
Forth operations are lists of addresses pointing into dictionary entries. Each dictionary entry contains:
(A) Ascii operation name, length of the name, and precedence bit; these are used only at compile time and will not be discussed further.
(B) A link pointer to the previous dictionary entry. (This is used only at compile time.)
(C) A pointer called the code address,
which always points to executable machine
code.
(D) A parameter field, which can contain machine instructions, or Forth address lists, or variable values or pointers or other information depending on the variable type.
By the way, virtually everything in
Forth is part of a dictionary entry: the
compiler, the run-time routines, the
operating system, and your programs. In
most versions, only a few bytes of code are
outside of the dictionary.
The code address is crucial; this is the 'indirect' part of ITC. Every dictionary entry contains exactly one code address. If the dictionary entry is for a "primitive" (one of the 40 or so operations defined in machine language), the code address points two bytes beyond itself, to the parameter field, which contains the machine-language routine.
If the dictionary entry is for a Forth
higher-level operation (a colon definition),
the code address points to a special "code
routine" for colon definitions. This short
routine (e.g. 3 PDP-11 instructions) nests
one level of Forth execution, pushing the
current 'I' register (the Forth "instruction
counter") onto a return-address stack, then
beginning Forth execution of the
address-list in the new operation's
parameter field.
If the dictionary entry is for a variable, then the code address points to a code routine unique to that variable's type. The parameter field of a variable may contain the variable's value - or pointers if re-entrant, pure-code Forth is desired.
Results of Forth-type ITC include:
(A) Execution is fast, e.g. two PDP-11 instruction executions to transfer between primitives, about ten to nest and un-nest a higher-level definition. (Because of the pyramidal tree-structure of execution, the higher-level nesting is done less often.) Yet the language is fully interactive.
(B) Forth operation names (addresses)
are used exactly the same regardless of
whether they represent primitives or
higher-level definitions (nested to any
depth). Not even the compiler knows the
difference. In case run-speed optimization
is desired, critical higher-level operations
(such as inner loops) can be re-coded as
primitives, running at full machine speed,
and nothing else need by changed.
(C) Forth code is very compact. The language implements an entire operating system which can run stand-alone, including the Forth compiler, optional assembler, editor, and run-time system, in about 6k bytes. (Forth can also run as a task under
a conventional operating system, which sees