Originally published inForth Dimensions XVIII/2, 30
I started a personal project two and half years ago, which was in my mind for quite a long time: Widespread Forth in Korea. Postfix is natural to Korean people since a verb comes after an object in Korean language. Also Forth does not restrict a programmer to use only alphanumeric characters. A Korean Forth programmer can easily express his idea in comfortable Korean words rather than to be forced to think in English. As one might expect, there was an effort for Korean Forth. Dr. Chong-Hong Pyun and Mr. Jin-Mook Park built a Korean version of fig-Forth for Apple II computer in mid-eighties. Long-time FD readers may remember Dr. Pyun's letter in Forth Dimensions X/6, 8. Unfortunately, Korean computer community swiftly moved to IBM PC while Dr. Pyun wrote articles about their work in popular programming and science magazines. It became somewhat obsolete before being known widely. Despite of this and other efforts Forth has been virtually unknown to most Koreans. Two and half years ago I decided to restart it and looked for a vehicle for the purpose. I found that there was no small ANS Forth system for IBM PC. I decided to build one. In the course of ANSifying eForth I have replaced every line of eForth source and felt that it deserved its own name. I knew that there were Forth systems named as bForth, cForth, eForth, gForth, iForth, Jforth and KForth. I picked h since it seemed not yet used by anyone and also Han means Korean in Korean language.
eForth, which was written by Mr. Bill Muench and Dr. C. H. Ting in 1990, seemed to be a good place to start. I studied eForth source and Dr. Ting's article in Forth Dimensions XIII/1, 15 and set the following goals:
Most of them are adapted from eForth. I emphasize extensive error
handling since some of well-known Forth systems cannot manage as simple a
situation as divide-by-zero. In hForth almost all ambiguous conditions
specified in the ANS Forth document issue THROW
and are
captured by CATCH
either by user-defined word or by hForth
system.
hForth ROM model is especially designed for a minimal development system for embedded applications which uses non-volatile RAM or ROM emulator in place of ROM. The content of ROM address space can be changed during development phase and is copied later to real ROM for production system. hForth ROM model checks whether or not ROM address space is alterable when it starts. New definitions go into ROM address space if it is alterable. Otherwise they go into RAM address space.
Alterable ROM address space Unalterable ROM address space =============================== =============================== name space of new definitions ------------------------------- RAM address space RAM address space ------------------------------- ------------------------------- data space / code space data space of new definitions =============================== =============================== name space of old definitions name space of old definitions ------------------------------- ------------------------------- name space of new definitions ------------------------------- ROM address space ROM address space ------------------------------- ------------------------------- data space / code space of new definitions data space ------------------------------- ------------------------------- code space of old definitions code space of old definitions =============================== ===============================
Data space can be allocated either in ROM address space for tables of
constants or in RAM address space for arrays of variables.
ROM
and RAM
, recommended in the Appendix of the
Standard document, are used to switch data space between RAM and ROM
address space. Name space may be excluded in final system if an
application does not require Forth text interpreter. 8086 hForth ROM
model occupies little more than 6 KB of code space for all Core word set
words and requires at least 1 KB of RAM address space for stacks and
system variables.
The assembly source is arranged so that more implementation-dependent
words come earlier. System-dependent words come first, CPU-dependent
words come after, then come all the other high level words. Colon
definitions of all high level words are given as comments in the assembly
source. One needs to redefine only system-dependent words to port hForth
ROM model to a 8086 single board computer from current one for MS-DOS
machine without changing any CPU-dependent words. Standard words come
after essential non-Standard words in each system-dependent,
CPU-dependent, and portable part. All Standard Core word set words are
included to make hForth an ANS Forth system. High level Standard words in
the last part of the assembly source are not used for the implementation
of hForth and can be omitted to make a minimal system. Current 8086
hForth ROM model for MS-DOS has 59 kernel words: 13 system-dependent
words, 21 CPU-dependent non-Standard words and 25 CPU-dependent Standard
words. System-dependent words include input/output words and other words
for file input through keyboard redirection of MS-DOS. For five of kernel
words, including (search-wordlist)
and ALIGNED
,
CPU-dependent definitions are used instead of high level definitions for
faster execution.
System initialization and input/output operations are performed
through following execution vectors: 'boot
,
'init-i/o
, 'ekey?
, 'ekey
,
'emit?
, 'emit
, and 'prompt
.
Appropriate actions can be taken by redirecting these execution vectors.
'init-i/o
is executed in THROW
and when the
system starts while 'boot
is executed only once when the
system starts. One has better chance not to loose control by restoring
i/o vectors through 'init-i/o
whenever an exception
condition occurs. For example, serial communication link may not be
broken by an accidental change of communication parameters.
'boot
may be redirected to an appropriate application word
instead of default word in a finished application. Traditional
'ok<end-of-line>' prompt (which is actually not) may be replaced by
redirecting 'prompt
.
Control structure matching is rigorously checked for different control flow stack items. Control-flow stack is implemented on data stack. Control-flow stack item is represented by two data stack items as below
Control-flow stack item Representation (parameter and type) ----------------------- ------------------------------------- dest control-flow destination 0 orig control-flow origin 1 of-sys OF origin 2 case-sys x (any value) 3 do-sys ?DO origin DO destination colon-sys xt of current definition -1
hForth can detect the nonsense clause "BEGIN IF AGAIN
THEN
" easily. CS-ROLL
and CS-PICK
can be
applied to the list of dests and origs only. This can be
verified by checking whether the ORed type is 1. I can not think of a
control-structure-mismatch that current hForth cannot catch.
Number of words grows substantially as a Forth system is extended.
Dictionary search can be time-consuming unless hashing or other means are
employed. Currently hForth uses no special search mechanism, however,
maintains reasonable compilation speed by keeping shallow search depth in
addition to using optimized (search-wordlist)
. Initially two
wordlists are in the search order stack: FORTH-WORDLIST
and
NONSTANDARD-WORDLIST
. FORTH-WORDLIST
contains
all the Standard words and NONSTANDARD-WORDLIST
contains all
the other words. Upon extending hForth, optional Standard words will go
in FORTH-WORDLIST
and lower-level non-Standard words to
implement them will be kept in separate wordlists which are usually not
in the search order stack. Only a small number of non-Standard words to
be used by a user will be added in NONSTANDARD-WORDLIST
.
hForth package consists of three models: ROM, RAM and EXE model.
hForth RAM model is for RAM only system where name, code and data spaces
are all combined. hForth EXE model is for a system in which code space is
completely separated from data space and execution token (xt) may not be
a valid address in data space. 8086 hForth EXE model uses two 64 KB full
memory segments: one for code space and the other for name and data
spaces. EXE model might be extended for an embedded system where name
space resides in host computer and code and data space are in target
computer. Few kernel words are added to ROM model to derive RAM and EXE
models and only several high level words such as HERE
and
CREATE
are redefined.
ROM and RAM models are probably too slow for many practical applications as original eForth. However, 8086 hForth EXE model is more competitive. High-level colon definitions of all frequently used words are replaced with 8086 assembly code definitions in hForth EXE model. Comparison with other 8086 Forth systems can be found in Mr. Borasky's article "Forth in the HP100LX" Forth Dimensions XVII/4, 6.
hForth models are highly extensible. Optional word set words as well
as an assembler can be added on top of basic hForth system. Complete
Tools, Search Order, Search Order Ext word set words and other optional
Standard words are defined in OPTIONAL.F included in 8086 hForth
package. 8086 Forth assembler is provided in ASM8086.F. Many of
Core Ext word set words are provided in OPTIONAL.F and all the
other Core Ext words except obsolescent ones and [COMPILE]
(for which POSTPONE
should be used) are provided in
COREEXT.F. Complete Double and Double Ext word set words are
provided in DOUBLE.F. High level definitions in these files should
work in hForth for other CPUs. These files are loaded into 8086 hForth
for MS-DOS machines through keyboard redirection function of MS-DOS.
Complete Block, Block Ext, File and File Ext word set words are provided
in MSDOS.F using MS-DOS file handle functions. Other utilities are
also included in 8086 hForth package. LOG.F is to capture screen
output to an MS-DOS text file, which is edited to make Forth text source.
DOSEXEC.F is to call MS-DOS executables within hForth system. A
user can call familiar text editor, edit Forth text source, exit the
editor, load the source and debug without leaving hForth environment.
This process can be repeated without saturating address spaces if a
MARKER
word is defined in the beginning of the Forth text
source and called before reload the source.
I had a chance to look at Mr. Muench's eForth 2.4.2. The multitasker is the most elegant one among those that I have seen. It does task switching through only two high-level words. I immediately adapted it to hForth. Mr. Muench's multitasker is now included in P21Forth for MuP21 processor.
In Forth multitasker each task has its own context: data stack, return
stack and its own variables (traditionally called user variables). The
contexts must be stored and restored properly when tasks are suspended
and resumed. In Mr. Muench's multitasker PAUSE
saves current
task's context and wake
restores next task's context.
PAUSE
saves return stack pointer on data stack and data
stack pointer into a user variable stackTop
, then jumps to
next task's status
which is held in current task's user
variable follower.
It is defined as:
: PAUSE rp@ sp@ stackTop ! follower @ >R ; COMPILE-ONLY
Advanced Forth users already know that '>R EXIT
'
causes high level jump for traditional Forth virtual machine. Each task's
user variable status
holds wake
and immediately
followed by user variable follower
. Initially hForth has
only one task SystemTask
. Its user variable
status
and follower
hold:
SystemTask's status follower +------+ +-----------------------------------------+ | wake | | absolute address of SystemTask's status | +------+ +-----------------------------------------+
If FooTask
is added, status
and
follwer
of the two tasks now hold:
SystemTask's status follower +------+ +-----------------------------------------+ | wake | | absolute address of FooTask's status | +------+ +-----------------------------------------+ FooTask's status follower +------+ +-----------------------------------------+ | wake | | absolute address of SystemTask's status | +------+ +-----------------------------------------+
Effectively current task's PAUSE
jumps to next task's
wake
. At this point user variables and stacks are not
switched yet. wake
assigns the return stack item (the next
address of status
, i.e. the address of
follower
) into global variable userP
, which is
used to calculate absolute address of user variables. All user variables
cluster in front of follower
. Now user variables are
switched. Then wake
restores data stack pointer stored in
user variable stackTop
(now data stack is switched) and
restores return stack pointer saved on top of data stack (now return
stack is switched). wake
is defined as:
: wake R> userP ! stackTop @ sp! rp! ; COMPILE-ONLY
What is clever here is that one item on return stack, left by
PAUSE
and consumed by wake
, is used to transfer
control as well as information for context switching. This multitasker is
highly portable. Not a line of multitasker code was touched when hForth
8086 RAM model was moved to Z80 processor. This is also verified by Neal
Crook when porting hForth to ARM processor. I believe that it should be
possible to port this multitasker to subroutine-threaded or native-code
Forth by redefining them in machine codes.
I used this multitasker to update graphics screen and make cursor
blink in HIOMULTI.F. Console output is redirected to graphics
screen to display Korean and English characters for VGA and Hercules
Graphics Adapters. EMIT
fills characters into a buffer and a
background task displays them on graphics screen when hForth is waiting
for keyboard input. Scrolling text on graphics screen is as fast as on
text screen. I also used the multitasker for serial communication in
SIO.F. Main routine fetches characters from input buffer and
stores characters in output buffer while background task does actual
hardware control.
I applied all the best ideas and tricks I know to hForth. Most of them came from other people while I added a few of my own. I believe that some of them are worth to mention.
hForth text interpreter uses vector table to determine what to do with a parsed strings after search it in the Forth dictionary. Dictionary search results the string and 0 (for an unknown word); xt and -1 (for non-immediate word); or xt and 1 (for immediate word) on data stack. hForth text interpreter chooses next action by the following code:
1+ 2* STATE @ 1+ + CELLS 'doWord + @ EXECUTE
'doWord
table consists of six vectors.
compilation state interpretation state (STATE returns -1) (STATE returns 0) ------------------ -------------------- non-immediate word (TOS = -1) optiCOMPILE, EXECUTE unknown word (TOS = 0) doubleAlso, doubleAlso immediate word (TOS = 1) EXECUTE EXECUTE TOS = top-of-stack
The behavior of the hForth text interpreter can be interactively
changed by replacing these vectors. For example, one can make hForth
interpreter accept only single-cell numbers by replacing
doubleAlso,
and doubleAlso
with
singleOnly,
and singleOnly
respectively.
optiCOMPILE,
does the same thing as Standard word
COMPILE,
except that it removes one level of
EXIT
if possible. optiCOMPILE,
does not
compile null definition CHARS
into the current definition.
Also it compiles 2*
instead of CELLS
if
CELLS
is defined as ": CELLS 2* ;
".
Compiling words created by CONSTANT
,
VARIABLE
, and CREATE
as literal values can
increase execution speed, especially for native-code Forth compilers. A
solution is implemented in hForth EXE model to provide special
compilation action for default compilation semantics. Words created by
CONSTANT
, VARIABLE
, and CREATE
have a special mark and xt for special compilation action. hForth
compiler executes the xt if it sees the mark. (POSTPONE
must
find this special compilation action also and compile it.) A new data
structure with special compilation action can be built by
CREATE
and only two non-Standard words:
implementation-dependent doCompiles>
and
implementation-independent compiles>
.
doCompiles>
verifies whether the last definition is ready
for special compilation action and takes an xt on data stack and assign
it as special compilation action of the last definition.
compiles>
is defined as:
: compiles> ( xt -- )
POSTPONE LITERAL POSTPONE doCompiles> ; IMMEDIATE
For example, 2CONSTANT
can be defined as:
:NONAME EXECUTE POSTPONE 2LITERAL ;
: 2CONSTANT
CREATE SWAP , , compiles> DOES> DUP @ SWAP CELL+ @ ;
It is the user's responsibility to match special compilation action with the default compilation semantics. I believe that this solution is general enough to be applied to other Forth systems.
100
FORWARD
' instead of prefix LOGO command 'FORWARD
100
'. No floating-point math is used at all. Integers are used
represent angles in degree rather than in radian and look-up table is
used to evaluate trigonometric functions. Only a few words are defined in
machine code for line drawing and trigonometric function evaluation. The
turtle moves swiftly on a 286 machine. The Forth source and MS-DOS
executables, TURTLE.F, ETURTLE.EXE (using English commands)
and HTURTLE.EXE (using Korean commands), are included.
hForth is a small ANS Forth system based on eForth. It is especially designed for small embedded system. The basic ROM and RAM models are designed for portability, however, can be easily optimized for a specific CPU to build a competitive system as shown in 8086 EXE model. hForth packages for 8086 and Z80 can be found at http://www.taygeta.com/forthcomp.html or ftp://ftp.taygeta.com/pub/Forth/Reviewed/. hForth is also ported to H8 processor by Mr. Bernie Mentink and to ARM processor by Neal Crook. I hope that hForth will be useful to many people.