Dr. Robert A. van Engelen, Copyright 2021
- Forth500
- Quick Forth tutorial
- Stack effects
- Stack manipulation
- Integer constants
- Arithmetic
- Numeric output
- String constants
- String operations
- Keyboard input
- Printing
- Serial port IO
- Screen and cursor operations
- Graphics
- Sound
- The return stack
- Defining new words
- Control flow
- Compile-time immedate words
- Source input and parsing
- Files
- Exceptions
- Environmental queries
- Dictionary structure
- Vocabulary structure
- Examples
- Troubleshooting
- Further reading
- Links to additional resources
Forth500 is a Standard Forth system for the SHARP PC-E500(S) pocket computer. This pocket computer sports a 2.304MHz 8-bit CPU with a 20-bit address space of up to 1MB. This pocket computer includes 256KB system ROM and 32KB to 256KB RAM. The RAM card slot offers additional storage up to 256KB RAM. Forth500 is small enough to fit in an unexpanded 32KB machine.
Forth is unlike any other mainstream programming language. It has an unconventional syntax and unique program execution characteristics. Because of this, programs run very efficiently and do not require much memory to run. On the other hand, this makes learning Forth a bit more challenging for beginners and experienced programmers alike.
It is perhaps best to think of Forth as a language positioned between assembly coding and C in terms of power and complexity. Like assembly and C, you have a lot of freedom and power at your fingertips. But this comes with a healthy dose of responsibility to do things right: errors can lead to crashes.
Fortunately, Forth500 protects its data and floating point stacks against under- and overflows. It also guards against dictionary overflow. Accidental infinite loops can be terminated by pressing BREAK to abort execution, which is also checked in loops and when calling secondaries (Forth subroutines).
If necessary, the PC-E500(S) recovers from a crash with a soft reset. You can
recover your work after you saved the Forth500 program state to the E: or F:
RAM disk with the SAVE
word (see the "SAVE" example to define):
SAVE F:FORTH500.BIN
Then load it back into memory later from BASIC in RUN mode:
> LOADM "F:FORTH500.BIN"
> CALL &Bx000
where Forth500 starts at &Bx000
(&B0000
on an expanded machine with 128KB
memory or more and &B9000
on an unexpanded 32KB machine). Note that a hard
reset requires allocating memory again for Forth500 before you can load it back
into memory, see the Forth500 installation instructions.
This section is intentionally kept simple by avoiding technical jargon and unnecessary excess. Familiarity with the concepts of stacks and dictionaries is assumed. Experience with C makes it easier to follow the use of addresses (pointers) to integer values, strings and other data in Forth.
A Forth system is a dictionary of words. A word can be any sequence of characters but excludes space, tab, newline, and other control characters. Words can be entered simply by typing them in as long as they are separated by spacing. Words are defined for subroutines, for named constants and for global variables and data. Some words may execute at compile time to compile the body of subroutine definitions and to implement control flow such as conditional branches and loops. As such, the syntax blends compile-time and runtime behaviors that are distinguishable by naming and naming conventions of words.
You can enter Forth words at the Forth500 interactive prompt. The following special keys can be used to enter a line of Forth code of up to 255 characters long:
key | comment |
---|---|
INS | switch to insertion mode or back to replace mode |
DEL | delete the character under the cursor |
BS | backspace |
ENTER | execute the line of input |
LEFT/RIGHT | before typing input "replays" the last line of input to edit |
CURSOR KEYS | move cursor up/down/left/right on the line |
C/CE | clears the line |
The BUSY annunciator lights up during the execution of a Forth word or program. The RUN annunciator lights up (BUSY turns off) to accept your input to execute. The PRO annunciator lights up to accept your input to compile, for example when entering a colon definition.
To exit Forth500 and return to BASIC, enter bye
. This saves the Forth500
state in memory. To reenter Forth500 from BASIC, CALL &Bxx00
again where
xx
is the high-order address of Forth500.
Forth500 is case insensitive. Words may be typed in upper or lower case or in mixed case. In this manual all built-in words are shown in UPPER CASE. User-defined words in the examples are shown in lower case.
To list the words stored in the Forth dictionary, type (↲ is ENTER):
WORDS ↲
Hit any key to continue. Hit C/CE or BREAK to stop. BREAK generally terminates the execution of a Forth500 subroutine associated with a word.
To list words that fully and partially match a given name, type:
WORDS NAME ↲
For example, WORDS DUP
lists all words with names that contain the part DUP
(the WORDS
search is case sensitive). Press C/CE or BREAK to stop listing
more words.
Words like DUP
operate on the stack. DUP
duplicates the top value,
generally called TOS "Top Of Stack". All computations in Forth occur on the
stack. Words may take values from the stack, by popping them, and push return
values on the stack. Besides words, you can also enter literal integer values
to push them onto the stack:
TRUE 123 DUP .S ↲
-1 123 123 OK[3]
where TRUE
pushes -1, 123
pushes 123, DUP
duplicates the TOS and .S
shows the stack values. It helps to use .S
to see what's currently on the
stack when debugging. OK[3]
indicates that currently there are three values
on the stack. This may also show as OK[3 1]
when there is one floating point
value on the floating point stack, which is indicated by the second number.
You can spread the code over multiple lines. It does not matter if you hit ENTER at the end or if you hit ENTER to input more than one line.
When an error occurred, an error message will be shown and a blinking cursor will appear:
1 0 / 2 + . ↲
1 0 / <Error -10
Press the left or right cursor key to edit the last line. For a description of the standard Forth error codes, see exceptions.
To clear the stacks, type CLEAR
:
CLEAR ↲
OK[0]
Like traditional Forth systems, Forth500 integers are 16-bit (single) and 32-bit (double) signed or unsigned integers. Decimal, hexadecimal and binary number systems are supported:
input | TOS | comment |
---|---|---|
TRUE |
-1 | Boolean true is -1 |
FALSE |
0 | Boolean false is 0 |
123 |
123 | decimal number (if the current base is DECIMAL ) |
-1 |
-1 | |
0 |
0 | |
$FF |
255 | hexadecimal number |
#-12 |
-12 | decimal number (regardless of the current base) |
%1000 |
8 | binary number |
'A |
65 | ASCII code of letter A |
Words for arithmetic like +
pop the TOS and 2OS "Second on Stack" to return
the sum. The .
("dot") word can then be used to print the TOS:
1 2 + . ↲
3 OK[0]
Two single stack integers can be combined to form a 32-bit signed or unsigned
integer. A double integer number is pushed (as two single integers) when the
number is written with a .
anywhere among the digits, but we prefer the .
at the end for clarity:
123. 456. D+ D. ↲
579 OK[0]
The D+
word adds two double integers and the D.
word prints a signed double
integer and pops it from the stack. Words that operate on two integers or
doubles are typically identified by Dxxx
and 2xxx
. See also double
arithmetic and numeric output.
The use of .
to mark double integers is unfortunate, because the number is
not a floating point number! The .
is traditional in Forth and still part of
Standard Forth.
Floating point numbers require an exponent E
or D
for double precision,
even when the exponent is zero, as for example in 1.23e+0
(the E
and D
are case insensitive). Floating point values are stored on a separate floating
point stack and have their own words for floating point arithmetic:
12.3e0 45.6e0 F+ FDUP F. FS. ↲
57.9 5.7900000000000000000E1 OK[0]
where F.
displays the value in fixed-point notation without exponent and
FS.
displays the value in scientific notation with 20 digits precision. See
also floating point stack manipulation,
floating point arithmetic and numeric
output.
Words that execute subroutines are defined with a :
("colon") and end with
a ;
("semicolon"):
: hello ." Hello, World!" CR ; ↲
This defines the word hello
that displays the obligatory "Hello, World!"
message. Separating the word with its definition using tab spacing visually
assists to identify word definitions more easily. After all, a Forth program
is stored in a dictionary of words with their definitions. A clean
visual presentation helps a lot when perusing Forth programs.
The ."
word parses a sequence of character until "
. These characters are
displayed on screen. Note that ."
is a normal word and must therefore be
followed by a space. The CR
word starts a new line by printing a carriage
return and newline.
Let's try it out:
hello ↲
Hello, World! OK[0]
Some words like ."
and ;
are compile-time only, which means that they can
only be used in colon definitions. Two other compile-time words are DO
and
LOOP
for loops:
: greetings 10 0 DO hello LOOP ; ↲
greetings ↲
This displays 10 lines with "Hello, World!" Let's add a word that takes a
number as an argument, then displays that many hello
lines:
: hellos 0 DO hello LOOP ; ↲
2 hellos ↲
Hello, World!
Hello, World! OK[0]
Something interesting has happened here, that is typical Forth: hellos
is the
same as greetings
but without the 10
loop limit. We just specify the loop
limit on the stack as an argument to hellos
. Therefore, we can refactor
greetings
to use hellos
by simply replacing 0 DO hello LOOP
by hellos
:
: greetings 10 hellos ; ↲
It is good practice to define words with short definitions! It makes your programs much easier to understand, maintain and reuse. Because words operate on the stack, pretty much any sequence of words can be moved from a definition into a new word to replace the sequence with a single descriptive word. This keeps definitions short and understandable. Also testing words by executing them from the prompt is a good way to verify their definitions to find potential problems before running a new program.
What if we want to change the message of hellos
? Forth allows you to
redefine words at any time, but this does not change the behavior of any
previously defined words that are used by other previously defined words:
: hello ." Hola, Mundo!" CR ; ↲
2 hellos ↲
Hello, World!
Hello, World! OK[0]
Only new words that we add after this will use our new hello
definition.
Basically, the Forth dictionary is searched from the most recently defined
word to the oldest defined word. A definition of a word is no longer
searchable when a word with the same name is defined. There is a way around
this by deferring words, as we will see later.
Definitions can be deleted with everything defined after it by forgetting:
FORGET hello ↲
Because we defined two hello
words, we should forget hello
twice to delete
the new and the old hello
. Forgetting means that everything after the
specified word is deleted from the dictionary, including our greetings
and
hellos
definitions. Another way to delete definitions is to define a
MARKER
with ANEW
for a section of code, see markers. Executing
a marker deletes it and everything after it.
To create a configurable hello
word that displays alternative messages, we
can use branching based on the value of a variable:
VARIABLE spanish ↲
VARIABLE
parses the next word in the input and adds the word to the
dictionary as a variable, in this case spanish
.
We rewrite our hello
as follows:
: hello ↲
spanish @ IF ↲
." Hola, Mundo!" ↲
ELSE ↲
." Hello, World!" ↲
THEN CR ; ↲
If you are new to Forth this may look strange with the IF
and THEN
out of
place. A THEN
closes the IF
(some Forth's allow both ENDIF
and THEN
).
By comparison to C, spanish @ IF x ELSE y
is similar to *spanish ? x : y
.
The variable spanish
places the address of its value on the stack. The value
is fetched (dereferenced) with the word @
("fetch"). If the value is nonzero
(true), then the statements after IF
are executed. Otherwise, the statements
after ELSE
are executed.
To set the spanish
variable to true:
TRUE spanish ! ↲
where the word !
("store") stores the 2OS value to the memory cell addressed
by the TOS, which is the variable spanish
in this example.
Observe this stack order carefully! Otherwise you will end up writing data to
arbitrary memory locations. The !
reminds you of this potential danger.
For convenience, the words ON
and OFF
can be used:
spanish OFF ↲
spanish ? ↲
0 OK[0]
spanish ON ↲
spanish ? ↲
-1 OK[0]
The ?
word used in the example above is a shorthand for @ .
to display the
value of a variable:
: ? @ . ;
Like the built-in ?
word, a large portion of the Forth system is defined in
Forth itself. Also ON
and OFF
are defined in Forth:
: ON TRUE SWAP ! ;
: OFF FALSE SWAP ! ;
where SWAP
swaps the TOS and 2OS.
Instead of nesting multiple IF
-ELSE
-THEN
branches to cover additional
languages, we should use CASE
-OF
-ENDOF
-ENDCASE
and enumerate the
languages as follows:
0 CONSTANT #english ↲
1 CONSTANT #spanish ↲
2 CONSTANT #french ↲
VARIABLE language #english language ! ↲
: hello ↲
language @ CASE ↲
#english OF ." Hello, World!" ENDOF ↲
#spanish OF ." Hola, Mundo!" ENDOF ↲
#french OF ." Salut Mondial!" ENDOF ↲
." Unknown language" ↲
ENDCASE ↲
CR ; ↲
hello ↲
Hello, World!
Note that the default case is not really necessary, but can be inserted between
the last ENDOF
and ENDCASE
. In the default arm of a CASE
, the CASE
value is the TOS, which can be inspected, but should not be dropped before
ENDCASE
.
Unlike a VARIABLE
, a CONSTANT
word is initialized with the specified value
on the stack. When the word is executed it pushes its value on the stack. By
contrast, a word defined as a variable pushes the address of its value on the
stack, after which the value can be fetched with @
. A new value can be
stored with !
.
So-called Forth value words offer the advantage of implicit fetches like
constants. To illustrate value words, let's replace the VARIABLE language
with VALUE language
initialized to #english
:
#english VALUE language ↲
To change the value we use the TO
word followed by the name of the value:
#spanish TO language ↲
Now with language
as a VALUE
, hello
should be changed by removing the
@
after language
:
: hello ↲
language CASE ↲
...
Forth constants, variables and values contain data. Data words are added
to the dictionary with CREATE
followed by words to allocate space for the
data. The word created returns the address pointing to its data:
CREATE data ↲
data . ↲
<address> OK[0]
In this example data
has no data allocated or stored, it just returns the
address of the location where the data would reside. Because addresses of
words in the dictionary are unique, we can use this mechanism to create
"symbolic" enumerations to replace constants (and save some space):
CREATE english ↲
CREATE spanish ↲
CREATE french ↲
english TO language ↲
Working with CREATE
and DOES>
to create data types and data structures is a
more advanced topic. See CREATE and DOES> for details.
See also example enums for a more elaborate example.
Earlier we saw the DO
-LOOP
. The loop iterates until its internal loop
counter when incremented equals the final value. For example, this loop
executes hello
10 times:
: greetings 10 0 DO hello LOOP ; ↲
Actually, DO
cannot be recommended because the loop body is always executed
at least once. When the initial value is the same as the final value we end
up executing the loop 65536 times! (Because integers wrap around.) We use ?DO
instead of DO
to avoid this problem:
: hellos 0 ?DO hello LOOP ; ↲
0 hellos ↲
OK[0]
This example has zero loop iterations and never executes the loop body hello
.
When we add more languages to hello
, the hello
definition code grows
substantially by the addition of lots of OF
-ENDOF
arms. We should keep
Forth definitions short and concise. To so so, we may want to reconsider
hello
and change it to a "deferred word". A deferred word can be assigned
another word, in this case to display a message in the selected language:
DEFER hello ↲
: hellos 0 ?DO hello LOOP ; ↲
The deferred hello
word is assigned hello-es
with IS
:
: hello-en ." Hello, World!" CR ; ↲
: hello-es ." Hola, Mundo!" CR ; ↲
: hello-fr ." Salut Mondial!" CR ; ↲
' hello-es IS hello ↲
2 hellos ↲
Hola, Mundo!
Hola, Mundo! OK[0]
The '
"tick" parses the next word and returns its "execution token" on the
stack, which is assigned by IS
to hello
. An execution token is the address
of the start of the code associated with a word. Basically, a deferred word is
a variable that holds the execution token of another word. When the deferred
hello
executes, it takes this execution token and executes it with EXECUTE
.
Think of execution tokens as function pointers in C and as call addresses in
assembly. You can pass them around and store them in variables and tables to
be invoked later with EXECUTE
.
We saw the use of a ?DO
-LOOP
earlier. To change the step size or direction
of the loop, we use +LOOP
. The word I
returns the loop counter value:
: evens 10 0 ?DO I . 2 +LOOP ; ↲
evens ↲
0 2 4 6 8 OK[0]
The +LOOP
terminates if the updated counter equals or crosses the limit. The
increment may be negative to count down.
A BEGIN
-WHILE
-REPEAT
is a logically-controlled loop with which we can do
the same as follows by pushing a 0
to use as a counter on top of the stack:
: evens ↲
0 ↲
BEGIN DUP 10 < WHILE ↲
DUP . ↲
2+ ↲
REPEAT ↲
DROP ; ↲
evens ↲
0 2 4 6 8 OK[0]
DUP 10 <
is used for the WHILE
test to check the TOS counter value is less
than 10. After the loop terminates, DROP
removes the TOS counter.
A BEGIN
-UNTIL
loop is similar, but executes the loop body at least once:
: evens ↲
0 ↲
BEGIN ↲
DUP . ↲
2+ ↲
DUP 10 < INVERT UNTIL ↲
DROP ; ↲
evens ↲
0 2 4 6 8 OK[0]
Forth has no built-in >=
, so we use < INVERT
. If you really want >=
,
then define:
: >= < INVERT ; ↲
Until now we haven't commented our code. Forth offers two words to comment
code, ( a comment goes here )
and \ a comment until the end of the line
:
: evens ( -- ) ↲
0 \ push counter 0 ↲
BEGIN ↲
DUP . \ display counter value ↲
2+ \ increment counter ↲
DUP 10 < INVERT UNTIL \ until counter >= 10 ↲
DROP ; \ drop counter ↲
Word definitions are typically annotated with their stack effects. In this
case there is no effect ( -- )
, see the next section on how this notation is
used in practice.
Forth source code is loaded from a file with INCLUDE
or with INCLUDED
:
INCLUDE FLOATEXT.FTH ↲
S" FLOATEXT.FTH" INCLUDED ↲
where S" FLOATEXT.FTH"
specifies a string constant with the file name. A
drive letter such as F: can be specified to load from a specific drive, which
becomes the current drive (the default drive is E:). Forth500 source files
must have LF or CRLF line endings.
To compile a Forth source code file transmitted to the PC-E500(S) via the serial interface:
INCLUDE COM: ↲
To make sure that a file is included at most once, use REQUIRE
or REQUIRED
instead of INCLUDE
and INCLUDED
, respectively:
REQUIRE FLOATEXT.FTH ↲
S" FLOATEXT.FTH" REQUIRED ↲
The name of the file will show up in the dictionary to record its presence, but with a space appended to the name to distinguish it from executable words.
To compile a Forth source code file transmitted by a CE-126P or CE-124 cassette interface to the PC-E500:
CLOAD ↲
The wav file transmitted from the host computer, such as a PC, should be
created with PocketTools from the
source file (e.g. FLOATEXT.FTH
) as follows:
$ bin2wav --pc=E500 --type=bin -dINV FLOATEXT.FTH
$ afplay FLOATEXT.wav
The afplay
command plays the wav file. Use maximum volume to play the wav
file or close to maximum to avoid distortion. If -dINV
does not transfer the
file, then try -dMAX
. Again, Forth500 source files must have LF or CRLF line
endings.
To list files on the current drive:
FILES ↲
You can also specify a drive with a glob pattern to list matching FILES
:
FILES F:*.FTH ↲
This lists all Forth .FTH source files on the F: drive and makes the F: drive the current drive. Forth source files commonly use extension FTH or FS. File names and extensions are case sensitive on the PC-E500(S). Drive names are not.
To send FILES
to a printer CE-126P, first check if a printer is connected
with PRINTER .
which displays the number of characters per line supported by
the printer or zero if no printer is connected. Execute STDL TO TTY FILES`.
Redirecting TTY
to devices connected to the COM: serial port requires opening
the port first, then setting TTY
to the port fileid. Don't forget to
close the port fileid afterwards.
Forth500 includes a small text editor TED
in a separate Forth file as an
addition. With the text editor you can write scripts and source files and read
them immediately to execute the commands and definitions:
TEDI MYWORK.FTH ↲
↲ \ start editing (press enter)
.( TEDI is great!) ↲ \ a line of Forth (press enter to save)
[CCE] \ end editing and read MYWORK.FTH
TEDI is great!
See TED.TXT for the TED manual.
This ends our introduction of the basics of Forth.
To make it easier to document words that manipulate the stack, we use the following Forth naming conventions to identify the types of values on the stack:
value | represents |
---|---|
flag | the Boolean value true (nonzero, typically -1) or false (zero) |
true | true flag (-1) |
false | false flag (0) |
n | a signed single integer -32768 to 32767 |
+n | a non-negative single integer 0 to 32767 |
u | an unsigned single integer 0 to 65535 |
x | an unspecified single integer |
d | a signed double integer -2147483648 to 2147483647 |
+d | a non-negative double integer 0 to 2147483647 |
ud | an unsigned double integer 0 to 4294967295 |
xd | an unspecified double integer (two unspecified single integers) |
r | a single or double precision floating point value on the floating point stack |
addr | a 16-bit address |
c-addr | a 16-bit address pointing to 8-bit character(s), usually a constant string |
f-addr | a 16-bit address pointing to a floating point value |
s-addr | a 16-bit address pointing to a file status structure (Forth500) |
fileid | a nonzero single integer file identifier |
ior | a single integer nonzero system-specific error code |
fam | a file access mode |
nt | a name token, address of the name of a word in the dictionary |
xt | an execution token, address of code of a word in the dictionary |
A single integer or address unit is also called a "cell" with a size of two bytes in Forth500.
With these naming conventions for stack values, words are described by their stack effect. Values on the left of the -- are on the stack before the word is executed and the values on the right of the -- are on the stack after the word is executed:
OVER
( x1 x2 -- x1 x2 x1 )
Words that create other words on the dictionary parse the name of a new word.
For example, CREATE
parses a word at "compile time". This word leaves the
address of its body (a data field) on the stack at "run time". This is denoted
by two effects separated by a semi-colon ;
, the first effect occurs at
compile time and the second effect occurs at run time:
CREATE
( "name" -- ; -- addr )
A quoted part such as "name" are parsed from the input and not taken from the stack.
Return stack effects are prefixed with R:
. For example:
>R
( x -- ; R: -- x )
R>
( R: x -- ; -- x )
The word >R
("to r") moves x from the stack to the so-called "return
stack". The word R>
("r from") moves x from the return stack to the stack.
The return stack is used to keep return addresses of words executed and to store temporary values. When using the return stack to store values temporarily in your code, it is very important to keep the return stack balanced. This prevents words from returning to an incorrect return address and crashing the system.
The following words manipulate values on the stack:
word | stack effect ( before -- after ) | comment |
---|---|---|
DUP |
( x -- x x ) | duplicate TOS |
?DUP |
( x -- x x ) or ( 0 -- 0 ) | duplicate TOS if nonzero |
DROP |
( x -- ) | drop the TOS |
SWAP |
( x1 x2 -- x2 x1 ) | swap TOS with 2OS |
OVER |
( x1 x2 -- x1 x2 x1 ) | duplicate 2OS to the top |
NIP |
( x1 x2 -- x2 ) | delete 2OS |
TUCK |
( x1 x2 -- x2 x1 x2 ) | tuck a copy of TOS under 2OS |
ROT |
( x1 x2 x3 -- x2 x3 x1 ) | rotate stack, 3OS goes to TOS |
-ROT |
( x1 x2 x3 -- x3 x1 x2 ) | rotate stack, TOS goes to 3OS |
Note that NIP
is the same as SWAP DROP
, TUCK
is the same as DUP -ROT
,
and -ROT
("not rot") is the same as ROT ROT
.
There are also two words to reach deeper into the stack:
word | stack effect ( before -- after ) | comment |
---|---|---|
PICK |
( xk ... x0 k -- xk ... x0 xk ) | duplicate k'th value down to the top |
ROLL |
( xk ... x0 k -- xk-1 ... x0 xk) | rotate the k'th value down to the top |
Note that 0 PICK
is the same as DUP
, 1 PICK
is the same as OVER
, 1 ROLL
is the same as SWAP
, 2 ROLL
is the same as ROT
and 0 ROLL
does
nothing.
PICK
and ROLL
take k mod 128 cells max as a precaution.
Note that some legacy Forth systems define PICK
and ROLL
arguments starting
at 1 instead of at 0. Forth500 PICK
and ROLL
follow the standard. Please
take note when porting legacy Forth programs to Forth500.
The following words operate on two cells on the stack at once (a pair of single integers or one double integer):
word | stack effect ( before -- after ) |
---|---|
2DUP |
( x1 x2 -- x1 x2 x1 x2 ) |
2DROP |
( x1 x2 -- ) |
2SWAP |
( x1 x2 x3 x4 -- x3 x4 x1 x2 ) |
2OVER |
( x1 x2 x3 x4 -- x1 x2 x3 x4 x1 x2 ) |
2NIP |
( x1 x2 x3 x4 -- x3 x4 ) |
2TUCK |
( x1 x2 x3 x4 -- x3 x4 x1 x2 x3 x4 ) |
2ROT |
( x1 x2 x3 x4 x5 x6 -- x3 x4 x5 x6 x1 x2 ) |
Other words related to the stack:
word | stack effect | comment |
---|---|---|
CLEAR |
( ... -- ; F: ,,, -- ) | clears the stack and the floating point stack |
DEPTH |
( -- n ) | returns the current depth of the stack |
.S |
( -- ) | displays the stack contents |
N.S |
( n -- ) | displays the top n values on the stack |
SP@ |
( -- addr ) | returns the stack pointer, points to the TOS |
SP! |
( addr -- ) | assigns the stack pointer (danger!) |
DEPTH
returns the current depth of the stack, which is the number of cells on
the stack not counting the depth value returned on the stack. The maximum
stack depth in Forth500 is 128 cells or 64 double cells.
The following words manipulate values on the floating point stack:
word | stack effect ( before -- after ) | comment |
---|---|---|
FDUP |
( F: r -- r r ) | duplicate FP TOS |
FDROP |
( F: r -- ) | drop the FP TOS |
FSWAP |
( F: r1 r2 -- r2 r1 ) | swap FP TOS with FP 2OS |
FOVER |
( F: r1 r2 -- r1 r2 r1 ) | duplicate FP 2OS to the top |
FROT |
( F: r1 r2 r3 -- r2 r3 r1 ) | rotate stack, FP 3OS goes to FP TOS |
CLEAR |
( ... -- ; F: ,,, -- ) | clears the stack and the floating point stack |
FDEPTH |
( -- n ) | returns the current depth of the floating point stack |
FP@ |
( -- addr ) | returns the floating point stack pointer, points to the FP TOS |
FP! |
( addr -- ) | assigns the floating point stack pointer (danger!) |
FDEPTH
returns the current depth of the floating point stack, which is the
number of floats on the stack. The maximum floating point stack depth in
Forth500 is 120 bytes or 10 floating point values.
Integer values when parsed from the input are directly pushed on the stack.
The current BASE
is used for conversion:
word | comment |
---|---|
BASE |
a VARIABLE holding the current base, valid values range from 2 to 36 |
DECIMAL |
set BASE to 10 |
HEX |
set BASE to 16 |
#d ...d |
a decimal single integer, ignores current BASE |
#-d ...d |
a negative decimal single integer, ignores current BASE |
$h ...h |
a hex single integer, ignores current BASE |
$-h ...h |
a negative hex single integer, ignores current BASE |
%b ...b |
a binary single integer, ignores current BASE |
%-b ...b |
a negative binary single integer, ignores current BASE |
Valid single integer constant values range from -32768 to 65535. The unsigned integer range 32768 to 65535 is the same signed integer range -1 to -32768.
The signedness of an integer only applies to the way the integer value is used
by a word. For example, -1 U.
displays 65535, because U.
takes an unsigned
integer to display and -1 is the same as 65535 (two's complement).
Double integer values have a .
(dot) anywhere placed among the digits. For
example, -1.
is double integer pushed on the stack, occupying the top pair of
consecutive cells on the stack, i.e. the TOS and 2OS with TOS holding the
16 high-order bits and 2OS holding the 16 low-order bits. The .
(dot) is
typically placed at the end of the digits.
A word defined in the dictionary with a name that matches a number will be evaluated instead of the number. Therefore, it makes sense to avoid defining words with numeric names.
When the current BASE
is not decimal, such as HEX
, words in the dictionary
may match instead of the integer constant specified. For example, F.
is a
valid double integer value in HEX
but the F.
word will output a float
instead.
The ASCII value of a single character is pushed on the stack with 'char'
.
The quoted form 'char
can be used interactively and to compile a literal:
word | comment |
---|---|
'A' |
ASCII code 65 of letter A |
'B |
ASCII code 66 of letter B, the closing quote may be omitted |
CHAR C |
ASCII code 67 of letter C, use this word interactively |
[CHAR] D |
ASCII code 68 of letter D, use this word to compile a literal |
The quoted form is essentially the same as CHAR
or [CHAR]
depending on the
current STATE
.
The following words define common constants regardless of the current BASE
:
word | comment |
---|---|
BL |
the space character, ASCII 32 |
FALSE |
Boolean false, same as 0 |
TRUE |
Boolean true, same as -1 |
Floating point values are parsed in base 10. Floating point values are not
parsed if the BASE
is anything other than DECIMAL
. Exception -13 will be
thrown instead, because the unrecognized word is not found in the dictionary.
Floating point values when parsed from the input are directly pushed on the
floating point stack. Floating point values must include a E
or D
exponent. An E
exponent marks a single precision floating point value (see
note below). A D
exponent marks a double precision floating point value with
up to 20 significant digits. The E
and D
exponent ranges from -99 to +99.
word | comment |
---|---|
3.141592654e+0 |
single precision pi |
3.1415926535897932385d+0 |
double precision pi |
3.1415926535897932385e+0 |
double precision pi (exceeds 10 digits) |
9.9999999999999999999d99 |
maximum double precision value |
-1.234e-10 |
single precision -0.0000000001234 |
1e0 |
single precision 1 |
0e |
single precision 0 |
0d |
double precision 0 |
Note that exponent e+0
may be abbreviated to e0
or just e
. A floating
point value may not start with a decimal point .
. The formal syntax is:
<float> := <significand> [ <exponent> ]
<significand> := [ <sign> ] <digit>+ [ . <digit>* ]
<exponent> := {E|e|D|d} [ <sign> ] [ <digit> ] [ <digit> ]
<sign> := {+|-}
<digit> := {0|1|2|3|4|5|6|7|8|9}
If the number of significant digits exceeds 10, then the floating point value
is stored in double precision format even when marked with an e
. Digits are
considered significant after removing all leading zeros, including zeros to the
right of the decimal point. For example, 0.001234567890e
is a single
precision value because it has 10 significant digits (this differs with the
PC-E500(S) BASIC where zero digits after the decimal point are considered
significant) and 0.0012345678900e
is a double precision value because it has
11 significant digits.
Forth500 floating point operations are performed on both single and double precision floating point values. A double precision value is returned if one of the operands is a double precision value.
The 0e+0
word is predefined. This word takes only 2 bytes of code space
instead of the 14 bytes to store floating point literals in code (2 bytes code
plus 12 bytes for the float). To save memory, you can also use S>F
and D>F
to push small whole numbers on the floating point stack, which require only 6
bytes and 8 bytes of code space, respectively.
A floating point value requires 12 bytes of storage for the sign, exponent and the binary-coded decimal mantissa with 10 or 20 digits:
(sign)(exp)(BCD0)(BCD1)(BCD2)(BCD3)(BCD4)(BCD5)(BCD6)(BCD7)(BCD8)(BCD9)
- the (sign) byte bit 0 is set to mark double precision values
- the (sign) byte bit 3 is set to mark negative values
- the (exp) byte is a 2s-complement integer in the range [-99,99]
- a single precision floating point value uses (BCD0) to (BCD4) and may use (BCD5) and (BCD6) to store so-called guard digits. A double precision floating point value uses (BCD0) to (BCD9) to store 20 significant digits
To view the internal format of a floating point value on the stack:
FP@ 12 DUMP ↲
All digits are stored, including the 2 to 4 guard digits of a single precision value, and passed on to subsequent floating point operations.
The maximum depth of the floating point stack in Forth500 is 120 bytes to hold up to 10 floating point values.
The following words perform single integer (one cell) arithmetic operations. Words involving division and modulo may throw exception -10 "Division by zero":
word | stack effect ( before -- after ) |
---|---|
+ |
( x1 x2 -- (x1+x2) ) |
- |
( x1 x2 -- (x1-x2) ) |
* |
( n1 n2 -- (n1*n2) ) |
/ |
( n1 n2 -- (n1/n2) ) |
MOD |
( n1 n2 -- (n1%n2) ) |
/MOD |
( n1 n2 -- (n1%n2) (n1/n2) ) |
*/ |
( n1 n2 n3 -- (n1*n2/n3) ) |
*/MOD |
( n1 n2 n3 -- (n1*n2%n3) (n1*n2/n3) ) |
MAX |
( n1 n2 -- n1 ) if n1>n2 otherwise ( n1 n2 -- n2 ) |
UMAX |
( u1 u2 -- u1 ) if u1>u2 unsigned otherwise ( u1 u2 -- u2 ) |
MIN |
( n1 n2 -- n1 ) if n1<n2 otherwise ( n1 n2 -- n2 ) |
UMIN |
( u1 u2 -- u1 ) if u1<u2 unsigned otherwise ( u1 u2 -- u2 ) |
AND |
( x1 x2 -- (x1&x2) ) |
OR |
( x1 x2 -- (x1|x2) ) |
XOR |
( x1 x2 -- (x1^x2) ) |
ABS |
( n -- +n ) |
NEGATE |
( n -- (-n) ) |
INVERT |
( x -- (~x) ) |
1+ |
( x -- (x+1) ) |
2+ |
( x -- (x+2) ) |
1- |
( x -- (x-1) ) |
2- |
( x -- (x-2) ) |
2* |
( n -- (n*2) ) |
2/ |
( n -- (n/2) ) |
LSHIFT |
( u +n -- (u<<+n) ) |
RSHIFT |
( u +n -- (u>>+n) ) |
The after stack effects in the table indicate the result computed with operations % (mod), & (bitwise and), | (bitwise or), ^ (bitwise xor), ~ (bitwise not/invert), << (bitshift left) and >> (bitshift right).
Integer overflow and underflow does not throw exceptions. In case of integer addition and subtraction, values wrap around. For all other integer operations, overflow and underflow produce undefined values.
The MOD
, /MOD
, and */MOD
words return a remainder on the stack. The
quotient q and remainder r satisfy q = floor(a / b) such that
a = b * q + r , where floor rounds towards zero. The MOD
is
symmetric, i.e. 10 7 MOD
and -10 7 MOD
return 3 and -3, respectively. See
also FM/MOD
and SM/REM
mixed arithmetic.
The */
and */MOD
words produce an intermediate double integer product to
avoid intermediate overflow. Therefore, */
is not a shorthand for the two
words * /
, which would truncate an overflowing product to a single integer.
For example, radius 355 * 113 /
with 355/133 to approximate pi overflows
when radius
exceeds 92, but radius 355 113 */
gives the correct result.
To perform unsigned *
, /
and MOD
operations, the words UM*
and UM/MOD
can be used, see mixed arithmetic. To perform unsigned
2*
and 2/
, 1 LSHIFT
and 1 RSHIFT
can be used.
A logical NOT
word is not available. Either INVERT
should be used to
invert bits or 0=
should be used, which returns TRUE
for 0
and FALSE
otherwise.
The following words perform double integer arithmetic operations. Words involving division and modulo may throw exception -10 "Division by zero":
word | stack effect ( before -- after ) |
---|---|
D+ |
( d1 d2 -- (d1+d2) ) |
D- |
( d1 d2 -- (d1-d2) ) |
D* |
( d1 d2 -- (d1*d2) ) |
D/ |
( d1 d2 -- (d1/d2) ) |
DMOD |
( d1 d2 -- (d1%d2) ) |
D/MOD |
( d1 d2 -- (d1%d2) (d1/d2) ) |
DMAX |
( d1 d2 -- d1 ) if d1>d2 otherwise ( d1 d2 -- d2 ) |
DMIN |
( d1 d2 -- d1 ) if d1<d2 otherwise ( d1 d2 -- d2 ) |
DABS |
( d -- +d ) |
DNEGATE |
( d -- (-d) ) |
D2* |
( d -- (d*2) ) |
D2/ |
( d -- (d/2) ) |
S>D |
( n -- d ) |
D>S |
( d -- n ) |
The S>D
word converts a signed single to a double integer. To convert an
unsigned single to an unsigned double integer, push a 0
on the stack.
The D>S
word converts a signed double to a signed single integer, throwing
exception -11 "Result out of range" if the double value cannot be converted.
A double in the range -32768 to 65535 can be converted to a single.
Integer overflow and underflow does not throw exceptions. In case of integer addition and subtraction, values simply wrap around. For all other integer operations, overflow and underflow produce undefined values.
The following words cover mixed single and double integer arithmetic operations. Words involving division may throw exception -10 "Division by zero".
word | stack effect ( before -- after ) | comment |
---|---|---|
M+ |
( d n -- (d+n) ) | add signed single to signed double |
M- |
( d n -- (d-n) ) | subtract signed single from signed double |
M* |
( n1 n2 -- (n1*n2) ) | multiply signed singles to return signed double |
UM* |
( u1 u2 -- (u1*u2) ) | multiply unsigned singles to return unsigned double |
UMD* |
( ud u -- (ud*u) ) | multiply unsigned double and single to return unsigned double |
M*/ |
( d n +n -- (d*n/+n) ) | multiply signed double with signed single then divide by positive single to return signed double |
UM/MOD |
( ud u -- (ud%u) (ud/u) ) | single remainder and single quotient of unsigned double dividend and unsigned single divisor |
FM/MOD |
( d n -- (d%n) (d/n) ) | floored single remainder and single quotient of signed double dividend and signed single divisor |
SM/REM |
( d n -- (d%n) (d/n) ) | symmetric single remainder and single quotient of signed double dividend and signed single divisor |
The UM/MOD
, FM/MOD
, and SM/REM
words return a remainder on the stack. In
all cases, the quotient q and remainder r satisfy a = b * q + r,
In case of FM/MOD
, the quotient is a single signed integer rounded towards
negative q = floor(a / b). For example, -10. 7 FM/MOD
returns
remainder 4 and quotient -2.
In case of SM/REM
, the quotient is a single signed integer rounded towards
zero (hence symmetric) q = trunc(a / b). For example, -10. 7 SM/REM
returns remainder -3 and quotient -1. This behavior is identical to /MOD
,
but /MOD
behavior is not standardized and may differ on other Forth systems.
Fixed point offers an alternative to floating point if the range of values manipulated can be fixed to a few digits after the decimal point. Scaling of values must be applied when appropriate.
A classic example is to compute the circumference of a circle using a rational approximation of pi and a fixed point radius with a 2 digit fraction.
: pi* 355 113 M*/ ; ↲
12.00 2VALUE radius ↲
radius 2. D* pi* D. ↲
7539 OK[0]
This computes 122pi=75.39. Note that the placement of .
in 12.00
has no
meaning at all, it is just suggestive of a decimal value with a 2 digit
fraction.
Multiplying the fixed point value radius
by the double integer 2.
does not
require scaling of the result. Addition and subtraction with D+
and D-
do not require scaling either. However, multiplying and dividing two fixed
point numbers requires scaling the result, for example with a new word:
: *.00 D* 100. D/ ; ↲
radius radius *.00 pi* D. ↲
45238 OK[0]
There is a risk of overflowing the intermediate product when the multiplicants are large. If this is a potential hazard then note that this can be avoided by scaling the multiplicants instead of the result with a small loss in precision of the result:
: 10./ 10. D/ ; ↲
: *.00 10./ 2SWAP 10./ D* ; ↲
Likewise, fixed point division requires scaling. One way to do this is by scaling the divisor down by 10 and the dividend up by 10 before dividing:
: /.00 10./ 2SWAP 10. D* 2SWAP D/ ;
The following words cover floating point arithmetic operations. The words accept single and double precision floating point numbers on the floating point stack (the F: stack effects):
word | stack effect ( before -- after ) |
---|---|
F+ |
( F: r1 r2 -- r1+r2 ) |
F- |
( F: r1 r2 -- r1-r2 ) |
F* |
( F: r1 r2 -- r1*r2 ) |
F/ |
( F: r1 r2 -- r1/r2 ) |
F** |
( F: r1 r2 -- r1**r2 ) |
FMAX |
( F: r1 r2 -- r1 ) if r1>r2 otherwise ( F: r1 r2 -- r2 ) |
FMIN |
( F: r1 r2 -- r1 ) if r1<r2 otherwise ( F: r1 r2 -- r2 ) |
FABS |
( F: r -- +r ) |
FSIGN |
( F: r -- 0e+0 ) if r=0 or ( F: r -- 1e+0 ) if r>0 otherwise ( F: r -- -1e+0 ) |
FNEGATE |
( F: r -- -r ) |
FLOOR |
( F: r -- ⌊r⌋ ) round towards negative infinity |
FROUND |
( F: r -- [r+5e-1⌋ ) |
FTRUNC |
( F: r -- [r] ) round towards zero |
FSIN |
( F: r -- sin(r) ) |
FCOS |
( F: r -- cos(r) ) |
FTAN |
( F: r -- tan(r) ) |
FASIN |
( F: r -- arcsin(r) ) |
FACOS |
( F: r -- arccos(r) ) |
FATAN |
( F: r -- arctan(r) ) |
FLOG |
( F: r -- log10(r) ) |
FLN |
( F: r -- log(r) ) |
FEXP |
( F: r -- e**r ) |
FSQRT |
( F: r -- √ r ) |
FDEG |
( F: r1 -- r2 ) where r1 is in dd.mmss format and r2 is degrees |
FDMS |
( F: r1 -- r2 ) where r1 is degrees and r2 is in dd.mmss format |
FRAND |
( F: r1 -- r2 ) where r2 is a pseudo-random number, see below |
F>D |
( F: r -- ; -- d ) or ( F: r -- ; -- ud ) convert r to d or ud |
D>F |
( d -- ; F: -- r ) convert d to r |
F>S |
( F: r -- ; -- n ) or ( F: r -- ; -- u ) convert r to n or u |
S>F |
( n -- ; F: -- r ) convert n to r |
If any of the operands of an arithmetic operation are double precision, then
the result of the operation is a double precision floating point value. For
example, 0d F+
promotes a single precision value to a double precision
value by adding a double precision zero.
Floating point operations in single precision are performed with 12 to 14 digits (10 + 2 to 4 guard digits). All digits are stored and passed on to subsequent floating point operations. The guard digits are not removed.
F**
returns r1 to the power r2.
FLOOR
returns r truncated towards negative values, for example -1.5e FLOOR
returns -2e+0. FTRUNC
returns r truncated towrds zero, for example
-1.5e+0 FTRUNC
returns -1e+0.
FDMS
returns the degrees (or hours) dd with the minutes mm and seconds ss as
a fraction. FDEG
performs the opposite. For example, 36.09055e0 FDMS
returns 36.052598 or 36° 5' 25.98". The FDEG
and FDMS
words are also
useful for time conversions.
FRAND
returns a pseudo-random number in the open range r2 ∈ (0,1) if
r1<1e and in the closed range r2 ∈ [1,r1] otherwise. A double precision
pseudo-random number is returned when r1 is a double precision floating point
value.
F>D
throws an exception when the floating point value r is too large for an
unsigned 32 bit integer, i.e. when |r|>4294967295. Likewise, F>S
throws an
exception when r is too large for an unsigned 16 bit integer, i.e. when
|r|>65535.
Trigonometric functions are performed in the current angular unit (DEG, RAD or GRAD). You can use the BASIC interpreter to set the desired angular unit or define a word to scale degrees and radians to the current unit before applying a trigonometric function:
3.141592654e FCONSTANT PI
: ?>dbl FP@ FLOAT+ C@ 1 AND FP@ C@ OR FP@ C! ;
: deg> 90e F/ 0e+0 ?>dbl FACOS F* ;
: rad> FDUP F+ PI F/ 0e+0 ?>dbl FACOS F* ;
: >deg 0e+0 ?>dbl FACOS F/ 90e F* ;
: >rad 0e+0 ?>dbl FACOS FDUP F+ F/ PI F* ;
For example, 30e deg> FSIN
("30 degree from sine") and PI 6e F/ rad> FSIN
both return 0.5e+0 on the floating point stack regardless of the current
angular unit. Likewise 0.5E FASIN >deg
("half arcsine to degree") returns
30.0e+0 on the floating point stack regardless of the current angular unit.
The ?>dbl
word promotes the FP TOS to a double if the FP 2OS is a double.
This word allows angular unit conversion words to support both single and
double precision floating point values. See floating point
constants for the internal floating point format.
The following additional floating point extended word set definitions are not
built in Forth500 and defined in FLOATEXT.FTH
. These words apply to both
single and double floating point values. For these definitions we do not need
?>DBL
:
: FSINCOS FDUP FSIN FSWAP FCOS ;
: FALOG 10e FSWAP F** ;
: FCOSH FEXP FDUP 1e FSWAP F/ F+ 2e F/ ;
: FSINH FEXP FDUP 1e FSWAP F/ F- 2e F/ ;
: FTANH FDUP F+ FEXP FDUP 1e F- FSWAP 1e F+ F/ ;
: FACOSH FDUP FDUP F* 1e F- FSQRT F+ FLN ;
: FASINH FDUP FDUP F* 1e F+ FSQRT F+ FLN ;
: FATANH FDUP 1e F+ FSWAP 1e FSWAP F- F/ FLN 2e F/ ;
: FATAN2 ( F: r1 r2 -- r3 )
FDUP F0> IF
F/ FATAN
ELSE FSWAP FDUP F0<> IF
FDUP FSIGN FASIN FROT FROT F/ FATAN F-
ELSE
FDROP F0< S>F FACOS THEN THEN ;
: F~ ( F: r1 r2 r3 -- ; -- flag )
FDUP F0= IF FDROP F= EXIT THEN
FDUP F0< IF FROT FROT FOVER FOVER F- FROT FABS FROT FABS F+ FROT FABS F* F< EXIT THEN
FROT FROT F- FABS F< ;
The F~
word compares two floating point values with the specified precision.
If r3 is zero, then flag is true if r1 and r2 are equal. If r3 is
negative, then flag is true if the absolute value of (r1 minus r2) is
less than the absolute value of r3 times the sum of the absolute values of
r1 and r2. If r3 is positive, then flag is true if the absolute
value of (r1 minus r2) is less than r3.
To check if a floating point value is a double precision value, define:
: DBL? FP@ C@ 1 AND NEGATE ;
DBL?
returns true if the value is double precision.
To promote a single to a double on the floating point stack, just do 0d F+
or
define:
: E>D FP@ C@ 1 OR FP@ C! ;
To demote a double to a single on the floating point stack by truncation:
: D>E FP@ C@ $fe AND FP@ C! FP@ 7 + 5 ERASE ;
The following words return true (-1) or false (0) on the stack by comparing integer values:
word | stack effect ( before -- after ) |
---|---|
< |
( n1 n2 -- true ) if n1<n2 otherwise ( n1 n2 -- false ) |
> |
( n1 n2 -- true ) if n1>n2 otherwise ( n1 n2 -- false ) |
= |
( x1 x2 -- true ) if x1=x2 otherwise ( x1 x2 -- false ) |
<> |
( x1 x2 -- true ) if x1<>x2 otherwise ( x1 x2 -- false ) |
U< |
( u1 u2 -- true ) if u1<u2 otherwise ( u1 u2 -- false ) |
U> |
( u1 u2 -- true ) if u1>u2 otherwise ( u1 u2 -- false ) |
D< |
( d1 d2 -- true ) if d1<d2 otherwise ( d1 d2 -- false ) |
D> |
( d1 d2 -- true ) if d1>d2 otherwise ( d1 d2 -- false ) |
D= |
( xd1 xd2 -- true ) if xd1=xd2 otherwise ( xd1 xd2 -- false ) |
D<> |
( xd1 xd2 -- true ) if xd1<>xd2 otherwise ( xd1 xd2 -- false ) |
DU< |
( ud1 ud2 -- true ) if ud1<ud2 otherwise ( ud1 ud2 -- false ) |
DU> |
( ud1 ud2 -- true ) if ud1>ud2 otherwise ( ud1 ud2 -- false ) |
0< |
( n -- true ) if n<0 otherwise ( n -- false ) |
0> |
( n -- true ) if n>0 otherwise ( n -- false ) |
0= |
( x -- true ) if x=0 otherwise ( x -- false ) |
0<> |
( x -- true ) if x<>0 otherwise ( x -- false ) |
D0< |
( d -- true ) if d<0 otherwise ( d -- false ) |
D0> |
( d -- true ) if d>0 otherwise ( d -- false ) |
D0= |
( xd -- true ) if xd=0 otherwise ( xd -- false ) |
D0<> |
( xd -- true ) if xd<>0 otherwise ( xd -- false ) |
WITHIN |
( n1|u1 n2|u2 n3|u3 -- flag ) |
The WITHIN
word applies to signed and unsigned single integers on the stack,
represented by n|u. True is returned if the value n1|u1 is in the range
n2|u2 inclusive to n3|u3 exclusive. For exanple:
5 -1 10 WITHIN . ↲
-1 OK[0]
5 6 10 WITHIN . ↲
0 OK[0]
5 -1 5 WITHIN . ↲
0 OK[0]
More specifically, WITHIN
performs a comparison of a test value n1|u1
with an inclusive lower limit n2|u2 and an exclusive upper limit n3|u3,
returning true if either (n2|u2 < n3|u3 and (n2|u2 <= n1|u1 and
n1|u1 < n3|u3)) or (n2|u2 > n3|u3 and (n2|u2 <= n1|u1
or n1|u1 < n3|u3)) is true, returning false otherwise.
The following words return true (-1) or false (0) on the stack by comparing floating point values on the floating point stack:
word | stack effect ( before -- after ) |
---|---|
F< |
( F: r1 r2 -- ; -- true ) if r1<r2 otherwise ( F: r1 r2 -- ; -- false ) |
F> |
( F: r1 r2 -- ; -- true ) if r1>r2 otherwise ( F: r1 r2 -- ; -- false ) |
F= |
( F: r1 r2 -- ; -- true ) if r1=r2 otherwise ( F: r1 r2 -- ; -- false ) |
F<> |
( F: r1 r2 -- ; -- true ) if r1<>r2 otherwise ( F: r1 r2 -- ; -- false ) |
F0< |
( F: r -- ; -- true ) if r<0e otherwise ( F: r -- ; -- false ) |
F0> |
( F: r -- ; -- true ) if r>0e otherwise ( F: r -- ; -- false ) |
F0= |
( F: r -- ; -- true ) if r=0e otherwise ( F: r -- ; -- false ) |
F0<> |
( F: r -- ; -- true ) if r<>0e otherwise ( F: r -- ; -- false ) |
Floating point operations in single precision are performed with 12 to 14
digits (10 + 2 to 4 guard digits). All digits are stored, including the guard
digits. Beware that comparisons for equality and inequality may fail even
though the numbers displayed look equal when fewer than 20 significant digits
are displayed when PRECISION
is less than 20 (see below). To compare for
equality within a specified precision, use F~
. See
Floating point arithmetic
The following words display integer values:
word | stack effect | comment |
---|---|---|
. |
( n -- ) | display signed n in current BASE followed by a space |
.R |
( n1 n2 -- ) | display signed n1 in current BASE , right-justified to fit n2 characters |
U. |
( u -- ) | display unsigned u in current BASE followed by a space |
U.R |
( u n -- ) | display unisnged u in current BASE , right-justified to fit n characters |
D. |
( d -- ) | display signed double d in current BASE followed by a space |
D.R |
( d u -- ) | display signed double d in current BASE , right-justified to fit n characters |
BASE. |
( u1 u2 -- ) | display unsigned u1 in base u2 followed by a space |
BIN. |
( u -- ) | display unsigned u in binary followed by a space |
DEC. |
( u -- ) | display unsigned u in decimal followed by a space |
HEX. |
( u -- ) | display unsigned u in hexadecimal followed by a space |
Note that 0 .R
may be used to display an integer without a trailing space.
Values are displayed with EMIT
and TYPE
, which may be redirected to a
printer or to a file. See character output.
See also pictured numeric output.
The following words display floating point values:
word | stack effect | comment |
---|---|---|
F. |
( F: r -- ) | display r in fixed-point notation followed by a space |
FS. |
( F: r -- ) | display r in scientific notation followed by a space |
SET-PRECISION |
( n -- ) | set the VARIABLE PRECISION to n significant digits to display with F. and FS. |
Note that SET-PRECISION
does not affect the precision of floating point
operations.
The standard FE.
word is defined in FLOATEXT.FTH
and displays a floating
point value in engineering format:
: FE. ( F: r -- )
HERE PRECISION 3 MAX REPRESENT DROP IF '- EMIT THEN
1- 3 /MOD SWAP DUP 0< IF 3 + SWAP 1- SWAP THEN 1+ HERE OVER TYPE '. EMIT
HERE OVER + SWAP PRECISION SWAP - 0 MAX TYPE 3 * 'E DBL + EMIT . ;
Formatted numeric output is produced with a sequence of "pictured numeric output" words. An internal "hold area" (a buffer of 40 bytes) is filled with digits and other characters in backward order (least significant digit goes in first):
word | stack effect | comment |
---|---|---|
<# |
( -- ) | initiates the hold area for conversion |
# |
( ud1 -- ud2 ) | adds one digit to the hold area in the current BASE , updates ud1 to ud2 |
#S |
( ud -- 0 0 ) | adds all remaining digits to the hold area in the current BASE |
HOLD |
( char -- ) | places char in the hold area |
HOLDS |
( c-addr u -- ) | places the string c-addr u in the hold area |
SIGN |
( n -- ) | places a minus in the hold area if _n_ is negative |
#> |
( xd -- c-addr u ) | returns the hold area as a string |
For example:
: dollars <# # # '. HOLD #S '$ HOLD #> TYPE SPACE ; ↲
1.23 dollars ↲
$1.23 OK[0]
Note the reverse order in which the numeric output is composed. Also note that
HOLD
is used to add one character to the hold area. To hold a string we
should use S" string" HOLDS
.
In this example the value 1.23
appears to have a fraction, but the placement
of the .
in a double integer has no significance, i.e. it is merely
"syntactic sugar".
To display signed double integers, it is necessary to tuck the high order cell
with the sign under the double number, then make the number positive and
convert using SIGN
at the end to place the sign at the front of the number:
: dollars TUCK DABS <# # # '. HOLD #S '$ HOLD DROP OVER SIGN #> TYPE SPACE ; ↲
-1.23 dollars ↲
-$1.23 OK[0]
Pictured numeric words should not be used directly from the Forth prompt, because the hold area may be overwritten by other numeric outputs and by new words added to the dictionary.
The following words store or display string constants:
word | stack effect | comment |
---|---|---|
S" ..." |
( -- c-addr u ) | returns the string c-addr u on the stack |
S\" ..." |
( -- c-addr u ) | same as S" with special character escapes (see below) |
C" ..." |
( -- c-addr ) | return a counted string on the stack, can only be used in colon definitions |
." ..." |
( -- ) | displays the string, can only be used in colon definitions |
.( ...) |
( -- ) | displays the string immediately, even when compiling, followed by a CR , |
Strings contain 8-bit characters, including special characters.
The string constants created with S"
and S\"
are compiled to code when used
in colon definitions. Otherwise, the string is stored in a temporary internal
256-byte string buffer returned by WHICH-POCKET
. Two buffers are recycled.
For example, the following "Forth500"
definition contains the constant string
"Forth500" permanently stored in code, whereas the interactive S" Forth500"
is temporarily stored in an internal buffer and is not persistent:
: "Forth500" S" Forth500" ; ↲
"Forth500" S" Forth500" S= . ↲
-1 OK[0]
Note that most words require strings with a c-addr u pair of cells on the
stack, such as TYPE
to display a string.
A so-called "counted string" is compiled with C"
to code in colon
definitions. A counted string constant is a c-addr pointing to the length of
the string followed by the string characters. The COUNT
word takes a counted
string c-addr from the stack to return a string address c-addr and length
u on the stack. The maximum length of a counted string is 255 characters.
The S\"
word accepts the following special characters in the string when
escaped with \
:
escape | ASCII | character |
---|---|---|
\\ |
92 | \ |
\" |
34 | " |
\a |
7 | BEL; alert |
\b |
8 | BS; backspace |
\e |
27 | ESC; escape |
\f |
12 | FF; form feed |
\l |
10 | LF; line feed |
\m |
13 10 | CR and LF; carriage return and line feed |
\n |
13 10 | CR and LF; carriage return and line feed |
\q |
34 | " |
\r |
13 | CR; carriage return |
\t |
9 | HT; horizontal tab |
\v |
11 | VT; vertical tab |
\xhh |
hh (hex) | |
\z |
0 | NUL |
The escape letters are case sensitive.
The following words allocate and accept user input into a string buffer:
word | stack effect ( before -- after ) | comment |
---|---|---|
PAD |
( -- c-addr ) | returns the fixed address of a 256 byte temporary buffer that is not used by any built-in Forth words |
BUFFER: |
( u "name" -- ; c-addr ) | creates an uninitialized string buffer of size u |
ACCEPT |
( c-addr +n1 -- +n2 ) | accepts user input into the buffer c-addr of max size +n1 and returns string size +n2 |
EDIT |
( c-addr +n1 n2 n3 n4 -- c-addr +n5 ) | edit string buffer c-addr of max size +n1 containing a string of length n2, placing the cursor at n3 and limiting cursor movement to n4 and after, returns string c-addr with updated size +n5 |
Note that BUFFER:
only reserves space for the string, or any type of data
that you want to store, but does not store the max size and the length of the
actual string contained. To do so, we can use a CONSTANT
and a VARIABLE
:
40 CONSTANT name-max ↲
name-max BUFFER: name ↲
VARIABLE name-len ↲
name name-max ACCEPT name-len ! ↲
For example, to let the user edit the name:
name name-max name-len @ DUP 0 EDIT name-len ! DROP ↲
See also the strings example for an improved implementation of string buffers that hold both the maximum and actual string lengths.
The following words move and copy characters in and between string buffers:
word | stack effect | comment |
---|---|---|
CMOVE |
( c-addr1 c-addr2 u -- ) | copy u characters from c-addr1 to c-addr2 |
CMOVE> |
( c-addr1 c-addr2 u -- ) | copy u characters from c-addr1 to c-addr2 |
MOVE |
( c-addr1 c-addr2 u -- ) | copy u characters from c-addr1 to c-addr2 |
C! |
( char c-addr -- ) | store char in c-addr |
C@ |
( c-addr -- char ) | fetch char from c-addr |
A problem may arise when the source and target address ranges overlap, for
example when copying string contents in place. In this case, CMOVE
("c move")
correctly copies characters to lower memory c-addr1>c-addr2 and CMOVE>
("c move up") correctly copies characters to higher memory c-addr1<c-addr2.
The MOVE
word always correctly copies characters either way. Also, MOVE
does nothing if c-addr1=c-addr2.
For example, to insert name=
before the string in the name
buffer by
shifting the string to make room and copying the prefix into the buffer:
name DUP 5 + name-len max-name 5 - MIN CMOVE> ↲
S" name=" name SWAP CMOVE ↲
The following words fill a string buffer with characters:
word | stack effect | comment |
---|---|---|
BLANK |
( c-addr u -- ) | fills u bytes at address c-addr with BL (space, ASCII 32) |
ERASE |
( c-addr u -- ) | fills u bytes at address c-addr with zeros |
FILL |
( c-addr u char -- ) | fills u bytes at address c-addr with char |
The following words update the string address c-addr and size u on the stack:
word | stack effect ( before -- after ) | comment |
---|---|---|
NEXT-CHAR |
( c-addr1 u1 -- c-addr2 u2 char ) | if u1>0 returns c-addr2=c-addr1+1, u2=u1-1 and char is the char at c-addr2, otherwise throw -24 |
/STRING |
( c-addr1 u1 n -- c-addr2 u2 ) | skip n characters c-addr2=c-addr1+n, u2=u1-n, n may be negative to revert |
-TRAILING |
( c-addr u1 -- c-addr u2 ) | returns string c-addr with adjusted size u2<=u1 to ignore trailing spaces |
-CHARS |
( c-addr u1 char -- c-addr u2 ) | returns string c-addr with adjusted size u2<=u1 to ignore trailing char |
Note that -TRAILING
("not trailing") is the same as BL -CHARS
. For
example, to remove trailing spaces from name
to update name-len
, then
display the name without the name=
prefix:
name name-len @ -TRAILING name-len ! DROP ↲
name name-len @ 5 /STRING TYPE ↲
Beware that /STRING
("slash string") does not check the length of the string
and the size of the adjustment n that may become negative, e.g. it can also
be used to undo slashed strings.
The following words compare and search two strings:
word | stack effect ( before -- after ) | comment |
---|---|---|
S= |
( c-addr1 u1 c-addr2 u2 -- flag ) | returns TRUE if the two strings are equal |
COMPARE |
( c-addr1 u1 c-addr2 u2 -- -1|0|1 ) | returns -1|0|1 (less, equal, greater) comparison of the two strings |
SEARCH |
( c-addr1 u1 c-addr2 u2 -- c-addr3 u3 flag ) | returns TRUE if the second string was found in the first string at c-addr3 with u3 remaining chars, otherwise FALSE and c-addr3=c-addr1, u3=u1 |
To convert a string to a number:
word | stack effect ( before -- after ) | comment |
---|---|---|
>NUMBER |
( ud1 c-addr1 u1 -- ud2 c-addr2 u2 ) | convert the integer in string c-addr1 u1 to ud2 using the current BASE and ud1 as seed, returns the remaining non-convertable string c-addr2 u2 |
>DOUBLE |
( c-addr u -- d true ) or ( c-addr u -- false ) | convert the integer in string c-addr u to d using the current BASE , returns true if successful, otherwise returns false without d |
>FLOAT |
( c-addr u -- true ; F: -- r ) or ( c-addr u -- false ) | convert the floating point value in string c-addr u to r, returns true if successful, otherwise returns false without r |
For >NUMBER
, the initial ud1 value is the "seed" that is normally zero.
This value can also be a previously converted high-order part of the number.
>DOUBLE
returns a double integer when successful. It also sets the VALUE
flag DBL
to true if the integer is a double with a dot (.
) in the numeric
string. To convert a string to a single signed integer, use D>S
afterwards
to convert.
>FLOAT
returns a single or double float on the floating point stack when
successful. It also sets the VALUE
flag DBL
to true if the float is a
double. >FLOAT
requires BASE
to be DECIMAL
.
The REPRESENT
word can be used to convert a floating point value to a string
saved to a string buffer
word | stack effect ( before -- after ) | comment |
---|---|---|
REPRESENT |
( c-addr u -- n flag true ; F: r -- ) | save the string representation of the significant of r to c-addr of size u, returns exponent n and sign flag |
REPRESENT
is used by the F.
, FE.
and FS.
words, which save the string
to the hold area at HERE
to display. The character string contains the u
most significant digits of the significand of r represented as a decimal
fraction with the implied decimal point to the left of the first digit, and the
first digit zero only if all digits are zero. The significand is rounded to
u digits following the "round to nearest" rule; n is adjusted, if
necessary, to correspond to the rounded magnitude of the significand. If
flag is true then r is negative. The VALUE
flag DBL
is true if the
float is a double.
The F.
, FE.
and FS.
words are defined as follows:
: F. ( F: r -- )
HERE PRECISION REPRESENT DROP IF '- EMIT THEN
HERE PRECISION '0 -CHARS 1 UMAX NIP
OVER 0> INVERT IF
." 0." SWAP NEGATE ZEROS HERE SWAP TYPE
ELSE 2DUP < INVERT IF
HERE OVER TYPE - ZEROS '. EMIT
ELSE
SWAP HERE OVER TYPE '. EMIT HERE OVER + -ROT - TYPE
THEN THEN SPACE ;
: FE. ( F: r -- )
HERE PRECISION 3 MAX REPRESENT DROP IF '- EMIT THEN
1- 3 /MOD SWAP DUP 0< IF 3 + SWAP 1- SWAP THEN 1+ HERE OVER TYPE '. EMIT
HERE OVER + SWAP PRECISION SWAP - 0 MAX TYPE 3 * 'E DBL + EMIT . ;
: FS. ( F: r -- )
HERE PRECISION REPRESENT DROP IF '- EMIT THEN
HERE C@ EMIT '. HERE C! HERE PRECISION TYPE 'E DBL + EMIT 1- . ;
: ZEROS ( n -- ) 0 ?DO '0 EMIT LOOP ;
See also numeric output.
The following words return key presses and control the key buffer:
word | stack effect | comment |
---|---|---|
EKEY? |
( -- flag ) | if a keyboard event is available, return TRUE , otherwise return FALSE |
EKEY |
( -- x ) | display the cursor, wait for a keyboard event, return the event x |
EKEY>CHAR |
( x -- char flag ) | convert keyboard event x to a valid character and return TRUE , otherwise return FALSE |
KEY? |
( -- flag ) | if a character is available, return TRUE , otherwise return FALSE |
KEY |
( -- char ) | display the cursor, wait for a character and return it |
INKEY |
( -- char ) | check for a key press returning the key as char or 0 otherwise, clears the key buffer |
KEY-CLEAR |
( -- ) | empty the key buffer |
>KEY-BUFFER |
( c-addr u -- ) | fill the key buffer with the string of characters at address c-addr size u |
MS |
( u -- ) | stops execution for u milliseconds |
The KEY
word returns a nonzero 7-bit ASCII code and ignores any special keys.
The EKEY
word returns a PC-E500(S) key event code. A positive 7-bit ASCII
code is returned or a negative code for special keys. A negative value
corresponds to the PC-E500(S) second byte code but negated. See BASIC INPUT$
in the PC-E500 manual page 268 for the corresponding key code tables. For example,
the ANS key code is $90
returned by EKEY
as $-90
(or $FF70
).
The INKEY
word returns a value between 0 and 255. See BASIC INKEY$
in the
PC-E500 manual page 265 for the corresponding key code table.
The following words display characters and text:
word | stack effect | comment |
---|---|---|
EMIT |
( char -- ) | display character char |
TYPE |
( c-addr u -- ) | display string c-addr of size u |
REVERSE-TYPE |
( c-addr u -- ) | same as TYPE with reversed video |
PAUSE |
( c-addr u -- ) | display string c-addr of size u in reverse video and wait for a key press |
DUMP |
( addr u -- ) | dump u bytes at address addr in hexadecimal |
CR |
( -- ) | moves the cursor to a new line with CR-LF |
SPACE |
( -- ) | displays a single space |
SPACES |
( n -- ) | displays n spaces |
The PAUSE
word checks if the current TTY
output is to STDO
, then displays
the string in reverse video and waits for a key press. If the BREAK key or the
C/CE key is pressed, then PAUSE
executes ABORT
. If the output is not to
STDO
, then PAUSE
drops c-addr and u.
The following words can be used to control character output:
word | stack effect | comment |
---|---|---|
STDO |
( -- 1 ) | returns fileid=1 for standard output to the screen |
STDL |
( -- 3 ) | returns fileid=3 for standard output to the line printer |
TTY |
( -- fileid ) | a VALUE containing the fileid of a device or file to send character data to |
PRINTER |
( -- n ) | connects printer, returns the number of characters per line or zero if printer is off |
Normally TTY
is STDO
for screen output. The output can be redirected by
setting the TTY
value to a fileid of an open file with fileid TO TTY
.
When an exception occurs, including ABORT
, TTY
is set back to STDO
.
To print "This is printed" on a CE-126P:
PRINTER . ↲
24 OK[0]
STDL TO TTY .( This is printed) STDO TO TTY ↲
OK[0]
Printer: This is printed
Note that a final CR
may be needed to print the last line on the printer.
Note that the .(
output includes a final CR
.
To make printing easier, you can define two words to redirect all character output to a printer:
: print-on PRINTER IF STDL TO TTY THEN ;
: print-off STDO TO TTY ;
For example:
print-on FILES F:*.* print-off ↲
Communicating with devices connected to the COM: serial port requires opening the port and closing it afterwards. The appropriate COM: settings should be specified once and for all in BASIC beforehand. To do so, connect a serial cable and initialize the COM: port on the PC-E500(S):
> OPEN "9600,N,8,1,A,L,&H1A,N,N": CLOSE
Then return to Forth500 with CALL &B0000
or CALL &B9000
on an unexpanded
PC-E500(S).
To make COM: port sending and receiving easier, you can define a COM
value
to hold the open COM: fileid or zero when it is closed. For example:
0 VALUE COM
: open-com S" COM:" R/W OPEN-FILE THROW TO COM ;
: close-com COM ?DUP IF CLOSE-FILE DROP 0 TO COM THEN ;
Then OPEN-COM
and COM TO TTY
sends TTY output to the COM: port. Execute
STDO TO TTY
to set TTY
back to STDO
. For example:
: serial-on COM 0= IF open-com THEN COM TO TTY ;
; serial-off STDO TO TTY ;
Don't forget to close the COM: port when no longer in use. Otherwise the port cannot be opened in BASIC.
The following words control the screen and cursor position:
word | stack effect | comment |
---|---|---|
AT-XY |
( n1 n2 -- ) | set cursor at column n1 and row n2 position |
AT-TYPE |
( n1 n2 c-addr u -- ) | display the string c-addr _u_at column n1 and row n2 |
AT-CLR |
( n1 n2 n3 -- ) | clear n3 characters at column n1 and row n2 |
PAGE |
( -- ) | clear the screen |
SCROLL |
( n -- ) | scroll the screen n lines up when n>0 or down when n<0 |
X@ |
( -- n ) | returns current cursor column position 0 to 39 |
X! |
( n -- ) | set cursor column position |
Y@ |
( -- n ) | returns current cursor row position 0 to 3, or 4 when the cursor passed the bottom of the window |
Y! |
( n -- ) | set cursor row position |
XMAX@ |
( -- n ) | returns cursor max columns 1 to 40 |
XMAX! |
( n -- ) | set max columns, restricts the display viewing window |
YMAX@ |
( -- n ) | returns cursor max rows 1 to 4 |
YMAX! |
( n -- ) | set max columns, restricts the display viewing window |
BUSY-ON |
( -- ) | turn on the busy annunciator |
BUSY-OFF |
( -- ) | turn off the busy annunciator, turn on RUN or PRO |
SET-CURSOR |
( u -- ) | set cursor: bit 5 = on, bit 3 = blink, bit 0 to 2 = shape |
The SET-CURSOR
argument is an 8-bit pattern formed by OR
-ing $20
to turn
the cursor on, OR
-ing with $8
to blink the cursor, and OR
-ing with one of
the following five possible cursor shapes:
value | cursor shape |
---|---|
$00 |
underline |
$01 |
double underline |
$02 |
solid box |
$03 |
space (to display a cursor "box" on reverse video text) |
$04 |
triangle |
To temporarily hide the cursor without turning it off, for example to avoid
showing a cursor with KEY
and EKEY
, use 0 4 AT-XY
to move the cursor to
the fourth invisible display row.
The graphics mode is set with GMODE!
. All graphics drawing commands use this
mode to set, reset or reverse pixels:
word | stack effect | comment |
---|---|---|
GMODE! |
( 0|1|2 -- ) | pixels are set (0), reset (1) or reversed (2), stores in VARIABLE GMODE |
GPOINT |
( n1 n2 -- ) | draw a pixel at x=n1 and y=n2 |
GPOINT? |
( n1 n2 -- 0|1|-1 ) | returns 1 if a pixel is set at x=n1 and y=n2 or 0 if unset or -1 when outside of the screen |
GLINE |
( n1 n2 n3 n4 u -- ) | draw a line from x=n1 and y=n2 to x=n3 and y=n4 with pattern u |
GBOX |
( n1 n2 n3 n4 u -- ) | draw a filled box from x=n1 and y=n2 to x=n3 and y=n4 with pattern u |
GDOTS |
( n1 n2 u -- ) | draw a row of 8 pixels u at x=n1 and y=n2 |
GDOTS? |
( n1 n2 -- u ) | returns the row of 8 pixels u at x=n1 and y=n2 |
GDRAW |
( n1 n2 c-addr u -- ) | draw rows of 8 pixels stored in string c-addr u at x=n1 and y=n2 |
GBLIT! |
( u addr -- ) | copy 240 bytes of screen data from row u (0 to 3) to address addr |
GBLIT@ |
( u addr -- ) | copy 240 bytes of screen data at address addr to row u (0 to 3) |
A pattern u is a 16-bit pixel pattern to draw dashed lines and boxes. The
pattern should be $ffff (-1 or TRUE
) for solid lines and boxes. For example,
to reverse the current screen:
2 GMODE! ↲
0 0 239 31 TRUE GBOX ↲
The GDOTS
word takes an 8-bit pattern to draw a row of 8 pixels. The GDRAW
word draws a sequence of 8-bit patterns. For example, to display a smiley at
the upper left corner of the screen:
: smiley ( x y -- ) S\" \x3c\x42\x91\xa5\xa1\xa1\xa5\x91\x42\x3c" GDRAW ; ↲
0 GMODE! PAGE CR 0 0 smiley ↲
XXXXXX
X X
X X X X
X X
X X X X
X XXXX X
X X
XXXXXX
In addition to S\"
escaped strings with hexadecimal codes, the 10 bytes can
also be specified in binary with the sprite rotated sideways so that the top of
the sprite is on the right:
: sprite CREATE C, DOES> ( x y addr -- ) COUNT GDRAW ; ↲
10 sprite smiley ↲
%00111100 C, ↲
%01000010 C, ↲
%10010001 C, ↲
%10100101 C, ↲
%10100001 C, ↲
%10100001 C, ↲
%10100101 C, ↲
%10010001 C, ↲
%01000010 C, ↲
%00111100 C, ↲
0 GMODE! PAGE CR 0 0 smiley ↲
Blitting moves screen data between buffers to update or restore the screen
content. The GBLIT!
word stores a row of screen data in a buffer and
GPLIT@
fetches a row of screen data to restore the screen. Each operation
moves 240 bytes of screen data for one of the four rows of 40 characters. For
example, to save and restore the top row in the 256 byte PAD
:
: save-top-row ( -- ) 0 PAD GBLIT! ;
: restore-top-row ( -- ) 0 PAD GBLIT@ ;
To blit the whole screen, a buffer of 4 times 240 bytes is required:
240 4 * BUFFER: blit
: save-screen ( -- ) 4 0 DO I DUP 240 * blit + GBLIT! LOOP ;
: restore-screen ( -- ) 4 0 DO I DUP 240 * blit + GBLIT@ LOOP ;
The BEEP
word emits sound with the specified duration and tone:
word | stack effect | comment |
---|---|---|
BEEP |
( u1 u2 -- ) | beeps with tone u1 for u2 milliseconds |
MS |
( u -- ) | stops execution for u milliseconds |
The return stack is used to call colon definitions by saving the return address on the return stack. The return stack is also used by do-loops to store the loop control values, see also loops.
The return stack may be used to temporarily store values by moving them from
the stack to the return stack with >R
("to r") and back with R>
("r from").
Care must be taken to prevent return stack imbalences when a colon definition
exits. The return stack pointer must be restored to the original state (when
the colon definition started) before the colon definition exits. For example,
an >R
must be followed by R>
or RDROP
(or R>DROP
).
The following words move cells between stacks:
word | stack effect ( before -- after ) | comment |
---|---|---|
>R |
( x -- ; R: -- x ) | move the TOS to the return stack |
DUP>R |
( x -- x ; R: -- x ) | copy the TOS to the return stack |
2>R |
( xd -- ; R: -- xd ) | move the double TOS to the return stack |
R@ |
( R: x -- x ; -- x ) | copy the return stack TOS to the stack |
2R@ |
( R: xd -- xd ; -- xd ) | copy the return stack double cell (TOS and 2OS) to the stack |
R'@ |
( R: x1 x2 -- x1 x2 ; -- x1 ) | copy the return stack 2OS to the stack |
R"@ |
( R: x1 x2 x3 -- x1 x2 x3 ; -- x1 ) | copy the return stack 3OS to the stack |
R> |
( R: x -- ; -- x ) | move the TOS from the return stack to the stack |
RDROP |
( R: x -- ; -- ) | drop the return stack TOS |
2R> |
( R: xd -- ; -- xd ) | move the double TOS from the return stack to the stack |
N>R |
( n*x +n -- ; R: -- n*x +n ) | move n cells to the return stack |
NR> |
( -- n*x +n ; R: n*x +n -- ) | move n cells from the return stack |
RDROP
as the same as R>DROP
, a combination of R>
and DROP
.
The N>R
and NR>
words move +n+1 cells, including the cell +n. For
example 2 N>R ... NR> DROP
moves 2+1 cells to the return stack and back,
then dropping the restored 2. Effectively the same as executing 2>R ... 2R>
.
Note: N>R
and NR>
move +n mod 128 cells max as a precaution.
Other words related to the return stack:
word | stack effect | comment |
---|---|---|
RP@ |
( -- addr ) | returns the return stack pointer, points the to return TOS |
RP! |
( addr -- ) | assigns the return stack pointer (danger!) |
"Caller cancelling" is possible with RDROP
("r drop")
to remove a return address before exiting:
: bar ." bar " RDROP ;
: foo ." foo " bar ." rest of foo" ;
where RDROP
removes the return address to foo
. Therefore:
foo ↲
foo bar OK[0]
The maximum depth of the return stack in Forth500 is 256 bytes to hold up to 128 cells or 128 calls to secondaries (colon definitions of words constructed from existing Forth words). Return stack under- and overflow is not checked by Forth500. Recursing too deep will overwrite the data stack and may eventually lead to a crash.
The following words define constants, variables and values:
word | stack effect | comment |
---|---|---|
CONSTANT |
( x "name" -- ; -- x ) | define "name" to return x on the stack |
2CONSTANT |
( dx "name" -- ; -- dx ) | define "name" to return dx on the stack |
FCONSTANT |
( F: r -- ; "name" -- ; -- r ) | define "name" to return r on the floating point stack |
VARIABLE |
( "name" -- ; -- addr ) | define "name" to return addr of the variable's cell on the stack |
! |
( x addr -- ) | store x at addr of a VARIABLE |
+! |
( n addr -- ) | add n to the value at addr of a VARIABLE |
@ |
( addr -- x ) | fetch the value x from addr of a VARIABLE |
? |
( addr -- ) | fetch the value x from addr of a VARIABLE and display it with . |
ON |
( addr -- ) | store TRUE (-1) at addr of a VARIABLE |
OFF |
( addr -- ) | store FALSE (0) at addr of a VARIABLE |
2VARIABLE |
( "name" -- ; -- addr ) | define "name" to return addr of the variable's double cell on the stack |
2! |
( dx addr -- ) | store dx at addr of a 2VARIABLE |
D+! |
( d addr -- ) | add d to the value at addr of a 2VARIABLE |
2@ |
( addr -- dx ) | fetch the value dx from addr of a 2VARIABLE |
FVARIABLE |
( "name" -- ; -- addr ) | define "name" to return addr of the variable's floating point value on the stack |
F! |
( F: r -- ; addr -- ) | store floating point value r at addr of a VARIABLE |
F@ |
( addr -- ; F: r ) | fetch the floating point value r from addr of a 2VARIABLE |
VALUE |
( x1 "name" -- ; -- x2 ) | define "name" with initial value x1 to return its current value x2 on the stack |
TO |
( x "name" -- ) | assign "name" the value x, if "name" is a VALUE |
+TO |
( n "name" -- ) | add n to the value of "name", if "name" is a VALUE |
2VALUE |
( dx1 "name" -- ; -- dx2 ) | define "name" with initial value dx1 to return its current value dx2 on the stack |
TO |
( dx "name" -- ) | assign "name" the value dx, if "name" is a 2VALUE |
+TO |
( d "name" -- ) | add d to the value of "name", if "name" is a 2VALUE |
FVALUE |
( F: r1 "name" -- ; -- F: r2 ) | define "name" with initial value r1 to return its current value r2 on the floating point stack |
TO |
( F: r "name" -- ) | assign "name" the value r, if "name" is an FVALUE |
Double integers are stored big endian with the 16 high order bits stored first, followed by the 16 low order bits. The TOS of a double integer on the stack contains the 16 higher order bits (stacks grow downward).
A single integer is stored little endian with the 8 low order bits stored
first, followed by the 8 high order bits. For examample, C@
fetches the low
order 8 bits of a single integer stored at the specified address.
Values defined with VALUE
and 2VALUE
are initialized with the specified
initial values and do not require fetch operations, exactly like constants. By
contrast to constants, values can be updated with TO
and +TO
. Note that
the TO
and +TO
words are used to assign and update VALUE
and 2VALUE
words.
A deferred word executes another word assigned to it, essentially a variable that contains the execution token of another word to execute indirectly:
word | stack effect | comment |
---|---|---|
DEFER |
( "name" -- ) | defines a deferred word that is initially uninitialized |
' |
( "name" -- xt ) | tick returns the execution token of "name" on the stack |
['] |
( "name" -- ; -- xt ) | compiles "name" as an execution token literal xt |
IS |
( xt "name" -- ) | assign "name" the execution token xt of another word |
ACTION-OF |
( "name" -- xt ) | fetch the execution token xt assigned to "name" |
DEFER! |
( xt1 xt2 -- ) | assign xt1 to deferred word execution token xt2 |
DEFER@ |
( xt1 -- xt2 ) | fetch xt2 from deferred word execution token xt1 |
NOOP |
( -- ) | does nothing |
EXECUTE |
( ... xt -- ... ) | executes execution token xt |
A deferred word is defined with DEFER
and assigned with IS
:
DEFER greeting ↲
: hi ." hi" ; ↲
' hi IS greeting ↲
greeting ↲
hi OK[0]
' NOOP IS greeting ↲
greeting ↲
OK[0]
:NOMAME ." hello" ; IS greeting ↲
greeting ↲
hello OK[0]
The tick '
word parses the name of a word in the dictionary and returns its
execution token on the stack. An execution token points to executable code in
the dictionary located directly after the name of a word. The EXECUTE
word
executes code pointed to by an execution token. Therefore, ' my-word EXECUTE
is the same as executing my-word
.
Executing an uninitialized deferred word throws exception -256 "execution of an
uninitialized deferred word". To make a deferred word do nothing, assign
NOOP
"no-operation" to the deferred word.
To assign one deferred word to another we use ACTION-OF
, for example:
DEFER foo ↲
' TRUE IS foo ↲
DEFER bar ↲
ACTION-OF foo IS bar ↲
The result is that bar
is assigned TRUE
to execute. By contrast, ' foo IS bar
assigns foo
to bar
so that bar
executes foo
and foo
executes
TRUE
. This means that changing foo
would also change bar
.
The current action of a deferred word can be compiled into a definition to produce a static binding:
: bar ... [ ACTION-OF foo COMPILE, ] ... ;
where COMPILE,
compiles the execution token on the stack into the current
definition. See also the [ and ] brackets. To
streamline this method, define the immediate word [ACTION-OF]
:
: [ACTION-OF] ACTION-OF COMPILE, ; IMMEDIATE
which is used as follows:
: bar ... [ACTION-OF] foo ...
Some Forth implementations use DEFERS
to do the same.
WARNING: a deferred word referencing a word that is deleted with FORGET
or deleted with a MARKER
is no longer executable. Executing it will likely
lead to a crash, because the reference is deleted.
A nameless colon definition just stores code that cannot be referenced by name.
The :NONAME
word compiles a definition and returns its execution token on the
stack:
:NONAME ." this definition has no name" ; ↲
EXECUTE ↲
this definition has no name OK[0]
EXECUTE
runs to code, but there is no longer any way to reuse the code or
delete it from the dictionary. :NONAME
is typically used with deferred
words to store and execute the unnamed code:
DEFER lambda ↲
:NONAME ." this definition has no name" ; IS lambda ↲
lambda ↲
this definition has no name OK[0]
FORGET lambda ↲
OK[0]
The no-name equivalent of CREATE
is CREATE-NONAME
, see CREATE and
DOES>.
A recursive colon definition cannot refer to its own name, which is hidden
until the final ;
is parsed. There are two reasons for this: to avoid the
possible use of an incomplete colon definition that can crash the system when
executed and to allow redefining a word to call the old definition while
executing additional code in the redefinition.
A recursive colon definition should use RECURSE
to call itself, for example:
: factorial ( u -- ud ) \ u<=12
?DUP IF DUP 1- RECURSE ROT UMD* ELSE 1. THEN ;
Mutual recursion can be accomplished with deferred words:
DEFER foo ↲
: bar ... foo ... ; ↲
:NONAME ... bar ... ; IS foo ↲
:NONAME
returns the execution token of an unnamed colon definition,
see also noname definitions.
Do not recurse too deep. The return stack supports up to 128 calls to secondaries, not counting other data stored on the return stack.
An immediate word is always interpreted and executed, even within colon
definitions. A colon definition word can be declared IMMEDIATE
after
the terminating ;
. For example, the following colon definition of RECURSE
compiles the execution token of the most recent colon definition (i.e. the
word we are defining) into the compiled code:
: RECURSE
?COMP \ error -14 if we are not compiling
LAST-XT \ the execution token of the word being defined
COMPILE, \ compile it into code
; IMMEDIATE
See also compile-time immedate words.
Data can be stored in the dictionary as words created with CREATE
. Like a
colon definition, the name of a word is parsed and added to the dictionary.
This word does nothing else but just return the address of the body of data
stored in the dictionary. To allocate and populate the data the following
words can be used:
word | stack effect | comment |
---|---|---|
CREATE |
( "name" -- ; -- addr ) | adds a new word entry for "name" to the dictionary, this word returns addr |
HERE |
( -- addr ) | the next free address in the dictionary |
CELL |
( -- 2 ) | the size of a cell (single integer) in bytes |
CELLS |
( u -- 2*u ) | convert u from cells to bytes |
CELL+ |
( addr -- addr+2 ) | increments addr by a cell width (2 bytes) |
CHARS |
( u -- u ) | convert u from characters to bytes (does nothing) |
CHAR+ |
( addr -- addr+1 ) | increments addr by a character width (1 byte) |
FLOATS |
( u -- 12*u ) | convert u from floats to bytes |
FLOAT+ |
( addr -- addr+12 ) | increments addr by a floating point value width (12 bytes) |
ALLOT |
( n -- ) | reserves n bytes in the dictionary starting HERE , adds n to HERE |
UNUSED |
( -- u ) | returns the number of unused bytes remaining in the dictionary |
, |
( x -- ) | stores x at HERE then increments HERE by CELL (by 2 bytes) |
2, |
( dx -- ) | stores dx at HERE then increments HERE by 2 CELLS (by 4 bytes) |
C, |
( char -- ) | stores char at HERE then increments HERE by 1 CHARS (by 1 byte) |
F, |
( F: r -- ) | stores floating point value r at HERE then increments HERE by 1 FLOATS (by 12 bytes) |
DOES> |
( -- ; -- addr ) | the following code will be compiled and executed by the word we CREATE |
@ |
( addr -- x) | fetches x stored at addr |
2@ |
( addr -- dx ) | fetches dx stored at addr |
C@ |
( addr -- char ) | fetches char stored at addr |
F@ |
( addr -- ; F: r ) | fetches r stored at addr |
Allocation is limited by the remaining free space in the dictionary returned by
the UNUSED
word. Note that the ALLOT
value may be negative to release
space. Make sure to release space only with ALLOT
after allocating space
with ALLOT
and when no new words were defined and added to the dictionary.
The CREATE
word adds an entry to the dictionary, typically followed by words
to allocate and store data assocated with the new word. For example, we can
create a word foo
with a cell to hold a value that is initially zero:
CREATE foo 0 , ↲
3 foo ! ↲
foo ? ↲
3 OK[0]
In fact, this is exactly how the VARIABLE
word is defined in Forth:
: VARIABLE CREATE 0 , ; ↲
We can use CREATE
with "comma" words such as ,
to store values. For
example, a table of 10 primes:
CREATE primes 2 , 3 , 5 , 7 , 11 , 13 , 17 , 19 , 23 , 31 , ↲
The entire primes
table is displayed using address arithmetic as follows:
: primes? ( -- ) 10 0 DO primes I CELLS + ? LOOP ; ↲
where primes
returns the starting address of the table and primes I CELLS +
computes the address of the cell that holds the I
'th prime value. The
CELLS
word multiplies the TOS by two, since Forth500 cells are 2 bytes.
Uninitialized space is allocated with ALLOT
. For example, a buffer:
CREATE buf 256 ALLOT ↲
This creates a buffer buf
of 256 bytes. The buf
word returns the starting
address of this buffer. In fact, the built-in BUFFER:
word is defined as:
: BUFFER: CREATE ALLOT ; ↲
so that buf
can also be created with:
256 BUFFER: buf ↲
The DOES>
word compiles code until a terminating ;
. This code is executed
by the word we CREATE
. For example, CONSTANT
is defined in Forth as
follows:
: CONSTANT CREATE , DOES> @ ; ↲
A constant just fetches its data from the definition's body.
Note that >BODY
(see introspection) of an execution token
returns the same address as CREATE
returns and that DOES>
pushes on the
stack. For example, ' buf >BODY
and buf
return the same address.
DOES>
is valid only in a colon definition, because the DOES>
code is part
of the creating definition, not with the word we CREATE
. The word we
CREATE
executes the DOES>
code.
For example, address arithmetic can be added with DOES>
to automatically
fetch a prime number from the primes
table of constants:
: table: ( "name" -- ; index -- n ) CREATE DOES> SWAP CELLS + @ ;
table: primes 2 , 3 , 5 , 7 , 11 , 13 , 17 , 19 , 23 , 31 , ↲
3 primes . ↲
7 OK[0]
The SWAP CELLS + @
doubles the index with CELLS
then adds the address of
the primes
table to get to the address to fetch the value.
CREATE-NONAME
is similar to CREATE
, but does not add a new word to the
dictionary, returning the execution token of the code instead. The execution
toke, can be assigned to a DEFER
word for example, see also noname
definitions.
The following words define a structure and its fields:
word | stack effect ( before -- after ) | comment |
---|---|---|
BEGIN-STRUCTURE |
( "name" -- addr 0 ; -- u ) | define a structure type |
+FIELD |
( u n "name" -- u ; addr -- addr ) | define a field name with the specified size |
FIELD: |
( u "name" -- n ; addr -- addr ) | define a single cell field |
CFIELD: |
( u "name" -- n ; addr -- addr ) | define a character field |
2FIELD: |
( u "name" -- n ; addr -- addr ) | define a double cell field |
FFIELD: |
( u "name" -- n ; addr -- addr ) | define a floating point field |
END-STRUCTURE |
( addr u -- ) | end of structure type |
The FIELD:
word is the same as CELL +FIELD
, CFIELD:
is the same as 1 CHARS +FIELD
and 2FIELD
is the same as 2 CELLS +FIELD
.
Space for structures can be allocated with BUFFER:
. Fields behave like
variables and can be assigned with !
, ON
and OFF
and fetched with @
.
For example:
BEGIN-STRUCTURE pair ↲
FIELD: pair.first ↲
FIELD: pair.second ↲
END-STRUCTURE ↲
: pair.init DUP pair.first OFF pair.second OFF ; ↲
pair BUFFER: xy ↲
xy pair.init ↲
xy pair.first @ xy pair.second @ AT-XY ↲
Structures can be nested. For example:
BEGIN-STRUCTURE pixel ↲
pair +FIELD pixel.xy ↲
FIELD: pixel.on ↲
END-STRUCTURE ↲
Space for arrays can be allocated with BUFFER:
, for example to store 10 prime
numbers:
10 CELLS BUFFER: primes ↲
This allocates space for 10 primes, but does nothing more. Adding a provision
to automatically index an array is done by creating a BUFFER:
with DOES>
code to return the address of a cell given the array and array index on the
stack:
: cell-array: CELLS BUFFER: DOES> SWAP CELLS + ; ↲
where CELLS BUFFER:
creates a new word and allocates the specified number of
cells for the named array, where BUFFER:
just calls CREATE ALLOT
to define
the name with the reserved space.
We can use cell-array
to create an array of 10 uninitialized cells, then set
the first entry to 123
for example:
10 cell-array: values ↲
123 0 values ! ↲
A generic array
word takes the number of elements and size of an element,
where the element size is stored as a cell using ,
followed by * ALLOT
to
reserve space for the array data:
: array: CREATE DUP , * ALLOT DOES> SWAP OVER @ * + CELL+ ; ↲
10 2 CELLS array: factorials ↲
1. 0 factorials 2! 1. 1 factorials 2! 2. 2 factorials 2! ↲
6. 3 factorials 2! 24. 4 factorials 2! 120. 5 factorials 2! ↲
720. 6 factorials 2! 5040. 7 factorials 2! 40320. 8 factorials 2! ↲
362880. 9 factorials 2! ↲
We can add "syntactic sugar" to enhance the readability of the code, for
example using {
and }
to demarcate the array index expression as follows:
: { ; IMMEDIATE \ Does nothing ↲
10 2 CELLS array: }factorials ↲
1. { 0 }factorials 2! 1. { 1 }factorials 2! 2. { 2 }factorials 2! ↲
By making {
immediate, it won't compile to a useless call to the {
excution
token. This array implementation has no array index bound checking.
An array of structures is created in the same way as before, using the
structure as the size parameter for array:
:
BEGIN-STRUCTURE pair ↲
FIELD: pair.first ↲
FIELD: pair.second ↲
END-STRUCTURE ↲
: pair.init DUP pair.first OFF pair.second OFF ; ↲
10 pair array: }pairs ↲
10 0 DO I pair * pair.init LOOP ↲
This created an array of 10 pairs and initialized them. To store the value 1
in array location 3 for pair.first
:
1 { 3 }pairs pair.first ! ↲
To store the value 1 in array location 3 for pair.second
:
2 { 3 }pairs pair.second ! ↲
A vocabulary defines a dictionary of words. The default vocabulary is FORTH
.
A new vocabulary is defined with VOCABULARY
. A new vocabulary inherits the
current vocabulary as the parent vocabulary. When a word is not found in the
dictionary of a vocabulary then its parent is searched, the parent of the
parent is searched and so on, lastly the pater familias FORTH
vocabulary.
For example, we can define a GAME
vocabulary with game-related words:
VOCABULARY GAME
The GAME
vocabulary is activated when the GAME
word is executed. This sets
the dictionary search order to find GAME
words first, then the words in its
parent vocabulary, in this case the FORTH
words. To define new GAME
words,
we use DEFINITIONS
after switching to the GAME
vocabulary:
GAME DEFINITIONS
; score ... ;
: play ... score ... ;
The score
and play
words are defined in the GAME
vocabulary and are not
visible to the outside. When we switch back to the FORTH
vocabulary we will
not be able to see the score
and play
words. However, we can define a
play-game
word that calls play
in the GAME
vocabulary as follows:
FORTH DEFINITIONS
: play-game [ GAME ] play [ FORTH ] ;
Note that [ GAME ]
executes GAME
immediately to switch the dictionary
search order to the GAME
vocabulary before compiling PLAY
. Bracketed words
are interpreted, see The [ and ] brackets
The VOCABULARY
word is legacy Forth. It is obsoleted by wordlists in
Standard Forth. The CURRENT
and CONTEXT
words that are used by
VOCABULARY
and DEFINITIONS
are values in Forth500, not variables as in
legacy Forth. See also vocabulary structure.
WARNING: exercise caution when using FORGET
and MARKER
with
vocabularies. For details, please read the next two sections.
A so-called "marker word" is created with MARKER
. When the word is executed,
it deletes itself and all definitions after it. For example:
MARKER _program_ ↲
...
_program_ ↲
This marks _program_
as the start of our code indicated by the ...
. This
code is deleted by executing _program_
.
A source code file might start with the following code to delete its definitions when the file is parsed again to be replace the old definitions with updated definitions:
[DEFINED] _program_ [IF] _program_ [THEN] ↲
MARKER _program_ ↲
ANEW
is shorter and does the same thing:
ANEW _program_ ↲
WARNING: a deferred word referencing a word that is deleted is no longer executable. Executing it will likely lead to a crash, because the reference is deleted.
WARNING: do not use DEFINITIONS
after MARKER
or after ANEW
, except
when defining a new VOCABULARY
with DEFINITIONS
that will be removed by the
marker. Beware that extending a previous vocabulary with DEFINITIONS
can
lead to a crash later when the marker is executed to delete all definitions:
FORTH DEFINITIONS
ANEW _game_
VOCABULARY GAME
GAME DEFINITIONS
... \ definitions added to GAME
FORTH DEFINITIONS \ problem
: music ... ; \ definition added to FORTH
_game_ \ problem
Executing _game_
or reloading the source code with ANEW _game_
deletes both
GAME
and the music
word, but the FORTH
vocabulary is corrupted afterwards
because music
is still considered part of FORTH
.
This is correct:
FORTH DEFINITIONS
ANEW _game_
: music ... ; \ definition added to FORTH
VOCABULARY GAME
GAME DEFINITIONS
... \ definitions added to GAME
_game_
Besides markers, FORGET name
can be used to remove name
and all
words defined thereafter. To protect the dictionary, forgetting is not
permitted past the address returned by FENCE
. FENCE
is a value that can be
assigned a new boundary in the dictionary to protect from FORGET
. For
example, HERE TO FENCE
protects all previously defined words.
WARNING: a deferred word referencing a word that is deleted is no longer executable. Executing it will likely lead to a crash, because the reference is deleted.
WARNING: exercise caution with FORGET
and vocabularies. Typical usage
should not pose any problems. However, extending a previous vocabulary with
additional DEFINITIONS
after a new vocabulary is defined can lead to a crash
when FORGET
is used to delete words from the new vocabulary:
VOCABULARY GAME
GAME DEFINITIONS
... \ definitions added to GAME
FORTH DEFINITIONS \ problem
: music ... ; \ definition added to FORTH
GAME FORGET GAME \ problem
This deletes both GAME
and music
, but the FORTH
vocabulary is corrupted
afterwards, because music
is still considered part of FORTH
. The music
word is dangling and will be overwritten with noisy data, leading to a crash.
Instead, we should say FORTH FORGET GAME
. This deletes both GAME
and
music
from the FORTH
vocabulary, which was our goal.
The following words can be used to inspect words and dictionary contents:
word | stack effect | comment |
---|---|---|
' |
( "name" -- xt ) | tick returns the execution token of "name" on the stack |
['] |
( "name" -- ; -- xt ) | compiles "name" as an execution token literal xt |
COLON? |
( xt -- flag ) | return TRUE if xt is a : definition |
DEFER? |
( xt -- flag ) | return TRUE if xt is a DEFER |
VALUE? |
( xt -- flag ) | return TRUE if xt is a VALUE |
2VALUE? |
( xt -- flag ) | return TRUE if xt is a 2VALUE |
FVALUE? |
( xt -- flag ) | return TRUE if xt is an FVALUE |
DOES>? |
( xt -- flag ) | return TRUE if xt is created by a word that uses CREATE with DOES> |
MARKER? |
( xt -- flag ) | return TRUE if xt is a MARKER |
>BODY |
( xt -- addr ) | return the addr of the body of execution token xt, usually data |
>NAME |
( xt -- nt ) | return the name token nt of the name of execution token xt |
NAME>STRING |
( nt -- c-addr u ) | return the string c-addr of size u of the name token nt |
NAME> |
( nt -- xt ) | return the execution token of the name token nt |
L>NAME |
( addr -- nt ) | return the name token of the dictionary entry at addr |
LAST-XT |
( -- xt ) | return the execution token of the last defined word |
WORDS |
( [ "name" ] -- ) | displays words in the dictionary, optionally matching part of "name" (case sensitive), when specified |
Words named xxx>yyy
are pronounced "xxx to yyy", words named >xxx
are
pronounced "to xxx" and words named xxx>
are pronounced "xxx from".
See the Standard Forth word list
with pronounciations of all standard words.
To send WORDS
to a printer, check if a printer is connected with PRINTER .
then STDL TO TTY WORDS`.
See also the dictionary structure.
A colon definition can be exited with EXIT
to return to the caller. To
recursively call the current word, use RECURSE
see recursion.
See also exceptions to THROW
and CATCH
exceptions and to use
ABORT
or ABORT"
to abort and return control to the keyboard.
The next two sections introduce conditional branches and loops.
The immediate words IF
, ELSE
and THEN
execute a branch based on a single
condition:
test IF
executed if test is nonzero
THEN
test IF
executed if test is nonzero
ELSE
executed if test is zero
THEN
These words can only be used in colon definitions.
The immediate words CASE
, OF
, ENDOF
, ENDCASE
select a branch to
execute by comparing the TOS to the OF
values:
value CASE
case1 OF
executed if value=case1
ENDOF
case2 OF
executed if value=case2
ENDOF
...
caseN OF
executed if value=caseN
ENDOF
executed if no case matched (default branch)
ENDCASE
These words can only be used in colon definitions. The default branch has
value
as TOS, which may be inspected in the default branch, but should not be
dropped. It is common to use the >R
and R>
words to temporarily save the
TOS in the default branch:
ENDOF
>R ... R>
ENDCASE
The stack effects of ...
are transparent to the code that follows ENDCASE
.
Enumeration-controlled do-loops start with the word DO
or ?DO
and end with
the word LOOP
or +LOOP
:
limit start DO
loop body
LOOP
limit start ?DO
loop body
LOOP
limit start DO
loop body
step +LOOP
limit start ?DO
loop body
step +LOOP
These words can only be used in colon definitions. Do-loops run from the start
to the limit
values, excluding the last iteration for limit
. The DO
loop
iterates at least once, even when start
equals limit
. The ?DO
loop does
not iterate when start
equals limit
. The +LOOP
word increments the
internal loop counter by step
. The step
size may be negative. The +LOOP
terminates if the updated counter equals or crosses the limit.
A do-loop body is exited prematurely with LEAVE
and ?LEAVE
. The ?LEAVE
word pops the TOS and when nonzero leaves the do-loop, which is a shorthand for
IF LEAVE THEN
.
When exiting from the current colon definition with EXIT
inside a do-loop,
first the UNLOOP
word must be used to remove the loop control values from the
return stack: UNLOOP EXIT
.
The internal loop counter value can be used in the loop as I
. Likewise, the
second outer loop counter of a loop nest is J
and the third outer loop
counter is K
. These return undefined values when not used within do-loops.
WARNING: Loop control data is placed on the return stack. Therefore, any
values placed on the return stack with >R
invalidates I
, J
and K
.
WARNING: Return stack operations >R
, R@
and R>
cannot be used to pass
values on the returns stack from outside a do-loop to the inside, because the
do-loop places control data on the return stack. For example, >R DO ... R@ ... LOOP R>
produces undefined values for R@
.
The words BEGIN
and AGAIN
form a loop that never ends:
BEGIN
loop body
AGAIN
There is no word like LEAVE
to exit a BEGIN
loop. Instead, UNTIL
or
WHILE
with REPEAT
should be used. An EXIT
in a loop will terminate the
loop and return control from the current colon definition,
The words BEGIN
and UNTIL
form a conditional loop which iterates at least
once until the condition test
is nonzero (is true):
BEGIN
loop body
test UNTIL
The words BEGIN
, WHILE
and REPEAT
form a conditional loop that iterates
while the condition test
is true (nonzero):
BEGIN
test WHILE
loop body
REPEAT
The BEGIN
, WHILE
, REPEAT
, UNTIL
and AGAIN
words can only be used in
colon definitions.
A WHILE
loop can be enhanced with additional WHILE
tests to create a
multi-test conditional loop with optional ELSE
:
BEGIN
test1 WHILE
loop body1
test2 WHILE
loop body2
REPEAT
ELSE
executed if test2 is true (nonzero)
THEN
The loop body1
and body2
are executed as long as test1
and test2
are
true (nonzere). If test1
is false (zero), then the loop exits. If test2
is false (zero), then the loop terminates in the ELSE
branch. Multiple
WHILE
and optional ELSE
branches may be added. Each additional WHILE
requires a THEN
after REPEAT
.
To understand how and why this works, note that a WHILE
and REPEAT
combination is identical to BEGIN
with an IF
to conditionally execute
AGAIN
:
BEGIN \ BEGIN
test IF \ test WHILE
AGAIN THEN \ REPEAT
The interpretation versus compilation state variable is STATE
. When true
(nonzero), a colon definition is being compiled. When false (zero), the
system is interpreting. The [
and ]
may be used in colon definitions to
temporarily switch to interpret mode.
Some words are always interpreted and not compiled. These words are marked
IMMEDIATE
. The compiler executes IMMEDIATE
word immediately. In fact,
Forth control flow is implemented with immediate words that compile conditional
branches and loops.
The immediate [
word switches STATE
to FALSE
and ]
switches STATE
to
TRUE
. This means that [
and ]
can be used within a colon definition to
temporarily switch to interpret mode and execute words, rather than compiling
them. For example:
: my-word [ .( here=) HERE . ] ." executing my-word" CR ;
This example displays here=<CR><address>
when my-word
is compiled and displays
executing my-word
when my-word
is executed.
Note that the immediate .(
word is used to display messages at compile time
or at run time, so we can also write this as follows:
: my-word .( here=) [ HERE . ] ." executing my-word" CR ;
It is a good habit to define words to break up longer definitions, so we can
refactor this by introducing a new word "here"
:
: "here" ." here=" CR HERE . ;
: my-word [ "here" ] ." executing my-word" CR ;
Consider:
: "here" ." here=" CR HERE . ;
: my-word [ "here" ] ." executing my-word" CR ;
The [
and ]
are not necessary if we make "here"
IMMEDIATE
to execute
immediately:
: [here] ." here=" CR HERE . ; IMMEDIATE
: my-word [here] ." executing my-word" CR ;
Using brackets with [here]
is another good habit as a reminder that we
execute an immediate word when it affects compilation.
This example illustrates how IMMEDIATE
is used. Because displaying
information while compiling is generally considered useful, the .(
word is
marked immediate to display text followed by a CR
during compilation:
: my-word .( compiling my-word) ." executing my-word" CR ;
All control flow words execute immediately to compile conditionals and loops.
To compile values on the stack into literal constants in the compiled code, we
use the LITERAL
, 2LITERAL
and SLITERAL
immediate words. For example,
we can create a variable and use its current value to create a literal
constant:
VARIABLE foo 123 foo ! ↲
: now-foo [ foo @ ] LITERAL ; ↲
456 foo ! ↲
now-foo . ↲
123 OK[0]
The example demonstrates how the current value of a variable is compiled into a
literal in now-foo
.
The 2LITERAL
word compiles double integers (two cells). The SLITERAL
word
compiles strings:
word | stack effect ( before -- after ) | comment |
---|---|---|
LITERAL |
( x -- ; -- x ) | compiles x as a literal |
2LITERAL |
( dx -- ; -- dx ) | compiles dx as a double literal |
FLITERAL |
( F: r -- ; F: -- r ) | compiles r as a floating point literal |
SLITERAL |
( c-addr1 u ; -- c-addr2 u ) | compiles string c-addr of size u as a string literal |
[CHAR] |
( "name" -- ; -- char ) | compiles the first character of "name" as a literal |
['] |
( "name" -- ; -- xt ) | compiles "name" as an execution token literal xt |
The SLITERAL
word compiles the string address c-addr1 and size u by
copying the string to code. The copied string is returned at runtime as
c-addr2 u.
The [CHAR]
word parses a name and compiles the first character as a literal.
This is the compile-time equivalent of CHAR
. For example, [CHAR] $
is the
same as [ CHAR $ ] LITERAL
. Instead of [CHAR] $
, the short form '$
may
be used.
The [']
word parses the name of a word and compiles the word's execution
token as a literal. This is the compile-time equivalent of '
("tick"). For
example, ['] NOOP
is the same as [ ' NOOP ] LITERAL
.
Immediate words cannot be compiled, unless we postpone their execution with
POSTPONE
. The POSTPONE
word parses a name marked IMMEDIATE
and compiles
it to execute when the colon definition executes. If the name is not
immediate, then POSTPONE
compiles the word's execution token as a literal
followed by COMPILE,
, which means that this use of POSTPONE
in a colon
definition compiles code. Basically, POSTPONE
may be used to define words
that compile the postponed words into a definition, acting like macros.
An example of POSTPONE
to compile the immedate word THEN
to execute when
ENDIF
executes, making ENDIF
synonymous to THEN
:
: ENDIF POSTPONE THEN ; IMMEDIATE ↲
An example of POSTPONE
to compile a non-immedate word:
: [MAX] POSTPONE MAX ; IMMEDIATE ↲
: foo [MAX] ; ↲
the result of which is:
: foo MAX ;
Note that [MAX]
is IMMEDIATE
to compile MAX
in the definition of foo
.
Basically, [MAX]
acts like a macro that expands into MAX
. Macros are
useful as immediate words to performs specific operations to compile one or
more words into a definition.
Forth source input is conditionally interpreted and compiled with [IF]
,
[ELSE]
and [THEN]
words. The [IF]
word jumps to a matching [ELSE]
or [THEN]
if the TOS is zero (i.e. false). When used in colon definitions,
the TOS value should be produced immediately with [
and ]
:
[ test ] [IF]
this source input is compiled if test is nonzero
[THEN]
[ test ] [IF]
this source input is compiled if test is nonzero
[ELSE]
this source input is compiled if test is zero
[THEN]
The [DEFINED]
and [UNDEFINED]
immediate words parse a name and return
TRUE
if the name is defined as a word or not, otherwise return FALSE
. For
example, to check if 2NIP
is defined before using it within a colon
definition:
: foo
...
[DEFINED] 2NIP [IF] 2NIP [ELSE] ROT DROP ROT DROP [THEN] ↲
... ;
Likewise, we can define 2NIP
if undefined:
[UNDEFINED] 2NIP [IF] ↲
: 2NIP ROT DROP ROT DROP ; ↲
[THEN] ↲
The interpreter and compiler parse input from two buffers, the TIB
(terminal
input buffer) and FIB
(file input buffer). Input from these two sources is
controlled by the following words:
word | stack effect | comment |
---|---|---|
TIB |
( -- c-addr ) | a 256 character terminal input buffer |
FIB |
( -- c-addr ) | a 256 character file input buffer |
SOURCE-ID |
( -- 0|-1 | fileid ) |
SOURCE |
( -- c-addr u ) | returns the current buffer (TIB or FIB ) and the number of characters stored in it |
>IN |
( -- addr ) | a VARIABLE holding the current input position in the SOURCE buffer to parse from |
REFILL |
( -- flag ) | refills the current input buffer from SOURCE-ID , returns true if successful |
The following words parse the current source of input:
word | stack effect ( before -- after ) | comment |
---|---|---|
PARSE |
( char "chars" -- c-addr u ) | parses "chars" up to a matching char, returns the parsed characters as string c-addr u |
\"-PARSE |
( char "chars" -- c-addr u ) | same as PARSE but also converts escapes to raw characters in c-addr u, see S\" in string constants |
PARSE-WORD |
( char "chars" -- c-addr u ) | same as PARSE but skips all leading matching char first |
PARSE-NAME |
( "name" -- c-addr u ) | parses a name delimited by blank space, returns the name as a string c-addr u |
WORD |
( char "chars" -- c-addr ) | an obsolete word to parse a word |
PARSE-NAME
is the same as BL PARSE-WORD
, where BL
is the space character.
The names of words in the dictionary are parsed with PARSE-NAME
. When BL
is used as delimiter, also the control characters, such as CR and LF, are
considered delimiters.
The EVALUATE
word combines parsing and execution with a string as the source
input:
word | stack effect | comment |
---|---|---|
EVALUATE |
( ... c-addr u -- ... ) | redirects input to the string c-addr u to parse and execute |
The following words are available to load Forth source code from files on the E: and F: drives and from the serial COM: port:
word | stack effect ( before -- after ) | comment |
---|---|---|
INCLUDE |
( "name" -- ) | load Forth source code file "name" |
INCLUDED |
( c-addr u -- ) | load Forth source code file named by the string c-addr u |
INCLUDE-FILE |
( fileid -- ) | load Forth source code from fileid |
REQUIRE |
( "name" -- ) | load Forth source code file "name" if not already loaded |
REQUIRED |
( c-addr u -- ) | load Forth source code file named by the string c-addr u if not already loaded |
Forth500 source files must have LF or CRLF line endings.
The following file-related words are available:
word | stack effect ( before -- after ) | comment |
---|---|---|
FILES |
( [ "glob" ] -- ) | lists files matching optional "glob" with wildcards * and ? |
DELETE-FILE |
( c-addr u -- ior ) | delete file with name c-addr1 u1 |
RENAME-FILE |
( c-addr1 u1 c-addr2 u2 -- ior ) | rename file with name c-addr1 u1 to c-addr2 u2 |
FILE-STATUS |
( c-addr u -- s-addr ior ) | if file with name c-addr u exists, return ior=0 |
R/O |
( -- fam ) | open file for read only |
W/O |
( -- fam ) | open file for write only |
R/W |
( -- fam ) | open file for reading and writing |
BIN |
( fam -- fam ) | update fam for "binary file" mode access (does nothing) |
CREATE-FILE |
( c-addr u fam -- fileid ior ) | create new file named c-addr u with mode fam, returns fileid or truncate existing file to zero length |
OPEN-FILE |
( c-addr u fam -- fileid ior ) | open existing file named c-addr u with mode fam, returns fileid and ior |
CLOSE-FILE |
( fileid -- ior ) | close file fileid |
READ-FILE |
( c-addr u1 fileid -- u2 ior ) | read buffer c-addr of size u1 from fileid, returning number of bytes u2 read and ior |
READ-LINE |
( c-addr u1 fileid -- u2 flag ior ) | read a line into buffer c-addr of size u1 from fileid, returning number of bytes u2 read and a flag = TRUE when EOF is reached |
READ-CHAR |
( fileid -- char ior ) | returns char read from fileid, ior = 257 when EOF is reached |
PEEK-CHAR |
( fileid -- char ior ) | returns the next char from fileid without reading it, ior = 257 when EOF is reached |
WRITE-FILE |
( c-addr u fileid -- ior ) | write buffer c-addr of size u to fileid |
WRITE-LINE |
( c-addr u fileid -- ior ) | write string c-addr of size u and CR LF to fileid |
WRITE-CHAR |
( char fileid -- ior ) | write char to fileid |
FILE-INFO |
( fileid -- ud1 ud2 u1 u2 ior ) | returns file fileid current position ud1 file size ud file attribute u1 and device attribue u2 |
FILE-SIZE |
( fileid -- ud ior ) | returns file fileid size ud |
FILE-POSITION |
( fileid -- ud ior ) | returns file fileid current position ud |
FILE-END? |
( fileid -- flag ior ) | returns TRUE if current position in fileid is a the end |
SEEK-SET |
( -- 0 ) | to seek from the start of the file |
SEEK-CUR |
( -- 1 ) | to seek from the current position in the file |
SEEK-END |
( -- 2 ) | to seek from the ens of the file |
SEEK-FILE |
( d fileid 0|1|2 -- ior ) | seek file offset d from the start, relative to the current position or from the end |
REPOSITION-FILE |
( ud fileid -- ior ) | seek file offset ud from the start |
RESIZE-FILE |
( ud fileid -- ior ) | resize fileid to ud bytes (cannot truncate files, only enlarge) |
DRIVE |
( -- c-addr ) | returns address c-addr of the current drive letter |
DSKF |
( c-addr u -- du ior ) | returns the free capacity of the drive specified in string c-addr u |
STDO |
( -- 1 ) | returns fileid=1 for standard output to the screen |
STDI |
( -- 2 ) | returns fileid=2 for standard input from the keyboard |
STDL |
( -- 3 ) | returns fileid=3 for standard output to the line printer |
>FILE |
( fileid -- s-addr ior ) | returns file s-addr data for fileid |
FILE>STRING |
( s-addr -- c-addr u ) | returns string c-addr u file name converted from file s-addr data |
Low-level file I/O words return ior to indicate success (zero) or failure with nonzero file error code.
FILE-INFO
returns current position ud1 file size ud file attribute u1
and device attribue u2. See the PC-E500 technical manual for details on
the attribute values.
If an exception occurs before a file is closed, then the file cannot be opened
again. Doing so returns error ior=264. The fileid of open files start
with 4, which means that the first file opened but not closed can be manually
closed with 4 CLOSE-FILE .
displaying zero when successful.
Globs with wildcard *
and ?
can be used to list files on the E: or F:
drive with FILES
, for example:
FILES E:*.* \ list all E: files and change the current drive to E:
FILES \ list all files on the current drive
FILES *.FTH \ list all FTH files on the current drive
FILES PROGRAM.* \ list all PROGRAM files with any extension on the current drive
FILES PROGRAM.??? \ same as above
Up to one *
for the file name may be used and up to one *
for the file
extension may be used. Press BREAK or C/CE to stop the FILES
listing.
The FILES
word repeatedly calls FIND-FILE
that uses glob patterns to
provide information about files:
word | stack effect ( before -- after ) | comment |
---|---|---|
FIND-FILE |
( c-addr u1 u2 -- c-addr u1 u2 s-addr u3 ior ) | returns the s-addr of a file with directory index u3>=u2 that matches the string pattern c-addr u1 |
When a drive letter is used with a filename or glob pattern, the specified drive becomes the current drive. To change the current drive letter:
'F DRIVE C!
The PC-E500 drive names associated with devices are:
drive name | fam | comment |
---|---|---|
STDO: / SCRN: | W/O |
LCD display |
STDI: / KYBD: | R/O |
keyboard |
STDL: / PRN: | W/O |
printer |
COM: | R/W |
serial IO (SIO) |
CAS: | R/W |
tape |
E: | R/W |
internal RAM disk |
F: | R/W |
external RAM disk |
G: | R/O |
internal ROM disk |
X: | R/W |
external FDD |
The first three devices are always accessible with the STDO
, STDI
and
STDL
words that return the corresponding fileid. STDL
is usable after
connecting and checking the status of the printer with PRINTER
, see also
printing.
The following words load raw binary data or text and Forth source code from "tape" (any audio device) via a SHARP CE-126P printer and cassette interface or a CE-124 cassette interface:
word | stack effect | comment |
---|---|---|
TAPE |
( -- addr u ior ) | load data (binary or text) from tape into free dictionary space, returning data addr of size u |
CLOAD |
( -- ) | read Forth source from tape |
TAPE
stores the raw tape data in free space located directly below the
floating point stack. When data was successfully loaded, zero is returned with
the address and size of the data. Otherwise a nonzero ior file
error code is returned with incomplete data. The data stored in
free space is not persistent and may be overwritten when the dictionary grows.
CLOAD
calls TAPE
THROW
and EVALUATE
. Because TAPE
saves the tape
data to the free space in the dictionary, the Forth source code should not
initially allocate large chunks of the dictionary with ALLOT
. Doing so may
overwrite the Forth source code stored in the remaining free space and may
cause strange compilation errors.
To transfer data and Forth source code from a host PC to the PC-E500(S), a wav file of the data or Forth source code should be created with the popular PocketTools and then "played" on the audio output to transmit the file to the PC-E500(S) via a cassette interface:
$ bin2wav --pc=E500 --type=bin -dINV FILE.FTH
$ afplay FILE.wav
Use maximum volume to play the wav file or close to maximum to avoid
distortion. If -dINV
does not transfer the file, then try -dMAX
. The
bin2wav
tool reports the "start address" and "end address", which are not
relevant and can be ignored.
The following tcopy
definition copies tape data to a new file on the E: or F:
drive:
: ?ior ( fileid ior -- fileid ) ?DUP IF SWAP CLOSE-FILE DROP THROW THEN ;
: tcopy ( "filename" -- )
PARSE-NAME W/O CREATE-FILE THROW
DUP TAPE ?ior
ROT WRITE-FILE ?ior
CLOSE-FILE DROP ;
Executing tcopy FLOATEXT.FTH
copies the Forth source transmitted from tape to
the new file FLOATEXT.FTH
on the current drive.
File I/O ior error codes returned by file operations, ior=0 means no error:
code | error |
---|---|
256 | an error occurred in the device and aborted ($00) |
257 | the parameter is beyond the range ($01) |
258 | the specified file does not exist ($02) |
259 | the specified pass code does not exist ($03) |
260 | the number of files to be opened exceeds the limit ($04) |
261 | file processing is not permitted ($05) |
262 | ineffective file handle was attempted (invalid fileid argument) ($06) |
263 | processing is not specified by open statement ($07) |
264 | the file is already open ($08) |
265 | the file name is duplicated ($09) |
266 | the specified drive does not exist ($0a) |
267 | error in data verification ($0b) |
268 | processing of byte number has not been completed ($0c) |
510 | fatal low battery ($fe) |
511 | break key was pressed ($ff) |
The ior code is the PC-E500(S) technical manual page 5 FCS error code + 256.
word | stack effect ( before -- after ) | comment |
---|---|---|
ABORT |
( ... -- ... ) | unconditionally abort execution and throw -1 |
ABORT" |
( "string" -- ; ... x -- ... ) | if x is nonzero, display "string" message and throw -2 |
QUIT |
( ... -- ... ) | throw -56 |
THROW |
( ... x -- ... ) or ( 0 -- ) | if x is nonzero, throw x else drop the 0 |
CATCH |
( xt -- ... 0 ) or ( xt -- x ) | execute xt, if an exception x occurs then restore the stack and return x, otherwise return 0 |
' |
( "name" -- xt ) | tick returns the execution token of "name" on the stack |
['] |
( "name" -- ; -- xt ) | compiles "name" as an execution token literal xt |
ABORT
, ABORT"
and QUIT
return control to the keyboard to enter commands.
Note that test ABORT" test failed"
throws -2 if test
leaves a nonzero on
the stack. This construct can be used to check return values and perform
assertions on values in the code.
The CATCH
word executes the execution token xt on the stack like EXECUTE
,
but catches exceptions thrown. If an exception x is thrown, then the stack
has the state before CATCH
with xt removed and the nonzero exception code
x as the new TOS. Otherwise, a zero is left on the stack. For example:
: try-divide ( dividend divisor -- )
['] / CATCH IF ↲
." cannot divide by zero" 2DROP \ remove dividend and divisor ↲
ELSE ↲
." result=" . ↲
THEN ; ↲
9 3 try-divide ↲
result=3 OK[0]
9 0 try-divide ↲
cannot divide by zero OK[0]
CATCH
restores the stack pointers when an exception is thrown, but the stack
values may be changed by the word executed and thus may or may not hold the
original values before CATCH
.
To throw and catch any errors when opening a file read-only, read it in blocks
of 256 bytes into the PAD
to display on screen, and close at the end of the
file. We also want to catch a read
exception to properly close
the file
then re-throw the exception:
: VARIABLE fh \ file handle, nonzero when file is open
: open ( c-addr u -- ) R/O OPEN-FILE THROW fh ! ;
: read ( -- len ) PAD 256 fh @ READ-FILE THROW ;
: close ( -- ) fh @ CLOSE-FILE fh OFF THROW ;
: more ( c-addr u -- )
open
BEGIN
['] read CATCH ?DUP IF
close
THROW \ rethrow exception thrown by read
THEN
?DUP WHILE
PAD SWAP TYPE
REPEAT
close CR ;
: test-more ( -- ) S" somefile.txt" ['] more CATCH ABORT" an error occurred" ;
The following Standard Forth exception codes may be thrown by built-in Forth500 words:
code | exception |
---|---|
-1 | ABORT |
-2 | ABORT" |
-3 | stack overflow |
-4 | stack underflow |
-5 | return stack overflow (not checked by Forth500) |
-6 | return stack underflow (not checked by Forth500) |
-7 | do-loops nested too deeply during execution (not checked by Forth500) |
-8 | dictionary overflow |
-9 | invalid memory address (N/A in Forth500) |
-10 | division by zero |
-11 | result out of range |
-12 | argument type mismatch |
-13 | undefined word |
-14 | interpreting a compile-only word |
-15 | invalid FORGET |
-16 | attempt to use zero-length string as a name |
-17 | pictured numeric output string overflow |
-18 | parsed string overflow |
-19 | definition name too long |
-20 | write to a read-only location (N/A in Forth500) |
-21 | unsupported operation (N/A in Forth500) |
-22 | control structure mismatch |
-23 | address alignment exception (N/A in Forth500) |
-24 | invalid numeric argument |
-25 | return stack imbalance (not checked in Forth500) |
-26 | loop parameters unavailable (not checked in Forth500) |
-27 | invalid recursion (not checked in Forth500) |
-28 | user interrupt (BREAK was pressed) |
-29 | compiler nesting (N/A in Forth500) |
-30 | obsolescent feature (N/A in Forth500) |
-31 | >BODY used on non-CREATEd definition (not checked in Forth500) |
-32 | invalid name argument (invalid TO name) |
-33 | block read exception (not supported in Forth500) |
-34 | block write exception (not supported in Forth500) |
-35 | invalid block number (not supported in Forth500) |
-36 | invalid file position (not thrown by Forth500, but throws File errors) |
-37 | file I/O exception (not thrown by Forth500, but throws File errors) |
-38 | non-existent file (not thrown by Forth500, but throws File errors) |
-39 | unexpected end of file (not thrown by Forth500, but throws File errors) |
-40 | invalid BASE for floating point conversion (not checked in Forth500) |
-41 | loss of precision (not checked in Forth500) |
-42 | floating-point divide by zero |
-43 | floating-point result out of range |
-44 | floating-point stack overflow |
-45 | floating-point stack underflow |
-46 | floating-point invalid argument |
-47 | compilation word list deleted (N/A in Forth500) |
-48 | invalid POSTPONE (not checked in Forth500) |
-49 | search-order overflow (N/A in Forth500) |
-50 | search-order underflow (N/A in Forth500) |
-51 | compilation word list changed (N/A in Forth500) |
-52 | control-flow stack overflow (N/A in Forth500) |
-53 | exception stack overflow (N/A in Forth500) |
-54 | floating-point underflow (N/A in Forth500) |
-55 | floating-point unidentified fault (N/A in Forth500) |
-56 | QUIT |
-57 | exception in sending or receiving a character |
-58 | [IF] , [ELSE] , or [THEN] exception |
-256 | execution of an uninitialized deferred word (Forth500) |
The ENVIRONMENT?
word takes a string to return system-specific information
about the Forth500 implementation as required by Standard
Forth ENVIRONMENT?
.
These queries return TRUE
with a value of the indicated type:
query string | type | meaning |
---|---|---|
/COUNTED-STRING |
n | maximum size of a counted string, in characters |
/HOLD |
n | size of the pictured numeric output string buffer, in characters |
/PAD |
n | size of the scratch area pointed to by PAD, in characters |
ADDRESS-UNIT-BITS |
n | size of one address unit, in bits |
FLOORED |
flag | true if floored division is the default |
MAX-CHAR |
u | maximum value of any character in the implementation-defined character set |
MAX-D |
d | largest usable signed double number |
MAX-N |
n | largest usable signed integer |
MAX-U |
u | largest usable unsigned integer |
MAX-UD |
ud | largest usable unsigned double number |
RETURN-STACK-CELLS |
n | maximum size of the return stack, in cells |
STACK-CELLS |
n | maximum size of the data stack, in cells |
FLOATING-STACK |
n | maximum size of the floating point stack, in floats |
MAX-FLOAT |
r | largest usable floating point number |
For example, S" MAX-N" ENVIRONMENT? . .
displays -1
(true) and 32767
.
Non-implemented and obsolescent queries (according to the Forth standard)
return FALSE
. Obsolescent queries that return FALSE
but are in fact
available in Forth500:
query string | type | comment |
---|---|---|
CORE |
flag | available |
CORE-EXT |
flag | available |
DOUBLE |
flag | available |
DOUBLE-EXT |
flag | available |
EXCEPTION |
flag | available |
EXCEPTION-EXT |
flag | available |
FACILITY |
flag | available |
FACILITY-EXT |
flag | partly available |
FILE |
flag | available |
FILE-EXT |
flag | available |
FLOATING |
flag | available |
FLOATING-EXT |
flag | INCLUDE FLOATEXT.FTH to complete |
STRING |
flag | available |
TOOLS |
flag | partly available |
TOOLS-EXT |
flag | partly available |
Some public Forth libraries still test these queries. To use these libraries with Forth500, change the library Forth source code to successfully pass obsolescent queries.
The Forth500 dictionary is organized as follows:
low address in the 11th segment $Bxxxx
_________
+--->| $0000 | last link is zero (2 bytes)
^ |---------|
| | 3 | length of "(:)" (1 byte)
| |---------|
| | (:) | "(:)" word characters (3 bytes)
| |---------|
| | code | machine code
| |=========|
+<==>+ link | link to previous entry (2 bytes)
^ |---------|
: : :
: : :
: : :
| |=========|
+<==>| link | link to previous entry (2 bytes)
^ |---------|
| | $80+5 | length of "aword" (1 byte) with IMMEDIATE bit set
| |---------|
| | aword | "aword" word characters (5 bytes)
| |---------|
| | code | Forth code and/or data
| |=========|
+<---| link |<--- last link to previous entry (2 bytes)
|---------|
| 7 | length of "my-word" (1 byte)
|---------|
| my-word | "my-word" word characters (7 bytes)
|---------|
| code |<--- LAST-XT Forth code and/or data
|=========|<--- HERE pointer
| hold | hold area for numerical output (40 bytes)
|---------|
| |
| free | unused dictionary space
| space |
| |
|=========|<--- dictionary limit
| |
| float | stack of 120 bytes (10 floats)
| stack | grows toward lower addresses
| |<--- FP stack pointer
|=========|
| |
| data | stack of 256 bytes (128 cells)
| stack | grows toward lower addresses
| |<--- SP stack pointer
|=========|
| |
| return | return stack of 256 bytes (128 cells/calls)
| stack | grows toward lower addresses
| |<--- RP return stack pointer
|_________|<--- $BFC00
high address
A link field points to the previous link field. The last link field at the lowest address of the dictionary is zero.
LAST-XT
returns the execution token of the last definition, which is the
location where the machine code of the last word starts.
Forth500 is a Direct Threaded Code Forth implementation. Code is either
machine code or starts with a jump or call machine code instruction of 3 bytes,
followed by Forth code (a sequence of execution tokens in a colon definition)
or data (constants, variables, values and other words created with CREATE
).
Immediate words are marked with the length byte high bit 7 set ($80). Hidden
words have the "smudge" bit 6 ($40) set. A word is hidden until successfully
compiled. HIDE
hides the last defined word by setting the smudge bit.
REVEAL
reveals it. Incomplete colon definitions with compilation errors
should never be revealed.
There are two words to search the dictionary:
word | stack effect ( before -- after ) |
---|---|
FIND |
( c-addr -- xt 1 ) if found and word is immediate, ( c-addr -- xt -1 ) if found and not immediate, otherwise ( c-addr -- c-addr 0 ) |
FIND-WORD |
( c-addr u -- xt 1 ) if found and word is immediate, ( c-addr u -- xt -1 ) if found and not immediate, otherwise ( c-addr u -- 0 0 ) |
FIND
takes a counted string c-addr whereas FIND-WORD
takes a string
c-addr of size u to search. The search is case insensitive. Hidden words
are marked "smudged" and not searchable.
:NONAME
and CREATE-NONAME
code has no dictionary entry. The code is just
part of the dictionary space as a block of code without link and name header.
Both words return the execution token of the code.
UNUSED
gives the unused dictionary space size plus hold the area size.
The hold area is used as a temporary buffer for numerical output, such as .
,
U.
, D.
, F.
, and <#
... #>
, Also the OK[n]
prompt overwrites this
area to display the stack depth n
. Otherwise, this space is unused.
Forth500 adopts the Fig Forth vocabulary implementation. The CURRENT
and
CONTEXT
values point to the link cell of a VOCABULARY
word. The CURRENT
pointer is used to define new words in the dictionary of a vocabulary. The
CONTEXT
pointer is used to search words in the dictionary of a vocabulary. A
colon definition sets CONTEXT
to CURRENT
. The DEFINITIONS
word sets
CURRENT
to CONTEXT
.
Assuming the following vocabularies and words are defined:
FORTH DEFINITIONS
VOCABULARY VOCAB-1
VOCAB-1 DEFINITIONS
: ABC ... ;
VOCAB-2 DEFINITIONS
: XYZ ... ;
then the resulting vocabulary tree structure is constructed in the dictionary structure as follows:
_________
| $0000 | last link is zero (2 bytes)
|---------|
| (:) | 3 + "(:)" word length + name defined in FORTH
|---------|
| ... | code
|=========|
: :
: : words with code defined in FORTH
: :
|=========|
(1) | | link to previous entry in FORTH
|---------|
| FORTH | 5 + "FORTH" word length + name defined in FORTH
|---------|
(2) | (4) |<--- CURRENT or CONTEXT = (2) to entry (4) in FORTH
|---------|
| $2041 | Fig Forth kludge (hidden blank name)
|=========|
(3) | (1) | link to entry (1)
|---------|
| ... | word with code defined in FORTH
|=========|
(4) | (3) | link to entry (3)
|---------|
| VOCAB-1 | 7 + "VOCAB-1" word length + name defined in FORTH
|---------|
(5) | (7) |<--- CURRENT or CONTEXT = (5) to entry (7) in VOCAB-1
|---------|
| $2041 | Fig Forth kludge (hidden blank name)
|=========|
(6) | (2) | link to entry (2)
|---------|
| ABC | 3 + "ABC" word length + name defined in VOCAB-1
|---------|
| ... | code
|=========|
(7) | (6) | link to entry (6)
|---------|
| VOCAB-2 | 7 + "VOCAB-1" word length + name defined in VOCAB-1
|---------|
(8) | (9) |<--- CURRENT or CONTEXT = (8) to entry (9) in VOCAB-2
|---------|
| $2041 | Fig Forth kludge (hidden blank name)
|=========|
(9) | (5) | link to entry (5)
|---------|
| XYZ | 3 + "XYZ" word length + name defined in VOCAB-2
|---------|
| ... | code
|=========|
: :
: :
For example, when the CONTEXT
is (8)
, a search for ABC
starts by
searching entry (9)
in the VOCAB-2
dictionary, then entry (5)
, entry
(7)
, and entry (6)
where ABC
is found as defined in VOCAB-1
.
File operations like FILES F:*.*
change the current drive letter to the
specified drive letter. This changes the DRIVE
variable.
We can also change the DRIVE
letter by defining a new word that sets
DRIVE C!
by the drive letter parsed from the input with PARSE-NAME
(we can DROP
the length which is always positive non-zero):
: CHDIR ( "letter" -- ) PARSE-NAME DROP C@ DRIVE C! ;
For example:
CHDIR F
FILES
The SAVE
word saves the Forth500 image to a file. You can reload the image
later with LOADM from BASIC.
: SAVE ( "name" -- )
PARSE-NAME W/O CREATE-FILE THROW >R
\ determine Forth500 start address and length up to HERE
['] (:) $ff00 AND HERE OVER -
\ create file header using HERE as a temporary buffer
HERE 16 ERASE
\ 255 0 6 1 16 SizeLow SizeHigh 0 StartLow StartHigh Segment 255 255 255 0 15
-1 HERE C!
262 HERE 2+ !
16 HERE 4 + C!
DUP HERE 5 + !
OVER HERE 8 + !
$b HERE 10 + C!
-1 HERE 11 + !
-1 HERE 13 + C!
15 HERE 15 + C!
\ write 16 byte header
HERE 16 R@ WRITE-FILE THROW
\ write Forth500 image from base address up to HERE
R@ WRITE-FILE THROW
\ close the file
R> CLOSE-FILE THROW
;
In Forth500 execute:
SAVE F:MYFORTH.BIN
In BASIC RUN mode execute (assuming memory for Forth500 is still allocated):
> LOADM "F:MYFORTH.BIN"
> CALL &B0000 ' or CALL &B9000 on an unexpanded machine
The greatest common divisor of two integers is computed with Euclid's algorithm in Forth as follows:
: gcd ( n1 n2 -- gcd ) BEGIN ?DUP WHILE TUCK MOD REPEAT ;
The double integer version:
: dgcd ( d1 d2 -- dgcd ) BEGIN 2DUP D0<> WHILE 2TUCK DMOD REPEAT 2DROP ;
A Forth500 version of the C rand() function to generate pseudo-random numbers between 0 and 32767:
2VARIABLE seed
: rand ( -- +n ) seed 2@ 1103515245. D* 12345. D+ TUCK seed 2! 32767 AND ;
: srand ( x -- ) S>D seed 2! ;
1 srand
Note: do not use rand
for serious applications.
To draw a randomized "starry night" on the 240x32 pixel screen:
: starry-night PAGE 1000 0 DO rand 240 MOD rand 32 MOD GPOINT LOOP ;
Note that Forth500 includes an FRAND
floating point random number generator,
see floating point arithmetic.
As an example application of rand
, let's simulate a Galton board with 400
balls and as many as 100 levels (!) of pegs:
400 VALUE balls \ number of balls to drop
100 VALUE levels \ levels of pegs on the board
120 VALUE middle \ starting point on the screen
Dropping a ball in the board means going left or right at each peg on the board, performing a random walk on the x-axis as it falls down:
: random-walk ( xpos steps -- xpos ) 0 DO rand 1 AND 2* 1- + LOOP ;
Balls accumulate at the bottom on top of eachother:
: accumulate ( xpos -- ) 0 BEGIN 2DUP GPOINT? 0= WHILE 1+ REPEAT 1- GPOINT ;
Each ball drops from the middle, makes a random walk, and accumulates making a click sound:
: drop-ball middle levels random-walk accumulate 100 10 BEEP ;
The program repeats for all balls:
: Galton 0 GMODE! PAGE balls 0 DO drop-ball LOOP S" done" PAUSE ;
The square root of a number is approximated with Newton's method. Forth500
includes a floating point FSQRT
word. This example shows how the
Newton-Raphson method is used to efficiently compute the square root of an
integer without FSQRT
.
Given an initial guess x for f(x) = 0, an improved guess is x' = x - f(x)/f'(x). This is iterated with x=x' until convergence.
To compute sqrt(a), let f(x) = x^2 - a to find the answer x with Newton's method such that f(x) = 0. Therefore, x' = x - (x^2-a)/(2 x) = (x + a/x)/2.
Because we operate with integers, the convergence check should consider the previous two estimates of x to avoid oscillation. The outline of the algorithm is:
y = 1 \ estimate before the previous estimate
x = 1 \ previous estimate
begin
x' = (x+a/x)/2 \ improved estimate
while x'<>x and x'<>y \ convergence?
y = x \ update estimates
x = x'
again
This algorithm assumes that a is positive. Negative a are invalid and may raise a division by zero exception. If a is zero then we also raise an exception, which we want to avoid by returning zero if a is zero.
To implement the algorithm in Forth, we place a on the return stack, because we only need a to compute x'. We place y and x on the stack and compute the new estimate x' as the TOS above them:
: sqrt ( n -- sqrt )
DUP IF \ if a<>0
>R \ move a to the return stack
1 1 \ -- y x where y=1 and x=1 initially
BEGIN
R@ \ -- y x a
OVER / \ -- y x (a/x)
OVER + 2/ \ -- y x x' where x'=(a/x+x)/2
ROT \ -- x x' y
OVER <> WHILE \ while x'<>y
2DUP <> WHILE \ and also while x'<>x
REPEAT THEN
DROP \ -- x
RDROP \ drop a from the return stack
THEN ;
Note that the second WHILE
requires a THEN
after REPEAT
. For an
explanation of this multi-WHILE
structure, see loops.
A minor issue is the potential integer overflow to a negative value in
(a/x+x) before dividing by 2. This can lead to all sorts of problems,
such as non-termination of the loop. This problem can be remedied by an
unsigned division by 2 with 1 RSHIFT
to replace 2/
.
The double integer square root implementation:
: dsqrt ( d -- dsqrt )
2DUP D0<> IF
2>R
1. 1.
BEGIN
2R@
2OVER D/
2OVER D+ 2. D/
2ROT
2OVER D<> WHILE
2OVER 2OVER D<> WHILE
REPEAT THEN
2DROP
RDROP RDROP
THEN ;
This examples shows how a temporary array of cells is created and how various looping constructs are used to generate prime numbers.
To create a temporary array of cells we will use ALLOT
, but only at runtime.
Afterwards we will destroy the array and release memory back to the dictionary.
The address of our temporary array or cells is stored in value array
:
0 VALUE array
Allocation and clearing the array is performed with calloc
:
: calloc ( n -- ) HERE TO array 2* DUP ALLOT array SWAP ERASE ;
Note that we store HERE
in array
, which is the address where our temporary
array starts. Destroying the temporary array is performed by destroy
which
computes the negative memory size to release with ALLOT
:
: destroy ( -- ) array HERE - ALLOT ;
Note that this is only safe if we destroy
after calloc
without defining any
new words inbetween. Fetching and storing a cell value in the array is
performed with the prime@
and prime!
words, that take an array index i
as
a parameter:
: prime@ ( i -- x ) 2* array + @ ;
: prime! ( x i -- ) 2* array + ! ;
With these definitions, we can write a prime number filter
to produce n
primes:
: filter ( n -- )
2 .
3 DUP 0 prime!
DUP .
SWAP \ -- 3 n
1 ?DO
BEGIN
2+ \ -- maybeprime
I 0 ?DO
I prime@ \ -- maybeprime prime
2DUP DUP * \ -- maybeprime prime maybeprime prime*prime
< IF
DROP TRUE LEAVE \ -- maybeprime true
ELSE
OVER SWAP \ -- maybeprime maybeprime prime
MOD 0= IF
FALSE LEAVE \ -- maybeprime false
THEN
THEN
LOOP \ -- maybeprime isprime
UNTIL
DUP .
DUP I prime!
LOOP
DROP ;
This word definition is a bit long, longer than we usually want in Forth.
But the innermost DO-LOOP
requires the I
index of the outer DO-LOOP
to
run. The outer DO-LOOP
counts the prime number index, from 1 to n
. The
inner BEGIN-UNtIL
loop produces the next prime number by checking if the
current value (starting with 3 incremented to 5 to check first) against the
previous prime numbers, as is done in the innermost DO-LOOP
. The innermost
DO-LOOP
always terminates with a LEAVE
that jumps out of the loop.
To display the first n
primes (n
> 1) we allocate the array, run the filter
and destroy the array:
: primes ( n -- ) DUP calloc ['] filter CATCH destroy THROW ;
Note that CATCH destroy THROW
ensures that we always destroy the array, even
whan an exception occurrs, such as pressing BREAK.
The Sieve of Eratosthenes is a famous prime number generator that marks off all the multiples of primes from a "grid" such that only the prime numbers remain.
In Forth we mark bits in an array of cells. Each cell has 16 bits. A bit is
marked with mark!
and checked with marked
:
: mark! ( n -- ) DUP 4 RSHIFT 2* array + SWAP 15 AND 1 SWAP LSHIFT OVER @ OR SWAP ! ;
: marked ( n -- ) DUP 4 RSHIFT 2* array + SWAP 15 AND 1 SWAP LSHIFT SWAP @ AND ;
Note that 4 RSHIFT
computes the index into the cell array and 15 AND
computes the bit number 0 to 15, followed by 1 SWAP LSHIFT
to set a 1 at the
corresponding bit position.
Sieving is performed as follows:
: do-sieve ( n -- )
2 .
DUP 3 ?DO
I marked 0= IF
I .
DUP I DO
I mark!
J +LOOP
THEN
2 +LOOP
DROP ;
The outer DO-LOOP
iterates from 3 to n
in steps of 2. If I
is not
marked, then it is a prime number and we mark I
and all multiples in the
inner DO-LOOP
that runs from I
to n
in steps of J
, i.e. the outer loop
index which is the prime number.
To display the prime numbers up to n
(n
> 3) we allocate the array, run the
sieve and destroy the array:
: sieve ( n -- ) DUP 15 + 16 / calloc ['] do-sieve CATCH destroy THROW ;
where 15 + 16 /
rounds up to multiples of 16 to allocate enough space for the
array.
In this example we use Simpson's rule for numerical integration. Simpson's rule approximates the definite integral of a function f over a range [a,b] with 2n summation steps:
I = h/3 × [ f(a) + ∑ᵢ₌₁ ⁿ ( 4 f(a + h × (2 i - 1)) + 2 f(a + 2 h i) ) - f(a + 2 h n) ]
where h = (b-a)/(2 n)
First we define the function to integrate as a deferred word in Forth, which means we can assign it later any given function y=f(x) to integrate:
DEFER integrand ( F: x -- y )
Next, we define three variables to hold x = a+h × (2 i - 1), h=(b-a)/(2 n) and the partial sum:
FVARIABLE x
FVARIABLE h
FVARIABLE sum
Note that Forth doesn't care if you redefine x
later, because x
and the
other variables remain visible to integrate
as a form of static scoping.
Thus, x
, h
and sum
are essentially local variables of integrate
.
Variables x
and h
are initialized with a and (b-a)/(2 n),
respectively, where a and b are on the floating point stack and n is on
the regular stack:
: init ( F: a b -- ; n -- ) FOVER F- 2* S>F F/ h F! x F! 0e sum F! ;
In the following definition we aim to update the next value of x
and return
its updated value on the floating point stack to use right away:
: nextx ( F: -- x ) x F@ h F@ F+ FDUP x F! ;
To accummulate the sum, we multiply y=f(x) by the FP TOS (4e or 2e) and
add it to sum
:
: *sum+! ( F: y r -- ) F* sum F@ F+ sum F! ;
The integration proceeds by first dividing the number of steps by 2 to get n,
then set x
to a and h
to (b-a)/(2 n) with init
and the sum
to
f(a) before the summation loop:
: integrate ( F: a b -- I ; 2n -- )
2/ DUP init
x F@ integrand 1e *sum+!
0 ?DO
nextx integrand 4e *sum+!
nextx integrand 2e *sum+!
LOOP
x F@ integrand -1e *sum+!
sum F@ h F@ F* 3e F/ ;
Recall that all floating point values must be typed with an exponent e
for
single precision or d
for double precision.
Because Forth500 internally switches to double precision if any of the operands
of an arithmetic operation are double precision, the function to integrate or
the integration bounds may use double precision to produce a double precision
result. The double precision integration result is not affected by the use of
the single precision weight values, such as 1e
, 2e
, 3e
and 4e
, in the
integrate
definition.
Let's integrate f(x)=1/(x ² + 1) over [0,1] with 2 n = 10 steps:
6 SET-PRECISION ↲
:NONAME FDUP F* 1e F+ 1e FSWAP F/ ; IS integrand ↲
0e 1e 10 integrate F. ↲
0.785398 OK[0]
We set the precision to 6 digits to display the result with F.
. We defined
an anonymous function with :NONAME
as the integrand
to integrate.
With double precision floating point and 100 steps:
20 SET-PRECISION ↲
0d 1d 100 integrate F. ↲
0.7853981633974 OK[0]
This example demonstrates how easy it is to switch to double precision. But this is not very useful with Simpson's rule of integration. The precision of the result is determined by Simpson's approximation and the number of steps performed, rather than by the use of higher precision floating point values.
Because Forth500 internally operates with BCD (Binary-Coded Decimal) floating point values, the numerical result of this example differs slightly from implementations that internally use IEEE 754 floating point values.
This example is an implementation with string buffers residing in the dictionary, which is pretty standard practice in Forth. Each buffer includes the maximum length of the string as the first byte followed by the actual length of the string in the second byte. The string contents follow these two bytes. This implementation is safer than simpler implementations that do not store the maximum string buffer size and thus have no protections against buffer overflows.
In this example we keep our definitions short and concise by reusing words as much as possible to avoid unnecessary complexity.
We first define four auxilliary words to obtain the max length, the current length, the unused space and to set a new length limited by the max length:
: strmax ( string -- max ) 2- C@ ;
: strlen ( string -- len ) 1- C@ ;
: strunused ( string -- unused ) DUP strmax SWAP strlen - ;
: strupdate ( string len -- ) OVER strmax UMIN SWAP 1- C! ;
Note that we used UMIN
to prevent negative string lengths (MIN
is signed).
A string
value on the stack is an address that points right after the max and
length bytes to the string contents stored in a string buffer.
The following string:
word creates a string buffer given a maximum length:
: string: ( max "name" -- ; string len )
CREATE DUP C, 0 C, ALLOT
DOES> 2+ DUP strlen ;
Let's define a name
to store up to 30 characters:
30 string: name ↲
The string name
returns the string address of its first character and the
length of the string. This makes it simpler to use our strings as the usual
constant string arguments passed to standard Forth words, such as TYPE
:
name TYPE ↲
OK[0]
This displays nothing because the string is initially empty.
To safely copy a (constant) string to a string buffer by limiting the number of characters copied to guard against overflowing the buffer:
: strcpy ( c-addr u string len -- )
DROP DUP ROT strupdate \ set the new length
DUP strlen CMOVE ;
For example:
S" John" name strcpy ↲
name TYPE ↲
John OK[0]
To safely concatenate a string to another by limiting the number of characters appended to guard against overflowing the buffer:
: strcat ( c-addr u string len -- )
>R \ save the old length
SWAP OVER strunused UMIN \ limit the added length
2DUP R@ + strupdate \ set the new length = old length + added
SWAP R> + SWAP CMOVE ;
For example:
S" Doe" name strcat ↲
name TYPE ↲
John Doe OK[0]
Forth words that work with constant strings, such as TYPE
, SEARCH
and S=
,
also work with our string buffers:
name S" Do" SEARCH . TYPE ↲
-1 Doe OK[0]
S" John" name 4 MIN S= . ↲
-1 OK[0]
We can also accept user input into a string:
: straccept ( string len -- ) DROP DUP DUP strmax ACCEPT strupdate ;
: stredit ( string len -- )
>R DUP strmax R> \ -- string max len
DUP \ place cursor at the end (=len)
0 \ allow edits to the begin at position 0 (no prompt)
EDIT strupdate ;
For example:
name straccept ↲
John ↲
name stredit ↲
Doe ↲
name TYPE ↲
John Doe OK[0]
The NEXT-CHAR
word slices off the first character of a string by incrementing
the address and decrementing the length by one:
name NEXT-CHAR EMIT CR TYPE ↲
J
ohn Doe OK[0]
The /STRING
("slash string") word advances the string address and reduces the
string length by the given amount and :
name 5 /STRING TYPE ↲
Doe OK[0]
We can define a word to slice strings. Slicing a substring from a (constant) string returns the (constant) substring address and substring length:
: slice ( c-addr1 u1 pos len -- c-addr2 u2 )
>R \ save len
OVER UMIN \ -- c-addr u1 pos where pos is limited to u1
TUCK \ -- c-addr pos u1 pos
- R> UMIN \ -- c-addr pos len where pos+len is limited to u1
>R + R> ;
where pos and len take a slice from string c-addr1 u1 to return the substring c-addr2 u2 located in c-addr1 at position pos with length len. If pos exceeds the string length u1 then u2=0. If pos+len exceeds the string length u1 then u2<len.
For example:
name 5 3 slice TYPE ↲
Doe OK[0]
Note that we can take slices of slices:
name 4 4 slice 1 2 slice TYPE ↲
Do OK[0]
Slicing can be used to modify a string by copying or concatenating a slice of the string to itself:
name 5 3 slice name strcat ↲
name TYPE ↲
John DoeDoe OK[0]
name 0 8 slice name strcpy ↲
name TYPE ↲
John Doe OK[0]
Inserting and deleting characters can be done with slicing and a temporary
buffer, such as the PAD
of 256 bytes that can hold a string with up to 254
characters:
: strtmp 254 PAD C! PAD 2+ PAD 1+ C@ ;
For example, to copy "John" from name
, insert " J." and append " Doe" from
name
into the string temporary:
name 0 4 slice strtmp strcpy ↲
S" J." strtmp strcat ↲
name 5 3 slice strtmp strcat ↲
strtmp TYPE ↲
John J. Doe OK[0]
Additional words to convert characters and string buffers to upper and lower case:
: toupper ( char -- char ) DUP [CHAR] a [CHAR] { WITHIN IF $20 - THEN ;
: tolower ( char -- char ) DUP [CHAR] A [CHAR] [ WITHIN IF $20 + THEN ;
: strupper ( string len -- ) 0 ?DO DUP I + DUP C@ toupper SWAP C! LOOP DROP ;
: strlower ( string len -- ) 0 ?DO DUP I + DUP C@ tolower SWAP C! LOOP DROP ;
For example:
name strupper name TYPE ↲
JOHN DOE OK[0]
The following sfield:
word adds a string member to a structure:
: sfield: ( u max "name" -- u ; addr -- string len )
CREATE
OVER , \ store current struct size u
DUP , \ store max
+ 2+ \ update struct size += max+2
DOES> ( struct-addr addr -- member-addr )
SWAP OVER @ + \ compute member address
DUP ROT \ -- member-addr member-addr addr
CELL+ @ C! \ make sure string max is set
2+ DUP strlen ;
For example an address with a 30 max character street name:
BEGIN-STRUCTURE address ↲
30 sfield: address.street ↲
FIELD: address.number ↲
END-STRUCTURE ↲
: address: address BUFFER: ; ↲
address: home ↲
S" Pleasantville" home address.street strcpy ↲
555 home address.number ! ↲
home address.street TYPE SPACE home address.number ? ↲
Pleasantville 555
To create arrays of (uninitialized) strings:
: sarray: ( size max "name" -- ; index -- string len )
CREATE
DUP , 2+ * ALLOT \ save max and allocate space
DOES> ( array-addr index -- string len )
SWAP OVER @ \ -- addr index max
DUP>R \ save max
2+ * + CELL+ \ address in the array = (max+2)*index+addr+2
R> OVER C! \ make sure the string max is set
2+ \ skip max and len to get to string
DUP strlen ;
To initialize an array element, just strcpy
a value to it. For example, to
create an array of 10 strings of 16 characters max, then copy "John" into array
item 5 (counting from 0):
10 16 sarray: names ↲
S" John" 5 names strcpy ↲
Large arrays of strings aren't very resource efficient, because each string element in the array reserves space. Best is to implement a heap to store strings and use compaction to keep the heap space efficiently used.
Enumerated values can be created with multiple CONSTANT
, each for a new
enumeration value. We can automate the constant value assignments as follows:
: begin-enum ( -- n ) 0 ;
: enum ( n "name" -- n ) DUP CONSTANT 1+ ;
: end-enum ( -- n ) DROP ;
Such that:
begin-enum
enum red
enum white
enum blue
end-enum
will create the constants red
, white
and blue
with values 0, 1 and 2,
respectively. In a similar way we can define a bitmask
word using 1 OVER LSHIFT
to set the constants to 1, 2, 4, 8 and so on. Bitmasks can be
manipulated with the bit operations AND
, OR
, XOR
and INVERT
.
If we don't care about the constants as long as they are unique, then another approach is to use the unique dictionary address of a word as the enumeration value. This always works when we never need the actual value of an enumeration. Consider for example an enumeration of colors:
CREATE red
CREATE white
CREATE blue
Each color word returns its address of the definition's body, which contains no
data. Because in Forth500 the body if a word is 3 bytes below the execution
token, we can implement a word enum.
to display the color name:
: body> ( addr -- xt ) 3 - ;
: enum. ( addr -- ) body> >NAME NAME>STRING TYPE ;
The body>
"body from" word converts the address of the body of a word to its
execution token, >NAME
converts the execution token to a name token and
NAME>STRING
returns the string of a name token on the stack. For example:
red enum. ↲
red OK[0]
Another way to implement enumerations is to use the address of a "counted string" as a unique enumeration value:
: red C" the color red" ;
: white C" the color white" ;
: blue C" the color blue" ;
The string of a color word is displayed with COUNT TYPE
.
This example shows how Forth encourages a bit of creativity to come up with an approach that is best suited for an application.
"Slurping" a file into memory to process it is typically performed by storing the file's contents in the free dictionary space. The free dictionary space serves as our working area. We could pre-allocate memory with the file size, but in this example we assume that the file size is unknown (e.g. when reading standard input with piped input, keyboard input, etc.) Therefore, slurping a file is done incrementally by reading a chunk at a time.
First we need some variables:
VARIABLE fh \ file handle, nonzero if file is open
VARIABLE fp \ file content pointer, points to start of the slurped file
VARIABLE fz \ file content length
Next, we define slurp
:
: slurp ( c-addr u -- c-addr u ) open read close ;
The slurp
word takes the file name as a string and returns the file contents
as a string. The file open
and close
words use OPEN-FILE
and
CLOSE-FILE
, respectively, which return a I/O error code ior. We want to
throw this error:
: open ( c-addr u -- ) R/O OPEN-FILE THROW fh ! ;
: close fh @ ?DUP IF CLOSE-FILE fh OFF THROW THEN ;
Note that close
has a guard to close only open files (omitting the guard
?DUP IF ... THEN
is fine too, it just throws an exception, because fileid=0
is invalid and cannot be closed).
Now we can read
a file incrementally, "sipping" one block at a time until he
last sip is empty:
: read start BEGIN sip 0= UNTIL done ;
To start
, we just initialize fp
to HERE
to point to the free space:
: start HERE fp ! ;
A "sip" allocates and reads up to 100 bytes at a time from the file:
: sip ( -- n ) HERE 100 DUP ALLOT fh @ READ-FILE THROW DUP 100 - ALLOT ;
Note that the second ALLOT
with a negative size (= number of bytes read -
100) releases unused space back to the dictionary, then returns the number of
bytes read.
After repeately "sipping", we can compute and return the length by subtracting
HERE
(the final address) from fp @
(the starting address) and return fp
and fz
:
: done ( -- c-addr u ) HERE fp @ - fz ! fp @ fz @ ;
For good measure, when we are all good and done with the file in memory, we should release memory back to the dictionary:
: release fp @ HERE - ALLOT ;
If we decide not to release
, then the file remains in memory for later use.
Before slurping another file, make sure to save fp
and fz
to retain access
to the file's content stored in memory.
Let's recap and put things in order:
VARIABLE fh \ file handle, nonzero if file is open
VARIABLE fp \ file content pointer, points to start of the slurped file
VARIABLE fz \ file content length
: sip ( -- n ) HERE 100 DUP ALLOT fh @ READ-FILE THROW DUP 100 - ALLOT ;
: start HERE fp ! ;
: done ( -- c-addr u ) here fp @ - fz ! fp @ fz @ ;
: read start BEGIN sip 0= UNTIL done ;
: open ( c-addr u -- ) R/O OPEN-FILE THROW fh ! ;
: close fh @ ?DUP IF CLOSE-FILE fh OFF THROW THEN ;
: slurp ( c-addr u -- c-addr u ) open read close ;
: release fp @ HERE - ALLOT ;
Note that definitions without a (
stack effect )
have no stack effect.
For example, we can search a text file for string matches, say "TODO":
." some.txt" slurp ." TODO" SEARCH release . ↲
0 OK[2]
This will display -1 (true) when found and leaves the address of the match with remaining length of the file on the stack, or 0 (false) when not found.
If the free space in the dictionary is insufficient, then exception -8 will be
thrown. If that happens, call close
and release
to close the file and
release memory.
Slurping a file from the E: or F: drive is much simpler. Since the file size is known, we can pre-allocate memory space and gulp the whole file at once into this space. We can make the following changes accordingly:
: data ( -- c-addr u ) fp @ fz @ ;
: gulp fz @ ALLOT data fh @ READ-FILE THROW ;
: size fh @ FILE-SIZE THROW fz ! ;
: read start gulp data ;
: slurp ( c-addr u -- c-addr u ) open size read close ;
where size
assigns variable fz
the file size, gulp
reads the whole file
at once and data
returns the address and size of the file data (i.e. as
c-addr u) for convenience.
Problem: file won't open or cannot INCLUDE a file from COM: E: or F:
When BREAK is pressed or an error occurs while files are still open, the file
cannot be re-opened until it is closed. Therefore, always close files in your
program (which may require an exception handler). On the other hand, you can
manually close a file with fileid CLOSE-FILE
where fileid
is a positive
integer between 4 and 16 (1 to 3 are associated with STDO
, STDI
and STDL
,
respectively). Therefore, you can try 4 CLOSE-FILE .
then 5 CLOSE-FILE .
up to 16 to close all files if open.
And so Forth... by Hans Bezemer
A Beginner's Guide to Forth by J.V. Noble
Thinking Forth by Leo Brodie
Moving Forth by Brad Rodriguez
Forth: The programming language that writes itself
Standard Forth alphabetic list of words
Forth Systems Comparisons by Guy M. Kelly
This document is Copyright Robert A. van Engelen (c) 2021