Squeezing Forth into 64 Bits

November 16, 2019

Brad Nelson / @flagxor

Caveats

  • Simplicity is in the eye of the beholder.
  • Wise and venerable practitioners have tried many methods.
    And some things that should not have been forgotten were lost.

Forthopedia

Motivation

  • Oct 7, 2019 - Mac OSX 10.15 (Catalina) drops 32-bit support
  • Oct, 2019 - Ubuntu 19.10 initially planned to drop 32-bit support
    • Wine + gaming community prompted change to support some packages for now

Motivation

Linus Torvalds <torvalds@linux-foundation.org>
Merge branch 'x86-nuke386-for-linus'
of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull "Nuke 386-DX/SX support" from Ingo Molnar:
 "This tree removes ancient-386-CPUs support and thus zaps
  quite a bit of complexity:

    24 files changed, 56 insertions(+), 425 deletions(-)

  ... which complexity has plagued us with extra work
  whenever we wanted to change SMP primitives, for years.

  Unfortunately there's a nostalgic cost: your old original
  386 DX33 system from early 1991 won't be able to boot modern
  Linux kernels anymore.  Sniff."

I'm not sentimental.  Good riddance.

Motivation

  • SwiftForth for Windows: Windows Vista or later
  • SwiftForth for macOS: Mac OS X 10.6 through macOS 10.14 (requires 32-bit library support)
  • SwiftForth for Linux: Linux 2.6 kernel or later with 32-bit library support

Lamentation of Complexity

  • 64-bit instruction sets are complex, grafted onto 32-bit ones.
  • 64-bit calling conventions are complex, involving alignment and "redzones".
  • 64-bit has become common, even though it is usually less memory effient, to allow a single complex configuration to prevail.
  • Software doesn't need to be this way, but it is because it has become cumulative.

What can be made Simpler with 64-bit?

  • Less need for double length words.
  • 64-bits could encode a fairly complex single instruction machine?
  • 64-bits is 8 bytes!, 8 letters is a lot.

ColorForth

  • Source code encoded in 32-bit words
    • 4-bit tag for color
    • 28-bits of variable length characters

ColorForth Character Set

   0 000    0      10 000 s  8    1100 000 d 16
   0 001 r  1      10 001 m  9    1100 001 v 17
   0 010 t  2      10 010 c 10    1100 010 p 18
   0 011 o  3      10 011 y 11    1100 011 b 19
   0 100 e  4      10 100 l 12    1100 100 h 20
   0 101 a  5      10 101 g 13    1100 101 x 21
   0 110 n  6      10 110 f 14    1100 110 u 22
   0 111 i  7      10 111 w 15    1100 111 q 23

1101 000 0 24    1110 000 8 32    1111 000 ; 40
1101 001 1 25    1110 001 9 33    1111 001 ' 41
1101 010 2 26    1110 010 j 34    1111 010 ! 42
1101 011 3 27    1110 011 - 35    1111 011 + 43
1101 100 4 28    1110 100 k 36    1111 100 @ 44
1101 101 5 29    1110 101 . 37    1111 101 * 45
1101 110 6 30    1110 110 z 38    1111 110 , 46
1101 111 7 31    1110 111 / 39    1111 111 ? 47
http://www.greenarraychips.com/home/documents/greg/cf-characters.htm

DSSP

  • Forth cousin for the Setun-70 (ternary computer)
  • Key philosophy: "one word of text - one word of machine code"
http://brokestream.com/daf.txt

DSSP

  DSSP            Forth
  [n] IF+ A       [n] 0> IF A THEN
  [n] IF0 A       [n] 0= IF A THEN
  [n] IF- A       [n] 0< IF A THEN
  [n] BR+ A B     [n] 0> IF A ELSE B THEN
  [n] BR- A B     [n] 0< NEG IF A ELSE B THEN
  [n] BR0 A B     [n] 0= IF A ELSE B THEN
  [n] BRS X Y Z
  [n] BR c1 p1  c2 p2 ... cN pN else pN+1
http://brokestream.com/daf.txt

DSSP

  DSSP            Forth
  [ ] RP A        []  BEGIN  A 0 UNTIL
  [n] DO A        [n,0] DO A LOOP

    To leave the cycle  4 break operators can be used:

   DSSP   Forth
    EX    LEAVE
    EX-   O< IF LEAVE THEN
    EX0   0= IF LEAVE THEN
    EX+   0> IF LEAVE THEN
http://brokestream.com/daf.txt

Forth's Character Set

  • Forth uses <# # #> () [] @ DATE&TIME !
  • Why don't we tend to use ` ~ | {} ?
  • Why do some EForths AND with $5F ?

Forth's Character Set

  0123456789ABCDEF
0x@ABCDEFGHIJKLMNO
1xPQRSTUVWXYZ[\]^_
2x !"#$%&'()*+,-./
3x0123456789:;<=>?
4x@ABCDEFGHIJKLMNO
5xPQRSTUVWXYZ[\]^_
6x`abcdefghijklmno
7xpqrstuvwxyz{|}~?
Ctrl

Forth's Character Set

  0123456789ABCDEF
0x@ABCDEFGHIJKLMNO
1xPQRSTUVWXYZ[\]^_
2x !"#$%&'()*+,-./
3x0123456789:;<=>?
4x@ABCDEFGHIJKLMNO
5xPQRSTUVWXYZ[\]^_
6x`abcdefghijklmno
7xpqrstuvwxyz{|}~?
Ctrl -- Forth

Forth's Character Set

  • Basically a 6-bit character set!

Forth's Character Set (I lied)

  • F~ ( f-proximate )
  • LOCALES|
  • % ^ _ aren't used
  • $ is widely used for hex

Digit Conversion

: DIGIT ( c -- n ) 9 over < 7 and + [char] 0 + ;
: DIGIT? ( c base -- u t )
   >R [char] 0 - 9 OVER <
   IF 7 - DUP 10 < OR THEN DUP R> U< ;

Make Life Easier

 0123456789abcdefghijklmnopqrstuvwxyz!"#$%&'()*+,-./:;<=>?@[\]^_

Even Easier?

  • 6-bit * 10 = 60-bits
  • Use 4-bit tag for: string(0), decimal(1), hex(2)

CircleForth

  • 84 circle.fs
  • 85 compound.fs
  • DEMO

ESPForth (AIBOT)

  • 815 espforth.c
  • DEMO

Interfacing with the OS

  • Kernels still use simple register calling conventions
    • But they assume you know complex struct layouts...
    • And hide complexity in binary blob graphics and modem drivers...
    • And they're aren't technically a stable ABI
    • And much of the OS is really dynamic library code
  • dlopen / dlsym ?
  • Write your API interface code in C ?

Matthew 22:21

Then saith he unto them, Render therefore unto Caesar the things which are Caesar's;
and unto God the things that are God's.

Lead me not into Complexity

  • If you use an OS facility, embrace its details.
  • Try to abstain from doing so!
  • Write a wrapper outside Forth if you want the capability, instead of the whole "library".
    • Simplify a GUI to just render bitmap in a window, if you just want to get pixels on the screen.
    • Reduce Wi-Fi to running an HTTP server offering remote transport.
  • Running multiple programs can separate concerns.
  • Portability is a profound temptation.

Parser Tug and Pull

  • Conventional Forth mixes if interpreting drives parsing, or parsing drives interpreting.
  • Interpreter Driven: WORD, PARSE, CREATE
  • Parser Driven: STATE, Number parsing

ColorForth Solution

  • Mostly Parser Driven
  • Color call table decides how to respond.
  • Numbers are pre-parsed.
  • Downside, more statefulness

Coroutines

  • Parsing words yield to parser.
  • Loading / evaluating words run as the parsing co-routine.
  • [ and ] terminate and restart interpret / compile.
  • Downside, more complexity

What I did

  • Similar to ColorForth, mostly parser driven.
  • WIP

slides at: github.com/flagxor

Thank you