Bootstrapping : color ] Forth
┄───────────────────────────┄
[ May 24, 2024
Why : color ] Forth ( ?
┄─────────────────────┄
• Dialect of Forth invented
by Chuck Moore circa 1999
• Many sub-variants,
all use color "tags" to dispatch
• Emphasis on making Forth even simpler!
Color Conventions
┄───────────────┄
: define [ execute
] compile { postpone
( comment ~ variable
% tag ^ gray
An Example
┄────────┄
: square ] dup * ;
: cube ] dup square * ;
[ 2 cube .
==> 8
High Level Plan
┄─────────────┄
• Keep it simple
◦ Really understand it
• Make sure it's usable as more than a toy
◦ Could I write an eww like browser in it?
• Write mostly Forth ASAP
• Names are hard : cf ( for now
Language Goals
┄────────────┄
• Runnable on Windows, but keep the complexity out!
• Decide register conventions in Forth
• Use x32 addressing (32-bit addresses, 64-bit data)
• Retain as many of Chuck's quirks as bearable:
◦ : - ] invert
◦ : or ] xor
◦ : if ] dup if
◦ : -if ] dup 0< if
◦ : push ] >r
◦ : pop ] r>
◦ @ ! etc. use word sized indexing
◦ Use A register, @+ !+
◦ Do literals on [ immediate ] to green
Chuck's Steps (I think)
┄─────────────────────┄
• Write editor + core in ASM
• Build up rest of the system
My Steps
┄──────┄
• Devise a text syntax (to allow git + vim)
• Write a converter
• Write a Vim syntax plugin
• Bootstrap from "minimal colorForth"
• Write the rest in colorForth
: color ] Forth "words"
┄─────────────────────┄
STRINGS:
wwww wwww wwww wwww wwww wwww wwww tttt
NUMBERS:
nnnn nnnn nnnn nnnn nnnn nnnn nnnH tttt
H = hex/decimal
t = tag
n = number bits, to be sign extended
w = word bits "shannon encoded"
Tag Syntax Element Color
┄─┄ ┄────────────┄ ┄───┄
15 Commented Number #White / $White
14 Display Macro % Blue
13 Compiler Feedback ^ Grey
12 Variable ~ Magenta
11 COMMENT (caps) (^^ (White) Obsolete
10 Comment (^ (White) Obsolete
9 comment (lower) White
8 Interpreted Number [ #Yellow $Yellow
7 Compile macro call { Cyan
6 Compile number ] #Green / $Green
5 Compile big number ] Green ( unsupported )
4 Compile forth word ] Green
3 Define forth word : Red
2 Interp big number [ Yellow ( unsupported )
1 Interp forth word [ Yellow
0 Word extension ( color of preceding word )
| 0 000 0 10 000 s 8 1100 000 d 16
| 0 001 r 1 10 001 m 9 1100 001 v 17
| 0 010 t 2 10 010 c 10 1100 010 p 18
| 0 011 o 3 10 011 y 11 1100 011 b 19
| 0 100 e 4 10 100 l 12 1100 100 h 20
| 0 101 a 5 10 101 g 13 1100 101 x 21
| 0 110 n 6 10 110 f 14 1100 110 u 22
| 0 111 i 7 10 111 w 15 1100 111 q 23
|
| 1101 000 0 24 1110 000 8 32 1111 000 ; 40
| 1101 001 1 25 1110 001 9 33 1111 001 ' 41
| 1101 010 2 26 1110 010 j 34 1111 010 ! 42
| 1101 011 3 27 1110 011 - 35 1111 011 + 43
| 1101 100 4 28 1110 100 k 36 1111 100 @ 44
| 1101 101 5 29 1110 101 . 37 1111 101 * 45
| 1101 110 6 30 1110 110 z 38 1111 110 , 46
| 1101 111 7 31 1110 111 / 39 1111 111 ? 47
Text Conventions
┄──────────────┄
: define [ execute
] compile { postpone
( comment ~ variable
% tag ^ gray
┄─────────────────────┄
#decimal $hex &screen
Separ`ate Long`ish words
Converter
┄───────┄
• 234 lines of Python
• More complex than I'd like
• Forward and backward
◦ To allow for check-in with Git
| usage: colorize [-h] [-d] [filename]
|
| Convert back and forth from colorForth encoding
|
| positional arguments:
| filename filename to process OR - for stdin
|
| optional arguments:
| -h, --help show this help message and exit
| -d, --decode decode instead of encode
VIM
┄─┄
• Vi-improved
• Developed by Bram Moolenaar
• Supports regular expression
based syntax highlighting
ftdetect/colorforth.vim
┄─────────────────────┄
| autocmd BufRead,BufNewFile *.cfs set filetype=colorforth
syntax/colorforth.vim
┄───────────────────┄
| syntax match Comment /[(][^]* [ `a-z0-9_\-./;'!+@*,?]\+/
| highlight Comment ctermfg=white guifg=#ffffff
|
| syntax match Variable /[~] [ `a-z0-9_\-./;'!+@*,?]\+/
| highlight Variable ctermfg=13 guifg=#ff00ff
|
| syntax match Define /: [ `a-z0-9_\-./;'!+@*,?]\+/
| highlight Define ctermfg=1 guifg=#ff0000
|
| syntax match Execute /[\[] [ #$`a-z0-9_\-./;'!+@*,?]\+/
| highlight Execute ctermfg=11 guifg=#ffff00
syntax/colorforth.vim (cont)
┄──────────────────────────┄
| syntax match Compile /[\]] [ #$`a-z0-9_\-./;'!+@*,?]\+/
| highlight Compile ctermfg=10 guifg=#00ff00
|
| syntax match Postpone /[{] [ `a-z0-9_\-./;'!+@*,?]\+/
| highlight Postpone ctermfg=6 guifg=#00ffff
|
| syntax match Blue /[<] [ `a-z0-9_\-./;'!+@*,?]\+/
| highlight Blue ctermfg=4 guifg=#0000ff
|
| syntax match Gray /[>] [ `a-z0-9_\-./;'!+@*,?]\+/
| highlight Gray ctermfg=8 guifg=#777777
|
| syntax match Block /[@] [0-9]\+/
| highlight Block ctermfg=7 guifg=#cccccc
Bootstrapping
┄───────────┄
• Start with C-interpreted [ execute ( words
• Make machine code [ execute ( words
work via : c2f ( thunk
• Make ] compiled ( words
work via : ,compile ( thunk
• Make ] compiled numbers ( words
work via : literal ( thunk
• Build up core words and macros
• Implement color words and dispatch
• Transition into machine code interpreter
Only executable bootstrap words:
[ 1, 2, 3, , load forth macro
[ Numbers
For Debugging:
[ here .
case 1: /* execute */
addr = find(name, FORTH_DICT);
if (addr) {
CALL(addr);
} else {
switch (name) {
case 0xd3f80000: /* 1, */ *HP++ = *SP++; break;
case 0xd5f80000: /* 2, */ *HP2++ = *SP++; break;
case 0xd7f80000: /* 3, */ *HP2++ = *SP; *HP++ = *SP++ >> 16; break;
case 0xa1ae0000: /* load */ if (load(*SP++)) return 1; break;
case 0xb1896400: /* forth */ DICT = &FORTH_DICT; break;
case 0x8ac84c00: /* macro */ DICT = &MACRO_DICT; break;
case 0xc8828000: /* here */ *--SP = (uint64_t) HP; break;
case 0xea000000: /* . */ PrintNumber(*SP++); break;
default: break;
}
}
break;
&0
( colorf`orth bootst`rappi`ng
% cr [ #1 load #2 load #3 load #4 load
% cr [ #5 load #6 load #7 load #8 load
% cr ( rax - tos
% cr ( rsi - stack pointe`r
% cr ( rdi - heap pointe`r
% cr ( r8 - interna`l vars - parked sp
% cr ( r8 +8 - parked hp
% cr ( r8 +16 - active dictio`nary
% cr ( r8 +24 - forth dictio`nary
% cr ( r8 +32 - macro dictio`nary
% cr ( r8 +40 - entryp`oint
% cr ( r8 +48 - last call
% cr ( rcx - temp
% cr ( rdx - temp
Memory Map
┄────────┄
$8000000 - BASE
$80000FF - Data Stack ↑
$8000100 (0) - Parked Data Stack Pointer
$8000108 (1) - Parked Heap Pointer
$8000110 (2) - Pointer to active vocabulary
$8000118 (3) - Pointer to end of FORTH dictionary
$8000120 (4) - Pointer to end of MACRO dictionary
$8000128 (5) - Entrypoint after bootstrap
$8000130 (6) - Source pointer (after bootstrap)
$8000138 (7) - NOPE (notfound) (after bootstrap)
$8000140 (8) - PrintNumber C-function
$8000148 (9) - RESERVED
$8000200 - Forth Dictionary
$8010000 - Macro Dictionary
$8020000 - Heap
$8100000 - Loaded Blocks
#define HEAP_SIZE (1024 * 1024 * 1024)
#define HEAP_BASE 0x8000000
#define VARS ((uint32_t **) (HEAP_BASE + 0x100))
#define SP (*(uint64_t **) &VARS[0])
#define HP (*(uint8_t **) &VARS[1])
#define HP2 (*(uint16_t **) &VARS[1])
#define DICT (*(uint32_t ***) &VARS[2])
#define FORTH_DICT (VARS[3])
#define MACRO_DICT (VARS[4])
#define ENTRYPOINT (VARS[5])
#define SET_HOOK(i, v) (VARS[i]) = (void *) v;
#define BLOCK_BASE ((int32_t *) (HEAP_BASE + 0x100000))
#define FORTH_START ((uint32_t *) (HEAP_BASE + 0x208))
#define MACRO_START ((uint32_t *) (HEAP_BASE + 0x10008))
#define SP_START ((uint64_t *) (HEAP_BASE + 0xf8))
#define HP_START ((uint8_t *) (HEAP_BASE + 0x20000))
#define C2F 0x96ab0000 /* c2f */
#define CALL(addr) *--SP = (uint64_t) (addr); \
((thunk_t) find(C2F, FORTH_DICT))(&SP);
&1
: c2f
% cr [ $fc 1, ( cld
% cr [ $56 1, ( push rsi
% cr [ $57 1, ( push rdi
% cr [ $c88949 3, ( mov rcx r8
% cr [ $308b49 3, ( mov @r8 rsi
% cr [ $788b49 3, $8 1, ( mov @r8 +8 rdi
% cr [ $ad48 2, ( lodsq
% cr [ $c28948 3, ( mov rax rdx
% cr [ $ad48 2, ( lodsq
% cr [ $d2ff 2, ( call rdx
% cr [ $768d48 3, $f8 1, ( lea @rsi -8 rsi
% cr [ $68948 3, ( mov rax @rsi
% cr [ $308949 3, ( mov rsi @r8
% cr [ $788949 3, $8 1, ( mov rdi @r8 +8
% cr [ $5f 1, ( pop rdi
% cr [ $5e 1, ( pop rsi
% cr [ $c3 1, ( ret
% cr ( now can execu`te words
#define LITERAL 0xa3920ad0 /* literal */
#define COMPILE 0xfd238e20 /* ,comp`ile */
#define VARIABLE 0xc2a2ea00 /* varia`ble */
#define CALL_FIND(name) CALL(find((name), FORTH_DICT));
#define CALL_COMPILE(val) *--SP = (uint64_t) (val); \
CALL_FIND(COMPILE);
#define CALL_LITERAL(val) *--SP = (uint64_t) (val); \
CALL_FIND(LITERAL);
Dictionary
┄────────┄
0, addr, word, addr, word, addr, word
↑
0, addr, word, addr, word MDICT
↑
FDICT
↑
DICT
static void *find(int32_t goal, uint32_t *start) {
while (*start) {
if (goal == (int32_t) (*start)) {
--start;
return (void *) (uint64_t) *start;
}
start -= 2;
}
return 0;
}
&2
: 1,,
% cr [ $50 1, ( push rax
% cr ( push rax , mov ?? al , stosb pop rax
% cr [ $b866 2, $b050 2, ( mov $b050 ax
% cr [ $ab66 2, ( stosw
% cr [ $58 1, ( pop rax
% cr [ $aa 1, ( stosb
% cr [ $b866 2, $58aa 2, ( mov $58aa ax
% cr [ $ab66 2, ( stosw
% cr [ $ad48 2, ( lodsq
% cr [ $c3 1, ( ret
% cr [ macro
: ;
% cr [ $49 1, $30783b 3, ( cmp @r8+`48 rdi
% cr [ $75 1, $4 1, ( jne +4
% cr [ $fb47c6 3, $e9 1, ( movb $e9 @rdi -5
% cr [ $c3 1,, ( ret [ $c3 1, ( ret
: , [ $ab 1,, ( stosd [ $48 1,, $ad 1,, ( lodsq ] ;
% cr [ forth
: ,comp`ile
% cr [ $e8 1,, ( call ?
% cr [ $f82948 3, ( sub rdi rax
% cr [ $e88348 3, $4 1, ( sub $4 rax ] ,
% cr [ $788949 3, $30 1, ( mov rdi @r8 +48 - mark call ] ;
% cr ( now can compi`le words
static int load(int32_t block) {
int32_t *ip = BLOCK_BASE + block * 256;
int32_t count = 256;
while (count) {
int32_t ir = *ip++;
--count;
int32_t tag = ir & 0xf;
int32_t name = ir & ~0xf;
void *addr;
switch (tag) {
case 1: /* execute */
/* NEXT SLIDE ... */
break;
case 3: /* define */ *++(*DICT) = (uint32_t) (uint64_t) HP;
*++(*DICT) = name; break;
case 4: /* compile */ addr = find(name, MACRO_DICT);
if (addr) { CALL(addr); }
else { CALL_COMPILE(find(name, FORTH_DICT)); } break;
case 6: /* compile# */ CALL_LITERAL(ir >> 5); break;
case 7: /* postpone */ CALL_COMPILE(find(name, MACRO_DICT)); break;
case 8: /* execute# */ *--SP = (ir >> 5); break;
case 12: /* variable */ *--SP = ir; CALL_FIND(VARIABLE); break;
default: break;
}
}
return 0;
}
&3
[ macro
: rexw [ $48 1,, ( rexw ] ;
: nipd`up { rexw [ $89 1,, $6 1,, ( mov rax @rsi ] ;
: dup { rexw [ $8d 1,, $76 1,, $f8 1,, ( lea @rsi -8 rsi { nipd`up ] ;
% cr [ forth
: literal { dup rexw [ $c7 1,, $c0 1,, ] , ;
% cr ( now can compi`le litera`ls
% cr ( ---
% cr [ macro
: drop { rexw [ $ad 1,, ] ;
: 1, [ $aa 1,, ( stosb { drop ] ;
: 2, ] $66 1, $ab 1, ( stosw { drop ] ;
: 3, { , rexw ] $cfff 2, ( dec rdi ] ;
: ,2 ] $ab48 2, ( stosq { drop ] ;
: here { dup ] $f88948 3, ( mov rdi rax ] ;
: allot ] $c70148 3, ( add rax rdi { drop ] ;
&4
[ macro
: nip { rexw ] $8768d 3, ( lea @rsi +8 rsi ] ;
: over { dup rexw ] $8468b 3, ( mov @rsi +8 rax ] ;
: aside ] $c18948 3, ( mov rax rcx ] ;
: swap { aside ] $68b48 3, ( mov @rsi rax ] $e8948 3, ( mov rcx @rsi ] ;
: push ] $50 1, ( push rax { drop ] ;
: pop { dup ] $58 1, ( pop rax ] ;
: @r { pop ] $50 1 ( push rax ] ;
: 0 { dup ] $c031 2, ( xor eax,e`ax ] ;
: @' ] $6348 3, ( movsx dword @rax rax ] ;
: 2@' ] $8b48 3, ( mov @rax rax ] ;
: c@ { rexw ] $b60f 3, ( movz`x byte @rax rax ] ;
: @ { rexw ] $850463 3, 0 , ( movsx dword @rax *4 rax ] ;
: 2@ { rexw ] $85048b 3, 0 , ( mov @rax *4 rax ] ;
: !' { aside drop ] $189 2, ( mov eax @rcx { drop ] ;
: 2!' { aside drop ] $18948 3, ( mov rax @rcx { drop ] ;
: c! { aside drop ] $188 2, ( mov al @rcx { drop ] ;
: ! { aside drop ] $8d0489 3, 0 , ( mov eax @rcx *4 { drop ] ;
: 2! { aside drop rexw ] $8d0489 3, 0 , ( mov rax @rcx *4 { drop ] ;
: +! { aside drop ] $8d0401 3, 0 , ( add rax @rcx *4 { drop ] ;
: a! ] $c28948 3, ( mov rax rdx { drop ] ;
: a { dup ] $d08948 3, ( mov rdx rax ] ;
: a+ ] $c2ff48 3, ( inc rdx ] ;
: @+ { dup rexw ] $950463 3, 0 , ( movsx dword @rdx *4 rax { a+ ] ;
: !+ ] $950489 3, 0 , ( mov eax @rdx *4 { a+ drop ] ;
&5
[ macro
: 1+ ] $c0ff48 3, ( inc rax ] ;
: 1- ] $c8ff48 3, ( dec rax ] ;
: 2* ] $e0d148 3, ( shl rax ] ;
: 2/ ] $f8d148 3, ( sar rax ] ;
: - ] $d0f748 3, ( not rax ] ;
: + ] $60348 3, ( add @rsi rax { nip ] ;
: and ] $62348 3, ( and @rsi rax { nip ] ;
: or ] $63348 3, ( xor @rsi rax { nip ] ;
: * { rexw ] $6af0f 3, ( imul @rsi rax { nip ] ;
: */ { aside drop ] $2ef748 3, ( imulq @rsi ] $39f748 3, ( idiv`q @rcx { nip ] ;
: shl { aside drop ] $e0d348 3, ( shl cl rax ] ;
: sar { aside drop ] $f8d348 3, ( sar cl rax ] ;
: shr { aside drop ] $e8d348 3, ( shr cl rax ] ;
: if { rexw ] $f883 3, $74 1, here 0 1, ;
: -if { rexw ] $f883 3, $79 1, here 0 1, ;
: then ] here over - + swap c! ;
: execu`te { aside drop ] $d1ff 2, ( call rcx ] ;
: sysva`r { dup ] $c0894c 3, ( mov r8 rax ] ;
: 2dup { over over ] ;
: 2drop { drop drop ] ;
: negate { - 1+ ] ;
&6
[ forth
: 1, ] 1, ; : , ] , ; : ,2 ] ,2 ;
: here ] here ; : allot ] allot ;
: sys ] sys ;
: nip ] nip ; : dup ] dup ;
: drop ] drop ; : over ] over ; : swap ] swap ;
: 0 ] 0 ; : @ ] @ ; : ! ] ! ;
: 1+ ] 1+ ; : 1- ] 1- ; : 2* ] 2* ; : 2/ ] 2/ ;
: 4* ] 2* 2* ; : 4/ ] 2/ 2/ ; : + ] + ;
: - ] - ; : and ] and ; : or ] or ; : * ] * ; : */ ] */ ;
: shl ] shl ; : shr ] shr ; : sar ] sar ;
: execu`te ] execu`te ;
: varia`ble ] ;
: ' ] pop dup #5 + push 1+ dup @' + #4 + ;
&7
: sys ] 2* sysva`r 4/ + ;
: dict ] #2 sys ;
: fdict ] #3 sys @ 4/ ;
: mdict ] #4 sys @ 4/ ;
: entryp`oint ] #5 sys ;
: thence ] #6 sys ;
: @nope ] #7 sys ;
: @tbl ] #16 + sys ;
: forth ] #3 sys 4* dict ! ;
: macro ] #4 sys 4* dict ! ;
: ?rex ] dup #8 and if $41 1, then drop #7 and ;
: ,psh ] ?rex $50 + 1, ;
: ,pop ] ?rex $58 + 1, ;
: ,mov ] dup #3 shr $48 + 1, $89 1, #7 and $c0 + 1, ;
% cr [ macro
: int3 ] $cc 1, ( int3 ] ;
: invoke ] #2 ,psh #7 ,psh #8 ,psh #9 ,psh #10 ,psh #11 ,psh
% cr ] #0 ,psh { drop ] #9 ,mov { drop ] #8 ,mov { drop ] #2 ,mov { drop ] #1 ,mov #0 ,pop
% cr ] #6 ,psh $d0ff 2, ( call rax ] #6 ,pop
% cr ] #11 ,pop #10 ,pop #9 ,pop #8 ,pop #7 ,pop #2 ,pop ;
% cr [ forth
% cr
: . ] #0 #0 #0 #8 sys @ invoke drop ;
: ? ] dup . ;
: yeep ] #999999 . ;
: go ] ;
: init ] ' go entryp`oint ! ' yeep @nope ! ; [ init
: fill ] push push a! pop pop swap
: fll ] if over !+ 1- fll ; then 2drop ;
&8
: nope ] @nope @ execu`te ;
: ,dict ] #4 dict @ 4/ +! dict @ 4/ @ 4/ ! ;
: nah? ( wd-w`df ] 2dup @ if or #15 - and ; then nip ;
: find ] nah? if drop 1- 1- find ; then drop nip 1- @ ;
: define ] here ,dict ,dict ;
: comp ] dup push mdict find if pop drop execu`te ; then
% cr ] pop fdict find if ,comp`ile ; then nope ;
: *comp ] #5 sar literal ;
: exec ] fdict find if execu`te ; then nope ;
: *exec ] #5 sar ;
: post ] mdict find if ,comp`ile ; then nope ;
: var ] define thence @ literal { ; ] ;
: none ] drop ;
: ,t ] over @tbl ! 1+ ;
: init ] #0
% cr ] ' none ,t ' exec ,t ' none ,t ' define ,t
% cr ] ' comp ,t ' none ,t ' *comp ,t ' post ,t
% cr ] ' *exec ,t ' none ,t ' none ,t ' none ,t
% cr ] ' var ,t ' none ,t ' none ,t ' none ,t
% cr ] drop ;
% cr [ init
: 2col ] #15 and @tbl @ ;
: 1eval ] thence @ push dup 2col execu`te pop thence ! ;
: eval ] if push dup push @ 1eval pop 1+ pop 1- eval ; then drop drop ;
: block ] #256 * #0 sys #-64 + $40000 + + ;
: load ] block #256 eval ;
: thru ] 2dup 1- or if drop push dup push load pop 1+ pop thru ; then 2drop ;
% cr [ #9 load
&9
( now in main interp`reter.
% cr [ #10 load
&10
: squar`e ] dup * ;
: cube ] dup squar`e * ;
: displ`ay ] $2400000 ;
: row ] #320 * displ`ay + ;
: red ] $ff0000 ;
: orange ] $ffcc00 ;
: yellow ] $ffff00 ;
: green ] $ff00 ;
: blue ] $ff ;
: test ] #5 row #100 red fill
% cr ] #10 row #120 orange fill
% cr ] #15 row #140 yellow fill
% cr ] #20 row #160 green fill
% cr ] #25 row #180 blue fill ;
% cr [ test
Size
┄──┄
• 256 lines of C
• 234 lines of Python (converter)
• 259 lines of colorForth
• 74 lines of Makefile
What's Left?
┄──────────┄
• Integrate with Win32 I/O Bridge
• Write an editor!
• Make it simpler
◦ Machine word indexing is nice,
but clashes with bootstraping from C
◦ Drop words I'm not using
◦ Should I have started in ASM?
DEMO
QUESTIONS❓
🙏
Thank you!