Recognizers in ESP32forth
π§
November 9, 2024
Motivation
ββββββββββ
β’ Forth is really good at creating
DSLs (Domain Specific Languages).
β’ Nearly everything can be redefined,
including numbers:
: 42 ." everything" ;
β’ But...
β¦ There's no easy way to add new kinds of numbers.
β¦ Strings in Forth have a weird extra space.
β¦ There's no way for words to implicitly exist
based on other words.
Forth Recognizers
βββββββββββββββββ
β’ Matthias Trute RfD in 2018
β’ Allow extension of the Forth interpreter
β¦ Change what happens while parsing + interpreting
β¦ Compose set of "recognizers" of different types
β¦ Extend literals, add strings, and more
β’ I learned about them from Leon Wagner's Forth Day talk
β’ Homologous to changing colorForth color dispatch
β’ Available, but not documented in ESP32forth
β’ Caveat: I'm ignorant of the standards process
Terminology
βββββββββββ
Recognizer = Word given the chance to handle parsing
( a n -- .. rectype )
Recognizer Sequence / Stack = Ordered set of recognizers
[ n xt0 xt1 xt2 ... ]
Recognizer Type = Dispatch table returned by a recognizer
[ xt0 xt1 xt2 ]
Recognizer Types
ββββββββββββββββ
RECTYPE: ( xt-interpret
xt-compile
xt-postpone
"name" -- )
| STATE @ 2+
|
0 | 1 | 2
ββββββββββΌββββββββββΌββββββββββ
POSTPONE | COMPILE | INTERPRET
: RECTYPE: ( xt-interpret xt-compile xt-postpone "name" -- ) CREATE , , , ;
: do-notfound ( a n -- ) -1 'notfound @ execute ;
( interpret ) ( compile ) ( postpone )
' do-notfound ' do-notfound ' do-notfound RECTYPE: RECTYPE-NONE
' execute ' , ' postpone, RECTYPE: RECTYPE-WORD
' execute ' execute ' , RECTYPE: RECTYPE-IMM
' drop ' execute ' execute RECTYPE: RECTYPE-NUM
: RECOGNIZE ( c-addr len addr1 -- i*x addr2 )
dup @ for aft
cell+ 3dup >r >r >r @ execute
dup RECTYPE-NONE <> if rdrop rdrop rdrop rdrop exit then
drop r> r> r>
then next
drop RECTYPE-NONE
;
Recognizer Stacks/Sequences
βββββββββββββββββββββββββββ
RECSTACK ( -- a )
+RECOGNIZER ( xt -- )
-RECOGNIZER ( -- )
GET-RECOGNIZERS ( -- xtn..xt1 n )
SET-RECOGNIZERS ( xtn..xt1 n -- )
Extra Challenge!
ββββββββββββββββ
β’ Implement recognizers during ESP32forth bootstrap
β’ Make 0, 1, -1 built-in words
β’ Make NL (10) and BL (32) built-in words
β’ Avoid other numbers until we have recognizers!
create RECSTACK 0 , BL 2/ ( 16 ) cells allot
: +RECOGNIZER ( xt -- )
1 RECSTACK +! RECSTACK dup @ cells + ! ;
: -RECOGNIZER ( -- ) -1 RECSTACK +! ;
: GET-RECOGNIZERS ( -- xtn..xt1 n )
RECSTACK @ for RECSTACK r@ cells + @ next ;
: SET-RECOGNIZERS ( xtn..xt1 n -- )
0 RECSTACK ! for aft +RECOGNIZER then next ;
: postpone ( "name" -- )
bl parse RECSTACK RECOGNIZE @ execute ; immediate
: +evaluate1
bl parse dup 0= if 2drop exit then
RECSTACK RECOGNIZE state @ 1+ 1+ cells + @ execute
;
: interpret0 begin +evaluate1 again ; interpret0
( Add regular words )
: REC-FIND ( c-addr len -- xt addr1 | addr2 )
find dup if
dup immediate? if RECTYPE-IMM else RECTYPE-WORD then
else
drop RECTYPE-NONE
then
;
' REC-FIND +RECOGNIZER
( Add integers )
: REC-NUM ( c-addr len -- n addr1 | addr2 )
s>number? if
['] aliteral RECTYPE-NUM
else
RECTYPE-NONE
then
;
' REC-NUM +RECOGNIZER
( Add floating point numbers, later )
: REC-FNUM ( c-addr len -- f addr1 | addr2 )
s>float? if
['] afliteral RECTYPE-NUM
else
RECTYPE-NONE
then
;
' REC-FNUM +RECOGNIZER
Effect on ESP32forth
ββββββββββββββββββββ
β’ PROs:
β¦ Allows removing int/float parsing from core
- In theory, can be done in Forth
β¦ Allows removing int/float parsing from core
β’ CONs:
β¦ Requires opcodes for 0, 1, -1, NL(10), BL(32) and 41=NL+BL-1
β¦ Somewhat complex
PARSE
FIND
CREATE
EVALUATE1
βββββββββββββββββ
S>FLOAT?
S>NUMBER?
βββBEFOREββββ
static cell_t *evaluate1(cell_t *rp) {
cell_t call = 0;
cell_t tos, *sp, *ip;
float *fp;
UNPARK;
cell_t name;
cell_t len = parse(' ', &name);
if (len == 0) { DUP; tos = 0; PARK; return rp; } // ignore empty
cell_t xt = find((const char *) name, len);
if (xt) {
if (g_sys->state && !(*TOFLAGS(xt) & IMMEDIATE)) {
COMMA(xt);
} else {
call = xt;
}
} else {
cell_t n;
if (convert((const char *) name, len, g_sys->base, &n)) {
if (g_sys->state) {
COMMA(g_sys->DOLIT_XT);
COMMA(n);
} else {
PUSH n;
}
} else {
float f;
if (fconvert((const char *) name, len, &f)) {
if (g_sys->state) {
COMMA(g_sys->DOFLIT_XT);
*(float *) g_sys->heap++ = f;
} else {
*++fp = f;
}
} else {
PUSH name;
PUSH len;
PUSH -1;
call = g_sys->notfound;
}
}
}
PUSH call;
PARK;
return rp;
}
βββAFTERββββ
static cell_t *evaluate1(cell_t *rp) {
cell_t call = 0;
cell_t tos, *sp, *ip;
float *fp;
UNPARK;
cell_t name;
cell_t len = parse(' ', &name);
if (len == 0) { DUP; tos = 0; PARK; return rp; } // ignore empty
cell_t xt = find((const char *) name, len);
if (xt) {
if (g_sys->state && !(*TOFLAGS(xt) & IMMEDIATE)) {
COMMA(xt);
} else {
call = xt;
}
} else {
return 0;
}
PUSH call;
PARK;
return rp;
}
Applications
ββββββββββββ
β’ New kinds of literals:
β¦ Complex Numbers - 12e9i-43
β¦ Rational Numbers - 22/7
β¦ Logarithmic Number Systems - ~123
β¦ Strings - "No need for a space"
β¦ Atoms - 'myname
β¦ New base prefixes - b$11010010
β’ General prefix/suffix:
β¦ Store to Values - 123 >myvar
β¦ Defining Syntax - square: dup * ;
Complex Numbers
βββββββββββββββ
β’ Add words that treat 2 floats as a complex number
β’ z+ z- z* z/ etc.
β’ Introduce literals [real]i[imaginary]
--> 123i100 2i0 z* z.
246.000000i200.000000 ok
--> 123i100 2i1 z+ z.
125.000000i101.000000 ok
--> 123i100 1/z z.
0.004894i-0.003979 ok
: z@ ( a -- z ) dup sf@ sfloat+ sf@ ;
: z! ( a -- z ) dup sfloat+ sf! sf! ;
: z, ( z -- ) fswap sf, sf, ;
: zconstant create z, does> z@ ;
: zvariable create 0i0 z, ;
: f>r r> fp@ ul@ fdrop >r >r ;
: r>f r> r> fdup fp@ l! >r ;
: -frot frot frot ;
: zdup fover fover ;
: zswap f>r fswap f>r fswap r>f r>f fswap f>r fswap r>f ;
: zover f>r f>r zdup r>f r>f zswap ;
: 2zdup zover zover ;
: z. ( z -- ) fswap <# #fs #> type ." i" <# #fs #> type space ;
: z+ ( z z -- z ) f>r fswap f>r f+ r>f r>f f+ ;
: z- ( z z -- z ) f>r fswap f>r f- r>f r>f f- ;
: z* ( z z -- z ) 2zdup -frot f* f>r f* r>f f+ f>r
frot f* f>r f* r>f f- r>f ;
: zlen ( z -- f ) fdup f* fswap fdup f* f+ ;
: 1/z ( z -- z ) zdup zlen fdup f>r fswap f>r
f/ r>f fnegate r>f f/ ;
: z/ ( z z -- z ) 1/z z* ;
: azliteral fswap afliteral afliteral ;
: find-char ( a n ch -- a )
swap for aft over c@ over = if drop rdrop exit then >r 1+ r> then next
2drop 0 ;
: iparts? { a n -- 0 | a n a n -1 }
a n [char] i find-char dup 0= if exit then { m }
m 1+ n 1- m a - -
a m a -
-1
;
: rec-z ( a n -- z addr1 | addr2 )
iparts? 0= if rectype-none exit then
2dup s>number? if s>f 2drop else s>float? 0= if rectype-none exit then then
2dup s>number? if s>f 2drop else s>float? 0= if rectype-none exit then then
['] azliteral rectype-num
;
' rec-z +recognizer
"Easy" Assignment
βββββββββββββββββ
β’ Add a syntax for assigning to values.
0 value foo
123 ->foo
β’ Make sure it works in definitions and postponed.
: bar 55 ->foo ;
: baz postpone ->foo ;
: ->ex execute ;
: ->, >r aliteral r> , ;
: ->post >r aliteral r> postpone aliteral postpone, ;
' ->ex ' ->, ' ->post rectype: rectype-to
: prefix? { a n b m -- 0 | a' n' -1 }
n m 1+ < if 0 exit then
m 0 do a i + c@ b i + c@ <> if 0 unloop exit then loop
a m + n m - -1
;
: rec-> ( a n -- )
s" ->" prefix? 0= if rectype-none exit then
find dup 0= if drop rectype-none exit then
>body ['] ! rectype-to
; ' rec-> +recognizer
: rec+-> ( a n -- )
s" +->" prefix? 0= if rectype-none exit then
find dup 0= if drop rectype-none exit then
>body ['] +! rectype-to
; ' rec+-> +recognizer
Logarithmic Number Systems
ββββββββββββββββββββββββββ
β’ Compute on fixed-point logarithms
β’ Use Gaussian Logarithms for add/subtract
log(x) + log(y) = log(x * y)
log(x) - log(y) = log(x / y)
log(x) * n = log(x ^ n)
Gaussian Logarithms
βββββββββββββββββββ
log2(x) + S(log2(y) - log2(x)) = log2(x + y)
log2(x) + D(log2(y) - log2(x)) = log2(|x - y|)
S(x) = log2(1 + 2^x)
D(x) = log2(|1 - 2^x|)
@summer.png
@differ.png
@both.png
log2(x) + S(log2(y) - log2(x))
= log2(x) + S(log2(y / x))
= log2(x) + log2(1 + 2^log2(y / x))
= log2(x) + log2(1 + y / x)
= log2(x + y)
LNS vs Floating Point vs Fixed
ββββββββββββββββββββββββββββββ
β’ LNS uniformly log scale vs tiered
β’ If you know your range, use fixed-point!
β’ LNS add/subtract is expensive
β¦ Table vs CORDIC
- 89 (4-bit), 2440 (8-bit)
β’ Most problems have an actual scale
β¦ But Audio, Light, Ritcher scale,
are logarithmic
LNS Syntax
ββββββββββ
~3.14159
~22 ~7 ~/ CONSTANT PI
~100 ~20 ~3 ~+ ~+ ~.
: ~* ( ~ ~ -- ~ ) + ;
: ~/ ( ~ ~ -- ~ ) - ;
: ~/1 ( ~ -- ~ ) negate ;
: ~>f ( ~ -- f ) 2e s>f ~precision s>f f/ f** ;
: f>~ ( f -- ~ ) fln 2e fln f/ ~precision s>f f* 0.5e f+ floor f>s ;
: ~. ( ~ -- ) ~>f f. ;
: ~summer ( ~ -- ~ ) ~>f 1e f+ f>~ ;
: ~differ ( ~ -- ~ ) 1e ~>f f- fabs f>~ ;
: ~order ( ~ ~ -- ~ ~ ) 2dup max >r min r> ;
0 value entries
: tabulate
begin
entries ~summer ,
entries ~differ ,
1 +to entries
entries ~summer entries =
entries ~differ entries = and if exit then
again
;
create table tabulate
: ~summer1 ( ~ -- ~ ) dup entries < if 2* cells table + @ then ;
: ~differ1 ( ~ -- ~ ) dup entries < if 2* 1+ cells table + @ then ;
: ~+ ( ~ ~ -- ~ ) ~order over - ~summer1 + ;
: ~- ( ~ ~ -- ~ ) ~order over - ~differ1 + ;
0 value result
0 value places
0 value fract
: !digit ( ch -- ) dup [char] 0 < over [char] 9 > or if -1 throw then ;
: =digit ( ch -- )
dup [char] . = if drop -1 to fract exit then
!digit
[char] 0 - to result ;
: +digit ( ch -- )
dup [char] . = if drop -1 to fract exit then
!digit
fract if 1 +to places then
[char] 0 - result 10 * + to result ;
: ~conv ( a n -- )
>r 1+ dup c@ =digit r>
2 - for aft 1+ dup c@ +digit then next
drop
result s>~ 10 places dig ~/
['] aliteral rectype-num
;
: rec-~ ( a n -- )
dup 2 < if 2drop rectype-none exit then
over c@ [char] ~ <> if 2drop rectype-none exit then
0 to fract
0 to result
0 to places
['] ~conv catch if 2drop rectype-none exit then
;
' rec-~ +recognizer
Closing Thoughts
ββββββββββββββββ
β’ These kinds of enhancements are nice,
but complex.
β’ Most highlight the weakness of
out-of-the box string handling.
DEMO &
QUESTIONSβ
π
Thank you!
ββββββββββββββββββββββββββ
http://eforth.appspot.com/