Infixing FORTH ══════════════ ══════════ ══════ ══ January 25, 2025 Forth & DSLs ════════════ Forth excels at Domain Specific Languages → Build a mini-language to fit the problem Examples: • Assemblers • Graphics • 3D Modeling • User Interfaces • Structures + Objects • Database Schemas @6502a.png@6502b.png
What about Infix Math? ══════════════════════ Why don't we see infix algebraic expressions? • Can Forth's superpower work for this? • Is it a good idea? • Would it be composable? Prior Art ═════════ • Brad Rodriguez https://www.bradrodriguez.com/papers/bnfparse.htm • Bob Armstrong https://cosy.com/4thCoSy/Code/CoSy/RecurInterp.f • Julian Noble: "The complete code for the FORmula TRANslator is too lengthy to print, hence it will be found on the included diskette." IMPORTANT! -- Book sold discounted, without diskette. https://github.com/Josefg/Scientific_FORTH Forth-y DSLs ════════════ • Balance ease of use against ease of implementation • Usually limited or no error checking • Typically involve an RPN-like syntax - RED LED ON - LEFT MOTOR OFF • Parsing tends to be limited to word name creation • Composition often arises naturally - But often relies on "carnal knowledge" Attempt #1 ══════════ • Support infix math with ONLY one word to the right EXPRESSION + <scan ahead one word> • Also support parentheses, by counting () depth • Rely on {} local variables variable pending : token ( -- a ) >in @ tib + ; : full? ( -- f ) >in @ #tib @ < ; : ( token 0 begin full? over 0< 0= and while token c@ [char] ( = if 1+ then token c@ [char] ) = if 1- then 1 >in +! repeat drop token over - 1- 0 max dup 0= pending ! evaluate ; immediate : scarf bl parse evaluate pending @ if postpone ( then ; : enact state @ if , else execute then ; : + scarf ['] + enact ; immediate : - scarf ['] - enact ; immediate : * scarf ['] * enact ; immediate : / scarf ['] / enact ; immediate : token ( -- a ) >in @ tib + ; : full? ( -- f ) >in @ #tib @ < ; : ( token 0 begin full? over 0< 0= and while token c@ [char] ( = if 1+ then token c@ [char] ) = if 1- then 1 >in +! repeat drop token over - 1- 0 max evaluate ; immediate : scarf +evaluate1 ; : enact state @ if , else execute then ; : + scarf ['] + enact ; immediate : - scarf ['] - enact ; immediate : * scarf ['] * enact ; immediate : / scarf ['] / enact ; immediate 2 + 3 + 4 . → 9 ok 2 + ( 3 * 4 ) . → 14 ok : square { x } x * x ; : pyth { a b } ( a square ) + ( b square ) ; 3 4 pyth . → 25 ok 2 + 3 * 4 . → 24 ok @hp12c.png
@hp12cp.png
Attempt #2 ══════════ • Use an actual grammar • Can we make it expandable? • Allow some more flexibility on spaces • Reuse {} local variables contextually Backus–Naur Form ════════════════ • Notation for a Context-Free grammar • Set of rules that map a non-terminal to a mix of terminal and non-terminals • Or | conjunction to allow alternatives ―――――――――――――――――――――――――――――――――――――――――――――――――――――― <full-name> ::= <first-name> <middle-name> <last-name> <expr> ::= <term> | <expr> <operation> <term> <integer> ::= <digit> | <integer> <digit> 0 \ BNF Parser (c) 1988 B. J. Rodriguez 1 0 VARIABLE SUCCESS 2 : <BNF SUCCESS @ IF R> IN @ >R DP @ >R >R 3 ELSE R> DROP THEN ; 4 : BNF> SUCCESS @ IF R> R> R> 2DROP >R 5 ELSE R> R> DP ! R> IN ! >R THEN ; 6 : | SUCCESS @ IF R> R> R> 2DROP DROP 7 ELSE R> R> R> 2DUP >R >R IN ! DP ! 1 SUCCESS ! >R THEN ; 8 : BNF: [COMPILE] : SMUDGE COMPILE <BNF ; IMMEDIATE 9 : ;BNF COMPILE BNF> SMUDGE [COMPILE] ; ; IMMEDIATE 10 11 : @TOKEN ( - n) IN @ TIB @ + C@ ; 12 : +TOKEN ( f) IF 1 IN +! THEN ; 13 : =TOKEN ( n) SUCCESS @ IF @TOKEN = DUP SUCCESS ! +TOKEN 14 ELSE DROP THEN ; 15 : TOKEN ( n) <BUILDS C, DOES> ( a) C@ =TOKEN ; ―――――――――――――――――――――――――――――――――――――――――――――― https://www.forth.org/literature/bnfparse.html 0 \ BNF Parser Example #1 - pattern recog. 18 9 88 bjr 19:41 1 \ from Aho & Ullman, Principles of Compiler Design, p.137 2 \ this grammar recognizes strings having balanced parentheses 3 4 HEX 28 TOKEN '(' 29 TOKEN ')' 0 TOKEN <EOL> 5 6 BNF: <CHAR> @TOKEN DUP 2A 7F WITHIN SWAP 1 27 WITHIN OR 7 DUP SUCCESS ! +TOKEN ;BNF 8 9 BNF: <S> '(' <S> ')' <S> | <CHAR> <S> | ;BNF 10 11 : PARSE 1 SUCCESS ! <S> <EOL> 12 CR SUCCESS @ IF ." Successful " ELSE ." Failed " THEN ; 13 14 15 ―――――――――――――――――――――――――――――――――――――――――――――― https://www.forth.org/literature/bnfparse.html Refinements ═══════════ • Make adding alternatives later easy • Use THROW / CATCH to preserve stacks - Tack on interpreter state too • Use regular parsing more • String / token matching immediate words Grammar Format ―――――――――――――― kind <FOO> {{ list of terminals / non-terminals / actions ... }} {{ alternate.. }} ' <BAR> :{{ out of order rule... }} The Target ―――――――――― def square { x } { x * x } def pyth { a b } { (a square) + (b square) } 3 4 pyth → 25 ok : kind ( "name" -- ) create 0 , does> @ begin dup while state @ >r here >r >in @ >r dup >r @ catch 0= if rdrop rdrop rdrop rdrop exit then r> cell+ @ r> >in ! r> here - allot r> state ! repeat -1 throw ; : :{{ >body :noname ; : {{ latestxt :{{ ; : }} postpone ; here >r , dup @ , r> swap ! ; immediate : @token ( -- ch ) >in @ tib + c@ ; : +token 1 >in +! ; : =token ( ch -- ) @token <> throw +token ; : within ( ch a b ) >r over <= swap r> <= and ; : []token ( a b -- ch ) @token -rot within 0= throw @token +token ; : t' postpone [char] postpone =token ; immediate : stoken ( a n -- ) 0 ?do dup c@ =token 1+ loop drop ; : t" postpone s" postpone stoken ; immediate What Grammar to Use? ════════════════════ • Crib from Pascal expressions • Call Forth code in []s : space? ( ch -- f ) dup bl = over nl = or over 10 = or swap 9 = or ; kind <SPACE> {{ }} {{ @token space? 0= throw +token <SPACE> }} : st' postpone <SPACE> postpone t' postpone <SPACE> ; immediate : st" postpone <SPACE> postpone t" postpone <SPACE> ; immediate kind <DIGIT> {{ [char] 0 [char] 9 []token [char] 0 - }} : letnum? ( ch -- f ) dup [char] 0 [char] 9 within over [char] A [char] Z within or over [char] a [char] z within or swap [char] _ = or ; kind <IDENTIFIER> {{ >in @ tib + 0 begin @token letnum? while 1+ +token repeat dup 0= throw }} kind <NUMBER'> {{ <DIGIT> 1 }} {{ <DIGIT> <NUMBER'> rot over 0 do 10 * loop -rot 1+ >r + r> }} kind <NUMBER> {{ <NUMBER'> drop aliteral }} kind <FORTH> {{ [char] ] parse evaluate }} kind <EXPRESSION'> kind <IDENTIFIERS> {{ }} {{ <SPACE> <IDENTIFIER> <SPACE> evaluate <IDENTIFIERS> }} kind <FACTOR> {{ <SPACE> <IDENTIFIER> <SPACE> evaluate <IDENTIFIERS> }} {{ <SPACE> <NUMBER> <SPACE> }} {{ st' ( <EXPRESSION'> st' ) }} {{ st' [ <FORTH> }} kind <TERM> {{ <FACTOR> }} {{ <FACTOR> st' * <TERM> postpone * }} {{ <FACTOR> st' / <TERM> postpone / }} {{ <FACTOR> st" mod" <TERM> postpone mod }} {{ <FACTOR> st" and" <TERM> postpone and }} kind <SIMPLE-EXPRESSION> {{ <TERM> }} {{ <TERM> st' + <SIMPLE-EXPRESSION> postpone + }} {{ <TERM> st' - <SIMPLE-EXPRESSION> postpone - }} {{ <TERM> st" or" <SIMPLE-EXPRESSION> postpone or }} {{ st' + <SIMPLE-EXPRESSION> }} {{ st' - <SIMPLE-EXPRESSION> postpone negate }} kind <EXPRESSION> {{ <SIMPLE-EXPRESSION> }} {{ <SIMPLE-EXPRESSION> st' = <EXPRESSION> postpone = }} {{ <SIMPLE-EXPRESSION> st" <>" <EXPRESSION> postpone <> }} {{ <SIMPLE-EXPRESSION> st' < <EXPRESSION> postpone < }} {{ <SIMPLE-EXPRESSION> st" <=" <EXPRESSION> postpone <= }} {{ <SIMPLE-EXPRESSION> st" >=" <EXPRESSION> postpone >= }} {{ <SIMPLE-EXPRESSION> st' > <EXPRESSION> postpone > }} ' <EXPRESSION'> :{{ <EXPRESSION> }} kind <STATEMENTS> {{ <EXPRESSION> <STATEMENTS> }} {{ st' } }} kind def {{ : st' { postpone { st' { <STATEMENTS> postpone ; }} kind on {{ ' :{{ st' { <STATEMENTS> postpone }} }} kind expr {{ :noname <EXPRESSION> postpone ; execute }} on <TERM> { [ <FACTOR> st' ^ <TERM> postpone xor ] } expr 1 ^ 2 ^ 4 → 7 ok Assessment ══════════ • Unconstrained BNF parsing is potentially slow • Error cases fail hard • Multiline fails due to REFILL, except in source • Probably rationalizable with effort Left Elimination? ═════════════════ • Make a larger grammar that doesn't need to backtrack Example regular expresion: βα* A → Aα | β ⇒ A → βA' A' → αA' | ε E → E + T | T T → T * F | F F → ( E ) | id ⇒ E → T E' E' → + T E' | ε T → F T' T' → * F T' | ε F → ( E ) | id Is it Forthy? Is it ~Good~? ═══════════════════════════ • You can add new words AND grammar • Root implementation is fairly simple - most complexity in the grammar • Backtracking behavior is weird - Reducing to left recursion makes it messy • The stack still, peeks through • It would allow more English DSLs • Needs a better way to deal with errors Would it Help Adoption? ═══════════════════════ • It matches people expectations more • But would be frustrating when it fails to meet them • The more it looks like C/C++/Java/..., the more it will make people expect only that DEMO & QUESTIONS❓ 🙏 Thank you!