Connecting Forth to the World

January 25, 2020

Brad Nelson / @flagxor

The Quest

  • Run flagxor.com on Forth
  • Keep existing content
  • Add dynamic content

The Old Solution

  • App Engine
  • Static Content
  • Python script + templates
  • Topics (categories)

App Engine

  • Google's opinionated scalable web solution
  • Runtimes for Python, Java, Go, Node.js
  • app.yaml
  • Free!
  • Now requires gcloud compilation :-(

Python Templates

  • Transform data into canned pages
  • Deploy pages via app engine as static content

gen.py

ARTICLES = [
  {
    'title': 'Source Code',
    'date': 'February 19, 2015',
    'topics': ['Forth', 'Programming Languages'],
    'summary': """\
<p>
Source code, while a common property of programming languages,
is by no mean universal. Languages like Smalltalk, traditional
interactive BASICs, and some Forths often have "source of truth"
representations other than text files. Additional REPLs of all
sorts often retain program state that exists only for the
duration of an interactive session. What are the trade-offs
and why does text continue to be the dominant format?
</p>
""",
    'rest': """\
<p>
A key feature of early micro-computers was an integrated BASIC
          

Topics (categories)

  • Aspirationally lots of topics: Forth, Blogging, Forth Haiku, Programming Languages
  • In practice, not so much

The New Solution

  • Static & Dynamic Content
  • Minimalistic Markup with Smarter JavaScript
  • Google Cloud Server Hosting
  • Conventional Web Server with Gateway to Forth Server

Minimalistic Markup

<!DOCTYPE html>
<body><div id="main">
<script src="../../flagxor.js"></script>
<h1>Source Code</h1>
<h2>February 19, 2015</h2>
<p>
Source code, while a common property of programming languages,
is by no mean universal. Languages like Smalltalk, traditional
interactive BASICs, and some Forths often have "source of truth"
representations other than text files. Additional REPLs of all
sorts often retain program state that exists only for the
duration of an interactive session. What are the trade-offs
and why does text continue to be the dominant format?
</p>
<hr/>
<p>
A key feature of early micro-computers was an integrated BASIC
          

Google Cloud Compute Engine

  • Run Virtual Machines in the Cloud
  • Wimpy Debian Server (1 vCPU + 0.6GB Mem + 10GB Disk) under free caps
  • Install Arbitrary Linux Stuff, including Gforth
  • Need to keep a credit card on file
  • Spend caps configurable

Nginx

  • Popular lightweight webserver
  • Easier to configure than Apache2

Dynamic Content

  • Web pages that compute or store server side
  • Applications: web counters, web apps, anything stateful

CGI

  • Common Gateway Interface
  • Starts a script once per page load
  • Standardized environment variables for decoded request

HTTP Request

GET /mypage/foo.html HTTP/1.0
User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36
    (KHTML, like Gecko) Chrome/79.0.3945.130 Safari/537.36
Host: flagxor.com:8080
Accept: image/gif, image/jpeg, */*

HTTP Reply

HTTP/1.0 200 ok
Content-type: text/html

<html>
<title>My Page</title>
...

CGI

  • REQUEST_METHOD="GET"
  • REMOTE_ADDR="127.0.0.1"
  • HTTP_HOST="foo.com""
  • REQUEST_URI="/cgi-bin/foo.pl?var1=1&var2=3"
  • SCRIPT_NAME="/cgi-bin/foo.pl"
  • QUERY_STRING="var1=2&var=3"

FastCGI

  • Gateway to over named pipe or domain socket
  • Multiplexing
  • Multiple Requests per connection
  • Complex protocol

SCGI

  • Simple Common Gateway Interface
  • Simple 100 Line Spec
  • One connection per request

SCGI Request

"70:"
  "CONTENT_LENGTH" <00> "27" <00>
  "SCGI" <00> "1" <00>
  "REQUEST_METHOD" <00> "POST" <00>
  "REQUEST_URI" <00> "/deepthought" <00>
","
"What is the answer to life?"
          

SCGI Reply

"Status: 200 OK" <0d 0a>
"Content-Type: text/plain" <0d 0a>
"" <0d 0a>
"42"
          

Why not a Proxy?

  • SCGI sanitizes inputs
  • HTTP Keep-alive
  • HTTP is complex
  • Somewhat of a pain to test

Let's Encrypt

  • Non-profit Certificate Authority
  • Part of a push to replace HTTP with HTTPS everywhere
  • Free certificates
  • Fairly short expiration

Certbot

  • Tool Help Certificate Renewal and Deployment
  • Easy to use
  • sudo certbot --nginx

Talking to Posix

\c #include <unistd.h>
c-function fork fork -- n

Aside on Gforth + ABI-Code

  • gforth-0.7.3 is everywhere, but is NOT the latest!
  • gforth-0.7.9 can be built from source
  • ABI-CODE allows ABI portable asm

ABI-CODE

Cell *word(Cell *sp, Float **fp_pointer);

abi-code my+ ( n1 n2 -- n3 )
\ SP passed in rdi, returned in rax
lea rax,[rdi+8] \ new sp in result reg
mov rdx,[rdi] \ get old tos
add [rax],rdx \ add to new tos
ret \ return from my+
end-code

Aside on Ye Unix of Old

  • Early UNIX got by without non-blocking I/O
  • fork() + pipe() for everything
  • Careful patterns of multi-writer / multi-reader
  • Pipes writes < 4096 bytes are atomic

Threads vs Processes

  • Shared Address Space vs Copy-on-Write
  • Maximum performance vs Separation of Concerns
  • Newer API vs Simpler API

Sockets

  • File-like on Unix
  • Learn on \c and a little C for setup

Sharing Between Workers

  • mmap
  • semaphores

Simple Sharing Pattern

  • await -- Wait for a wakeup
  • awake -- Wake up all waiters
  • mmap some shared data

Simple Sharing Pattern

: clock-pulse begin 1000 ms awake again ;
          

Aside on Stack Leaks

: gnd   depth throw ;

: clock-pulse begin 1000 ms awake gnd again ;
          

HTTP Long Polling

  • Delay responding to an HTTP request until something happens
  • Keep a counter to know if the client is up to date
  • use await to delay, and awake when state changes

Dynamic Toys

  • IP Map
  • Shared Board

Demo Time!

flagxor.com
slides
code

Thank you