Play Ping-Pong with Lunatic 🦀 UDP
As I recently threatened here — Lunatic Rust 🦀 library now has std net::UdpSocket -like lunatic::net::UdpSocket on which final PR was merged today into lunatic-rs main and ready to be played with now — if you pull the git main that is.
jtenner (GitHub) beat me to it earlier with the Lunatic AssemblyScript library — think TypeScript and WebAssembly == AssemblyScript
Lunatic (GitHub) is Erlang inspired WebAssembly runtime — or a VM — both made in Rust 🦀 and with high level libraries in Rust 🦀 — also AssemblyScript — created by Lunatic Solutions (GitHub)
Bernard has described the motivation for creating Lunatic that ticks a lot of my boxes — without having to write a single async/await in the guest side.
Back to our little game of Lunatic 🦀 Ping-Pong
I wrote a reasonably quick ugly duck test to see how the green threads scheduler handles 15,000 “clients” and a server that plays Ping-Pong with them on 4 byte UDP packets.
Think of being a table tennis player but having 15,000 opponents on the other side of the table?
It works by first spawning a “busy UDP server” process first that listens on 127.0.0.1:8888 via Process:spawn on wait_ping —
And then it keeps adding Ping clients via Process:spawn on send_ping from main loop every 100 ms — or 10 per second —
The pinger will send ping every one second and if the scheduler is really really good then you should see large amounts of “active” UDP socket flow pairs in say trafshow utility that tracks “active” connections
I would not go over 15,000 concurrent processes (or green threads) right now as the virtual memory consumption is a current issue despite my 4 byte re-allocated buffers on each process — this is something we need to sort with wasmtime that Lunatic uses — until then you might hit Out of Memory error.
The moral of the story is that this can work as reasonable stress tester for any async or lunatic runtimes out there where you need to test how many concurrent “active” processes you can have..
… and the fact that Lunatic 🦀 has now UDP support 🍷
To run this yourself
Install lunatic-runtime
cargo install lunatic-runtime
Add wasm32-wasi target
rustup target add wasm32-wasi
Pull my UDP examples repo
git clone https://github.com/pinkforest/lunatic-udp-examples.git
Make sure open file handles maximum is set high enough e.g.
ulimit -n 100240
Run
cd lunatic-udp-examples; cargo run — bin udp_ping_pong
Observe — install e.g. trafshow and see the ever increasing concurrent flows:
sudo trafshow -i lo ‘port 8888’
After 20 or so seconds… you should see over 200 active flows..
Trafshow is reasonable tool for basic observation since it shows all the flows that have had activity within a second which means the scheduler is working as intended to send that ping roughly every second.
It will be also crucial to start ramping up/down in graduated manner so we can weed out all resource use issues e.g. memory leaks.
I will be adding some tooling around doing automated tests around this that utilises proper packet generators/probe overservers/capture as well as sys perf brobing to see how well the scheduler performed vs resources consumed and then we can track this in CI across commits over time.
Legacy POSIX/Linux-like API was just the beginning ..
Just be aware that the legacy POSIX/Linux-like syscall/libc network support was just the beginning that this effort to create current std lib — like experience was reflecting.
You can read some early related rationale on WASI here on this:
https://github.com/bytecodealliance/wasmtime/blob/main/docs/WASI-rationale.md
Component model is the future …
Future is Now …
I recommend looking into Fermyon spin (GitHub, .dev) as another great pointer on how — today — the component model is and how this will evolve around the server side (.. for now) WebAssembly for distributed and composable microservices with nice developer ergnomics to go with.
I think we’ve gone far from “no networking” to where we are now..
Legacy share everything or complete isolation? Is there a middle?
What is notable in that Sleepy Ping Pong experience is that every individual Pinger is running inside it’s own sandboxed wasmtime instance — thus the virtual memory usage seems absurdingly high ;)
However complete in-request/green thread level nor legacy share everything model extremes are not / may not be the only future paths in this space —
This is especially important with the component model where you would either have to do a lot of copying vs zero copy we love —
Could we apply the borrow / ownership model to WA Components?
Or.. Maybe.. we could use the borrow / ownership model like Rust 🦀 is famous for — and where modern things like io_uring require “owned buffers” due to sharing buffers between kernel / userspace —
Perhaps something I can write next! 🦄