Drive connected Win Apps in Docker

CC-BY-SA image borrowed from Wikipedia Container Ship

And perhaps contribute to your clients green goals too. I have had clients with IoT device vendors who often have sneaky Win32 only Application to engage the “factory” setup or white-label process or recovery on their devices and it is critical to retain the vendor support whilst optimising the workflow at scale.

Or even you could have the vendor itself wanting to semi-automate their factoryline to initialize devices that vendor has developed Win32 app you suddenly find yourself having to “plug it in” to your automation magic.

Or you could be just a regular home user wanting to drive some windows app automatically because you are just simply time poor or you really hate dialogs and popups and stuff or using the app itself ..

My example use-case and the associated problems heavily revolves around the connected IoT devices which may need a proprietary / legacy windows app but it’s applicable to quite few other things if you can’t use for various reasons the Windows UI automation stuff already out there or if you simple need to scale headless parallel prosessing by using containers without heavy headed instance overhead like I do.

Ideal isn’t always possible ..

Real world often involves white-label and then some sort activation and if you can keep things generic pre-activation that would be the best but often there is a need for this and there are protocols like Broadband forum TR.64 to do just these kind of things but one cannot always avoid the bootloader friend.

The world is never perfect and we can’t expect it always be especially in transitional situations whilst addressing the risks.

.. but we can certainly help it be so

Not only that but the vendor relationship has usually benefited from increased collaboration from the initial distrust situation and at best it may help put your clients business objectives at the top on priority list.

One could even use the app for just plain extra validation if one is really concerned about some risk one identifies and then return it to the vendor without bothering the customer who would angrily return it and writing a rant in a review hurting the success especially on a early stage hardware startup with limited number of other reviews.

Finicky workflows and Apps

Leased IoT device churn

But it’s not as bad as it could be :)

And with today’s chip shortage this becomes even more important as these valuable devices can be in a short supply.

Numbers matter

(35 min * 100k) / 60 (hr) / 1840 (FTE p.a. typical)

As do the other numbers we sometimes forget

IoT can be risky business

What not to do

One could record the bootloader interaction what the app does but often there are variety of problems why this isn’t such a good idea outside the support issue such as there might be multiple versions of the physical device and each needs ROM.

Don’t mess with the magic sauce

Like what may happen year 2038 Epochcalopyse with 32 bit signed integer used for some timestamps that’s been counting the seconds since the epoch.

IoT reliability also it’s own topic.

The IoT Loader Problems

  • Physical power cycle to engage the bootloader at specific time
  • Finicky “Sun spots are not aligned” App/Device
  • High human error rate w/ or lack of Q&A
  • Parallelisation problems e.g. with mgmt IPv4 collission

Let’s keep our client green

Parallelisation for scale

In the past I’ve just mapped “loading bays” to physical switchports to a VLAN bundle delivered over a tunnel etc. to a virtual instance(s) whatever suits in the client environment/use case.

Docker/Wine/Xvfb stack to the rescue

  • Docker container say using Alpine Linux minimal image
  • Wine Win32 compatibility layer for *NIX etc.
  • X Virtual Frame buffer (Xvfb) — Common GUI framework for *NIX
  • Scripting language e.g. Perl — don’t hate me its just ~always everywhere :)
  • X11 “Driver/Screenshotter” e.g. CPAN X11::GUITest
  • WeMo Power AC Switch or anything you can drive off/on with some RPC
  • Optical Character Recognition (OCR) e.g. GNU OCR (gocr)
  • Image Matching e.g. CPAN Imager::Search
  • Human interface (debug, monitor & control)
  • Script that drives the Win32 App over Wine

To make it even better perhaps

  • Record/Learning script instead of hardcoded
  • API to debug/monitor/control the related instance(s)
  • “Human interface” Front-End towards API

So how do all these components play together?

  • Power control (WeMo)
  • Docker container
  • sub-VLAN tag to separate overlapping IP’s to their own Virtual LAN tag

All these would be connected in a bundle to say a virtual machine running the container instances where the script could drive the Win32 App over Wine and X virtual framebuffer with the help of X driver and other associated things like the power cycle control with WeMo that we can submit SOAP calls to either turn it Off or On.

OCR/Image Match used for triggering

It is important to cover all the error situations so there is less debug and restarting to reduce the toil on the human monitoring and control interface.

Record/Replay is ideal for client self-maintenace

The recording would use a combined OCR and Image recognition/matching as the GUI would typically never change under a virtual frame buffer compared to say literal “screen scraping” from a CRT monitor screen which introduces it’s own artefacts and like.

Human Interface can be API driven

  • Allow debugging (e.g. accessing screenshots/what is going on)
  • Recycle the ephemal container driving the instance(s)
  • Loading bay(s) and Q&A status for parallel operation monitoring

just some collection of stardust in the wider universe

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store