Threading in Glk Libraries

Abstract (a what?)

I lay out a model for implementing a two-threaded Glk library -- one thread for the user interface, one for the virtual machine (Glulx). I used this when implementing Glk for iOS.

This document does not cover the notion of running two Glk applications in the same process. The Glk API does not currently allow this, and yes, it's a nuisance. I will deal with it someday. Not today.

This document also does not cover the notion of a threaded IF virtual machine which can perform work while the user is typing input. This is possible, but the Glk API does not manage it. The VM would have to manage its own threading model. (The Glk API must always be called from the thread that glk_main() was invoked in.)

Introduction

The early Glk implementations were single-threaded. That is, the game interpreter and the Glk library all ran in the same thread. The game would call glk_select(), which would block and wait for input via some native API. (CheapGlk calls fgets() to accept input; GlkTerm calls the curses getch() function.)

This is a simple model, and it works tolerably well if the game's command processing is fast. However, a slow command will make the player's interface freeze for a moment, and if the game gets stuck in an infinite loop the interface does too. This is not ideal.

Furthermore, modern GUI systems don't want their event loops to be called from inside game code. (MacOS Classic could be made to work this way, but MacOS X is not suited to it, for example.)

The best plan for these systems is for the Glk library to operate two threads. The UI thread talks to the native UI and handles user interactions; the VM thread runs the game code.

(The familiar example of a Glk game is Glulx code executing in a Glulx interpreter. For the purposes of this document, I am lumping them together; I will refer to "the game code" and "the VM" interchangeably. Of course the Glk library doesn't care whether it is linked to an interpreter, a non-VM-model game, or any other program.)

Threading is always a minefield; we must be careful to keep a clean separation between the threads. This document describes the model that I used in the iOSGlk library. (It took me a few tries to work this out, so I figure I should document it for posterity.)

The Plan

The plan is to have two separate sets of data describing the interface (the Glk windows and input state). One set is managed by the UI thread, the other by the VM thread. We synchronize these at specific times: when entering and leaving glk_select(), for example.

To summarize:

VM thread:

The familiar Glk calls such as glk_select() are all made on this thread.
VM-side data structures store the windows, streams, etc as the game sees them.
Glk calls directly affect the VM-side data structures. So glk_window_open() immediately creates a VM-side window structure. There is no need for thread-locking in most of these calls.
VM-side windows contain text data and styles as provided by the Glk API, but they do not do text layout (word wrap).

UI thread:

All calls to the native OS UI are made on this thread. (For example, calls to the UIKit API in iOSGlk.)
UI-side data structures store windows and input state as the player sees them. These data structures hold references to native OS objects (e.g. UIKit windows).
UI-side windows may contain text layout information (word wrap), if that is not handled by native OS objects.
There are no UI-side data structures for Glk streams or files. (These can be handled entirely on the VM side.)

The application life cycle:

Native application starts up. It builds the UI-side data structures, describing a blank screen (no windows, no input widgets).
Since there's no input, the player can't do anything at this point.
The application immediately launches the VM thread.
The VM thread calls glk_main(). The game performs its initial round of computation, creating windows and printing text to them. (These are VM-side structures.) This continues up until the game's first input -- its first glk_select() calls.
When glk_select() begins, we synchronize the UI-side data structures to the VM-side ones. (Thus we create a UI-side window structure -- and a native OS window -- for every VM-side one.) It doesn't matter which thread does this, because we lock both sets of data structures during sync.
We are now waiting for user input. The VM thread is still suspended in its glk_select() call; the UI thread is handling user events normally.
When the user completes input (say, by hitting Enter) we synchronize this back to the UI thread. (Again, we lock both sets of data structures, but we're only copying back the input content and perhaps window sizes.) The VM thread wakes up, completes its glk_select() call, and begins a new round of computation.
For as long as the VM thread is computing, the UI thread does not accept input. (It can still handle UI operations like scrolling or selecting text.)
Eventually (soon!) the VM thread reaches its second glk_select() call. Synchronize again. (We might not create a new UI-side window, but we'd have to copy newly-printed text from VM-side windows to their UI-side counterparts.)
Repeat forever.
If the VM thread calls glk_exit() or returns from glk_main(), it is done. It should perform an operation like glk_select(), but waiting for no input -- this call will never complete. The UI will then be in "input mode", allowing the user to scroll and read text, but there are no input fields. The application will stay in this state until shut down.

Again, the application will spend nearly all of its time waiting for input (inside glk_select). The VM thread normally computes only for short intervals after an input event.

The sync procedure (entering glk_select) goes roughly like this:

Lock data structures.
Close any UI windows whose VM windows have disappeared.
Create new UI windows for any VM windows that don't have them yet.
Recompute UI window sizes based on VM window split info.
Update the contents of each UI window to match the updates stored in the matching VM window.
Update the input state of each UI window to match the matching VM window.
Unlock data structures.

Additional Notes

It is important to have unique window identifiers for the sync process. (If the VM side closes one window and immediately opens another, the UI side will have to recognize that this is a different window.) A simple incrementing counter suffices. The game code will never see these identifiers.

It is also important to have unique identifiers for line input events. (A window may keep a line input event active across several glk_select() cycles, or it may cancel and restart line input within a single cycle. The UI side must be able to distinguish these cases. So it can't just look at a boolean flag "is line input active?" at sync time.)

A VM-side text buffer window only needs to store the text printed in the current cycle. It can be emptied out at sync time, because the text gets appended to the UI-side buffer window and then lives there forever. (Or until the UI-side window is cleared, or trims its history.)

A VM-side text grid window should keep the entire current state of the grid, as printed by the VM. (This is not strictly necessary, but it's easiest to sync the entire grid every cycle.)

If the UI application window is resized, this percolates into the VM as an "arrange" event. (The VM might respond to this by redrawing the status line.) Usually this happens when the app is in input mode (VM is blocked on glk_select). This is the easy case; you hand off the arrange event to glk_select, just like any other user input, and glk_select() ends.

However, you should consider that a window resize might occur while the VM is running. (This is unlikely, because the VM only runs in short bursts, but it is possible!) In this case you will have to stash the arrange event and deliver it the next time glk_select() occurs. It is sufficient to use a boolean flag here; if several window resizes occur in a row, you only need to deliver one arrange event.

The glk_fileref_create_by_prompt() call blocks and syncs, just like glk_select(). The UI should display a modal file dialog and wait for the user to select a file. Other UI events (line input, char input, arrange) can not be accepted.

The experimental gidebug_pause() call also blocks and syncs. The UI should display a debug console and accept debug commands.

When RemGlk and GlkOte cooperate to connect a web server application to a web client, I use a similar two-process plan. In this case RemGlk (a single-threaded C library) acts as the VM "thread"; GlkOte (a Javascript library running in a web browser) acts as the UI "thread". These are separate processes on separate machines, but the synchronization logic is as I have described. RemGlk keeps the VM-side data structures, GlkOte keeps the UI-side structures, and they synchronize by passing JSON messages back and forth.

Life gets more complicated in the mobile world. A mobile application has to be able to launch back to a previous state, seamlessly. Glk was not designed for this; it requires an extended Glk library that can serialize its entire state, including window contents. That's a topic for another time, however.

The difficulty for threading is that when the game ends, the mobile convention is to display a "Restart" button. (It is never appropriate for a mobile application to shut itself down.) Therefore, the VM thread must be prepared to restart glk_main() after glk_exit(). Glulxe has been updated to permit this, but other interpreters might not.

Last updated May 31, 2014.

Glk home page

Zarfhome (map) (down)