Scripting, baby steps

07 Jun 2015

I’ve made a tentative start integrating a Lua interpreter into the game to enable scriptable gameplay, which is what this post will be about. Writing the post wasn’t easy, for two reasons. First of all because embedded scripting is a pretty broad topic, and I’d like to avoid losing myself in writing too much about ‘Lua, the scripting language’ as opposed to ‘Lua, as used by 2k14: The Game’. Second, because I’m still figuring all this out, and I don’t have a clear picture yet how things will end up eventually. So I’ll try my best to stick to the relevant details I know at this point, and not drift off into a ‘Lua for Dummies’ guide.

Embedded scripting basics

When embedding a scripting language into a program written in a language from the ‘C family’ (C, C++, Objective-C), one of the first decisions to make is how to bind the functions and data structures (classes) of the native API so they can be used from and passed in & out of the scripting language. Every scripting language has its own ways to represent fundamental types, objects, functions, etc, and generally speaking, they will be different from their native counterparts. This means an interface layer is required to bind the two languages together, translate data types, expose (a subset of) the functions of the native API, memory management (object lifetime, shared references), etc.

Multiple options are available for implementing the interface layer that binds the native API and the scripting language. I’ll quickly go over the main options and their pros and cons. Not all binding options will be feasible depending on the native language the host program is written in, but I’ll list them anyway to point out the differences.

Static binding

Probably the most commonly used method to interface between native code and scripting languages is static binding by means of an interface wrapper generator. Wrapper generators take the source code of the native application as input, and generate binding code from it, usually with some help of a separate interface definition to translate native objects and data types to the scripting language and vice-versa, and to specify and/or (re-) map the native functions that should be made available for scripting. Examples of wrapper generators are SWIG, which wraps C/C++ to a large number of scripting languages, or Luabind for binding C/C++ to Lua.

Pros
- The most obvious benefit of using static binding using an interface wrapper generator, is that it will (hopefully) allow you to generate bindings without having to write and maintain any code besides (if needed) the interface definition. Reducing the amount of code to write also decreases the chances of introducing bugs in the scripting interface layer.
- Advanced wrapper generators such as SWIG support mapping almost any language feature of the native language (C++ in the case of SWIG), to multiple scripting languages. Especially for very large and complex C++ API’s or API’s that need to be accessible from multiple scripting languages, static binding is almost indispensable.
Cons
- Wrapper generators are typically strictly one-way, allowing the scripting language to call native code, but not the other way around. If two-way binding is necessary, manual bindings are required for the other direction. This may get complicated if the same data types that were generated by the wrapper generator, which are usually opaque and not easily created manually, need to be passed in both directions.
- The generated interface wrapper will directly mimic the native interface in the scripting language. In many cases, directly mapping the native API is exactly what you want, maybe hiding or adding a few functions and types to restrict or extend the scripting capabilities. Sometimes though, a direct translation of the native API would be clumsy to use in scripts, or the native API could be too complex to wrap directly.
- Even though it may initially seem like you are saving yourself a lot of work by having the interface wrapper generated directly from your native API, you will almost always run into problems somewhere down the road where the wrapper generator fails to generate wrapper code from your native API. For example when adding some abstract base class, a weird overloaded operator, template function, or a data type that cannot be automatically wrapped to the scripting language. My experience is that static wrapper generators typically do about 90% of the work for you, but leave you with the remaining 10%, which typically requires nasty workarounds or obscure binding voodoo to make that last 10% of your binding code play nice with the generated wrapper code.
- An intermediate build step is required that generates the interface wrapper code from the C/C++ code, which complicates the build step and introduces its own set of ‘meta-problems’ where valid native code leads to invalid or incorrect wrapper code. This kinds of bugs or compile errors can be hard to debug as it may not be directly obvious if the problem is in the generated wrapper code, the interface definition (if there is any) or in the implementation of the native API itself. Additionally, for complex native API’s containing many C++ classes, the generated interface wrappers can grow absurdly large. I’ve seen interface wrappers generated by SWIG that ran over 120K SLOC, even for only moderately complex C++ API’s.
- Most interface wrappers I’m aware of only support C or C++ for the native API to wrap. I’m not aware of any static interface generators that can generate interface wrappers from Objective-C code.

Runtime automatic binding

An alternative available in some situations where the native (host) language allows introspection/reflection, is runtime automatic binding. In this case, the interface wrapper is not generated at compile time. Instead, code is inserted into the scripting language, the native language, or both, to bridge the two at runtime. When calling the native API from a script, the bridge code will use introspection to find the function or method to call and its signature, convert scripting language types to native types, call the native function, and translate the results. The other direction (if supported by the binding layer) works in a similar fashion. In other words, the bridge code acts as a proxy between the host code and the scripting language.

Objective-C provides introspection and reflection facilities that allow runtime automatic binding, and there have been quite a few Objective-C to Lua bridges, such as LuaCocoa and iPhone Wax. Somehow none of these appear to be actively maintained though.

Pros
- Runtime automatic binding has the same benefits as for static binding, without most of its downsides. No additional step is required in the build process, and problems in the binding code will only occur at runtime and will typically be much easier to debug.
- Besides binding your own code so it can be used from scripts, dynamic binding libraries can often also expose system-level API’s and frameworks. For example using LuaCocoa or iPhone Wax it is possible to write complete iOS applications in Lua, using (almost) the full set of Apple Cocoa/UIKit frameworks.
- Because the binding is dynamic, interesting tricks such as hot-swapping, duck-typing and monkey patching classes and functions are possible.
Cons
- Dynamic binding introduces overhead when calling between methods, which can start to add up if the number of calls is high compared to the average runtime of the calls. This is particularly relevant for scripting game logic, which may have to execute up to 60 times a second.
- Some of the cons of static binding also apply to runtime dynamic binding, such as exposing an API to scripts that directly mimics the native API. This may not always be desirable.

Manual binding

The last alternative is to write all the binding code manually. This involves interfacing directly with the native binding API of the script language, writing your own type conversions, thinking about memory management if references to the same objects need to be shared between host code and script code, everything basically. Knowledge of nitty-gritty details of the scripting language is required, and you’ll have to figure out yourself how to map advanced features of your native API to something that makes sense in the scripting language.

Pros
- Full control over what part of your API is exposed, and how. If only a very small and well-defined fraction of the functionality of the native API has to be available for scripting, it doesn’t make sense to map complicated class interfaces onto the API for scripts. Instead, you can pick and choose what functionality needs to be scriptable and map it to whatever makes the most sense in the scripting language.
- The scripting API can be implemented to behave more like a service, hiding the underlying implementation, where the scripts themselves can be seen as ‘clients’ of the scripting API. If you decide to completely restructure the native implementation, introducing big changes to its class structure, the scripting API does not have to reflect this. In other words, manual binding makes it much easier to ensure backwards compatibility for existing scripts.
- No overhead introduced by inefficient generated wrapper code or runtime binding. More options available to optimize the interaction between host code and script code.
Cons
- Lots of work. Your native API changes? More work. The scripting language doesn’t support some feature or data structure of your native API? Even more work. You want to add support for a second scripting language? Do all the work all over again. You get the point…
- Intricate knowledge of the native binding API provided by the scripting language is needed. To say that native binding API’s for scripting languages aren’t typically the most enjoyable things to work with would be an understatement, at least for some scripting languages. I’ve personally experienced this first-hand when working on Tcl and Perl binding code, just looking at it makes your eyes bleed. Python is already a lot better, and Lua is actually not so bad, mostly because it takes such a bare-bones approach to more or less everything about the language.

Choosing a binding strategy

Considering all of the above in the context of the game we’re writing, I opted for manual binding. This may seem like a huge waste of time, but since the API surface of the scripting layer will be very small, and because Lua has such a simple and elegant way to implement two-way interaction between host code and script code, I’m convinced the additional work to get up and running will pay off in the long term. SWIG support for static binding of Objective-C code appears to be experimental at this point, which means I would end up writing a thin C/C++ layer to wrap anyway, and even that would still only give me one-way binding. Runtime dynamic binding using LuaCocoa or iPhone Wax does not seem very appealing either. I don’t like to take the performance hit they introduce, and I don’t like the clumsy way the Lua scripts would have to be written to use the runtime bindings. So even if I really hated the extra work required for manual binding (I don’t, it’s actually good fun in it’s own peculiar way), it would probably still be the best option.

Embedded Lua basics

This is the section where I have to watch out this post doesn’t turn into a manual for embedding Lua, or a general introduction into Lua programming. I’ll stick to the bare minimum of details, and refer to Programming in Lua for more information. The online version is is the first edition of this book, which was written when Lua 5.0 was the latest version. Today, we’re at Lua 5.3 and the language has evolved, apparently. As I’m a total Lua n00b myself, I don’t really know what changes between Lua 5.0 and 5.3 are relevant for our purposes, but if you want to learn about modern Lua programming, make sure to order the current edition of the book.

Lua is a scripting language fully aimed at embedded scripting, for extending a host program written in a different language with scripting capabilities. Lua scripts run inside a Lua virtual machine, which executes Lua bytecode compiled from Lua scripts. The host program can spawn one or more Lua VM’s and load Lua code into them from a file, a string, or by directly injecting bytecode. The state of a Lua VM can be accessed, created and/or modified by the host program: global variables, functions, the script bytecode, virtually everything. Typically a Lua script does not run concurrently with the host program, but gets loaded into the Lua VM by the host program, which can then execute it top to bottom, call individual functions defined in the script, add functions to it, etc. After the script finishes or the function call returns, the Lua VM will be idle, but as long as the host program keeps it around its state will be preserved, and can be used for subsequent function calls. Lua does have threading support, which supposedly can be be used to run Lua code ‘in the background’, but I don’t know anything about how this works, and I’m not planning to use threads, so let’s just pretend the feature doesn’t exist.

Communication between the host program and Lua script is implemented using a virtual stack, which is associated with the Lua VM. Whenever the host program wants to call a Lua function, it first needs to push a reference to the function onto the stack, followed by the arguments for the call. The function can then be called by Lua, which will pop the function and its arguments, execute the function, and push any return values back onto the stack for consumption by the host program. All of this happens by means of the Lua native C API, which contains a number of functions, most of which push, pop, read or set values on the virtual stack. A typical Lua call from a C/Objective-C/C++ program could look like this:

// The lua_State pointer references the Lua VM, which is assumed to be created and 
// initialized with a script that contains a global function 'add' here
lua_State *L = ...;

// Get reference to 'add' function from the VM using lua_getglobal, which
// will push the reference to the stack.
lua_getglobal(L, "add");

// Push the arguments for our function call to the stack.
lua_pushnumber(L, 1.0);
lua_pushnumber(L, 4.0);

// Call add(1.0, 4.0), which will pop 2 arguments and the function reference,
// and push one return value back onto the stack
lua_call(L, 2, 1);

// Get the result, which will be at the top of the stack (index -1, as
// the stack grows downwards)
float result = lua_tonumber(L, -1);

// Pop the return value to keep the stack balanced. After the call, the
// stack will be left in the same state as on line 1 of this snippet
lua_pop(L, 1);

Calling back into the host program from the Lua script works in a similar fashion. First, the host program needs to provide the Lua script with a reference to a C-function with a particular signature, which takes a single argument (a pointer to the Lua VM), and returns a single integer (the number of return values the native function left on the stack). The function reference can be assigned to a global inside the Lua VM using lua_setglobal, just like any other Lua variable:

static int the_function(lua_State *L)
{
  // Push result, in this case the constant '123' and return
  lua_pushnumber(L, 123.0);

  return 1;
}

// ...

// Register the lua function by pushing it onto the stack. The lua_setglobal
// function will pop the reference and assign it to a global 'get_a_number'
lua_pushcfunction(L, the_function);
lua_setglobal(L, "get_a_number");

// ...

After registering the function, the Lua script loaded into the VM can call it as if it were directly defined as a Lua function. As an interesting example, consider the following Lua script, that will integrate the Lua add function and the native get_a_number function, returning the value 128 (1 + 4 + 123), to go full circle from native, to lua, to native, to lua, and back:

function add(a, b):
  n = get_a_number()
  return a + b + n 
end

Lua language highlights

To provide some context for integrating Lua into the game, a bare minimum overview of Lua the language is unavoidable. I’ll itemize the bits and pieces that may be relevant here, and again refer to the Lua PIL for additional information.

Lua only has a very small set of variable types. From the top of my head they are: booleans, numbers, strings, tables and functions Functions are first-class objects in Lua, in other words, you can assign them and pass them around just like any other value The lua number type represents a floating-point number. There is no integer type in Lua.
Lua has limited type coercion between strings and numbers, but not between booleans and numbers or strings, in other words, false is not the same as 0 or "0". In fact, comparing a boolean with any other type will always return false.
Tables are used for everything that is not a scalar, a string or a function. They are used to represent arrays, when only using numbers as table keys, as associative arrays, and as the basic building block to create class-like objects.
By convention, indices start from 1, which is reflected in all the Lua C API functions, and all Lua standard libraries. Since there is no technical reason why you couldn’t use 0 as a table key, nothing is stopping you from using 0-based indices if you like though.
Lua has no built-in support for classes or structures, but both can be implemented using tables. Because functions can be used as table values, a class can be defined as a table mapping function names to function pointers.

Lua has two mechanisms called metatables and metamethods, which can be combined to extend the ‘tables as classes’ idea even further, and separate instance variables from class variables and methods. It goes too far to explain all the intricacies of metatables here, but the basic idea is as follows: a table containing class variables and methods can be assigned as the metatable of another table that holds instance variables. This separates the class definition (defined by the metatable) from the instance itself (the regular table). All tables by default also have a metamethod __index which is called when requesting a non-existing key from the table. By assigning the metatable to the __index metamethod of the instance table, any time a class variable or method is referenced or called using the instance table, Lua will automatically get it from the metatable by means of the __index metamethod.

To hide what is going on behind the scenes, Lua has some syntactic sugar to make tables with metatables appear very much like classes in regular languages. For example x:print(s) is syntactic sugar for x.print(x, s), which itself is syntactic sugar for x["print"](x, s). Inside the function definition function x:print(s), the first parameter is hidden from the signature and assigned to a local variable self, allowing access to the instance the print function was called on. The metatable/metamethod idiom may look a little weird at first but it works remarkably well, and allows most of the features you’ll find in ‘real’ object-oriented languages. It’s possible to implement inheritance, class methods, constructors and destructors, overloading, etc. Conspicuously missing here is data hiding, there is no such thing as private or protected scoping of variables or methods. The Lua philosophy is that if the interfaces of whatever it is you are writing in Lua really need to be rock-solid, safe, and cannot be abused, you may want to reconsider if it wouldn’t be better to use a different language.

Integrating Lua into the game

After this long treatise about embedded scripting in general, and Lua in particular we’ve finally ended up at the part where I’ll show how this all applies to 2k14: The Game. At this point, I’m still just experimenting to get a feeling about what works and what doesn’t, how things will end up eventually could be completely different. I’ll just give the executive summary of the current messy state the Lua integration is in, and report back in a later post when things have crystallized out a little.

My initial idea was to start with support for scriptable planets, to be able to define new planets in Lua. I quickly abandoned this idea for two reasons. First of all, since planets are populated by entities, setting up a planet from Lua would depend on scriptable entities, and I wasn’t ready yet to decide how to integrate those. Second, we want to have all scriptable game state, from game, to player, to planet, to entities, to all run inside the same Lua VM. I don’t know whether it is even possible to access state across different Lua VM’s, but I imagine it would be quite painful. So I switched to a top-down approach and created a scriptable game class K14ScriptableGame instead.

The K14ScriptableGame class is derived from K14Game, and extends it with the following responsibilities:

Creating a Lua VM which will be used for all game state: player, game, planet, entities. This VM will need to be accessible to future K14ScriptablePlanet and K14ScriptableEntity classes
Registering a Lua ‘Game’ class (table with metatable) inside the Lua VM, and binding it to a set of native wrapper functions that allow Lua to call back into the host code
Loading the game script into the VM so K14ScriptableGame can call out to it
Overriding the standard K14Game gameplay event handling function, to call a Game:processEvent method define in the Lua script instead of the ‘normal’ event handler defined in native code. The way I set it up was to have the Lua function return a boolean indicating whether it processed (‘consumed’) the event passed to it or ignored it. If the message was ignored, the K14ScriptableGame class will pass it on to the normal K14Game event handler.
Implement a ‘wrap’ method, that will wrap the Objective-C class instance by creating a Lua Game instance and associating it with the Objective-C self pointer. When Lua has to call back into native code, it will pass this wrapped instance containing the pointer to the native class, so the call can be relayed to any one of its Objective-C selectors. Creating the Lua wrapper class all happens using Lua C API functions that manipulate the stack, after calling wrap the wrapped K14ScriptableGame instance is left on the stack so it can be used directly as an argument for calling a function defined in the Lua script.

At this point, this is basically all there is to it. Of interest here may be the functions that register the Lua ‘Game’ class, and the function that wraps a K14ScriptableGame instance and pushes it to the stack so it can be used as an argument for a Lua function call:

+(void) registerClasses: (lua_State *) L
{
  // local m = {}
  // m.__index = {}
  lua_createtable(L, 0, 0);
  lua_pushvalue(L, -1);
  lua_setfield(L, -2, "__index");
  
  // m:setLives = wrap_set_lives
  lua_pushcfunction(L, wrap_set_lives);
  lua_setfield(L, -2, "setLives");
  
  // Game = m
  lua_setglobal(L, "Game");
}

-(void) wrap
{
  // game = {
  //    game : self,
  //    player : {
  //      lives: self.player.lives,
  //    }
  // }
  lua_createtable(L, 0, 0);
  
  lua_pushlightuserdata(L, (__bridge void *) self);
  lua_setfield(L, -2, "game");
  
  lua_createtable(L, 0, 0);
  
  lua_pushnumber(L, self.player.lives);
  lua_setfield(L, -2, "lives");
  
  lua_setfield(L, -2, "player");

  // return setmetatable(game, Game)
  lua_getglobal(L, "Game");
  lua_setmetatable(L, -2);
}

In the registerClass function you can see the somewhat strange metatable dance used to define a table as a class, which is used inside the wrap function to wrap the K14ScriptableGame instance. Creating a class amounts to setting up a table containing the instance variables, and then setting the Game table as its metatable. For demonstration purposes I’ve added an instance variable that holds player information (only number of lives in this case), and defined a Game:setLives method that calls back into the native function wrap_set_lives on the Game class.

Calling out to the Lua Game:processEvent method works like this. Note that right now, only the event type code is passed to the Lua script, and not its parameters, it’s just a proof-of-concept example:

-(void) processEvent: (K14GameplayEvent *) event
{
  // Push Lua function to call
  lua_getglobal(L, "Game");
  lua_getfield(L, -1, "processEvent");
  
  // Push 'self', wrapped as the Lua 'Game' class
  [self wrap];
  
  // Push event type code
  lua_pushnumber(L, event.type);
  
  // Call Game:processEvent
  lua_call(L, 2, 1);
  
  // Pop the return value, and relay the original event to the 
  // native handler if the Lua handler did not consume it
  BOOL consumed = lua_toboolean(L, -1);
  
  if (!consumed)
  {
    [super processEvent:event];
  }
  
  lua_pop(L, 1);
}

The wrapper function wrap_set_lives does nothing more than extracting the K14ScriptableGame instance pointer from its first argument and the number of lives to set from the second argument, use them to set the game.player.lives property:

static int wrap_set_lives(lua_State *L)
{
  // Get function call arguments
  int lives = (int) lua_tointeger(L, -1);
  
  lua_getfield(L, -2, "game");
  
  K14ScriptableGame *game = (__bridge K14ScriptableGame *) lua_topointer(L, -1);
  
  // Set player number of lives
  game.player.lives = lives;
  
  // Return number of return values for the wrapped function, which are 
  // none in this case
  return 0;
}

Putting all this together, we can now write a Lua script that implements custom game logic by simply implementing a Game:processEvent method. As an example, the following Lua script will intercept the K14GameplayEventEntityHit event, and increases the number of lives when it is received:

function Game:processEvent(event)

  -- Event code 0 corresponds with 'entity hit'
  if (event == 0) then 
    self:setLives(self.player.lives + 1)
  end
  
  return false
end

Things can’t get much simpler than this ;-). Obviously these are still, as the title of this post indicates, baby steps. Besides also wrapping the K14Planet and K14Entity classes, at the very minimum we also want to wrap some type definitions, constants, etc. so we could write things like if (event == Game.EntityHitEvent) ... instead of using magic numbers. So there’s enough work ahead, but it’s a start.

Video

No video this time, as there aren’t really any visible changes since the last post. I could of course make a video where you can see the number of lives increase when an enemy is hit, but that would be a little pointless ;-)

Next steps

I’ll continue exploring and prototyping game scripting to extend it until it is possible to move the full game logic and planet setup to Lua scripts. After that, I’ll try to also move the entity classes to Lua. The end goal is to be able to completely do away with all native K14Entity subclasses, and draw a very strict line between the Objective-C side (the engine) and the Lua side (the logic) of the game.

Development scoreboard

Getting the proof-of-concept scripting functionality up and running to the point it is now took about 5 hours, for a total of ~180 hours. I did at least 2 hours of reading up on embedded Lua scripting, but since we’re only counting time spent behind the computer, I won’t add them in the development time. SLOC count only increased by 25, for a total of 2492 source lines of code.

2k14 : The Game