sunday, 20 february 2011

posted at 21:58

Like many games, Pioneer uses Lua for its content generation and dynamic world stuff. It doesn't expose enough of its lucky charms to the world this way though (in my opinion), which is why I'm currently doing a major overhaul of everything Lua related.

From looking at the code there's a couple of ways that Lua has been integrated over time. All the models in the game (ships, space stations, buildings) consist of static data (textures, object definitions) and a Lua script to hook it all together. The script can define a function that gets called as the model is rendered that can actually modify the model. This is how flashing running lights, rotating radar dishes and other things are done. Its quite a clever system. The Lua style used is pretty much plain calls to the Lua C API, with some hand-coded Lua classes for matrices, vectors, and so on.

The other place Lua is used is in the modules. These are essentially plugins that add stuff to the game based on triggers. Its how missions and smugglers appear on the bulletin board and pirates appear in the system when you enter it, to name but two functions. This interface uses a combination of normal Lua and OOLua.

OOLua is a C++ binding for Lua, which provides lots of macro and template magic to make it easy to expose your C++ classes as Lua classes. It automatically handles unpacking the Lua stack and argument typing and whatnot so that when in Lua do something like o = SomeClass:new("abc", 1) and o:thing(23) it'll arrange calls to SomeClass::SomeClass(const char* a, int b) followed by void SomeClass::thing(int a). I'll leave it to you to go and read the theory and code linkes from the OOLua homepage. Its quite interesting, though it took me quite a bit of use and misuse before I really started to get my head around it.

My plan, which of course you read because I linked it above, is to expose pretty much everything that might be useful to people wanting to add to the Pioneer universe as Lua, since that's theoretically easier for non-programmers to get to grips with. A good start has already been made to getting OOLua hooked up, so I decided after a few experiments that my first steps should be to convert all remaining non-OOLua stuff to use it. The big one here is all the code in LmrModel.cpp, which is where all the model magic mentioned above happens (indeed, LMR stands for "Lua Model Renderer", which should give some idea of just how central Lua is to all of this).

The way a Lua model definition works is pretty straightforward. At boot Pioneer loads all the model Lua files, which typically contain calls to define_model(). The arguments to define_model() contain lots of call to functions that define the model. Examples are cylinder(), texture() and so on. As you'd expect, Lua calls all of these functions to assemble the arguments before calling define_model() with the whole lot. LmrModel turns this inside out. When a call is made to eg cylinder(), it actually pushes a bunch of commands like "draw triangle", "draw billboard", "use light" and so on onto a stack. When the final call to define_model() is made, LmrModel attaches that stack to a global model definition. Its a bit unusual and can be prone to errors (eg currently if you call one of the element functions outside of define_model(), you'll usually get a segfault), but it also simplifies the code a great deal because it greatly reduces the amount of data that needs to be passed back and forth between C++ and Lua.

The difficult thing about converting all of this to OOLua is that its all the element functions are static functions, not object methods. OOLua is really only built for proper classes, and has only the most minimal support for calling an unadorned function from Lua. That minimal support does do the stack and type signature handling that I described above, so I've built an entire layer on top of it to sanely handle calling static functions from Lua. Its still under heavy development (you can follow my lua-overhaul branch if you're interested) but its already very functional. Here I want to describe a bit about how it works, because I'm quite proud of what I've been able to do. The details are at the bottom of LmrModel.cpp and in OOLuaStatic.h if you want to follow along.

OOLua requires that all static functions be registered against a Lua class (causing them to appear under its namespace). What that means is that we have to define a class, even its empty. Sucks, but lets do it.

Once there's a class in place, its trivial to register functions against it. A typical call is:

OOLUA::register_class_static<pi_model>(l, "call_model", &static_model::call_model);

pi_model is the already-registered class we're hooking the function to. call_model is the name of the function that appears on the Lua side (so in this case we've just registered pi_model.call_model). The final arg is a pointer to the C function that will be called when the function is invoked.

The function is a standard Lua callback with the signature int func(lua_State*). If you like you can use this as-is, but OOLua provides some extra magic here to use this function as a thunk that does stack unpacking and type checking before passing the call on to a real handling function. A typical callback function for OOLua looks likes:

int static_model::call_model(lua_State *l) {
    OOLUA_C_FUNCTION_5(void, call_model, const std::string&, const pi_vector&, const pi_vector&, const pi_vector&, float)

The arguments are straightforward - the return type, the name of the function to call, and the types of its arguments. This will result in a call to:

static void call_model(const std::string& obj_name, const pi_vector& pos, const pi_vector& _xaxis, const pi_vector& _yaxis, float scale)

If the types or number of arguments are wrong, then instead a Lua error will be generated.

So this is all very nice, but has some shortcomings. The simplest is the amount of boilerplate that needs to be written to set up a function. Some simple start/end macros to define the thunk function are all thats necessary.

The next thing I stumbled on is the need for a form of multiple dispatch. OOLua already does this for constructors, but not for method calls, which I find a little odd. What it meant is that I had to impelement it myself. Since Lua is typeless there's really no way short of some educated guessing to make a choice based on types, but choosing the function is possible based on the number of arguments. This is expected in a few places in the existing model interface. For example, texture() in its simplest form requires the name of a texture only, but its also possible to call it with extra args specifying position and transformation of the texture. So we now have two possible functions that we could call. OOLua can't support this directly, so I wrote some macros that when used, expand to (slightly simplified):

int static_model::texture(lua_State *l) {
    const int n = lua_gettop(l)-1;
    if (n == 1) {
        OOLUA_C_FUNC_1(void, texture, const std::string&)
    if (n == 4) {
        OOLUA_C_FUNC_4(void, texture, const std::string&, const pi_vector&, const pi_vector&, const pi_vector&)
    return 0;

Using the macros, this gets written as:

    STATIC_FUNC_1(void, texture, const std::string&)        
    STATIC_FUNC_4(void, texture, const std::string&, const pi_vector&, const pi_vector&, const pi_vector&)

That works well. Later I found another problem, which was a bit trickier to solve. The function extrusion() looks like this in Lua:

function extrusion (start, end, updir, radius, ...)

The first three args are vectors, the fourth is a number (float). That's not important though. What's important here is that following the required args comes an arbitrary number of vectors to define points for the shape to extrude. This posed a problem - its easy to create a macro that says "expect 4 or more arguments), but OOLua's function call mechanism fails outright if the number of args on the Lua stack don't match the number of arguments called for.

The solution I settled on was to define a seperate STATIC_FUNC_4_VA macro. When this appears in the thunk definition, it looks for extra arguments on the stack and puts them into a Lua table. It then pushes the table and the number of items in it onto the stack and calls the function with two extra arguments. All this gives the following:

    STATIC_FUNC_4_VA(void, extrusion, const pi_vector&, const pi_vector&, const pi_vector&, float)

static void extrusion(const pi_vector& start, const pi_vector& end, const pi_vector& updir, float radius, OOLUA::Lua_table t, int nt)

STATIC_FUNC_4_VA expands like so:

if (n >= 4) {
    OOLUA_C_FUNC_6(void, texture, const std::string&, const pi_vector&, const pi_vector&, const pi_vector&, OOLUA::Lua_table, int)

You can go read _static_varargs_table() too. Its interesting but not really relevant to this discussion.

So right now this is all working wonderfully well. I'm not quite finished refactoring all the functions, but its only a short hop away. But there is one fatal flaw in all this which I'm really struggling with right now. The problem is that all calls made by OOLua via its method/function macros don't have ready access to the lua_State representing the interpreter, which means if at any point OOLua can't do something for you (which is often) and you need to drop back down to the standard Lua API, you're stuck.

In LmrModel this is not a problem as the state is held in a global. In LuaUtilFuncs however there's no such global and indeed, you wouldn't want one, as these functions as used by several different interpreter contexts throughout the codebase.

Its actually tricky to solve this one. Obviously when the registered function is called its called with the Lua context in the args, so we do know it. But we lose it as soon as we ask OOLua to call our function with its fancy typechecks and stuff. We can store it in a global just for the duration of that call, but then we aren't re-entrant which could be a real problem down the track. I don't want to do that.

The only idea I have at this point is to push a pointer to the context onto the Lua stack so that OOLua can unpack it and pass it, but these seems rather heavy. Its not just a pointer either; due to the way OOLua does its type handling I have to push a full object instance. Thats a slight lie; a few primitive types like int and float don't need an object, but I don't want to do crazy stuff like casting pointers to integers to make this work.

I will try that option, but I'm keen to find something else. A thought that occured to me is that perhaps this is all wrong; perhaps it should always be the case that these functions are actually called as object methods. It makes a certain amount of sense. The model definitions could be built up in an object rather than in globals, which paves the way for object loading to be done in parallel in the future. The difficulty with this however is that the pseudo multiple-dispatch that I've implement for functions is not available for method calls, so I've undone a good amount of work.

I think at some point I'm going to need to take all this over to the OOLua author and discuss getting all this implemented properly. Its a fantastic system and I don't want to have to move away from it, but its starting to run out of steam. The author has been very responsive and helpful so I expect there's lots that can be done, which is hopeful.

That'll do for now.