Monday, 3 October 2016

One Million Bytes

A slight aside today while I play with futuristic skyscraper shapes.  I'm not going to talk about cities, instead this post is about an interesting technical aspect of the synthesis engine; the use of the call stack.

The Stack

Typically the stack is something that most programs and programmers don't need to worry about, ever.  Programs tend to rarely have a call graph that goes even remotely near the stack depth limit.  Of course we have all hit stack-overflow situations, but these are usually due to either:
  1. A function accidentally calling itself (either directly or indirectly).
  2. A recursive algorithm not terminating properly.
Both of these are just bugs you would hit during development, fix, and move on.  Sometimes though you may be explicitly using recursion to solve a problem and need to be wary of how deep the recursion goes.  Usually in this case you would just tweak the code to stop it running out of stack space, or perhaps your data-set is of limited complexity and it just works, so you can ignore it and again just move on.  With Apparance however we have to be a bit more careful.

Tree Evaluation

The synthesis process in Apparance is implemented as an operator-tree evaluation.
Simple tree evaluation example
This means that starting at some root procedure we evaluate its output, triggering this sequence of events:
  1. The graph inside the procedure is instantiated and connected up.
  2. We find what the procedure output is wired to, and evaluate its output (recurse to step 1).
  3. At some point we reach a fundamental operator instead of a procedure.  Evaluating its output results in some code being executed to perform its intended function.  In order for it to be able to do this though it needs input data, which it requests (another function call) as required until it can return with the calculated output result.
  4. We find out what the operator input is wired up to (this happened in step 1 for its containing procedure).
    1. Usually this will be another output and we will evaluate that (step 2).
    2. If our input is actually wired to a procedure input, then in a similar way to 4 we have to see what that is wired up to and recursively follow it until we either reach another operator output (step 2 again), or we reach a constant value.
  5. On reaching a constant value we now have data to pass back up the call graph and we can return.  There are a couple of ways constant values can appear in a procedure:
    1. Embedded in a procedure itself as a constant input.
    2. As an input parameter supplied to the root procedure.
    3. Output-only operators that obtain the value from outside systems, e.g. mathematical constants like Pi, or useful synthesis state like the current block size.
The upshot of all this is that as we dig deeper into our procedure graph, and at every step towards some constant data value, we are making deeper and deeper function calls.

Data Driven Code

Normally the execution call graph is largely hard-coded into a program; conditional branching changes the exact shape of this graph, and recursion can cause chains of similar shaped sub-branches to push it deeper, but it is still largely predetermined in behaviour.  During development, the stack usage may increase, but this maximum will largely not change during the execution of the program.  As a consequence, for most programs it is impossible to blow the stack in every-day use.
For a data-driven situation however, you can no-longer make this assertion; we don't know how deep our procedures will end up going when we are building them in the editor.  It is very simple to accidentally create a procedure that recurses for-ever and instantly overflows the stack.
Simplest procedure that overflows the program stack.
This is a problem since these crashes happen at edit time, when the application is in the hands of the end user, and not during its development where a debugger and developer are at hand.

A Walk In The Dark

In some senses, every program is a ticking time-bomb.  Are you really aware of how close your program runs to the stack limit?  Do you just ignore the problem since you've never hit it?  Do you even know how much stack space is available to your program?  It's a bit like taking a night-time stroll on the cliff-tops, everything is fine, until it suddenly really isn't.
It's a bit of an odd situation because languages don't really acknowledge the existence of a stack limit and it seems a little talked about point.  There don't seem to be any ways to query the stack, or work out the current depth, using standard library calls.  This may be because there are ways to hand code this and it's such a rare use case that it's just not worth the effort supporting it, or alternatively it is acknowledging what should be entirely an implementation detail that the programmer should make no assumptions about.  Anyway, I need to know the limit and I need to monitor the depth as procedure evaluation is performed and so in the absence of any library support for this it's down to a hand-rolled approach. 

Get A Tape Measure

First, how deep are we?  A few searches online reveal what I suspected; that the simplest way to measure how deep you are in function calls is to take the address of a local variable (or function parameter) at the start of the program and compare it with the address of a local variable in your current function.  I created a macro to do this:
 #define CURRENT_STACK_LOCATION(a) ((uint64_t)(void*)&(a))
Used like this:
 int a;
 uint64_t stack_pos = CURRENT_STACK_LOCATION(a);
During synthesis this is evaluated each time we dig deeper, and compared to the initial stack position.  If we go too deep, we flag the error and abort the synthesis, unwinding the stack with returns triggered by an error check.

How High Can You Get?

The other important question is "how big is the stack?".  There doesn't appear to be a way to measure this in code and some investigation reveals that this is something you can set in your EXE project settings (for C++ project), or via some EXE fiddling (with EditBin.exe) for C#.  In both cases though, the default seems to be one megabyte.
Assuming this 1 MB is actually 1,048,576 bytes (2^20) I decided a 1,000,000 byte threshold would leave a nice safety margin after triggering an abort of the synthesis run.  This seems to work fine, and you can see the results of this with the simple test case (proposed earlier).
Results of simple stack overflow test procedure running.
We managed to get to an evaluation depth of 15,624 and it was a final stack frame of (at least) 96 bytes that pushed us over the million byte limit.  This is plenty of space for very complex procedure evaluation, but with the safety net of knowing you aren't going to crash.

Engineering

It's quite likely that I have missed some useful API call or technique that could have saved me some time or prevented various assumptions here, but I guess this is an engineer's approach to the problem; measure, try stuff, get something that works, and test it.  It seems to work, and until I find a platform/machine/OS/scenario where it doesn't (or a suggestion to improve it)  I will consider it solved for now.  If you have any suggestions, or concerns, please chip in below.

Thanks

A quick thank-you to those that came up for a chat and see me demo some of Apparance at last weeks Tuesday Night Gamedev Pub gathering in Guildford.  There were many good questions and interesting conversation, and I could see that the whole project was making people think, hard (either that or the beer was making it hard to think).  Your questions and chatter prompted the writing of this post.  I hope to demo again next time for those that missed it.

No comments:

Post a Comment