EQSDebugger saves the day! Kinda.

I wanted to share a short story of fixing a bug in EQS.

A colleague for the Paragon dev team IMed me asking what the EnvQueryManager is and sent me this image:

The number marked in red is the single call cost in milliseconds. Yeah, it’s pretty high. It’s enormous in fact! With EQS supporting time slicing I knew this had to be a bug so I’ve started digging. By looking at the code my hunch was that the unexpected cost was related to the new-ish EQS profiling functionality since time it takes is not included in the loop’s time budget. So I’ve added stats to the EQS debugger functions to confirm my hunch. Here’s what I gotten:

The last column is the call count within a frame.. I’m pretty sure we’re not suppose to try to log over 600k queries a tick ;)

The core issue turned out to be us not accounting for EQS queries that have been aborted before the main EQS loop had been triggered. This way if we finished all active queries, and skipping the aborted ones, the loop “thought” there was still some more work to do, up until it run out of time budget. Only thanks to a minor bug in EQS debugger’s usage (the fact that it can end up logging the same query multiple times after the query has finished) and it causing such a huge performance drop, the core, deeper EQS bug has been found. So, in essence, EQS Debugger uncovered and helped nail down the bug. Well done, EQSDebugger!

Fixing EQS with math

(Warning, there’s some math involved later in the post! You’ve been warned ;) )

I’ve recently found a little inconsistency in the way EQS normalizes scores (if you have no idea what EQS is see here, in short it’s an spatial reasoning system in UE4). It’s not a bug per se, it doesn’t result in eliminating wrong items, or influence items’ relative score. But I did notice it because it may result in not-totally-worthless items getting a score of 0. Let me give you a practical example.

I was looking into changing the way bots in Paragon pick their targets (which is done via EQS). I wanted to try to do some additional, non-EQS processing on enemies that scored more than 0.9 (90% of the best score). However, while looking though a vislog recording of a bot game, I’ve notice this:

Having an enemy score a total of 0 points caught my attention. So I’ve stared digging. Here’s more query details from the log:

The Score column shows the total score of an item (an enemy actor). Every Test columns first show the normalized score followed by the raw result of a given test. At a first glance everything’s all right. The negative normalized score comes from some tests having negative result multiplier to indicate that smaller values, closer to zero, are to be preferred. Something doesn’t feel right though – why is Muriel’s score 0 while in the last test she had second best result?

    Here’s how we come up with a total score of an item:

  1. calculate the sum of all normalized test results for a given item
  2. offset it by the lowest score (or 0, whichever is smaller)
  3. divide it by the score span

It’s all well and good until you have negative scores. If lowest total score is \lt 0 then after offsetting and normalizing the lowest scoring item will always get 0, regardless of how well it did on some of the tests. It can even lead to very misleading results! Imagine having three items of respective total scores of -0.2, -0.1, 0 – the presented scoring normalization algorithm will result in final scores of 0, 0.5 and 1! Not at all representing relatively similar performance of items being scored.

Not to mention the whole scoring falls apart if the best total score is also negative.. Wow, looks like I’ve found an interesting bug :D

Fixing the issue is a pretty straightforward affair. While normalizing test results, if a user configured weight is negative instead of using w \cdot v (where w \lt 0 is the weight and v \in [0,1] is normalized test value of an item) we just use \mid{w}\mid \cdot (1 - v). This results in the expected reversing of scores (lower is better) and has an added benefit of making the result positive.

“Why this post then? Just fix it, guy!” I hear you say. Well, normally I would. But the problem is that EQS is widely used in both Epic’s internal projects (most notably Paragon and Fortnite) as well as a number of external projects (I can tell by number of questions people ask ;) ). Messing up EQS scoring would make many developers unhappy/miserable/angry (pick at least one). On the other hand I’m pretty use the solution is good, but how do I test it? I can come up with a number of test cases, but that’s not enough, there’s always something that test cases would not cover. If only there was a way to prove something beyond any finite test set… MATH!

Now it has been many (many, many) years since I last proven anything mathematically, so what I’m about to present might be rough around the edges, but I’m pretty sure there’s no holes in the reasoning (and if there are please let me know!).

All right, so what do I need to prove? Obviously the suggested change will affect the final scores, that’s the whole point. But what is the property we don’t want to change? We want to make sure the final order of scored items is preserved. Or in other words (and here’s where the math starts):

\forall{x_{n},x_{m}\in X} \quad s(x_{n}) \leq s(x_{m}) \Longrightarrow s'(x_{n}) \leq s'(x_{m})

where X is the set of all items being scored, s(x) is the “old” scoring function that for k tests looks like this:

\begin{matrix} s(x) = \frac{1}{\beta} \sum_{i=0}^{k}{t_{i}(x)} \\ t_{i}(x_{n}) = w_{i}\cdot v_{i}(x_{n}) \end{matrix}

where v_{i}(x) is the i-th’s test normalized score of item x and \beta is the normalization term, the total score span. The “only” difference between s(x) and s'(x) is the t function. We define the new scoring function as follows

t_{i}'(x_{n}) = \left\{\begin{matrix} w_{i}\cdot v_{i}(x_{n}) & w_{i} \geq 0 \\-w_{i}\cdot (1-v_{i}(x_{n})) & w_{i} \lt 0 \end{matrix}\right.  
We see that t and t' differs only when w_{i}\lt 0 and since

\frac{1}{\beta} \sum_{i=0}^{k}{t_{i}(x)} = \frac{1}{\beta}\left(\sum_{j\in K_{-}}{t_{j}(x)} + \sum_{l\in K_{+}}{t_{l}(x)}\right )

where K_{-} and K_{+} represent the sets of negatively and positively weighted tests we can focus on negative weights from this point on.

We’re almost there! All that’s left to show is that if:

\begin{matrix} s(x_{n}) \leq s(x_{m})\\ \frac{1}{\beta} \sum_{i=0}^{k}{t_{i}(x_{m})} - \frac{1}{\beta} \sum_{i=0}^{k}{t_{i}(x_{n})} \geq 0\\ \sum_{i=0}^{k}{t_{i}(x_{m}) - t_{i}(x_{n})} \geq 0 \\ \sum_{i=0}^{k}{w_{i}\cdot (v_i(x_{m}) - v_i(x_{n}))} \geq 0 \end{matrix}

then

\sum_{i=0}^{k}{t_{i}'(x_{m})-t_{i}'(x_{n})} \geq 0  

and since we’re considering w_{i} \lt 0, then using (1) we can expand (2) to

\begin{matrix} \sum_{i=0}^{k}{-w_{i}\cdot (1-v_{i}(x_{m})) + w_{i}\cdot (1-v_{i}(x_{n}))} \geq 0 \\ \sum_{i=0}^{k}{-w_{i} + w_{i}\cdot v_{i}(x_{m}) + w_{i}-w_{i}\cdot v_{i}(x_{n}))} \geq 0 \\ \sum_{i=0}^{k}{w_{i}\cdot (v_i(x_{m})-v_i(x_{n}))} \geq 0 \qquad q.e.d \end{matrix}

<mic drop> ;)
If you find anything wrong with the prof please let me know!

HTN planner prototype

I’ve been working on Paragon bots for some time now, and up until now, it was mostly about making it work, having bots play the game at all. Now, when the dust has settled and the bots do indeed play the game, it’s time to start thinking about bots that can use their abilities in an advanced manner, chaining their own abilities and cooperating ability triggering with their friends. This calls for some kind of planning, and I’ve wanted to implement a generic Hierarchical Task Network (HTN) planner for UE4 for a long time now. Well, here’s my use-case :D

Since I’ve never implemented an HTN before I’ve figured I can put together a quick prototype that will allow me to get this unique, intimate understanding you can only get by doing the thing yourself. This post is a kind of a record of the process.

To refresh my memory on HTNs I’ve reached for an article I remembered had a really clear explanation of the concept (namely this one).

An HTN planner implementation consists of two parts: planning domain, and the planner itself. There’s also “world state”, but I’m considering it a part of the domain since the domain used determines the validity of the world state representation.

Following the naming convention used in the article we have:

In short, an HTN Domain is build of Primitive and Compound Tasks. Compound tasks are build of Methods. Methods are build of Primitive and Compound Tasks.

The world state is a simple “facts” lookup with additional info on how to modify and check the stored facts:

As mentioned, the second part of and HTN planner implementation is the actual planner itself.

Up until now, I’ve listed only the properties of the declared classes. With this we can (almost) build a planning domain. I’ve just taken advantage of python syntactic sugar to make planning domain look nicer once defined:

With these we can build a test domain:

Of course, this wouldn’t compile. There’s a bunch of functions used here I haven’t defined (or even declared!) yet. This is partly how I work. I lay out the high-to-mid level code, and then just fill in the blanks. Let’s fill in the blanks now. First the part I always, always, always build along with any system I implement, and so should you, the debugging/validation code:

And last but not least (by far) the code that does the actual work. Here comes the functional part of the classes declared above:

The code doesn’t require much explanation, provided you’ve read the article mentioned earlier. There’s not much I can add other than the actual implementation (which I do).

The whole code in proper order can be found on the github. And we can test it like so:

which generates the following output:

Of course, this is the most basic implementation of an HTN planner and it doesn’t handle the more exotic cases. But it’s enough for me to start a proper C++ implementation in UE4 sources.

Sharing toys

It’s kinda funny to think I’ve gone live with my blog and then not posted anything after the very first post. I’m working on an interesting one, I promise, but in the meantime, I wanted to share a tiny toy I’ve made while investigating new tech. The new tech is Processing.js and I’m looking into it since I’ve already used regular Processing a bit in the past, and it’s an easy way to implement dirty prototypes with simple 2d graphics, and have it work in the browser without any changes to the code! I know suspect there better alternatives, but this one’s good enough :D

The following toy is a dead simple avoidance implementation, written from scratch in Processing, and done solely for the purposes of refreshing my Processing skills. Had fun implementing it nonetheless.

Here’s a link to the sources of this toy on github. Feel free to take a look, even though there’s a lot more educational value in trying to implement something like this from scratch rather than reading someone else’s code (especially mine! ;) ).