I wanted to share a short story of fixing a bug in EQS.
A colleague for the Paragon dev team IMed me asking what the EnvQueryManager is and sent me this image:
The number marked in red is the single call cost in milliseconds. Yeah, it’s pretty high. It’s enormous in fact! With EQS supporting time slicing I knew this had to be a bug so I’ve started digging. By looking at the code my hunch was that the unexpected cost was related to the new-ish EQS profiling functionality since time it takes is not included in the loop’s time budget. So I’ve added stats to the EQS debugger functions to confirm my hunch. Here’s what I gotten:
The last column is the call count within a frame.. I’m pretty sure we’re not suppose to try to log over 600k queries a tick ;)
The core issue turned out to be us not accounting for EQS queries that have been aborted before the main EQS loop had been triggered. This way if we finished all active queries, and skipping the aborted ones, the loop “thought” there was still some more work to do, up until it run out of time budget. Only thanks to a minor bug in EQS debugger’s usage (the fact that it can end up logging the same query multiple times after the query has finished) and it causing such a huge performance drop, the core, deeper EQS bug has been found. So, in essence, EQS Debugger uncovered and helped nail down the bug. Well done, EQSDebugger!