7. Evaluation
In this chapter, we’ll go through our real robot demonstrations to evaluate our hypotheses, which we’ll state here again:
-
Robot-local execution with synchronized UI state, concurrent action layering, reactive tree logic, and behavior-time semantic perception yield door behaviors that are faster and more reliable than prior IHMC baselines and competitive with reported reinforcement learning systems on overlapping door tasks.
-
Runtime-editable behaviors and perception modules reduce the iteration loop required to diagnose failures, modify logic, and re-test on the robot, relative to redeploy, restart, or retrain workflows.
-
Decomposing behaviors into reusable primitives, subtrees, and scene actions allows new door and loco-manipulation variants to be brought up by editing a small part of a working behavior rather than rebuilding it from scratch.
We’ll present evidence that supports our first hypothesis with speed and reliability results for real robot tasks in Section 7.1, our second hypothesis by showing the speed with which we can create and modify behaviors in Section 7.2, and our third hypothesis by showing that we can readily adapt existing behaviors to new tasks in Section 7.3. We’ll then do a comparative analysis against the literature to see how our results stack up in Section 7.4. Finally, we’ll evaluate to what degree our hypotheses held in Section 7.5.