7.4. Comparative Analysis
In this section, we’ll compare our work with others in the literature across speed, capability, reliability, resilience, and the speed of creating and adapting behaviors. We present our work against a handful of classical, model-based references and two learned references.
7.4.1. Speed and Capability
Figure 7.29 presents a comparison of the durations of robot door traversals across this thesis and the literature. This represents every timed door traversal in the literature that we are aware of for both wheeled and legged robots. Our door traversals are dramatically faster than the classical model-based results, all of which take more than one minute to traverse a door. However, recent learned systems are quite fast. Zhang et al. [49], with their ANYmal quadruped robot, demonstrate traversing a door in just 10 seconds. DoorMan [75] averages 15.4 seconds per door traversal. However, as highlighted by the dotted line in the figure, we have one door traversal that executed in 14 seconds, just beating DoorMan’s average. This firmly places our system in the “competitive with learned systems” regime.
We think one of the ways Zhang et al achieves such a high speed is through their robot’s kinematics. They have a single, long arm mounted on the front of their quadruped robot which also has a high yaw range of motion that is something like 270 degrees or more. These design elements give the robot a massive reachability workspace. This allows it to open doors from a distance and swing the door panels all the way open in one sweeping motion.
There are two important caveats with the learned systems, however. Zhang et al. used external motion capture and fiducial markers rather than robot vision. Xue et al. (DoorMan) used an off-board computer to run their vision based neural inference. This is in contrast to our system which uses only two color cameras on the robot for vision and all autonomous functionality is computed on board.

7.4.2. Reliability and Resilience
Figure 7.30 presents a comparison of our results against all door traversal reliability measurements that appear in the literature that we are aware of. Our premiere reliability tests were not full traversals, but were loco-manipulation behaviors. We did not measure repeated full door traversals because our walk through component was not reliable in the time we had to test it. Testing it could have damaged the robot beyond repair, preventing some of our more important results. Although this is unfortunate for the full reliability comparison, we do feel that with a repeated approach and opening task, that we still show that our behavior architecture, perception, and manipulation control is capable of very high reliability. The unreliability of the door walk through had more to do with the walking controller performance than the work presented in this thesis.
Most of the literature is in the same performance envelope of our repeated run reliability tests. Although we reported no failures, this does not mean our system is 100% reliable and it is not. A similar statement can be made for the results in the literature. Therefore, these repeated run measurements should be taken with a grain of salt, but they do serve as an indication that these systems are not super brittle “one-off” runs.
Among the learned systems, DoorMan [75] actually stands out for saying “83%” reliability without giving the number of attempts or a summary of what went into the calculation of that number. It is for that reason that we put “?/?” in the diagram and leave out that percentage. We do not feel like it is a reliable comparison to our thesis and the literature.

7.4.3. Creation and Adaptability
This thesis is essentially unique in its focus on reporting behavior authoring duration. We focus on both the effort required to create new behaviors from scratch and modify existing ones, adapting them to new tasks. We didn’t find any documented results in the literature on the authoring durations of loco-manipulation behaviors on robots.
For this section, we’ll simply compare against an estimate of what we consider to be our most relevant competition: DoorMan [75]. In Figure 7.31 we give a step by step process of creating a door behavior from scratch and adapting an existing door behavior to a different door type. We use an actual measurement for our push door creation from scratch: 6 steps in 2 hours. For DoorMan, we use the steps presented in the paper and estimate the durations.
Two main things stand out when doing this comparison. First, our system is estimated to be 50 to 100 times faster in creating and adapting behaviors. Secondly, when an adapted behavior is needed, the process for our system is dramatically shortened, but the process for DoorMan still requires the same multi-day pipeline.
This stark contrast is due to the fact that our system is decomposed into reusable parts that have a high degree of independence from each other. For example, the difference between a pull door and a sliding door might just be adjusting a few actions at runtime. However, DoorMan’s architecture requires a simulation refit, PPO policy tuneup, and an overnight retraining.

References cited on this page
[49] M. Zhang, Y. Ma, T. Miki, and M. Hutter, “Learning to open and traverse doors with a legged manipulator.” 2024. Available: https://arxiv.org/abs/2409.04882
[75] H. Xue et al., “Opening the sim-to-real door for humanoid pixel-to-action policy transfer,” arXiv preprint arXiv:2512.01061, 2025.