The standard contract in human-in-the-loop robotics is that the human is a fallback. The robot runs autonomously until it hits failure or uncertainty, at which point it taps the person for help — and the rest of the time, the person watches. That framing treats the human, in the authors' blunt phrasing, "primarily as tools for improving robot performance." A new arXiv paper posted June 16, 2026 — Beyond Failure Recovery: An Engagement-Aware Human-in-the-loop Framework for Robotic Systems, by Jiaying Fang, Joyce Yang, Zhanxin Wu, Bohan Yang and Tapomayukh Bhattacharjee — argues that for an important class of robots, this contract is backwards, and proposes a control formulation that inverts it.

The setting that motivates the inversion is physical caregiving, and the example is pointed: a user with mobility limitations being fed by a robot. Under the failure-driven paradigm, such a user might go long stretches as a passive observer, because they lack the ability to intervene in the moment and the robot only solicits input when something breaks. The authors note that "a user with mobility limitations may feel less engaged when being continuously and passively fed by a robot." The problem is not that the robot fails too often; it is that it succeeds quietly and autonomously in a way that writes the person out of their own care. But the opposite extreme is no better — "overly frequent interaction can be tiring and increase the user's workload." The design target is a genuine trade-off between two competing harms: disengagement on one side, fatigue on the other.

"Rather than requesting input only when difficulties arise during task execution, the robot proactively considers the user's preferred level of engagement throughout the task, balancing autonomy and interaction while ensuring task success."— arXiv, source

The proposed solution is Engagement-aware MPC, or E-MPC, and the choice of model predictive control as the substrate is apt. MPC plans over a horizon subject to constraints, which is exactly the shape of this problem: you want to schedule interactions over the course of a task to maintain engagement, while a workload constraint caps how much you can demand of the user. E-MPC plans interaction to maintain engagement while respecting that workload constraint — engagement is the objective, workload is the constraint, and the planner solves for an interaction schedule that satisfies both rather than reacting to failures as they come.

Modeling engagement as a dynamical quantity

The component that makes the predictive framing work is a user interaction dynamics model that captures how user engagement evolves as a function of both the frequency and type of interaction. This is the conceptual core, and it is what separates E-MPC from a heuristic "check in every N seconds" rule. By treating engagement as a state that evolves with a dynamics model, the controller can predict the engagement trajectory under different interaction plans and choose one that holds engagement in a desirable band without overshooting into fatigue. That both frequency and type of interaction feed the model matters: a brief confirmation and a substantive choice are not interchangeable units of engagement, and a model that distinguishes them can plan a varied interaction diet rather than nagging the user with identical prompts. Modeling engagement as a controllable dynamical quantity, with its own model that the planner reasons over, is the kind of formalization that turns a soft human-factors goal into something an optimizer can actually pursue.

The framing also quietly reframes what "task success" means. In the failure-recovery paradigm, success is the robot completing the task; the human is instrumental. E-MPC keeps task success as a requirement — it ensures task success — but adds engagement as a co-equal objective the controller actively manages. For caregiving, that is the ethically correct ordering: the goal is not merely to feed the person efficiently, but to feed them in a way that keeps them an agent in their own care. A control formulation that encodes the user's preferred level of engagement as something to be honored throughout the task, not just consulted at breakdowns, is making a value judgment legible to the math.

What the evaluation actually tested

The validation has two layers, and the second is the one that counts. The authors first evaluate E-MPC in simulation with several ablations and baseline comparisons, demonstrating effectiveness across diverse user personas — the standard way to show a method generalizes beyond a single hand-tuned case and to isolate which parts of the design drive the result. But for a method whose entire premise is about human experience, simulation can only go so far, and the authors know it. They also conduct a real-world user study with participants with emulated mobility limitations on a robot-assisted bite-acquisition system, reporting that E-MPC "improves user experience while maintaining task success."

That real-world study is the load-bearing evidence, and its design deserves credit: bite acquisition is a real, high-stakes caregiving task where the disengagement problem is concrete rather than hypothetical, and emulating mobility limitations in the participants targets exactly the population the method is built for. The honest reading of the result is also its appropriately modest one — "improves user experience while maintaining task success" claims that engagement-aware interaction did not come at the cost of getting the food to the person, and that subjective experience improved. It is not a claim of dramatic performance gain, because performance was never the point; the point was to improve the human's experience without degrading the task, and that is what the study supports.

From the standpoint of where defensible robotics capability is forming, E-MPC is a marker of an underweighted frontier. The field has poured effort into making robots more autonomous and into failure recovery, and comparatively little into the question of how a robot should ration its demands on a human to keep that human appropriately involved — especially when the human cannot easily insert themselves. Formalizing engagement as a state with its own dynamics, then planning interaction against it under a workload budget, is a transferable pattern that reaches well past feeding: any assistive, rehabilitative, or collaborative robot working with a person who has limited ability to intervene faces the same disengagement-versus-fatigue trade-off. The open questions are whether the interaction-dynamics model generalizes across users without per-person calibration, and how robustly the workload constraint reflects real fatigue rather than a proxy for it. But as a deliberate inversion of the failure-driven default — and one tested on a real caregiving task with the right population — the work is a notable contribution. The full preprint, including the user study and ablations, is available on arXiv.