Research

RL/IL for Multi-Agent Collaborative Construction

A current research effort I have become a part of at the CMU Biorobotics lab under Professor Howie Choset is Multi-Agent (MA) Collaborative Construction. In this problem, given a target design structure, the goal is to orchestrate the collaboration of several robot agents in building this structure. Specifically, the approach I am working on involves using Reinforcement Learning (RL) to learn a decentralized policy for each individual agent, given only the current environment state, goal state, and the state of other agents. This theoretically allows for an arbitrary number of agents to work together without the need to retrain the policy model and with linear runtime scalability. Current techniques [1] could potentially be augmented using Imitation Learning (IL), using an optimization-based expert example to begin learning a suboptimal but effective policy, then improving the policy using RL [2]. 

Modular Robot Design Automation

Another recent effort I have joined is Modular Robot Design Automation, or the process of mapping from a task space to a robot design space, where the output robot design performs optimally for the specified task under certain constraints. Because the task and robot design spaces are generally quite large and multi-dimensional, several machine-learning-based techniques [3, 4] have been developed to learn an approximate mapping such that inference is fast enough to deploy a robot in real-time in the field (e.g. search and rescue, unmanned exploration). I am currently working on an extension of RoboGAN [4] to improve the capabilities and convergence time for the generator network beyond the scope of the original work.

Memory Efficient Multi-Objective A*

In continuation of my Multi-Objective A* work with Professor Howie Choset, I attempted to improve the memory efficiency of Multi-Objective A* (MOA*) algorithms, a class of heuristic search algorithms that extend A* to multiple, often conflicting, objectives. I developed a novel memory efficiency improvement to a state-of-the-art MOA* algorithm [5] that leverages the notion of partial expansion from single-objective A* research [6] to reduce the space requirements for runtime-efficient multi-objective planning algorithms. I have recently submitted this work (Enhanced Multi-Objective A* with Partial Expansion) to the 2023 International Conference on Automated Planning and Scheduling (ICAPS) as first author [7]. Check out the preprint here.

I am currently working on further improving the memory efficiency of this algorithm (which we call PE-EMOA*) by leveraging depth-first search (DFS) techniques to additionally reduce memory consumption while sacrificing some degree of runtime performance.

Anytime Multi-Objective A*

Multi-Objective A* algorithms grow exponentially in runtime in the worst case. This often makes them prohibitive for real-time robotics use. To address this issue, I developed an "anytime" Multi-Objective A* (MOA*) algorithm that loosened the bound on solution optimality to improve response time for practical robotics applications. This involved synthesizing single-objective anytime techniques [8] with an MOA* algorithm [9]. The resulting algorithm provides a "Pareto-suboptimal" solution set in a short amount of time, then iteratively returns improvements to this solution (with bounds on the suboptimality of each iteration) until the optimal solution is returned. The overall runtime was experimentally seen to only minimally increase the runtime of traditional NAMOA* due to the use of "repairing" techniques that reused results from previous iterations. Check out the GitHub here.

  1. Sartoretti, G., Wu, Y., Paivine, W., Kumar, T. S., Koenig, S., & Choset, H. (2019). Distributed reinforcement learning for multi-robot decentralized collective construction. In Distributed Autonomous Robotic Systems: The 14th International Symposium (pp. 35-49). Springer International Publishing.
  2. Sartoretti, G., Kerr, J., Shi, Y., Wagner, G., Kumar, T. S., Koenig, S., & Choset, H. (2019). Primal: Pathfinding via reinforcement and imitation multi-agent learning. IEEE Robotics and Automation Letters, 4(3), 2378-2385.
  3. Hu, J. (2022). Composition Learning in “Modular” Robot Systems (Doctoral dissertation, CARNEGIE MELLON UNIVERSITY Pittsburgh).
  4. Hu, J., Whitman, J., Travers, M., & Choset, H. Modular Robot Design Optimization with Generative Adversarial Networks.
  5. Yoshizumi, T., Miura, T., & Ishida, T. (2000, July). A* with Partial Expansion for Large Branching Factor Problems. In AAAI/IAAI (pp. 923-929).  
  6. Ren, Z., Zhan, R., Rathinam, S., Likhachev, M., & Choset, H. (2022, July). Enhanced multi-objective A* using balanced binary search trees. In Proceedings of the International Symposium on Combinatorial Search (Vol. 15, No. 1, pp. 162-170).
  7. Kothare, V., Ren, Z., Rathinam, S., & Choset, H. (2022). Enhanced Multi-Objective A* with Partial Expansion. arXiv preprint arXiv:2212.03712.
  8. Likhachev, M., Gordon, G. J., & Thrun, S. (2003). ARA*: Anytime A* with provable bounds on sub-optimality. Advances in neural information processing systems, 16.
  9. Mandow, L., & De la Cruz, J. P. (2005, July). A new approach to multiobjective A* search. In IJCAI (Vol. 8).