All Too Human

From Best-of-n to Ranking n: A New Approach to Collective Decision-Making in Robot Swarms

Background

As a Postdoctoral Research Associate (PDRA) at the University of Bristol, I explored how multi-agent systems can make collective decisions under greater limitations than typically considered in traditional approaches to the "best-of-n" problem in robot swarms. In our paper, published in Swarm Intelligence (2021), we introduce a novel model for collective preference learning that allows robot swarms to learn not just which option is best, but to rank all available options.

Beyond binary choices

Traditional approaches to the best-of-n problem often focus on identifying only the single best option from a set of alternatives, discarding information about the remaining n1 options. While this works for simple scenarios, real-world applications like search and rescue operations require more nuanced decision-making.

Consider a swarm of robots searching for casualties in a disaster scenario. The traditional approach would only identify the highest priority casualty, requiring the entire decision-making process to be repeated for each remaining casualty. This is inefficient and potentially costly in time-critical situations.

Our three-valued logic approach

Instead of using binary logic (option A is better/worse than option B), we've developed a model using three-valued logic where robots can hold beliefs about preference relations in three states:

This enables agents to maintain and refine a complete preference ordering over all options simultaneously, rather than simply identifying which option is best.

Key findings

Our experiments revealed several important insights:

  1. Robust to Noise: The model performs well even in noisy environments where evidence might be contradictory or incorrect. When evidence is sparse and noise levels are high, preserving the transitivity of agents' beliefs improves accuracy further.

  2. Scalable Solution: Our approach successfully scales to environments with up to 20 options before accuracy is significantly affected in noisy environments.

boolean_adopt_robust_kilobot_trajectory

High-noise scenario

three_valued_robust_kilobot_trajectory

Low-noise scenario

  1. Communication Efficiency: We developed a bandwidth-limited version allowing agents to communicate only a subset of their preference relations, which still performs well and sometimes outperforms the standard model.

  2. Speed-Accuracy Trade-off: There's a clear trade-off between increased accuracy (by applying transitive closure operations) and computational cost, giving system designers flexibility depending on application requirements.

Why this matters

This research represents a significant step toward deploying swarm robotics in complex real-world scenarios where ranking multiple options is crucial. For example:

By learning complete preference orderings rather than just identifying the best option, swarm robotic systems can make more informed decisions and adapt more effectively to complex environments. This approach also avoids the complete loss of information once the task of highest priority is completed, as the next task is already decided.

Future research directions

We're continuing to investigate how this approach performs with physical constraints like limited communication range and observation capabilities. We're also exploring how to adapt the model for dynamic environments where preferences might change over time.


More detailed information about this research can be found in our paper: "Collective preference learning in the best-of-n problem: From best-of-n to ranking n" published in Swarm Intelligence (2021).

#project