Unlike conventional optimization algorithms that aim to find the minimum or maximum of a function, novelty search (NS) focuses on discovering a diverse set of outcomes without regard to objective performance [1]. Originally developed to address deceptive optimization problems, where traditional methods often get trapped in local optima [2], NS has also proven valuable in objective-free discovery tasks such as materials design and drug development, where the goal is to uncover candidates with unusual or out-of-distribution properties that may be promising in unanticipated ways [3].
In many of these applications, the underlying system is a black box, where the relationship between inputs and outputs is unknown and evaluations (which could be simulations or experiments) are expensive. Classical NS algorithms rely on population-based metaheuristics such as genetic algorithms, which are poorly suited for such high-cost settings due to their sample inefficiency. To address this, the recently proposed BEACON algorithm integrates NS with Bayesian optimization (BO) by using Gaussian process (GP) surrogates and an acquisition function that balances predicted novelty and uncertainty [4]. While BEACON has demonstrated strong performance in low- to moderate-dimensional settings, it suffers in high-dimensional problems due to the inherent limitations of global GP models.
In this work, we propose TR-BEACON [5], a trust region-based extension of BEACON tailored for high-dimensional novelty search. Rather than modeling the entire search space globally, TR-BEACON constructs local GP models within a dynamically managed trust region. At each iteration, the trust region is centered around the point that yields the most novel outcome – defined by the maximum distance from all previously observed outcomes in the behavior space. The trust region size is adaptively tuned: it contracts when the model fails to generate novel outcomes (to improve local fidelity) and expands when novelty is consistently discovered (to encourage broader exploration).
We benchmark TR-BEACON on a challenging 20-dimensional synthetic function and a 24-dimensional reinforcement learning (RL) task. Across both domains, TR-BEACON outperforms state-of-the-art NS methods (including the original BEACON algorithm) by achieving broader coverage of the outcome space and significantly higher rewards in the RL setting. These results demonstrate TR-BEACON’s ability to scale novelty search to high-dimensional, expensive black-box systems.
Reference
[1] Lehman, J., & Stanley, K. O. (2011). Abandoning objectives: Evolution through the search for novelty alone. Evolutionary computation, 19(2), 189-223.
[2] Forrester, A., & Jones, D. (2008, September). Global optimization of deceptive functions with sparse sampling. In 12th AIAA/ISSMO multidisciplinary analysis and optimization conference (p. 5996).
[3] Terayama, K., Sumita, M., Tamura, R., Payne, D. T., Chahal, M. K., Ishihara, S., & Tsuda, K. (2020). Pushing property limits in materials discovery via boundless objective-free exploration. Chemical science, 11(23), 5959-5968.
[4] Tang, W. T., Chakrabarty, A., & Paulson, J. A. (2024). Beacon: A bayesian optimization strategy for novelty search in expensive black-box systems. arXiv preprint arXiv:2406.03616.
[5] Tang, W. T., Chakrabarty, A., & Paulson, J. TR-BEACON: Shedding Light on Efficient Behavior Discovery in High-Dimensional Spaces with Bayesian Novelty Search over Trust Regions. In NeurIPS 2024 Workshop on Bayesian Decision-making and Uncertainty.