
UCB (Upper Confidence Bound)
The Upper Confidence Bound (UCB) is a strategy used in decision-making to balance exploring new options and exploiting known ones. It assigns a score to each choice based on its average reward and the uncertainty or variability around that estimate. Choices with higher uncertainty get a boost, encouraging trying options that might be better but aren’t yet fully known. This approach helps identify the best options efficiently over time by systematically exploring less-known options while capitalizing on the promising ones, making it a key method in areas like machine learning and adaptive algorithms.