
Upper Confidence Bound (UCB)
Upper Confidence Bound (UCB) is a strategy used in decision-making scenarios where you want to balance exploring new options and exploiting known ones. Imagine you're testing different options to find the best, but you're not sure which is best yet. UCB assigns a score to each option based on its average success rate plus an additional "uncertainty" bonus. This encourages trying options that haven't been tested much or seem promising, helping to identify the best choice over time. Essentially, UCB guides you to explore uncertain options while gradually focusing on the most promising, optimizing decisions as more data becomes available.