סמינר: Machine Learning Seminar
Quantifying the Value of Information in Decision-Making
In modern sequential decision-making systems, decision-aiding information is everywhere: navigation heavily relies on traffic information, data published on the news affects stock-market decisions, and more. However, to make full use of such information, we need to identify types of information acquired by the decision makers, quantify their value, and design algorithms that provably utilize such information. In this talk, we show how to accomplish this in two different case studies.
In the first case, we consider scheduling problems, where agents choose in which order to evaluate tasks. In many such instances, tasks are accompanied by additional information, indicating the type of computation. In particular, we assume that each falls into one of K possible types that determines its expected duration. When the statistics of task durations are known, we analyze the potential benefits of obtaining this knowledge. We then show that even when the task distributions are unknown, learning algorithms can be employed to achieve optimal performance, up to sublinear error.
In the second case, we explore reinforcement learning with lookahead information – i.e., where partial information on the near future is given to the agent. This form of information is natural in many applications. For instance, in transactions, prices are known before an exchange; in navigation, agents observe traffic data; and in goal-oriented problems, the goals are revealed well in advance. For known environments, we give a full characterization of the possible gains due to future reward information, as well as bounds on the gain due to transition information. We further present efficient learning algorithms that observe either immediate reward or transition information before acting, competing against a much stronger baseline that also utilizes similar information.

