The Afforcement Learning Techniques (Applied Informatics)
Type: For the student's choice
Department: discrete analysis and intelligent system
Lectures
Semester | Amount of hours | Lecturer | Group(s) |
9 | 16 | Associate Professor Shcherbyna Y. М. |
Laboratory works
Semester | Amount of hours | Group | Teacher(s) |
9 | 16 |
Course description
Purpose. Fundamental provisions of the theory and methods of reinforcement learning are study. The solution of a problem of training with reinforcement gives the chance to the intellectual agent to achieve success in the unknown environment, using results of perception only received by it, and sometimes also remuneration.
Short description. In a course Markov decision-making processes, Bellman’s equation, algorithm of iteration on values, algorithm of iteration on strategy for calculation of optimum strategy, adaptive dynamic programming, temporal-difference learning, and environment research are studied.
Problem. The main task of a course are acquaintance of students with the formulation of the main concepts of such section of machine learning, as reinforcement learning, and studying of the main types of methods of reinforcement learning.
As a result of studying of this course the student
Has to know
• The formulation of basic provisions of reinforcement learning;
• Technique of direct estimation of usefulness;
• Technique of adaptive dynamic programming;
• Technique of temporal-difference learning.
Has to be able
• To use the methods based on direct estimation of usefulness;
• To use the methods based on adaptive dynamic programming;
• To use temporal-difference learning methods.