The Afforcement Learning Techniques (Applied Informatics)

Type: For the student's choice

Department: discrete analysis and intelligent system

Curriculum

SemesterCreditsReporting
94Setoff

Lectures

SemesterAmount of hoursLecturerGroup(s)
918Associate Professor Yu. M. ShcherbynaPMi-53m

Laboratory works

SemesterAmount of hoursGroupTeacher(s)
918PMi-53mAssociate Professor Yu. M. Shcherbyna

Course description

Purpose. Fundamental provisions of the theory and methods of reinforcement learning are study. The solution of a problem of training with reinforcement gives the chance to the intellectual agent to achieve success in the unknown environment, using results of perception only received by it, and sometimes also remuneration.

Short description. In a course Markov decision-making processes, Bellman’s equation, algorithm of iteration on values, algorithm of iteration on strategy for calculation of optimum strategy, adaptive dynamic programming, temporal-difference learning, and environment research are studied.

Problem. The main task of a course are acquaintance of students with the formulation of the main concepts of such section of machine learning, as reinforcement learning, and studying of the main types of methods of reinforcement learning.

As a result of studying of this course the student

Has to know

• The formulation of basic provisions of reinforcement learning;

• Technique of direct estimation of usefulness;

• Technique of adaptive dynamic programming;

• Technique of temporal-difference learning.

Has to be able

• To use the methods based on direct estimation of usefulness;

• To use the methods based on adaptive dynamic programming;

• To use temporal-difference learning methods.