{"id":15046,"date":"2019-09-25T16:14:57","date_gmt":"2019-09-25T16:14:57","guid":{"rendered":"https:\/\/www.techopedia.com\/definition\/q-learning\/"},"modified":"2019-09-25T16:14:57","modified_gmt":"2019-09-25T16:14:57","slug":"q-learning","status":"publish","type":"definition","link":"https:\/\/www.techopedia.com\/definition\/32882\/q-learning","title":{"rendered":"Q-learning"},"content":{"rendered":"
Q-learning is a term for an algorithm structure representing model-free reinforcement learning. By evaluating policy and using stochastic modeling, Q-learning finds the best path forward in a Markov decision process.<\/p>\n
The technical makeup of the Q-learning algorithm involves an agent, a set of states and a set of actions per state.<\/p>\n
The Q function uses weights for various steps in conjunction with a discount factor in order to value rewards.<\/p>\n
Although it may seem like a simple idea, Q-learning is of paramount importance in many types of reinforcement learning and deep learning models. One of the best examples is where deep Q-learning is used to help machine learning programs to learn game-play strategies in various types of video games, for example, in Atari games from the 1980s. Here a convolutional neural network takes samples of game-play in order to work up a stochastic model that will help the computer know how to play the game better over time.<\/p>\n
Q-learning has abundant potential for helping to advance artificial intelligence and machine learning.<\/p>\n","protected":false},"excerpt":{"rendered":"
What Does Q-learning Mean? Q-learning is a term for an algorithm structure representing model-free reinforcement learning. By evaluating policy and using stochastic modeling, Q-learning finds the best path forward in a Markov decision process. Techopedia Explains Q-learning The technical makeup of the Q-learning algorithm involves an agent, a set of states and a set of […]<\/p>\n","protected":false},"author":7813,"featured_media":0,"comment_status":"open","ping_status":"closed","template":"","format":"standard","meta":{"_acf_changed":false,"_lmt_disableupdate":"","_lmt_disable":"","om_disable_all_campaigns":false,"footnotes":""},"definitioncat":[270,256,269],"class_list":["post-15046","definition","type-definition","status-publish","format-standard","hentry","definitioncat-data-science","definitioncat-emerging-technology","definitioncat-machine-learning"],"acf":[],"yoast_head":"\n