Spring 2026 Op Research - Computer Science E6529 section 001

Advanced Reinforcement Learning

Call Number	13322
Day & Time Location	MW 1:10pm-2:25pm 140 Uris Hall
Points	3
Grading Mode	Standard
Approvals Required	None
Instructor	Shipra Agrawal
Type	SEMINAR
Method of Instruction	In-Person
Course Description	Theory of Markov Decision Processes (MDP) and Dynamic Programming. Design and convergence properties of Reinforcement Learning (RL) algorithms including Q-learning and Policy iteration methods. Function approximation and deep RL algorithms: DQN, policy gradient, actor-critic methods. Exporation-Exploitation and regret bounds in RL. Multi-agent RL. RL with Human Feedback (RLHF). RL and Monte Carlo Tree Search (MCTS) for Agentic Systems. Note: Only one of ORCS E4529 or 6529 may be taken for credit.
Web Site	Vergil
Department	Industrial Engineering and Operations Research
Enrollment	41 students (50 max) as of 1:06PM Thursday, April 2, 2026
Subject	Op Research - Computer Science
Number	E6529
Section	001
Division	School of Engineering and Applied Science: Graduate
Note	Students can only count either IEOR 4529 or 6529 towards the
Section key	20261ORCS6529E001