| Call Number | 11860 | 
|---|---|
| Day & Time Location | MW 1:10pm-2:25pm 303 Seeley W. Mudd Building | 
| Points | 3 | 
| Grading Mode | Standard | 
| Approvals Required | None | 
| Instructor | Shipra Agrawal | 
| Type | LECTURE | 
| Method of Instruction | In-Person | 
| Course Description | Markov Decision Processes (MDP) and Reinforcement Learning (RL) problems. Reinforcement Learning algorithms including Q-learning, policy gradient methods, actor-critic method. Reinforcement learning while doing exploration-exploitation dilemma, multi-armed bandit problem. Monte Carlo Tree Search methods, Distributional, Multi-agent, and Causal Reinforcement Learning. | 
| Web Site | Vergil | 
| Department | Industrial Engineering and Operations Research | 
| Enrollment | 65 students (60 max) as of 11:06AM Friday, October 31, 2025 | 
| Status | Full | 
| Subject | Op Research - Computer Science | 
| Number | E4529 | 
| Section | 001 | 
| Division | School of Engineering and Applied Science: Graduate | 
| Section key | 20253ORCS4529E001 |