13. You’re a new data scientist hired by Netflix. Your first assignment is to improve the following app: a user is presented with a screen containing a film poster. The user can then choose to watch the film (a reward of +1) or choose “next screen” (a reward of -1) which presents an additional film. For each film, you have a fixed-length vector that represents it. you also have a fixed number of poster types for each movie (same number for all films). Your algorithms need to select both the film and the poster to show at each time step. To train your model, you are provided with a previously collected dataset of 100,000 user sessions. Which of the following DRL algorithms should you use: