DRL

Branching dueling Q-networks are only effective when the action space can be partitioned into sub-groups in advance.

1

true

מיין לפי

4. [General] Which of the following statements is true about the problems associated with correlated states in DQN algorithms for reinforcement learning (multiple answers may apply)?

1

מיין לפי

b) One of the challenges of curiousity-driven exploration is the need to differentiate between unexpected events caused by our agent’s action, and those that occur randomly

1

T

מיין לפי

7. [Transformers in DRL] Which of the following statements is correct regarding the Decision Transformer (DT) and the Trajectory Transformer (TT)?

1

מיין לפי

10. [Imitation Learning] Which of the following statements is true regarding the differences between DAgger and simpler forms of imitation learning?

1

מיין לפי

b) Contextual bandits assume that the reward function is known and fixed for all actions and states.

1

true

מיין לפי

13. [General] Which of the following statements is true about optimal policies and the Bellman equation in reinforcement learning (multiple answers may apply)?

1

מיין לפי

a) In actor-critic algorithms, the actor and critic components interact with each other in a cyclic manner, with the actor updating the policy based on the critic's value estimates, and the critic updating its estimates based on the actor's policy changes.

1

true

מיין לפי

16. [Meta learning] Which of the following statements is true regarding Actor-Mimic Networks (AMN). (multiple answers may apply):

1

מיין לפי

1. [General] For Q-learning to converge, which of the following options needs to take place?

1

מיין לפי

Discuss, Learn and be Happy דיון בשאלות

Branching dueling Q-networks are only effective when the action space can be partitioned into sub-groups in advance.

4. [General] Which of the following statements is true about the problems associated with correlated states in DQN algorithms for reinforcement learning (multiple answers may apply)?

b) One of the challenges of curiousity-driven exploration is the need to differentiate between unexpected events caused by our agent’s action, and those that occur randomly

7. [Transformers in DRL] Which of the following statements is correct regarding the Decision Transformer (DT) and the Trajectory Transformer (TT)?

10. [Imitation Learning] Which of the following statements is true regarding the differences between DAgger and simpler forms of imitation learning?

b) Contextual bandits assume that the reward function is known and fixed for all actions and states.

13. [General] Which of the following statements is true about optimal policies and the Bellman equation in reinforcement learning (multiple answers may apply)?

a) In actor-critic algorithms, the actor and critic components interact with each other in a cyclic manner, with the actor updating the policy based on the critic's value estimates, and the critic updating its estimates based on the actor's policy changes.

16. [Meta learning] Which of the following statements is true regarding Actor-Mimic Networks (AMN). (multiple answers may apply):

1. [General] For Q-learning to converge, which of the following options needs to take place?