Think, Solve, and Learn למדת 0 מתוך 95

שים לב! - לא תוכל לראות התקדמות למידה ללא התחברות (תמיד תשאר על 0%)
לחץ פה להתחברות מהירה

shuffle help brightness_4 brightness_7 format_textdirection_r_to_l format_textdirection_l_to_r

Which of the following statements are correct regarding off-policy methods (multiple answers may apply)

(20 נק') - שאלה #170517

The algorithm employs two Q-functions, each designed to function as the unbiased estimator of the other

(20 נק') - שאלה #170518

The two Q-functions are trained on the same experiences (i.e. samples), but vary in their parameters

(20 נק') - שאלה #170519

Which of the following distinguishes deep reinforcement learning from supervised learning

(20 נק') - שאלה #170520

Apprenticeship learning is incapable of adapting to previously unseen circumstances because of the need to re-calculate the policy at each time step

(20 נק') - שאלה #170521