Jump to content

Talk:Policy gradient method

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia
This is the current revision of this page, as edited by Hector (talk | contribs) at 15:07, 4 February 2025 (REINFORCE algorithm: new section). The present address (URL) is a permanent link to this version.
(diff) ← Previous revision | Latest revision (diff) | Newer revision → (diff)

REINFORCE algorithm

[edit]

i would erase the index subscript in the expectation : Do you agree ? Thanks ! Hector (talk) 15:07, 4 February 2025 (UTC)[reply]