Related changes
Appearance
Enter a page name to see changes on pages linked to or from that page. (To see members of a category, enter Category:Name of category). Changes to pages on your Watchlist are shown in bold with a green bullet. See more at Help:Related changes.
List of abbreviations (help):
- D
- Edit made at Wikidata
- r
- Edit flagged by ORES
- N
- New page
- m
- Minor edit
- b
- Bot edit
- (±123)
- Page byte size change
- Temporarily watched page
11 April 2025
- diffhist m AI safety 15:11 +1,272 Bunnypranav talk contribs (Rollback edit(s) by 51.235.232.20 (talk): Unexplained content removal (RW 16.1)) Tags: RW Rollback
- diffhist AI safety 15:10 −1,272 51.235.232.20 talk (Small small near miss report making) Tags: Reverted references removed Visual edit Mobile edit Mobile web edit
10 April 2025
- diffhist Reinforcement learning from human feedback 20:48 −14 PopoDameron talk contribs (Not an example but rather a clarification on how the formula simplifies in this common case)
- diffhist Artificial intelligence 19:03 +144 Alenoach talk contribs (Added a citation) Tag: Visual edit
- diffhist Artificial intelligence 18:45 +3 Alenoach talk contribs (copyedit of the added paragraph) Tag: Visual edit
- diffhist Artificial intelligence 15:11 +739 אלכסנדר סעודה talk contribs (→Power needs and environmental impacts: Adding about the IEA report)
7 April 2025
- diffhist Wikipedia:Verifiability 23:37 +6 Springee talk contribs (Undid revision 1284493685 by JavaHurricane (talk)Per BRD and NOCON this should be reverted absent a consensus for the change. Please use the talk page if you wish to make a case for the change.) Tag: Undo
- diffhist Wikipedia:Verifiability 22:57 −6 JavaHurricane talk contribs (Reverted 1 edit by Newimpartial (talk): See https://dictionary.cambridge.org/grammar/british-grammar/prefer (Cambridge dictionary)) Tags: Twinkle Undo Reverted
6 April 2025
- diffhist Reinforcement learning from human feedback 18:22 −5 Alenoach talk contribs (copyedit) Tag: Visual edit
- diffhist m Reinforcement learning from human feedback 15:52 0 Kooryan talk contribs (→Kahneman-Tversky Optimization (KTO)) Tag: Visual edit
- diffhist Reinforcement learning from human feedback 15:52 +4,354 Kooryan talk contribs (Added KTO, another important DAA) Tag: Visual edit
- diffhist Reinforcement learning from human feedback 15:39 −5 Alenoach talk contribs (copyedit) Tag: Visual edit
- diffhist m Reinforcement learning from human feedback 15:36 0 Alenoach talk contribs (sentence case) Tag: Visual edit
- diffhist Reinforcement learning from human feedback 15:34 +2 Alenoach talk contribs (Moved and copyedited the "Direct Alignment Algorithms" section) Tag: Visual edit
- diffhist m Reinforcement learning from human feedback 14:58 +667 Hanpei talk contribs (→Direct Alignment Algorithms)
- diffhist Wikipedia:Verifiability 14:10 +6 Newimpartial talk contribs (Undid revision 1284253643 by JavaHurricane (talk) This isn't the grammar I know.) Tags: Undo Reverted Mobile edit Mobile app edit Android app edit App undo
- diffhist m Reinforcement learning from human feedback 13:30 −5 Arjayay talk contribs (Duplicate word removed)
- diffhist Wikipedia:Verifiability 13:27 −6 JavaHurricane talk contribs (→Non-English sources: grammar; it's preferred to, not preferred over) Tags: Reverted 2017 wikitext editor
- diffhist m Reinforcement learning from human feedback 06:24 +258 JWEEEEEEN talk contribs (edit new section) Tag: Visual edit
- diffhist Reinforcement learning from human feedback 00:27 +1,175 Kooryan talk contribs (New section for Direct Alignment Algorithms) Tag: Visual edit
- diffhist m Reinforcement learning from human feedback 00:22 −22 Kooryan talk contribs (→(Identity Preference Optimization)) Tag: Visual edit
- diffhist Reinforcement learning from human feedback 00:19 +4,053 Kooryan talk contribs (Added identity preference optimization) Tag: Visual edit
5 April 2025
- diffhist m Reinforcement learning from human feedback 21:44 +39 Kooryan talk contribs (→Direct preference optimization) Tag: Visual edit
- diffhist m Reinforcement learning from human feedback 15:58 +160 Kooryan talk contribs (→Reward model: Modified equation formatting. Modified incorrect definitions of the advantage estimation in the clipped surrogate objective. It is wrong to consider the KL-penalty because the clipped surrogate is a totally different objective. A more technical description is required actually for PPO in this section as well, as many details are incorrect or unclear.) Tag: Visual edit
- diffhist Wikipedia:Citing sources 10:23 −1 Augnablik talk contribs (→Citation order: Fixed a grammatical error.) Tags: Mobile edit Mobile web edit