Module talk:Emoji

I feel like we can make this better

Hi everyone:

I have worked on Lua modules before, and after stumbling upon this one, I feel we can make it better and easier for Wikipedians to get emoji codes in user and talk pages. I am calling for users who have worked on this module before, specifically @RexxS, @Qzekrom and @Izno, but really anyone who wants to help. To anyone who wants to help, please reply and we can begin working on a plan!

Additionally, I started another module, Module:WEmoji, and may be able to use this module for WEmoji.

More technical details: We could make it require less typing, e.g. by adding it to a template (See Template:Wikipedia ads for an example), so you wouldn't have to type in as much text. For example, {{Emoji Smiley}}. Or, to use mappings, {{Emoji 1f603}}.
I will wait 2 months, and if I do not receive a reply, I will begin working on this by myself.
Urban Versis 32 (talk) 13:32, 22 May 2022 (UTC)[reply]

Where do these names come from?

If you look at U+1F507 🔇 SPEAKER WITH CANCELLATION STROKE (linked here:🔇), you will see it is called Muted Speaker. That name comes from Unicode's CLDR short name. However,

{{#invoke:emoji|emoname|1f507}} → mute

So where are do these names from? Please document. Dpleibovitz (talk) 04:20, 9 March 2024 (UTC)[reply]

Ok, I did some hunting around. The latest charts can be found at

https://www.unicode.org/cldr/charts/44/annotations/americas.html - I think this is the one we use

And it shows *muted speaker | mute | quiet | silent | speaker

So there can be many aliases for a name. Note sure that the module works with all of these. Will it always return the first alias (and never the full name)? Dpleibovitz (talk) 04:49, 9 March 2024 (UTC)[reply]

I don't know where that (rather limited) list came from. The editor who added it is no longer with us so we can't ask.

The annotations chart you link doesn't seem to me to be the definitive list of emoji. Perhaps a better list is https://unicode.org/Public/emoji/latest/emoji-test.txt (version 15.1 at this writing). This appears to list all current emoji with their proper names.

I have hacked a lua module in my sandbox that reads a local copy of the emoji-test.txt file to create a replacement for Module:Emoji/data. Some items in Module:Emoji/data are not in the sandbox list by the same name (wink, grin, 8ball from the examples in the module doc are some).

Module:Emoji/data is only used in one mainspace article (Irony punctuation § Emoji and Emoticons) so replacing the data table with the new one from my sandbox requires only that article to be fixed (because the abbreviated names rolling_eyes, stuck_out_tongue, and upside_down should be face_with_rolling_eyes, face_with_tongue, and upside-down_face).

So, what to do? Nothing? Replace the data in Module:Emoji/data with data derived from emoji-test.txt? with data derived from some other source?

—Trappist the monk (talk) 15:32, 10 March 2024 (UTC)[reply]

So we seem to have two sources of data, but your's (which is newer and more extensive) doesn't have CLDR keywords, nor locale information., while https://www.unicode.org/cldr/charts/44/annotations/americas.html does. For example "trade mark" is the general short name (with "trademark" a keyword), but in the en-CA locale, these two are reversed. As you say, this module is currently used in one article, but I have suggestions as to use in thousands more.

Before beginning, a quick background of me. I have my own merged fork of Wikipedia, Wiktionary, Wikiquote, etc. that starts out with zero content. As I add the pages I desire, I add all the categories, templates, modules that are needed to support them. Typically none of this content is modified much. However, Wikipedia articles are merged with their disambiguations, as well as the Wiktionary articles of both initial cases. Many Wikipedia redirects are replaced by their Wiktionary articles. All wiktionary templates, modules and catergories have a "Wikt/" prefix that is stripped out via DISPLAYNAME to prevent clashes. Wish there was MediaWiki support for a USING keywords! If you want to discuss my work, lets do that in your or my talk page, or offline via email.
In Wikipedia, I have modified the {{r from emoji}} (on my fork) to take all the CLDR keywords as extra parameters. The first parameter is the short name which is currently manually specified, but could be automated with this module. I have also added a |locale=en-CA parameter. In my fork, the {{r from emoji}} is called twice for ™️. Unfortunately, these currently use historical Unicode values in title case. At some point, Unicode changed all names to be completely in lower case, so all invocations of {{r from emoji}} should be modified. In my fork, I have categories for every keyword and short name, e.g, category:aKeyword (emoji keyword) and category:aShortName (emoji short name), so {{r from emoji}} will both link its displayed output to these categories, as well as categorize the current page into them sorted by short name. I have made the same modifications to wikt:Template emojibox (and added the first short name parameter), so I can use this for non-redirect entries such as trademark, trade mark and ™ (called twice for the general and en-CA locale). Wikipedia could use this as well, perhaps simply disabling the output but retaining the categorizing. Perhaps nobody wants the categorizing at all but It makes finding emojis (sorted with their short names as articles or redirects) much easier than a single long list for each alphabetic character - perhaps that's what the emoji/Unicode articles are for, but they don't exist on Wiktionary. Module:Emoji could automate everything, but neither the current implementations nor your improvements would help my purposes so I'm still manually entering things (for the few emoji's I;m interested in). Perhaps what I want, data module wise is
```
shortname[codepoint] = {
   {"short name", "keyword1", "keyword2", "etc."},
   locale["en-CA"] = {"short name", "keyword1", "keyword2", "etc."},
   locale["en-NZ"] = {"short name", "keyword1", "keyword2", "etc."},
},
```

Currently, I don't know where both locale and keyword data would come from that is computer readable. But I would get the module to produce the outputs of {{r from emoji}} and wikt:Template emojibox, sometimes twice or more if needed. I could place my modifications to {{r from emoji}} in my user page if you like. They're fairly minor and don't yet use this module. PS. I think we should use spaces as the standard suggests, and not underscores. Dpleibovitz (talk) 00:05, 11 March 2024 (UTC)[reply]