Talk:X86 Bit manipulation instruction set

This is the talk page for discussing improvements to the X86 Bit manipulation instruction set article.
This is not a forum for general discussion of the article's subject.

Put new text under old text. Click here to start a new topic.
New to Wikipedia? Welcome! Learn to edit; get help.

Article policies

Find sources: Google (books · news · scholar · free images · WP refs) · FENS · JSTOR · TWL

Computing C‑class Mid‑importance

	This article is within the scope of WikiProject Computing, a collaborative effort to improve the coverage of computers, computing, and information technology on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.ComputingWikipedia:WikiProject ComputingTemplate:WikiProject ComputingComputing
C	This article has been rated as C-class on Wikipedia's content assessment scale.
Mid	This article has been rated as Mid-importance on the project's importance scale.

Unnamed section

I cant understand the wording of the description of item 1 in the table under BMI1, The reference uses the same wording, although it does not make sense Matthias291999 (talk) 01:14, 23 January 2016 (UTC)[reply]

The description of ANDN? Or the comment about about CPUID? Carewolf (talk) 11:45, 26 January 2016 (UTC)[reply]

Parallel bit deposit and extract examples

All these illustrate is simple right and left shifts, which is NOT what these instructions do. (yes, in the reductio ad absurdum case, they can can do a shift, but then why do we need them?). It's actually misleading, if the reader assumes this is all they can do. It's bad enough to just delete the examples. I don't think the editor who constructed them (that's why there's no citation) understood the operators. It's nominally WP:OR, though in construction of self-evident examples, we sometimes have to allow a little liberty. I'd feel better if the examples were taken from published material. Sbalfour (talk) 14:49, 25 October 2019 (UTC)[reply]

It was written that way so that people can look at the example and immediately tell what is going on. While your examples are show more of the power of the instruction, the examples in themselves are not instructive as to what the instructions does (in my opinion).Carewolf (talk) 14:57, 26 October 2019 (UTC)[reply]

Your example confounds bits, nibbles and bytes as well as actual numeric quantities with symbolic representations of them. My examples also had some of that. I'm not sure what RGBA8888 and RGBA4444 represent. Are 'R', etc symbols or literals? Is the quantity to be interpreted as 0xRGBA8888, i.e. a hexadecimal 32-bit number? 'R' and 'G' aren't hexadecimal digits, but even if they're symbolic, PEXT does not translate 0xRGBA8888 to 0xRGBA4444. I presume we're in 32-bit mode (though my examples were 64-bit) as the text stands. You use the mask "111100001111000011110000" twice, a 24-bit quantity; I'll take that as a typo for 0b11110000111100001111000011110000, though any typo here means the operator won't work as described. I've actually run the code, using ascii bytecode transliterations of your R_1..8G_1..8B_1..8A_1..8 as the source. Here is the result:

PEXT(0x52474241,0xf0f0f0f0) = 0x00005444

PDEP(0x00005444,0xf0f0f0f0) = 0x50404040

Even correcting the mask, these don't agree with your "Result" column. If everything could be fixed up, the problem remains that the user may imagine that PEXT and PDEP are "nibble-packing/unpacking" operators, and lack generalization. PEXT sequentially collates any number of arbitrary-size bitfields of the source omitting gaps between them, into the low portion of the destination; PDEP is the inverse, taking any number of arbitrary-size contiguous fields of the source and distributing them sequentially over the destination, with gaps between the fields being zeroed. I've therefore recast the examples (and run them) to something correct and readable using hexadecimal numbers uniformly, and putting the selector, source and destinations fields of the table in the order in which the operands are specified to the operators in GAS, though I think assemblers differ. Sbalfour (talk) 22:19, 28 October 2019 (UTC)[reply]

RGBA is a RGBA color space the numbers after refers to how many bits you have for each color. But your new example are good too and more compact. Carewolf (talk) 07:21, 29 October 2019 (UTC)[reply]

LZCNT

The text says: "LZCNT is almost identical to the Bit Scan Reverse (BSR) instruction..." [except for flags and zero operands]. That is NOT NOT NOT true, and don't you believe it. BSR returns the index (offset from 0 bit) of the high '1' bit; LZCNT does literally what it says. They're not even close; in fact they are almost inverses of each other: for example LZCNT(0x80000000) = 0; BSR(0x80000000) = 31, because the highest bit set is 31 offset from bit 0, or index 31. LZCNT(0x00000001) = 31; BSR(0x00000001) = 0, because the set bit is offset zero from bit 0. If one checks gcc's __builtin_clz(x) on architectures without ABM or BMI, it codes as 31^BSR(x) (that's basically 31-BSR(x) in a 5 bit field).

And even worse, LZCNT executes as BSR on architectures that don't support LZCNT, and that can lead to some surprises because BSR returns a different result than LZCNT. Sbalfour (talk) 17:22, 13 November 2019 (UTC)[reply]

Similarly, TZCNT and BSF are not 'almost identical': there is an analogous inverse relationship between them. Sbalfour (talk) 17:30, 13 November 2019 (UTC)[reply]