Talk:Pentium FDIV bug

I am wondering, what "586 Pentium clone" did IBM have for sale in 1994? Crusadeonilliteracy 13:52, 12 Jan 2004 (UTC)

I'm probably the one that wrote the original text; I don't remember where I got that information. [1] claims that IBM introduced the "5x86" in 8/1995; I think that the whole FDIV flap did last the 10 months required for this to be relevant. Cwitty 19:34, 12 Jan 2004 (UTC)

It actually was Cyrix's chip-design; IBM merely physically produced them for Cyrix, and under agreement got to sell some under their own brandname. I don't think the IBM5x86C can be called a Pentium clone from a design point of view. And it went into Socket 3 motherboards Crusadeonilliteracy 00:53, 13 Jan 2004 (UTC)

The idea of the text is to point out that IBM was a competitor of Intel's in terms of selling x86-compatible chips. For that purpose, I'm not sure that it matters who designed the chips, or what motherboard they go in. On the other hand, if you feel that the current text is bad, go ahead and edit it. Cwitty 23:27, 13 Jan 2004 (UTC)

I've edited it to remove any confusion (I hope). --Townmouse 22:42, 8 Jun 2005 (UTC)

more detail needed?

The article is good but could use some more detail about the precise nature of the bug and its potential consequences. I'm not sure the "of little importance" phrase is NPOV. There's a fair bit of detail in the German article de:Pentium-FDIV-Bug if anybody cares to translate it. --Mathew5000 06:15, 27 June 2006 (UTC)[reply]

Bad opening sentence?

The original Pentium chip is notorious in computing history, for being the only chip ever made capable of performing the mathematically impossible operation of dividing by zero - due to a bug in the FPU.

This looks like it might be vandalism, or at least badly misinformed. Division by zero in a floating point context has been a fully defined operation since IEEE 754.

Also, the fdiv bug caused incorrect but real results when given real parameters. -- Myria 07:52, 13 November 2006 (UTC)[reply]

well spotted Myria, this looks like vandalism to me, as even a casual read of the first paragraph gives an accurate overview of the flaw. I think we should just remove that sentence. -- taviso 16:45, 13 November 2006 (UTC)[reply]

I went ahead and killed it. -- taviso 16:48, 13 November 2006 (UTC)[reply]

Wikipedia articles need a lead section; inaccurate or not, it did introduce what the article is about. I consider it quite rude to remove it without replacing with a better one [and I'm not the author]. I slapped a {{cleanup}} for now. -- intgr 18:47, 13 November 2006 (UTC)[reply]

An inaccurate sentence is considered better than no sentence? Interesting. Sorry, I'll remember that in future. -- taviso 19:39, 13 November 2006 (UTC)[reply]

It was not strictly "inaccurate", merely misleading, and it served its purpose. -- intgr 13:47, 14 November 2006 (UTC)[reply]

I've written a new lead and removed the generic cleanup tag. However, I'm not sure about the overall structure. It seems to be largely chronological, which is fine, but perhaps it would be better to embrace that and have the play-by-play as a timeline of some sort. --Steven Fisher 22:20, 18 December 2006 (UTC)[reply]

Trivia or Refferences

The Freakazoid article reffers to this bug. —The preceding unsigned comment was added by Can Not (talk • contribs) 00:59, 3 December 2006 (UTC).[reply]

Wrong value?

My PC gives not 4195835.0/3145727.0 = 1.333 820 449 136 241 000 but 1.333 820 449 136 241 002 etc. - is this a bug or is the number in the text wrong? --Con^structor 21:17, 8 August 2007 (UTC)[reply]

Do you have an original Pentium? The expected incorrect answer is off by roughly one part in ten thousand. Your answer could be easily explained by a change in precision, or by a software division algorithm replacing the hardware division instruction. --Steven Fisher 04:50, 10 August 2007 (UTC)[reply]

I have a dual core AMD. Still was curios about that. --Con^structor 08:31, 26 August 2007 (UTC)[reply]

Short answer: Yes, both are correct results; double-precision (64-bit) floating numbers cannot accommodate this precision, so both of these results would be equal. The x86-specific 80-bit floating point datatype is implementation-defined by design (although it's at least as precise as double-precision values)

What the value actually looks like in the end depends a lot on implementation details, e.g., whether the number formatter is rounding up or down, whether it's interpolating un-representable binary values or filling them with zeroes, whether it's using double-precision IEEE floating point numbers or the 80-bit x86 reals, etc. And this behavior might change depending on the application or standard C library version. -- intgr ^#%@! 23:03, 26 August 2007 (UTC)[reply]

Analysis of the defect.

In late 1994, the Intel Pentium FDIV bug played out mailing list and in the newsgroup comp.sys.intel. The posters were acoumplished scientist and engineers from major companies. While Intel was claiming the bug was minor, the readers of these newsgroups found out how serious the defect was. (I followed the posting at the time and was amazed at their quality.)

Tim Coe, a FPU (floating point unit) designer at Vitesse Semiconductor, read the reports of the Pentium division errors and was able to reverse engineer the cause of error. He wrote a C program to predict the errors. He did not own an Intel CPU, so he went to a local computer store to check his results. His error predictions were correct. He posted his results on the newsgroup, comp.sys.intel, on November 16, 1994.

Tim Coe (1994-11-16). "Re: Glaring FDIV bug in Pentium!". Newsgroup: comp.sys.intel. Retrieved 2008-03-24.

The original newsgroup posing can be found on Google groups. Here is a web site that has a good copy of Tim Coe's posting and some other valid links. [2]

His work was reported in the technical press at the time and here is a report from the MathWorks newsletter.

Moler, Cleve (Winter 1995). "A Tale of Two Numbers" (PDF). The MathWorks News & Notes. The MathWorks. Retrieved 2008-03-24.

Tim Coe later wrote a paper in the peer reviewed journal, IEEE Computational Science & Engineering

Coe, Tim (Spring 1995). "Computational aspects of the Pentium affair". Computational Science & Engineering, IEEE. 2 (1): 18–30. doi:10.1109/99.372929. {{cite journal}}: Unknown parameter |coauthors= ignored (|author= suggested) (help) "The Pentium affair has been widely publicized. It started with an obscure defect in the floating-point unit of Intel Corporation's flagship Pentium microprocessor. This is the story of how the Pentium floating-point division problem was discovered, and what you need to know about the maths and computer engineering involved before deciding whether to replace the chip, install the workaround provided here, or do nothing. The paper also discusses broader issues of computational correctness."

-- SWTPC6800 (talk) 03:02, 25 March 2008 (UTC)[reply]

If you're planning to add some of this new material to the article, I'd suggest that you not include the Usenet posting (since it's not considered a reliable source) but the other papers would be good to include as references. It would be especially interesting if you can identify any discovery made by Tim Coe that was different or in addition to what Thomas Nicely reported. The article is surprisingly thin on the technical details of the problem. Possibly a few more sentences might be added by someone who could fully digest the references. EdJohnston (talk) 02:52, 25 March 2008 (UTC)[reply]

I have asked about the newsgroup posting at WP:Reliable_sources/Noticeboard#A_reliable_newsgroup_posting. This newsgroup cost Intel millions of dollars. -- SWTPC6800 (talk) 03:02, 25 March 2008 (UTC)[reply]

It appears that Andy Grove, the Intel CEO, responded to this newsgroup.[[3]] -- SWTPC6800 (talk) 03:13, 25 March 2008 (UTC)[reply]

The two papers I was involved in concerning this bug was Coe et al as cited above and

Pratt, V.R., "Anatomy of the Pentium Bug", Proc. Theory and Practice of Software (TAPSOFT'95), Springer-Verlag Lecture Notes in Computer Science, volume LNCS 915, 97-107, Aarhus, Denmark, May 1995, available online at various places as a PDF by googling for the title, or just click on the copy at my website.

The latter paper expands on Coe's analysis of the bug, modifying it to account for additional details of the bug and in the process exposing previously hidden architectural details of the floating point unit (so the bug works a little like a linear accelerator, which reveals the structure of the nucleus by smashing particles together). The bug was caused by a miscalculation of where to truncate the lookup table used with the SRT algorithm, resulting in a row of five 2's being cleared to zero. This row had an extremely low probability of being used during random testing (Intel estimated one error in 27,000 years) making it hard to detect. Intel's statistics postulate only random data; in my paper I show that if instead one starts out with only the number 1 and repeatedly combine it with itself using the four arithmetic operations chosen at random (for example 1+1 = 2, 1/2 = .5, .5+1 = 1.5, .5/1.5 = .333…, etc.) the probability of encountering the row rises dramatically, with the bug manifesting itself every few minutes. Another bad case is "bruised" integers less than a hundred, such as 23.999927 (as caused by rounding errors when the results were supposed to be exact integers), where the bug is encountered on average every 400 divisions! --Vaughan Pratt (talk) 17:50, 4 June 2008 (UTC)[reply]

Link?

Should a link or reference be included to this subject from wiki's "Math Coprocessor"? —Preceding unsigned comment added by 68.107.184.54 (talk) 13:31, 24 June 2008 (UTC)[reply]