Talk:Third normal form

This article is within the scope of WikiProject Databases, a project which is currently considered to be inactive.DatabasesWikipedia:WikiProject DatabasesTemplate:WikiProject DatabasesDatabases

Archives: 1 2

Not Simple Enough

I have a BSc in Computer Science, although I shied away from databases, and I also have a British A-Level in Computing, where I was taught how to reduce data down to 3NF. This article is written using database algebra. I like the history of it, but I passed this as an article about how I designed a coherent naming scheme for about 1000 computers (consider the fields between the dots in a hostname to be atoms), and I didn't even understand the article myself, even though I know the method.

This is an encyclopaedia, not a scholarly textbook! — Preceding unsigned comment added by 93.97.48.95 (talk) 00:11, 17 September 2011 (UTC)[reply]

Agreed. I've added a plainer English version based on a definition at Technopedia (with citation) even though this may not be the best one MaryEFreeman (talk) 15:38, 24 April 2014 (UTC)[reply]

I believe that the definition from the Technopedia article is incorrect. For example, if R = (A,B,C) where AB is a primary key and A is dependent on C, then R is in 3NF (according to my professor and Codd's definition), even though A is not only dependent on the primary key. 142.58.43.110 (talk) 21:48, 1 December 2014 (UTC)[reply]

Example

I've added some text about why the candidate key is what it is. This is because we frequently get talk-page comments where beginners express confusion about the concept of a candidate key, which in turn leads to confusion about the examples in the NF articles. There is of course always a link to the "candidate key" article, but the candidate key article at present is too theoretical for beginners to understand. --Nabav (talk) 09:17, 1 February 2009 (UTC)[reply]

I think either something's wrong with the example or with the candidate key article. Why is "Winner Date of Birth" supposed to be a non-prime attribute? "Tournament, Winner Date of Birth" is a candidate key according to the definition in the candidate key article (there are no distinct tuples with the same values for Tournament and Winner Date of Birth (1) and (1) does not hold for either Tournament or Winner Date of Birth). -- But then "Winner Date of Birth" is contained in a candidate key and therefore prime by definition. Am I missing something? —Preceding unsigned comment added by 141.76.75.39 (talk) 16:42, 27 January 2010 (UTC)[reply]

I'm not surprised you're confused. There's an important point that isn't coming across clearly in the "candidate key" article. Namely: candidate keys are supposed to be unique identifiers not merely for rows that exist in the table right now, but for all rows that could ever exist in the table. Thus {Tournament, Winner} is a candidate key because it's guaranteed to identify rows uniquely no matter what. Whereas the fact that two people can share the same date of birth means that {Tournament, Winner Date of Birth} won't do the same job.--Nabav (talk) 08:40, 15 May 2010 (UTC)[reply]

The whole key

This post relates to this page as much as it does to BCNF: http://en.wikipedia.org/wiki/Talk:Boyce-Codd_normal_form#The_whole_key fogus (talk) 18:52, 18 April 2009 (UTC)[reply]

I've reworded the explanation of the BCNF version of the key-whole-key-and-nothing-but-the-key dictum in this article, in response to your criticism. --Nabav (talk) 21:01, 20 April 2009 (UTC)[reply]

X and A?

In the example "This definition states that a table is in 3NF if and only if, for each of its functional dependencies X → A, at least one of the following conditions holds:" I am totally unclear as to whether X and A are keys or not, or whether it matters or not. Can somebody clarify? —CobraA1 19:54, 9 November 2009 (UTC)[reply]

i think this is a total false: " A-X, the set difference between A and X is a prime attribute "

from what i know each att of A, Ai should be prime. (and about the response if x is superkey it can be a key) — Preceding unsigned comment added by 46.117.228.13 (talk) 10:41, 7 September 2011 (UTC)[reply]

That's correct. I've changed the text accordingly. However, if I recall correctly, Zaniolo's paper was written using the canonlical definition of FDs, where the determined object is not an attribute set but an attribute (not even a singleton set of attributes) and I didn't make that change since most of the normalisation stuff in wikipedia uses the for where the determined object is a set of attributes. Logically, of course, the single attribute vs set of attributes makes no difference. Michealt (talk) 19:26, 12 October 2012 (UTC)[reply]

Odd comment

'Normalization is a process of reducing data redundencies.'Bold text —Preceding unsigned comment added by 112.135.20.34 (talk) 06:48, 8 March 2011 (UTC)[reply]

Example has two composite keys

I was editing the article to remove the "clarification" tag at the end of the tournament example, but I can't come up with a clarification that I like. I initially wrote this:

"Update anomalies cannot occur in these tables, since the only informational column (the date of birth) is not repeated across rows. This assumes that the winner name cannot change or be subject to variation, for example Al being called Alfred in a different competition, a problem which can be avoided through use of a surrogate key".

I don't like it because the date of birth is not purely informational, since in the single-table representation it may also serve to distinguish between two people with the same name. I don't like getting sidetracked with the name thing, but I feel that it is necessary and cannot bring myself to implicitly recommend using a real name as a primary key by not pointing out the problem.

Perhaps the example needs to be amended to use a "Competitor ID" column like the exmaple in first normal form? That doesn't really work very well either, since the data is clearly aggregated from multiple sources and does not have a consistent business key that can be used. Any attempt to identify a unique person by matching name and date of birth is an unreliable assumption in such a context.

— PhilHibbs | talk 09:05, 25 June 2015 (UTC)[reply]

Zaniolo

The section does not deliver a proof. First, it does not show equivalency of the two definitions but only one direction. Second, it even fails to deliver that promise. it shows nothing. It is not even wrong. — Preceding unsigned comment added by 93.207.206.126 (talk) 23:30, 28 January 2016 (UTC)[reply]

(Non-transitively) dependent or Non-(transitively dependent)

When describing Codd's condition, it is written "Every non-prime attribute of R is non-transitively dependent on every key of R."

I originally read this to mean that every non-prime attribute is directly dependent on every key. Having read the rest of the article, I believe that it is saying every non-prime attribute is not dependent on any key -- not even transitively!

Could the text be edited to disambiguate this? — Preceding unsigned comment added by 92.28.127.164 (talk) 17:19, 6 June 2016 (UTC)[reply]

Courtroom phrasing

There must be a way to work in my favorite database joke about the courtroom phrasing: "I promise to use the key, the whole key, and nothing but the key, so help me Codd". But I haven't figured out where it fits yet :-). Amillar (talk) 06:19, 8 July 2016 (UTC)[reply]