Talk:B+ tree

Computer science Start‑class High‑importance

This article is within the scope of WikiProject Computer science, a collaborative effort to improve the coverage of Computer science related articles on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.Computer scienceWikipedia:WikiProject Computer scienceTemplate:WikiProject Computer scienceComputer science

Start

This article has been rated as Start-class on Wikipedia's content assessment scale.

High

This article has been rated as High-importance on the project's importance scale.

Things you can help WikiProject Computer science with:

Here are some tasks awaiting attention:

Article requests :
- Requested articles/Applied arts and sciences/Computer science, computing, and Internet
Cleanup :
- Computer science articles needing attention
- Computer science articles needing expert attention
Copyedit :
- Computing
Expand :
- Computer science
Infobox :
- Computer science articles without infoboxes
Maintain :
- Timeline of computing 2020–present
Photo :
- Find pictures for the biographies of computer scientists (see List of computer scientists)
- Computing articles needing images
Stubs :
- Computer science stubs
Unreferenced :
- WikiProject Computer science/Unreferenced BLPs
Project-related :
- Tag all relevant articles in Category:Computer science and sub-categories with {{WikiProject Computer science}}

inner insert breay, this diagram erringly splits nodes right when they reach capacity, thus precluding there ever being any "full" nodes. B-tree algorithms (and B+-tree) are supposed to split nodes only after capacity is reached and a new value is tried to be inserted in a full node. 132.198.12.98 21:30, 30 November 2007 (UTC)[reply]

The diagram is incorrect and has been removed. While there are some implementations of B+ trees that do split before the block is full (this allows you to perform the split after insertion is completed), this is not the "normal" procedure.78.91.39.182 15:13, 1 December 2007 (UTC)[reply]

Diagram

The diagram isn't very clear. It doesn't show that the data matches the keys, and it makes the 'linked list' pointers much different than the data pointers. —Preceding unsigned comment added by 24.26.131.178 (talk) 04:46, 4 December 2007 (UTC)[reply]

There should be step-by-step insertion/deletion diagrams showing a few common scenarios. --Aelon (talk) 10:32, 22 December 2007 (UTC)[reply]

The diagram contains a major mistake: in a B+ tree, right subtrees have keys greater than or equal to those in parent node. The root node must therefore be corrected to 4 - 6, and the issue is dealt with. Note: the very insertion algorithm described in this article ("Insert the new leaf's smallest key and address into parent") confirms the same. 26.09.2009

Merge suggested

It seems to me like the main difference between B-Trees and B+-Trees is a relatively minor one that has little effect on how operations proceed, which renders much of this article redundant with B-tree. I suggest this article be merged into B-tree and become a section of that article. Dcoetzee 03:58, 2 December 2008 (UTC)[reply]

Can't support this. The differences might not be obvious but lie in details. B+ trees have much more problems which are not discussed in the B-Tree article. For example how do you handle duplicate keys during deletion? Duplicate keys do not exist for regular B-Trees, therefore it's not an issue there. The differences affect all the algorithms, therefore it would not make sense to merge both articles.

Diagram

Am I not correct that the current diagram doesn't show a B+ Tree at all? First, nodes will have between N and 2N entries. Which integer N is 3 twice?

You are not correct. B+ trees are order B trees (a node must have between 1 and B children), and B doesn't have to be 2, it can take any integer value greater than 1.Michealt (talk) 11:47, 29 August 2011 (UTC)[reply]

Tree Order

Ramakrishnan's book on DBMS says that tree order of d means size of each internal node can range between d <= m <= 2d. However this article says it ranges between d/2 <= m <= d. I am not sure which one is correct. But majority of the CS Schools around take Ramakrishnan's book as reference. So this may be written wrong in the article. — Preceding unsigned comment added by 95.15.75.81 (talk) 18:18, 17 December 2011 (UTC)[reply]

Merge or Additional Clarity

The only sentence that separates this from the B-tree article is In a B+ tree, in contrast to a B-tree, all records are stored at the leaf level of the tree; only keys are stored in interior nodes.

Can someone add why you'd use a B+ tree instead of a B tree? There seem to be a lot of database solutions using B+ trees, so there has to be a reason, but without that reason, this article doesn't differ substantially from the B tree article. Dean (talk) 14:14, 22 July 2010 (UTC)[reply]

I've added some content to the implementation section describing at least one advantage of a B+-tree over a B-tree in the context of database systems. It seems to me that merging these two articles wouldn't be a great idea; while it's true that the articles are similar, that's largely due to a lack of detail on the insert and delete algorithms. There are subtle differences in how these work. If someone were to put in detailed pseudocode for these algorithms, they'd be somewhat different. That would make for quite a large merged article. While it's true that merging them now without that content might make some sense, it just means that someday, when someone wants to put in the detailed pseudocode, someone else would suggest splitting them. Maybe someone should just put in detailed code now? SeparateWays (talk) 19:02, 22 July 2010 (UTC)[reply]

Dean, as I understand it, a B-tree node takes up 50% more space (depending on data types, of course). This means that with the same block size a B+-tree fans out quicker, allowing you to traverse less tree levels to get to the leaf. — Preceding unsigned comment added by 82.139.81.0 (talk) 03:17, 18 May 2012 (UTC)[reply]

SQlite's use of B+ trees

At the very top of the article it is mentioned that SQlite uses B+ trees for its indices. The referenced link [[1]] says that B+ trees are only used to store data and not table indices. Should the reference to SQlite be dropped entirely or should there be an extra clause to say "SQlite just uses them for data!"? — Preceding unsigned comment added by 137.48.72.149 (talk) 13:54, 25 January 2012 (UTC)[reply]

I think it's fair to keep SQLite on that list as it is. The term "B+ tree" is only relevant when data records consist of a lookup field (e.g. table's primary key) and non-lookup data fields (rest of table data).

In the case of regular database indexes, the whole index record is the lookup key, therefore a B+ tree and B-tree would be equivalent anyway. I suspect that this observation also applies to other databases. -- intgr [talk] 13:40, 2 February 2012 (UTC)[reply]

Deletion of seperator value from index node

as seperator keys are copied from leaf and can be subsequently moved to higher index nodes, while deleting a key value we have to delete the key twice, once from the leaf and once from the index, right?

                     9
                  /     \
           2 5 - -       11    17        21     -
          / | \         /      |         |  \
              5 6 7 - 9 10 - - 11 12 13 -

like in this case if we want to delete 9, we first delete 9 from leaf; which will be prevented from underflowing by borrowing 11 from right sibling, adjusting their seperator entry in parent to 12 subsequently. The tree will then look like

                     9
                  /     \
           2 5 - -       12     17       21     -
          / | \         /       |        |  \
              5 6 7 - 10 11 - - 12 13 - -

but should we then delete the entry 9 from the root? which will need another deletion operation. Or we just keep that entry, it would take some space though and can acumulate to a larger space.

--Samik ganguly (talk) 10:30, 5 April 2012 (UTC)[reply]

Traditionally all keys correspond to actual data in leaves. If you don't delete the interior node value when the corresponding data is deleted, when will you? This is traditionally done by simply move the 10 to where the 9 is. A delete_min function typically does that using a pass-by reference parameter to the original data in the interior node's key value to be replaced.132.160.49.90 (talk) 04:02, 27 March 2016 (UTC)[reply]

Maximum possible number of keys in the root node as a leaf

Hello,

Quotation from the article: "(The root is also the single leaf, in this case.) This node is permitted to have as little as one key if necessary, and at most b."

Are you sure that the root node possible maximum number of keys as a leaf is b and not (b - 1) like any other "regular" leaf?

Thank you. — Preceding unsigned comment added by 217.128.106.129 (talk) 21:17, 10 November 2012 (UTC)[reply]

PostgreSQL's use of B+ trees

PostgreSQL is capable of using B+ trees: http://www.postgresql.org/about/ Ddugovic (talk) 16:59, 25 December 2012 (UTC)[reply]

That page only mentions B+ trees in relation to GiST, which is a framework for building advanced inverted index structures. The fact that it *can* be used for building B+ trees doesn't mean that anyone uses it that way. I think it would be very misleading to list it on this article because of GiST.

Normal ("btree" type) indexes in Postgres are not B+ trees. The distinction between B+ trees and B-trees is kind of nonsense for database indexes in the first place -- all the columns in the index itself *are* the lookup key, and they're the same on the leaf level as any other level. The *record* itself is generally stored in a separate structure -- in the case of Postgres, it's the table heap.

The distinction does make sense for databases which store whole table data in a tree, such as InnoDB (which uses a normal B-tree) or Oracle's index-organized tables (and others). Postgres always stores table data in a heap, not a tree, thus the B+ tree/B-tree classificaiton is not applicable. -- intgr [talk] 23:03, 25 December 2012 (UTC)[reply]

So, what is a B+tree?

I. Here it states, "A B+ tree can be viewed as a B-tree in which each node contains only keys (not key-value pairs)"

II. However, in the B tree article : "In the B+-tree, copies of the keys are stored in the internal nodes; the keys and records are stored in leaves"

Is is I, II, or neither?

A B+ tree is

A B+ tree is simply a B tree with data stored at the leaves instead of with the keys in the internal nodes.

Although it is possible to link the leaf nodes as a single or double link list, that is not a requirement. You could also do that with other trees, but it's just cleaner and more elegant to do with B+ trees than most others

There are misleading statements should be removed like the definition here that seems to be someone's inaccurate opinion coming from a typical variation. 132.160.49.90 (talk) 04:09, 27 March 2016 (UTC)[reply]

Search algorithm, insertion, and merging gibberish

First, a function just calls another function. What is k_0 or k_i? Why are there no brackets like in k_[i+1]? Where does p come from? etc .. etc... etc ...

"B-trees grow at the root and not at the leaves" Umm ....

Is there any reason whey the "merging" garbage should not be deleted?

Insertion: "B-trees grow at the root and not at the leaves."

I read the referenced "fundamentals of database systems" and after that I do not understand this saying either. If a leaf node is split the tree grows at its bottom as well, which is the common case. Maybe it would be better to say that **new levels** are added by splitting the root node and adding a new root.

External links modified

Hello fellow Wikipedians,

I have just added archive links to one external link on B+ tree. Please take a moment to review my edit. If necessary, add {{cbignore}} after the link to keep me from modifying it. Alternatively, you can add {{nobots|deny=InternetArchiveBot}} to keep me off the page altogether. I made the following changes:

Added archive https://web.archive.org/20090912082150/http://1978th.net:80/tokyocabinet/ to http://1978th.net/tokyocabinet/

When you have finished reviewing my changes, please set the checked parameter below to true to let others know.

Y An editor has reviewed this edit and fixed any errors that were found.

If you have discovered URLs which were erroneously considered dead by the bot, you can report them with this tool.
If you found an error with any archives or the URLs themselves, you can fix them with this tool.

Cheers.—^{cyberbot II}_{Talk to my owner:Online} 22:42, 9 January 2016 (UTC)[reply]

Terrible

I have more than twenty years experience as a developer, I have implemented AVL B-trees and coming to this article and reading it, I have absolutely *no idea* what a B+ tree is.

This is I have to say though my normal experience on the Wikipedia for anything scientific, mathematic or in computing; it is explained in a way that only someone who *already completely understands* can comprehend. In that light, this article is no worse and no better than all or almost all other such articles.

95.91.213.121 (talk) 11:02, 24 June 2017 (UTC)[reply]

I read your comment and it's terrible. After reading it, I am still absolutely no smarter about what's wrong with this article. If you're going to spend the time to write a comment, find something constructive to say. Or maybe even contribute to improving the articles. -- intgr [talk] 23:26, 28 June 2017 (UTC)[reply]

I have some computer experience also, and noticed that the diagram disagrees with the insertion pseudo-code. So, I came to this discussion page. Perhaps all that is needed to clarify this algorithm is an explanation of how to step through the pseudo code to search for a few keys, based on the tree shown in the diagram. (And, when that fails, replace the pseudo-code and/or the diagram. ;^)