If I have a table column with data and create an index on this column, will the index take same amount of disc space as the column itself?
I'm interested because I'm trying to understand if b-trees actually keep copies of column data in leaf nodes or they somehow point to it?
Sorry if this a "Will Java replace XML?" kind question.
UPDATE:
created a table without index with a single GUID column, added 1M rows - 26MB
same table with a primary key (clustered index) - 25MB (even less!), index size - 176KB
same table with a unique key (nonclustered index) - 26MB, index size - 27MB
So only nonclustered indexes take as much space as the data itself.
All measurements were done in SQL Server 2005
The B-Tree points to the row in the table, but the B-Tree itself still takes some space on disk.
Some database, have special table which embed the main index and the data. In Oracle, it's called IOT -- index-organized table.
Each row in a regular table can be identified by an internal ID (but it's database specific) which is used by the B-Tree to identify the row. In Oracle, it's called rowid
and looks like AAAAECAABAAAAgiAAA
:)
If I have a table column with data and create an index on this column, will the index take same amount of disc space as the column itself?
In a basic B-Tree, you have the same number of node as the number of item in the column.
Consider 1,2,3,4
:
1
/
2
\ 3
\ 4
The exact space can still be a bit different (the index is probably a bit bigger as it need to store links between nodes, it may not be balanced perfectly, etc.), and I guess database can use optimization to compress part of the index. But the order of magnitude between the index and the column data should be the same.