How to measure table space on disk in RedShift / ParAccel

diemacht picture diemacht · Oct 22, 2013 · Viewed 19.4k times · Source

I have a table in RedShift. How can I see how many disk-space it uses?

Answer

Tomasz Tybulewicz picture Tomasz Tybulewicz · Oct 22, 2013

Use queries from this presentation: http://www.slideshare.net/AmazonWebServices/amazon-redshift-best-practices

Analyze disk space usage for cluster:

select
    trim(pgdb.datname) as Database,
    trim(pgn.nspname) as Schema,
    trim(a.name) as Table,
    b.mbytes,
    a.rows
from (
    select db_id, id, name, sum(rows) as rows
    from stv_tbl_perm a
    group by db_id, id, name
) as a
join pg_class as pgc on pgc.oid = a.id
join pg_namespace as pgn on pgn.oid = pgc.relnamespace
join pg_database as pgdb on pgdb.oid = a.db_id
join (
    select tbl, count(*) as mbytes
    from stv_blocklist
    group by tbl
) b on a.id = b.tbl
order by mbytes desc, a.db_id, a.name; 

Analyze Table distribution between nodes:

select slice, col, num_values, minvalue, maxvalue
from svv_diskusage
where name = '__INSERT__TABLE__NAME__HERE__' and col = 0
order by slice, col;