I am using DB2 9.7 FP5 for LUW. I have a table with 2.5 million rows and I want to delete about 1 million rows and this delete operation is distributed across table. I am deleting data with 5 delete statements.
delete from tablename where tableky between range1 and range2
delete from tablename where tableky between range3 and range4
delete from tablename where tableky between range5 and range5
delete from tablename where tableky between range7 and range8
delete from tablename where tableky between range9 and range10
While doing this, first 3 deletes works properly but the 4th fails and DB2 hangs, doing nothing. Below is the process I followed, please help me on this:
1. Set following profile registry parameters: DB2_SKIPINSERTED,DB2_USE_ALTERNATE_PAGE_CLEANING,DB2_EVALUNCOMMITTED,DB2_SKIPDELETED,DB2_PARALLEL_IO
2.Alter bufferpools for automatic storage.
3. Turn off logging for tables (alter table tabname activate not logged initially) and delete records
4. Execute the script with +c to make sure logging is off
What are the best practices to delete such large amount of data? Why its failing when it is deleting data from same table and of same nature?
This is allways tricky task. The size of transaction (e.g. for safe rollback) is limited by the size of transaction log. The transaction log is filled not only by yours sql commands but also by the commands of other users using db in the same moment.
I would suggest using one of/or combination of following methods
Do commmits often - in your case I would put one commit after each delete command
As I recall default db2 transaction log is not very big. The size of transaction log should be calculated/tuned for each db individually. Reference here and with more details here
Write and call stored procedure which does deletes in blocks, e.g.:
-- USAGE - create: db2 -td@ -vf del_blocks.sql -- USAGE - call: db2 "call DEL_BLOCKS(4, ?)" drop PROCEDURE DEL_BLOCKS@ CREATE PROCEDURE DEL_BLOCKS(IN PK_FROM INTEGER, IN PK_TO INTEGER) LANGUAGE SQL BEGIN declare v_CNT_BLOCK bigint; set v_CNT_BLOCK = 0; FOR r_cur as c_cur cursor with hold for select tableky from tablename where tableky between pk_from and pk_to for read only DO delete from tablename where tableky=r_cur.tableky; set v_CNT_BLOCK=v_CNT_BLOCK+1; if v_CNT_BLOCK >= 5000 then set v_CNT_BLOCK = 0; commit; end if; END FOR; commit; END@
In some cases when I needed to purge very big tables or leave just small amount of records (and had no FK constraints), then I used export + import(replace). The replace import option is very destructive - it purges the whole table before import of new records starts (reference of db2 import command), so be sure what you're doing and make backup before. For such sensitive operations I create 3 scripts and run each separately: backup, export, import. Here is the script for export:
echo '===================== export started ';
values current time;
export to tablename.del of del
select * from tablename where (tableky between 1 and 1000
or tableky between 2000 and 3000
or tableky between 5000 and 7000
) ;
echo '===================== export finished ';
values current time;
Here is the import script:
echo '===================== import started ';
values current time;
import from tablename.del of del allow write access commitcount 2000
-- !!!! this is IMPORTANT and VERY VERY destructive option
replace
into tablename ;
echo '===================== import finished ';
Db2 in version 9.7 introduced TRUNCATE statement which:
deletes all of the rows from a table.
Basically:
TRUNCATE TABLE <tablename> IMMEDIATE
I had no experience with TRUNCATE in db2 but in some other engines, the command is very fast and does not use transaction log (at least not in usual manner). Please check all details here or in official documentation. As solution 4, this method too is very destructive - it purges the whole table so be very careful before issuing the command. Ensure previous state with table/db backup doing first.
When there are no other users on db, or ensure this by locking the table.
In transaction db (like db2) rollback can restore db state to the state when transaction started. In methods 1,3 and 4 this can't be achieved, so if you need feature "restoring to the original state", the only option which ensures this is the method nr. 2 - increase transaction log.