I'm facing a trade-off issue between temp space & query performance.
My scenario is like this: I have a huge transaction table (20 billion records across 25 partitions) and a small metric lookup table with 7 records. I need to process each transaction record for each metric record. Essentially output will be 7 * 20 billion records. This output will have to be aggregated based on 5-6 columns.
I have considered two options:
Cross join these two tables and specify processing logic wrt metric using "case when" and do "group by" operation.
Have seven different queries for each metric id and "UNION ALL" the results.
#1 is consuming huge temp space of about 250 GB and #2 runs for around 230 mins.
Is there a way, I can optimize one of these options? I need to get this query completed in 60 mins.
Adding query
@david query is pasted below
WITH IDQ_LKP AS
(SELECT '703' AS METRIC_ID,'desc1' AS EN_METRIC_1_NM
FROM DUAL
UNION ALL
SELECT '702' AS METRIC_ID,'desc1' AS EN_METRIC_1_NM
FROM DUAL
UNION ALL
SELECT '704' AS METRIC_ID,'% desc1' AS EN_METRIC_1_NM
FROM DUAL
UNION ALL
SELECT '705' AS METRIC_ID,'desc2' AS EN_METRIC_1_NM
FROM DUAL
UNION ALL
SELECT '706' AS METRIC_ID,'desc3' AS EN_METRIC_1_NM
FROM DUAL
UNION ALL
SELECT '707' AS METRIC_ID,'desc5' AS EN_METRIC_1_NM
FROM DUAL
UNION ALL
SELECT '701' AS METRIC_ID,'desc4' AS EN_METRIC_1_NM
FROM DUAL)
SELECT /*+ parallel(16) USE_HASH_AGGREGATION */ col1 ,
col2 ,
'Monthly Snapshots' AS Time_Rollup , col3 , date_pk , colr , col5 , colr_DESC , col5_DESC , metric_id , EN_METRIC_1_NM , sum (
CASE
WHEN (metric_id='704'
AND record_identifier=17 )
OR (metric_id='705'
AND record_identifier=123)
OR (metric_id='706'
AND record_identifier=43)
OR (metric_id='707'
AND record_identifier=34) THEN
nvl ( record_count,0 )
WHEN metric_id NOT IN ('704','705','706','707') THEN
NULL
ELSE 0
END ) AS METRIC_1_CY , NULL AS METRIC_1_LY , sum (
CASE
WHEN (metric_id='703'
AND record_identifier=17)
OR (metric_id='705'
AND record_identifier=777 )
OR (metric_id='702'
AND record_identifier=123 )
OR (metric_id='704'
AND record_identifier=17 )
OR (metric_id='706'
AND record_identifier=99999997 )
OR (metric_id='707'
AND record_identifier=99999996) THEN
nvl ( record_count,0 )
WHEN metric_id NOT IN ('702','703','704','705','706','707') THEN
NULL
ELSE 0
END ) AS METRIC_2_CY , NULL AS METRIC_2_LY , NULL AS METRIC_3_CY , NULL AS METRIC_3_LY
FROM TXN,LKP
WHERE col1=2
GROUP BY col1 , col2 , col3 , date_pk , colr , col5 , colr_DESC , col5_DESC, metric_id , EN_METRIC_1_NM
Hi ,
I still see 252 GB need of temp space in explain plan...
-------------------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost | Pstart| Pstop | TQ |IN-OUT| PQ Distrib |
-------------------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | | 17M| | | | | |
| 1 | PX COORDINATOR | | | | | | | | | | |
| 2 | PX SEND QC (RANDOM) | :TQ10003 | 1894M| 217G| | 17M| | | Q1,03 | P->S | QC (RAND) |
| 3 | HASH GROUP BY | | 1894M| 217G| 262G| 17M| | | Q1,03 | PCWP | |
| 4 | PX RECEIVE | | 1894M| 217G| | 17M| | | Q1,03 | PCWP | |
| 5 | PX SEND HASH | :TQ10002 | 1894M| 217G| | 17M| | | Q1,02 | P->P | HASH |
| 6 | HASH GROUP BY | | 1894M| 217G| 262G| 17M| | | Q1,02 | PCWP | |
| 7 | MERGE JOIN CARTESIAN | | 1894M| 217G| | 149K| | | Q1,02 | PCWP | |
| 8 | BUFFER SORT | | | | | | | | Q1,02 | PCWC | |
| 9 | PX RECEIVE | | 7 | 154 | | 14 | | | Q1,02 | PCWP | |
| 10 | PX SEND BROADCAST | :TQ10000 | 7 | 154 | | 14 | | | | S->P | BROADCAST |
| 11 | VIEW | | 7 | 154 | | 14 | | | | | |
| 12 | UNION-ALL | | | | | | | | | | |
| 13 | FAST DUAL | | 1 | | | 2 | | | | | |
| 14 | FAST DUAL | | 1 | | | 2 | | | | | |
| 15 | FAST DUAL | | 1 | | | 2 | | | | | |
| 16 | FAST DUAL | | 1 | | | 2 | | | | | |
| 17 | FAST DUAL | | 1 | | | 2 | | | | | |
| 18 | FAST DUAL | | 1 | | | 2 | | | | | |
| 19 | FAST DUAL | | 1 | | | 2 | | | | | |
| 20 | BUFFER SORT | | 270M| 25G| | 17M| | | Q1,02 | PCWP | |
| 21 | VIEW | | 270M| 25G| | | | | Q1,02 | PCWP | |
| 22 | HASH GROUP BY | | 270M| 22G| 29G| 115K| | | Q1,02 | PCWP | |
| 23 | PX RECEIVE | | 270M| 22G| | 843 | | | Q1,02 | PCWP | |
| 24 | PX SEND HASH | :TQ10001 | 270M| 22G| | 843 | | | Q1,01 | P->P | HASH |
| 25 | PX BLOCK ITERATOR | | 270M| 22G| | 843 | 28 | 55 | Q1,01 | PCWC | |
|* 26 | TABLE ACCESS FULL| TXN | 270M| 22G| | 843 | 28 | 55 | Q1,01 | PCWP | |
---------------------------------------------------------------------------------------------------------
Its running now... I doubt it may get stuck into same problem....
Plan for "UNION ALL" approach..
-------------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost | Pstart| Pstop | TQ |IN-OUT| PQ Distrib |
-------------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | | 774K| | | | | |
| 1 | UNION-ALL | | | | | | | | | | |
| 2 | PX COORDINATOR | | | | | | | | | | |
| 3 | PX SEND QC (RANDOM) | :TQ10001 | 270M| 18G| | 100K| | | Q1,01 | P->S | QC (RAND) |
| 4 | HASH GROUP BY | | 270M| 18G| 24G| 100K| | | Q1,01 | PCWP | |
| 5 | PX RECEIVE | | 270M| 18G| | 843 | | | Q1,01 | PCWP | |
| 6 | PX SEND HASH | :TQ10000 | 270M| 18G| | 843 | | | Q1,00 | P->P | HASH |
| 7 | PX BLOCK ITERATOR | | 270M| 18G| | 843 | 28 | 55 | Q1,00 | PCWC | |
|* 8 | TABLE ACCESS FULL| TXN | 270M| 18G| | 843 | 28 | 55 | Q1,00 | PCWP | |
| 9 | PX COORDINATOR | | | | | | | | | | |
| 10 | PX SEND QC (RANDOM) | :TQ20001 | 270M| 21G| | 112K| | | Q2,01 | P->S | QC (RAND) |
| 11 | HASH GROUP BY | | 270M| 21G| 28G| 112K| | | Q2,01 | PCWP | |
| 12 | PX RECEIVE | | 270M| 21G| | 843 | | | Q2,01 | PCWP | |
| 13 | PX SEND HASH | :TQ20000 | 270M| 21G| | 843 | | | Q2,00 | P->P | HASH |
| 14 | PX BLOCK ITERATOR | | 270M| 21G| | 843 | 28 | 55 | Q2,00 | PCWC | |
|* 15 | TABLE ACCESS FULL| TXN | 270M| 21G| | 843 | 28 | 55 | Q2,00 | PCWP | |
| 16 | PX COORDINATOR | | | | | | | | | | |
| 17 | PX SEND QC (RANDOM) | :TQ30001 | 270M| 21G| | 112K| | | Q3,01 | P->S | QC (RAND) |
| 18 | HASH GROUP BY | | 270M| 21G| 28G| 112K| | | Q3,01 | PCWP | |
| 19 | PX RECEIVE | | 270M| 21G| | 843 | | | Q3,01 | PCWP | |
| 20 | PX SEND HASH | :TQ30000 | 270M| 21G| | 843 | | | Q3,00 | P->P | HASH |
| 21 | PX BLOCK ITERATOR | | 270M| 21G| | 843 | 28 | 55 | Q3,00 | PCWC | |
|* 22 | TABLE ACCESS FULL| TXN | 270M| 21G| | 843 | 28 | 55 | Q3,00 | PCWP | |
| 23 | PX COORDINATOR | | | | | | | | | | |
| 24 | PX SEND QC (RANDOM) | :TQ40001 | 270M| 21G| | 112K| | | Q4,01 | P->S | QC (RAND) |
| 25 | HASH GROUP BY | | 270M| 21G| 28G| 112K| | | Q4,01 | PCWP | |
| 26 | PX RECEIVE | | 270M| 21G| | 843 | | | Q4,01 | PCWP | |
| 27 | PX SEND HASH | :TQ40000 | 270M| 21G| | 843 | | | Q4,00 | P->P | HASH |
| 28 | PX BLOCK ITERATOR | | 270M| 21G| | 843 | 28 | 55 | Q4,00 | PCWC | |
|* 29 | TABLE ACCESS FULL| TXN | 270M| 21G| | 843 | 28 | 55 | Q4,00 | PCWP | |
| 30 | PX COORDINATOR | | | | | | | | | | |
| 31 | PX SEND QC (RANDOM) | :TQ50001 | 270M| 21G| | 112K| | | Q5,01 | P->S | QC (RAND) |
| 32 | HASH GROUP BY | | 270M| 21G| 28G| 112K| | | Q5,01 | PCWP | |
| 33 | PX RECEIVE | | 270M| 21G| | 843 | | | Q5,01 | PCWP | |
| 34 | PX SEND HASH | :TQ50000 | 270M| 21G| | 843 | | | Q5,00 | P->P | HASH |
| 35 | PX BLOCK ITERATOR | | 270M| 21G| | 843 | 28 | 55 | Q5,00 | PCWC | |
|* 36 | TABLE ACCESS FULL| TXN | 270M| 21G| | 843 | 28 | 55 | Q5,00 | PCWP | |
| 37 | PX COORDINATOR | | | | | | | | | | |
| 38 | PX SEND QC (RANDOM) | :TQ60001 | 270M| 21G| | 112K| | | Q6,01 | P->S | QC (RAND) |
| 39 | HASH GROUP BY | | 270M| 21G| 28G| 112K| | | Q6,01 | PCWP | |
| 40 | PX RECEIVE | | 270M| 21G| | 843 | | | Q6,01 | PCWP | |
| 41 | PX SEND HASH | :TQ60000 | 270M| 21G| | 843 | | | Q6,00 | P->P | HASH |
| 42 | PX BLOCK ITERATOR | | 270M| 21G| | 843 | 28 | 55 | Q6,00 | PCWC | |
|* 43 | TABLE ACCESS FULL| TXN | 270M| 21G| | 843 | 28 | 55 | Q6,00 | PCWP | |
| 44 | PX COORDINATOR | | | | | | | | | | |
| 45 | PX SEND QC (RANDOM) | :TQ70001 | 270M| 21G| | 112K| | | Q7,01 | P->S | QC (RAND) |
| 46 | SORT GROUP BY | | 270M| 21G| 28G| 112K| | | Q7,01 | PCWP | |
| 47 | PX RECEIVE | | 270M| 21G| | 843 | | | Q7,01 | PCWP | |
| 48 | PX SEND HASH | :TQ70000 | 270M| 21G| | 843 | | | Q7,00 | P->P | HASH |
| 49 | PX BLOCK ITERATOR | | 270M| 21G| | 843 | 28 | 55 | Q7,00 | PCWC | |
|* 50 | TABLE ACCESS FULL| TXN | 270M| 21G| | 843 | 28 | 55 | Q7,00 | PCWP | |
-------------------------------------------------------------------------------------------------------------------------------------------
Just looking at the query, one approach that I'd consider trying is to change from this method of implementation:
... to this ...
That ought to require a smaller temp data set.