I want to create a table where the entries of various columns are 'pivoted' to column headers. The table is for reporting purposes - my users want to query data via Excel (using Microsoft Query), and the problem is that doing the pivoting in Excel makes the files unpractically large and slow even for moderately-sized data sets (~100k data points).
Consider the following example:
CREATE TABLE tt
(
"COMMODITY" VARCHAR2(4000 BYTE),
"MARKET" VARCHAR2(4000 BYTE),
"BID_ASK" VARCHAR2(4000 BYTE),
"PRICE" NUMBER
);
INSERT INTO tt VALUES ('Gold','US','Ask',1.1);
INSERT INTO tt VALUES ('Gold','US','Bid',1);
INSERT INTO tt VALUES ('Gold','EU','Ask',1.2);
INSERT INTO tt VALUES ('Gold','EU','Bid',1.1);
INSERT INTO tt VALUES ('Oil','US','Ask',11);
INSERT INTO tt VALUES ('Oil','US','Bid',10);
INSERT INTO tt VALUES ('Oil','EU','Ask',12);
INSERT INTO tt VALUES ('Oil','EU','Bid',11);
The output that I want to achieve would be something like (the exact column headers don't matter much):
COMMODITY 'US_Bid' 'US_Ask' 'EU_Bid' 'EU_Ask'
Gold 1 1.1 1.1 1.2
Oil 10 11 11 12
Now it is straightforward to pivot a single column:
SELECT * FROM
(
SELECT * FROM tt
)
PIVOT
(
SUM(PRICE)
FOR MARKET IN ('US','EU')
)
Which gives:
COMMODITY BID_ASK 'US' 'EU'
Gold Bid 1 1.1
Oil Bid 10 11
Oil Ask 11 12
Gold Ask 1.1 1.2
According to my research there is no syntax for directly pivoting multiple columns. There are some related questions (here, here or here), but I could not find a direct answer to my problem there. So I came up with the following solution:
SELECT * FROM
(
SELECT COMMODITY, CONCAT(CONCAT(MARKET,'_'),BID_ASK) AS MARKET_BID_ASK, PRICE FROM tt
)
PIVOT
(
SUM(PRICE)
FOR MARKET_BID_ASK IN ('US_Bid','US_Ask','EU_Bid','EU_Ask')
)
This produces exactly the desired output. However, I do not consider it a practical solution as the number of variables that I have to enter grows way too fast (in my real data set, I want to pivot more fields at once, all of which have many different values). I know that there exist dynamic pivots, however I am not sure whether this will work with Excel, and I would also like to keep the syntax as simple as possible, because the users will define queries on their own (I just want to provide a template query that they can adapt). So I tried to query the field names in the IN-clause:
SELECT * FROM
(
SELECT COMMODITY, CONCAT(CONCAT(MARKET,'_'),BID_ASK) AS MARKET_BID_ASK, PRICE FROM tt
)
PIVOT
(
SUM(PRICE)
FOR MARKET_BID_ASK IN
(
SELECT DISTINCT CONCAT(CONCAT(MARKET,'_'),BID_ASK) AS MARKET_BID_ASK FROM tt
)
)
I think that such a solution could be practical, because one could still constrain the variables queried without having to list all the concatenated options using LIKE-conditions in the subquery. However, I get an "ORA-00936 - missing expression" error with this query, even though subqueries should be legal here according to the documentation that I found.
You can pivot on multiple columns by enclosing the columns, and the sets of values, in parentheses:
SELECT * FROM
(
SELECT * FROM tt
)
PIVOT
(
SUM(PRICE)
FOR (MARKET, BID_ASK)
IN (('US', 'Bid') us_bid, ('US', 'Ask') us_ask, ('EU', 'Bid') eu_bid, ('EU', 'Ask') eu_ask)
);
COMMODITY US_BID US_ASK EU_BID EU_ASK
---------- ---------- ---------- ---------- ----------
Gold 1 1.1 1.1 1.2
Oil 10 11 11 12
but the value pairs still have to be known when the query is parsed, and this doesn't scale well if you have a lot of combinations of values.
Your only alternative is dynamic SQL, as you suspected, unless you can get Excel to process the result of an XML pivot - which I don't think is possible. With dynamic SQL you could perhaps have a function that does the query and pivot and returns a ref cursor, if Excel finds that easier to handle than the pivot query.