Here is an example of the data for which I am trying to find all the orders with same quantities, ignoring the OrderID column
Product Location Customer OrderID Quantity
Eggs Chicago XYZ 2011 10
Eggs Chicago XYZ 2012 10
Eggs Chicago XYZ 2013 15
So, I used DENSE_RANK function in the SQL
Select Product,Location,Customer,OrderID,Quantity,
Ranking = DENSE_RANK() OVER (PARTITION BY Product,Location,Customer,Quantity
ORDER BY OrderID ASC)
FROM MyTable
to get the data below
Product Location Customer OrderID Quantity Ranking
Eggs Chicago XYZ 2011 10 1
Eggs Chicago XYZ 2012 10 2
Eggs Chicago XYZ 2013 15 1
So, based on the ranking I was able to filter out the records that have the same quantity across different orderIDs and treat them as one. So far everything is good and I am happy. But, one of another crazy requirement is this form of aggregation should be done only for the first change in quantity. For example, if the above data happens to be like one below
Product Location Customer OrderID Quantity
Eggs Chicago XYZ 2011 10
Eggs Chicago XYZ 2012 10
Eggs Chicago XYZ 2013 15
Eggs Chicago XYZ 2014 15
Eggs Chicago XYZ 2015 15
The same SQL would produce result
Product Location Customer OrderID Quantity Ranking
Eggs Chicago XYZ 2011 10 1
Eggs Chicago XYZ 2012 10 2
Eggs Chicago XYZ 2013 15 1
Eggs Chicago XYZ 2013 15 2
Eggs Chicago XYZ 2013 15 3
But, I would need the result to be
Product Location Customer OrderID Quantity Ranking
Eggs Chicago XYZ 2011 10 1
Eggs Chicago XYZ 2012 10 2
Eggs Chicago XYZ 2013 15 1
Eggs Chicago XYZ 2013 15 1
Eggs Chicago XYZ 2013 15 1
Please, note the ranking remains 1 for all the records after the first change in quantity.
Is it possible to tweak my SQL to get the above behavior?
Thanks for any suggestions.
If I understand you correctly, you want to use DENSE_RANK()
to eliminate duplicate rows in your data.
It seems you’ve already solved your problem. If you want to eliminate the duplicates, use the same SQL code you have above and delete any rows with Ranking > 1
. This will leave you with one copy of each row with the same unique key (e.g. Product
, Location
, Customer
, OrderID
).