I am just trying to confirm my understanding of difference between 0 reducer and identity reducer.
You understanding is correct. I would define it as following:
If you do not need sorting of map results - you set 0 reduced,and the job is called map only.
If you need to sort the mapping results, but do not need any aggregation - you choose identity reducer.
And to complete the picture we have a third case : we do need aggregation and, in this case we need reducer.