In decision tree J48 example, when we say tree pruned or unpruned, what is the difference?
The unpruned trees are larger. What happens is that basically the tree is created according to the implemented algorithm and if pruning is enabled, an additional step looks at what nodes/branches can be removed without affecting the performance too much.
The idea behind pruning is that, apart from making the tree easier to understand, you reduce the risk of overfitting to the training data. That is, being able to classify the training data (almost) perfectly, but nothing else because instead of learning the underlying concept, the tree has learned the properties intrinsic and specific to the training data.