What is the difference between Principal Component Analysis (PCA) and Feature Selection in Machine Learning? Is PCA a means of feature selection?
PCA is a way of finding out which features are important for best describing the variance in a data set. It's most often used for reducing the dimensionality of a large data set so that it becomes more practical to apply machine learning where the original data are inherently high dimensional (e.g. image recognition).
PCA has limitations though, because it relies on linear relationships between feature elements and it's often unclear what the relationships are before you start. As it also "hides" feature elements that contribute little to the variance in the data, it can sometimes eradicate a small but significant differentiator that would affect the performance of a machine learning model.