ML — Data Standardization vs Normalization
Sep 2, 2022
Both Normalization and Standardization are preprocessing steps we take to:
- Reduce the size of the data. As we process data, lots of transformations are applied producing really large numbers by a series of multiplications and other operations.
If you do not reduce the size, your computer may not have enough memory to process the data and/or it could take much longer to process it.
2. Some algorithms are really sensitive to non-normalized data, so, please make sure to always normalize data.
Should I choose Normalization or Standardization?
Normalization is useful when there are no outliers.
Standardization should be used in cases where the data follows a Gaussian (Normal) distribution.
Standardization does not get affected by outliers because there is no predefined range of transformed features.
Hands-on
For hands-on, please check the following notebook availabe at Github: