Dataset Description
Here I have used house pricing dataset. this dataset contains information about kitchen,bedroom,housestyle,area,street,price etc.
Data Encoding
Data encoding is the transformation of categorical variables to binary or numerical counterparts. In this we assign unique values to all the categorical attribute.so there are two types so data encoding (1)label encoding (2)Onehot encoding
(1)Label encoding
If we will have more than one category in the dataset that to convert those categories into numerical features we can use a Label encoder.
(2)Onehot encoder
One hot encoder does the same things but in a different way. Label Encoder initializes the particular number but one hot encoder will assign a whole new column to particular categories.
Normalization
Standardization
Imputing Missing Values
Missing data are values that are not recorded in a dataset. They can be a single value missing in a single cell or missing of an entire observation (row). Missing data can occur both in a continuous variable (e.g. height of students) or a categorical variable (e.g. gender of a population).
Simple Imputer
Discretization
Quantile Discretization Transform
Uniform Discretization Transform
KMeans Discretization Transform