Now that I introduced the various steps in feature engineering, that is, feature selection, creation, and transformation, I will dive into each one of these steps in this video. The very first step in feature engineering is feature selection. Here, you identify the appropriate data attributes, or features, to include in your training dataset, as well as filter out any redundant and unusable features from the training dataset. You perform feature selection with the goal of reducing the feature set dimensionality so that the reduced feature set can help train your models much more quickly. So, how do you select features to include in your training dataset? One of the techniques to use is feature importance score. If you have completed the Course 1 of this specialization, this chart should look very familiar to you. This is the feature importance chart that was generated on the product review dataset using Amazon SageMaker Data Wrangler. This chart visually indicates how important, or relevant, each one of the features is to the final model, as indicated by the importance score. In this chart, towards the right side, you can see features like class name, division name, and department name, that are really not contributing anything to your final model, as indicated by the importance score of 0. So, here's an opportunity for you to filter out these irrelevant features out of the training dataset. Similarly, on the left side, you'll see features like recommended indicator, review text, and review title, that are contributing in a major way towards your final model. These are the features that you want to include in your training dataset. Now keep in mind that using feature importance score is just one of the techniques that you can use to select appropriate features to include in your training dataset. The next step in feature engineering is feature creation. Here, you can combine existing features into new features, or you can infer new attributes from existing attributes. In the product review dataset, you infer a new feature called sentiment, based on an existing feature reading. So, that is an example of inferring new attributes. Another possible example of creating new features could be combining the features' review text and review title. The idea here is by creating and using these new features, you lead your machine learning model to produce more accurate prediction. The final step involved in feature engineering is feature transformation. This could involve calculating missing feature values using techniques like imputation or scaling numerical feature values, using techniques like standardization and normalization. Finally, converting non-numerical features into numerical values so that the algorithm can make sense of these non-numerical features. To know numerical features from the product review dataset, the features you consider here are class name and review text. Class name is a categorical variable that can take values from a pre-determined fixed set of values. This feature represents a product category and can have values like blouses, pants, dresses, and so on. This categorical feature can be converted into numerical feature by using techniques like one-hot encoding. How about the second one? The review text as you can see here is a completely free-flowing text, and it can take any value that is appropriate for this particular feature. Now, what kind of techniques can you use to convert this free-flowing text into numerical features, so that the machine learning algorithm that you choose can actually understand what this feature represents? This week, you will learn how to convert this free-flowing text, review text into BERT vectors or BERT embeddings. Before I end this video, I have mentioned a few terms like normalization, standardization, imputation, and one-hot encoding, as part of the feature engineering discussion. I will not be reviewing the basic feature engineering techniques in this course but if you're interested in reading up on these techniques or brushing up, please review the reading material that is provided to you for this week. For the rest of the week, I will be focused on converting the raw review text into BERT vectors or BERT encodings that the algorithm can readily use.