As we think about modeling a real problem with machine learning, we first need to think about what input signals we can use to train that model. In this next section, let's use a common example. When it real estate, can you predict the price of a property? As you think about that problem, you must first choose your features. That is the data that we'll be basing your predictions on. Why not try and build a model that predicts the price of an area of a house? Or an apartment? Your features could be the square footage, that's numeric in the category. Is it a house or an apartment? So far square footage that's numeric, numbers can be fed directly into a neural network for training. But we're going to come back to that later and how that's done. The type of the property though, that's not numeric. This piece of information may be represented in a database by a string value, like house or apartment. And strings need to be transformed into numbers before being fed into a neural network. Remember, a feature column describes how the model should use raw input data from your features dictionary. In other words, a feature column provides methods for the input data to be properly transformed before sending it to a model for training. Again the model just wants to work with numbers, that's the tensors part. Here's how you implement this in code. Use the feature column API to determine the features, first numeric column for square footage, then a categorical column for the property type, two possible categories in this very simple model. House or apartment. You probably noticed that the categorical column is called, categorical column with vocabulary list. Use this when your inputs are in a string or integer format and you have an in memory vocabulary mapping to each value to an integer ID. By default, out of vocabulary values are ignored. As a quick side note, other variations of this are, categorical column without with a vocabulary file when inputs are in a string or integer format, but there's a vocabulary file that map's each value in your integer ID. A categorical column with identity, that's used when inputs are integers in the range of zero to the number of buckets. You want to use the input value itself as the categorical ID, and finally categorical column with the hash bucket, that's used when features are sparse in the string or integer format. And you want to distribute your inputs into a finite number of buckets by hashing them. In this example, after the raw input is modified by feature column transformations, you can then instantiate a linear regressor to train on these features. Regressor is a model that outputs a number. In our example, the predicted sale price of the property, that's the number. But why do you need feature columns in the context of model building? Do you remember how to get used? Let's break it down for this model type. A linear regressor is a model that works on a vector of data. It computes a weighted sum of all input data elements and can be trained to adjust the weights for your problem. Here we're predicting the sales price. But how can you pack your data into a single input vector that the linear regressor expects? Well the answer is in various ways, depending upon what data that you're packing. And in what area that your feature columns are using the API's, that really comes in handy. It implements various standard ways, the API of helping you pack that data into those vectorized elements. Let's look at a few. Here, values in numeric column they're just numbers, they can get copied as they are into a single element of the input vector. On the other hand, those categorical columns, they need to get one hot encoded. You have two categories, house or apartment, house will be 1, 0 and apartment will be 0, 1. A third category would be 0, 0, 1 and so on. Now, the linear regressor knows how to take those features that you care about, pack them into an input vector and apply whatever the linear regressor does. Besides the categorical ones that we've seen, there are many other mode feature column types to choose from. Columns for continuous values that you want to bucketized, word embedding, column crosses, and so on. The transformations they apply are clearly described in the TensorFlow documentation. So you always have an idea of what's going on. And we're going to take a look at quite a few of them here in code. A bucketized column, helps with discretizing continuous feature values. In this example, if we were to consider the latitude and longitude, highly granular right, of the house or apartment that we're training or predicting on, we wouldn't want to feed in the raw latitude and longitude values. Instead, we would create buckets that could group the ranges of values for latitude and longitude. It's kind of like zooming out, if you're looking at just like a zip code. If you're thinking this sounds familiar, in just like building a vocabulary list for categorical columns, you're absolutely right. categorical columns are represented in TensorFlow as sparse tensors. So categorical columns are an example of something that's sparse. TensorFlow can do math operations on sparse tensors without having to convert them into dense values first, and this saves memory and optimizes compute time. But as the number of categories of the feature grow large, it becomes infeasible to train a neural network using those 100 codings. Imagine that 0, 0, 0, 0. A million 0, 1. Recall we can use an embedding column. Embeddings overcome this limitation. Instead of representing the data as a one hot vector of many dimensions, an embedding column represents that data, at a lower dimensional level or a dense vector in which each cell can contain any number not just a 0 or a 1. We'll get back to a real estate example shortly. But first let's take a quick detour into the wild world of embeddings.