[MUSIC] Hello everyone, welcome to the Materials Data Science and Informatics class. I'm Surya Kalidindi and I am the instructor for this class. Today's lesson is titled Structure-Property Linkages using a Data Science Approach Application Part 1. In this lesson, the learning outcomes are to understand how to apply the data science approach we have been learning in this class, particularly through a case study. This particular case study involves an inclusions/steel composite system. Okay, let's start with a basic introduction to the problem we're going to deal with. This particular problem we are going to look at non-metallic inclusions in steel. So this will be considered as a composite system where the matrix is essentially the steel and the inclusions are the second phase. These inclusions are typically impurities in steel, so they are usually carbides, nitrites, etc. So some of these inclusions are unavoidable, and the steel manufacturers have to deal with having some inclusions in their steels. The main issue with having inclusions is that they have a significant effect on the macro-scale properties of the finished product. In particular, notice that when you have some hard inclusions, you may have these pores or defects. And also, when these hard inclusions actually become soft at a high temperature, the inclusion shape can be quite flat or pancaked, and the sharp edges here in these corners can cause stress concentrations. So the size and shape and properties of the inclusions have a strong effect on the performance of the steel that's produced. So from a Data Science point of view, what we are asking in this particular case study is can we understand and establish quantify the linkages between the material structure? And in this case, the structure is essentially inclusions and steel. The shape, size, and spatial distribution of inclusions in the steel. And can we connect this to properties of interest, effective properties of interest? And these properties could be effective in points, hardening rates, ductility, so on so forth. Now, let's remind ourselves the main steps involved in the data science approach for this homogenization problem. We went through these steps in the previous lessons, so here I'm only trying to summarize them and remind you what the steps are. Step 1, it involves generation of the datasets. In this particular case, we will have both sub-tasks in step 1, and the first sub-task will be to generate synthetic microstructures. Of course if you actually have experimental microstructures, you can use them, but in this particular case we are going to make microstructures. In other words, we're going to make them up on the computer. Step 1B would be once we generate the microstructures, for each microstructure, we're going to evaluate a mechanical property of interest using a numerical model. In this particular case, we are going to use an abacus model. After we generate the data sets, Step 2 is reduced order quantification of the microstructures. Again in previous lessons, we have gone into substantial detail of what would this would contain or involve. And broadly speaking, this has two steps, sub-steps, the first step is to compute n-point correlations or in our case, it'll be two point correlations. And then doing a Principle Component Analysis, so that we end up with the reduced order quantification of microstructure. And then the last step would be to use the properties that we generate in the dataset, and representation of the microstructure using the principle scores. And connect up these two using regression models. Once we generate these models, we want to evaluate or validate the models, and we will use Leave-One-Out cross validation. Again, all of these individual steps have been discussed in previous lessons, and in this particular lesson, we're going to focus on the application of these concepts to a practical problem. So let's start. So we're going to start with generational microstructures. In this case, we're generating a bunch of microstructures where we have different types and shapes and distributions of inclusions in a steel matrix. So we start with a library of possible inclusion shapes of interest and here is a small library of five inclusion shapes. You could add more if you want. And then we also have decided that the volume fractions of interest are in between 0 and 20%. It might sound like 20% is a little high but when you actually look at inclusions in steel, sometimes the local volume fractions where the inclusions cluster can be quite high. So this thing has been selected to cover the entire range of potential interest. So once we select the inclusion shapes and we decide on volume fractions, we generate an ensemble of microstructures. In this particular case, we decided to generate an ensemble of 900 microstructures. These 900 microstructures are broadly distributed in these four classes. So one class is Randomly Scattered, the other class is Vertical Bands and yet another class is Horizontal Bands and the fourth one is Clustered. Each one has a certain number of microstructures, for example the Randomly Scattered one has 300 microstructures, and what you are seeing here is one of them. This is one of the 300 microstructures. So the way each of the microstructures are generated is you randomly select the inclusions from this library, and you select a volume fraction. And you keep adding inclusions until you get to the volume fraction that you want. The strategies we're using in these different classes is different. So depending on the strategy we use, we get a randomly selected one or we get what we call as vertical bands. In this case, the inclusions are still selected randomly, but they are placed in vertical bands. Or we might place them in horizontal bands. Or we might cluster them into clusters. And again, in each cluster, we only place the inclusions in a random way. So this is the way we decided to generate the microstructures. There are a total of 900 microstructures and that's Step 1a. After we generate the 900 microstructures, for each microstructure we do a finite element analysis. In this case again, we decided to do finite element analysis because of the expediency. If you actually have experimental datasets, you could easily replace the digital datasets we generated for this case study with real experimental data sets. So, when we do a finite element simulation, we discretize the mesh, discretize the volume into a grid. And then we are applying periodic bond reconditions. In this case, we're applying plane strain compression type periodic bond reconditions. For these bond reconditions, we're evaluating the stress and strain fields. Again, we are doing this for every microstructure. And then once we do the simulation, from the simulation we extract certain properties of interest. The properties of interest for this study have been selected as follows. The first property of interest is the Effective Yield Strength. This is the overall yield strength of the entire composite. The second property of interest is the Effective Strain Hardening Exponent. Again, this is effective here means it's a property of the entire composite, not the individual phases. And lastly, the third property of interest was selective Localization Propensity. And what this really says is what is the likelihood of forming some sheer bands or localized bands in this composite? Again, notice that because the reinforcement phase is different from the matrix, you would have some defects. So in the top picture here, the precipitates or the inclusions are hard and therefore you have some deep bonding in the simulation. And in the bottom picture here, the inclusions are soft and so they change shape and they flow with the material but because they're soft, they also cause localization. And localization propensity is simply defined as the area fraction of the matrix elements experiencing an equivalent strain greater than a prescribed cut-off. So this is a user selected prescribed cut-off. So these are the properties that were selected to be of interest in this case study. So that completes Step 1. Now we are ready for Step 2. For each microstructure, again, this is just an example. For each microstructure, we generate the 2-pt statistics. Again, the mathematical theories of generating the 2-pt statistics were done before. Here is a reminder as to what equations we used to compute this. And in fact we used the DFT methods we discussed in previous lessons. So, the 2-pt statistics for each microstructure looks like this, we are doing this for a total of 900 microstructures in this case study. So, we have all 900 sets of 2-pt statistics. That's 2a. Once we have this ensemble of 900 2-pt statistics, we are performing a principal component analysis. And the first three principal scores are shown here, and they're color coded so that you can see the different classes. And you can easily see that they are automatically segregated into different classes. Again, this is unsupervised classification. So PCA itself doesn't understand that there are four classes. But the result of the PCA shows that you can see the four classes separately. What you're seeing here are the average values of the principle components for each class. For the random, for vertical, and clustered. And looking at this, you can see that they are indeed separated. So now we have a low dimensional representation of the microstructure in terms of these three principal scores. So the clustering can also be seen by comparing the points in the principal component space to the actual microstructures. So here is an example of a microstructure from the random placement. Here is an example of the microstructure from vertical bands. Then here is an example from clustered microstructures. And one can see that the PCA automatically classifies these microstructures into their separate classes. [MUSIC]