In this lecture, we are going to explore The Local Interpretable Model Agnostic Explanations, also known as LIME. As we're going to see, LIME employs a local surrogate model. The surrogate models are usually simple enough to be inherently interpretable. Please note that LIME is different than global surrogate method, which employ interpretable models trained on the input data and predictions of more complex models. Global surrogate methods provide small explainability value since they have limited insight in how the original model works. We have already explored permutation feature importance, which is a global approach of providing explanations for any model that can be expressed as a set of tabular features. As we already saw, we extended this to the time series. We saw that this is a powerful method that renews the values of each tabular feature in order to create a pseudorandom vector and this is significance of the specific predictor across all samples. Global explanations cannot provide information about specific decisions. Therefore, the explanations may be a novel simplification of the system that ignores local variations. Far more importantly, in certain applications such as when a mortgage is issued or a decision is taken for a health care problem, local explanations are required to provide specific personal feedback. LIME is a local explainability model that works via surrogate model. In other words, it employs an interpretable model such as a linear model or a decision 3, and provides local information of the solution space. Therefore, it explains the more complex model, black box, via set of local solutions of a much simpler approximation. To do this, it uses local perturbation of the sample data. In this slide, we see the intuition behind local interpretable model agnostic explanation. We see here a complex function which is a black box and this complex function is a known to LIME. The bright bold red cross is the instance being explained. LIME samples instances and gets prediction using f and weights them by the proximity to the instances being explained. Here, if we zoom locally in the area we tried to approximate the black box function, then we see that the dash lined is the explanation which is locally faithful. But globally it cannot really describe the model. An interpretable model may not be able to approximate a black box model globally, however, approximate the black box model in the space of an individual instance. As we see, it is feasible. Potentially interpretable model used as surrogates can be linear model, decision 3, and rule lists. Here we see the line formulation. F is the model we would like to explain. LIME is obtained by minimizing a function of the instance x. L here is a measure of distance of the approximation of the function f, locally defined by the function g. L therefore is a fidelity measure. Omega measures the complexity of this fidelity measure. With these, we aim for an approximation which ensures that locally its fidelity is high whereas, its complexity Omega, is low enough to be interpretable by humans. L Is estimated by generating samples based on perturbation of the input sample around x. Subsequent predictions are generated with the black box model and weighting them according to our proximity measure. We can see that L controls the quality and simplicity of interpretability. LIME's focus is on explaining individual prediction and in this way it allows more accurate explanations while retaining model flexibility. If we benchmark an LSTM or a CNN, or a multilayer perceptron, or even a classical support vector machine or a random forest, the cost required to switch from one model to the other and use LIME as an explanation method is very small. Here we see an example of how LIME is applied on two different models, a logistic model and an LSTM model. They both predict on whether the sentence, this is not too bad, has positive or negative meaning. As we see the explanation is very intuitive. For both types of models they do not require expert knowledge to understand. For the logistic regression, we see that the LIME explanation is faithful to the logistic model that predicts negative meanings. On the other hand, for the LSTM model, the explanation assigns positive weight to both not and bad. We know that only the combination can have a positive meaning, which is captured with LSTM model even though we have not explicitly modeled the interaction between these two words. What makes LIME a useful tool is that it use a representation that can be understood by non-experts and compare models regardless of the actual features used by the model. As we saw, LIME explanations are locally faithful to the model's behavior. Also, LIME is a model agnostic approach and thus its application to a number of different models does not require drastic changes to the models or to the data.