TFX encodes many of Google's best practices for Machine-Learning. Let's explore some design patterns for how to further incorporate DevOps best practices, such as continuous integration and continuous deployment into simple, fine, and speeding up your TFX pipeline development. From Course 1, recall the concepts of continuous integration, continuous deployment, and continuous training workflows for your machine-learning pipeline development. To review, continuous integration is the development process of working from a shared code repository for pipeline code, unit testing, integration testing, and continuous delivery of your TFX pipeline to production. Conversely, continuous deployment is the process of automating the building, testing and releasing of your TFX pipeline code. Continuous training is the process of monitoring, measuring, retraining, and serving your Machine-Learning models produced by your pipeline. TFX pipelines can be configured for continuous training with model evaluation and [inaudible] gates to ensure that only the best-performing models make it to production. There it's AI pipeline solutions, Google Cloud provides a complete enterprise created, easy setup, secure environment to build and deploy robust and reproducible TFX pipelines. This environment comes with monitoring, auditing, and version tracking built-in. At the heart of this architecture is Cloud Build, a managed service that executes your container builds on Google Cloud Infrastructure. Cloud Build can import TFX pipeline code from Cloud Source Repositories, GitHub, or GitLab, and then execute a build to your specifications and produce artifacts such as containers or Python tarfiles. Cloud Build executes your builds, is a series of build steps defined in a build YAML Configuration File. Each build step is run [inaudible] as a container. You can either use the supported build steps provided by Cloud Build, to write your own build steps. The Cloud Build process, which performs the required CICD for your model system, can be executed either manually or through automated Build triggers with Cloud functions or GitHub actions. Triggers execute your configured build steps, whenever the changes are pushed to the build source code. You can set a Build trigger to execute the Build routine on changes to the source repository or only when changes match certain criteria. In addition, you have Build routines, Cloud Build Configuration Files that are executed in response to different triggers. For example, you can set Build routines to start when commits are merged to your TFX project main branch from individual pipeline developer branches to promote collaboration. Let's step through an example Cloud Build CICD Workflow for the TFX pipeline. First, the TFX pipelines source code repository is copied to the Cloud Build runtime environment under the slash workspace directory. Unit Tests are then run. For TFX pipelines tests might include unit tests for each model, tests on data transformations and the transform component, and also integration tests between your components. This step might also include Static Code Analysis for formatting, such as Pylint. If the tests are passed the container images for each user, custom component are built. The images can be optionally tagged with the commit SHA parameter. Then the container images are uploaded to the Container Registry. The image URL is updated in each of the Component.js File with the created and tags container images. The TFX pipelines workflow is compiled to produce the workflow.tar.gz File. The workflow file is then uploaded to Cloud Storage. The compiled TFX pipeline, again tagged with its name and version, is then deployed to a managed Kubeflow pipelines instance. Pipeline runtime parameters are read from the settings.YAML File. In a Kubeflow experiment is created or reused. Optionally, the pipeline can be run with the parameter values as part of an integration test or production execution. The executed pipeline eventually deploys a model as an API on Cloud AI platform. Let's now step through some TFX pipeline design patterns to see how they connect to the levels of AI process development and automation introduced earlier in Course 1. TFX provides level-0 automation through notebooks. Notebooks can be the prototyping environment for interactive TFX pipeline development or also can be used to trigger programmatic creation of pipelines. Notebooks use the same or similar programming abstractions with orchestrators during experimentation, so you can automatically or semi-automatically refactor experimentation code into a production pipeline. As I previously mentioned, there's a nice export to pipeline method, utility function to quickly go from your notebook prototype to a production orchestrator by specifying a runner such as beam, airflow, or Kubeflow pipelines. Next, TFX provides level-1 automation through continuous training operation, through manual or automatic triggers. For experimentation and testing, you can use the TFX command-line utility or UI to manually trigger pipeline tests for partial or full pipeline runs. A common use case is to retrain your model with new hyperparameter founded in the latest study, but you don't need to necessarily run the data ingestion and processing parts of your pipeline because your inputs haven't changed. To add additional automation, you can leverage Google Cloud services such as Cloud Functions to develop automatic triggers for full pipeline runs, based of new data for example. Cloud Functions enables continuous training through the creation of automatic triggers events such as new data being pushed to Cloud Storage Bucket or pipeline code being integrated into a main branch from the team members development branch. Level 1 automation for ML deployment was about automating the TFX Pipelines continuous training, but level 2 brings automation to the pipeline development process itself, through continuous integration and delivery of the TFX Pipelines code. In this CICD setup, you have a Build Pipeline that builds pipeline component images, runs unit tests, and uploads component images to container registry. You also have a QA or staging environment where your code is deployed first, which can run your TFX pipeline end to end and run through any integration tests between components. As we discussed in our example, Cloud build workflow, Cloud build can be configured to run the Build Pipeline, perform a QA pipeline run, and then deploy a new continuous training TFX pipeline to a production environment. To summarize the output of these CICD workflows, are if given a new implementation of a pipeline code, a successful CICD pipeline run deploys a new TFX continuous training pipeline. If given new data, this new TFX continuous training pipeline will then deploy a new model prediction service. Let's review the full TFX end-to-end MLOps workflow at present. There are several feedback loops on this MLOps automation that mirror the feedback loops of the ML Project Lifecycle Diagram introduced earlier in course 1. On Google Cloud, you can start your development with prototyping a model and pipeline in a Cloud Platform Notebook. Next you would gradually refactor your pipeline into code and configurations that are output to a source control repository on GitHub, GitLab, or Cloud Source Repositories. This pipeline code can then become part of a continuous integration and delivery feedback loop, using a build system like Cloud Build that outputs code artifacts such as component and pipeline containers to Cloud Container Registry. You can configure your TFX pipeline code to deploy automatically to different environments including development, test, and staging based on different triggers. Finally, you have a continuous training pipeline that is regularly retraining, evaluating, and deploying machine learning models while outputting metadata, artifacts, and models to serving environments. What is the future state of TFX machine-learning pipeline development look like on Google Cloud? The answer is more managed services to simplify resource management so you can focus on your machine learning use case instead of the underlying pipeline and environment details. There's still TFX continuous training pipelines for continuous model training, evaluation and deployment. There's also still CICD pipelines for pipeline code, metadata, and artifacts. However, this automation of pipeline training and co-development opens up two new automation opportunities to further improve your machine learning development velocity. First, a feature story is for managing and serving machine learning features, serving as a bridge between your models and your data. Feature stores provide scalable access and automated documentation of features to promote feature discovery and reuse across machine learning use cases. The key benefit that a feature store brings to your TFX pipeline development workflow is that it decouples feature engineering from feature usage, allowing an independent development and consumption of features. Features that are added to a feature store become available immediately for training and serving. Models can retrieve the same features used in training from a low latency online store while being served in production. This means that new TFX pipelines start with the process of feature selection from a catalog instead of having to do feature engineering from scratch. With the automated building blocks of models, code, and features, you now have the ability to further expand automation over your entire TFX pipeline. You can spin up multiple pipelines each with different run-time parameters such as data spans and splits, features, and model hyperparameters to iterate towards an optimal pipeline for deployment. TFX pipelines are moving towards providing a path for a full automation in tuning of the entire end-to-end pipeline including data management and feature selection, feature engineering, model architecture surge,and pipeline development and deployment. The key takeaway is that the future of TFX pipeline development on Google Cloud, is more managed services and products with scalability and best practices built in to support your machine learning development velocity.