Hi there. In this video, I'm going to discuss an experience I had at Nordstrom when we were migrating some of our legacy products to the cloud and introducing the DevOps mindset. I've titled this video, 'Living in a Hybrid World' because that was a big piece of the reality and contexts in which we were operating at the time. I originally presented this talk at DevOps Days Seattle, alongside one of the principal engineers who was on that team that migrated to the cloud. First, let me set the context because I'm sure that there are a lot of people in similar situations dealing with complexity in their system. At Nordstrom at the time, we had our full price offerings that included our brick and mortar stores, our dot com, apps, and the website, and we also had acquired a company called Trunk Club. For our off-price channel, we had our Nordstrom rack stores and also Nordstromrack.com and we had also acquired HauteLook, which was a digital experience that was a flash sale site. So, we had a lot of different channels, but at the end of the day, we ultimately wanted our customers to see Nordstrom as one offering. Our technology was one of the key enablers to make that happen. We were looking for strategic flexibility. Our digital experience was convenient but it didn't have the same high-touch experience that our brick and mortar stores had. At the same time, while our stores were more high-touch, they were less convenient and we were all about speed and reliability. I'm going to start with our journey toward modernization and how we started down that path. I intend to demonstrate that not all teams move at the same pace. This journey could be characterized as a crawl, walk, and then run. In the crawl phase, there's a lot of investment in engineering thought leadership. It involves focusing on shipping product and looking at microservices and cloud strategy. In the walk phase, there was a lot of investment in Lean mindsets and practices adopting DevOps and the implementation of microservices and cloud, moving towards on-demand releases and establishing baseline metrics. In the run phase, it was about taking talent and spreading that across the organization, learning from each other, managing to the metrics that had been established, and then optimizing and extending across those dimensions. Now we'll share a case study in our web product page modernization. Our goal with the product page was to transform it. I mean, the product page for any organization is what we like to call the money page. That is where customers decide if they're going to make a purchase. Therefore, it is extremely important for that page to be compelling, but it also needs to perform. Prior to the modernization effort, we were in a very low trust environment with our customers. We were slow to respond to business needs. We had a tightly coupled monolithic codebase that had been developed over a decade before. We were in a five-week release process and often had sight failures during peak traffic events. We really wanted to move to being a trusted and reliable partner, having a more agile response to our business requirements, establishing microservices, investing in continuous integration and continuous delivery pipelines, having on-demand releases, and having a reliable and scalable environment. We looked at modernization across the entire stack. We decided to put in UX component models at the presentation layer, moving to microservices and having data that we can leverage in a speed devalue way, moving to cloud-based blue-green deployments so that we can de-risk those releases and move to on-demand and have fully automated test suites across the entire deployment pipeline. I'd like to share a quote by Christopher Alexander that was a way for us to think about how we were operating at the time. It says, "Unless the regions have the power to be self-governing, they will not be able to solve their own environmental problems. The arbitrary lines of states and countries, which often cut across natural regional boundaries, make it all but impossible for people to solve regional problems in a direct and humanly efficient way." This represents a way for us to use Conway's law to our advantage and optimize our organization to match what we were shipping as a product. At the start of our modernization journey, we had multiple Dev teams, centralized support, and centralized infrastructure. We wanted to move toward a distributed full-stack team and having a weekly support rotation that the team owned. What we expected to get from that was ownership, accountability, and an operational mindset. For our code and infrastructure, we had a centrally-managed build and source control platform. We were centrally managing infrastructure in physical data centers, and we had a five-week release process which was keeping us from delivering fast. We ended up moving to a distributed model, where the team owned the CI/CD pipeline, had on-demand releases to the cloud and started using infrastructure as code. The benefit was that we were able to release multiple times per day. We had a rollback capability, so if we got into a situation where we deployed code that introduced a problem, we could roll back immediately. We could react to major and minor traffic needs within six minutes and technology was no longer the constraint for delivering value. For logging and metrics, we had basic IIS logs and a DOM-ready page performance view. We moved to more of an extensible logging scheme and user-ready page performance. That got us to a point where the team had access to app-specific metrics. They could look at the business outcomes and understand how their product was performing. For incidence, prior to this, we had a centralized support team which resulted in very slow escalation and resolution. We moved to automated alerts, a modern incident platform with collaboration and the on-call rotation was amongst the product team. We were able to respond to critical incidents within two to three hours and we had fewer incidents due to that proactive monitoring. There's a famous quote from Jez Humble that says, "If it hurts, do it more frequently and bring the pain forward." That was a big lesson for us. Often when things are painful, it's easy to retreat. Instead we said, "No. We're going to continue to persist and do this more frequently so we can learn." Teams pushed themselves very hard. The developers were passionate about continuous improvement. We got our feature batch size down to something that was small and increased our frequency through new processes. We could split up new tests branches. We had an experimentation platform. We could do this within 5-10 minutes and it was accessible by multiple teams across the company. We were able to ideate and course-correct early. What else do we learn from this? Well, we no longer needed these big sign-offs. We didn't have big go or no-go decisions before we released. We had to continue making investments in the platform. It wasn't a one and done, so we needed to have our feature to infrastructure ratios established and incorporated into each release. It was very important for the team to have a focus on business outcomes. They rallied around that instead of the traditional metrics of just shipping code. There were still some challenges in a hybrid environment which will probably not be surprising. We still have a lot of cross team orchestration. Even though most of the product page team is autonomous, they still depend on other teams to get work done. They're only as fast as the slowest link. So, if there's a team that can't move at the speed of multiple deployments a day, then the product page team is restricted to the same slowness. If there were cross team requirements for operational work or incident management, that was challenging as well. There was still a lot of legacy code in the environment, so there was a need to focus on how we could modernize across the whole stack. Another thing we learned is that when you're in a hybrid environment, you must also take care of the legacy stack. Prior to shifting our focus to modernization, we had a five-week release cycle for everything else that was part of the website, but we really wanted to get away from that. We had a hardening sprint that was part of that five-week release cycle that was a two-week hardening sprint. We frequently had exceptions to that. Often the code was not ready when the hardening sprint started. So, we would have quality issues deep into the hardening sprint. Testing was manual. The releases took several days and required a lot of heroics. At one point, they were three days long and took 24 by seven shifts in order to accommodate the release. We had a target of reducing that hardening sprint by a week, reducing the exceptions by 75 percent, automating our tests and holding our feature teams accountable for quality. We aim to get our production releases down to being able to be finished in a single eight-hour day. This is what it looked like across our timeline in 2015. We had our release schedules established. We did a value stream mapping workshop so that we could understand what was contributing to the bottleneck across the five-week release cycle. We did some experiments that saved over 2,000 hours and remove 20 steps, and we really got our collaboration to be high. We put in quality gates to reduce those exceptions and had some additional experiments that were across multiple dimensions including integration, deployment, and performance testing. As we implemented those improvements, we started to see additional incremental reductions. By the end of 2015, we had taken five days out of that hardening sprint, and we had removed over 3,000 hours of waste. We took out the exception time that we expected and we were all on track to getting our releases down to being done within a single eight-hour day. This produced excellent outcomes. We saw behavior change within the teams. They got more excited about problem-solving and continuous improvement. Our leaders became very engaged. They needed to be in order to create the space for the problem-solving. Our trust went up across all teams and so did the team morale. People were excited about the fact that we were improving the. legacy system and not just focused on the modern stack. Teams got to learn new skills. They got to learn about Lean, DevOps, and they saw that as the organization continued to invest in their personal development. So, this is what it was like to live in a hybrid world. I think it's important to remember that most organizations will be in that situation. Even if you're on a journey to modernize your stack, you almost always have something within your environment that's considered legacy. It's important to implement DevOps practices and Lean, regardless of how modern the stack and capabilities are. So, when you find yourself in a hybrid environment, focus on modernization but don't neglect continuing to show your legacy system some love. When you integrate both, you'll serve all your customers well.