Patterns and practices for building resilient serverless applications

“Lambda gives you multi-AZ out-of-the-box, but still, things can go wrong in production. There are region-wide outages, and performance degradation in services your function depends on can cause it to time out or error. And what if you’re dealing with downstream systems that just aren’t as scalable and can’t handle the load you put on them?

The bottom line is many things can go wrong and they often do at the worst of times. The goal of building resilient systems is not to prevent failures, but to build systems that can withstand these failures. In this talk, we will look at a number of practices and architectural patterns that can help you build more resilient serverless applications. Such as multi-region, active-active, employing DLQs and surge queues. We’ll also see how we can use chaos experiments to help us identify failure modes before they manifest in production.”