Course description

In today's complex and distributed systems, failures are inevitable. Chaos Engineering is a discipline that helps organizations proactively test system resilience by intentionally injecting controlled failures. This course provides a hands-on approach to understanding and implementing Chaos Engineering techniques to build more robust and fault-tolerant systems.
Through a structured learning path, you will explore key concepts, best practices, and tools to identify weaknesses before they cause major disruptions. By the end of this course, you will have the skills to design, execute, and integrate chaos experiments into your software development lifecycle.


Course content:


  • Introduction to Chaos Engineering
    Understand the principles, history, and benefits of Chaos Engineering. Learn why failure testing is critical for modern applications.


  • Understanding System Weaknesses
    Identify common failure points in distributed systems, including dependencies, bottlenecks, and cascading failures.


  • Designing Chaos Experiments
    Learn how to define hypotheses, establish steady-state behavior, and develop structured chaos experiments.


  • Chaos Engineering Tools Overview
    Explore popular Chaos Engineering tools such as Chaos Monkey, LitmusChaos, Gremlin, and more.


  • Simulating Infrastructure Failures
    Conduct controlled experiments to test resilience against server crashes, resource exhaustion, and cloud failures.


  • Injecting Network Failures
    Understand how network delays, packet loss, and service disruptions affect system reliability and how to mitigate them.


  • Observability and Monitoring
    Learn how to leverage observability tools and metrics to analyze system performance during chaos experiments.


  • Integrating Chaos Engineering into CI/CD Pipelines
    Implement Chaos Engineering practices within CI/CD workflows to ensure reliability in production environments.


  • Advanced Chaos Scenarios
    Explore complex scenarios such as multi-region failures, security-related chaos, and scaling challenges.


Who Should Take This Course?

    • DevOps engineers, SREs, and cloud architects
    • Software engineers and QA professionals
    • IT professionals interested in system resilience and fault tolerance

What will i learn?

  • Understand Chaos Engineering Principles - Gain a deep understanding of Chaos Engineering concepts, its importance, and how it helps build resilient systems.
  • Design and Execute Chaos Experiments - Learn how to plan, implement, and analyze controlled failure experiments to identify system weaknesses.
  • Integrate Chaos Engineering into CI/CD Pipelines - Develop the skills to incorporate automated chaos testing into deployment workflows to enhance system reliability.

Requirements

  • Basic Understanding of Cloud and Distributed Systems – Familiarity with cloud computing concepts and distributed architectures will be helpful but not mandatory.
  • Knowledge of Linux and Command-Line Basics – Since many Chaos Engineering tools operate in Linux environments, basic command-line navigation and scripting knowledge are beneficial.
  • Access to a Testing or Cloud Environment – While not mandatory, having access to a test environment (AWS, Azure, GCP, or a local Kubernetes cluster) will enhance hands-on learning and experimentation.

Frequently asked question

No, this course is designed for both beginners and experienced professionals. We start with the fundamentals and gradually move to advanced concepts, ensuring that you gain a solid understanding of Chaos Engineering.

While having access to a cloud or on-premises environment is beneficial, we will provide guided simulations and hands-on exercises using Chaos Engineering tools that can be run in test environments or locally.

This course will equip you with practical skills to proactively identify and mitigate system weaknesses, improving reliability and resilience in production environments. You will also learn how to integrate Chaos Engineering into CI/CD pipelines, a critical skill for DevOps and SRE professionals.

Akinola Ojuola

Cloud Solution Architect, DevOps Consultant & Trainer

Akinola Ojuola is a seasoned Cloud Solution Architect, DevOps Consultant and technical trainer with over 20 years of industry expertise. Throughout his career, he has worked with some of the world’s most prominent technology-driven organisations, including IBM, Fujitsu, Walmart, and MasterCard, delivering transformative solutions across various sectors. Akinola has trained and mentored more than 1,000 students across 18 countries on five continents. His commitment to real-world, practical learning has enabled hundreds of learners to launch successful careers in global tech companies. He is passionate about practical, real-world learning. His teaching approach blends deep technical knowledge with hands-on, enterprise-level experience. He holds multiple industry certifications and leads advanced projects in Cloud Architecture, DevOps, DevSecOps, and Artificial Intelligence for both private enterprises and public institutions.Whether you’re just starting or looking to advance your tech career, you’ll gain valuable, job-ready skills under his guidance.

$10

Lectures

9

Quizzes

9

Skill level

Beginner

Expiry period

1 Months

Certificate

Yes

Related courses