Friday, March 15, 2024

AWS: Find Root Cause of Failure for CloudFormation Stacks

 When a CloudFormation stack fails, you have to scroll back trough the events to find the root cause of the failure. Recently, AWS even added a "Detect Root Cause" button to the Console to immediately scroll to the correct event. But how do you do it from a python script?

import boto3

def find_root_cause(stack_name):
    cf_client = boto3.client('cloudformation')

    next_values = "First Time"
    params = {
        "StackName": stack_name
    }
    root_cause = None

    while next_values:
        result = cf_client.describe_stack_events(**params)

        next_values = result.get("NextToken")
        params["NextToken"] = next_values

        for event in result["StackEvents"]:
            status = event.get("ResourceStatus", "")
            reason = event.get("ResourceStatusReason")

            # start of deployment
            if reason == "User Initiated":
                return root_cause
           
            if reason and "FAILED" in status:
                root_cause = reason

    return root_cause

You follow the same pattern as from the Console. You go back the events history, until you reach the oldest error message before the start of the deployment.

No comments:

Post a Comment