Wednesday, October 25, 2023

Moto: Alias Issue when Creating S3 Access Point is Fixed

 Two days ago, I discovered a small bug in the moto library that I use to unit test my lambdas on AWS. I needed to create an S3 Access Point with boto3, and while retrieving the its alias, I had different results when using the return value of create_access_point or get_access_point.

So I wrote a small unit test:

from moto import mock_s3control
import boto3

@mock_s3control
def test_access_point_alias():
    client = boto3.client("s3control")

    alias_from_create = client.create_access_point(
        AccountId="123456789012",
        Name="my-access-point",
        Bucket="MyBucket",
    )["Alias"]

    alias_from_get = client.get_access_point(
        AccountId="123456789012",
        Name="my-access-point",
    )["Alias"]

    assert alias_from_create == alias_from_get

I create an S3 Access Point, and retrieve its alias in two ways: from the response of the create_access_point function, and from the get_access_point function. On my moto 4.2.6, this test fails.

So I opened an issue on the project's repository. It was fixed and closed on the same day. That's reactivity!

Thursday, October 12, 2023

AWS: The Next Token Pattern

When developing  for AWS, there is a pattern that you use each time a response to a service may return a lot of data. You get back some of the data, together with a token that you can provide to get another round of data.

As an example, let's take the service that returns the list of events from a Cloudformation template deployment. Here is how you would do it using Python and boto3:

cf_client = boto3.client("cloudformation")
response = cf_client.describe_stack_events(StackName="mystack")
# do something with the response

while response.get("NextToken"):
    response = cf_client.describe_stack_events(
        StackName="mystack",
        NextToken=response.get("NextToken")
    )
    # do again something with the response

However, there is one thing that I do not like with this pattern: code repeat. You call the service at two different parts of your code, with almost identical parameters. And you process the response in the same way, again in two places. If you have to fix something in this code, you have to remember to fix in both places.

The approach I use to have your code only once, is to take advantage of Python's capacity to pass parameters as a dictionary. Here is my approach to this pattern:

cf_client = boto3.client("cloudformation")
next_token = "FIRST TIME"
params = {"StackName": "mystack"}

while next_token:
    response = cf_client.describe_stack_events(**params)
    # do something with the response

    next_token = response.get("NextToken")
    params["NextToken"] = next_token

I store my parameter list in a dictionary, and I initialize the next token with something that is not empty. So the first time in the loop will always run. I can then call my service, without the token. After processing the response, I then read the next token and fill it in my parameter list. The second time round, it will call the service with my token.

Something funny happened when I started using this pattern into our production code. We tried using Bandit, which is a tool that analyze your code and looks for security issues. It would systematically flag my pattern with this error: 

[B105:hardcoded_password_string] Possible hardcoded password: 'FIRST TIME'

 Well, I have to slightly modify my pattern to avoid using the word token...

Wednesday, October 11, 2023

AWS: Simpler S3 File Deletes by Prefix

 I came across this code that deletes files in an S3 from a list of prefixes:

s3_client = boto3.client('s3')
for prefix in prefix_list:
    paginator = s3_client.get_paginator('list_objects_v2')
    file_list = paginator.paginate(
        Bucket=data_bucket,
        Prefix=prefix
    )
    for current_content in file_list:
        for current_file in current_content.get('Contents', []):
            current_key = current_file['Key']
            response = s3_client.delete_object(
                Bucket=data_bucket,
                Key=current_key
            )

The code creates an S3 client, and then, for each prefix in a list, it creates a paginator. Paginators are great because they help you avoid using all your memory when the list of files is big. Using this paginator, the code retrieves the list of all the files corresponding to the prefix, and deletes it.

Nothing bad in the code, it works nicely. My only remark here, is that there exists a simpler way. Instead of using the S3 client, you can create an S3 bucket resource. From there, you can simply delete all files listed under a prefix using a simple filter:

s3_resource = boto3.resource('s3')
bucket = s3_resource.Bucket(data_bucket)
for prefix in prefix_list:
    bucket.objects.filter(Prefix=prefix).delete()

Simpler!

Friday, October 6, 2023

AWS: Automatic Subscription Confirmation from SQS Queue to SNS Topic

 We have an architecture in AWS where different events from different accounts need to be sent to one central SQS queue. Since the events will cross both accounts and regions, one way to do it is to send them to a local SNS Topic. 

The SQS queue will have to subscribe to all those Topics, but we can not do it on the SQS side, since it does not know each time someone pops out a new account. However, the problem with having the SNS Topics create the subscriptions, is that they are waiting for confirmation from the SQS queue.

Since we already have a lambda waiting on the other side of the queue, handling all the events, we added a small code to handle the subscription confirmation as well. Here it is:

import json
import urllib.request

def lambda_handler(event, context):
    for record in event["Records"]:
        body = json.loads(record["body"])

        if body.get("Type") == "SubscriptionConfirmation":
            handle_subscription_confirmation(body)

def handle_subscription_confirmation(message):
    url = message["SubscribeURL"]

    with urllib.request.urlopen(url) as response:
        print(response.read())

I find it strange that the Cloudformation template that we use to create the subscription does not handle the confirmation as well. Or maybe not cross-account?