Scratch Where It's Itching

Monday, June 13, 2022

WTF: Case for a Join

I found this code in our repository:

roles = ""
for i, id in enumerate(account_ids):
    if i > 0:
        roles = roles + ","
        if i == len(account_ids) - 1:
            roles = roles + "\"arn:aws:iam::*:role/" + entity_name + "-" + id + "-admin"
        else:
            roles = roles + "\"arn:aws:iam::*:role/" + entity_name + "-" + id + "-admin\""
    if i == 0:
        if i == len(account_ids) - 1:
            roles = roles + "arn:aws:iam::*:role/" + entity_name + "-" + id + "-admin"
        else:
            roles = roles + "arn:aws:iam::*:role/" + entity_name + "-" + id + "-admin\""

It takes some time to understand what is going on here, but basically, we are building a coma separated list of AWS roles from a list of account IDs.

When you see a condition based on the index of a for loop, you start to feel a very strong code smell.

First, the test for the index being bigger than zero, or equal to 0, is mainly made for handling the coma. But not only. There are then tests repeated, with almost similar codes, to check if we are on the last iteration, all this to discover if we have to add a " character at the beginning or the end of our string.

At the end, it produces a string in the form: role1","role2","role3

The reason that there is no quotation marks at the beginning or the end of the string, is that ultimately, it will be put in a template that is declared with the marks already there, in this form: "__ROLES__".

All this code is quite bad, and the coma and quotation marks handling can all be left to a call to the join function:

roles = '","'.join([f"arn:aws:iam::*:role/{entity_name}-{id}-admin"
    for id in account_ids])

From 13 lines to 1, I kind of like it.

Tuesday, June 7, 2022

AWS assume role one-liner

A couple of months ago, I wanted to simplify the way I assume a role in AWS, an operation I perform several times a day. Usually, you would run a command like this one with the AWS CLI:

aws sts assume-role --role-arn $MY_ROLE_ARN --role-session-name test

It would return you a JSON document like this one:

{
    "AssumedRoleUser": {
        "AssumedRoleId": "AROA3XFRBF535PLBIFPI4:s3-access-example",
        "Arn": "arn:aws:sts::123456789012:assumed-role/xaccounts3access/s3-access-example"
    },
    "Credentials": {
        "SecretAccessKey": "9drTJvcXLB89EXAMPLELB8923FB892xMFI",
        "SessionToken": "AQoXdzELDDY//////////wEaoAK1wvxJY12r2IrDFT2IvAzTCn3zHoZ7YNtpiQLF0MqZye/qwjzP2iEXAMPLEbw/m3hsj8VBTkPORGvr9jM5sgP+w9IZWZnU+LWhmg+a5fDi2oTGUYcdg9uexQ4mtCHIHfi4citgqZTgco40Yqr4lIlo4V2b2Dyauk0eYFNebHtYlFVgAUj+7Indz3LU0aTWk1WKIjHmmMCIoTkyYp/k7kUG7moeEYKSitwQIi6Gjn+nyzM+PtoA3685ixzv0R7i5rjQi0YE0lf1oeie3bDiNHncmzosRM6SFiPzSvp6h/32xQuZsjcypmwsPSDtTPYcs0+YN/8BRi2/IcrxSpnWEXAMPLEXSDFTAQAM6Dl9zR0tXoybnlrZIwMLlMi1Kcgo5OytwU=",
        "Expiration": "2016-03-15T00:05:07Z",
        "AccessKeyId": "ASIAJEXAMPLEXEG2JICEA"
    }
}

And then you would need to export environment variables for setting the access key, secret key and session token.

export AWS_ACCESS_KEY_ID="ASIAJEXAMPLEXEG2JICEA"
export AWS_SECRET_ACCESS_KEY="9drTJvcXLB89EXAMPLELB8923FB892xMFI"
export AWS_SESSION_TOKEN="..."

So I looked for a simpler solution and I stumbled upon this StackOverfow question: AWS sts assume role in one command.

Some suggestions use the very useful JQ utility which allows to retrieve information from JSON documents. But in case of AWS CLI commands, it is normally not necessary, since they all accept the --query option that supports JMESPath syntax. So, as one answer suggested, you can simply use the join built-in command to construct your export command, and let the shell evaluate it. Which is the solution I used for some months now:

eval $(aws sts assume-role \
 --role-arn $MY_ROLE_ARN \
 --role-session-name test \
 --query 'join(``, [`export AWS_ACCESS_KEY_ID=`, 
 Credentials.AccessKeyId, ` ; export AWS_SECRET_ACCESS_KEY=`,
 Credentials.SecretAccessKey, `; export AWS_SESSION_TOKEN=`,
 Credentials.SessionToken])' \
 --output text)

This command has been really practical, so I decided to add an entry to my blog about it, so I will always remember where to look for it. So I returned to the StackOverflow site, and I found out that a simpler solution was suggested since. It uses the built-in printf shell function, that I had absolutely no idea existed:

export $(printf "AWS_ACCESS_KEY_ID=%s AWS_SECRET_ACCESS_KEY=%s AWS_SESSION_TOKEN=%s" \
 $(aws sts assume-role \
 --role-arn $MY_ROLE_ARN \
 --role-session-name test \
 --query "Credentials.[AccessKeyId,SecretAccessKey,SessionToken]" \
 --output text))

I guess you never stop learning.

Monday, May 16, 2022

WTF: Python Dict two liner

When working with DynamoDB in AWS, you know that the values you put in there are not your usual JSON. Instead of you key/value pair, your rather have key/type/value. Something like this:

{
    'key': {
        'type': 'value'
    }
}

Of course, you can let your API build this mess for you. Or you can do it yourself, as I found in our code base. Except it had this little gem in it. For creating a map, you have to apply this key/type/value pattern to all values. And then you have to insert the map using the 'M' type. The code ended with these two lines:

data = dict({})
data['M'] = my_map

First, dict({}) is like creating en empty dict from an empty dict. You either use dict() or {}, but not both. But since you can create your dict inline, I'd rather use this one-liner:

data = {'M': my_map}

Does it look simpler only to me?

Thursday, April 21, 2022

WTF: Oops, already in my list

I found an interesting loop pattern, used at several places in our code. It goes like this:

I create an empty list
I start a loop to fill it
Inside my loop, I create an object and add it to my list
Still in the loop, I need to modify the object I just added, so I retrieve it from the list

An example here:

templates = []
index = 0

json_templates = json.loads(templates_as_str)
for template_file in json_templates:
    templates.add(get_template(template_file))

    template = templates[index]
    # Do some stuff on my template

    index += 1

So yes, I could have created my template reference before storing it in the list. But it's too late! I need to use an index now...

Wednesday, April 6, 2022

Python: Over formatting

In the Python project I work on, some colleague likes to use the string's format method. Sometimes a bit too much for my taste. For instance, I can often see this kind of code:

my_string = "{}-{}".format(part1, part2)

Usually, I prefer to use an f-string:

my_string = f"{part1}-{part2}"

But it could be a matter of taste. However, in some cases, I would prefer to use another approach.

For instance, in the case of the print method, there is already an existing pattern. I see often this kind of code:

print("The value is {}".format(value))

I replace it usually with this code:

print("The value is", value)

A bit more annoying is the use of format inside logging. I often see this code:

logger.info("The value is {}".format(value))

In case of logging, the pattern is here for a reason. If logging level is set to WARNING for instance, you want to avoid the string formatting, which takes some processing time. The preferred pattern is the following:

logger.info("The value is %s", value)

Finally, there are the cases where the use of format is completely insane. Here is an example:

my_string = "{}".format(value)

Maybe my knowledge of Python is too limited, but value being already a string, is there any difference with this code:

my_string = value

Friday, April 1, 2022

AWS: Read Timeout when Invoking Lambda

I have an AWS Lambda that invokes another Lambda synchronously. Nothing was wrong with it until recently, when the other Lambda started performing more tasks and taking more time. Suddenly, the caller Lambda started failing with this weird message:

ReadTimeoutError: Read timeout on endpoint URL: "https://lambda.eu-central-1.amazonaws.com/2015-03-31/functions/MyOtherLambda/invocations"

Looking at the logs of my other Lambda, I could see that it ran fine, although it was executed several times. After some research on the Internet, I found this article from AWS support that explains that when invoking a Lambda synchronously, there is a default timeout of 60 seconds, and 3 retries. This can be configured when creating the client.

In Python using boto3 for instance, you have to use the Config object:

import boto3
from botocore.config import Config

lambda_client = boto3.client("lambda", config=Config(read_timeout=600))

Now, my invoked lambda has 10 minutes to perform its tasks.

Monday, March 7, 2022

WTF: retry by recursion

Often, in an application, you'll want to retry some actions when they fail. If you start thinking: "why not use recursion?", stop right there. It's a bad idea. Filling the call stack is never a good idea. It might slow down your whole application. And using Stack Overflow errors to tell you that you should stop retrying is not so great.

Unfortunately for me, someone thought it would be a good idea to introduce this pattern into our production code. Almost everywhere, I find this type of code:

def myfunc(params, tries=0):
    sys.setrecursionlimit(500) #By default 1,000

    try:
        dosomestuff()
    except:
        if tries < sys.getrecursionlimit():
            return myfunc(params, tries+1)
        else:
            print("RECURSION LIMIT HIT for myfunc !")

Please use loops...