Wednesday, November 22, 2023

Dynamic Class Loading

 This article was originally posted on JRoller on June 30, 2005.

The other day, I wanted to write an Eclipse plugin (maybe more about that in a different post), in which I need to read a selected class file from the project I am working on and execute a method in it. Since I can not have my project in the classpath, I found out that the only solution is to have the class loaded dynamically. If there is a better solution in Eclipse, please somebody tell me.

Before starting to write my plugin, I decided to write a small test application, because I never used class loading before. So here is the class I want to load:

package hello;

public class HelloWorld
{
  public void run()
  {
    System.out.println ("Hello World!");
  }
}

To load it and execute the run method, you can then use the following lines of code:

        ClassLoader loader = new ClassLoader(getClass().getClassLoader())
        {
            public Class findClass(String name) {
                try
                {
                    String path = "C:\\mypath\\hello";
                    File file = new File(path, name + ".class");
                    RandomAccessFile raf = new RandomAccessFile(file, "r");
                    byte[] content = new byte[(int)file.length()];
                    raf.readFully(content);

                    return defineClass("hello." + name, content, 0, content.length);
                }
                catch (Exception e)
                {
                    e.printStackTrace();
                }
                
                return null;
            }
        };
        
        try
        {
            Class helloClass = loader.loadClass("HelloWorld");
            Object hello = helloClass.newInstance();
            Method m = helloClass.getMethod("run"new Class[0]);
            m.invoke(hello);
        }
        catch (Exception e)
        {
            e.printStackTrace();
        }

 I did not try the code on more recent java, but since the whole Class Loader API was in the process of being removed, I guess there are other ways to perform this nowadays. I tried asking ChatGPT to produce this code, and the result is quite similar, except it was using the URLClassLoader object which handles reading the file content for us.

Wednesday, November 15, 2023

Python: ruamel.yaml lib has a problem handling comments

 In our project, we are using the ruamel.yaml library for handling reading/writing YAML files. The reason we are not using yaml basic lib from Python is that ruamel handles better yaml standard, keeps the comments and formatting, and always dumps the keys in the same order.

However, since version 0.18.3, we had some strange behavior in our file dump. Some newlines were removed from some files. I opened ticket #492, with the following code that replicates the problem:

import ruamel.yaml

y = ruamel.yaml.YAML()
with open("organizational_units.yaml", "r") as file:
    ou = y.load(file)

with open("organizational_units.yaml", "r") as file:
    content = y.load(file)

content["organizational_units"] = ou["organizational_units"]

with open("test.yaml", "w") as file:
    y.dump(content, file)
with open("test.yaml", "w") as file:
    y.dump(content, file)

with open("test.yaml", "r") as file:
    y.load(file)

It is of course an oversimplified version of what we are doing in our project. We are normally loading several YAML files and combine them into one big model. Then, when we need to save changes into one file, we first reload it into memory in order to retrieve the original comments at the beginning of the file before replacing the old content with the new one.

Then you can see that we are saving our file twice. In fact, we are really performing a first save into an in-memory string stream, before logging the content in the file (at least in debug mode). Then we are saving it. Again, this code here is a simplification just to display the problem.

The problem occurs on the second save. The first works fine. Using this file as an example:

# Organizational Unit Specification

organizational_units:

- Name: root
  Accounts:
  - FirstAccount
  - SecondAccount

After the second save, we have this result:

# Organizational Unit Specification

organizational_units: -
  Name: root
  Accounts:
  - FirstAccount
  - SecondAccount

Noticed the missing newlines?

The last line of the code is loading the resulting file, just to show that we can not read it back.

After opening the ticket, I got the answer (on the same day, nice reactivity!) that it is in fact the duplicate of ticket #410. The #410 is a bit different, because it duplicates the complete structure, while we are only replacing a part of it. So maybe that is why our code was still working. I think the part that broke it is coming from this fix: "fix issue with spurious newline on first item after comment + nested block sequence".

As the developer explains, the issue is coming from the way the library is storing comments internally. It seems that comments are stored in different places, with the same reference. And when they are dumped, to avoid saving them several times, there is some internal bookkeeping going on. When we replaced reference to the top key, we broke some comments reference.

As a workaround, I restored comments reference around the top key:

comments = content["organizational_units"].ca.comment
content["organizational_units"] = ou["organizational_units"]
content["organizational_units"].ca.comment = comments

Worked for me...

Friday, November 3, 2023

JComboBox Editor Listening

This article was posted originally on JRoller June 3, 2005

To listen to edition event in the editor component of a JComboBox:

((JTextComponent)comboBox.getEditor().getEditorComponent()).getDocument().addDocumentListener(listener);


Wednesday, October 25, 2023

Moto: Alias Issue when Creating S3 Access Point is Fixed

 Two days ago, I discovered a small bug in the moto library that I use to unit test my lambdas on AWS. I needed to create an S3 Access Point with boto3, and while retrieving the its alias, I had different results when using the return value of create_access_point or get_access_point.

So I wrote a small unit test:

from moto import mock_s3control
import boto3

@mock_s3control
def test_access_point_alias():
    client = boto3.client("s3control")

    alias_from_create = client.create_access_point(
        AccountId="123456789012",
        Name="my-access-point",
        Bucket="MyBucket",
    )["Alias"]

    alias_from_get = client.get_access_point(
        AccountId="123456789012",
        Name="my-access-point",
    )["Alias"]

    assert alias_from_create == alias_from_get

I create an S3 Access Point, and retrieve its alias in two ways: from the response of the create_access_point function, and from the get_access_point function. On my moto 4.2.6, this test fails.

So I opened an issue on the project's repository. It was fixed and closed on the same day. That's reactivity!

Thursday, October 12, 2023

AWS: The Next Token Pattern

When developing  for AWS, there is a pattern that you use each time a response to a service may return a lot of data. You get back some of the data, together with a token that you can provide to get another round of data.

As an example, let's take the service that returns the list of events from a Cloudformation template deployment. Here is how you would do it using Python and boto3:

cf_client = boto3.client("cloudformation")
response = cf_client.describe_stack_events(StackName="mystack")
# do something with the response

while response.get("NextToken"):
    response = cf_client.describe_stack_events(
        StackName="mystack",
        NextToken=response.get("NextToken")
    )
    # do again something with the response

However, there is one thing that I do not like with this pattern: code repeat. You call the service at two different parts of your code, with almost identical parameters. And you process the response in the same way, again in two places. If you have to fix something in this code, you have to remember to fix in both places.

The approach I use to have your code only once, is to take advantage of Python's capacity to pass parameters as a dictionary. Here is my approach to this pattern:

cf_client = boto3.client("cloudformation")
next_token = "FIRST TIME"
params = {"StackName": "mystack"}

while next_token:
    response = cf_client.describe_stack_events(**params)
    # do something with the response

    next_token = response.get("NextToken")
    params["NextToken"] = next_token

I store my parameter list in a dictionary, and I initialize the next token with something that is not empty. So the first time in the loop will always run. I can then call my service, without the token. After processing the response, I then read the next token and fill it in my parameter list. The second time round, it will call the service with my token.

Something funny happened when I started using this pattern into our production code. We tried using Bandit, which is a tool that analyze your code and looks for security issues. It would systematically flag my pattern with this error: 

[B105:hardcoded_password_string] Possible hardcoded password: 'FIRST TIME'

 Well, I have to slightly modify my pattern to avoid using the word token...

Wednesday, October 11, 2023

AWS: Simpler S3 File Deletes by Prefix

 I came across this code that deletes files in an S3 from a list of prefixes:

s3_client = boto3.client('s3')
for prefix in prefix_list:
    paginator = s3_client.get_paginator('list_objects_v2')
    file_list = paginator.paginate(
        Bucket=data_bucket,
        Prefix=prefix
    )
    for current_content in file_list:
        for current_file in current_content.get('Contents', []):
            current_key = current_file['Key']
            response = s3_client.delete_object(
                Bucket=data_bucket,
                Key=current_key
            )

The code creates an S3 client, and then, for each prefix in a list, it creates a paginator. Paginators are great because they help you avoid using all your memory when the list of files is big. Using this paginator, the code retrieves the list of all the files corresponding to the prefix, and deletes it.

Nothing bad in the code, it works nicely. My only remark here, is that there exists a simpler way. Instead of using the S3 client, you can create an S3 bucket resource. From there, you can simply delete all files listed under a prefix using a simple filter:

s3_resource = boto3.resource('s3')
bucket = s3_resource.Bucket(data_bucket)
for prefix in prefix_list:
    bucket.objects.filter(Prefix=prefix).delete()

Simpler!

Friday, October 6, 2023

AWS: Automatic Subscription Confirmation from SQS Queue to SNS Topic

 We have an architecture in AWS where different events from different accounts need to be sent to one central SQS queue. Since the events will cross both accounts and regions, one way to do it is to send them to a local SNS Topic. 

The SQS queue will have to subscribe to all those Topics, but we can not do it on the SQS side, since it does not know each time someone pops out a new account. However, the problem with having the SNS Topics create the subscriptions, is that they are waiting for confirmation from the SQS queue.

Since we already have a lambda waiting on the other side of the queue, handling all the events, we added a small code to handle the subscription confirmation as well. Here it is:

import json
import urllib.request

def lambda_handler(event, context):
    for record in event["Records"]:
        body = json.loads(record["body"])

        if body.get("Type") == "SubscriptionConfirmation":
            handle_subscription_confirmation(body)

def handle_subscription_confirmation(message):
    url = message["SubscribeURL"]

    with urllib.request.urlopen(url) as response:
        print(response.read())

I find it strange that the Cloudformation template that we use to create the subscription does not handle the confirmation as well. Or maybe not cross-account?