Thursday, April 18, 2024

WTF: E-mail Validation

 E-mail validation is usually a hard task, but in our case, we had a simple regular expression that allowed us to accept a list of e-mails in a known format. Here is the regular expression that you could find in our code:

^$|^\s*[\w+.-]+@[a-zA-Z_-]+?\.[a-zA-Z]{2,3}(?:,[\w+.-]+@[a-zA-Z_-]+?\.[a-zA-Z]{2,3})*\s*$

Many things here, let's decompose:

  • ^$|: we accept an empty string
  • ^\s*: we ignore leading white space characters
  • [\w+.-]+: the user name part of the e-mail. We accept all words characters, plus sign, dot and dash. 
  • @: the at sign
  • [a-zA-Z_-]+?: the domain name, which can have any letter, dash and underscore. Note here the use of the +? pattern, which is very strange. I had to google it, it is the lazy expansion, which means take the minimum number of character needed to fulfill the pattern. Completely useless here since we are looking for a dot character afterward.
  • \. The dot character between the domain name and the extension
  • [a-zA-Z]{2,3}: the extension, which can be 2 or 3 letters (like .fr or .com)
  • (?:, ... )*: we repeat here the whole pattern to say that we can have any number of other e-mails separated by a comma. Note the strange use of the ?: pattern. I had to google that one too. This is the non capturing group, which means that it is a group that you can not retrieve later using group() functions. Useless here since we are not checking for capturing groups.
  • \s*$: we ignore all trailing space characters
A bit complicated, but still ok. But then, somebody complained that it is not supporting e-mails from our Japanese branch, which have extensions in the form of @domain.co.jp. So someone was set to the task, and came up with the following regular expression:
^$|^\s*[\w+.-]+@[a-zA-Z_-]+?\.[a-zA-Z]{2,3}(?:,[\w+.-]+@[a-zA-Z_-]+?\.[a-zA-Z]{2,3}\.[a-zA-Z]{2,3})*\s*$

The only difference with the previous one is that there is a new \.[a-zA-Z]{2,3} added within the parenthesis. Which mean that you can have japanese style e-mails, but only after the first e-mail of the list. Worse, you can only have japanese style e-mails from the second mail onward. I notified the person that commited the code, and he said that he will think about the problem. Of course, code went to prod...

So I decided to make a quick fix. I removed all the strange patterns, and set the following regular expression:

^$|^\s*[\w+.-]+@[a-zA-Z_-]+(\.[a-zA-Z]{2,3}){1,2}(,[\w+.-]+@[a-zA-Z_-]+(\.[a-zA-Z]{2,3}){1,2})*\s*$

The fix was made using the {1,2} pattern to say that we can have one or two extensions. Meanwhile, the guy who made the first change also started to make a fix. Small communication problem here, he didn't noticed that I already assigned the bug to myself. But the funny thing is that he had a fix on a branch that was never merged. It looked like this:

^$|^\s*[\w+.-]+@(?:domain)+?(\.[a-zA-Z]{2,3}|\.[a-zA-Z]{2,3}\.[a-zA-Z]{2,3})(?:,[\w+.-]+@(?:domain)+?(\.[a-zA-Z]{2,3}|\.[a-zA-Z]{2,3}\.[a-zA-Z]{2,3}))(?:,[\w+.-]+@(?:domain)+?(\.[a-zA-Z]{2,3}|\.[a-zA-Z]{2,3}\.[a-zA-Z]{2,3}))*\s*$

I don't even want to know if it is correct... 

Friday, March 15, 2024

AWS: Find Root Cause of Failure for CloudFormation Stacks

 When a CloudFormation stack fails, you have to scroll back trough the events to find the root cause of the failure. Recently, AWS even added a "Detect Root Cause" button to the Console to immediately scroll to the correct event. But how do you do it from a python script?

import boto3

def find_root_cause(stack_name):
    cf_client = boto3.client('cloudformation')

    next_values = "First Time"
    params = {
        "StackName": stack_name
    }
    root_cause = None

    while next_values:
        result = cf_client.describe_stack_events(**params)

        next_values = result.get("NextToken")
        params["NextToken"] = next_values

        for event in result["StackEvents"]:
            status = event.get("ResourceStatus", "")
            reason = event.get("ResourceStatusReason")

            # start of deployment
            if reason == "User Initiated":
                return root_cause
           
            if reason and "FAILED" in status:
                root_cause = reason

    return root_cause

You follow the same pattern as from the Console. You go back the events history, until you reach the oldest error message before the start of the deployment.

Sunday, March 3, 2024

JFileChooser and the Lost Folder Selection

This article was originally posted on JRoller on July 7, 2005

It might sound like an Indiana Jones movie title, but it is an interesting problem we came across. We have a third party product which at some point displays a JFileChooser, in which you must select a directory. In old Java 1.4, this dialog box was working properly. Now that we switched to brand new 5.0, when we select a folder and click on open, it does not come back with the folder as a selected value, but instead goes into the folder. The main difference in the behavior comes from the fact that when we selected a folder, its name was visible in the selected file textfield, and now it is not.

The colleague who had to solve the problem tried to execute the program by copying the 1.4 version of JFileChooser into the bootclasspath. It did not help, so I suggested him to try with the UI class instead. And oh suprise, it works as in the old days. So he started to compare the source code of both versions, and in the ListSelectionListener, he found an interesting difference. A property which was always true before is now set to false by default. So to solve the problem, he inserted the following line in the main method:

UIManager.put("FileChooser.usesSingleFilePane"new Boolean(true));

I wonder if these properties are documented somewhere. There seems to be so many of them...

I checked in my more recent version of Java. This parameter still exists, and still does not seem to be documented.