Getting Started with AWS Step Functions Part I

4 minute read

AWS came out with Step Functions a few years ago, and up until recently, I have not had the opportunity to dive in and give them a try. Yes, I could build my own pipeline or state machine, but the idea behind Step Functions is that it does most of the heavy lifting for you. That, and it ties into other AWS services. As such, I decided to dive into getting started, and looked at the demo options and walkthroughs that were available. None of them met my needs, so I rolled my own.

The idea is to see how I can create a Step Function that will run multiple loops, and call a Lambda function multiple times. What I wanted to test was the following:

  • Pass Variables into the Step Function and see how they are handled
  • Call a Lambda function multiple times
  • Create a loop using the Step Function DSL
  • Test output from Lambda and make a decision based upon it
  • Figure out any gotchas and how to trigger Step Functions

Let’s dive in. Now, this is the final result. It took me a few iterations to actually get to this point. Smarter people than I might be able to get it done on one go, but not I.

Lambda Code

I came up with a simple Lambda function written in Python 3.6. All that I wanted to do was to perform a loop with Step Functions, and then get output the values. Simple. And as you can see, this code is pretty simple. It could be streamlined, but it was quick and easy to write.

def lambda_handler(event, context):
    print('value1 = ' + event['key1'])
    print('value2 = ' + event['key2'])
    print('value3 = ' + event['key3'])
    taskresult = event.get('taskresult', None)
    if taskresult is None:
        count = 0
    else:
        count = taskresult.get('count', None)
    
    if count is None:
        count = 0
    else:
        count = count +1
        
    if count < 5:
        output = {
            'count' : count,
            'value1' : 'ThereIsNoSpoon';
        }
    else:
        output = { 'value1' : event['key1'],
                    'value2': event['key2'],
                    'count' : count }
    return output

Now we need to move onto the body of what we are working on, and that would be the Step Function. Step Functions have their own language or domain specific language (DSL) that is used to define the state machine. I wanted more than just a “Hello World” example. The idea was to loop through a step functions. Make sure that I could call it multiple times, and then either go to a success or failed state

AWS Step Function Code

{
  "Comment": "A Retry example of the Amazon States Language using an AWS Lambda Function",
  "StartAt": "LambdaFunction",
  "States": {
    "LambdaFunction": {
      "Type": "Task",
      "Resource": "arn:aws:lambda:us-east-1:123456789012:function:aws-serverless-repository-hello-w-helloworldpython-1JQ8TEEDUAHCE",
      "ResultPath": "$.taskresult",
      "Retry": [
        {
          "ErrorEquals": ["CustomError"],
          "IntervalSeconds": 1,
          "MaxAttempts": 2,
          "BackoffRate": 2.0
        },
        {
          "ErrorEquals": ["States.TaskFailed"],
          "IntervalSeconds": 30,
          "MaxAttempts": 2,
          "BackoffRate": 2.0
        },
        {
          "ErrorEquals": ["States.ALL"],
          "IntervalSeconds": 5,
          "MaxAttempts": 5,
          "BackoffRate": 2.0
        }
      ],
      "Next": "ChoiceState"
    },
    "ChoiceState": {
      "Type": "Choice",
      "Choices": [
        {
          "Variable": "$.taskresult.value1",
          "StringEquals": "value1",
          "Next": "SuccessState"
        },
        {
          "Variable": "$.taskresult.count",
          "NumericLessThan": 5,
          "Next": "LambdaFunction"
        }
      ],
      "Default": "FailState"
    },
    "SuccessState": {
      "Type": "Succeed"
    },
    "FailState": {
      "Type": "Fail",
      "Cause": "Invalid response.",
      "Error": "ErrorA"
    }
  }
}

The way that this code works is as follows. Everything works around States. So, you have to move from State to State. This is a key concept when it comes to Step Functions. Now, there are multiple State types, but I am not going to go into that now. The key factor is that you will go through and loops if a proper return value is not returned. Looking at it now, it looks like a bunch of gobbledygook. I am going to have to come back and write up how this works later.

This is what the visual representation looks like when viewed in the AWS Step Function page. There is a defined ‘Start’ and ‘Stop’. The other stages match what was named in the previous section. The code works to present a model that you can follow.

The cool think about AWS Step Functions is that they guarantee a run. And in a situation where you need to ensure that the code is run, and you need a guarantee. This is mostly due to the cost that is associated with it. Running Lambda that Triggers on SQS would be cheaper, but not as easy to ensure.

Back on with our stuff now. We are looking at how we execute the AWS Step Function. Now we need to execute it. Right now, I am not going to go into the logic around passing variables around. Needless, you will need to understand that when writing your own, and I am going to have to revisit it.

Execution. A couple of items to note.

  • Each execution has to have a unique name.
    • Note, this will bite you when you are testing, and think about this when executing it via automation.
  • It takes in an action just like Lambda, via json
  • Making a small change in the inputs can cause madness
{
  "key1": "value1",
  "key2": "value2",
  "key3": "value3"
}

The output will also be in json, and you can see the results in the visual display.

{
  "key1": "value1",
  "key2": "value2",
  "key3": "value3",
  "taskresult": {
    "value1": "value1",
    "value2": "value2",
    "count": 5
  }
}

This is what the output looks like.

I will go into more detail on the breakdown of the Step Function in the next post. There is a lot to be covered, and this just scratches at the surface.