How to configure environment for Python Poetry based project.

How do I get started?

I recently switched over to Poetry as a package manager for my project CfnMason. ( CfnMason is a tool related to CloudFormation stack management, but you can read about that on the Readme as it is updated. ) The question is, how do you use Poetry when you are working on a project across multiple machines and operating systems. I guess I am going to attempt to address the issue.

So, you want to join a project, or work on a project that is using the Poetry dependency management tool. Great! But, how do you get the requirements setup for the project so that you can start working on it? How do you know which version of Python to use, which packages to install, how to build the project, or how to run the test suite?

This is an issue that I was facing, but it was not with another project, but my own as I was switching between machines. Now, I do know some of the answers to the questions above, but I was still stuck as to how to setup a project on another machine. As such, I decided to walk through the process of coming onto a new project and determining how to work with it. The project that will be used for this walk-through will be CfMason, at tool that manages some aspects of building and deploying CloudFormation stacks on AWS.

Hopefully the project that you are working on has a Readme file. Though, to be fair, documentation is hard, and is often the last thing that is added to a project. If it does, you should be able them, but if they are not provided, then the following steps are the way that I would go about working on a project that uses Poetry for dependency management. Oh, and as a note. I am making the assumption that you already have Poetry installed.

Steps to work on Poetry based Project

  1. Determine that the project is using Poetry
  2. Check the version of Python that is needed
    1. Validate local Python version
    2. Install if missing
  3. Create a virtual environment for building the application
    1. venv — for packaging and validation
    2. venvdev — for building the package
  4. Install project dependencies
  5. Build the project
  6. Run tests

Determine the project is using Poetry.

If the ReadMe does not tell you that the project is using poetry, then there is a quick way to find out.

  • Look for the file pyproject.toml in the base of the project.
  • Open the file and look for the following line
    • [tool.poetry]

Provided you are able to find this file, and line, then the project is using Poetry.

Determine the version of Python

One nice thing about Poetry is that it has a defined location to identify the version of Python. I am a big fan of this, as the difference between different versions can cause major problems. Take for example that reserved keywords changed a bunch between 3.6 and 3.7. The steps to follow are as follows.

  • Open the toml project file pyproject.toml
  • Find the supported python versions in the [tool.poetry.dependencies]
  • Find the line starting with python, and find the supported versions.
    • Ex. python = "^3.6"
  • If you don’t have that version of Python on your system, install it.

Setting up a local Virtual Env

I am a huge proponent of using a virtual environment for each application that I am working on. In some cases, I will have 2, one for including all the development modules, and one for just the ones needed for the application to run. Since Python 2 is pretty much EOL, I am not going to spend any time on how to setup a virtual environment for Python2. Instead, this is all dedicated to Python3. And I can only guarantee this on Python 3.6 or later.

foo$ python -m venv venv-dev
foo$ python -m venv venv
foo$ ls -ld venv*
drwxr-xr-x 1 foobar 197610 0 Nov 24 17:02 venv/
drwxr-xr-x 1 foobar 197610 0 Dec 25 14:19 venv-dev/

Install project dependencies

For this last part, you need to activate either of the Python Virtual Environments and then run the install code from there. This is only if you really want to install it both ways. If not, then you can just create a single virtual env directory and just install all the dependencies.

Install all the dependencies, even the ones needed for development.

foo$ poetry install
Installing dependencies from lock file


Package operations: 10 installs, 0 updates, 0 removals

  - Installing more-itertools (7.2.0)
  - Installing zipp (0.6.0)
.....
  - Installing cfnmason (0.1.0)

The other option is to just install the libraries needed to execute and run the module. I would almost prefer if it defaulted to the method below, but it works.

foo$ poetry install --no-dev
Installing dependencies from lock file

Nothing to install or update

  - Installing cfnmason (0.1.0)

Start working on the Project

That is it. You should be up and running. At least to the point where you can get started with the project. Moving forward from this point will rely a lot upon how the project is setup, and how well it is documented. But, the big factor is that you can now start working on it while using Poetry, or you have the foundation to work on a project across multiple machines.

Setting up the cfnmason project for Python

Cfnmason is yet another tool that can be used to manipulate cloudformation stacks. It is not designed to be a replacement for CloudFormation like Terraform, but as a means of making building and managing them easier. I had written a version of this ages ago in Ruby, but with most of my work now being in Python, I am creating a new version in Python. As I have never created an exportable Python package, this will track the process of building and releasing a new PyPi package.

Oh, and to keep things interesting, I am doing this on a mix of Windows 10 and Linux.

Creating the default layout using Poetry

The first thing that we need to do is to create the base package. I could do it by hand, but I want to try out the Poetry package and see if I will hate the decision later.

c:\dev> poetry new cfnmason
...
c:\dev\cfnmason> tree \f \a

|   pyproject.toml
|   README.rst
|
+---cfnmason
|       __init__.py
|
\---tests
        test_cfnmason.py
        __init__.py

Init a new Git repo and add a .gitignore file

These are more civilized times. As such, I almost always create a git repository when I am working on a project, even small ones. They may or may not be public, and it can fluctuate on which platform I use to host my code. This time I am opting for Github, and have uploaded the code.

I have been using gitignore.io to generate base .gitignore files for ages. Type in a few operating systems, the language you are coding for, any IDEs, etc, and you can have a useful .gitignore file right out of the gate. Sure you can do it by hand, but this is quick and easy, and it can always be edited later.

Modifying the Readme file

Poetry starts you with a README.rst file. I don’t know about you, but I have been working with MarkDown for ages. It is common on a number of platforms and there is support for it in a number of editors. I understand the RST files are really designed for technical documentation, but I can burn that bridge later. For now, I need a decent starting point.

We could have started by writing out the template by hand. This would have been long and tedious. Instead, I am using a template. By using a template, I am up and running quickly. I can add and remove parts that I do or do not need, and hopefully, I will not forget a major part. As this is the first pass, I am not going to update the entire thing, but at least get the project name in and the fact that I am working on it.

The license file

The last thing needed before I start working on the code is the license file. Which license you choose is up to you. Personally, I like to use the MIT license. It lets people do what they want. As this is an opensource project, I feel that people should be able to use it as much or as little as they want.

The easiest way to do this is using the Github console. Just add a new file via the web interface and name it LICENSE or LICENSE.md in all caps. Then you will have the option to choose from a list of known licenses.

That is it and next steps

Hmmm. This took longer to write than the work to get the base package up and running. And, that is going to be expected. But, it should also show some of the thoughts and considerations that are needed when creating a new project. Especially if it is going to open for the world.

With that I will leave you to it. In the next couple of days, I am going to start adding in the libraries that are needed to begin working on the project. As I have written this in Ruby before, I have a rough layout in my head of how I want the application to work. And, I know what libraries are needed to meet the core functionality.

The question that I have for myself is if I should take the time to ensure that I update the design docs. Having a design doc is a great way to ensure that you are not missing any features. I am going to have to sleep on it.

How do I setup a new python project?

Python prides itself on there only being one best way of doing things. However, if you have ever had the desire to create a package that can be distributed in python, then you may have run into some frustration. Out of the box, there does not appear to be a single clear concise way of creating a new project. After hearing over and over again, that the correct way in Python should be obvious, in this area, it seems that this is definitely not the case.

There should be one-- and preferably only one --obvious way to do it.

— Zen of Python

However, it does seem that there are a couple of solutions that have been put forward to solve this problem, and none of them have officially been endorsed by the people that manage Python. You can do it all by hand, and I give you credit if you do, but this is more work than I want to keep up with. I started to walk through the documentation to use setuptools to configure a package, and almost wanted to cry. Not really, but when you cannot give a clean concise way to build and distribute a python package, there is a problem.

The only thing that is clear is that you should have a setup.py file. Other than that, you are on your own. As I said before, the documentation from setuptools is laughable. Maybe if I did not mind reading pages upon pages to get a simple package started I would be OK, but I don’t have the patience for it. I thought that it would make sense to look at existing packages, but once again, everybody seems to setup their projects differently. Don’t get me wrong, there is nothing wrong with developers taking their own routes to setup a package, especially after seeing how little guidance there is. Maybe packaging was an afterthought for Python.

Having used Ruby in the past, I was sure that there had to be packages out there to assist in setting up the initial package structure and environment. Now, I do not expect it to write the code for me, but the basics of what is in the package, file layout, author information, etc. And after a bit of searching and head scratching I finally found what I was looking for. Well, at least I got a bit further down the rabbit hole. I had honestly thought that by this point I would be working on migrating my app from Ruby to Python, not trying to figure out how to create a package.

What tools are there?

It seems that upon first glance, the top three tools for creating Python packages are Poetry, Pipenv, and Hatch. And when I say creating, I mean creation and management of the packages. There is always doing it by hand, and maybe I will end up there, but that goes against the grain of automation.

I have spent a bit of time looking at these three options, plus managing it by hand. After looking at it for a bit, I think that I am going to throw Pipenv out the window. The lack of proper documentation is a stopping point for me. Also, it seems like there is a fork that is now responsible for the actual development, and not the original source itself.

That leaves me with Hatch and Poetry. Decisions, decisions, decisions. I am not sure which route I am going to go down.

Which one do you choose?

I think that I am going to start by creating my project using both methods. Early on, it should be simple enough to create the source and copy it between projects. The real question will be, which one will make management easier in the long run. That means, in my next post, I am going to start building, or more accurately, rebuilding cfmason.

Within a week or two I should be able to make my decision. But, I am going to build them out from scratch both ways, and record the process. Heck, that will probably be harder than the coding itself. Documenting this stuff is not easy.

See you in a week.

How do I get rid of the bell in Vim on PyCharm on Windows?

One of the most infuriating things in the world is when the the bell goes off continuously when using the Vim plugin in PyCharm. Actually, this holds true for all of the JetBrains IDEs, but I have been working mostly with Python recently, and as such, it is PyCharm that is on the top of my mind.

While there are a number of plugins that you can get for the JetBrains products, the one that I always end up installing is Vim. Call me old school, but back in the day, I had to work on remote systems. It was either Vim or Pico, and I chose Vim. Now that I have modern IDEs, I still end up using Vim for all my editing needs. The muscle memory is built in, and I don’t even think about it anymore.

But what about the Bell?

While I have used Linux or Mac as my primary development environment for the last 10 years, I am recently decided to see how well I can survive while coding using Windows. That being what it is, I installed PyCharm, got the Vim plugin installed, and went to get to work. And then it happened, the audible bell of hell. I added my regular vimrc file into my home directory, and no dice.

The bell was still there.

I double checked my settings, and to no avail, the stupid bell would not go away. It was driving me crazy. I turned off the volume on my computer while I was working on the issue. And then, after digging through the documentation on the JetBrains documentation website I found the answer. The Vim modules for JetBrains does not look at the .vimrc file.

The fix….

The Vim plugin for PyCharm and all the JetBrain products uses the .ideavimrc file.

set noeb
set novb
set belloff=all

Then, all you have to do is save and exit. Restart the IDE, and you are done.

No more bell. Well, at least on Windows. But, this should not be that hard, but then again, I think with some of the settings that are built into the various IDEs, they did not want to pick up all the settings in a normal .vimrc file. I may dig more into that in the future, but the next step is to see how well Python plays on Windows compared with on Linux/Mac.

An Introduction to AWS Step Functions – Part 2

In Part 1, I went through a whirlwind tour and left example code on how to create a Lambda function and a Step Function. The thing is, I did not explain how the code works. That is what this article is about. I am going to be using the same code from the previous post, but will be going through the code and explaining it. We shall see how well the formatting works.

Jumping right into it. This is the entire code for the StepFunction from the previous post. As of right now, this code has to be json. I did not find any information about supporting yaml. Here is the documentation provided by AWS. It is pretty thorough, but it could be a bit clearer. My explanation could be better or worse.

{
  "Comment": "A Retry example of the Amazon States Language using an AWS Lambda Function",
  "StartAt": "LambdaFunction",
  "States": {
    "LambdaFunction": {
      "Type": "Task",
      "Resource": "arn:aws:lambda:us-east-1:123456789012:function:aws-serverless-repository-hello-w-helloworldpython-1JQ8TEEDUAHCE",
      "ResultPath": "$.taskresult",
      "Retry": [
        {
          "ErrorEquals": ["CustomError"],
          "IntervalSeconds": 1,
          "MaxAttempts": 2,
          "BackoffRate": 2.0
        },
        {
          "ErrorEquals": ["States.TaskFailed"],
          "IntervalSeconds": 30,
          "MaxAttempts": 2,
          "BackoffRate": 2.0
        },
        {
          "ErrorEquals": ["States.ALL"],
          "IntervalSeconds": 5,
          "MaxAttempts": 5,
          "BackoffRate": 2.0
        }
      ],
      "Next": "ChoiceState"
    },
    "ChoiceState": {
      "Type": "Choice",
      "Choices": [
        {
          "Variable": "$.taskresult.value1",
          "StringEquals": "value1",
          "Next": "SuccessState"
        },
        {
          "Variable": "$.taskresult.count",
          "NumericLessThan": 5,
          "Next": "LambdaFunction"
        }
      ],
      "Default": "FailState"
    },
    "SuccessState": {
      "Type": "Succeed"
    },
    "FailState": {
      "Type": "Fail",
      "Cause": "Invalid response.",
      "Error": "ErrorA"
    }
  }
}

Looking at this for the first time, it all seems a bit overwhelming. However, after breaking it down, you will see that while you have to be precise with it, Step Functions are fairly easy to read. (Writing them is still a PITA, but that is just what it is) When you get down to it, the Step Function configuration file only contains 3 sections: ‘Comment‘, ‘StartsAt‘, and ‘States‘. That is it.

Let us take a quick look at the the first two mentioned above, Comment and StartsAt. The former of these is just a comment so you know what the purpose of the Step Function is. On the other hand, you could use this for whatever comment you like. Then, there is StartsAt. StartsAt, actually links to a State. And the States section is what contains all the real meat of the Step Function. The value that is used for StartsAt is the name of one of the States.

States

States are where everything really happens when it comes to Step Functions. They are the individual units that when strung together, make the magic happen. When you look at it from this perspective, it sounds ridiculously easy. But, just like everything else, the devil is in the details. The key to understanding States is to know what the various components that make up a State, and the type of states that are available.

Since this is an introduction, and based off of my code here, I am not going to go through all the options. However, there are docs from Amazon on it. The think about it is, it is not simple. Based on the docs, the only required field is Type. Great, but what are the various types? How do I link them together to make a cohesive whole? What are these other options and how do I pass variables from one section to the next and use them over and over again in a loop? This information is available, but it is scatter all over the documentation and you have to piecemeal it all together to figure out what is available.

What are the State types?

  • Pass – used to take input and pass to output. Basically, used to debug step functions, so you know what the heck is happening between States.
  • Task – this is where the work gets done. It can be a Lambda function or any other supported service. This will take your input and from there provide output to another State to be consumed
  • Choice – handles branching in your Step Function by looking at the output from the previous State and sending the Step Function to perform the next decided upon State in the workflow.
  • Wait – allows you to put a pause in your Step Function to wait for an action to finish, like deploying CloudFormation or some other long running task.
  • Succeed – the state machine has finished with success. This one is actually simple and does just what it says.
  • Fail – the end state where all has not gone according to plan. This is the opposite of Succeed, and allows for some reporting on what went wrong.
  • Parallel – run States at the same time. I have not yet worked with this, so I do not have as much information as I would like on this type of State

In the example above, I use only 4 of the States from above, Task, Choice, Succeed, and Fail. Mainly because I wanted to touch on a good bit of the basics without going to far down the rabbit hole. For my example, my Task was a Lambda function. To me this is the most logic way to use step functions, but there are probably many more that I have not considered. I also wanted to employ a loop. This being that I wanted to test flow control, and passing variable to and from Lambda functions. I could have strung the Lambda functions together in a straight line, but I wanted to test out branching as well.

The big factor to remember is that the order of the States does not matter. What matters is linking which State comes next. You could use a single Choice State to manage the flow through the entire Step Function, and based upon the input move onto any number of other States or back to a previous one.

[code]
],
“Next”: “ChoiceState”
},
[/code]

This little block of code is a great example. The Next keyword is used, and the following word is the name of the next State block that is going to be executed. The name of each State is identified by the start of the json template block. In the example above, I am not going to a State of type Choice, but have created a State that is called ChoiceState. In retrospect, much of this could have been done cleaner, but it was an example that I put together rather quickly. This is the declaration:

[code]
},
“ChoiceState”: {
“Type”: “Choice”,
“Choices”: [
[/code]

I am going to end here for today. The next section is going to be about passing data from the Step Function to Lambda, and how to reuse it. But, I want to work on a smaller block of code and focus on just the one thing.