A planet of blogs from our members...

Caktus GroupHow to Do Wagtail Data Migrations

Wagtail is a fantastic content management system that does a great job of making it easy for developers to get a new website up and running quickly and painlessly. It’s no wonder that Wagtail has grown to become the leading Django-based CMS. As one of the creators of Wagtail recently said, it makes the initial experience of getting a website set up and running very good. At Caktus, Wagtail is our go-to framework when we need a content management system.

Wagtail StreamFields are a particularly great idea: Rather than editing a big blob of Rich Text, you create a StreamField with the content blocks that you want — paragraphs, images, videos — and let editors create a stream of these blocks containing the content on the page. For example, with a video block, editors don’t have to fiddle with video embed codes, they just insert the video URL or shortcode, and the block handles all of the templating. This has the advantage of making it possible to have complex page content, without requiring the site editors to know how to model a complex design in HTML directly. It makes for a much better, more structured, and smoother editing experience. (StreamFields are such a great idea that WordPress has recently launched a similar feature — inspired by Wagtail?)

But…. There are some pain points for developers who work on large Wagtail projects. One of those is data migrations, particularly those that involve StreamFields. My informal survey of fellow developers yielded the following helpful comments:

  • “Wagtail migrations are evil.” —Eddie
  • “As a rule, avoid data migrations on streamfields at all costs.” —Neil
  • “It feels like we’re fighting the framework when we’re working programmatically with StreamField data, which is a real head-scratcher.” —name withheld

FWIW, I don’t think we’re exactly fighting the framework, so much as trying to do something that the framework hasn’t yet been optimized for. Wagtail has clearly been optimized to create a fantastic onboarding experience. And it’s really great. But it hasn’t yet been optimized for maintaining page data in an environment of shifting requirements. And so it’s currently really hard to do a data migration correctly.

The Caktus team was recently working on an existing Wagtail installation in which we were asked to migrate a page model from non-StreamFields to use a StreamField, giving editors greater flexibility and normalizing the page data. We were also asked, if possible, to migrate the existing pages’ data into the StreamField. That’s a pretty straightforward use case, and one that would seem to be a fairly common need: People start out their page models with regular ol’ fields, then they decide later (after building and publishing a bunch of pages!) that they want those pages to use StreamFields instead.

Considering all of this a worthy challenge, I rolled up my sleeves, dug in, and created a robust data migration for the project. It worked well, migrated all of the page and revision data successfully, and taught me a lot about Wagtail StreamFields.

At PyCon 2019, I hosted an open session on making Wagtail better for developers, and one of the things we talked about was data migrations (read more in my overview of PyCon 2019: “Be Quick or Eat Potatoes: A Newbie’s Guide to PyCon”). A couple of Wagtail core developers came to the session. I was pleased to learn that the method I used is essentially the same method that the Wagtail team has landed on as the best way to migrate StreamField data. So while this method isn’t yet officially supported in Wagtail, you heard it here first: This is currently the best way to do it.

The following section provides a worked example of the method I used. A repository containing all of the code in this example is available on GitHub @sharrisoncaktus/wagtail-data-migration-example.

Let’s dig in, shall we?

How I migrated a Wagtail page model with a StreamField

Start with an Existing Page Model

To illustrate the method I used, I’ll set up a simple page model with

  • a title (as always)
  • a placed image
  • a body
  • a list of documents (that can be displayed as a grid, for example)

The code for this page model looks like this (omitting all the scaffolding of imports etc.):

# First version of the model.

class ExamplePage(Page):
    image = models.ForeignKey(
        'wagtailimages.Image',
        null=True,
        blank=True,
        on_delete=models.SET_NULL,
        related_name='+'
    )
    body = RichTextField()
    docs = StreamField([
        ('doc', DocumentChooserBlock()),
    ])
    content_panels = Page.content_panels + [
        ImageChooserPanel('image'),
        FieldPanel('body'),
        StreamFieldPanel('docs'),
    ]


Example Page: Starting Point

Wagtail Example Page

Situation 1: Add New Fields to the Model without Moving or Renaming Anything

Now let’s suppose the customer wants to add pages to the docs block — they want to be able to display a link to a page in the grid alongside downloadable documents.

Here’s what the model looks like after adding a 'page' block to the 'docs' StreamField:

# Second version of the model: Added page block to the docs StreamField

class ExamplePage(Page):
    image = models.ForeignKey(
        'wagtailimages.Image',
        null=True,
        blank=True,
        on_delete=models.SET_NULL,
        related_name='+')
    body = RichTextField()
    docs = StreamField([
        ('doc', DocumentChooserBlock()),
        ('page', PageChooserBlock()),
    ])

    content_panels = Page.content_panels + [
        ImageChooserPanel('image'),
        FieldPanel('body'),
        StreamFieldPanel('docs'),
    ]

You can create and run this migration, no problem and no worries, because you haven’t moved or changed any existing data.

Rule 1: You can add fields to the model, and new blocks to StreamFields, with impunity — as long as you don’t move or rename anything.

Situation 2: Create Data Migrations to Move Existing Data

Some time later, the customer / site owner / editors have written and published a hundred pages using this model. Then the fateful day arrives: The customer / site owner / editors have enjoyed working with the docs field, and now want to move all the page content into a StreamField so that they can have a lot more flexibility about how they structure the content.

Does this sound familiar?

It’s not hard to write the new model definition.

# End result: The model after content has been migrated to a StreamField:

class ExamplePage(Page):
    content = StreamField([
        ('image', ImageChooserBlock()),
        ('text', RichTextBlock()),
        ('docs', StreamBlock([
            ('doc', DocumentChooserBlock()),
            ('page', PageChooserBlock()),
        ])),
    ])

    content_panels = Page.content_panels + [
        StreamFieldPanel('content'),
    ]

Now, it goes almost without saying: Do not create and run this migration. If you do, you will have a VERY angry customer, because you will have deleted all of their content data.

Instead, you need to break up your migration into several steps.

Rule 2: Split the migration into several steps and verify each before doing the next.

You’ll notice that I chose a different name for the new field — I didn’t, for example, name it “body,” which currently exists as a RichTextField. You want to avoid renaming fields, and you want to do things in an orderly way.

So, here are the steps of a Wagtail data migration.

Step 1: Add fields to the model without moving or renaming anything.

Here’s the non-destructive next version of the model.

# Data Migration Step 1: The model with the `content` StreamField added.

class ExamplePage(Page):
    # new content StreamField
    content = StreamField([
        ('image', ImageChooserBlock()),
        ('text', RichTextBlock()),
        ('docs', StreamBlock([
            ('doc', DocumentChooserBlock()),
            ('page', PageChooserBlock()),
        ], null=True)),
    ])

    # old fields retained for now
    image = models.ForeignKey(
        'wagtailimages.Image',
        null=True,
        blank=True,
        on_delete=models.SET_NULL,
        related_name='+')
    body = RichTextField()
    docs = StreamField([
        ('doc', DocumentChooserBlock()),
        ('page', PageChooserBlock()),
    ])

    content_panels = Page.content_panels + [
        StreamFieldPanel('content'),

        # old panels retained for now
        ImageChooserPanel('image'),
        FieldPanel('body'),
        StreamFieldPanel('docs'),
    ]

The content field has to allow null values (null=True), because it’s going to be empty for all existing pages and revisions until we migrate the data.

Step 2: Create a data migration that maps / copies all the data from the old fields to the new fields, without modifying the existing fields. (Treat existing data as immutable at this point.)

This is the hard part, the fateful day, the prospect of which makes Wagtail devs run away screaming.

I’m here to encourage you: You can do it. Although this procedure is not well-documented or supported by Wagtail, it works reliably and well.

So, let’s do this. First you’ll create an empty migration

$ python manage.py makemigrations APPNAME -n migrate_content_to_streamfield --empty

You’ll end up with an empty migration. For the “forward” migration, you’ll add a RunPython operation that copies all the content data from the existing fields to the new StreamField.

You can also create a “reverse” operation that undoes the changes, but I usually prevent reverse migrations — life is hard enough as it is. However, it’s up to you, and the same kind of procedure can work in reverse.

Here’s what things will look like so far:

def copy_page_data_to_content_streamfield(apps, schema_editor):
    raise NotImplementedError("TODO")

def prevent_reverse_migration(apps, schema_editor):
    raise NotImplementedError(
        "This migration cannot be reversed without"
        + " inordinate expenditure of time. You can"
        + " `--fake` it if you know what you're doing,"
        + " and are a migration ninja."
    )

class Migration(migrations.Migration):
    dependencies = [
        ('home', '0005_add_content_streamfield'),
    ]
    operations = [
      migrations.RunPython(
          copy_page_data_to_content_streamfield,
          prevent_reverse_migration,
            )
    ]

The copy_page_data_to_content_streamfield(…) function will copy all page and revision data from the existing fields to the new content StreamField. Here’s what it looks like:

def copy_page_data_to_content_streamfield(apps, schema_editor):
    """With the given page, copy the page data to the content stream_data"""
    # if the ExamplePage model no longer exists, return directly
    try:
        ExamplePage = import_module('home.models').ExamplePage
    except:
        return
    ExamplePage = import_module('home.models').ExamplePage
    for page in ExamplePage.objects.all():
        page_data = json.loads(page.to_json())
        content_data = page_data_to_content_streamfield_data(page_data)
        if content_data != page.content.stream_data:
            page.content.stream_data = content_data
            page.save()
        for revision in page.revisions.all():
            revision_data = json.loads(revision.content_json)
            content_data = page_data_to_content_streamfield_data(revision_data)
            if content_data != revision_data.get('content'):
                # StreamField data is stored in revision.content_json in a string field
                revision_data['content'] = json.dumps(content_data)
                revision.content_json = json.dumps(
                    revision_data, cls=DjangoJSONEncoder)
                revision.save()

There are several things to notice here:

  • We’re importing the ExamplePage definition from home.models rather than via apps.get_model(). This allows us to use the ExamplePage.to_json() method. We have to import the model during the migration using importlib so that future model changes don’t break the migration. (Never import from the app’s models at the module-level of a migration.) We also need to put the import into a try/except block, in case the model is deleted in the future.
  • Using page.to_json() puts the page_data into the same form as the page revision data, which makes it much easier to do a data migration (one function for both page data and revision data)
  • We’re using regular Python data structures – dicts, lists, etc. This turns out to be a lot easier than trying to build StreamValues directly.
  • We’re using the same helper function, page_data_to_content_streamfield_data(…) (which we haven’t yet created) for both the page data and all revisions data. (We’ll develop this function next.) We can use the same helper function for page data and revisions data because the data structures are the same when represented using Python data structures.
  • The content data in revisions is stored in a JSON string. No problem. We just use json.loads() and json.dumps() with the DjangoJSONEncoder (DjangoJSONEncoder is not entirely necessary here because we don’t have any date or datetime fields in this model, but it’s a good practice to use it in Django projects).

Next, we need to implement the page_data_to_content_streamfield_data() function. This function takes a Python dict as its only argument, representing either the page data or a revision’s data, and returns a Python list, representing the data to be placed in the new content StreamField. It’s a pure function, with no side-effects, and that means it doesn’t mutate the page or revision data (which is only a copy anyway).

To build this function, it’s helpful to start with the definition of the content StreamField, and use it to build a Python data structure that contains the existing data. Here is the content StreamField definition again:

content = StreamField(
    [
        ('image', ImageChooserBlock()),
        ('text', RichTextBlock()),
        ('docs',
         StreamBlock([
             ('doc', DocumentChooserBlock()),
             ('page', PageChooserBlock()),
         ])),
    ],
    blank=True,
    null=True,
)

StreamField definitions use a list of tuples, but the stream_data that we’re building uses a list of dicts, which will look like this:

def page_data_to_content_streamfield_data(page_data):
    """With the given page field data, build and return content stream_data:
    * Copy existing page data into new stream_data.
    * Handle either the main page data or any revision data.
    * page_data is unchanged! (treated as immutable).
    """
    content_data = [
        {'type': 'image', 'value': ...},
        {'type': 'text', 'value': ...},
        {'type': 'docs': 'value': [...]},
    ]
    return content_data

We need to fill in the values for each 'value' field. The 'image' and 'text' are easy: We just need to copy in the 'image' and the 'body' values from the page_data.

The 'docs' value is going to be little harder — but not much! We have to do is take the stream_data from the existing ‘docs’ field. Since ‘docs’ is a StreamField, it is stored as a string in the page_data that comes from json. When loaded, here’s what that field looks like — a typical stream_data value:

[
  {
    "type": "doc",
    "value": 1,
    "id": "91229f94-7aab-4711-ab47-c07cd71461a7"
  },
  {
    "type": "page",
    "value": 3,
    "id": "cc1d77e3-bc35-4f74-97f0-e00645692004"
  }
]

We’re simply going to load the json value and copy over the data, filtering out the ‘id’ fields (Wagtail will assign new ones for us).

Here’s what the final version of the function looks like:

def page_data_to_content_streamfield_data(page_data):
    """With the given page field data, build and return content stream_data:
    * Copy existing page data into new stream_data.
    * Handle either the main page data or any revision data.
    * page_data is unchanged! (treated as immutable).
    """
    return [
        {'type': 'image', 'value': page_data['image']},
        {'type': 'text','value': page_data['body']},
        {'type': 'docs', 'value': [
            {key: block_data[key] for key in ['type', 'value']}  # no 'id'
            for block_data in json.loads(page_data['docs'])
        ]},
    ]

That’s it! We’re just mapping the existing data to the new data structure, without changing any of the existing data. It reminds me a little bit of using XSLT to declare a transformation from one data schema to another.

The complete migrations can be seen on github.

Now we can run the data migration! When we do so, we see all of the existing page data populating the page content field.

Example Page with Added Fields and Data Migration Applied (Steps 1 & 2)

Wagtail Example Page

Step 3: Deploy the migration and let editors review everything, making sure that all the data was correctly copied.

Step 4: Switch the site templates / API to the new fields. By making this a separate step before deleting the old data, we make sure that we haven’t missed anything before we pass the point of no return. (As our CEO and Co-founder, Tobias McNulty pointed out while reviewing this post: “Extra reviews never hurt — plus, you'll have no way to revert if the new templates introduce some non-trivial breaking changes (and you've already deleted your model fields.”)

It’s a good idea not to delete any production data until the customer / site owner / editors are satisfied. So we deploy the site at this point and wait for them to be satisfied that the old data has migrated to the new fields, and that the site templates / API are correctly using the new fields.

Step 5: Create final migration that deletes the old data, and deploy it with updated templates that use the new fields. This is the point of no return.

Now your model can finally look like the “end result” above.

# End result: The model after content has been migrated to a StreamField:

class ExamplePage(Page):
    content = StreamField([
        ('image', ImageChooserBlock()),
        ('text', RichTextBlock()),
        ('docs', StreamBlock([
            ('doc', DocumentChooserBlock()),
            ('page', PageChooserBlock()),
        ])),
    ])

    content_panels = Page.content_panels + [
        StreamFieldPanel('content'),
    ]

Creating this final migration is super easy: Just delete all the old fields and content_panels from the model, let Django create the migration, and apply it. We’ve removed ,blank=True, null=True from the content field definition, because now that the migration has been applied, every instance of the content field should now be non-null. (Django’s makemigrations will ask you what to do about null existing rows. None of them should be, so you can choose 2) ignore for now… when makemigrations prompts you. Or you can just leave in the null=True parameter on the content field.)

Example Page in Final Form, with Old Fields Removed (Step 5)

Wagtail Example Page

Summary: Rules for Wagtail Data Migrations

  1. You can add fields to the model, and new blocks to StreamFields, with impunity — as long as you don’t move or rename anything.
  2. If you are moving or renaming data, split the migration into several steps
    • Step 1: add new fields that will contain the post-migration data
    • Step 2: create a data migration that maps / copies all the data from the old fields to the new fields. Do this without modifying the existing fields. (Treat existing data as immutable at this point.)
    • Step 3: you might want to pause here and let the editors review everything before changing the templates to use the new fields.
    • Step 4: switch the site templates / API to the new fields.
    • Step 5: once the editors are happy, you can create a migration that deletes the old fields from the model.
  3. Data migrations involving StreamFields are best done by writing directly to the stream_data property of the StreamField. This method:
    • allows the use of a json-able dict (Python-native data structure), which is a lot easier than trying to build the StreamValue using Wagtail data structures.
    • allows using the same function for both the page data and the page revisions data, keeping things sane.
    • is not officially supported by Wagtail, but can be said to be sanctioned by at least a couple of Wagtail core developers.

The repository containing the worked example is available at github.

Migrate with Confidence

There’s no question that migrating page data involving Wagtail StreamFields is an involved process, but it doesn’t have to be scary. By doing things in distinct stages and following the methods outlined here, you can migrate your data with security and confidence.

Caktus GroupA Review of ReportLab: PDF Processing with Python

These days it’s easy to get swept up into the buzz around Python’s strengths as a data science package, but Python is also great for the more mundane, business process side of computing. One of the most important business processes is generating reports, and the most used and venerable form of report is the PDF. Python has a great library for generating and manipulating PDFs: ReportLab. I recently read more about this extremely useful library in ReportLab: PDF Processing with Python, by Michael Driscoll. With a few caveats, it’s an excellent resource.

Python remains a great choice for the stuff that no one ever got rich on Patreon writing or talking about. Things like processing spreadsheets (which pandas is great at, by the way), mail-merge and of course, arguably one of the most important business activities, generating PDF reports. For this, Mike Driscoll’s book is a great introduction, tutorial, and resource for any Python programmer looking to get into the exciting world of programmatically generated Quarterly TPS reports!

The Technical

This book is available in digital format (PDF natch), and can be found on the author’s website.

There is a lot of content in this book. It contains 428 pages of examples and deep dives into the API of the library. Seriously, if there is something you wish you could do with a PDF and ReportLab can do it, then this book will get you started.

The Good

Because the bitter is often softened by the sweet, I’ll start with the sweet things about this book.

It is clear that the author, Michael Driscoll, knows ReportLab very well, and he knows how to construct illustrative snippets of code that demonstrate his material. From the start to finish this book is full of clear, useful code that works (this cannot be underlined enough), the code that is in the book will work if you copy it, which is sadly a rarity for many resources about computing. Big publishing names like O’Reilly and Wrox who have editorial staff often publish books with broken examples. Full disclosure, I did not run every single piece of code, but I did sample about 40% of the code and none of it was broken.

Driscoll also does a very good job of building up his examples. Every book on programming starts with its “Hello, World!” example, and this book is no exception, but in my experience, the poorer books out there fail to continue a steady progression of ideas that layer logically one on top of the other, which can leave a reader feeling lost and frustrated. Driscoll, on the other hand, does a very good job of steadily incrementing the work already done with the new examples.

Almost every example in this book shows its result as an embedded image. This, of course, makes sense for a book about a library that works with PDFs. It is also another one of those touches that highlight the accuracy of the code. It’s one thing to say, “Hey, cool, the code I just worked through ran,” and another to be able to compare your results visually with the source.

The Not So Good

I have one major complaint about this book and a few minor editorial quibbles.

Who is the intended audience for this book?

While the parts of the book that actually deal with ReportLab are extremely well organized, the opening of the book is a mess of instructions that might turn off novice programmers, and are a little muddled for experienced developers.

The first section “Conventions” discusses the Python prompt which indicate a focus on beginners, but then the very next section jumps right into setting up a virtual environment. Wait, I’m a beginner, what is the “interpreter”? What is IDLE? What is going on here? On the flip side, if this book was targeted at more experienced developers, much of this could be boiled down into a single dependencies and style section.

The author also adds a section about using virtualenv and dependencies, but the discussion of virtualenvs takes place before a discussion about Python. For the beginner this could possibly stop them all together as they tried to install virtualenv on a machine that doesn’t already have Python installed.

To be fair, none of this is a problem for an experienced developer, and with a specialized topic like working with a fairly extensive and powerful library like ReportLab, the author can be forgiven for assuming a more experienced readership. However, this should be spelled out at the beginning of the book. Who is the book for? What skill level is needed to get the most from the book?

Quibble: Code Styling Is Inconsistent

This is certainly a minor quibble — the code working is much more important — but quite often I would see weird switches in style from example to example and sometimes within examples.

First off, ReportLab itself uses lowerCamelCase for class methods and functions rather than snake_case, which sometimes bleeds over into the author’s choice of variable names. For example, on page 57, the author is showing us how to use ReportLab to build form letters, and his example contains the following variable styles:

magName = "Pythonista"
issueNum = 12
subPrice = "99.00"
limitedDate = "03/05/2010"
freeGift = "tin foil hat"
formatted_time = time.ctime()
full_name = "Mike Driscoll"
address_parts = ["411 State St.", "Waterloo, IA 50158"]

Is this minor? Yes. Does it make my hand itch? Yes.

Quibble: Stick with a single way of doing things.

Sometimes the author switches between a Python 2 idiom and a Python 3 idiom for doing a thing. In the same code example I noted in the above quibble, the author uses the Python 2 % operator to do string interpolation, and in the same block of code throws in a Python 3 .format() for the exact same purpose. I noticed this only a couple of times so again — minor. But these sorts of things can throw a new developer who is trying to grasp the material and perhaps a new language.

Conclusion

If you are interested in learning how to automate the generation of PDFs for your projects and you plan on using ReportLab, then this book is a great choice. It covers in detail every aspect of the ReportLab library in a clear and iteratively more complex manner. Also, the code examples work!

Aside from a slightly unfocused introduction, which could hinder a new developer from approaching the material and some style inconsistencies, the author has produced a solid instructional book. It’s a great reference when you need to brush up on how to accomplish some arcane bit of PDF magic.

Note: This review was solicited by the author of the book, and my company received a free copy for review. However, all opinions are my own.

Caktus GroupTaking a Dip into Elixir

ElixirConf 2019 will be in Aurora, CO, and I'm delighted to announce that two Cakti, our CEO Tobias McNulty and Lead Developer Vinod Kurup, will be in attendance from August 27 - 30.

I caught up with both of them recently to ask a few questions about their interest in Elixir and Phoenix, which is the premiere framework for web development with Elixir.

How did you learn about Elixir and Phoenix? What most excites you about each?

Vinod: I'm intrigued by functional programming languages. Like many Cakti, I don't have formal computer science training, but early in my career, I listened to some lectures from Berkeley that were presented in the functional language Scheme, and ever since then I've been interested in functional languages. Elixir has come across my radar many times, but I didn't really get interested until our former colleague Neil started talking about it, which eventually led to a recent internal project. That project was really fun, and it showed me how quickly you can go from zero to productive in Elixir and Phoenix. Since then, I've read and tinkered more. The more I look at it, the more interested I am.

I guess I'm most interested in being able to apply some of the benefits of immutability and simplicity that functional languages provide. Specifically, I'd like to learn more about how LiveView can allow us to build dynamic UIs using Elixir rather than Javascript.

Tobias: I first learned about Elixir during a ShipIt Day at Caktus in 2016. Since that time, a few potential clients have reached out to us about Elixir projects and we've continued to learn and exercise our Elixir skills. We recently did a fun project called Chippy, which uses Elixir, Phoenix, and LiveView. Chippy is a digital implementation of the traditional physical "chips" used by a development team to determine project allocations during sprint planning.

Coming from a Python/Django background, what excites me the most about Elixir is its entirely different approach to processes and concurrency. A single Elixir app might have hundreds, if not thousands, of concurrent processes (without suffering from something like the Global Interpreter Lock in Python), and has been proven to support massive numbers of concurrent network connections. These features may not be relevant to every project, but for some they make a lot of sense.

Caktus is traditionally a Python/Django shop. Why branch out?

Vinod: Caktus has been well served by Python and I don't expect that to stop anytime soon. But Caktus has always felt like the kind of place where I'm encouraged to try new things and see what we can learn from them. There is a lot to learn from something as different as Elixir.

Tobias: I love Python and Django, and it's still our web framework of choice for nearly all client projects. Django is a stable, battle-tested solution and its batteries-included philosophy makes it quick and easy to create web applications and backend APIs. That said, projects that demand a high level of concurrency or a large number of network connections might benefit from a language like Elixir. We're even exploring the possibility of using Elixir and Python side-by-side for a single project, since both Django and Phoenix have an affinity for the Postgres database.

What are you most looking forward to at the conference?

Vinod: I'm looking forward to being immersed in Elixir while being surrounded by Elixir enthusiasts. I'm hoping to get more familiar with the language, while finding inspiration on how to apply it to our own technical problems.

Tobias: Being relatively new to the community, I'm most looking forward to meeting people at the conference and learning how they use Elixir (especially for web projects).

What are some of the talks you plan to attend?

Vinod: Phoenix LiveView Demystified and Alchemy Meets Science: Adopting Elixir in Cancer Therapeutics R&D caught my eye, and I’m also looking forward to the From Zero to Hero with Elixir tutorial to get me kickstarted.

Tobias: I'm looking forward to GraphQL Based Microservices in Elixir and Building an Elixir Team When No One Knew Elixir on Thursday. A couple talks I plan to attend on Friday are Your Guide to Understand the Initial Commit of a Phoenix Project and Kubernetes at Small Scale.

👋 We hope to see you there! Tobias and Vinod will sport their Caktus shirts during the conference, and you can also connect with them on twitter via @TobiasMcNulty and @vkurup.

Caktus GroupDjangoCon, Here We Come!

We’re looking forward to the international gathering at DjangoCon 2019, in San Diego, CA. The six-day conference, from September 22 - 27, is focused on the Django web framework, and we’re proud to attend as sponsors for the tenth year! We’re also hosting the second annual Caktus Mini Golf event.

⛳ If you’re attending DjangoCon, come play a round of mini golf with us. Look for our insert in your conference tote bag. It includes a free pass to Tiki Town Adventure Golf on Wednesday, September 25, at 7:00 p.m. (please RSVP online). The first round of golf is on us! And whoever shoots the lowest score will win a $100 Amazon gift card.*

Talk(s) of the Town

Among this year’s talented speakers is one of our own, Erin Mullaney (pictured). Caktus Contractor Erin Mullaney Erin has been with Caktus since 2015, and has worked as a contractor for us since July 2017. On Monday, September 23, she’ll share her experiences going from a full-time developer to a contractor in her talk, “Roll Your Own Tech Job: Starting a Business or Side Hustle from Scratch.” The talk will cover her first two years as a consultant, including how she legally set up her business and found clients. Erin said she enjoys being her own boss and is excited to share her experiences.

Caktus Developer Jeremy Gibson, who will attend DjangoCon for the first time, is looking forward to expanding his knowledge of Django best practices surrounding queries and data modeling. He’s also curious to see what other developers are doing with the framework. He’s most looking forward to the sessions about datastore and Django's ORM, including:

Other talks we’re looking forward to include:

See the full schedule of talks and tutorials.

Meeting and Greeting

If you’d like to meet the Caktus team during DjangoCon, join us for our second annual Mini Golf Event. Or you can schedule a specific time to chat with us one-on-one.

During the event, you can also follow us on Twitter @CaktusGroup and #DjangoCon2019 to stay tuned in. Check out DjangoCon’s Slack channel for attendees, where you can introduce yourself, network, and even coordinate to ride share.

We hope to see you there!

*In the event of a tie, the winner will be selected from a random drawing from the names of those with the lowest score. Caktus employees can play, but are not eligible for prizes.

Caktus GroupHow to Import Multiple Excel Sheets in Pandas

Pandas is a powerful Python data analysis tool. It's used heavily in the data science community since its data structures make real-world data analysis significantly easier. At Caktus, in addition to using it for data exploration, we also incorporate it into Extract, Transform, and Load (ETL) processes.

The Southern Coalition for Social Justice’s Open Data Policing website uses Pandas in various capacities. Open Data Policing aggregates, visualizes, and publishes public records related to all known traffic stops in North Carolina, Maryland, and Illinois. The project must parse and clean data provided by state agencies, including the State of Maryland. Maryland provides data in Excel files, which can sometimes be difficult to parse. pandas.read_excel() is also quite slow compared to its _csv() counterparts.

By default, pandas.read_excel() reads the first sheet in an Excel workbook. However, Maryland's data is typically spread over multiple sheets. Luckily, it's fairly easy to extend this functionality to support a large number of sheets:

import pandas as pd

def read_excel_sheets(xls_path):
    """Read all sheets of an Excel workbook and return a single DataFrame"""
    print(f'Loading {xls_path} into pandas')
    xl = pd.ExcelFile(xls_path)
    df = pd.DataFrame()
    columns = None
    for idx, name in enumerate(xl.sheet_names):
        print(f'Reading sheet #{idx}: {name}')
        sheet = xl.parse(name)
        if idx == 0:
            # Save column names from the first sheet to match for append
            columns = sheet.columns
        sheet.columns = columns
        # Assume index of existing data frame when appended
        df = df.append(sheet, ignore_index=True)
    return df

The Maryland data is in the same format across all sheets, so we just stack the sheets together in a single data frame. Now we can load the entire Excel workbook:

stops = read_excel_sheets("data/PIALog_through-20171231.xlsx")

Now it's easy to write this data frame to a CSV:

stops.to_csv("data/stops.01end.csv", index=False)

We can add to this data, too. Maryland sends deltas of data, rather than an updated full data set. So it can be appended to the existing CSV data set by using mode=”a”:

stops2 = read_excel_sheets("data/PIANorthCarolina_02152019.xlsx")
stops2.to_csv("data/stops.01end.csv", mode="a", header=False, index=False)

Now we have a single CSV file with all of the data.

That's it! This certainly isn't a huge data set by any means, but since working with Excel files can be slow and sometimes painful, we hope this short how-to will be helpful!

Caktus GroupMaking Space for Wagtail

It’s no secret that Caktus ❤️’s Wagtail, our favorite Django-based CMS. On July 25, I had the pleasure of attending Wagtail Space US, an annual convening and celebration of all things Wagtail at The Wharton School at the University of Pennsylvania. After a couple days of talks, workshops, and sprints, we’re even more excited by what Wagtail can offer us and our clients.

Tom Dyson, Technical Director at Torchbox, kicked off the first day with a “State of Wagtail” overview (pictured right; photo by Will Barton). Tom Dyson in front of a screen that reads "Python is winning" He highlighted Wagtail’s increasing traction as the most popular CMS on the fastest growing software language (Python). From Google to NASA to the UK Government, big names are increasingly investing in Wagtail applications. At the same time, the Wagtail core contributor team is growing as more developers around the world invest their time and expertise to improve the core application. In short, it’s a good time to be working with Wagtail. Watch the full presentation.

On the topic of growth, several speakers discussed new extensions and distribution models for Wagtail. For example, Brian Smith and Eric Sherman from Austin’s Office of Design and Delivery demonstrated their new guided page creation model, which provides content creators with a templatized page creation wizard. This allows organizations running Wagtail to distribute content creation while maintaining a consistent voice and style guide.

Additionally, Vince Salvino from web development firm CodeRed gave a practical demonstration of CodeRed CMS, a new distribution tool that aims to streamline the deployment and delivery of Wagtail-based sites. CodeRed has open sourced this package, making it available for any team seeking to rapidly deploy and customize a Wagtail marketing site.

Two back-to-back talks focused on accessibility, including Torchbox’s Thibaud Colas (pictured; photo by Will Barton) discussing improvements to the tab stop and voiceover features of the Wagtail admin interface. See his presentation. Plus, Columbia’s Zarina Mustapha covered her team’s implementation of voiceover accessibility using custom Wagtail fields. Watch her presentation. Thibaud Colas presenting on the Wagtail adminThe second day sprints also encouraged a focus on accessibility, and it’s great to see the Wagtail community embrace accessibility as a core requirement.

I particularly enjoyed Naomi Morduch Toubman’s talk on “Thoughtful Code Review.” Naomi is a Wagtail core contributor, and spoke about how developers can be positive and productive collaborators via code review on not just Wagtail but any shared project. Her message aligns closely with Caktus’s philosophy towards team-based software development, and could be considered required watching for any developer seeking to be a stronger team member.

Finally, Tim Allen — IT Director at The Wharton School and a driving force in the Wagtail community — gave an impassioned talk that explored both personal experience and technology. While the technology focus was on an organization’s adoption of Wagtail to improve communication and user experience, Tim also spoke eloquently about the power of proactive community and the sometimes lifesaving importance of inclusivity.

Caktus will be back at Wagtail Space in 2020, and we look forward to seeing the event’s continued growth and success!

Are you interested in exploring Wagtail for your CMS needs? We’d ❤️ to talk to you!

Joe GregorioLooking back on five years of web components

Over 5 years ago I wrote No more JS frameworks and just recently Jon Udell asked for an update.

I have been blogging bits and pieces over the years but Jon’s query has given me a good excuse to roll all of that up into a single document.

For the last five years me and my team have been using web components to build our web UIs. At the time I wrote the Zero Framework Manifesto we moved all of our development over to Polymer.

Why Polymer?

We started with Polymer 0.5 as it was the closest thing to web components that was available. At the time I wrote the Zero Framework Manifest all of the specifications that made up web components were still just proposed standards and only Chrome had implemented any of them natively. We closely followed Polymer, migrating all of our apps to Polymer 0.8 and finally to Polymer 1.0 when it was released. This gave us a good taste for what building web components was like and verified that building HTML elements was a productive way to do web development.

How

One of the questions that comes up regularly when talking about zero frameworks is how can you expect to stitch together an application without a framework? The short answer is ‘the same way you stitch together native elements’, but I think it’s interesting and instructional to look at those ways of stitching elements together individually.

There are six surfaces, or points of contact, between elements, that you can use when stitching elements together, whether they are native or custom elements.

Before we go further a couple notes on terminology and scope. For scope, realize that we are only talking about DOM, we aren’t talking about composing JS modules or strategies for composing CSS. For the terminology clarification, when talking about DOM I’m referring to the DOM Interface for an element, not the element markup. Note that there is a subtle difference between the markup element and the DOM Interface to such an element.

For example, <img data-foo="5" src="https://example.com/image.png"/> may be the markup for an image. The corresponding DOM Interface has an attribute of src with a value of https://example.com/image.png but the corresponding DOM Interface doesn’t have a data-foo attribute, instead all data-* attributes are available via the dataset attribute on the DOM Interface. In the terminology of the WhatWG Living Standard, this is the distinction between content attributes vs IDL attributes, and I’ll only be referring to IDL attributes.

With the preliminaries out of the way let’s get into the six surfaces that can be used to stitch together an application.

Attributes and Methods

The first two surfaces, and probably the most obvious, are attributes and methods. If you are interacting with an element it’s usually either reading and writing attribute values:

element.children

or calling element methods:

document.querySelector('#foo');

Technically these are the same thing, as they are both just properties with different types. Native elements have their set of defined attributes and methods, and depending on which element a custom element is derived from it will also have that base element’s attributes and methods along with the custom ones it defines.

Events

The next two surface are events. Events are actually two surfaces because an element can listen for events,

ele.addEventListener(‘some-event’, function(e) { /* */ });

and an element can dispatch its own events:

var e = new CustomEvent(‘some-event’, {details: details});
this.dispatchEvent(e);

DOM Position

The final two surfaces are position in the DOM tree, and again I’m counting this as two surfaces because each element has a parent and can be a parent to another element. Yeah, an element has siblings too, but that would bring the total count of surfaces to seven and ruin my nice round even six.

<button>
  <img src="">
</button>

Combinations are powerful

Let’s look at a relatively simple but powerful example, the ‘sort-stuff’ element. This is a custom element that allows the user to sort elements. All children of ‘sort-stuff’ with an attribute of ‘data-key’ are used for sorting the children of the element pointed to by the sort-stuff’s ‘target’ attribute. See below for an example usage:

 <sort-stuff target='#sortable'>
   <button data-key=one>Sort on One</button>
   <button data-key=two>Sort on Two</button>
 </sort-stuff>
 <ul id=sortable>
   <li data-one=c data-two=x>Item 3</li>
   <li data-one=a data-two=z>Item 1</li>
   <li data-one=d data-two=w>Item 4</li>
   <li data-one=b data-two=y>Item 2</li>
   <li data-one=e data-two=v>Item 5</li>
 </ul>

If the user presses the “Sort on One” button then the children of #sortable are sorted in alphabetical order of their data-one attributes. If the user presses the “Sort on Two” button then the children of #sortable are sorted in alphabetical order of their data-two attributes.

Here is the definition of the ‘sort-stuff’ element:


And here is a running example of the code above:

  • Item 3
  • Item 1
  • Item 4
  • Item 2
  • Item 5

Note the surfaces that were used in constructing this functionality:

  1. sort-stuff has an attribute 'target' that selects the element to sort.
  2. The target children have data attributes that elements are sorted on.
  3. sort-stuff registers for 'click' events from its children.
  4. sort-stuff children have data attributes that determine how the target children will be sorted.

In addition you could imagine adding a custom event ‘sorted’ that ‘sort-stuff’ could generate each time it sorts.

Why not Polymer?

But after having used Polymer for so many years we looked at the direction of Polymer 2.0 and now 3.0 and decided that may not be the direction we want to take.

There are a few reasons we moved away from Polymer. Polymer started out and continues to be a platform for experimentation with proposed standards, which is great, as they are able to give concrete feedback to standards committees and allow people to see how those proposed standards could be used in development. The downside to the approach of adopting nascent standards is that sometimes those things don’t become standards. For example, HTML Imports was a part of Polymer 1.0 that had a major impact on how you wrote your elements, and when HTML Imports failed to become a standard you had a choice of either a major migration to ES modules or to carry around a polyfill for HTML Imports for the remainder of that web app’s life. You can see the same thing happening today with Polymer 3.0 and CSS mixins.

There are also implementation decisions I don’t completely agree with in Polymer, for example, the default use of Shadow DOM. Shadow DOM allows for the encapsulation of the children of a custom element so they don’t participate in things like querySelector() and normal CSS styling. But there are several problems with that, the first is that when using Shadow DOM you lose the ability to do global styling changes. If you suddenly decide to add a “dark mode” to your app you will need to go and modify each element’s CSS. It was also supposed to be faster, but since each element contains a copy of the CSS there are performance implications, though there is work underway to address that. Shadow DOM seems like a solution searching for a problem, and Polymer defaults to using Shadow DOM while offering a way to opt out and use Light DOM for your elements; I believe the default should lie in the other direction.

Finally Polymer’s data binding has some mis-features. It offers two-way data binding which is never a good idea, every instance of two-way data binding is just a bug waiting to happen. The data binding also has a lot of magic to it, in theory you just update your model and Polymer will re-render your template at some point in the future with the updated values. The “at some point in the future” is because updates happen in an async fashion, which in theory allows the updates to be more efficient by batching the updates, but the reality is that you spend a lot of development time updating your model, not getting updated DOM, and scratching your head until you remember to either call a function which forces a synchronous render, or that you updated a deep part of your model and Polymer can’t observe that change so you need to update your code to use the set() method where you give the path to the part of the model you just updated. The async rendering and observing of data is fine for simple applications, but for more complex applications leads to wasted developer time debugging situations where a simpler data binding model would suffice.

It is interesting to note that the Polymer team also produces the lit-html library which is simply a library for templating that uses template literals and HTML Templates to make the rendering more efficient, and it has none of the issues I just pointed out in Polymer.

What comes after Polymer?

This is where I started with a very concrete and data driven minimalist approach, first determining what base elements we really needed and then what library features we would need as we built up those elements, and finally what features we need as we build full fledged apps from those base elements. I was completely open to the idea that maybe I was just being naive about the need for async render or Shadow DOM and I’d let the process of building real world applications inform what features were really needed.

The first step was to determine which base elements we really needed. The library of iron-* and paper-* elements that Polymer provides is large and the idea of writing our own version of each was formidable, so instead I looked back over the previous years of code we’d written in Polymer to determine which elements we really did need. If we’d started this process today I would probably just have gone with Elix or another pure web components library of elements, but none of them existed at the time we started this process.

The first thing I did was scan each project and record every Polymer element used in every project. If I’m going to replace Polymer at least I should know how many elements I’m signing up to rewrite. That initial list was surpising in a couple of ways, the first was how short the list was:

Polymer/Iron elements Used
iron-ajax
iron-autogrow-textarea
iron-collapse
iron-flex-layout
iron-icon
iron-pages
iron-resizable-behavior
iron-scroll-threshold
iron-selector
paper-autocomplete
paper-button
paper-checkbox
paper-dialog
paper-dialog-scrollable
paper-drawer-panel
paper-dropdown-menu
paper-fab
paper-header-panel
paper-icon-button
paper-input
paper-item
paper-listbox
paper-menu
paper-menu-button
paper-radio-button
paper-radio-group
paper-spinner
paper-tabs
paper-toast
paper-toggle-button
paper-toolbar
paper-tooltip

After four years of development I expected the list to be much larger.

The second surpise was how many of the elements in that list really shouldn’t be elements at all. For example, some could be replaced with native elements with some better styling, for example button for paper-button. Alternatively some could be replaced with CSS or a non-element solution, such as iron-ajax, which shouldn’t be an element at all and should be replaced with the fetch() function. After doing that analysis the number of elements actually needed to be re-implemented from Polymer fell to a very small number.

In the table below the ‘Native’ column is for places where we could use native elements and just have a good default styling for them. The ‘Use Instead’ column is what we could use in place of a custom element. Here you will notice a large number of elements that can be replaced with CSS. Finally the last column, ‘Replacement Element’, is the name of the element we made to replace the Polymer element:

Polymer Native Use Instead Replacement Element
iron-ajax   Use fetch()  
iron-collapse     collapse-sk
iron-flex-layout   Use CSS Flexbox/Grid  
iron-icon     *-icon-sk
iron-pages     tabs-panel-sk
iron-resizable-behavior   Use CSS Flexbox/Grid  
iron-scroll-threshold   Shouldn’t be an element  
iron-selector     select-sk/multi-select-sk
paper-autocomplete   No replacement yet.  
paper-button button    
paper-checkbox     checkbox-sk
paper-dialog     dialog-sk
paper-dialog-scrollable   Use CSS  
paper-drawer-panel   Use CSS Flexbox/Grid  
paper-dropdown-menu     nav-sk
paper-fab button    
paper-header-panel   Use CSS Flexbox/Grid  
paper-icon-button button   button + *-icon-sk
paper-input input    
paper-item     nav-sk
paper-listbox option/select    
paper-menu     nav-sk
paper-menu-button     nav-sk
paper-radio-button     radio-sk
paper-radio-group **    
paper-spinner     spinner-sk
paper-tabs     tabs-sk
paper-toast     toast-sk
paper-toggle-button     checkbox-sk
paper-toolbar   Use CSS Flexbox/Grid  
paper-tooltip   Use title attribute  

** - For radio-sk elements just set a common name like you would for a native radio button.

That set of minimal custom elements has now been launched as elements-sk.

Now that we have our base list of elements let’s think about the rest of the tools and techniques we are going to need.

To get a better feel for this let’s start by looking at what a web framework “normally” provides. The “normally” is in quotes because not all frameworks provide all of these features, but most frameworks provide a majority of them:

  • Framework
    • Model
    • Tooling and structure
    • Elements
    • Templating
    • State Management

All good things, but why do they have to be bundled together like a TV dinner? Let’s break each of those aspects of a framework out into their own standalone thing and then we can pick and choose from the various implementations when we start developing an application. This style of developement we call “a la carte” web development.

Instead of picking a monolithic solution like a web framework, you just pick the pieces you need. Below I outline specific criteria that need to be met for some components to participate in “a la carte” web development.

A la carte

“A la carte” web development does away with the framework, and says just use the browser for the model, and the rest of the pieces you pick and choose the ones that work for you. In a la carte development each bullet point is a separate piece of software:

A la carte

Tooling and structure
Defines a directory structure for how a project is put together and provides tooling such as JS transpiling, CSS prefixing, etc. for projects that conform to that directory structure. Expects ES modules with the extension that webpack, rollup, and similar tools presume, i.e. allow importing other types of files, see webpack loaders.
Elements
A library of v1 custom elements in ES6 modules. Note that these elements must be provided in ES6 modules with the extension that webpack, rollup, and similar tools presume, i.e. allow importing other types of files, see webpack loaders. The elements should also be “neat”, i.e. just HTML, CSS, and JS.
Templating
Any templating library you like, as long as it works with v1 custom elements.
State Management
Any state management library you like, if you even need one.

The assumptions needed for all of this to work together are fairly minimal:

  1. ES6 modules and the extension that webpack, rollup, and similar tools presume, i.e. allow importing other types of files, see webpack loaders.
  2. The base elements are “Neat”, i.e. they are JS, CSS, and HTML only. No additional libraries are used, such as a templating library. Note that sets of ‘neat’ elements also conform to #1, i.e. they are provided as webpack/rollup compatible ES6 modules.

Obviously there are other guidelines that could be added as advisory, for example Google Developers Guide - Custom Elements Best Practices, should be followed when creating custom elements sets, except for the admonition to use Shadow DOM, which I would avoid for now, unless you really need it.

Such code will natively run in browsers that support custom elements v1. To get it to run in a wider range of browsers you will need to add polyfills and, depending on the target browser version, compile the JS back to an older version of ES, and run a prefixer on the CSS. The wider the target set of browsers and the older the versions you are targeting the more processing you will need to do, but the original code doesn’t need to change, and all those extra processing steps are only incurred by projects that need it.

Concrete

So now that we have our development system we’ve started to publish some of those pieces.

We published pulito, a stake in the ground for what a “tooling and structure” component looks like. You will note that it isn’t very complex, nothing more than an opinionated webpack config file. Similarly we published our set of “neat” custom elements elements-sk.

Our current stack looks like:

Tooling and structure
pulito
Elements
elements-sk
Templating
lit-html

We have used Redux in an experimental app that never shipped and haven’t needed any state management libraries in the other applications we’ve ported over, so our ‘state management’ library is still an open question.

Example

What is like to use this stack? Let’s start from an empty directory and start building a web app:

$ npm init
$ npm add pulito

We are starting from scratch so use the project skeleton that pulito provides:

$ unzip node_modules/pulito/skeleton.zip
$ npm

We can now run the dev server and see our running skeleton application:

$ make serve

Now let’s add in elements-sk and add a set of tabs to the UI.

$ npm add elements-sk

Now add imports to pages/index.js to bring in the elements we need:

import 'elements-sk/tabs-sk'
import 'elements-sk/tabs-panel-sk'
import '../modules/example-element'

And then use those elements on pages/index.html:

<body>
  <tabs-sk>
    <button class=selected>Some Tab</button>
    <button>Another Tab</button>
  </tabs-sk>
  <tabs-panel-sk>
    <div>
      <p> This is Some Tab contents.</p>
    </div>
    <div>
      This is the contents for Another Tab.
    </div>
  </tabs-panel-sk>
  <example-element active></example-element>
</body>

Now restart the dev server and see the updated page:

$ make serve

Why is this better?

Web frameworks usually make all these choices for you, you don’t get to choose, even if you don’t need the functionality. For example, state managament might not be needed, why are you ‘paying’ for it, where ‘paying’ means learning about that aspect of the web framework, and possibly even having to serve the code that implements state managment even if you never use it. With “a la carte” development you only include what you use.

An extra benefit comes when it is time to upgrade. How much time have you lost with massive upgrades from v1 to v2 of a web framework? With ‘a la carte’ developement the upgrades don’t have to be monolithic. I.e. if you’ve chosen a templating library and want to upgrade to the next version you only need to update your templates, and not have to touch every aspect of your application.

Finally, ‘a la carte’ web development provides no “model” but the browser. Of all the things that frameworks provide, “model” is the most problematic. Instead of just using the browser as it is, many frameworks have their own model of the browser, how DOM works, how events work, etc. I have gone into depth on the issues previously, but they can be summarized as lost effort (learning something that doesn’t translate) and a barrier to reuse. What should replace it? Just use the browser, it already has a model for how to combine elements together, and now with custom elements v1 gives you the ability to create your own elements, you have all you need.

One of the most important aspects of ‘a la carte’ web developement is that it decouples all the components, allowing them to evolve and adapt to user needs on a much faster cycle than the normal web framework release cycle allows. Just because we’ve published pulito and elements-sk doesn’t mean we believe they are the best solutions. I’d love to have a slew of options to choose from for tooling, base element sets, templating, and state management. I’d like to see Rollup based tools that take the place of pulito, and a whole swarm of “neat” custom elements sets with varying levels of customizability and breadth.

What we’ve learned

We continue to learn as we build larger applications.

lit-html is very fast and all the applications we’ve ported over have been smaller and faster after the port. It is rather pleasant to call the render() function and know that the element has been rendered and not getting tripped up by async rendering. We haven’t found the need for async rendering either, but that’s not surprising. Let’s think about cases where async rendering would make a big difference, i.e. where it would be a big performance difference to batch up renders and do them asynchronously. This would have to be an element with a large number of properties and each change of the property would change the DOM expressed and thus would require a large number of calls to render(). But in all the development we’ve done that situation has never arisen, elements always have a small number of attributes and properties. If an element takes in a large amount of data to display that’s usually done by passing in a small number of complex object as properties on the element and that results in a small number of renders.

We haven’t found the need for Shadow DOM. In fact, I’ve come to think of the Light DOM children of elements as part of their public API that goes along with the attributes, properties, and events that make up the ‘normal’ programming surface of an element.

We’ve also learned that there’s a difference between creating base elements and higher level elements as you build up your application. You are not creating bullet-proof re-usable elements at every step of development; the same level of detail and re-usability aren’t needed as you move up the stack. If an element looks like it could be re-used across applications then we may tighten up the surface of the element and add more options to cover more use cases, but that’s done on an as-needed basis, not for every element. Just because you are using the web component APIs to build an application doesn’t mean that every element you build needs to be as general purpose and bullet proof as low level elements. You can use HTML Templates without using any other web component technology. Same for template literals, and for each of the separate technologies that make up the web components group of APIs.

Caktus GroupBook Review: Creating GUI Applications with wxPython

I enjoyed working through the book Creating GUI Applications with wxPython by Michael Driscoll, learning various techniques for programming GUI applications in Python using wxPython.

This book is not intended to be a beginners' tutorial. The first chapter is titled "An Intro to wxPython," but it's very basic. I think anyone with a few simple wxPython apps under their belt would have no trouble with this book, but as a complete beginner to wxPython, I struggled a bit. Again, the book is not intended for complete beginners, so that's my fault.

Of the book's 14 chapters, 12 are dedicated to example applications, one per chapter. So these are not toy applications — some of them are small, but all are complete and useful as-is, and all of the code is provided. But the code isn't just dumped for you to try to figure out — it's presented in small sections, in a logical order, with an explanation of each part.

The first application is an image viewer that opens a dialog to let you pick an image file, then displays it. It's a good choice for a first example. The functionality is useful but not at all complicated, so you can focus on the boilerplate common to wxPython applications and how to put together a few widgets into a working application.

From there, the applications gradually get more involved, including a calculator, a database editor, a tarball creator, a tool to search for NASA images, and even an XML editor.

Some of the chapters introduce useful third-party wxPython add-ons, like ObjectListView which is much better than the built-in ListView.

The final chapter is about distributing your application using pyInstaller. Including this was a good decision. As a Python developer I'm happy to pipx install application, but if you're building applications with wxPython, your target users are quite likely not experienced Python developers, and a simple way to distribute and install your application is important if you want it to be used.

If you're going to build applications with wxPython, I recommend taking a look at this book and if possible, working through the examples. I'm sure you'll learn a lot. There are links to purchase digital or paper copies at the author's blog.

Disclosure: The author, Michael Driscoll, provided a digital copy of his book for review. However, the author was not involved in the writing of this review and all opinions are my own.

Josh JohnsonBranching With Git And Testing With Pytest: A Comprehensive Guide: Part 3

This is part three of a three-part series. This is a comprehensive guide to a basic development workflow. Using a simple, but non-trivial web application, we learn how to write tests, fix bugs, and add features using pytest and git, via feature branches. Along the way we'll touch on application design and discuss best practices.

In this installment, we will:

  • Simulate collaborative work by two developers.
  • Use the workflow we learned in part 2 to add a new feature, and fix a new bug.
  • Create a merge conflict and resolve it.

Josh JohnsonBranching With Git And Testing With Pytest: A Comprehensive Guide: Part 2

This is part two of a three-part series. This is a comprehensive guide to a basic development workflow. Using a simple, but non-trivial web application, we learn how to write tests, fix bugs, and add features using pytest and git, via feature branches. Along the way we'll touch on application design and discuss best practices.

In this installment, we will:

  • Identify and fix a bug on a branch.
  • Build a new feature, also on a branch.
  • Use git rebase to keep our change history tidy.
  • Use tagging to mark new versions of our application.

Caktus GroupHow to Set Up a Centralized Log Server with rsyslog

For many years, we've been running an ELK (Elasticsearch, Logstash, Kibana) stack for centralized logging. We have a specific project that requires on-premise infrastructure, so sending logs off-site to a hosted solution was not an option. Over time, however, the maintenance requirements of this self-maintained ELK stack were staggering. Filebeat, for example, filled up all the disks on all the servers in a matter of hours, not once, but twice (and for different reasons) when it could not reach its Logstash/Elasticsearch endpoint. Metricbeat suffered from a similar issue: It used far too much disk space relative to the value provided in its Elasticsearch indices. And while provisioning a self-hosted ELK stack has gotten easier over the years, it's still a lengthy process, which requires extra care anytime an upgrade is needed. Are these problems solvable? Yes. But for our needs, a simpler solution was needed.

Enter rsyslog. rsyslog has been around since 2004. It's an alternative to syslog and syslog-ng. It's fast. And relative to an ELK stack, its RAM and CPU requirements are negligible.

This idea started as a proof-of-concept, and quickly turned into a production-ready centralized logging service. Our goals are as follows:

  1. Set up a single VM to serve as a centralized log aggregator. We want the simplest possible solution, so we're going to combine all logs for each environment into a single log file, relying on the source IP address, hostname, log facility, and tag in each log line to differentiate where logs are coming from. Then, we can use tail, grep, and other command-line tools to watch or search those files, like we might have through the Kibana web interface previously.
  2. On every other server in our cluster, we'll also use rsyslog to read and forward logs from the log files created by our application. In other words, we want an rsyslog configuration to mimic how Filebeat worked for us previously (or how the AWS CloudWatch Logs agent works, if you're using AWS).

Disclaimer: Throughout this post, we'll show you how to install and configure rsyslog manually, but you'll probably want to automate that with your configuration management tool of choice (Ansible, Salt, Chef, Puppet, etc.).

Log Aggregator Setup

On a central logging server, first install rsyslog and its relp module (for lossless log sending/receiving):

sudo apt install rsyslog rsyslog-relp

As of 2019, rsyslog is the default logger on current Debian and Ubuntu releases, but rsyslog-relp is not installed by default. We've included both for clarity.

Now, we need to create a minimal rsyslog configuration to receive logs and write them to one or more files. Let's create a file at /etc/rsyslog.d/00-log-aggregator.conf, with the following content:

module(load="imrelp")

ruleset(name="receive_from_12514") {
    action(type="omfile" file="/data/logs/production.log")
}

input(type="imrelp" port="12514" ruleset="receive_from_12514")

If needed, we can listen on one or more additional ports, and write those logs to a different file by appending new ruleset and input settings in our config file:

ruleset(name="receive_from_12515") {
    action(type="omfile" file="/data/logs/staging.log")
}

input(type="imrelp" port="12515" ruleset="receive_from_12515")

Rotating Logs

You'll probably want to rotate these logs from time to time as well. You can do that with a simple logrotate config. Create a new file /etc/logrotate.d/rsyslog_aggregator with the following content:

/data/logs/*.log {
  rotate 365
  daily
  compress
  missingok
  notifempty
  dateext
  dateformat .%Y-%m-%d
  dateyesterday
  postrotate
      /usr/lib/rsyslog/rsyslog-rotate
  endscript
}

This configuration will rotate log files daily, compressing older files, and rename the rotated files with the applicable date.

To see what this logrotate configuration will do (without actually doing anything, you can run it with the --debug option:

logrotate --debug /etc/logrotate.d/rsyslog_aggregator

To customize this configuration further, look at the logrotate man page (or type man logrotate on your UNIX-like operating system of choice).

Sending Logs to Our Central Server

We can also use rsyslog to send logs to our central server, with the help of the imfile module. First, we'll need the same packages installed on the server:

sudo apt install rsyslog rsyslog-relp

Create a file /etc/rsyslog.d/90-log-forwarder.conf with the following content:

# Poll each file every 2 seconds
module(load="imfile" PollingInterval="2")

# Create a ruleset to send logs to the right port for our environment
module(load="omrelp")
ruleset(name="send_to_remote") {
    action(type="omrelp" target="syslog" port="12514")  # production
}

# Send all files on this server to the same remote, tagged appropriately
input(
    type="imfile"
    File="/home/myapp/logs/myapp_django.log"
    Tag="myapp_django:"
    Facility="local7"
    Ruleset="send_to_remote"
)
input(
    type="imfile"
    File="/home/myapp/logs/myapp_celery.log"
    Tag="myapp_celery:"
    Facility="local7"
    Ruleset="send_to_remote"
)

Again, I listed a few example log files and tags here, but you may wish to create this file with a configuration management tool that allows you to templatize it (and create each input() in a Jinja2 {% for %} loop, for example).

Be sure to restart rsyslog (i.e., sudo service rsyslog restart) any time you change this configuration file, and inspect /var/log/syslog carefully for any errors reading and/or sending your log files.

Watching & Searching Logs

Since we've given up our fancy Kibana web interface, we need to search logs through the command line now. Thankfully, that's fairly easy with the help of tail, grep, and zgrep.

To watch logs come through as they happen, just type:

tail -f /data/logs/staging.log

You can also pipe that into grep, to narrow down the logs you're watching to a specific host or tag, for example:

tail -f /data/logs/staging.log | grep django_celery

If you want to search previous log entries from today, you can do that with grep, too:

grep myapp_django /data/logs/staging.log

If you want to search the logs for a few specific days, you can do that with zgrep:

zgrep myapp_celery /data/logs/staging.log.2019-05-{23,24,25}.gz

Of course, you could search all logs from all time with the same method, but that might take awhile:

zgrep myapp_django /data/logs/staging.log.*.gz

Conclusion

There are a myriad of ways to configure rsyslog (and centralized logging generally), often with little documentation about how best to do so. Hopefully this helps you consolidate logs with minimal resource overhead. Feel free to comment below with feedback, questions, or the results of your tests with this method.

Josh JohnsonBranching With Git And Testing With Pytest: A Comprehensive Guide

This is part one of a three-part series. This is a comprehensive guide to a basic development workflow. Using a simple, but non-trivial web application, we learn how to write tests, fix bugs, and add features using pytest and git, via feature branches. Along the way we'll touch on application design and discuss best practices.

In this installment, we will:

  • Talk a bit about the design of the example application.
  • Ensure we are set up for development.
  • Exercise the basics of pytest, and git by writing some tests, adding a fixture, and committing our changes.

Josh JohnsonBranching With Git And Testing With Pytest: A Comprehensive Guide

This is part one of a three-part series. This is a comprehensive guide to a basic development workflow. Using a simple, but non-trivial web application, we learn how to write tests, fix bugs, and add features using pytest and git, via feature branches. Along the way we'll touch on application design and discuss best practices.

In this installment, we will:

  • Talk a bit about the design of the example application.
  • Ensure we are set up for development.
  • Exercise the basics of pytest, and git by writing some tests, adding a fixture, and committing our changes.

Joe GregorioStream

I’ve launched a new micro-blog at stream.bitworking.org, which has an Atom Feed if you want to follow along. You can also follow along on Mastodon by following @stream.bitworking.org@stream.bitworking.org thanks to https://fed.brid.gy/. Any entries will also appear on Twitter thanks to https://brid.gy. Interactions on any of those sites should flow back to Stream thanks to webmention support via github.com/jcgregorio/webmention-run.

Finally the admin interface to Stream is a PWA that supports the Web Share Target API, which means I can trivially share content to Stream using the native Android Share intent.

The backend is written in Go and it runs entirely on Google Cloud Run. The login is handled via Google Sign-In for Websites, and Workbox is used for the PWA aspects.

The code for Stream is on GitHub and I’ve endeavored to make it customizable via the config.json file, but no guarantees since I just got it all working today.

Caktus GroupOur Favorite PyCon 2019 Presentations

Above: A view of the busy exhibit hall. Photo copyright © 2019 by Sean Harrison. All rights reserved.

PyCon 2019 attracted 3,393 attendees, including a group of six Cakti. When we weren’t networking with attendees at our booth, we attended some fascinating presentations. Below are some of our favorites. You can watch these talks and more on the PyCon 2019 YouTube channel.

From Co-founder & CTO Colin Copeland:

One tutorial and one talk stood out as my favorites.

Data Science Best Practices with pandas

I attended four tutorials and this was the last one on Thursday. Each tutorial is hands-on and lasts about three hours. By far, this was my favorite, primarily due to the exercise-based format. Kevin Markham was well organized and a great teacher, most likely because he runs Data School, which I discovered is in Asheville! The tutorial centered around analyzing the TED Talks dataset from Kaggle. Kevin live-coded each lesson, demonstrating best practices for slicing and analyzing the dataset with Pandas. Then he turned it over to us, providing several possible exercises with increasing levels of difficulty, which utilized the tools he just taught. I found the in-person exercises valuable as they made us practice the techniques right then, and therefore, learn through experience. I overbooked myself with four tutorials and didn’t have enough time to practice what I learned in all of them, so I felt like I got the most out of Kevin’s format, and I look forward to future tutorials with him. The full video of the tutorial is below, or you can view a condensed version.


Plan your next eclipse viewing with Jupyter and geopandas

This was the first talk I attended. I was excited to learn about using maps in Jupyter Notebooks, as I hadn’t had the chance to do so yet. Christy Heaton’s talk was very accessible with an easy to follow hypothetical question of: In what cities will we be able to see upcoming solar eclipses? After starting with a brief intro on eclipses, spatial data, and coordinate systems, she walked through a Jupyter Notebook, demonstrating the ease of mapping data in a notebook with geopandas. Eventually piecing together cities, eclipse paths, and years to show which cities would be best for viewing the 2024 eclipse over the U.S. I learned a lot from this talk, especially how easy it is to use matplotlib to visualize DataFrames with geometries. It’s great to see how notebooks can be used to easily explore spatial data with Pandas and Geopandas.


From Technology Support Specialist Scott Morningstar:

There was one PyCon talk that rose to the top for me.

Building an Open Source Artificial Pancreas

Sarah Withee spoke from the heart on this topic. She described the OpenAPS (open artificial pancreas system) which was created from an open source project that also involved hardware. It combines glucose monitors and insulin pumps to automatically manage insulin levels for those with type 1 diabetes. It was a project that the medical device companies didn’t want to take on because they didn’t think it’d be profitable. So Sarah and others took the matter into their own hands. From the talk, I learned a lot about how the pumps work and interact with each other and I saw how life-changing it is. Sarah is an adopter of the device, and she spoke about how her blood sugar levels went from being all over the place to being much more stable. The project really speaks to the power of people and open source. Now, the medical device companies are finally trying to incorporate it into their devices.


From Contractor Sean Harrison:

My top 3 picks this year included the following:

Leveraging the type system to write secure applications

Shannon Zhu from Instagram provided an informative overview of how Instagram has used type hints with the Pyre library to improve the security of millions of lines of Python. Type hints are a new area for me. This talk convinced me that starting to use them would be beneficial — not for improving performance, because they aren’t (yet?) used for that, but for the sake of validating the software’s interface security, particularly in terms of ensuring that “taints” from user input are “cleaned” before they are used.


Put down the deep learning: When not to use neural networks and what to do instead

Rachael Tatman from Kaggle gave an excellent overview of the different techniques of data science and encouraged people not to use deep learning until they know they need it. Machine learning has been and still is all the rage, and there were a lot of data scientists at PyCon. But I was glad to see some of the more traditional statistical methods being promoted alongside the newer ones.


Strategies for testing async code

Async was half the buzz at PyCon this year (the other half was machine learning), but a lot of places aren’t using asyncio very much yet. That’s one of the reasons I really liked this talk — not only did Neil Chazin, from Agari, give an accessible intro/overview to async programming, but he helped answer one of the key questions, which is “How do I write tests for it?” Watch his talk for the answer:


Remember, you can watch more presentations from PyCon 2019 on YouTube. Comment below on your favorites!

Joe GregorioOpenID Connect for US Citizens

Is there an OpenID Connect for US Citizens run by the US government? I’m not sure why I’ve been thinking a lot about such a utility lately. </sarcasm>)

Caktus GroupBe Quick or Eat Potatoes: A Newbie’s Guide to PyCon

Pictured: I traveled to Cleveland, OH, for PyCon 2019, where I got this shot of the city skyline.

This year I attended PyCon for the first time. It’s rather amazing that I haven’t been before, since I’ve been using Python professionally for over 15 years. PyCon 2019 was held in Cleveland from May 1–9. There was so much to take in, and there are so many good things to say about it. It was a fantastic experience. But rather than provide a “mission report: 2019” a la Winter Soldier, I thought I’d do something more useful — write a guide to PyCon from a newbie perspective. Here are six lessons I learned from my first PyCon.

1. Register Early

PyCon regularly sells out, and the organizers have stated no desire to increase the size of the show (about 3,400 participants). We’re not yet to WWDC levels (Apple’s developer conference that sells out moments after registration opens), but it’s a good idea to register for PyCon as early as possible.

I learned this the hard way. By the time I was ready to register in mid-April, the conference was already sold out. Fortunately, Caktus Group, as a sponsor, had an extra conference pass available, so I was able to go after all. Note to self: Next year, register in February.

2. Open a Space

PyCon Open Spaces sign-up sheet One of the most attractive aspects of PyCon is the open spaces. It’s very simple: There are several rooms available throughout the conference. The organizers put out boards with a schedule for each full conference day, and some cards and pens. Anyone (anyone!) can write a session topic on a card and post it on the board for a particular time and room. Then if anyone else is interested in your topic, they show up.

This simple setup is really inspiring. I’ve been to a lot of conferences over the years, and I’m always impressed that thousands of people will travel hundreds or even thousands of miles (and, at PyCon this year, from 59 countries) to sit in conference rooms and … passively listen to other people talk. I’ve often thought at such events, I could do this at home. Why don’t we have more open conversations? At PyCon, we can! This was one of my favorite aspects of the conference.

I attended several interesting open spaces, and there were several others that I wish I could have attended. Of note are:

  • Caktus’ own Scott Morningstar hosted a game called WINTERHORN. This live action game is a simulation in which you try to use a variety of bad techniques to disrupt people’s lives. If you win: Congratulations, you’ve done evil in the world! Then you go away and start seeing these techniques in use, and you’re wiser to the ways of the world. (You can read more about Scott and his passion for live action role play games.)

  • Vince Salvinno from CodeRed hosted an open space on Wagtail CMS. It was a great opportunity to talk about how different people are using the CMS. There were also several people present who were new to or just exploring Wagtail, so we were able to answer their questions and help them get enthused about this excellent CMS framework.

  • I went to the open space on Falcon Framework, a very fast lightweight API framework. Here, I was the new person exploring something for the first time. The team spent much of the session talking about the framework in general terms for newbies like me, as well as plans for the upcoming releases (async was a big topic). I was so impressed with both the team and the framework that I decided to join them during the sprints (more on that below).

Card advertising the Wagtail CMS open space

I also dove right in and hosted an open space myself — “Making Wagtail Better for Developers.” At Caktus we use Wagtail to provide our customers with a great content editing experience. We have another post on why Caktus loves Wagtail. It really is an excellent CMS, but there are a few aspects that could be made better, especially when it comes to data migrations, importing and exporting content, and editing content with nested StreamFields. I wanted to talk about a few of those issues with other developers and compare notes. Apparently, others wanted to talk about these things, too, because they showed up!

PyCon 2019 open space session on “Making Wagtail Better for Developers” We made Wagtail better. (photo by Colin Copeland)

We had a good conversation about Wagtail and its development, and I was inspired to help make Wagtail better for developers by contributing to the project (I plan to talk about my work on data migrations in an upcoming blog post).

That is one of the best outcomes from a conference like this: You come away inspired to do better work.

3. Try Some Talks

When most people go to conferences, they focus on the talks, and while talks and keynotes are a great source of inspiration and learning, you can wear yourself out trying to go to them all. PyCon had a packed schedule of talks and keynotes (upcoming blog posts will go over some of them). Instead of maxing yourself out on all the talks, my recommendation is to pick only two or three talks to attend each day. Pick ones that will expand your knowledge and help you to think about how to grow in the coming year. Many of the scheduled speakers make their slides and recordings freely available after the conference, so if you can’t get to a talk during PyCon, you can continue the conference later in the comfort of your own home.

4. Nurture Your Network

It’s all about the community, they say, and seasoned conference goers know that conferences are one of the best opportunities to “make new friends, and keep the old.” Frankly, I put a higher priority on relationship-building activities than going to talks, and PyCon has a plethora of these activities to help you nurture your network:

  • Open Spaces: Good ones are engaging and relational. They aren’t just lectures.

  • Breakfast and Lunch: Every table is large, and you can join any conversation. Don’t miss the opportunity to talk with someone new. Like I did ... sleeping late and missing breakfast. At least I always got to lunch and met new people there.

  • Dinner: Off-site dinner during PyConIf you meet a new friend or an old one, you can have dinner together offsite. This is a greater time commitment than breakfast or lunch, and can lead to a much deeper relationship. This is especially important for those who work remotely, like me: Having a meal or drink together is just about the only thing we can’t do on Slack. Over a meal, you can get to know people so much better, and the good vibes carry over into a better working relationship. Pictured: A group of us from Caktus had dinner one night with a couple of folks from the Truth Initiative, with whom we’ve recently been working.

  • Vendors / Sponsors / Exhibitors: Caktus staff at their booth during PyCon 2019Many of the exhibitors are developers, and all of them have periods of boredom during three days of exhibiting. If you see someone at a booth who is not talking to anyone, they are probably bored out of their wits and would welcome an opportunity to talk to you. Just walk up and say, “Hi! Can you tell me about [whatever it is that they are exhibiting ]?” It can lead to some of the best conversations you have. Also, free t-shirts. However, the Caktus sales team was not bored during the conference — whenever I went by, Tim Scales and Ian Huckabee (pictured) were always busy talking with someone.

  • Job Fair: The job fair, which is open to the public, is another great opportunity to talk with company representatives. You’re less likely to find someone bored, but you’ll be able to pack a lot of conversations into a couple of hours. Even if you’re not looking for a new job, it’s interesting to see what opportunities are available, and see if there is a new skill set you want to work on. You might even find the perfect opportunity. Last year, I met Caktus at the PyCon job fair (I went to the fair but not the conference itself). This year, I talked with a lot of other companies and learned more about the lay of the land. My take: There’s good work available for Python engineers who understand DevOps, can build APIs, can support data scientists, and work well in a dynamic, collaborative environment. In fact, that sounds to me a lot like working with Caktus Group. They’re hiring, by the way.

  • Sprints: Not only are you contributing to a project, but you are working alongside some really smart people whom you might have just met. (More on the sprints below.)

5. Eat Early

Lunch buffet at PyCon 2019 Speaking of meals, it’s imperative to arrive early to the meal line, because developers are a hungry lot, and there isn’t really extra. I learned this the hard way. One day, I was so busy sprinting and trying to get my tests working that I didn’t go to lunch until 45 minutes after the opening bell. At that point, all that was left was a large plate of potatoes. At least they were really good potatoes. Pictured: I was on time that day.

6. Sprint!

After the conference itself there are four days of sprints. These are like open spaces, but for coding. Anyone can host a sprint for their open-source project. You get a table in one of the open spaces rooms, and you code together for any length of time, up to four days. The conference organizers provide the space (with power and internet) and lunch (be quick or eat potatoes), and you provide the code. It’s a pretty sweet opportunity to get your feet wet with some new projects, or to work with your project team in person (a great opportunity when you all live in different places). It’s a great time to be with others, and despite the fact that a lot of joking and talking takes place around the table, a lot of serious coding also happens.Falcon Framework team at PyCon

CodeRed CMS team at PyCon For my part, even though my work and family schedule didn’t allow me to stay for the entire sprint, I decided to stay through Monday (the first sprint day) and check out a couple of projects. Based on my interactions during the conference, I chose to sprint with Falcon Framework (top photo at right) for a few hours, and then switched to CodeRed CMS (bottom photo at right) for a couple of hours. I was a complete newbie on both projects, but both teams were super helpful in getting me onboard, and I was even able to commit some code to each project (see here and here) before driving home Monday evening. Sprinting with these teams was a wonderful experience for me.

Prepare for Pittsburgh

Cleveland was pretty cool. I’m already looking forward to being in Pittsburgh for PyCon 2020. I hope to see you there. Just be sure to register on time — details about the next PyCon should be released mid-summer; you can watch @PyCon on Twitter for updates.

Unless noted, all images copyright © 2019 by Sean Harrison. All rights reserved.

Joe GregorioColostomy Takedown

This colostomy takedown surgery is the second in the pair of surgeries I have had this year. If you would like to read the story of how I came to need a colostomy takedown please read A thing that happened.

As opposed to the first surgery which was done as an emergency procedure, this was a planned surgery, which made a world of difference. We were able to research and hire a patient advocate to stay with me a few nights, apply for short term disability before the surgery, etc.

Day of surgery

I was told to arrive at 11 AM for a 2 PM surgery, but when I got there at 11 they told me that both the surgeon and anesthetist had arrived and were ready to start, so my prep was actually pretty quick, they brought me back without Lynne to get me dressed and an IV started before Lynne was allowed to come back to see me, by the time they allowed her to come back and see me we only had 5 minutes together before they wheeled me off to the operating room.

The prep for the surgery went well. I have deep and abiding issues with getting IVs, a leftover from a previous surgery years ago when I was traumatized with an awful IV experience. This time, instead of just sucking it up, I talked to the anesthesiologist before hand, they added notes to my chart about my issues with IVs, and I also, on their advice, talked to the nurse giving me the IV. She was very understanding and gave me a Lidocaine shot before giving me the IV. This turned out to be a much better strategy than sucking it up, and the IV was painless and easy.

Compared to the first surgery, which was an emergency surgery, I was awake for much more of the process leading up to the surgery, even helping to move myself onto the operating table and watching them apply straps to hold me in place. I was also woken up in the operating room at the end of the surgery, I remember moving off the operating table and being rolled into PACU.

Going into the surgery the goal was to do the whole thing laparoscopically, but if that didn’t work they would have to re-open me up in same dramatic fashion that they had for the first surgery. Also, if there was still too much swelling or scarring on my large intesting they might also have to give me an ileostomy that I would then need to wait yet another three months to have taken down. Luckily all went well and when I woke up I was glad to hear that they were able to to do the whole thing laparoscopically, and I didn’t have an ostomy of any kind.

When coming out of surgery you initially have no pain medication in you, so they need to judge how much pain you are in and how much pain medication to give you. This is always a point of confusion post-op for me because I do this thing where I shake when I’m in extreme pain, not a little shaking mind you, I’m talking full-racking-body-swaying-the-hospital-bed shakes. This, unfortunately leads the nurse to believe I’m cold, and not in pain, so they start wrapping me in more and more layers of blankets. By the time they let Lynne into the PACU I was under 10 blankets, 6 over my body and 4 around my head. She explained that I wasn’t cold, but in pain. The nurse gave me my first dose of pain medicine, but having just come out from under anesthesia I was a little incoherent:

Nurse: Did that help?

Me: Yes.

Lynne: You are still shaking, are you still in pain?

Me: Yes.

Lynne: So that last set of pain meds wasn't enough?

Me: What pain meds?

As you can see from this exchange that both the nurse and Lynne are saints.

After they had my pain under control they set me up with a pain pump and moved me to my room.

The doctors and nurses told me that the more I walked the better I would heal and I took them very seriously. I was out of PACU and in my room by 5pm and took my first walk an hour or so later, with Lynne pushing IV pole and me with a walker. Lynne then had to leave to take care of the kids for the night, so I took another walk with the help of the nurse at 8PM. That exhausted me and I took an hour nap, but that got me behind on the pain pump and I felt it when I woke up at 9PM.

My pain was mostly in area where colostomy was closed, except for the hiccups, which caused intense pain right below the rib cage. My Dr. explained that they were the result of the CO2 they pumped me full of for the laparoscopic surgery and that they should go away as my body absorbs the rest of the CO2.

I actually slept pretty well that first night, sleeping in solid blocks of 2-4 hours and then waking up to use the pain pump.

Day 2

I was serious about my walking and if I was awake and had my pain under control I would try to get out for a walk. I was also serious about keeping hydrated, as I had been since the first surgery, as becoming dehydrated and constipated with a colostomy was something I dreaded. So as the day wore on I kept drinking and walking, but after a while it felt like I couldn’t drink any more and my stomach was getting sore. The Dr for that shift suggested I stop drinking, that my bowels probably hadn’t woken up and that everything I was drinking was just accumulating in my stomach. I did hold off on the liquids for the rest of the day, and then asked my Dr. when I saw her later in the day and she agreed with his prognosis. At this point I suggested that if I wasn’t going to drink maybe they should increase my rate of IV fluids above the KVO (Keep Vein Open) level so I didn’t get dehydrated. She agreed and actually changed me to IV Lactate. I don’t think anyone would have thought of this on their own, just pointing this out that no matter how great the care, you really need to be your own advocate.

That morning they removed both the catheter and pain pump, I apparently didn’t use the pain pump very much, and they moved to oral pain meds, falling back to IV injection for breakthrough pain, which didn’t happen very often. I was still having hiccups, which were still painful, and would be triggered by coughing, laughing, or most annoyingly, just saying the word “hiccups”.

I walked 6 more times, now without the walker since without the pain pump I was very stable on my feet. As the day went on my stomach got better and towards evening started to pee a lot more, around 700ml every hour or two. The IV fluids were only coming in at 75ml/h so that wasn’t the source and I hoped that my digestive tract had started to wake up.

I had a total of four bowel movements during the day, but they were entirely blood clots, and each one was progressively smaller than the previous one, and since I didn’t pass any gas they didn’t “count”. What the staff were waiting for was me to pass gas, at which point I would be allowed to transition from “clears” to solid foods. I know they were blood clots and not “blood” because I dragged a nurse into the bathroom each time to inspect them and confirm that it wasn’t “blood”.

I took 3 more walks through the evening for a total of 10 walks for the whole day.

Day 3

Early in the morning I saw the other Dr and he said I had urinated over 3L during the night, in addition the nurse listened to abdomen and said it sounded very active, so I was hopeful that would be enough evidence that my bowels had woken up.

The majority of the day was again more walks, keeping on top of oral pain meds, and napping between walks. The only change was that it was now beautiful and I upgraded to walking around outside.

The hiccups are gone at this point, and the pain at this point is mostly in the site of the former colostomy and only during transitions to/from walking/lying down.

I saw my surgeon at 3 PM and she wrote me up for solid foods, and I immediately ate one Ritz cracker, from a sleeve I had squirreled away in my travel bad for just such an occasion. Later they deliver the hospital dinner, which I nibbled at, and then went for another walk and then back to sleeping. I woke later that evening and had another small bowl of chicken noodle soup they prepared at the nurses station, got my pain pills, and went back to sleeping.

Day 4

I took my first walk at 6 AM and had a bowl movement with real stool and a small amount of blood clot. I also had a lot of gas coming out both ends of my digestive tract.

Later that morning I walked down to Au Bon Pain in the hospital lobby to get a croissant for breakfast, and I also went back down there for lunch. Yes, the hospital food was that bad. The folks working at the Au Bon Pain seemed oblivious to me wearing a hospital gown and pushing my IV pole as I ordered my lunch, but some of the other patrons gave me wary looks.

I was released later that day.

Day 5

I didn’t realize that when they released me I was still on 10mg of oxy every 4 hours, but they wrote a prescription for 5mg of oxy every four hours. It took me some time to coordinate my pills and get on an overlapping Motrin/Tylenol shift with just oxy for the breakthrough pain.

At this point I am home and walking 2-3 times a day, where each walk is about a mile long. A new symptom appears at this point, occasionally I will get a muscle cramp in my abdomen around the old ostomy site, and that will slowly spread across my entire upper abdomen. It usually only lasts a minute or two, but it is fairly painful when it happens.

Day 7

I am no longer using the oxy for breakthrough pain.

I did have a bit of a panic this day, I had been regular and stools were beginning to become more formed, but then I “missed”, or really just went a few hours over when I was due for a BM, so I tried lots of things, taking a stool softener, drinking apple juice, etc. A few hours later I had a normal BM, but unfortunately all the things I had tried to loosen up my stools were still in my system working away and I ended up giving myself diarrhea and re-irritating my bowels. Fortunately that died down over the next 24 hours.

Day 11

No longer taking Motrin or Tylenol on a regular basis. At this point I am fine if I am standing, walking, or lying down, but sitting gets uncomfortable, and if I sit too long then when I stand up the area just under my ribs feels uncomfortable, it’s hard to describe, bit it almost feels like my intenstines stiffen into one position when I sit and then when I stand up they resist, in a painful way, moving back into the standing position.

Day 12

Stopped wet packing the former colostomy site as it was almost completely closed.

Day 14

Actually had enough brain power to do some real programming, but that only lasted for about 30 minutes; it is shocking how much pain and healing will take out of you and turn your brain to mush.

Day 30

First day back to work. I am still taking Motrin occasionally, usually if I just try to do too much. I can now sit for much longer periods of time, and I don’t get that stiffness in my abdomen when I stand, but I will still occasionally get a muscle spasm around the old ostomy site. All my wounds are closed at this point, but my abdomen is covered with scars, swollen in some areas from the last surgery and distened from where the colostomy was; in summary I look like my stomach was run over with farm equipment.

At this point I am 20 lbs lighter than when I went into the emergency room for that first surgery. I lost 15 lbs from the first surgery and another 5 lbs from this latest surgery. I’m fine with the weight loss, just not what I had to go through to get here.

Joe GregorioWebmention on Google Cloud Run

I just published webmention-run, a Google Run application written in Go that implements Webmention. I’m now using this to handle webmentions on bitworking.org. Given the generous free quota for Google Run I don’t expect this to cost me anything. This is on top of using Firebase Hosting to host the static (Jekyll) parts of my blog, which is also effectively free.

Another awesome feature is that both services will provide SSL certificates; in my case Firebase Hosting provides the cert for https://bitworking.org, and Google Cloud Run provides the SSL cert for the subdomain where my instance of webmention-run is running, which is the subdomain https://webmention.bitworking.org.

Joe GregorioThe Great Famine of 1315-1317

Great Famine

The Great Famine of 1315-1317 only lasted two years, was no where close to the change in climate that we are looking in the face right now, and it wiped out 10-25% of the population.

To provide some measure of relief, the future was mortgaged by slaughtering the draft animals, eating the seed grain, abandoning children to fend for themselves (see “Hansel and Gretel”) and, among old people, voluntarily refusing food for the younger generation to survive.

One of the things that frightens me most about climate change is that small changes can have drastic affects and institutions can unravel much more quickly than anyone imagines. My fear is that by the time things get bad enough that we need to try things like geoengineering our institutions will have fallen apart and we’ll be incapable of launching such efforts.

Joe Gregoriosimplifying income

Bill Gates

In terms of revenue collection, you wouldn’t want to just focus on the ordinary income rate, because people who are wealthy have a rounding error of ordinary income.

I would love to see the U.S. do away with categories of income (income, earned interest, capital gains, etc) and make it all just one bucket and tax that at a progressive rate.

Joe GregorioI’m not done trickling on you!

Look at poor Ken, so freaked out that everyone is talking about taxes. Obviously a firm believer in trickle-down economics, his entire screed boils down to:

Don’t tax me, bro! I’m not done trickling on you!

Ironically he refers to Bill Gates, who appears not to agree with Ken at all.

To get the full context it’s useful to watch the video from the beginning where Historian Rutger Bregman schools Michael Dell on his ignorant comment about a top marginal tax rate of 70%:

Joe GregorioConcentrated corporate power as a threat to democracy

In this essay, I will argue that the interaction of concentrated corporate power and politics it a threat to the functioning of the free market economy and to economic prosperity it can generate, and a threat to democracy as well.

Towards a Political Theory of the Firm (PDF).

Glad to see the Chicago School of Economics trying to resuscitate their image after the previous damage they’ve done.

Joe GregorioCEO Pay

I’ve said the Harvard MBA is the most second most damaging thing to happen to business in the last 40 years. I might have to clarify that to “the myth of the Harvard MBA is the second most damaging thing to happen to business in the last 40 years”.

We found no statistically significant alphas — despite testing every possible school with a reasonable sample size. MBA programs simply do not produce CEOs who are better at running companies, if performance is measured by stock price return.

But if there is no evidence that stock returns are attributable to CEOs, then what justification is there for their stratospheric pay? How much longer will investors and boards be fooled by randomness and hollow credentialism?

Oh, and the first most damaging thing to happen to business in the past 40 years? The Chicago School of Economics.

Joe GregorioArts and crafts

When you have a stoma every day is arts and crafts day.

I swear I’ve modified more articles of clothing than my sister did during her entire teenage life in the 80s.

Caktus GroupHow to Switch to a Custom Django User Model Mid-Project

The Django documentation recommends always starting your project with a custom user model (even if it's identical to Django's to begin with), to make it easier to customize later if you need to. But what are you supposed to do if you didn't see this when starting a project, or if you inherited a project without a custom user model and you need to add one?

At Caktus, when Django first added support for a custom user model, we were still using South for migrations. Hard to believe! Nearly six years ago, I wrote a post about migrating to a custom user model that is, of course, largely obsolete now that Django has built-in support for database migrations. As such, I thought it would be helpful to put together a new post for anyone who needs to add a custom user model to their existing project on Django 2.0+.

Background

As of the time of this post, ticket #25313 is open in the Django ticket tracker for adding further documentation about this issue. This ticket includes some high-level steps to follow when moving to a custom user model, and I recommend familiarizing yourself with this first. As noted in the documentation under Changing to a custom user model mid-project, "Changing AUTH_USER_MODEL after you’ve created database tables is significantly more difficult since it affects foreign keys and many-to-many relationships, for example."

The instructions I put together below vary somewhat from the high-level instructions in ticket #25313, I think (hope) in positive and less destructive ways. That said, there's a reason this ticket has been open for more than four years — it’s hard. So, as mentioned in the ticket:

Proceed with caution, and make sure you have a database backup (and a working process for restoring it) before changing your production database.

Overview

Steps 1 and 2 below are the same as they were in 2013 (circa Django 1.5), and everything after that differs since we're now using Django's built-in migrations (instead of South). At a high level, our strategy is to create a model in one of our own apps that has all the same fields as auth.User and uses the same underlying database table. Then, we fake the initial migration for our custom user model, test the changes thoroughly, and deploy everything up until this point to production. Once complete, you'll have a custom user model in your project, as recommended in the Django documentation, which you can continue to tweak to your liking.

Contrary to some other methods (including my 2013 post ), I chose this time to update the existing auth_user table to help ensure existing foreign key references stay intact. The downside is that it currently requires a little manual fiddling in the database. Still, if you're using a database with referential integrity checking (which you should be), you'll sleep easier at night knowing you didn't mess up a data migration affecting all the users in your database.

If you (and a few others) can confirm that something like the below works for you, then perhaps some iteration of this process may make it into the Django documentation at some point.

Migration Process

Here's my approach for switching to a custom user model mid-project:

  1. Assumptions:

    • You have an existing project without a custom user model.
    • You're using Django's migrations, and all migrations are up-to-date (and have been applied to the production database).
    • You have an existing set of users that you need to keep, and any number of models that point to Django's built-in User model.
  2. First, assess any third party apps to make sure they either don't have any references to the Django's User model, or if they do, that they use Django's generic methods for referencing the user model.

  3. Next, do the same thing for your own project. Go through the code looking for any references you might have to the User model, and replace them with the same generic references. In short, you can use the get_user_model() method to get the model directly, or if you need to create a ForeignKey or other database relationship to the user model, use settings.AUTH_USER_MODEL (which is simply a string corresponding to the appname.ModelName path to the user model).

    Note that get_user_model() cannot be called at the module level in any models.py file (and by extension any file that a models.py imports), since you'll end up with a circular import. Generally, it's easier to keep calls to get_user_model() inside a method whenever possible (so it's called at run time rather than load time), and use settings.AUTH_USER_MODEL in all other cases. This isn't always possible (e.g., when creating a ModelForm), but the less you use it at the module level, the fewer circular imports you'll have to stumble your way through.

  4. Start a new users app (or give it another name of your choice, such as accounts). If preferred, you can use an existing app, but it must be an app without any pre-existing migration history because as noted in the Django documentation, "due to limitations of Django’s dynamic dependency feature for swappable models, the model referenced by AUTH_USER_MODEL must be created in the first migration of its app (usually called 0001_initial); otherwise, you'll have dependency issues."

    python manage.py startapp users
    
  5. Add a new User model to users/models.py, with a db_table that will make it use the same database table as the existing auth.User model. For simplicity when updating content types later (and if you'd like your many-to-many table naming in the underlying database schema to match the name of your user model), you should call it User as I've done here. You can rename it later if you like.

    from django.db import models
    from django.contrib.auth.models import AbstractUser
    
    
    class User(AbstractUser):
        class Meta:
            db_table = 'auth_user'
    
  6. As a convenience, if you'd like to inspect the user model via the admin as you go, add an entry for it to users/admin.py:

    from django.contrib import admin
    from django.contrib.auth.admin import UserAdmin
    
    from .models import User
    
    
    admin.site.register(User, UserAdmin)
    
  7. In settings.py, add users to INSTALLED_APPS and set AUTH_USER_MODEL = 'users.User':

    INSTALLED_APPS = [
        # ...
        'users',
    ]
    
    AUTH_USER_MODEL = 'users.User'
    
  8. Create an initial migration for your new User model:

    python manage.py makemigrations
    

    You should end up with a new migration file users/migrations/0001_initial.py.

  9. Since the auth_user table already exists, normally in this situation we would fake this migration with the command python manage.py migrate users --fake-initial. If you try to run that now, however, you'll get an InconsistentMigrationHistory error, because Django performs a sanity check before faking the migration that prevents it from being applied. In particular, it does not allow this migration to be faked because other migrations that depend on it, i.e., any migrations that include references to settings.AUTH_USER_MODEL, have already been run. I'm not entirely sure why Django places this restriction on faking migrations, since the whole point is to tell it that the migration has, in fact, already been applied (if you know why, please comment below). Instead, you can accomplish the same result by adding the initial migration for your new users app to the migration history by hand:

    echo "INSERT INTO django_migrations (app, name, applied) VALUES ('users', '0001_initial', CURRENT_TIMESTAMP);" | python manage.py dbshell
    

    If you're using an app name other than users, replace users in the line above with the name of the Django app that holds your user model.

    At the same time, let's update the django_content_types table with the new app_label for our user model, so existing references to this content type will remain intact. As with the prior database change, this change must be made before running migrate. The reason for this is that migrate will create any non-existent content types, which will then prevent you from updating the old content type with the new app label (with a "duplicate key value violates unique constraint" error).

    echo "UPDATE django_content_type SET app_label = 'users' WHERE app_label = 'auth' and model = 'user';" | python manage.py dbshell
    

    Again, if you called your app something other than users, be sure to update SET app_label = 'users' in the above with your chosen app name.

    Note that this SQL is for Postgres, and may vary somewhat for other database backends.

  10. At this point, you should stop and deploy everything to a staging environment, as attempting to run migrate before manually tweaking your migration history will fail. If your automated deployment process runs migrate (which it likely does), you will need to update that process to run these two SQL statements before migrate (in particular because migrate will create any non-existent content types for you, thereby preventing you from updating the existing content type in the database without further fiddling). Test this process thoroughly (perhaps even multiple times) in a staging environment to make sure you have everything automated correctly.

  11. After testing and fixing any errors, everything up until this point should be deployed to production (and/or any other environments where you need to keep the existing user database), after ensuring that you have a good backup and a process for restoring it in the event anything goes wrong.

  12. Now, you should be able to make changes to your users.User model and run makemigrations / migrate as needed. For example, as a first step, you may wish to rename the auth_user table to something in your users app's namespace. You can do so by removing db_table from your User model, so it looks like this:

    class User(AbstractUser):
        pass
    

    You'll also need to create and run a new migration to make this change in the database:

    python manage.py makemigrations --name rename_user_table
    python manage.py migrate
    

Success?

That should be it. You should now be able to make other changes (and create migrations for those changes) to your custom User model. The types of changes you can make and how to go about making those changes is outside the scope of this post, so I recommend carefully reading through the Django documentation on substituting a custom User model. In the event you opt to switch from AbstractUser to AbstractBaseUser, be sure to create data migrations for any of the fields provided by AbstractUser that you want to keep before deleting those columns from your database. For more on this topic, check out our post about DjangoCon 2017, where we link to a talk by Julia M Looney titled "Getting the most out of Django’s User Model." My colleague Dmitriy also has a great post with some other suggestions for picking up old projects.

Once again, please test this carefully in a staging environment before attempting it on production, and make sure you have a working database backup. Good luck, and please comment below with any success or failure stories, or ideas on how to improve upon this process!

Caktus GroupThe Secret Lives of Cakti (Part 3): Game On!

Pictured: Scott, Kat, and Tim take a quick break for a game of cards.

It may be no surprise that there are gamers among our Caktus crew, but you may be surprised by the type of games that Cakti play. From the ancient art of Mahjong to the modern fun of Pokemon, our team members cover it all.

Mahjong Master & Crossword Champ Tim Scales

Tim’s grandmother taught him to play Mahjong when he was about 10 years old. She learned to play the Chinese tile-based game in the ‘50s while living in Singapore, and she continued playing after she moved back to England. As a result, it became a regular family past time. Tim enjoys the speed, complexity, and competitiveness of Mahjong. “It’s a hard game for beginners to pick up, so while I love bringing new people into the game, it’s also a pleasure to play at the breakneck speed of experienced players,” he said.

In addition to Mahjong, Tim’s grandmother also instilled in him a love for crossword puzzles. “When I went to visit her in England, we would complete the Daily Telegraph crossword every morning over breakfast. She lived to be over 100 and was sharp until the end. She swore that doing the crossword was her secret. I view it as an investment in my future mental sharpness,” said Tim, who’s now a daily New York Times crossword solver.

According to the NYT Crossword app, Tim has completed over 800 puzzles, not including the hundreds he previously did on paper! And there’s no stopping him now. He’s on a 15-month streak of solving the NYT crossword (almost) every day. Coincidentally, another Cakti, Karen Tracey created NYT crosswords that were published from 2003 - 2010. “I went into the archives and found a puzzle that Karen created, and it was hard! It took about twice my average time to complete it,” Tim said. (Go Karen!)

Live Action Storyteller Scott Morningstar

In 1977, Scott’s uncle gave him and his brother Jason the original Dungeons & Dragons boxed set, which kickstarted a lifelong interest in tabletop and role-playing games. Jason began designing these types of games when he was in high school, and Scott tested them, providing feedback and suggestions. Several of Scott’s favorite games are ones that his brother created. In 2005, Jason founded Bully Pulpit Games.

Thanks in large part to D&D, Scott is especially into live-action role-playing games (or LARP games), which involve improvisation, storytelling, and acting. It’s the perfect fit for Scott, who says he was a “theater geek”. He prefers the genre of American Freeform LARP, which can last anywhere from 90 minutes to 4 hours. These games involve freeform story and character development. Scott especially enjoys LARPS based on historical events, and true stories, or ones that involve complex scenarios like time travel. “A lot of people play a game to win, but we play for the best story,” Scott explained. “I like to tell a good story. If you watch us play, you’ll see that what we do is closer to improv theatre than gaming.”

For 20 years, Scott and his brother have been involved with a weekly gaming group, and they regularly test games that Jason creates. They’ve also tested games made by designers around the world. “My favorite part of it all is spending time with my brother. Gaming has kept us connected,” Scott said. It’s also an interest that Scott has passed down to his sons, who are also into gaming.

Scott enjoys sharing his gaming interest and bringing other players together. He and his brother founded LARP Shack, which attracts players from as far as Washington, D.C., and Philadelphia. Scott has hosted LARP Shack events at Caktus since August 2017, and the next one will be held on April 27 — details in this private Facebook group and on the Caktus event calendar. Scott also organizes Local Area Network (LAN) parties at Caktus. These parties are a blast from the past! Attendees bring their own computer, which they connect to form a LAN to play online games. It’s a bit like playing online games during the days of dial-up. Scott plans to host the next LAN party sometime this spring.

Video Game Enthusiast Kat Smith

Kat can’t recall a time when she didn’t have a gaming console (or two, or three, or more!). As a kid, she played Bubble Bobble on a classic Nintendo with her grandmother, and she often watched her brother play Final Fantasy X. Other consoles she had growing up included Super Nintendo, Sega Genesis, Nintendo 64, PlayStation2 and 3. Currently, Kat is more interested in PC gaming, but she still owns a PlayStation 4 Pro, Nintendo Switch, and a Wii U. Kat's cat (pictured) also enjoys watching the video games.

Cat sitting behind a video game controller She used to play a lot of multiplayer online battle arena (MOBA) games, “but the older I get, the more I like relaxing games,” Kat said. Now she’s into world builder, simulation, and visually appealing games. She enjoys open-ended games, and a few of her favorite video games are the Witcher 3, Skyrim, Civilization V, and Banished. She’s also a regular Minecraft player, and has it for PC, phone, and console. “Every 6 months or so I play Minecraft, mostly to see what updates have been made to the game. It's a really relaxing game where you can do whatever you want. And it's very customizable!” Kat said.

Kat has also been active in the Pokemon Go community since July 2016, and she plays daily. She’s now at level 33 (out of 40)! She sometimes participates in raids with fellow Cakti Dmitriy Chukhin. The Civil Rights Mural located outside of the Caktus office is a Pokemon Gym so there's almost always an opportunity to find a Pikachu or a Magikarp on her way in and out of the office. On weekends, Kat enjoys walking around town with her fiance and their dog, checking the poke stops on their route. If Kat were a Pokemon, she said she’d be a Snorlax or a Slaking!

Caktus GroupWe're Eagerly Preparing for PyCon 2019!

Pictured: The final rush is on! Staff quickly check materials for our PyCon booth.

PyCon 2019 is almost here, and we’re excited to continue to sponsor this premier Python event, which takes place in Cleveland, OH, from May 1 - 9. PyCon attracts attendees from around the world, and for the first time, the conference will include a track of Spanish talks.

Caktus at PyCon

Connecting with the Python community is one of our favorite parts of participating in PyCon. We love to catch up with people we’ve met before and see new faces, too! We’ll be in the Exhibit Hall at booth 645 on May 2 - 4, where we’ll have swag, games, and giveaways.

Ultimate Tic Tac Toe Game by Caktus Some of you may remember our Ultimate Tic Tac Toe game from previous years. Only a few committed players were able to beat the AI opponent last year. This year, any (human) champions will earn a Caktus hoodie and be entered into a drawing to win a Google AIY Vision Kit and a Google AIY Voice Kit.

Must-See Talks & Events

PyCon consistently attracts top-notch speakers who present on a variety of informative topics. Our team is especially looking forward to the following:

Check out the full schedule of talks. Some of these will likely appear in our follow-up PyCon Must-See Talks series, so if you can’t make it to the event, check back in June for our top picks.

Open Spaces: Beyond the scheduled talks, our Technology Support Specialist Scott Morningstar is looking forward to the Open Spaces sessions, which are self-organizing, meetup-like events. Scott plans to run a game of WINTERHORN during one of the open spaces times. The live-action game allows players to reflect on the government and opportunities for activism. “I’m not sure if playing WINTERHORN will make you a better developer, but it may make you a better citizen, or at least better informed about what is happening in the world,” Scott said. Read more about Scott's passion for games like WINTERHORN.

Arts Festival: This year, PyCon includes a mini arts festival called The Art of Python, which will “showcase novel art that helps us share our emotionally charged experiences of programming (particularly in Python).” With his background in STEAM education (STEM + the Arts), account executive Tim Scales is particularly excited about the arts festival, which will provide a creative complement to the technical presentations and lectures.

Grow with us. Now hiring Django contractors.

Job Fair Open to Public

Are you a sharp Django web developer searching for your next opportunity? Good news — we’re hiring! View the spec and apply from our Careers page. We’ll also be at table 34 during the PyCon job fair on May 5, which is open to the public, so come meet the hiring manager and learn more about what it’s like to work at Caktus.

Don’t be a Stranger!

Come see us at our booth, look for members of the Caktus team in our T-shirts during the event, or go ahead and schedule a meeting with us.

Whether you’ll be at PyCon or following along from home, we’ll tweet from @CaktusGroup. Be sure to follow us for the latest updates from the event.

Caktus GroupCaktus Adopts New Web Framework

Caktus Changing from Django to New COBOL-based Framework

Beginning immediately, Caktus will build new projects using our new COBOL-based framework, ADD COBOL TO WEB.

Time-tested

We've come to realize that new-fangled languages like Python (1989) offer nothing that more time-tested languages such as COBOL (1959) cannot. We're tired of trying to keep up with the latest, greatest thing (Django, 2003). Accordingly, we're going back to old, reliable COBOL.

Flexible and Powerful

COBOL provides us with flexibility and power that Python cannot match. A statement like "ALTER X PROCEED TO Y" can remotely modify a GO TO statement in procedure X to target a completely different place in the program!

Straightforward

Here's "Hello, world" in COBOL:

IDENTIFICATION DIVISION.
PROGRAM-ID. hello-world.
PROCEDURE DIVISION.
    DISPLAY "Hello, world!"
    .

What could be simpler? Statements are terminated by periods, which we've been used to since childhood. No pesky parentheses, just write DISPLAY "Hello, world!" and the program displays Hello, world!.

COBOL for the Web

Of course, COBOL pre-dates the web, so it doesn't come with support for building websites. We have therefore developed a new framework, ADD COBOL TO WEB, that links COBOL to this new-fangled (1989) web.

We take full advantage of COBOL's built-in templating facilities: the PICTURE clause and the Report Writer. This is ideal for table output:

01  sales-on-day TYPE DETAIL, LINE + 1.
    03  COL 3                    VALUE "Sales on".
    03  COL 12                   PIC 99/99/9999 SOURCE sales-date.
    03  COL 21                   VALUE "were".
    03  COL 26                   PIC $$$$9.99 SOURCE sales-amount.

And the application to generating HTTP responses and HTML content is, of course, obvious.

Open Source

We plan to share the complete source for ADD COBOL TO WEB as soon as someone ports git to COBOL.

Customer Benefit

Our customers will appreciate our use of long-established tools to build their websites, with future maintainability guaranteed by the many COBOL programmers in the workforce.

The IBM 370

IBM System 370, image by Oliver.obi via Wikimedia Commons.

Future Plans

We recently picked up an IBM 370 from eBay and are arranging shipping, and additional power to our office basement. We've been assured the 370 was state-of-the-art for COBOL in its day, and can compile literally tens of statements per minute. Once our business takes off, we'll be able to afford a 256 KB memory expansion to speed it up even more!

Also, keep an eye on our job postings. We will soon be looking for experienced mainframe operators who can help us deploy our new applications.

Caktus GroupCoding for Time Zones & Daylight Saving Time — Oh, the Horror

In this post, I review some reasons why it's really difficult to program correctly when using times, dates, time zones, and daylight saving time, and then I'll give some advice for working with them in Python and Django. Also, I'll go over why I hate daylight saving time (DST).

TIME ZONES

Let's start with some problems with time zones, because they're bad enough even before we consider DST, but they'll help us ease into it.

Time Zones Shuffle

Time zones are a human invention, and humans tend to change their minds, so time zones also change over time.

Many parts of the world struggle with time changes. For example, let's look at the Pacific/Apia time zone, which is the time zone of the independent country of Samoa. Through December 29, 2011, it was -11 hours from Coordinated Universal Time (UTC). From December 31, 2011, Pacific/Apia became +13 hours from UTC.

What happened on December 30, 2011? Well, it never happened in Samoa, because December 29, 23:59:59-11:00 is followed immediately by December 31, 0:00:00+13:00.

Date Time Zone Date Time Zone
2011-12-29 23:59:59 UTC-11 2011-12-30 10:59:59 UTC
2018-12-31 00:00:00 UTC+13 2011-12-30 11:00:00 UTC

That's an extreme example, but time zones change more often than you might think, often due to changes in government or country boundaries.

The bottom line here is that even knowing the time and time zone, it's meaningless unless you also know the date.

Always Convert to UTC?

As programmers, we're encouraged to avoid issues with time zones by "converting" times to UTC (Coordinated Universal Time) as early as possible, and convert to the local time zone only when necessary to display times to humans. But there's a problem with that.

If all you care about is the exact moment in the lifetime of the universe when an event happened (or is going to happen), then that advice is fine. But for humans, the time zone that they expressed a time in can be important, too.

For example, suppose I'm in North Carolina, in the eastern time zone, but I’m planning an event in Memphis, which is in the central time zone. I go to my calendar program and carefully enter the date and "3:00 p.m. CST". The calendar follows the usual convention and converts my entry to UTC by adding 6 hours, so the time is stored as 9:00 p.m. UTC, or 21:00 UTC. If the calendar uses Django, there's not even any extra code needed for the conversion, because Django does it automatically.

The next day I look at my calendar to continue working on my event. The event time has been converted to my local time zone, or eastern time, so the calendar shows the event happening at "4:00 p.m." (instead of the 3:00 p.m. that it should be). The conversion is not useful for me, because I want to plan around other events in the location where the event is happening, which is using CST, so my local time zone is irrelevant.

The bottom line is that following the advice to always convert times to UTC results in lost information. We're sometimes better off storing times with their non-UTC time zones. That's why it's kind of annoying that Django always "converts" local times to UTC before saving to the database, or even before returning them from a form. That means the original timezone is lost unless you go to the trouble of saving it separately and then converting the time from the database back to that time zone after you get it from the database. I wrote about this before.

By the way, I've been putting "convert" in scare quotes because talking about converting times from one time zone to another carries an implicit assumption that such converting is simple and loses no information, but as we see, that's not really true.

DAYLIGHT SAVING TIME

Daylight saving time (DST) is even more of a human invention than time zones.

Time zones are a fairly obvious adaptation to the conflict between how our bodies prefer to be active during the hours when the sun is up, and how we communicate time with people in other parts of the world. Historical changes in time zones across the years are annoying, but since time zones are a human invention it's not surprising that we'd tweak them every now and then.

DST, on the other hand, amounts to changing entire time zones twice every year. What does US/eastern time zone mean? I don't know, unless you tell me the date. From January 1, 2018 to March 10, 2018, it meant UTC-5. From March 11, 2018 to November 3, 2018, it meant UTC-4. And from November 4, 2018 to December 31, 2018, it's UTC-5 again.

But it gets worse. From Wikipedia:

The Uniform Time Act of 1966 ruled that daylight saving time would run from the last Sunday of April until the last Sunday in October in the United States. The act was amended to make the first Sunday in April the beginning of daylight saving time as of 1987. The Energy Policy Act of 2005 extended daylight saving time in the United States beginning in 2007. So local times change at 2:00 a.m. EST to 3:00 a.m. EDT on the second Sunday in March and return at 2:00 a.m. EDT to 1:00 a.m. EST on the first Sunday in November.

So in a little over 50 years, the rules changed 3 times.

Even if you have complete and accurate information about the rules, daylight saving time complicates things in surprising ways. For example, you can't convert 2:30 a.m. March 11, 2018. in US/eastern time zone to UTC, because that time never happened — our clocks had to jump directly from 1:59:59 a.m. to 3:00:00 a.m. See below:

Date Time Zone Date Time Zone
2018-03-11 1:59:59 EST 2018-03-11 6:59:59 UTC
2018-03-11 3:00:00 EDT 2018-03-11 7:00:00 UTC

You can't convert 1:30 a.m. November 4, 2018, in US/eastern time zone to UTC either, because that time happened twice. You would have to specify whether it was 1:30 a.m. November 4, 2018 EDT or 1:30 a.m. November 4, 2018 EST:

Date Time Zone Date Time Zone
2018-11-04 1:00:00 EDT 2018-11-04 5:00:00 UTC
2018-11-04 1:30:00 EDT 2018-11-04 5:30:00 UTC
2018-11-04 1:59:59 EDT 2018-11-04 5:59:59 UTC
2018-11-04 1:00:00 EST 2018-11-04 6:00:00 UTC
2018-11-04 1:30:00 EST 2018-11-04 6:30:00 UTC
2018-11-04 1:59:59 EST 2018-11-04 6:59:59 UTC

Advice on How to Properly Manage datetimes

Here are some rules I try to follow.

When working in Python, never use naive datetimes. (Those are datetime objects without timezone information, which unfortunately are the default in Python, even in Python 3.)

Use the pytz library when constructing datetimes, and review the documentation frequently. Properly managing datetimes is not always intuitive, and using pytz doesn't prevent me from using it incorrectly and doing things that will provide the wrong results only for some inputs, making it really hard to spot bugs. I have to triple-check that I'm following the docs when I write the code and not rely on testing to find problems.

Let me strengthen that even further. It is not possible to correctly construct datetimes with timezone information using only Python's own libraries when dealing with timezones that use DST. I must use pytz or something equivalent.

If I'm tempted to use datetime.replace, I need to stop, think hard, and find another way to do it. datetime.replace is almost always the wrong approach, because changing one part of a datetime without consideration of the other parts is almost guaranteed to not do what I expect for some datetimes.

When using Django, be sure USE_TZ = True. If Django emits warnings about naive datetimes being saved in the database, treat them as if they were fatal errors, track them down, and fix them. If I want to, I can even turn them into actual fatal errors; see this Django documentation.

When processing user input, consider whether a datetime's original timezone needs to be preserved, or if it's okay to just store the datetime as UTC. If the original timezone is important, see this post I wrote about how to get and store it.

Conclusion

Working with human times correctly is complicated, unintuitive, and needs a lot of careful attention to detail to get right. Further, some of the oft-given advice, like always working in UTC, can cause problems of its own.

Caktus GroupWhy We Love Wagtail (and You Will, Too)

New clients regularly ask us if we build WordPress sites. When we dig deeper, we generally learn that they’re looking for a user-friendly content management system (CMS) that will allow them to effortlessly publish and curate their site content. As we’ve written about previously, WordPress can be a good fit for simple sites. However, the majority of our clients need a more robust technical solution with customizable content management tools. For the Python-driven web applications that we develop, we love to work with Wagtail.

What is Wagtail?

Wagtail is a Python-driven CMS built on the Django web framework. It has all the features you’d expect from a quality CMS:

  • intuitive navigation and architecture
  • user-friendly content editing tools
  • painless image uploading and editing capabilities
  • straightforward and rapid installation

What Makes Wagtail Different?

From the user’s perspective, Wagtail’s content editor is what sets it apart, and it’s why we really love it. Most content management systems use a single WYSIWYG (“what you see is what you get”) HTML editor for page content. While Wagtail includes a WYSIWYG editor — the RichTextField — it also has the Streamfield, which provides an interface that allows you to create and intermix custom content modules, each designed for a specific type of content.

What does that mean in practice? Rather than wrangling an image around text in the WYSIWYG editor and hoping it displays correctly across devices, you can drop an image into a separate, responsive module, which has a custom data model. In other words:

As a user, you don’t need to customize your content to the capabilities of your CMS. Instead, you customize your CMS to maximize your content.

Let’s Take a Look

Below is a screenshot of the WordPress editing dashboard, with the single HTML content area. You can edit and format text, add an image, and insert an HTML snippet — all the basics.

Screenshot of the WordPress CMS

Now take a look at Wagtail. Each type of content has its own block — that’s Streamfield at work. The icons at the bottom display the developer-defined modules available, which in this case are Heading block, Paragraph block, Image block, Block quote, and Embed block. This list of modules can be extended to include a variety of custom content areas based on your specific website.

Screenshot of the Wagtail CMS

Using the blocks and modules, a web content editor can quickly add a paragraph of text, followed by an image, and then a blockquote to create a beautiful, complete web page. To demonstrate this better, we put together a short video of Streamfields in action.

You can also learn more about Streamfields and other features on the Wagtail website.

Powered by Django

At its core, Wagtail is a Django app, meaning that it seamlessly integrates with other Django and Python applications. This allows near-endless flexibility to extend your project with added functionality. For example, if your application includes complex Python-based data analysis on the backend but you want to easily display output to site visitors, Wagtail is the ideal choice for content management.

The Bottom Line

Wagtail provides content management features that go above and beyond the current abilities of a WordPress site, plus the inherent customization and flexibility of a Django app. We love working with Wagtail because of the clear advantages it provides to our clients and content managers. We highly recommend the Wagtail CMS to all our clients.

Contact us to see if Wagtail would be a good fit for your upcoming project.

Caktus GroupDjango: Recommended Reading

Pictured: Our library of reference books at Caktus cover topics including Django and Python, as well as project management and Agile methodologies.

At Caktus, we believe in continued learning (and teaching). It's important to read up on the latest industry trends and technologies to stay current in order to address our clients' challenges. We even maintain a library in our office for staff use, and we add references frequently. Our team enjoys sharing what they've learned by contributing to online resources, such as the Django Documentation and the Mozilla Developer Network Web Docs. Below is a list (in alphabetical order) of the books, blogs, and other documents that we’ve found to be the most accurate, helpful, and practical for Django development.

Django Documentation

Authors: Various

Recommended by Developer Dmitriy Chukhin

Overview: When Dmitriy first began learning about Django, he went through the official Django tutorial. Then, as a developer, he read through other pieces of documentation that are relevant to his work.

A Valuable Lesson: Dmitriy learned that detailed documentation makes working with a framework significantly easier than trying to figure it out on his own or from other developers’ posts about their errors. The documentation is readable, uses understandable language, and gives useful examples, making Django Documentation a lot friendlier than Dmitriy expected. It encouraged him to continue using it, since other core developers consider it important to make their software usable and well-documented. One thing that’s particularly helpful about the Django documentation is that pages now have a ‘version switcher’ in the bottom right corner of the screen, allowing readers to switch between the versions of Django for a specific feature. Since our projects at Caktus involve using a number of different versions of Django, it’s helpful to switch between the documentation to see when a feature was added, changed, or deprecated. Seeing the documentation on the Django Documentation site also encouraged Dmitriy to thoroughly document the code he writes for people who will work with it in the future.

Why You Should Read This: The Django tutorial is a great place to begin learning about using Django. The reference guide is best for those who are already using Django and need to look up details on how to use forms, views, URLs, and other parts of the Django API. The topic guides provide high-level explanations.

Django User’s Group

Authors: Various

Recommended by Lead Developer and Technical Director Karen Tracey

Overview: The Django User’s Group is a public Google Group that Karen found when she first started using Django in 2006 and ran into some trouble with database tables. She posted her challenges and questions on the Google Group and received a response the same day. She’s been using Django ever since — coincidence?

A Valuable Lesson: Django Users was Karen’s first introduction to the Django community and she learned a great deal from it. It was also her entry into becoming a regular contributing member of the community.

Why You Should Read This: If you have a Django puzzle that you can’t solve, searching the group and (if that fails to yield results) writing up and posting a question is a great way to get a solution. Karen also notes that sometimes it’s not even necessary to post since the act of writing the question in a way others can understand sometimes makes the answer clear! Reading various posts in the group is also a way to see the issues that trip up newcomers, and trying to solve questions by others also provides helpful learning opportunities. Cover of High Performance Django book

High Performance Django

Authors: Peter Baumgartner & Yann Malet

Recommended by Developer Neil Ashton

Overview: High Performance Django, proclaims to “give you a repeatable blueprint for building and deploying fast, scalable Django sites.” Neil first learned about this book from friend and former coworker Jeff Bradberry, who pointed it out as a way to start pushing his Django development skills beyond a firm grasp of the basics.

A Valuable Lesson: Neil learned that making Django perform at scale means keeping the weight off Django itself. The book taught him about making effective use of the high-performance technologies that make up the rest of the stack to respond to browser requests as early and quickly as possible. It taught him that there’s more to building web apps with Django than just Django, and it opened the door to thinking and learning about many other features of the web app development landscape.

Why You Should Read This: This book is ideal for anyone who’s beginning a career in web app development. It’s especially helpful for those with a different background, whether it’s front-end development or something further afield like computational linguistics. It’s easy to lose sight of the forest for the trees as a new web developer, and this book manages to provide you with a feel for the big picture in a surprisingly small number of pages.

Mozilla Developer Network Web Docs

Authors: Various

Recommended by Developer Vinod Kurup

Overview: The Mozilla Developer Network (MDN) Web Docs are a popular resource when it comes to nearly any general web development topic. It’s authored by multiple contributors, and you can be an author, too. Vinod usually visits the site when he’s struggling with a piece of code, and the MDN pops up at the top of his web search results. Caktus especially loves the MDN because we were fortunate to work with Mozilla on the project that powers the MDN.

A Valuable Lesson: Vinod and his team used Vue.js on a recent project, and he learned a lot more about modern Javascript than he needed to know in the past. One specific topic that has confused him was Javascript Promises. Fortunately, the MDN has documentation on using promises and more detailed reference material about the Promise.then(). Those two pieces of documentation cleared up a lot of confusion for Vinod. He also likes how each page of reference documentation includes a browser compatibility section, which helps him to identify whether his code will work in browsers that our clients use.

Why You Should Read This: The MDN provides excellent documentation on every basic front-end technology including HTML, CSS, and Javascript, among others. Since Mozilla is at the forefront of helping to create the specifications for these tools, you can trust that the documentation is authoritative. It’s also constantly being worked on and updated, so you know you’re not getting documentation on a technology that has been deprecated. Finally, and most importantly, the documentation is GOOD! They cover the basic syntax, and always include common usage examples, so that it’s clear how to use the tool. In addition, there are many other gems including tutorials (both basic and advanced) on a wide variety of web development topics.

Towards 14,000 Write Transactions Per Second on my Laptop

Author: Peter Geoghegan

Recommended by CEO Tobias McNulty

Overview: Towards 14,000 write transactions per second on my laptop, is a relatively short blog post that provides an overview of two little-discussed Postgres settings: commit_delay and commit_siblings.

A Valuable Lesson: This post not only provides an overview of commit_delay and commit_siblings, but also an important change to the former that dramatically improved its effectiveness since the release of Postgres 9.3. For database servers that need to handle a lot of writes, the commit_delay setting (which is disabled by default, as of Postgres 11) gives you an efficient way to "group" writes to disk that helps increase overall throughput by sacrificing a small amount of latency. The setting has been instrumental to us at Caktus in optimizing Postgres clusters for a couple of client projects, yet Tobias rarely, if ever, sees it mentioned in more general talks and how-tos on optimizing Postgres.

Why You Should Read This: These settings will change nothing for read-heavy sites/apps (such as a CMS), but if you use Postgres in a write-heavy Django (or other) application, you should learn about and potentially configure these settings to improve the product. Book "Two Scoops of Django"

Two Scoops of Django

Authors: Daniel Roy Greenfeld and Audrey Roy Greenfeld

Recommended by Developer Dan Poirier

Overview: Two Scoops of Django has several editions, and the latest is 1.11 (Dan read edition 1.8). The editions stand the test of time, and the authors go through nearly all facets of Django development. They share what has worked best for them and what to watch out for, so you don't have to learn it all the hard way. The authors’ tagline is, “Making Python and Django as fun as ice cream,” and who doesn’t love ice cream?

A Valuable Lesson: By reading the Django Documentation (referenced earlier in this post), you can learn what each setting does. Then in chapter 5 of Two Scoops, read about a battle-tested scheme for managing different settings and files across multiple environments, from local development to testing servers and production, while protecting your secrets (passwords, keys, etc). Similarly, chapter 19 covers what cases you should and shouldn't use the Django admin for, warns about using list_editable in multi-user environments and gives tips for securing the admin and customizing it.

Why You Should Read It: The great thing about the book is that the chapters stand alone. You can pick it up and read whatever chapter you need. Dan keeps the book handy at his desk, for nearly all his Django projects. The book is not only full of useful information, but almost every page also includes examples or diagrams.

Well Read

We recommend these readings on Django development because they provide valuable insight and learning opportunities. What do you refer to when you need a little help with Django? If you have any recommendations or feedback, please leave them in the comments below.

Joe Gregoriogcsfuse and systemd –user

Google Cloud Storage has an officially supported fuse client!

This is something I have always wanted and would have expected for Google Drive, but 🤷.

The only thing better than a fuse client is a fuse directory that gets mounted automatically when you log in, which you can do fairly simply using systemd --user, which is just systemd, but everthing runs as you.

Here’s a gist of how I set this up on my machine:

Caktus GroupImpressed by Devopsdays Charlotte 2019

We have a small two-person Infrastructure Ops team here at Caktus (including myself) so I was excited to go to my first devopsdays Charlotte and be surrounded by Ops people. The event was held just outside of Charlotte, at the Red Ventures auditorium in Indian Land, South Carolina. About 200 people gathered there for two days of talks and open sessions. Devopsdays are held multiple times a year, in various locations around the world. Check out their schedule to see if there will be an event near you.

On Thursday afternoon, Quintessence Anx gave an awesome technical Ignite talk on Sensory Friendly Monitoring. She packed a whole lot of monitoring wisdom into 5 minutes and 20 slides, so I was then looking forward to what she had to say about diversity. She spoke on Unquantified Serendipity: Diversity in Development, and it ended up being my favorite talk.

Devopsdays speaker Quintessence Anx on stage, giving her presentation

Quintessence (pictured right) provided a lot of actionable information and answered many common concerns that people have with diversity. She told the story of how as a junior developer her mentors often told her how to solve problems, while they told her male peers how to find answers. She suggested that mentors give a “hand up, not a hand out” and stressed the importance of being introduced to a mentors network so that the mentee can start building their own networks. I thought that the talk had the right balance between urgency and applicability.

View her slides and a mind map of an earlier version of the talk.

The Friday Keynote was given by Sonja Gupta and Corey Quinn, and was titled Embarrassingly Large Numbers: Salary Negotiation for Humans. It focused on how to upgrade your income by getting a new job. This talk was informative and entertaining, including more f-bombs than all the other presentations combined. Some of the points they made were:

  • interview for jobs you don’t plan to take
  • interview at least once a quarter
  • never take the first offer

They also recognized that negotiation is hard, but you are not rude if you ask for what you’re worth. I was looking forward to this keynote since I recently began following Corey’s newsletter, Last Week in AWS, where he has elevated snark to an art form.

I enjoyed the event, and I am looking forward to attending devopsdays Raleigh in the fall. The next devopsdays Charlotte will take place in 2020.

Joe GregorioMore TMI recovery stories

This is another story from my recovery that might be TMI for some people. Read with caution, or skip.

So while describing my initial hospitalization I mentioned an open question:

What exactly my anus is supposed to be doing for those three months is a question I forget to ask.

And now I know from the questions and comments I’ve received that many of you are curious too.

The answer turns out to be “nothing much” for most of the time, with occassional bouts of Phantom Limb sensation.

Now the “phantom limb” in this case is my large intestines, which are no longer hooked up to my rectum. The sensation they are missing is the pressure that comes when stool builds up in the colon, which in turn triggers the sensations of “needing to go”.

So one or two times a day I get the sensation of “needing to go”. I logically know that’s an impossibility, but apparently my rectum is not swayed by logic, so I had to look for alternative methods of getting the sensation to go away. It turns out there is one way to get it to go away and that’s to go and sit on the toilet as if I were having a bowl-movement. And I realize, as I sit here, on the toilet, not having a bowl-movement, that what I’m really doing in playing “make pretend” for my rectum. And it works!

I wonder if my rectum will be as grateful as my kids are for all the time I spent playing “make pretend” with them?

Caktus GroupSuggestions For Picking Up Old Projects

At Caktus, we work on many projects, some of which are built by us from start to finish, while others are inherited from other sources. Oftentimes, we pick up a project that we either have not worked on in a long time, or haven’t worked on at all, so we have to get familiar with the code and figure out the design decisions that were made by those who developed it (including when the developers are younger versions of ourselves). Moreover, it is a good idea to improve the setup process in each project, so others can have an easier time getting set up in the future. In our efforts to work on such projects, a few things have been helpful both for becoming familiar with the projects more quickly, and for making the same projects easier to pick up in the future.

Getting Set Up

Here are my top tips for getting up to speed on projects:

Documentation

As a perhaps obvious first step, it can be helpful to read through a README or other documentation, assuming that it exists. At Caktus, we write steps for how to get set up locally with each project, and those steps can be helpful when getting familiar with the major aspects of a project. If there is no README or obvious documentation, we look for comments within files. The first few lines may document the functionality of the rest of the file. One thing that can also be beneficial for future developers is either adding to or creating documentation for getting set up on a project as you get set up yourself. Though you likely don’t have a lot of project knowledge at this point, it can be helpful to write down some documentation, even notes like ‘installing node8.0 here works as of 2019-01-01’.

Project Structure

If you're working on a Django project, for instance, you can look for the files that come with the project: a urls.py, models.py files, views.py files, and other such files, to illuminate the functionality that the project provides and the pages that a user can visit. For non-Django projects, it can still be useful to look at the directories within the project and try to make sense of the different parts of the application. Even large Django projects with many models.py files can provide helpful information on what is happening in the project by looking at the directories. As an example, we once began working on a project with a few dozen models.py files, each with a number of models in it. Since reading through each models.py file wasn’t a feasible option, it was helpful to see which directory each models.py file was in, so that we could see the general structure of the project.

Tests

In terms of getting familiar with the project, tests (if they exist) are a great place to look, since they provide data for how the different parts of the project are supposed to work. For example, tests for Django views give examples of what data might be sent to the server, and what should happen when such data is sent there. Any test data files (for example, a test XML file for a project that handles XML) can also provide information about what the code should be handling. If we know that a project needs to accept a new XML structure, seeing the old XML structure can save a lot of time when figuring out how the code works.

Improving the Code

Getting familiar with the code should also mean making the code friendlier for future developers. Beware that future developers may, in fact, be us in a few years, and it’s much friendlier and more efficient to start working on a project that is well-documented and well-tested, than a project that is neither. While all the code doesn’t have to be improved all at once, it is possible to start somewhere, even if it means adding comments and tests for a short function. With time, the codebase can be improved and be easier to work with.

Refactoring

Oftentimes when beginning to look at a new (or unfamiliar) project, we get the urge to begin by refactoring code to be more efficient or more readable, or just to be more modern. However, it has been more helpful to resist this urge at the beginning until we understand the project better, since having working code is better than not working code. Also, there are often good reasons for why things were written a certain way, and changing them may have consequences that we are not currently aware of, especially if there aren’t sufficient tests in the project. It may be helpful to add comments to the code as we figure out how things work, and instead focus on tests, leaving refactoring for a future time.

Testing

Testing is a great place to start improving the codebase. If tests already exist, then working on a feature or bugfix should include improving the tests. If tests don’t exist, then working on a feature or bugfix should include adding relevant tests. Having tests in place will also make any refactoring work easier to do, since they can be used to check what, if anything, broke during the refactoring.

Documentation

As mentioned above, documentation makes starting to work on a project much easier. Moreover, working through getting the project set up is a great time to either add or improve a README file or other setup documentation. As you continue working on the code, you can continue to make documentation improvements along the way.

Conclusion

Having made these recommendations, I should also acknowledge that we are often faced with various constraints (time, resources, or scope) when working on client projects. While these constraints are real, best practices should still be followed, and changes can be made while working on the code, such as adding tests for new functionality, or improving comments, documentation, and tests for features that already exist and are being changed. Doing so will help any future developers to understand the project and get up to speed on it more efficiently, and this ultimately saves clients time and money.

Caktus GroupCommunity & Caktus: Charitable Giving, Winter 2018

Pictured: Developer Dan Poirier is an advocate for WCPE and a volunteer announcer. WCPE is one of the recipients of our charitable giving program.

We are pleased to continue serving the North Carolina community at-large through our semi-annual Charitable Giving Program. Twice a year we solicit proposals from our team to contribute to a variety of non-profit organizations. With this program, we look to support groups in which Cakti are involved or that have impacted their lives in some way. This gives Caktus a chance to support our own employees as well as the wider community. For winter 2018, we were pleased to donate to the following organizations:

ARTS North Carolina

ARTS North Carolina “calls for equity and access to the arts for all North Carolinians, unifies and connects North Carolina’s arts communities, and fosters arts leadership.” Our Account Executive Tim Scales has been a board member and supporter of this organization for several years.

Learn more about ARTS NC.

Museum of Life and Science

The Museum of Life and Science’s mission is to “create a place of lifelong learning where people, from young child to senior citizen, embrace science as a way of knowing about themselves, their community, and their world.” Our Chief Business Development Officer Ian Huckabee is a current museum board member and sits on the executive and finance committees.

Learn more about the Museum of Life and Science.

Sisters’ Voices

Developer Vinod Kurup’s daughter and niece, who are members of the Sisters' Voices choir. Sisters’ Voices is a “choral community of girls, within which each is known and supported while being challenged to grow as a musician and as a person.” Caktus Developer Vinod Kurup’s niece Vishali and daughter Anika (pictured from left to right) are members of the Sisters' Voices choir. Vinod believes that being a member of the choir has “enriched their lives and taught them the love of music, and they developed an appreciation of their own voice.”

Learn more about Sisters’ Voices.

WCPE Radio

WCPE is a non-commercial, independent, listener-supported radio station, dedicated to excellence in classical music. Broadcasting includes service to the Piedmont area, Raleigh, Durham, and Chapel Hill on 89.7 FM. Their facility is staffed 24 hours a day, 7 days a week.

“WCPE gained the distinction of being the only public radio station in the eastern half of North Carolina to stay on the air during Hurricane Fran in 1996, acting as an Emergency Broadcast Relay station, providing weather information directly from the National Weather Service.” Caktus Developer Dan Poirier is an advocate for WCPE and has been listening and donating for years. Last year, he trained as a volunteer announcer and now commutes 90 miles round-trip, 2-3 times a month to work a shift on the air.

Learn more about WCPE.

The Giving Continues!

Caktus’ next round of giving will be June 2019, and we look forward to supporting another group of organizations that are committed to enriching the lives of North Carolinians!

Joe GregorioThe real value of a 70% top marginal tax rate

The last time we had a 70% top marginal tax rate in the U.S. it generated very little revenue. That doesn’t mean it failed, that means it was doing it’s job, as explained in The opportunity cost of firm payouts.

Caktus GroupA Guide To Creating An API Endpoint With Django Rest Framework

As part of our work to make sharp web apps at Caktus, we frequently create API endpoints that allow other software to interact with a server. Oftentimes this means using a frontend app (React, Vue, or Angular), though it could also mean connecting some other piece of software to interact with a server. A lot of our API endpoints, across projects, end up functioning in similar ways, so we have become efficient at writing them, and this blog post gives an example of how to do so.

First, a few resources: read more about API endpoints in this previous blog post and review documentation on Django Rest Framework.

A typical request for an API endpoint may be something like: 'the front end app needs to be able to read, create, and update companies through the API'. Here is a summary of creating a model, a serializer, and a view for such a scenario, including tests for each part:

Part 1: Model

For this example, we’ll assume that a Company model doesn’t currently exist in Django, so we will create one with some basic fields:

# models.py
from django.db import models


class Company(models.Model):
    name = models.CharField(max_length=255)
    description = models.TextField(blank=True)
    website = models.URLField(blank=True)
    street_line_1 = models.CharField(max_length=255)
    street_line_2 = models.CharField(max_length=255, blank=True)
    city = models.CharField(max_length=80)
    state = models.CharField(max_length=80)
    zipcode = models.CharField(max_length=10)

    def __str__(self):
        return self.name

Writing tests is important for making sure our app works well, so we add one for the __str__() method. Note: we use the factory-boy and Faker libraries for creating test data:

# tests/factories.py
from factory import DjangoModelFactory, Faker

from ..models import Company


class CompanyFactory(DjangoModelFactory):
    name = Faker('company')
    description = Faker('text')
    website = Faker('url')
    street_line_1 = Faker('street_address')
    city = Faker('city')
    state = Faker('state_abbr')
    zipcode = Faker('zipcode')

    class Meta:
        model = Company
# tests/test_models.py
from django.test import TestCase

from ..models import Company
from .factories import CompanyFactory


class CompanyTestCase(TestCase):
    def test_str(self):
        """Test for string representation."""
        company = CompanyFactory()
        self.assertEqual(str(company), company.name)

With a model created, we can move on to creating a serializer for handling the data going in and out of our app for the Company model.

Part 2: Serializer

Django Rest Framework uses serializers to handle converting data between JSON or XML and native Python objects. There are a number of helpful serializers we can import that will make serializing our objects easier. The most common one we use is a ModelSerializer, which conveniently can be used to serialize data for Company objects:

# serializers.py
from rest_framework.serializers import ModelSerializer

from .models import Company

class CompanySerializer(ModelSerializer):
    class Meta:
        model = Company
        fields = (
            'id', 'name', 'description', 'website', 'street_line_1', 'street_line_2',
            'city', 'state', 'zipcode'
        )

That is all that’s required for defining a serializer, though a lot more customization can be added, such as:

  • outputting fields that don’t exist on the model (maybe something like is_new_company, or other data that can be calculated on the backend)
  • custom validation logic for when data is sent to the endpoint for any of the fields
  • custom logic for creates (POST requests) or updates (PUT or PATCH requests)

It’s also beneficial to add a simple test for our serializer, making sure that the values for each of the fields in the serializer match the values for each of the fields on the model:

# tests/test_serializers.py
from django.test import TestCase

from ..serializers import CompanySerializer
from .factories import CompanyFactory


class CompanySerializer(TestCase):
    def test_model_fields(self):
        """Serializer data matches the Company object for each field."""
        company = CompanyFactory()
        for field_name in [
            'id', 'name', 'description', 'website', 'street_line_1', 'street_line_2',
            'city', 'state', 'zipcode'
        ]:
            self.assertEqual(
                serializer.data[field_name],
                getattr(company, field_name)
            )

Part 3: View

The view is the layer in which we hook up a URL to a queryset, and a serializer for each object in the queryset. Django Rest Framework again provides helpful objects that we can use to define our view. Since we want to create an API endpoint for reading, creating, and updating Company objects, we can use Django Rest Framework mixins for such actions. Django Rest Framework does provide a ModelViewSet which by default allows handling of POST, PUT, PATCH, and DELETE requests, but since we don’t need to handle DELETE requests, we can use the relevant mixins for each of the actions we need:

# views.py
from rest_framework.mixins import (
    CreateModelMixin, ListModelMixin, RetrieveModelMixin, UpdateModelMixin
)
from rest_framework.viewsets import GenericViewSet

from .models import Company
from .serializers import CompanySerializer


class CompanyViewSet(GenericViewSet,  # generic view functionality
                     CreateModelMixin,  # handles POSTs
                     RetrieveModelMixin,  # handles GETs for 1 Company
                     UpdateModelMixin,  # handles PUTs and PATCHes
                     ListModelMixin):  # handles GETs for many Companies

      serializer_class = CompanySerializer
      queryset = Company.objects.all()

And to hook up our viewset to a URL:

# urls.py
from django.conf.urls import include, re_path
from rest_framework.routers import DefaultRouter
from .views import CompanyViewSet


router = DefaultRouter()
router.register(company, CompanyViewSet, base_name='company')

urlpatterns = [
    re_path('^', include(router.urls)),
]

Now we have an API endpoint that allows making GET, POST, PUT, and PATCH requests to read, create, and update Company objects. In order to make sure it works just as we expect, we add some tests:

# tests/test_views.py
from django.test import TestCase
from django.urls import reverse
from rest_framework import status

from .factories import CompanyFactory, UserFactory


class CompanyViewSetTestCase(TestCase):
      def setUp(self):
          self.user = UserFactory(email='testuser@example.com')
          self.user.set_password('testpassword')
          self.user.save()
          self.client.login(email=self.user.email, password='testpassword')
          self.list_url = reverse('company-list')

      def get_detail_url(self, company_id):
          return reverse(self.company-detail, kwargs={'id': company_id})

      def test_get_list(self):
          """GET the list page of Companies."""
          companies = [CompanyFactory() for i in range(0, 3)]

          response = self.client.get(self.list_url)

          self.assertEqual(response.status_code, status.HTTP_200_OK)
          self.assertEqual(
              set(company['id'] for company in response.data['results']),
              set(company.id for company in companies)
          )

      def test_get_detail(self):
          """GET a detail page for a Company."""
          company = CompanyFactory()
          response = self.client.get(self.get_detail_url(company.id))
          self.assertEqual(response.status_code, status.HTTP_200_OK)
          self.assertEqual(response.data['name'], company.name)

      def test_post(self):
          """POST to create a Company."""
          data = {
              'name': 'New name',
              'description': 'New description',
              'street_line_1': 'New street_line_1',
              'city': 'New City',
              'state': 'NY',
              'zipcode': '12345',
          }
          self.assertEqual(Company.objects.count(), 0)
          response = self.client.post(self.list_url, data=data)
          self.assertEqual(response.status_code, status.HTTP_201_CREATED)
          self.assertEqual(Company.objects.count(), 1)
          company = Company.objects.all().first()
          for field_name in data.keys():
                self.assertEqual(getattr(company, field_name), data[field_name])

      def test_put(self):
          """PUT to update a Company."""
          company = CompanyFactory()
          data = {
              'name': 'New name',
              'description': 'New description',
              'street_line_1': 'New street_line_1',
              'city': 'New City',
              'state': 'NY',
              'zipcode': '12345',
          }
          response = self.client.put(
              self.get_detail_url(company.id),
              data=data
          )
          self.assertEqual(response.status_code, status.HTTP_200_OK)

          # The object has really been updated
          company.refresh_from_db()
          for field_name in data.keys():
              self.assertEqual(getattr(company, field_name), data[field_name])

      def test_patch(self):
          """PATCH to update a Company."""
          company = CompanyFactory()
          data = {'name': 'New name'}
          response = self.client.patch(
              self.get_detail_url(company.id),
              data=data
          )
          self.assertEqual(response.status_code, status.HTTP_200_OK)

          # The object has really been updated
          company.refresh_from_db()
          self.assertEqual(company.name, data['name'])

      def test_delete(self):
          """DELETEing is not implemented."""
          company = CompanyFactory()
          response = self.client.delete(self.get_detail_url(company.id))
          self.assertEqual(response.status_code, status.HTTP_405_METHOD_NOT_ALLOWED)

As the app becomes more complicated, we add more functionality (and more tests) to handle things like permissions and required fields. For a quick way to limit permissions to authenticated users, we add the following to our settings file:

# settings file
REST_FRAMEWORK = {
    'DEFAULT_PERMISSION_CLASSES': ('rest_framework.permissions.IsAuthenticated',)
}

And add a test that only permissioned users can access the endpoint:

# tests/test_views.py
from django.test import TestCase
from django.urls import reverse
from rest_framework import status

from .factories import CompanyFactory, UserFactory


class CompanyViewSetTestCase(TestCase):

      ...

      def test_unauthenticated(self):
          """Unauthenticated users may not use the API."""
          self.client.logout()
          company = CompanyFactory()

          with self.subTest('GET list page'):
              response = self.client.get(self.list_url)
              self.assertEqual(response.status_code, status.HTTP_403_FORBIDDEN)

          with self.subTest('GET detail page'):
              response = self.client.get(self.get_detail_url(company.id))
              self.assertEqual(response.status_code, status.HTTP_403_FORBIDDEN)

          with self.subTest('PUT'):
              data = {
                  'name': 'New name',
                  'description': 'New description',
                  'street_line_1': 'New street_line_1',
                  'city': 'New City',
                  'state': 'NY',
                  'zipcode': '12345',
              }
              response = self.client.put(self.get_detail_url(company.id), data=data)
              self.assertEqual(response.status_code, status.HTTP_403_FORBIDDEN)
              # The company was not updated
              company.refresh_from_db()
              self.assertNotEqual(company.name, data['name'])

          with self.subTest('PATCH):
              data = {'name': 'New name'}
              response = self.client.patch(self.get_detail_url(company.id), data=data)
              self.assertEqual(response.status_code, status.HTTP_403_FORBIDDEN)
              # The company was not updated
              company.refresh_from_db()
              self.assertNotEqual(company.name, data['name'])

          with self.subTest('POST'):
              data = {
                  'name': 'New name',
                  'description': 'New description',
                  'street_line_1': 'New street_line_1',
                  'city': 'New City',
                  'state': 'NY',
                  'zipcode': '12345',
              }
              response = self.client.put(self.list_url, data=data)
              self.assertEqual(response.status_code, status.HTTP_403_FORBIDDEN)

      with self.subTest('DELETE'):
              response = self.client.delete(self.get_detail_url(company.id))
              self.assertEqual(response.status_code, status.HTTP_403_FORBIDDEN)
              # The company was not deleted
              self.assertTrue(Company.objects.filter(id=company.id).exists())

As our project grows, we can edit these permissions, make them more specific, and continue to add more complexity, but for now, these are reasonable defaults to start with.

Conclusion

Adding an API endpoint to a project can take a considerable amount of time, but with the Django Rest Framework tools, it can be done more quickly, and be well-tested. Django Rest Framework provides helpful tools that we’ve used at Caktus to create many endpoints, so our process has become a lot more efficient, while still maintaining good coding practices. Therefore, we’ve been able to focus our efforts in other places in order to expand our abilities to grow sharp web apps.

Caktus Group7 Conferences We’re Looking Forward To

Above: The Internet Summit in Raleigh is one of the local conferences we recommend attending. (Photo by Ian Huckabee.)

At Caktus, we strongly believe in professional development and continued learning. We encourage our talented team to stay up to date with industry trends and technologies. During 2018, Cakti attended a number of conferences around the country. Below is a list (in alphabetical order) of the ones we found the most helpful, practical, and interesting. We look forward to attending these conferences again, and if you get the chance, we highly recommend that you check them out as well.

All Things Open

Recommended by Account Executive Tim Scales

Next Conference Location: Raleigh

Next Conference Date: October 13 - 15, 2019

All Things Open is a celebration and exploration of open source technology and its impact. Topics range from nuts and bolts sessions on database design and OSS languages to higher-level explorations of current trends like machine learning with Python and practical blockchain applications.

The annual conference is heavily publicized in the open source community, of which Caktus is an active member. All Things Open attracts open source thought leaders from across industries, and it’s a valuable learning experience for both non-technical newbies and expert developers to gain insight into a constantly evolving field.

Tim particularly enjoyed the session by expert Jordan Kasper of the Defence Digital Service about his efforts to implement open source development at the US Department of Defense with the code.mil project. It was an enlightening look at the use of open source in the federal government, including the challenges and opportunities involved.

Image of attendees at DjangoCon Above: Hundreds of attendees at DjangoCon 2018. (Photo by Bartek Pawlik.)

DjangoCon

Recommended by Developer David Ray

Next Conference Location: San Diego, California

Next Conference Date: September 22 - 27, 2019

DjangoCon is the preeminent conference for Django users, and as early proponents of Django, Caktus has sponsored the conference since 2009. We always enjoy the annual celebration of all things Django, featuring presentations and workshops from veteran Djangonauts and enthusiastic beginners.

David enjoyed Russell Keith-Magee’s informative and hilarious talk on navigating the complexities of time zones in computer programming. It’s rare for a conference talk to explore both the infinite complexity of a topic while also providing concrete tools to resolve the complexity as Russell did. Attending DjangoCon is a must for Django newbies and veterans alike. It’s an opportunity to sharpen your craft and explore the possibilities of our favorite framework in a fun, collaborative, and supportive environment.

Read more in our conference recap post and Django Depends on You: A Takeaway from DjangoCon.

Digital PM Summit

Recommended by Lead Project Manager Gannon Hubbard

Next Conference Location: Orlando

Next Conference Date: October 20 - 22, 2019

The Digital PM Summit is an annual gathering of project managers, which includes presentations, workshops, and breakout sessions focused on managing digital projects. The event is organized by the Bureau of Digital and was held in Memphis in 2018. Cakti attended and spoke at the conference in previous years; check out the highlights of Marketing Content Manager Elizabeth Michalka’s talk on investing in relationships from the 2016 conference.

The summit provides a unique opportunity for attendees to network and learn from each other. Indeed, the biggest draw for Gannon was the chance to be in the same room as so many other project managers. The project management (PM) role encompasses an array of activities — planning and defining scope, activity planning and sequencing, resource planning, time and cost estimating, and reporting progress, to name a few. There are professional books and months-long certificates to teach this knowledge, but nothing is better than being able to ask, “Hey, have you run into this before?” The ability to compare notes with PMs he doesn’t work with every day is invaluable, and the three most impactful sessions for Gannon were:

  • Rachel Gertz’s talk “Static to Signal: A Lesson in Alignment”
  • Meghan McInerney’s talk on “The Ride or Die PM”
  • Lynn Winter’s talk “PM Burnout: The Struggle Is Real”

Each of these talks seemed to build on one another. Rachel Gertz set the stage with her keynote by pointing out that the project manager is the nexus of a project. The course of a project is determined by countless small adjustments, and the PM is the one who makes those adjustments. Nine times out of 10, when a project fails, it’s because of something the PM did or didn’t do.

Image of presentation from the Digital PM Summit Also a keynote, Meghan McInerney’s talk (pictured) identified the primary attributes of a PM who’s at the top of their game. They’re reliable, adaptable, and a strategic partner for their clients. When you hit this ideal, you’re the one asking stakeholders and team members hard questions about “Should we?” — and as the PM, you’re the only one who can be counted on to ask those questions. Lynn Winter’s lightning talk cautions against giving too much of yourself over to this, though. As she pointed out, the role is often made up of all the tasks that no one else wants to do, and there’s a good chance at least a few of those tasks will take a toll on you. You have to make space for yourself if you’re going to be effective.

Internet Summit

Recommended by Chief Business Development Officer Ian Huckabee

Next Conference Location: Raleigh

Next Conference Date: November 13 - 14, 2019

The Internet Summit is a marketing conference that attracts impressive speakers and provides opportunities to stay on top of digital marketing trends and technologies that can help drive growth and success. Internet Summit conferences are organized by Digital Summit and are also held in varies cities, in addition to Raleigh.

Digital marketing is constantly changing, so it’s important to stay current. At Internet Summit, Ian heard valuable information from dozens of the country’s top digital marketers who shared current trends and best practices for using real-time data to build intelligence and improve customer interaction and engagement. Keynote speakers included marketing guru, author, and former dot-com business exec Seth Godin and founder of The Onion Scott Dikkers.

These summits are key to staying on trend with how people want to be reached, and the execution strategies are, in most cases, proven. For instance, behavioral targeting tools have evolved to the point where ABM (account-based marketing) can be extremely effective when executed properly. Also meaningful for Ian was the talk Marketing Analytics: Get the Insights You Need Faster, by Matt Hertig. Ian walked away from the workshop with tangible advice on managing large volumes of data to provide meaningful insights and analysis more quickly.

JupyterDay in the Triangle

Recommended by Chief Technical Officer and Co-founder Colin Copeland

Next Conference Location: Chapel Hill

Next Conference Date: TBD

In 2018, Colin was excited to attended JupyterDay in the Triangle because it provided a chance to learn from the greater Python/Jupyter community, and it occurred around the corner in Chapel Hill. He especially enjoyed the following presentations:

Matthew McCormick’s talk on Interactive 3D and 2D Image Visualization for Jupyter, which demonstrated how notebooks can be used to explore very large datasets and scientific imaging data Joan Pharr’s talk on Learning in Jupyter focused on how her company uses Jupyter notebooks for SME Training and onboarding new members to their team

Ginny Gezzo during her presentationColin’s favorite presentation was Have Yourself a Merry Little Notebook by Ginny Gezzo of IBM (pictured). She focused on using Python notebooks to solve https://adventofcode.com/ puzzles. The talk highlighted the value of notebooks as a tool for exploring and experimenting with problems that you don’t know how to solve. Researching solutions can easily lead in many directions, and it’s valuable to have a medium to record what you did and how you got there, and you can easily share your results with other team members.

Colin has worked with Jupyter notebooks, and sees great value in their utility. For example, Caktus used them on a project with Open Data Policing to inspect merging multiple sheets of an Excel workbook in Pandas (see the project in github).

Caktus booth at PyCon 2018

Above: The Caktus team and booth during PyCon 2018.

Pycon

Recommended by Technology Support Specialist Scott Morningstar

Next Conference Location: Cleveland

Next Conference Date: May 1- 9, 2019

Cakti regularly attend and sponsor PyCon. It’s the largest annual gathering of Python users and developers. The next conference will take place in Cleveland, OH, in May 2019.

The event also includes an “unconference” that runs in parallel of the scheduled talks. Scott especially enjoyed these open sessions during the 2018 conference. Pycon dedicates space to open sessions every year. These are informal, community-driven meetings where anybody can post a topic, with a time and place to gather. The open sessions that Scott attended covered a wide range of topics from using Python to control CNC Milling machines to how reporters can use encryption to protect sources. He also enjoyed a session on Site Reliability Engineering (SRE), which included professionals from Intel, Google, and Facebook who spoke about how they managed infrastructure at scale.

See our full recap of PyCon 2018, and our must-see talks series:

TestBash

Recommended by Quality Assurance Analyst Gerald Carlton

Next Conference Location: Varies

Next Conference Date: Multiple

TestBash was so good in 2017, that Gerald decided to attend again in 2018. The conferences are organized nationally and internationally by the Ministry of Testing, and in 2018, Gerald attended the event in San Francisco. Gerald originally learned about the event on a QA Testing forum called The Club. Read about what he loved at the 2017 conference. Cookie with the Test Bash logo

TestBash is a single-track conference providing a variety of talks that cover all areas of testing. Talks range from topics such as manual testing, automation, and machine learning, to less technical topics including work culture and quality.

One particularly interesting talk was given by Paul Grizzaffi, a Principal Automation Architect at Magenic. He declared automation development is like software development. Paul talked about how the same principles used when developing features for a website can also be used when building out the automation that tests said website. Just like code is written to build a website, code is also written to create the automated scripts that run to test the website, therefore there is a valid argument for treating automation as software development. The talk highlighted the point that sometimes automation is seen as an extra tool, but it’s actually something that we build to perform a task. So when we think about it, it’s not that different from the development process one would go through when developing a new website. Paul’s talk is available on The Dojo (with a free membership), and you can read more on his blog.

TestBash provides practical information that attendees can learn and take back to their teams to implement. Attendees not only learn from the speakers but also from each other by sharing their challenges and how they overcame them. It’s also a positive environment for networking and building friendships. Gerald met people at the conference who he’s stayed in touch with and who provide a lasting sounding board.

Worth Going Again

We recommend these conferences and look forward to attending them again because they provide such valuable learning and networking opportunities.

What conferences do you plan to attend this year? If you have any recommendations, please leave them in the comments below.

Caktus GroupThe Secret Lives of Cakti (Part 2)

Pictured from left: Our musically inclined Cakti, Dane Summers, Dan Poirier, and Ian Huckabee.

The first installment of the secret lives of Cakti highlighted some colorful extracurriculars (including rescuing cats, running endurance events, and saving lives). This time, we’re taking a look at our team’s unexpected musical talents.

If you Google musicians and programming, you’ll find dozens of posts exploring the correlation between musical talent and programming expertise. Possible factors include musicians’ natural attention to detail, their trained balance between analysis and creativity, and their comfort with both solitary focus and close collaboration.

Cakti are no exception to this, and creative talent runs deep across our team. Here are a few of our musical colleagues.

Dane Summers with his fretless banjo.

Appalachian Picker Dane Summers

Contract programmer Dane is inspired by old-time Appalachian music as both a banjo player and flat foot clogger. After ten years of learning to play, he’s managed to accumulate four banjos, but his favorite (and the only currently-functional one) is a fretless that he plays in the traditional Round Peak style. He’s working up to singing while he plays, at which point we hope he'll do an in-office concert.

The Multi-Talented Dan Poirier

Our sharp developer Dan has multiple musical passions. As a singer, he lends his baritone to Voices, a distinguished community chorus in Chapel Hill. You can also hear him a couple of times a month on WCPE, the Classical Station, as an announcer for Weekend Classics. Rumor has it that he’s also a dab hand at the ukulele, though until he shows off his talents at the office we won’t know for sure.

Blues Guitarist Ian Huckabee

Holding the distinction as the only Caktus team member to jam with Harry Connick, Jr., our chief business development officer Ian has played blues guitar since he was 12 years old. He also started his professional career in the music business, managing the NYC recording studios for Sony Music Entertainment. His current musical challenge is mastering Stevie Ray Vaughan’s cover of Little Wing.

Waiting for the Band to Get Together

No word yet on whether Dan, Dane, and Ian are planning to start a Caktus band, but we’ll keep you posted. If they do, they’ll have more talent to draw from: our team also includes an opera singer, multiple guitarists, a fiddle player, and others.

Want to get to know us better? Drop us a line.

Caktus GroupHow to Use Django Bulk Inserts for Greater Efficiency

It's been awhile since we last discussed bulk inserts on the Caktus blog. The idea is simple: if you have an application that needs to insert a lot of data into a Django model — for example a background task that processes a CSV file (or some other text file) — it pays to "chunk" those updates to the database so that multiple records are created through a single database operation. This reduces the total number of round-trips to the database, something my colleague Dan Poirier discussed in more detail in the post linked above.

Today, we use Django's Model.objects.bulk_create() regularly to help speed up operations that insert a lot of data into a database. One of those projects involves processing a spreadsheet with multiple tabs, each of which might contain thousands or even tens of thousands of records, some of which might correspond to multiple model classes. We also need to validate the data in the spreadsheet and return errors to the user as quickly as possible, so structuring the process efficiently helps to improve the overall user experience.

While it's great to have support for bulk inserts directly in Django's ORM, the ORM does not provide much assistance in terms of managing the bulk insertion process itself. One common pattern we found ourselves using for bulk insertions was to:

  1. build up a list of objects
  2. when the list got to a certain size, call bulk_create()
  3. make sure any objects remaining (i.e., which might be fewer than the chunk size of prior calls to bulk_create()) are inserted as well

Since for this particular project we needed to repeat the same logic for a number of different models in a number of different places, it made sense to abstract that into a single class to handle all of our bulk insertions. The API we were looking for was relatively straightforward:

  • Set bulk_mgr = BulkCreateManager(chunk_size=100) to create an instance of our bulk insertion helper with a specific chunk size (the number of objects that should be inserted in a single query)
  • Call bulk_mgr.add(unsaved_model_object) for each model instance we needed to insert. The underlying logic should determine if/when a "chunk" of objects should be created and does so, without the need for the surrounding code to know what's happening. Additionally, it should handle objects from any model class transparently, without the need for the calling code to maintain separate object lists for each model.
  • Call bulk_mgr.done() after adding all the model objects, to insert any objects that may have been queued for insertion but not yet inserted.

Without further ado, here's a copy of the helper class we came up with for this particular project:

from collections import defaultdict
from django.apps import apps


class BulkCreateManager(object):
    """
    This helper class keeps track of ORM objects to be created for multiple
    model classes, and automatically creates those objects with `bulk_create`
    when the number of objects accumulated for a given model class exceeds
    `chunk_size`.
    Upon completion of the loop that's `add()`ing objects, the developer must
    call `done()` to ensure the final set of objects is created for all models.
    """

    def __init__(self, chunk_size=100):
        self._create_queues = defaultdict(list)
        self.chunk_size = chunk_size

    def _commit(self, model_class):
        model_key = model_class._meta.label
        model_class.objects.bulk_create(self._create_queues[model_key])
        self._create_queues[model_key] = []

    def add(self, obj):
        """
        Add an object to the queue to be created, and call bulk_create if we
        have enough objs.
        """
        model_class = type(obj)
        model_key = model_class._meta.label
        self._create_queues[model_key].append(obj)
        if len(self._create_queues[model_key]) >= self.chunk_size:
            self._commit(model_class)

    def done(self):
        """
        Always call this upon completion to make sure the final partial chunk
        is saved.
        """
        for model_name, objs in self._create_queues.items():
            if len(objs) > 0:
                self._commit(apps.get_model(model_name))

You can then use this class like so:

import csv

with open('/path/to/file', 'rb') as csv_file:
    bulk_mgr = BulkCreateManager(chunk_size=20)
    for row in csv.reader(csv_file):
        bulk_mgr.add(MyModel(attr1=row['attr1'], attr2=row['attr2']))
    bulk_mgr.done()

I tried to simplify the code here as much as possible for the purposes of this example, but you can obviously expand this as needed to handle multiple model classes and more complex business logic. You could also potentially put bulk_mgr.done() in its own finally: or except ExceptionType: block, however, you should be careful not to write to the database again if the original exception is database-related.

Another useful pattern might be to design this as a context manager in Python. We haven't tried that yet, but you might want to.

Good luck with speeding up your Django model inserts, and feel free to post below with any questions or comments!

Caktus GroupCaktus Blog: Top 18 Posts of 2018

In 2018, we published 44 posts on our blog, including technical how-to’s, a series on UX research methods, web development best practices, and tips for project management. Among all those posts, 18 rose to the top of the popularity list in 2018.

Most Popular Posts of 2018

  1. Creating Dynamic Forms with Django: Our most popular blog post delves into a straightforward approach to creating dynamic forms.

  2. Make ALL Your Django Forms Better: This post also focuses on Django forms. Learn how to efficiently build consistent forms, across an entire website.

  3. Django vs WordPress: How to decide?: Once you invest in a content management platform, the cost to switch later may be high. Learn about the differences between Django and WordPress, and see which one best fits your needs.

  4. Basics of Django Rest Framework: Django Rest Framework is a library which helps you build flexible APIs for your project. Learn how to use it, with this intro post.

  5. How to Fix your Python Code's Style: When you inherit code that doesn’t follow your style preferences, fix it quickly with the instructions in this post. Woman typing on a laptop.

  6. Filtering and Pagination with Django: Learn to build a list page that allows filtering and pagination by enhancing Django with tools like django_filter.

  7. Better Python Dependency Management with pip-tools: One of our developers looked into using pip-tools to improve his workflow around projects' Python dependencies. See what he learned with pip-tools version 2.0.2.

  8. Types of UX Research: User-centered research is an important part of design and development. In this first post in the UX research series, we dive into the different types of research and when to use each one.

  9. Outgrowing Sprints: A Shift from Scrum to Kanban: Caktus teams have used Scrum for over two years. See why one team decided to switch to Kanban, and the process they went through.

  10. Avoiding the Blame Game in Scrum: The words we use, and the tone in which we use them, can either nurture or hinder the growth of Scrum teams. Learn about the importance of communicating without placing blame.

  11. What is Software Quality Assurance?: A crucial but often overlooked aspect of software development is quality assurance. Find out more about its value and why it should be part of your development process.

  12. Quick Tips: How to Find Your Project ID in JIRA Cloud: Have you ever created a filter in JIRA full of project names and returned to edit it, only to find all the project names replaced by five-digit numbers with no context? Learn how to find the project in both the old and new JIRA experience.

  13. UX Research Methods 2: Analyzing Behavior: Learn about UX research methods best suited to understand user behavior and its causes.

  14. UX Research Methods 3: Evaluating What Is: One set of techniques included in UX research involves evaluating the landscape and specific instances of existing user experience. Learn more about competitive landscape review.

  15. Django or Drupal for Content Management: Which Fits your Needs?: If you’re building or updating a website, you should integrate a content management system (CMS). See the pros and cons of Django and Drupal, and learn why we prefer Django.

  16. 5 Scrum Master Lessons Learned: Whether your team is new to Scrum or not, check out these lessons learned. Some are practical, some are abstract, and some are helpful reminders like “Stop being resistant to change, let yourself be flexible."

  17. Add Value To Your Django Project With An API: This post for business users and beginning coders outlines what an API is and how it can add value to your web development project.

  18. Caktus Blog: Best of 2017: How appropriate that the last post in this list is about our most popular posts from the previous year! So, when you’ve read the posts above, check out our best posts from 2017.

Thank You for Reading Our Blog

We look forward to giving you more content in 2019, and we welcome any questions, suggestions, or feedback. Simply leave a comment below.

Caktus GroupMy New Year’s Resolution: Work Less to Code Better

You may look at my job title (or picture) and think, “Oh, this is easy, he’s going to resolve to stand up at his desk more.” Well, you’re not wrong, that is one of my resolutions, but I have an even more important one. I, Jeremy Gibson, resolve to do less work in 2019. You’re probably thinking that it’s bold to admit this on my employer’s blog. Again, you’re not wrong, but I think I can convince them that the less work I do, the more clear and functional my code will become. My resolution has three components.

1) I will stop using os.path to do path manipulations and will only use pathlib.Path on any project that uses Python 3.4+

I acknowledge that pathlib is better than me at keeping operating system eccentricities in mind. It is also better at keeping my code DRYer and more readable. I will not fight that.

Let's take a look at an example that is very close to parity. First, a simple case using os.path and pathlib.

  # Opening a file the with os.path
  import os

  p = 'my_file.txt'

  if not os.path.exists(pn) : open(pn, 'a')

  with open(pn) as fh:
    # Manipulate

Next, pathlib.Path

  # Opening a file with Path

  from pathlib import Path

  p = Path("my_file.txt")

  if not p.exists() : p.touch()

  with p.open() as fh:
    # Manipulate

This seems like a minor improvement, if any at all, but hear me out. The pathlib version is more internally consistent. Pathlib sticks to its own idiom, whereas os.path must step outside of itself to accomplish path related tasks like file creation. While this might seem minor, not having to code switch to accomplish a task can be a big help for new developers and veterans, too.

Not convinced by the previous example? Here’s a more complex example of path work that you might typically run across during development — validating a set of files in one location and then moving them to another location, while making the code workable over different operating systems.

With os.path

  import os
  from shutil import move

  p_source = os.path.join(os.curdir(), "my", "source", "path")
  p_target = os.path.join("some", "target", "path")

  for root, dirs, files in os.walk(p_source):
    for f in files:
      if f.endswith(".tgz"):
        # Validate
        if valid:
          shutil.move(os.path.join(root,f), p_target)

With pathlib

  from pathlib import Path

  # pathlib translates path separators
  p_source = Path().cwd() / "my" / "source" / "path"
  p_target = Path("some/target/path")

  for pth in p_source.rglob("*.tgz"):
    # Validate
    if valid:
      p_target = p_target / pth.name
      pth.rename(p_target)

Note: with pathlib I don't have to worry about os.sep() Less work! More readable!

Also, as in the first example, all path manipulation and control is now contained within the library, so there is no need to pull in outside os functions or shutil modules. To me, this is more satisfying. When working with paths, it makes sense to work with one type of object that understands itself as a path, rather than different collections of functions nested in other modules.

Ultimately, for me, this is a more human way to think about the processes that I am manipulating. Thus making it easier and less work. Yaay!

2) I will start using f'' strings on Python 3.6+ projects.

I acknowledge that adding .format() is a waste of precious line characters (I'm looking at you PEP 8) and % notation is unreadable. The f'' string makes my code more elegant and easier to read. They also move closer to the other idioms used by python like r'' and b'' and the no longer necessary (if you are on Python3) u''. Yes, this is a small thing, but less work is the goal.

    for k, v in somedict.items():
        print("The key is {}\n The value is {}'.format(k, v))

vs.

    for k, v in somedict.items():
        print(f'The key is {k}\n The value is {v}')

Another advantage in readability and maintainability is that I don't have to keep track of parameter position as before with .format(k, v) if I later decide that I really want v before k.

3) I will work toward, as much as possible, writing my tests before I write my code.

I acknowledge that I am bad about jumping into a problem, trying to solve it before I fully understand the behavior I want to see (don't judge me, I know some of you do this, too). I hope, foolishly, that the behavior will reveal itself as I solve the various problems that crop up.

Writing your tests first may seem unintuitive, but hear me out. This is known as Test Driven Development. Rediscovered by Kent Beck in 2003, it is a programming methodology that seeks to tackle the problem of managing code complexity with our puny human brains.

The basic concept is simple: to understand how to build your program you must understand how it will fail. So, the first thing that you should do is write tests for the behaviors of your program. These tests will fail and that is good because now you (the programmer with the puny human brain) have a map for your code. As you make each test pass, you will quickly know if the code doesn’t play well with other parts of the code, causing the other tests to fail.

This idea is closely related to Acceptance Test Driven Development, which you may have also heard of, and is mentioned in this Caktus post.

It All Adds Up

Although these three parts of my resolution are not huge, together they will allow me to work less. Initially, as I write the code, and then in the future when I come back to code I wrote two sprints ago and is now a mystery to me.

So that's it, I'm working less next year, and that will make my code better.

Caktus GroupHow to Fix your Python Code's Style

Sometimes we inherit code that doesn't follow the style guidelines we prefer when we're writing new code. We could just run flake8 on the whole codebase and fix everything before we continue, but that's not necessarily the best use of our time.

Another approach is to update the styling of files when we need to make other changes to them. To do that, it's helpful to be able to run a code style checker on just the files we're changing. I've written tools to do that for various source control systems and languages over the years. Here's the one I'm currently using for Python and flake8.

I call this script flake. I have a key in my IDE bound to run it and show the output so I can click on each line to go to the code that has the problem, which makes it pretty easy to fix things.

It can run in two modes. By default, it checks any files that have uncommitted changes. Or I can pass it the name of a git branch, and it checks all files that have changes compared to that branch. That works well when I'm working on a feature branch that is several commits downstream from develop and I want to be sure all the files I've changed while working on the feature are now styled properly.

The script is in Python, of course.

Work from the repository root

Since we're going to work with file paths output from git commands, it's simplest if we first make sure we're in the root directory of the repository.

#!/usr/bin/env python3
import os
import os.path
import subprocess

if not os.path.isdir('.git'):
    print("Working dir: %s" % os.getcwd())
    result = subprocess.run(['git', 'rev-parse', '--show-toplevel'], stdout=subprocess.PIPE)
    dir = result.stdout.rstrip(b'\n')
    os.chdir(dir)
    print("Changed to %s" % dir)

We use git rev-parse --show-toplevel to find out what the top directory in the repository working tree is, then change to it. But first we check for a .git directory, which tells us we don't need to change directories.

Find files changed from a branch

If a branch name is passed on the command line, we want to identify the Python files that have changed compared to that branch.

import sys
...
if len(sys.argv) > 1:
    # Run against files that are different from *branch_name*
    branch_name = sys.argv[1]
    cmd = ["git", "diff", "--name-status", branch_name, "--"]
    out = subprocess.check_output(cmd).decode('utf-8')
    changed = [
        # "M\tfilename"
        line[2:]
        for line in out.splitlines()
        if line.endswith(".py") and "migrations" not in line and line[0] != 'D'
    ]

We use git diff --name-status <branch-name> -- to list the changes compared to the branch. We skip file deletions — that means we no longer have a file to check — and migrations, which never seem to quite be PEP-8 compliant and which I've decided aren't worth trying to fix. (You may decide differently, of course.)

Find files with uncommitted changes

Alternatively, we just look at the files that have uncommitted changes.

else:
    # See what files have uncommitted changes
    cmd = ["git", "status", "--porcelain", "--untracked=no"]
    out = subprocess.check_output(cmd).decode('utf-8')
    changed = []
    for line in out.splitlines():
        if "migrations" in line:
            # Auto-generated migrations are rarely PEP-8 compliant. It's a losing
            # battle to always fix them.
            continue
        if line.endswith('.py'):
            if '->' in line:
                # A file was renamed. Consider the new name changed.
                parts = line.split(' -> ')
                changed.append(parts[1])
            elif line[0] == 'M' or line[1] != ' ':
                changed.append(line[3:])

Here we take advantage of git --porcelain to ensure the output won't change from one git version to the next, and it's fairly easy to parse in a script. (Maybe I should investigate using --porcelain with the other git commands in the script, but what I have now works well enough.)

Run flake8 on the changed files

Either way, changed now has a list of the files we want to run flake8 on.

cmd = ['flake8'] + changed
rc = subprocess.call(cmd)
if rc:
    print("Flake8 checking failed")
    sys.exit(rc)

Running flake8 with subprocess.call this way sends the output to stdout so we can see it. flake8 will exit with a non-zero status if there are problems; we print a message and also exit with a non-zero status.

Wrapping up

I might have once written a script like this in Shell or Perl, but Python turns out to work quite well once you get a handle on the subprocess module.

The resulting script is useful for me. I hope you'll find parts of it useful too, or at least see something you can steal for your own scripts.

Caktus GroupOur Top Tip for Computer Security

‘Tis the season for shopping online, sending cute holiday e-cards, and emailing photos to grandparents. But during all this festive online activity, how much do you think about your computer security? For example, is your password different for every shopping and e-card site that you use? If not, it should be!

Given that Friday, November 30, is Computer Security Day, it’s a good time to consider whether your online holiday habits are putting you at risk of a data breach. And our top tip is to use a different password for every website and online account. You’ve probably heard this a hundred times already, but it’s the first line of defense that you have against attacks.

We all should take computer and internet security seriously. The biggest threat to ordinary users is password reuse, like having the same (or similar) username and password combination for Amazon, Facebook, and your health insurance website. This issue is frighteningly common — the resource Have I Been Pwned has collected 5.6 billion username and password pairs since 2013. Once attackers breach one of your online accounts, they then try the same username and password on sites across the internet, looking for another match.

If one password on one website is breached, then all your other accounts with the same password are vulnerable.

It’s worth reiterating: Don’t use the same password on more than one website. Otherwise, your accounts are an easy target for an attacker to gain valuable data like your credit card number and go on a holiday shopping spree that’ll give you a headache worse than any eggnog hangover you’ve ever had!

More Tips to Fend Off an Online Grinch

Here are a few more tips for password security, to help protect your personal information from attacks, scams, phishing, and other unsavory Grinch-like activity:

  1. Create a strong password for every website and online account. A password manager like LastPass or 1Password can help you create unique passwords for every online account. Be sure to also choose a strong passphrase with 2-factor authentication for your password manager login, and then set it up to automatically generate passwords for you.

  2. Choose 2-factor authentication. Many websites now offer some means of 2-factor authentication. It takes a few more minutes to set up, but it’s worth it. Do this on as many websites as possible to make your logins more secure.

  3. Do not send personal or business-related passwords via email. It may be an easy means of communication, but email is not a secure method of communication.

Have Holly, Jolly Holidays

You have an online footprint consisting of various accounts, email providers, social media, and web browsing history. Essential personal info, like your health records, banking and credit records are online, too. All of this info is valuable and sellable to someone, and the tools they use to steal your data are cheap. All they need to do is get one credit card number and the payoff may be huge. Don’t let that credit card number be yours, otherwise, you won’t have a very jolly holiday.

Be vigilant, especially around the holidays, when there’s an increase in online commerce and communication, and therefore a greater chance that an attacker may succeed in getting the info they want from you.

Caktus GroupDjango Depends on You: A Takeaway from DjangoCon

Photo by Bartek Pawlik.

DjangoCon 2018 attracted attendees from around the world, including myself and several other Cakti (check out our DjangoCon recap post). Having attended a number of DjangoCons in the past, I looked forward to reconnecting with old colleagues and friends within the community, learning new things about our favorite framework, and exploring San Diego.

While it was a privilege to attend DjangoCon in person, you can experience it remotely. Thanks to technology and the motivated organizers, you can view a lot of the talks online. For that, I am thankful to the DjangoCon organizers, sponsors, and staff that put in the time and energy to ensure that these talks are readily available for viewing on YouTube.

Learn How to Give Back to the Django Framework

While I listened to a lot of fascinating talks, there was one that stood out and was the most impactful to me. I also think it is relevant and important for the whole Django community. If you have not seen it, I encourage you to watch and rewatch Carlton Gibson’s “Your web framework needs you!". Carlton was named a Django Fellow in January of 2018 and provides a unique perspective on the state of Django as an open source software project, from the day-to-day management, to the (lack of) diversity amongst the primary contributors, to the ways that people can contribute at the code and documentation levels.

This talk resonated with me because I have worked with open source software my entire career. It has enabled me to bootstrap and build elegant solutions with minimal resources. Django and its ilk have afforded me opportunities to travel the globe and engage with amazing people. However, in over 15 years of experience, my contributions back to the software and communities that have served me well have been nominal in comparison to the benefits I have received. But I came away from the talk highly motivated to contribute more, and am eager to get that ball rolling.

Carlton says in his talk, “we have an opportunity to build the future of Django here.” He’s right, our web framework needs us, and via his talk you will discover how to get involved in the process, as well as what improvements are being made to simplify onboarding. I agree with Carlton, and believe it’s imperative to widen the net of contributors by creating multiple avenues for contributions that are easily accessible and well supported. Contributions are key to ensuring a sound future for the Django framework. Whether it’s improving documentation, increasing test coverage, fixing bugs, building new features, or some other aspect that piques your interest, be sure to do your part for your framework. The time that I am able to put toward contributing to open source software has always supplied an exponential return, so give it a try yourself!

Watch the talk to see how you can contribute to the Django framework.

Caktus GroupDjangoCon 2018 Recap

Above: Hundreds of happy Djangonauts at DjangoCon 2018. (Photo by Bartek Pawlik.)

That’s it, folks — another DjangoCon in the books! Caktus was thrilled to sponsor and attend this fantastic gathering of Djangonauts for the ninth year running. This year’s conference ran from October 14 - 19, in sunny San Diego. ☀️

Our talented Caktus contractor Erin Mullaney was a core member of this year’s DjangoCon organizing team, plus five more Cakti joined as participants: CTO Colin Copeland, technical manager Karen Tracey, sales engineer David Ray, CBDO Ian Huckabee, and myself, account exec Tim Scales.

What a Crowd!

At Caktus we love coding with Django, but what makes Django particularly special is the remarkable community behind it. From the inclusive code of conduct to the friendly smiles in the hallways, DjangoCon is a welcoming event and a great opportunity to meet and learn from amazing people. With over 300 Django experts and enthusiasts attending from all over the world, we loved catching up with old friends and making new ones.

What a Lineup!

DjangoCon is three full days of impressive and inspiring sessions from a diverse lineup of presenters. Between the five Cakti there, we managed to attend almost every one of the presentations.

We particularly enjoyed Anna Makarudze’s keynote address about her journey with coding, Russell Keith-Magee’s hilarious talk about tackling time zone complexity, and Tom Dyson’s interactive presentation about Django and Machine Learning. (Videos of the talks should be posted soon by DjangoCon.)

What a Game!

Thanks to the 30+ Djangonauts who joined us for the Caktus Mini Golf Outing on Tuesday, October 16! Seven teams putted their way through the challenging course at Belmont Park, talking Django and showing off their mini golf skills. We had fun meeting new friends and playing a round during the beautiful San Diego evening.

Thanks to all the organizers, volunteers, and fellow sponsors who made DjangoCon 2018 a big success. We look forward to seeing you again next year!

Caktus GroupFiltering and Pagination with Django

If you want to build a list page that allows filtering and pagination, you have to get a few separate things to work together. Django provides some tools for pagination, but the documentation doesn't tell us how to make that work with anything else. Similarly, django_filter makes it relatively easy to add filters to a view, but doesn't tell you how to add pagination (or other things) without breaking the filtering.

The heart of the problem is that both features use query parameters, and we need to find a way to let each feature control its own query parameters without breaking the other one.

Filters

Let's start with a review of filtering, with an example of how you might subclass ListView to add filtering. To make it filter the way you want, you need to create a subclass of FilterSet and set filterset_class to that class. (See that link for how to write a filterset.)

class FilteredListView(ListView):
    filterset_class = None

    def get_queryset(self):
        # Get the queryset however you usually would.  For example:
        queryset = super().get_queryset()
        # Then use the query parameters and the queryset to
        # instantiate a filterset and save it as an attribute
        # on the view instance for later.
        self.filterset = self.filterset_class(self.request.GET, queryset=queryset)
        # Return the filtered queryset
        return self.filterset.qs.distinct()

    def get_context_data(self, **kwargs):
        context = super().get_context_data(**kwargs)
        # Pass the filterset to the template - it provides the form.
        context['filterset'] = self.filterset
        return context

Here's an example of how you might create a concrete view to use it:

class BookListView(FilteredListView):
    filterset_class = BookFilterset

And here's part of the template that uses a form created by the filterset to let the user control the filtering.

<h1>Books</h1>
  <form action="" method="get">
    {{ filterset.form.as_p }}
    <input type="submit" />
  </form>

<ul>
    {% for object in object_list %}
        <li>{{ object }}</li>
    {% endfor %}
</ul>

filterset.form is a form that controls the filtering, so we just render that however we want and add a way to submit it.

That's all you need to make a simple filtered view.

Default values for filters

I'm going to digress slightly here, and show a way to give filters default values, so when a user loads a page initially, for example, the items will be sorted with the most recent first. I couldn't find anything about this in the django_filter documentation, and it took me a while to figure out a good solution.

To do this, I override __init__ on my filter set and add default values to the data being passed:

class BookFilterSet(django_filters.FilterSet):
    def __init__(self, data, *args, **kwargs):
        data = data.copy()
        data.setdefault('format', 'paperback')
        data.setdefault('order', '-added')
        super().__init__(data, *args, **kwargs)

I tried some other approaches, but this seemed to work out the simplest, in that it didn't break or complicate things anywhere else.

Combining filtering and pagination

Unfortunately, linking to pages as described above breaks filtering. More specifically, whenever you follow one of those links, the view will forget whatever filtering the user has applied, because that filtering is also controlled by query parameters, and these links don't include the filter's parameters.

So if you're on a page https://example.com/objectlist/?type=paperback and then follow a page link, you'll end up at https://example.com/objectlist/?page=3 when you wanted to be at https://example.com/objectlist/?type=paperback&page=3.

It would be nice if Django helped out with a way to build links that set one query parameter without losing the existing ones, but I found a nice example of a template tag on StackOverflow and modified it slightly into this custom template tag that helps with that:

# <app>/templatetags/my_tags.py
from django import template

register = template.Library()


@register.simple_tag(takes_context=True)
def param_replace(context, **kwargs):
    """
    Return encoded URL parameters that are the same as the current
    request's parameters, only with the specified GET parameters added or changed.

    It also removes any empty parameters to keep things neat,
    so you can remove a parm by setting it to ``""``.

    For example, if you're on the page ``/things/?with_frosting=true&page=5``,
    then

    <a href="/things/?{% param_replace page=3 %}">Page 3</a>

    would expand to

    <a href="/things/?with_frosting=true&page=3">Page 3</a>

    Based on
    https://stackoverflow.com/questions/22734695/next-and-before-links-for-a-django-paginated-query/22735278#22735278
    """
    d = context['request'].GET.copy()
    for k, v in kwargs.items():
        d[k] = v
    for k in [k for k, v in d.items() if not v]:
        del d[k]
    return d.urlencode()

Here's how you can use that template tag to build pagination links that preserve other query parameters used for things like filtering:

{% load my_tags %}

{% if is_paginated %}
  {% if page_obj.has_previous %}
    <a href="?{% param_replace page=1 %}">First</a>
    {% if page_obj.previous_page_number != 1 %}
      <a href="?{% param_replace page=page_obj.previous_page_number %}">Previous</a>
    {% endif %}
  {% endif %}

  Page {{ page_obj.number }} of {{ paginator.num_pages }}

  {% if page_obj.has_next %}
    {% if page_obj.next_page_number != paginator.num_pages %}
      <a href="?{% param_replace page=page_obj.next_page_number %}">Next</a>
    {% endif %}
    <a href="?{% param_replace page=paginator.num_pages %}">Last</a>
  {% endif %}

  <p>Objects {{ page_obj.start_index }}{{ page_obj.end_index }}</p>
{% endif %}

Now, if you're on a page like https://example.com/objectlist/?type=paperback&page=3, the links will look like ?type=paperback&page=2, ?type=paperback&page=4, etc.

Caktus GroupThe Secret Lives of Cakti

Pictured from left: Caktus team members Vinod Kurup, Karen Tracey, and David Ray.

The Caktus team includes expert developers, sharp project managers, and eagle-eyed QA analysts. However, you may not know that there’s more to them than meets the eye. Here’s a peek at how Cakti spend their off-hours.

Vinod Kurup, M.D.

By day Vinod is a mild-mannered developer, but at night he swaps his keyboard for a stethoscope and heads to the hospital. Vinod’s first career was in medicine, and prior to Caktus he worked many years as an MD. While he’s now turned his expertise to programming, he still works part-time as a hospitalist. Now that’s what I call a side hustle.

Karen Tracey, Cat Rescuer

When Karen isn’t busy as both lead developer and technical manager for Caktus, she works extensively with Alley Cats and Angels, a local cat rescue organization dedicated to improving the lives and reducing the population of homeless cats in the Triangle area. She regularly fosters cats and kittens, which is why you sometimes find feline friends hanging out in the Caktus office.

David Ray, Extreme Athlete

Software development and extreme physical endurance training don’t generally go together, but let me introduce you to developer/sales engineer David. When not building solutions for Caktus clients, David straps on a 50-pound pack and completes 24-hour rucking events. Needless to say, he’s one tough Caktus. (Would you believe he’s also a trained opera singer?)

David Ray at a rucking event.

Pictured: David Ray at a recent rucking event.

These are just a few of our illustrious colleagues! Our team also boasts folk musicians, theater artists, sailboat captains, Appalachian cloggers, martial artists, and more.

Want to get to know us better? Drop us a line.

Caktus GroupDjango or Drupal for Content Management: Which Fits your Needs?

If you’re building or updating a website, you’re probably wondering about which content management system (CMS) to use. A CMS helps users — particularly non-technical users — to add pages and blog posts, embed videos and images, and incorporate other content into their site.

CMS options

You could go with something quick and do-it-yourself, like WordPress (read more about WordPress) or a drag-and-drop builder like Squarespace. If you need greater functionality, like user account management or asset tracking, or if you’re concerned about security and extensibility, you’ll need a more robust CMS. That means using a framework to build a complex website that can manage large volumes of data and content.

Wait, what’s a framework?

Put simply, a framework is a library of reusable code that is easily edited by a web developer to produce custom products more quickly than coding everything from scratch.

Django and Drupal are both frameworks with dedicated functionality for content management, but there is a key difference between them:

  • Drupal combines aspects of a web application framework with aspects of a CMS
  • Django separates the framework and the CMS

The separation that Django provides makes it easier for content managers to use the CMS because they don’t have to tinker with the technical aspects of the framework. A popular combination is Django and Wagtail, which is our favorite CMS.

I think I’ve heard of Drupal ...

Drupal is open source and built with PHP programming language. For some applications, its customizable templates and quick integrations make it a solid choice. It’s commonly used in higher education settings, among others.

However, Drupal’s predefined templates and plugins can also be its weakness: while they are useful for building a basic site, they are limiting if you want to scale the application. You’ll quickly run into challenges attempting to extend the basic functionality, including adding custom integrations and nonstandard data models.

Other criticisms include:

  • Poor backwards compatibility, particularly for versions earlier than Drupal 7. In this case, updating a Drupal site requires developers to rewrite code for elements of the templates and modules to make them compatible with the newest version. Staying up-to-date is important for security reasons, which can become problematic if the updates are put off too long.
  • Unit testing is difficult due to Drupal’s method of storing configurations in a database, making it difficult to test the effects of changes to sections of the code. Failing to do proper testing may allow errors to make it to the final version of the website.
  • Another database-related challenge lies in how the site configuration is managed. If you’re trying to implement changes on a large website consisting of thousands of individual content items or users, none of the things that usually make this easier — like the ability to view line-by-line site configuration changes during code review — are possible.

What does the above mean for non-technical stakeholders? Development processes are slowed down significantly because developers have to pass massive database files back and forth with low visibility into the changes made by other team members. It also means there is an increased likelihood that errors will reach the public version of your website, creating even more work to fix them.

Caktus prefers Django

Django is used by complex, high-profile websites, including Instagram, Pinterest, and Eventbrite. It’s written in the powerful, open-source Python programming language, which was created specifically to speed up the process of web development. It’s fast, secure, scalable, and intended for use with database-driven web applications.

A huge benefit of Django is more control over customization, plus data can easily be converted. Since it’s built on Python, Django uses a paradigm called object-oriented programming, which makes it easier to manage and manipulate data, troubleshoot errors, and re-use code. It’s also easier for developers to see where changes have been made in the code, simplifying the process of updating the application after it goes live.

How to choose the right tool

Consider the following factors when choosing between Drupal and Django:

  • Need for customization
  • Internal capacity
  • Planning for future updates

Need for customization: If your organization has specific, niche features or functionality that require custom development — for example, data types specific to a library, university, or scientific application — Django is the way to go. It requires more up-front development than template-driven Drupal but allows greater flexibility and customization. Drupal is a good choice if you’re happy to use templates to build your website and don’t need customization.

Internal capacity: Drupal’s steep learning curve means that it may take some time for content managers to get up to speed. In comparison, we’ve run training workshops that get content management teams up and running on Django-based Wagtail in only a day or two. Wagtail’s intuitive user interface makes it easier to manage regular content updates, and the level of customization afforded by Django means the user interface can be developed in a way that feels intuitive to users.

Planning for future updates: Future growth and development should be taken into account when planning a web project. The choices made during the initial project phase will impact the time, expense, and difficulty of future development. As mentioned, Drupal has backwards compatibility challenges, and therefore a web project envisioned as fast-paced and open to frequent updates will benefit from a custom Django solution.

Need a second opinion?

Don’t just take our word for it. Here’s what Brad Busenius at the University of Chicago says about their Django solution:

"[It impacts] the speed and ease at which we can create highly custom interfaces, page types, etc. Instead of trying to bend a general system like Drupal to fit our specific needs, we're able to easily build exactly what we want without any additional overhead. Also, since we're often understaffed, the fact that it's a developer-friendly system helps us a lot. Wagtail has been a very positive experience so far."

The bottom line

Deciding between Django and Drupal comes down to your specific needs and goals, and it’s worth considering the options. That said, based on our 10+ years of experience developing custom websites and web applications, we almost always recommend Django with Wagtail because it’s:

  • Easier to update and maintain
  • More straightforward for content managers to learn and use
  • More efficient with large data sets and complex queries
  • Less likely to let errors slip through the cracks

If you want to consider Django and whether it will suit your next project, we’d be happy to talk it through and share some advice. Get in touch with us.

Caktus GroupDiverse Speaker Line-up for DjangoCon is Sharp

Above: Caktus Account Manager Tim Scales gears up for DjangoCon.

We’re looking forward to taking part in the international gathering of Django enthusiasts at DjangoCon 2018, in San Diego, CA. We’ll be there from October 14 - 19, and we’re proud to attend as sponsors for the ninth year! As such, we’re hosting a mini golf event for attendees (details below).

This year’s speakers are impressive, thanks in part to Erin Mullaney, one of Caktus’ talented developers, who volunteered with DjangoCon’s Program Team. The three-person team, including Developer Jessica Deaton of Wilmington, NC, and Tim Allen, IT Director at The Wharton School, reviewed 257 speaker submissions. They ultimately chose the speakers with the help of a rating system that included community input.

“It was a lot of fun reading the submissions,” said Erin, who will also attend DjangoCon. “I’m really looking forward to seeing the talks this year, especially because I now have a better understanding of how much work goes into the selection process.”

Erin and the program team also created the talk schedule. The roster of speakers includes more women and underrepresented communities due to the DjangoCon diversity initiatives, which Erin is proud to support.

What we’re excited about

Erin said she’s excited about a new State of Django panel that will take place on Wednesday, October 17, which will cap off the conference portion of DjangoCon, before the sprints begin. It should be an informative wrap-up session.

Karen Tracey, our Lead Developer and Technical Manager, is looking forward to hearing “Herding Cats with Django: Technical and social tools to incentivize participation” by Sage Sharp. This talk seems relevant to the continued vibrancy of Django's own development, said Karen, since the core framework and various standard packages are developed with limited funding and rely tremendously on volunteer participation.

Our Account Manager Tim Scales is particularly excited about Tom Dyson’s talk, “Here Come The Robots,” which will explore how people are leveraging Django for machine learning solutions. This is an emerging area of interest for our clients, and one of particular interest to Caktus as we grow our areas of expertise.

Other talks we’re looking forward to include:

Follow us on Twitter @CaktusGroup and #DjangoCon to stay tuned on the talks.

Golf anyone?

If you’re attending DjangoCon, come play a round of mini golf with us. Look for our insert in your conference tote bag. It includes is a free pass to a mini golf outing that we’re hosting at Tiki Town Adventure Golf on Tuesday, October 16, at 7:00 p.m. (please RSVP online). The first round of golf is on us! Whoever shoots the lowest score will win a $100 Amazon gift card.*

No worries if you’re not into mini golf! Instead, find a time to chat with us one-on-one during DjangoCon.

*In the event of a tie, the winner will be selected from a random drawing from the names of those with the lowest score. Caktus employees can play, but are not eligible for prizes.

Caktus GroupBetter Python Dependency Management with pip-tools

I recently looked into whether I could use pip-tools to improve my workflow around projects' Python dependencies. My conclusion was that pip-tools would help on some projects, but it wouldn't do everything I wanted, and I couldn't use it everywhere. (I tried pip-tools version 2.0.2 in August 2018. If there are newer versions, they might fix some of the things I ran into when trying pip-tools.)

My problems

What were the problems I wanted to find solutions for, that just pip wasn't handling? Software engineer Kenneth Reitz explains them pretty well in his post, but I'll summarize here.

Let me start by briefly describing the environments I'm concerned with. First is my development environment, where I want to manage the dependencies. Second is the test environment, where I want to know exactly what packages and versions we test with, because then we come to the deployed environment, where I want to use exactly the same Python packages and versions as I've used in development and testing, to be sure no problems are introduced by an unexpected package upgrade.

The way we often handle that is to have a requirements file with every package and its version specified. We might start by installing the packages we know that we need, then saving the output of pip freeze to record all the dependencies that also got installed and their versions. Installing into an empty virtual environment using that requirements file gets us the same packages and versions.

But there are several problems with that approach.

First, we no longer know which packages in that file we originally wanted, and which were pulled in as dependencies. For example, maybe we needed Celery, but installing it pulled in a half-dozen other packages. Later we might decide we don't need Celery anymore and remove it from the requirements file, but we don't know which other packages we can also safely also remove.

Second, it gets very complicated if we want to upgrade some of the packages, for the same reasons.

Third, having to do a complete install of all the packages into an empty virtual environment can be slow, which is especially aggravating when we know little or nothing has changed, but that's the only way to be sure we have exactly what we want.

Requirements

To list my requirements more concisely:

  • Distinguish direct dependencies and versions from incidental
  • Freeze a set of exact packages and versions that we know work
  • Have one command to efficiently update a virtual environment to have exactly the frozen packages at the frozen versions and no other packages
  • Make it reasonably easy to update packages
  • Work with both installing from PyPI, and installing from Git repositories
  • Take advantage of pip's hash checking to give a little more confidence that packages haven't been modified
  • Support multiple sets of dependencies (e.g. dev vs. prod, where prod is not necessarily a subset of dev)
  • Perform reasonably well
  • Be stable

That's a lot of requirements. It turned out that I could meet more of them with pip-tools than just pip, but not all of them, and not for all projects.

Here's what I tried, using pip, virtualenv, and pip-tools.

How to set it up

  1. I put the top-level requirements in requirements.in/*.txt.

    To manage multiple sets of dependencies, we can include "-r file.txt", where "file.txt" is another file in requirements.in, as many times as we want. So we might have a base.txt, a dev.txt that starts with -r base.txt and then adds django-debug-toolbar etc, and a deploy.txt that starts with -r base.txt and then adds gunicorn.

    There's one annoyance that seems minor at this point, but turns out to be a bigger problem: pip-tools only supports URLs in these requirements files if they're marked editable with -e.

# base.txt
Django<2.0
-e git+https://github.com/caktus/django-scribbler@v0.8.0#egg=django-scribbler

# dev.txt
-r base.txt
django-debug-toolbar

# deploy.txt
-r base.txt
gunicorn
  1. Install pip-tools in the relevant virtual environment:
$ <venv>/bin/pip install pip-tools
  1. Compile the requirements as follows:
$ <venv>/bin/pip-compile --output-file requirements/def.txt requirements.in/dev.txt

This looks only at the requirements file(s) we tell it to look at, and not at what's currently installed in the virtual environment. So one unexpected benefit is that pip-compile is faster and simpler than installing everything and then running pip freeze.

The output is a new requirements file at requirements/dev.txt.

pip-compile nicely puts a comment at the top of the output file to tell developers exactly how the file was generated and how to make a newer version of it.

#
# This file is autogenerated by pip-compile
# To update, run:
#
#    pip-compile --output-file requirements/dev.txt requirements.in/dev.txt
#
-e git+https://github.com/caktus/django-scribbler@v0.8.0#egg=django-scribbler
django-debug-toolbar==1.9.1
django==1.11.15
pytz==2018.5
sqlparse==0.2.4           # via django-debug-toolbar
```
  1. Be sure requirements, requirements.in, and their contents are in version control.

How to make the current virtual environment have the same packages and versions

To update your virtual environment to match your requirements file, ensure pip-tools is installed in the desired virtual environment, then:

$ <venv>/bin/pip-sync requirements/dev.txt

And that's all. There's no need to create a new empty virtual environment to make sure only the listed requirements end up installed. If everything is already as we want it, no packages need to be installed at all. Otherwise only the necessary changes are made. And if there's anything installed that's no longer mentioned in our requirements, it gets removed.

Except ...

pip-sync doesn't seem to know how to uninstall the packages that we installed using -e <URL>. I get errors like this:

Can't uninstall 'pkgname1'. No files were found to uninstall.
Can't uninstall 'pkgname2'. No files were found to uninstall.

I don't really know, then, whether pip-sync is keeping those packages up to date. Maybe before running pip-sync, I could just

rm -rf $VIRTUAL_ENV/src

to delete any packages that were installed with -e? But that's ugly and would be easy to forget, so I don't want to do that.

How to update versions

  1. Edit requirements.in/dev.txt if needed.
  2. Run pip-compile again, exactly as before:
$ <venv>/bin/pip-compile--output-file requirements/dev.txt requirements.in/dev.txt
  1. Update the requirements files in version control.

Hash checking

I'd like to use hash checking, but I can't yet. pip-compile can generate hashes for packages we will install from PyPI, but not for ones we install with -e <URL>. Also, pip-sync doesn't check hashes. pip install will check hashes, but if there are any hashes, then it will fail unless all packages have hashes. So if we have any -e <URL> packages, we have to turn off hash generation or we won't be able to pip install with the compiled requirements file. We could still use pip-sync with the requirements file, but since pip-sync doesn't check hashes, there's not much point in having them, even if we don't have any -e packages.

What about pipenv?

Pipenv promises to solve many of these same problems. Unfortunately, it imposes other constraints on my workflow that I don't want. It's also changing too fast at the moment to rely on in production.

Pipenv solves several of the requirements I listed above, but fails on these: It only supports two sets of requirements: base, and base plus dev, not arbitrary sets as I'd like. It can be very slow. It's not (yet?) stable: the interface and behavior is changing constantly, sometimes multiple times in the same day.

It also introduces some new constraints on my workflow. Primarily, it wants to control where the virtual environment is in the filesystem. That both prevents me from putting my virtual environment where I'd like it to be, and prevents me from using different virtual environments with the same working tree.

Shortcomings

pip-tools still has some shortcomings, in addition to the problems with checking hashes I've already mentioned.

Most concerning are the errors from pip-sync when packages have previously been installed using -e <URL>. I feel this is an unresolved issue that needs to be fixed.

Also, I'd prefer not to have to use -e at all when installing from a URL.

This workflow is more complicated than the one we're used to, though no more complicated than we'd have with pipenv, I don't think.

The number and age of open issues in the pip-tools git repository worry me. True, it's orders of magnitude fewer than some projects, but it still suggests to me that pip-tools isn't as well maintained as I might like if I'm going to rely on it in production.

Conclusions

I don't feel that I can trust pip-tools when I need to install packages from Git URLs.

But many projects don't need to install packages from Git URLs, and for those, I think adding pip-tools to my workflow might be a win. I'm going to try it with some real projects and see how that goes for a while.

Josh JohnsonState And Events In CircuitPython: Part 3: State And Microcontrollers And Events (Oh My!)

In this part of the series, we'll apply what we've learned about state to our simple testing code from part one.

Not only will we debounce some buttons without blocking, we'll use state to more efficiently control some LEDs.

We'll also explore what happens when state changes, and how we can take advantage of that to do even more complex things with very little code, using the magic of event detection 🌈 .

All of this will be done in an object-oriented fashion, so we'll learn a lot about OOP as we go along.

Josh JohnsonState And Events In CircuitPython: Part 3: State And Microcontrollers And Events (Oh My!)

In this part of the series, we'll apply what we've learned about state to our simple testing code from part one.

Not only will we debounce some buttons without blocking, we'll use state to more efficiently control some LEDs.

We'll also explore what happens when state changes, and how we can take advantage of that to do even more complex things with very little code, using the magic of event detection 🌈 .

All of this will be done in an object-oriented fashion, so we'll learn a lot about OOP as we go along.

Josh JohnsonState And Events In CircuitPython: Part 2: Exploring State And Debouncing The World

In this part of the series, we're going to really dig into what state actually is. We'll use analogies from real life, and then look at how we might model real-life state using Python data structures.

But first, we'll discuss a common problem that all budding electronics engineers have to deal with at some point: "noisy" buttons and how to make them "un-noisy", commonly referred to as "debouncing".

We'll talk about fixing the problem in the worst, but maybe easiest way: by blocking. We'll also talk about why it's bad.

Josh JohnsonState And Events In CircuitPython: Part 2: Exploring State And Debouncing The World

In this part of the series, we're going to really dig into what state actually is. We'll use analogies from real life, and then look at how we might model real-life state using Python data structures.

But first, we'll discuss a common problem that all budding electronics engineers have to deal with at some point: "noisy" buttons and how to make them "un-noisy", commonly referred to as "debouncing".

We'll talk about fixing the problem in the worst, but maybe easiest way: by blocking. We'll also talk about why it's bad.

Caktus GroupNational Day of Civic Hacking in Durham

Pictured: Simone Sequeira, Senior Product Manager of GetCalFresh, with event attendees at Caktus.

On August 11, I attended the National Day of Civic Hacking hosted by Code for Durham. More than 30 attendees came to the event, hosted in the Caktus Group Tech Space, to collaborate on civic projects that focus on the needs of Durham residents.

National Day of Civic Hacking is a nationwide day of action that brings together civic leaders, local government officials, and community organizers who volunteer their skills to help their local community. Simone Sequeira, Senior Product Manager of GetCalFresh, came from Oakland to participate and present at our Durham event. Simone inspired us with a presentation of GetCalFresh, a project supported by Code for America, that streamlines the application process for food assistance in California. It started as just an idea, and turned into a product used statewide that’s supported by over a half dozen employees. Many Code for Durham projects also start as ideas, and the National Day of Civic Hacking provided an opportunity to turn those ideas into realities.

Laura Biedeger, City of Durham Community Engagement Coordinator, presents at the event at Caktus.

Pictured: Laura Biedeger, a Team Captain at Code for Durham and a co-organizer of the event, speaks to attendees. I'm standing to the left.

Durham Projects

We worked on a variety of projects in Durham, including the following:

One group of designers, programmers, and residents audited the Code for Durham website. The group approached the topic from a user-centered design perspective: they identified and defined user personas and wrote common scenarios of visitors to the site. By the end of the event they had documented the needs of the site and designed mockups for the new site.

Regular volunteers with Code for Durham have been working with the Durham Innovation Team to create an automated texting platform for the Drivers License Restoration Initiative, which aims to support a regular amnesty of driver’s license suspensions in partnership with the Durham District Attorney’s Office. During our event volunteers added a Spanish language track to the platform.

The “Are We Represented?” project focused on voter education: showing how the makeup of County Commissioner boards across the state compare to the population in their county. During the event I worked with Jason Jones, the Analytics and Innovation Manager of Greensboro, to deploy the project to the internet (and we succeeded!).

The Are We Represented group reviews the State Board of Elections data files on a screen.

Pictured: The Are We Represented group reviews State Board of Elections data files.

Another group partnered with End Hunger in Durham, which provides a regularly updated list of food pantries and food producers (gardeners, farmers, grocery stores, bakeries) that regularly donate surplus food. The volunteers reviewed an iOS app they had developed to easily find a pantry, and discussed the development of an Android app.

Join Us Next Time!

The National Day of Civic Hacking gave volunteers a chance to get inspired about new project opportunities, to meet new volunteers, city employees, and to focus on a project for an extended period of time. The projects will continue at Code for Durham’s regularly hosted Meetup at the Caktus Group Tech Space. Volunteers are always welcome, so join us at the next Meetup!

Josh JohnsonState And Events In CircuitPython: Part 1: Setup

This is the first article in a series that explores concepts of state in CircuitPython.

In this installment, we discuss the platform we're using (both CircuitPython and the Adafruit M0/M4 boards that support it), and build a simple circuit for demonstration purposes. We'll also talk a bit about abstraction.

This series is intended for people who are new to Python, programming, and/or microcontrollers, so there's an effort to explain things as thoroughly as possible. However, experience with basic Python would be helpful.

Josh JohnsonState And Events In CircuitPython: Part 1: Setup

This is the first article in a series that explores concepts of state in CircuitPython.

In this installment, we discuss the platform we're using (both CircuitPython and the Adafruit M0/M4 boards that support it), and build a simple circuit for demonstration purposes. We'll also talk a bit about abstraction.

This series is intended for people who are new to Python, programming, and/or microcontrollers, so there's an effort to explain things as thoroughly as possible. However, experience with basic Python would be helpful.

Footnotes