Whether you’re a Data Scientist, a Web Developer working in an API, or any other of a long list of roles, chances are you’ll stumble upon Python at some point. If so, List Comprehensions are to be expected.
Some of us love Python for its simplicity, its fluidity and legibility. Others hate it for not being as performant as C or pure Assembly, having Duck Typing, or being single-threaded (ish).
No matter what group you belong to, if you’re in a position where you want/have to write Python code, you’ll want it to be as clean and readable as possible. Or you may have stumbled upon a List Comprehension in the wild and be confused as to how to tame it. If either of those is true, then this article is for you.
What are List Comprehensions?
First of all, let’s define our terms. A list comprehension is a piece of syntactic sugar that replaces the following pattern:
With this, equivalent, one:
That looks… Nice, right?
It may look kinda quirky the first time you see it, I know, but trust me. It’s an acquired taste.
Why should we use Python List Comprehensions?
What are the advantages of using List Comprehensions?
First of all, you’re reducing 3 lines of code into one, which will be instantly recognizable to anyone who understands list comprehensions.
Secondly, the second code is faster, as Python will allocate the list’s memory first, before adding the elements to it, instead of having to resize on runtime. It’ll also avoid having to make calls to ‘append’, which may be cheap but add up.
Lastly, code using comprehensions is considered more ‘Pythonic’ — better fitting Python’s style guidelines.
Refactoring Code Smells
Another, more subtle advantage is smell detection. Your code without comprehensions may look like this:
If the preceding or following code is long enough in ‘some_function’, that bit about the list may get lost. But using list comprehensions directly on those 6 lines wouldn’t look that pretty:
Trying to parse that with your eyes will give you a headache. There are some paper bags below your seats in case any of you need to use them. What’s happening here? Well, it’s clear that bit of logic should be abstracted into a new function, like this:
Then those first six lines of code end up just being
another_list = [new_function(i) for i in range(k)]
Which is a lot clearer (if I hadn’t picked such awful names for our functions) and reads faster if you know what’s going on.
Some may argue I did end up adding 6 lines of overhead code in order to get to this place.
That’s true, but if this behavior appeared at least once more in the code, then even that’s not really a loss. And even if that were not the case, what we lose in code size, we gain in maintainability and readability, which should be sought after.
Good programmers write code that humans can understand.
— Martin Fowler.
Some other things that are easy to do with list comprehensions are
unwrapping a matrix into a vector:
Filtering a list:
Generating many instances of a class (in this case modeled with a simple dictionary, like a JSON object):
Casting a list of objects of a certain type into a list of another type:
Python List Comprehensions Benchmarks
In order to verify there is an actual boost in performance, I decided to run some tests. I ran the for-loop version and the list comprehension version of the same code, with and without filtering. Here’s the test’s snippet:
The list_a methods generate lists the ugly way, with a for-loop and appending. The list_b methods use List Comprehensions.
As you can see, one filters half of the elements before adding them to the list, whereas the other one just adds them all.
My results, after running the script ten times and averaging the resulting time measures, were the following:
- 5.84 seconds for list a
- 4.07 seconds for list b
- 4.85 seconds for filtered list a
- 4.13 seconds for filtered list b
I encourage you to run that same script in your computer and see the boost for yourself, maybe even change the input size.
We see a 33% boost in speed from switching to List Comprehensions in the unfiltered case, whereas the filtered algorithm only gets a 15% boost.
This confirms our theory that the main performance advantage comes from not having to call the append method at each iteration, which is skipped on every other iteration in the filtered case.
Finally, I should add that all I just taught you about list comprehensions can be done with Python dictionaries.
Notice that every assignment to a used key after the first will overwrite the previous value.
This also provides a fast way to generate a set from a list, though in most cases we would just do set(my_list).
We showed Python’s List Comprehensions are much faster than initializing a list by appending elements to it, and also make for cleaner and more Pythonic code.
We also saw there is an equivalent expression for dictionaries.
That was my crash course in list comprehensions, I hope you liked it!
Here are a few more Python tutorials you may be interested in: