This post covers one of my favorite fundamental design choices in Ruby: its design favors forward flow of execution in complex statements, and allows programmers to use it in a consistent fashion via blocks. I will contrast this with Python, where the situation is... mixed. Although it is certainly possible to write fluent interfaces in Python, the language and libraries tends to fight you.
One of my biggest gripes with Python is that APIs are often designed in such a way that forces you to scan forwards and backwards repeatedly when chaining function calls. For example, consider this directory structure:
test/ ├── a/ └── b/
To get from A
(our cwd
) to its sibling directory B
using Python's os.path
without needing lots of intermediate
variables, you need to do something like this:
import os.path b = os.path.join(os.path.dirname(os.getcwd()), 'b') # '/home/jtk/test/b'
In English, that's
"join the directory that the cwd
is in with b
".
Notice how the starting point of the calculation, the cwd
,
comes in the middle of the sentence.
If you started there, you would need to read out to the left
until you hit the join, then jump back to the end of the
sentence to figure out what you're joining to.
(The terminology is also somewhat noodly, at least to me -
"the directory that the cwd
is in" means "the parent of the cwd
",
but I associate "the directory that something is in" with files
more than I do directories.)
If you use Python's pathlib
module instead, you get
from pathlib import Path b = Path.cwd().parent / 'b' # PosixPath('/home/jtk/test/b')
This is this significantly more compact
(no repetition of os.path
; most people don't from os.path import
the individual functions)
and expressive
(parent
instead of dirname
, using /
to join path components).
Even better, it reads in an obvious manner from left to right:
"start in the cwd
, go to its parent, then go down into b
".
This matches our intuition about how you can traverse the filesystem locally, and reads sensibly from left to right without needing to jump around the structure of the call.
In Python, you can only really achieve this kind of "forward flow"
when working with an interface that is explicitly
designed to be fluent, like pathlib
.
Because the Path
methods generally return a new path, you can chain
them from left to right without having to nest
the nouns deep inside a function call.
Compare something like
verb(noun, verb(noun, noun))
to something like
noun.verb(noun).verb(noun)
I vastly prefer the latter when at all practical. Control flows smoothly from left to right without needing to jump aorund.
This is particularly problematic when you want to use Python's map
,
or even its comprehensions:
doubled = map(lambda x: 2 * x, range(10)) doubled = [2 * x for x in range(10)]
In both cases, the verb comes before the noun. The list comprehension at
least reads like an English sentence,
but it can get very messy if you want to perform multiple operations on
each element, filter or replace elements, or
use a more complex expression than is reasonable for a lambda
.
Well-designed fluent APIs like pathlib
certainly exist, but they are
the exception, not the rule.
There is no driving principle welling up from the design of the language
itself to help programmers write fluent interfaces.
Instead, Python's own syntax tends to fight you.
Ruby flips this around, emphasizing forward flow by making
"control flow methods" like map
into methods on
generic "Enumerable" (in Python, we would say "iterable") objects.
In Ruby, the same maps as above look like one of these two equivalent forms:
doubled = (0...10).map { |x| 2 * x } doubled = (0...10).map do |x| 2 * x end
In both cases, we are passing a "block" (the bit inside { }
or do end
)
to the map
method of a Range
object.
In Python, one could imagine doing
doubled = range(10).map(lambda x: 2 * x)
... except that range
has no such method.
The unifying feature in Ruby is that Enumerable
is itself a class with
many methods,
while Python's iterators are not equipped with any useful methods beyond
what is absolutely necessary for them to work.
Even the basic for
loop can be performed with Enumerable.each
in Ruby.
Because any code can go in a block, they can act like super-powered Python
lambda
functions, allowing you to express multi-step operations while
maintaining forward flow.
Note that the Ruby version of the map
is in noun-verb order,
while the original Python examples are in verb-noun order.
Noun-verb feels more natural to me:
I want to know what we're operating on before I know what the operation is.
Of course, that thing might be abstract
(an interface, or a duck-typed object, etc.),
but I want to understand the context of the action we're about to take.
One must be very careful when deciding whether an interface should be fluent. A good way to decide is to look at possible side effects that might happen in the fluent call chain: if there are any, you probably shouldn't provide a fluent interface.
Fluent interfaces are most effective when they describe a chain of transformations,
returning a new object from each method in the chain.
The examples above are all, well, examples of this.
Guido van Rossum sums it up in this Python-Dev
post from 2003:
I'd like to explain once more why I'm so adamant that
sort()
shouldn't returnself
.This comes from a coding style (popular in various other languages, I believe especially Lisp revels in it) where a series of side effects on a single object can be chained like this:
pythonx.compress().chop(y).sort(z)
which would be the same as
pythonx.compress() x.chop(y) x.sort(z)
I find the chaining form a threat to readability; it requires that the reader must be intimately familiar with each of the methods. The second form makes it clear that each of these calls acts on the same object, and so even if you don't know the class and its methods very well, you can understand that the second and third call are applied to
x
(and that all calls are made for their side-effects), and not to something else.I'd like to reserve chaining for operations that return new values, like string processing operations:
pythony = x.rstrip("\n").split(":").lower()
There are a few standard library modules that encourage chaining of side-effect calls (
pstat
comes to mind). There shouldn't be any new ones;pstat
slipped through my filter when it was weak.
Of course, Python has good support for keyword arguments with optional default values, which largely eliminates the need for the Builder pattern that is popular in some languages. There will always be exceptions where a fluent interface might make sense even with underlying mutation.
The discussion in this post so far has been rather philosophical. What do I actually try to do while writing code on a day-to-day basis?
verb(noun, noun, ...)
, noun.verb(noun, noun).verb(noun)
, or verb(verb(noun))
,
where the control flow moves exclusively from either left-to-right or inside-to-outside,
without needing to jump around.
This improves the readability of the main code path.
(at the expense of shuffling complexity into many small functions/methods,
which I consider an acceptable tradeoff).noun.property
instead of compute(noun)
)
and makes it easier to chain operations.
(This is usually good for encapsulation anyway.)self
.
Mutating method calls should be isolated to their own line of code,
and preferably have a very explicit name.
If you simply must mutate in a chained method call,
do it at the end of the chain.These aren't rules, and I don't follow them religiously. But I'd like to think I have them in the back of my mind, especially when refactoring existing code.
This Website Uses React | 2024‑06‑16 |
Adding Support for Inline DOT to Blahs | 2021‑11‑07 |
Adventures in GitHub Project Automation | 2020‑09‑05 |
Fluent Interfaces in Python and Ruby | 2020‑05‑25 |
My Favorite Software Materials | 2020‑03‑07 |