Multiple Mixins (and naming conflicts) in PythonJun 28, 2020 • #programming , #python
Mixins are quite useful. In case you're not familiar with the idea, it's a step in the direction of "composition over inheritance".
In Object Oriented Programming, consider that you're implementing a class and you want a bunch of functions to be available inside of that class. The idea behind mixins is to implement these functions in other independent classes and inherit your actual class from these utility classes. Such utility classes are called "mixins".
In Python, for instance, this is possible due to multiple inheritance.
As an example, assume that we want to implement an
Animal class, the objects
of which should have the capability to both "bark" and "meow". Putting
biological concerns aside, this can be implemented by defining the "barking"
functionality inside a
Barker class and the "meowing" functionality inside a
Meower class. You can then inherit the
Animal class from both
class Barker: def bark(self): print('Woof!') class Meower: def meow(self): print('Meow!') class Animal(Barker, Meower): def greet(self): self.bark() self.meow()
This has the advantage that if later on we want to add another
class which wants the ability to "bark", we simply have to choose
one of the base classes to have the
bark() function available in the new
While mixins come in super handy when solving real-world problems, they often come with their own set of issues. And some of these issues come from how the programming language you're using implements multiple inheritance.
In the Python world, one such problem is — what happens if two different mixins define the same function?
Let's take a real-world-ish problem to illustrate the issue.
Consider that we're writing a web application serving HTTP requests. At the end
of every web request we would like to perform some cleanup related to the
database and the cache. Let's also assume that our web framework of choice
on_finish hook on individual requests to place such cleanup code.
When implementing something like that using mixins, this is how it could roughly look like.
class DatabaseCleaner: def on_finish(self): print('Cleaning the database') class CacheCleaner: def on_finish(self): print('Cleaning the cache') class Request(DatabaseCleaner, CacheCleaner): pass
For this code snippet, what do you think the output would be?
Because of the way Python implements multiple inheritance, it ends up calling
the function defined in the first base class. In our example, this happens
DatabaseCleaner.on_finish, which is what ends up being called. This has
the unfortunate side-effect that
CacheCleaner.on_finish never gets called.
This is obviously not what we want. We want to print both "Cleaning the database" as well as "Cleaning the cache".
One solution for this problem is to name those
differently. So that
While that could work (with a little bit of extra effort), it sounds a little
odd. What if those two classes are written by two completely different authors
who don't know about the existence of each other? Not to mention that this now
puts the responsibility on the calling code to remember to call those
on_finish_* functions, defeating the mixin-magic in the first place.
Luckily, Python provides a way to call the "next in line" function, which refers
to the function with the same name in the "next" base class. Let's modify the
clean() functions in both the mixins to something like the following.
class DatabaseCleaner: def clean(self): print('Cleaning the database') try: next_clean = super().clean() except AttributeError: pass else: next_clean()
Here, inside the
except clause, we first try to find the
function that's "next in line" by using
super(). In case one was found, we
simply call it. In case no such function was found (which most likely means that
this is the last base class), then
super().clean() would raise an
AttributeError, in which case we can choose to do nothing and do a clean exit.
While this does add 6 extra lines to the mixin code, I feel it's a small price to pay. As library authors, it's nice to act as good citizens and keep in mind that there are other libraries out there as well which our users might use which may result in name clashes. I feel it's a good idea to make our code resistant to such things.
I came across this problem when writing tornado-sqlalchemy, which is a Python
package that provides SQLAlchemy integration for Tornado projects. This
integration is provided using a
DatabaseMixin which makes database
objects available in the request classes.
When using this library in a work project which also had Sentry integration
enabled through sentry-python, I noticed that both
sentry-python provided (at least at that time) mixins to perform cleanup at the
end of the request.
tornado-sqlalchemy did the database cleanup while
sentry-python did some cleanup related to Sentry.
But since Tornado provides only a single cleanup function called
the same function was being overridden by both the libraries. This resulted in
the fact that if a user was using both the mixins in their application code,
depending on which class order they used when defining the (multiple)
inheritance, one cleanup function would be skipped completely.
I've since patched the
DatabaseMixin provided by
tornado-sqlalchemy so that
this problem doesn't exist anymore. But it was still interesting to run into
this issue and find the solution.