Mixins are quite useful. In case you're not familiar with the idea, it's a step in the direction of "composition over inheritance".
In Object Oriented Programming, consider that you're implementing a class and you want a bunch of functions to be available inside of that class. The idea behind mixins is to implement these functions in other independent classes and inherit your actual class from these utility classes. Such utility classes are called "mixins".
In Python, for instance, this is possible due to multiple inheritance.
As an example, assume that we want to implement an Animal
class, the objects
of which should have the capability to both "bark" and "meow". Putting
biological concerns aside, this can be implemented by defining the "barking"
functionality inside a Barker
class and the "meowing" functionality inside a
Meower
class. You can then inherit the Animal
class from both Barker
and
Meower
.
class Barker:
def bark(self):
print('Woof!')
class Meower:
def meow(self):
print('Meow!')
class Animal(Barker, Meower):
def greet(self):
self.bark()
self.meow()
This has the advantage that if later on we want to add another Animal
-ish
class which wants the ability to "bark", we simply have to choose Barker
as
one of the base classes to have the bark()
function available in the new
class.
While mixins come in super handy when solving real-world problems, they often come with their own set of issues. And some of these issues come from how the programming language you're using implements multiple inheritance.
In the Python world, one such problem is — what happens if two different mixins define the same function?
Let's take a real-world-ish problem to illustrate the issue.
Consider that we're writing a web application serving HTTP requests. At the end
of every web request we would like to perform some cleanup related to the
database and the cache. Let's also assume that our web framework of choice
provides an on_finish
hook on individual requests to place such cleanup code.
When implementing something like that using mixins, this is how it could roughly look like.
class DatabaseCleaner:
def on_finish(self):
print('Cleaning the database')
class CacheCleaner:
def on_finish(self):
print('Cleaning the cache')
class Request(DatabaseCleaner, CacheCleaner):
pass
For this code snippet, what do you think the output would be?
Because of the way Python implements multiple inheritance, it ends up calling
the function defined in the first base class. In our example, this happens
to be DatabaseCleaner.on_finish
, which is what ends up being called. This has
the unfortunate side-effect that CacheCleaner.on_finish
never gets called.
This is obviously not what we want. We want to print both "Cleaning the database" as well as "Cleaning the cache".
One solution for this problem is to name those on_finish
functions
differently. So that DatabaseCleaner
implements on_finish_database
and
CacheCleaner
implements on_finish_cache
.
While that could work (with a little bit of extra effort), it sounds a little
odd. What if those two classes are written by two completely different authors
who don't know about the existence of each other? Not to mention that this now
puts the responsibility on the calling code to remember to call those
on_finish_*
functions, defeating the mixin-magic in the first place.
Luckily, Python provides a way to call the "next in line" function, which refers
to the function with the same name in the "next" base class. Let's modify the
clean()
functions in both the mixins to something like the following.
class DatabaseCleaner:
def clean(self):
print('Cleaning the database')
try:
next_clean = super().clean()
except AttributeError:
pass
else:
next_clean()
Here, inside the try
/except
clause, we first try to find the
function that's "next in line" by using super()
. In case one was found, we
simply call it. In case no such function was found (which most likely means that
this is the last base class), then super().clean()
would raise an
AttributeError
, in which case we can choose to do nothing and do a clean exit.
While this does add 6 extra lines to the mixin code, I feel it's a small price to pay. As library authors, it's nice to act as good citizens and keep in mind that there are other libraries out there as well which our users might use which may result in name clashes. I feel it's a good idea to make our code resistant to such things.
I came across this problem when writing tornado-sqlalchemy, which is a Python
package that provides SQLAlchemy integration for Tornado projects. This
integration is provided using a DatabaseMixin
which makes database Session
objects available in the request classes.
When using this library in a work project which also had Sentry integration
enabled through sentry-python, I noticed that both tornado-sqlalchemy
and
sentry-python
provided (at least at that time) mixins to perform cleanup at the
end of the request. tornado-sqlalchemy
did the database cleanup while
sentry-python
did some cleanup related to Sentry.
But since Tornado provides only a single cleanup function called on_finish
,
the same function was being overridden by both the libraries. This resulted in
the fact that if a user was using both the mixins in their application code,
depending on which class order they used when defining the (multiple)
inheritance, one cleanup function would be skipped completely.
I've since patched the DatabaseMixin
provided by tornado-sqlalchemy
so that
this problem doesn't exist anymore. But it was still interesting to run into
this issue and find the solution.