Can Python Overload Methods?
The title of the post is a question that seems to come up quite often in beginner forums, you can find a lot of hits in a web search, on Stackoverflow, etc. Not surprisingly, it usually comes from people who have worked in languages where method/function overloading is a big part of the language design.
According to Wikipedia,
…function overloading or method overloading is the ability to create multiple methods of the same name with different implementations. Calls to an overloaded function will run a specific implementation of that function appropriate to the context of the call, allowing one function call to perform different tasks depending on context.
Sometimes this question ends up dismissed as stupid in the Python context, “because you don’t need that in Python”. I’d like to take some time to talk about that, because the ideas behind it aren’t worthless.
Arguments of Differing Types
As a simplistic example, consider a class which should work whether an argument to a class constructor is an integer or a double-precision
class Foo {
public:
Foo(int x);
Foo(double x);
...
};
Foo::Foo(int x)
{
...
}
Foo::Foo(double x)
{
...
}
Foo a = new(Foo(12));
Foo b = new(Foo(18.0));
So, two different constructor functions both named after the class, as that is the C++ syntax. C++ needs that information up front, or the compiler will throw type errors on compilation, as it’s an error to pass an argument of one type to a function expecting a different type. Overloading is often discussed in terms of constructors, though it is not limited to those methods.
Python syntax does not allow multiple functions/methods to take the
same name within the same scope - just like with variables.
People are sometimes a little surprised at this, but
a Python function definition is an executable statement, in which
two things happen: a function object is created from the body of
the definition, and a reference to that object is then assigned to
a variable with the name of the function - in other words, a
def
statement is variable assignment. So assigning two different
functions to the same name just means the second one replaces the first.
It’s "override" rather than "overload".
In the case of Python, the method we are interested in is __init__
.
It is not exactly a constructor, but as the method called automatically
after construction, the behavior that people associate with
constructors (“set things up”) happens here.
We can show this sequence with a bit of interactive Python use:
>>> a=2
>>> a=4
>>> print(a)
4
>>> def a():
... print "Hello"
...
>>> print(a)
<function a at 0x7f7fa29b4c80>
>>> a()
Hello
>>> def a(x):
... print "Hi, argument was", x
...
>>> print(a)
<function a at 0x7f7fa29b4cf8> # <1>
>>> a()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: a() takes exactly 1 argument (0 given)
>>> a(8)
Hi, argument was 8
>>>
-
note different address from previous function object
So while you can not define multiple functions/methods of the same name, since Python is not a strongly typed language, that is not necessarily any problem. Consider the following snippet:
class Foo:
def __init__(self, x):
self.x = x
a = Foo(12)
b = Foo(18.0)
c - Foo("text")
That is, we’ve instantiated with three different types of arguments, but the initializer doesn’t look any different. Of course, you have to make sure the method you write in Python can actually handle arguments of different types. Python will let you check the type of an object, so you could go ahead and carefully check your arguments and make sure:
>>> x = 12
>>> isinstance(x, int)
True
>>> x = 18.0
>>> isinstance(x, int)
False
>>> isinstance(x, float)
True
But there’s a concept often cited in Python circles that it’s "Easier to Ask Forgiveness than Permission" (or EAFP), which suggests that you just go ahead and try something, and clean up the mess afterwards if it didn’t work. That’s antithetical to some software development theory which says to check everything first ("Look Before You Leap"), but as long as you can properly deal with the fallout from failures there’s not any real problem. That idea isn’t unique to Python, though it may be expressed in different ways when talking about different languages. Exceptions are an example of this way of thinking, and they exist in many languages. This isn’t just a case of dogma, consider that many LBYL sequences have proven to have timing holes that lead to security issues - for example if you write code to validate a filename path, then proceed to open it after the check completes, the time difference between the two, however minimal, is a window where someone external to the program can link the pathname elsewhere and cause the program to open something it didn’t intend to. This class of security issues is called TOCTOU (Time Of Check, Time Of Use).
And it turns out Python can often handle differing types without any problem - it’s not an error, for example, to perform arithmetic operations between an integer and a float. That gets into another concept often discussed around Python, "duck typing". That is a term which is intentionally a little silly:
In other words, don’t check whether it IS-a duck: check whether it QUACKS-like-a duck, WALKS-like-a duck, etc, etc, depending on exactly what subset of duck-like behavior you need to play your language-games with.
— "Alex Martelli"
In this simple case, if your objective is to perform arithmetic on your arguments, then integers and floats do both have methods like add, subtract, multiply, etc. So there is not a compelling reason to treat them in different ways unless you actually run into a case where behavior is incompatible with your expectations.
Differing Numbers of Arguments
Another case for overloading in static languages is if the method may need to take different numbers of arguments. This can come up in a few different ways, to list a couple of examples:
-
You want to offer different ways to instantiate a class, as in a hypothetical employee database where a new employee can be added by a (Firstname, Lastname, Salary) triple, or by a string encoding all three as "Firstname-Lastname-Salary".
-
API evolution: say you’ve implemented a class, and then later find out you need to make some extensions to your API which involves passing an additional parameter. If you just change the constructor, then all the code instantiating that class must now change. But by overload through adding a new constructor plus leaving the old one and adjusting its behavior so it has a sensible default if the added argument from the new constructor is not passed old and new code can both be supported.
API Evolves, Arguments Added
Of the two examples, the "we added an argument but don’t want to break backwards compatibility" case seems fairly easy to handle in Python. A combination of keyword arguments and/or default arguments normally does the trick. So we can go from:
class Foo:
def __init__(self, x):
self.x = x
a = Foo(12)
to:
class Foo:
def __init__(self, x, y=None):
self.x = x
self.y = y # <1>
a = Foo(12) # <2>
b = Foo(12, 18.0) # <3>
-
Even if
y
was not passed, this is okay since it is set to default to something (None
in this case). -
Old way, one argument, still works
-
New way, two arguments
Differing Class Instantiations
The other example case has some more nuances. It suggests we’re intending, up front, to allow the class to instantiated in quite different ways (although this change could of course also happen as part of an evolution)
One way to approach this case is to use Python’s keyword argument passing. Rather than trying to put this in words, here’s an example:
class Employee:
num_of_emps = 0
def __init__(self, **kwargs):
if "emp_str" in kwargs:
first, last, pay = kwargs["emp_str"].split('-')
elif "first" in kwargs and "last" in kwargs and "pay" in kwargs:
first = kwargs["first"]
last = kwargs["last"]
pay = kwargs["pay"]
else:
print("invalid initializer:", kwargs)
return
self.first = first
self.last = last
self.pay = pay
Employee.num_of_emps += 1
def __str__(self):
return "Name: {} {}, Pay: {}".format(self.first, self.last, self.pay)
emp_1 = Employee(first="John", last="Public", pay=50000) # <1>
emp_2 = Employee(emp_str="Test-Employee-60000") # <2>
print(emp_1)
print(emp_2)
print("Employees:", Employee.num_of_emps)
-
Pass a tuple of values
-
Pass a string encoding all the values
We have managed to instantiate an Employee two ways: by
passing a tuple of values, or by passing an encoded string.
In the initializer, we try to work out which way we were
called by digging around in the dictionary that is given
to us as kwargs
, then fishing the actual values out
of that dict, and saving them in instance variables.
So this is certainly a form of "overloading", though it
feels kind of clunky.
We might as well try using default values instead:
class Employee:
num_of_emps = 0
def __init__(self, pay=None, last=None, first=None, emp_str=None):
if emp_str:
first, last, pay = emp_str.split('-')
elif not (first and last and pay):
print("invalid initializer")
return
self.first = first
self.last = last
self.pay = pay
Employee.num_of_emps += 1
def __str__(self):
return "Name: {} {}, Pay: {}".format(self.first, self.last, self.pay)
emp_1 = Employee(first="John", last="Public", pay=50000) # <1>
emp_2 = Employee(emp_str="Test-Employee-60000") # <2>
print(emp_1)
print(emp_2)
print("Employees:", Employee.num_of_emps)
-
Pass a tuple of values
-
Pass a string encoding all the values
Notice the caller side of this is identical - the "API" is the same.
This is a little simpler looking, but it still feels awkward
because of making assumptions in the __init__
function,
based on possibly not terribly reliable information - in the first
example we looked for the presence of key names in a dictionary,
in this one we’re looking for non-default values of named
arguments: if the string value is present we use it, else
we check that we have all three of the expected arguments in the
other form, and go from there.
There is another way to tackle this, which gets back to my objective in writing these posts - learning things added to Python since the "early days" of Python 2, and seeing how they can be used to make code nicer looking, and that is to use class methods. Class methods are not really new Python, they appeared in 2.2 and the decorator form was added in 2.4. Still, it’s not something I had learned about in those early Python 2 days.
To know what’s going on here, when a method is defined inside
a class definition, it is by default what is called an instance
method. That means it receives an implicit first argument
which is a reference to the instance object. By convention this
argument is named self
, though the name itself is not anything
magical. For a class method, this implicit argument is instead
a reference to the class object, and is by convention named
cls
. The simple way to set this up is to decorate the
method definition with @classmethod
. There is another kind
of method known as a static method, which does not receive
either an instance or class argument.
class Employee:
num_of_emps = 0
def __init__(self, first, last, pay):
self.first = first
self.last = last
self.pay = pay
Employee.num_of_emps += 1
@classmethod
def from_string(cls, emp_str):
first, last, pay = emp_str.split('-')
return cls(first, last, pay)
def __str__(self):
return "Name: {} {}, Pay: {}".format(self.first, self.last, self.pay)
emp_1 = Employee(first="John", last="Public", pay=50000) # <1>
emp_2 = Employee.from_string(emp_str="Test-Employee-60000") # <2>
print(emp_1)
print(emp_2)
print(Employee.num_of_emps)
-
Pass a tuple of values
-
Pass a string containing all the values, using the
from_string
classmethod
This leaves something nice and clean looking. I will admit for those who come from the "method overloading" point of view, it the way the string form is instantiated is different so it doesn’t look quite like classical overloading any more. Also note for symmetry, the tuple form could also be written as a class method, with both then calling to the initializer by calling through the class. Then at least the invocation methods would look more similar, as in:
emp_1 = Employee.from_tuple(first="John", last="Public", pay=50000)
emp_2 = Employee.from_string(emp_str="Test-Employee-60000")
No comments:
Post a Comment