The Bottom-Line Single Main Difference Between Python 2 and 3
"How is Python 3 different from Python 2?"
An excellent question. A hard one, too. Python 3 has hundreds of improvements compared to the last Python 2. Some of these may matter to you a lot. It's possible most won't matter to you at all.
Which are which? I have no idea.
But I can bottom line it for you.
We can summarize all the ways Python 3 differs from 2.7 - digest all the things that are new in 3, and will now never be backported to 2 - in one sentence:
Python 3 makes it easier to develop high quality software.
"High quality" means software that is:
- Less prone to hidden or tricky bugs - i.e., robust and reliable;
- More straightforward to change over time - in other words, maintainable; and
- Simpler to troubleshoot and fix when things do go wrong.
You can, of course, create high quality software in Python 2. But the two hundred-ish improvements in Python 3 generally make it easier to do so. Here are three examples.
Evading the Worst Kind of Bug
Imagine a class representing an angle, in the range of 0 to 360 degrees:
- >>> # Python 2 code.
- >>> class TrickyAngle(object):
- ... def __init__(self, degrees):
- ... self.degrees = degrees % 360
- ...
- >>> TrickyAngle(6) < TrickyAngle(5)
- True
- >>> TrickyAngle(6) < TrickyAngle(5)
- False
Whoa, what just happened? The same char-for-char expression,
TrickyAngle(6) < TrickyAngle(5)
, evaluated twice,
produces two completely opposite values. In Python 2, if you forget to
define the __lt__
method on your class, the comparison
falls back to using the object's ID. This is effectively a random
integer. Whether you get True or False is like tossing a coin:
- # In Python 2...
- >>> a, b = TrickyAngle(42), TrickyAngle(84)
- >>> id(a), id(b)
- (4518897296, 4518897232)
- >>> id(a) < id(b)
- False
- >>>
- >>> # Exact same code - run it again...
- >>> a, b = TrickyAngle(42), TrickyAngle(84)
- >>> id(a), id(b)
- (4518897040, 4518897104)
- >>> id(a) < id(b)
- True
This is the worst kind of bug. It's the kind that easily sneaks its way past the most vigilant unit tests, even rigorous manual testing, all the way into the deployed, production code base. And you won't even know it's there, until your most valued clients are screaming at you.
If that's the worst kind of bug, what's the best kind? The kind that immediately, unambiguously, loudly announces its presence, in a way that is impossible to miss:
- >>> # Python 3 code.
- ... class TrickyAngle:
- ... def __init__(self, degrees):
- ... self.degrees = degrees % 360
- ...
- >>> TrickyAngle(6) < TrickyAngle(5)
- Traceback (most recent call last):
- File "<stdin>", line 1, in <module>
- TypeError: unorderable types: TrickyAngle() < TrickyAngle()
That's right: in Python 3, comparing two objects raises a
TypeError
by default, until you explicitly define an
ordering.
This one change is representative of a large class of subtle semantic improvements in Python 3, each of which eliminates many potential bugs. Imagine how much debugging time and frustration they save.
Streamlined super()
I'll assume you know about Python's built-in super()
function. It's used in methods, to call a method in the parent
class. In Python 2, you use it like this:
- import json
- class Config(object):
- def __init__(self, config_dict):
- self.data = config_dict
-
- class ClientConfig(Config):
- def __init__(self, json_config_file):
- with open(json_config_file) as fh:
- data = json.load(fh)
- super(ClientConfig, self).__init__(data)
Whenever you call super()
, you have to pass in two
arguments. These parameters are useful in multiple inheritance, to
make your classes cooperate. But in the much more common case
of single inheritance, typing in ClientConfig
and
self
are redundant. That's why in Python 3,
you can omit them both:
- class ClientConfig(Config):
- def __init__(self, json_config_file):
- with open(json_config_file) as fh:
- data = json.load(fh)
- super().__init__(data) # <---- Shorter!
This has two benefits. The first, obvious one is that if the name
of ClientConfig
changes, the call to super()
need not be modified. A small but real maintainability benefit, right there.
There's also a more immediate, cognitive benefit: you don't have to
think as much when typing out the super
line. In Python
2, I always have to take a moment to remember the order of the
arguments, or even look it up. (Is it super(MyClass,
self)
or super(self, MyClass)
?) And I have to
glance up at what current class I'm in, if I don't immediately recall
its name.
Running into this just once this doesn't drain much mental
energy. But when you're deep in your focus of coding, even the
smallest distraction can disrupt the pictures you're holding in your
mind; having the freedom to just type "super()
" in that
moment can help maintain your flow. Multiplying over every time you
use super
, and other improvements like this in Python
3, compounds the effect.
Unmasked Exceptions
Suppose you have the following class:
- # Valid in both Python 2 and 3.
- class DataSet(object):
- def __init__(self, data):
- # data is of type list.
- self.data = data
- def mean_of_positives(self):
- '''
- Return average of *positive* elements in data set.
- '''
- # (I'll tell you what's in here momentarily.)
- def record_first_element(self, dest_path):
- '''
- Sample the current data set by writing
- first element to a log file.
- '''
- # (Again, be patient.)
-
- # Now, somewhere in your application:
- dataset = DataSet(initial_data)
- try:
- posmean = dataset.mean_of_positives()
- finally:
- dataset.record_first_element(sample_file)
Now suppose the code above is run under Python 2.7, and you see the following traceback:
- Traceback (most recent call last):
- File "chaining.py", line 25, in <module>
- dataset.record_first_element(sample_file)
- File "chaining.py", line 16, in record_first_element
- dest.write(str(self.data[0]) + ',')
- IndexError: list index out of range
Pop quiz: what method of DataSet
triggered an
exception?
Got it? Look carefully. Are you sure?
It turns out there are two different exceptions raised
here. In Python 2, the IndexError
is the second, and
fully masks the first one. But in Python 3, all is revealed:
- Traceback (most recent call last):
- File "chaining.py", line 23, in <module>
- posmean = dataset.mean_of_positives()
- File "chaining.py", line 12, in mean_of_positives
- return total / len(positives)
- ZeroDivisionError: float division by zero
-
- During handling of the above exception, another exception occurred:
-
- Traceback (most recent call last):
- File "chaining.py", line 25, in <module>
- dataset.record_first_element(sample_file)
- File "chaining.py", line 16, in record_first_element
- dest.write(str(self.data[0]) + ',')
- IndexError: list index out of range
In the try
block, mean_of_positives
raises a ZeroDivisionError
. Before bubbling up,
execution jumps to the finally
block, where
record_first_element
raises IndexError
.
Python 3's way of handling the situation is called exception chaining. There are steps you can take to implement this manually, but Python 3 gives it to you for free, so to speak.
This benefits your development process in obvious ways. One that may not be obvious: if your long-running application is properly logging exceptions, this can massively assist with troubleshooting rare bugs that are hard to reproduce. Exception chaining tells you about everything that went wrong, not just the last thing.
All Else Being ==
All else being equal, Python 3 makes it easier to write high quality software than Python 2. These are three changes enabling that, and there are hundreds more.
Of course, things are not all equal. If you're starting a brand-new application today, there are plenty of valid reasons to write it in 2 instead of 3. There will be for a long time.
And if you have a large existing code base, you definitely want to carefully consider before dropping everything to port it to the new major version. Python 3 brings many benefits, but switching to it does have a cost in time and energy.
There's another aspect to all this. The examples above don't convey the new syntax, core language enhancements, and modules exclusively available in Python 3... enabling expressive new patterns, and making a real impact on development speed and productivity. In other words, Python 3 is a more powerful and expressive language as a whole.
That tends to make the language more of a joy to work with. Which arguably doesn't matter professionally. But as a developer, it might matter to you personally.
In any event, just keep your eyes open, so you can make the authentically best decision for you and your team. And whatever you do, write the best software you can!
Thanks to Simeon Franklin for feedback on a draft of this article.