Python Tricks: String Conversion(Every Class Needs a repr)
When you define a custom class in Python and then try to print one of its instance to the console(or inspect in an interpreter session), you get a relatively unsatisfying result. The default “to string” conversion behavior is basic and lacks detail:
In [12]: class Car:
...: def __init__(self, color, mileage):
...: self.color = color
...: self.mileage = mileage
In [13]: my_car = Car('red', 37281)
In [14]: print(my_car)
<__main__.Car object at 0x7fa8bc4c3b20>
In [15]: my_car
Out[15]: <__main__.Car at 0x7fa8bc4c3b20>
By default all you get is a string containing the class name and the id of the object instance(which is the object’s memory address in CPython.) That’s better than nothing, but it’s also not very useful.
You might find yourself trying to work around this by printing attributes of the class directly, or even by adding a custom to_string() method to your classes:
In [16]: print(my_car.color, my_car.mileage)
red 37281
The general idea here is the right one–but it ignores the conventions and built-in mechanisms Python uses to handle how objects are represented as strings.
Instead of building your own to-string conversion machinery, you’ll be better off adding the str and repr “dunder” methods to your class. They are the Pythonic way to control how objects are converted to strings in different situations.
Let’s take a look at how these methods work in practice. To get started, we’re going to add a str method to the Car class we defined earlier:
In [19]: class Car:
...: def __init__(self, color, mileage):
...: self.color = color
...: self.mileage = mileage
...: def __str__(self):
...: return f'a {self.color} car'
When you try printing or inspecting a Car instance now, you’ll get a different, slightly improved result:
In [20]: my_car = Car('red', 37281)
In [21]: print(my_car)
a red car
In [22]: my_car
Out[22]: <__main__.Car at 0x7fa8be313850>
Inspecting the car object in the console still gives us the previous result containing the object’s id. But printing
str is one of Python’s “dunder”(double-underscore) methods and gets called when you try to convert an object into a string through the various means that are available:
In [23]: print(my_car)
a red car
In [24]: str(my_car)
Out[24]: 'a red car'
In [25]: '{}'.format(my_car)
Out[25]: 'a red car'
With a proper str implementation, you won’t have to worry about printing object attributes directly or writing a separate to_string() function. It’s the Pythonic way to control string conversion.
By the way, some people refer to Python’s “dunder” methods as “magic methods.” But these methods are not supposed to be magical
Don’t be afraid to use Python’s dunder methods–they’re meant to help you.
str vs repr
Now, our string conversion story doesn’t end there. Did you see how inspecting my_car in an interpreter session still gave that odd< Car object at 0x7fa8bc4c3b20> result?
This happened because there are actually two
Here’s a simple experiment you can use to get a feel for when str or repr is used. Let’s redefine our car class so it contains both to-string
In [26]: class Car:
...: def __init__(self, color, mileage):
...: self.color = color
...: self.mileage = mileage
...: def __repr__(self):
...: return '__repr__ for Car'
...: def __str__(self):
...: return '__str__ for Car'
Now, when you play through the previous examples you can see which method controls the string conversion result in each case:
In [27]: my_car = Car('red', 37281)
In [28]: print(my_car)
__str__ for Car
In [29]: '{}'.format(my_car)
Out[29]: '__str__ for Car'
In [30]: my_car
Out[30]: __repr__ for Car
This experiment confirms that inspecting an object in a Python interpreter session simply prints the result of the object’s repr.
Interestingly, containers like lists and dicts always use the result of repr to represent the objects they contain. Even if you call str on the container itself:
In [31]: str([my_car])
Out[31]: '[__repr__ for Car]'
To manually choose between both string conversion methods, for example, to express your code’s intent more clearly, it’s best to use the built-in str() and repr() functions. Using them is preferable over calling the object’s str or repr directly, as it looks nicer and gives the same result:
In [32]: str(my_car)
Out[32]: '__str__ for Car'
In [33]: repr(my_car)
Out[33]: '__repr__ for Car'
Even with this investigation complete, you might be wondering what the “real-world” difference is betweenstr and repr . They both seem to serve the same purpose, so it might be unclear when to use each.
With questions like that, it’s usually a good idea to look into what the Python standard library does. Time to devise another experiment. We’ll create a date time. Date object and find out how it usesrepr and str to control string conversion:
In [34]: import datetime
In [35]: today = datetime.date.today()
The result of the date object’s str function should primarily be readable. It’s meant to return a concise textual representation for human consumption-something you’d feel comfortable displaying to a user. Therefore, we get something that looks like an ISO date format when we call str() on the date object:
In [36]: str(today)
Out[36]: '2022-05-23'
With repr, the idea is that its result should be, above all, unambiguous. The resulting string is intended more as a debugging aid for developers. And for that it needs to be as explicit as possible about what this object is. That’s why you’ll get a more elaborate result calling repr() on the object. It even includes the full module and class name:
In [37]: repr(today)
Out[37]: 'datetime.date(2022, 5, 23)'
We could copy and paste the string returned by repr and execute it as valid Python to recreate the original date objects. This is a neat approach and a goal to keep in mind while writing your own reprs.
On the other hand, I find that it is quite difficult to put into practice. Usually it won’t be worth the trouble and it’ll just create extra work for you. My rule of thumbs is to make my repr strings unambiguous and helpful for developers, but I don’t expect them to be able to restore an object’s complete state.
Why Every Class Needs a repr
If you don’t add a str method. Python falls back on the result of repr when looking forstr . Therefore, I recommend that you always add at least a repr method to your classes. This will guarantee a useful string conversion result in almost all cases, with a minimum of implementation work.
Here’s how to add basic string conversion support to your classes quickly and efficiently. For our Car class we might start with the following repr:
In [38]: def __repr__(self):
...: return f'Car({self.color!r}, {self.mileage!r})'
Please note that I’m using the !r conversion flag to make sure the output string uses repr(self.color) and repr(self.mileage) instead of str(self.color) and str(self.mileage).
This works nicely, but one downside is that we’ve repeated the class name inside the format string. A trick you can use here to avoid this repetition is to use the object’s class.nameattributes, which will always reflect the class’ name as a string.
The benefit is you won’t have to modify the\ repr implementation when the class name changes. This makes it easy to adhere to the Don’t Repeat Yourself(DRY)
In [39]: def __repr__(self):
...: return (f'{self.__class__.__name__}('f'{self.color!r}, {self.mileage!r})')
The downside of this implementation is that the format string is quite long and unwieldy. But with careful formatting, you can keep the code nice and PEP 8 compliant.
With the above repr implementation, we get a useful result when we inspect the object or call repr() on it directly:
In [49]: repr(my_car)
Out[49]: "Car('red', 37281)"
Printing the object or calling str() on it returns the same string because the default str implementation simply calls repr:
In [50]: print(my_car)
Car('red', 37281)
In [51]: str(my_car)
Out[51]: "Car('red', 37281)"
I believe this approach provides the most value with a modest amount of implementation work. It’s also a fairy cookie-cutter approach that can be applied without much deliberation. For this reason, I always try to add a basic repr implementation to my classes.
Here’s a complete example for Python3, including an optional str implementation:
In [52]: class Car:
...: def __init__(self, color, mileage):
...: self.color = color
...: self.mileage = mileage
...: def __repr__(self):
...: return (f'{self.__class__.__name__}('f'{self.color!r}, {self.mileage!r})')
...: def __str__(self):
...: return f'a {self.color} car'
Python 2.x Differences:unicode
In Python 3 there’s one data type to represent text across the board: str. It holds unicode characters and represent most of the world’s writing systems.
Python2.x uses a different data model for strings. There are two types to represent text:str, which is limited to the ASCII character set, andunicode, which is equivalent to Python3’s str.
Due to this difference, there’s yet another dunder method in the mix for controlling string conversion in Python 2:unicode In Python 2, strreturns bytes, whereas unicode returns characters.
For most intents and purposes, unicode is the newer and preferred method to control string conversion. There’s also a built-in unicode() function to go along with it. It calls the respective dunder method, similar to how str() and repr() work.
So far so good. Now, it gets a little more quirky when you look at the rules for when __str and unicode__ are called in PYthon2:
The print statement and str() call str . The unicode() built-in calls unicode if it exists, and otherwise falls back to str_ and decodes the result with the system text encoding.
Compared to Python3, these special cases complicate the text conversion rules somewhat. But there is a way to simplify things again for practical purposes. Unicode is the preferred and future-proof way of handling text in your Python programs.
So generally, what I would recommend you do in Python2.x is to put all of your string formatting code inside the unicode method and then create a stub str implementation that returns the unicode representation encoded as UTF-8:
In [53]: def __str__(self):
...: return unicode(self).encode('utf-8')
The str stub will be the same for most classes you write, so you can just copy and paste it around as needed( or put it into a base class where it makes sense). All of your string conversion code that is meant for non-developer use then lives in unicode.
Here’s a complete example for Python2.x:
In [54]: class Car(object):
...: def __init__(self, color, mileage):
...: self.color = color
...: self.mileage = mileage
...: def __repr__(self):
...: return '{}({!r}, {!r})'.format(self.__class__.__name__,self.color, self.mileage)
...: def __unicode__(self):
...: return u'a {self.color} car'.format(self=self)
...: def __str__(self):
...: return unicode(self).encode('utf-8')