Python tricks:Underscores, Dunders, and More
接上篇 https://developer.aliyun.com/article/1618427?spm=a2c6h.13148508.setting.17.77924f0edu1m2D
3. Double leading Underscore:“__var”
The naming patterns we’ve covered so far receive their meaning from agreed-upon convention only. With Python class attributes(variables and methods) that start with double underscores, things are a little different.
A double underscore prefix causes the Python interpreter to rewrite the attribute name in order to avoid naming conflicts in subclasses.
This is also called name mangling - the interpreter changes the name of the variable in a way that makes it harder to create collisions when the class is extended later.
I know this sounds rather abstract. That’s why I put together this little code example we can use for experimentation:
>>> class Test:
... def __init__(self):
... self.foo = 11
... self._bar = 23
... self.__baz = 42
Let’s take a look at the attributes on this object using the built-in dir() function:
>>> t = Test()
>>> dir(t)
['_Test__baz', '__class__', '__delattr__', '__dict__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__le__', '__lt__', '__module__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__', '_bar', 'foo']
This gives us a list with the object’s attributes. Let’s take this list and look for our original variable names foo, _bar, and __baz. I promise you’ll notice some interesting changes.
First of all, the self.foo variable appears unmodified as foo in the attribute list.
Next up, self._bar behaves the same way–it shows up on the class as _bar. Like I said before, the leading underscore is just a convention in this case-a hint for the programmer.
However, with self.baz things look a little different. When you search for baz in that list, you’ll see that there is no variable with that name.
So what happened to __baz?
If you look closely, you’ll see there’s an attribute called Testbaz on this object. This is the name mangling that the Python interpreter applies. It does this to protect the variable from getting overridden in the subclasses.
Let’s create another class that extends the Test class and attempts to override its existing attributes added in the constructor:
>>> class ExtendedTest(Test):
... def __init__(self):
... super().__init__()
... self.foo = 'overridden'
... self._bar = 'overridden'
... self.__baz = 'overridden'
Now, what do you think the values of foo, _bar, and__baz will be on instances of this ExtendedTest class?Let’s take a look:
>>> t2 = ExtendedTest()
>>> t2.foo
'overridden'
>>> t2._bar
'overridden'
>>> t2.__baz
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'ExtendedTest' object has no attribute '__baz'
Wait, why did we get that AttributeError when we tried to inspecdt the value of t2.baz?Name mangling strikes again! It turns out this object doesn’t even have a baz attribute:
>>> dir(t2)
['_ExtendedTest__baz', '_Test__baz', '__class__', '__delattr__', '__dict__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__le__', '__lt__', '__module__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__', '_bar', 'foo']
As you can see, baz got turned into _ExtendedTestbaz to prevent accidental modification. But the original _Test__baz is also still around:
>>> t2._ExtendedTest__baz
'overridden'
>>> t2._Test__baz
42
Double underscore name mangling is fulling transparent to the programmer. Take a look at the following example that will confirm this:
>>> class ManglingTest:
... def __init__(self):
... self.__mangled = 'hello'
... def get_mangled(self):
... return self.__mangled
...
>>> ManglingTest().get_mangled()
'hello'
>>> ManglingTest().__mangled
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'ManglingTest' object has no attribute '__mangled'
Does name mangling also apply to method names? It sure does! Name mangling affects all names that with two underscore characters(“dunders”) in a class context:
>>> class MangledMethod:
... def __method(self):
... return 42
... def call_it(self):
... return self.__method()
>>> MangledMethod().__method()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'MangledMethod' object has no attribute '__method'
>>> MangledMethod().call_it()
42
Here’s another, perhaps surprising, example of name mangling in action:
_MangledGlobal__mangled = 23
>>> class MangledGlobal:
... def test(self):
... return __mangled
>>> MangledGlobal().test()
23
In this example, I declared _MangledGlobal mangled as a global variable. Then I accessedd the variable inside the context of a class named MangledGlobal. Because of name mangling, I was able to reference the_MangledGlobalmangled global variable as justmangled inside the test() method on the class.
The Python interpreter automatically expanded the name mangled to_MangledGlobalmangled because it begins with two underscore characters. This demonstrates that name mangling isn’t tied to class attributes specially. It applies to any name starting with two underscore characters that is used in a class context.
Whew! That was a lot to absorb.
To be honest with you, I didn’t write down these examples and explanations off the top my head. It took me some research and editing to do it. I’ve been using Python for years but rules and special cases like that aren’t constantly on my mind.
Sometimes the most important skills for a programmer are “pattern recognition” and knowing where to look things up. If you feel a little overwhelmed at this point, don’t worry. Take your time and play with some of the examples in this chapter.
Let these concepts sink in enough so that you’ll recognize the general idea of name mangling and some of the other behaviors I’ve shown you. If you encounter them “in the wild” one day, you’ll know what to look for in the documentation.
Sidebar: What are dunders?
If you’ve heard some experienced Pythonistas talk about Python or watched a few conference talks you may have heard the term dunder. If you’re wondering what that is, well, here’s your answer:
Double underscores are often referred to as “dunders” in the Python community. The reason is that double underscores appear quite often in Python code, and to avoid fatiguing their jaw muscles, Pythonistas often shorten “double underscore” to “dunder”.
For example, you’d pronounce baz as “dunder baz” . Likewise, init__ would be pronounced as “dunder init”, even though one might think it should be “dunder init dunder”.
4. Double Leading and Trailing Underscore: “var”
Perhaps surprisingly, name mangling is not applied if a name starts and ends with double underscores. Variables surrounded by a double underscore prefix and postfix are left unscathed by the Python interpreter:
>>> class PrefixPostfixTest:
... def __init__(self):
... self.__bam__=42
>>> PrefixPostfixTest().__bam__
42
However, names that have both leading and trailing double underscores are reserved for special use in the language. This rule covers things like initfor object constructors, or call to make objects callable.
These dunder methods are often referred to as magic methods --but many people in the Python community, including myself, don’t like that word. It implies that the use of dunder methods is discouraged. Which is entirely not the case. They’re a core feature in Python and should be used as needed. There’s nothing “magical” or arcane(神秘的) about them.
However, as far as naming conventions go, it’s best to stay away from using names that start and end with double underscores in your own programs to avoid collisions with future changes to the Python language.
5. Single Underscore:“_”
Per convention, a single stand-alone underscore is sometimes used as a name to indicate that a variable is temporary or insignificant.
For example, in the following loop we don’t need access to the running index and we can use “_” to indicate that is just a temporary value:
>>> for _ in range(32):
... print('Hello, World.')
Hello, World.
Hello, World.
Hello, World.
Hello, World.
Hello, World.
Hello, World.
Hello, World.
Hello, World.
Hello, World.
Hello, World.
Hello, World.
Hello, World.
Hello, World.
Hello, World.
Hello, World.
Hello, World.
Hello, World.
Hello, World.
Hello, World.
Hello, World.
Hello, World.
Hello, World.
Hello, World.
Hello, World.
Hello, World.
Hello, World.
Hello, World.
Hello, World.
Hello, World.
Hello, World.
Hello, World.
Hello, World.
You can also use single underscores in unpacking expressions as a “don’t care” variable to ignore particular values. Again, this meaning is per convention only and it doesn’t trigger any special behaviors in the Python parser. The single underscore is simply a valid variable name that’s sometimes used for this purpose.
In the following code example, I’m unpacking a tuple into separate variable but I’m only interested in the values for the color and mileage fields. However, in order for the unpacking expression to succeed, I need to assign all values contained in the tuple to variables. That’s where “_” is useful as placeholder variable:
>>> car = ('red', 'auto', 12, 3812.4)
>>> color, _,_, mileage = car
>>> color
'red'
>>> mileage
3812.4
>>> _
12
Besides its use as a temporary variable, “_” is a special variable in most Python REPLs that represents the result of the last expression evaluated by the interpreter.
This is handy if you’re working in an interpreter session and you’d like to access the result of a previous calculation:
>>> 20 + 3
23
>>> _
23
>>> print(_)
23
It’s also handy if you’re constructing objects on the fly and want to interact with them without assigning them a name first:
>>> list()
[]
>>> _.append(1)
>>> _.append(2)
>>> _.append(3)
>>> _
[1, 2, 3]