Python Tricks: A Shocking Truth About String Formatting(二)

简介: Python Tricks: A Shocking Truth About String Formatting(二)

接上文 Python Tricks: A Shocking Truth About String Formatting(一)https://developer.aliyun.com/article/1618445

#3 - Literal String Interpolation(Python3.6+)
Python3.6 adds yet another way to format strings, called Formatted String Literals. This new way of formatting strings lets you use embedded Python expressions inside string constants. Here’s a simple example to give you a feel for the feature:

>>> f'Hello, {name}'
'Hello, Bob'

This new formatting syntax is powerful. Because you can embed arbitrary Python expressions, you can even do inline arithmetic with it, like this:

>>> a = 5
>>> b = 10
>>> f'Five plus ten is {a + b} and not {2 * (a + b)}.'
'Five plus ten is 15 and not 30.'

Behind the scenes, formatted string literals are a Python parser feature that converts f-strings into a series of string constants and expressions. They then get joined up to build the final string.

Imagine we had the following greet() function that contains an f-string:

>>> def greet(name, question):
...     return f"Hello, {name}! How's it {question}?"
... 
>>> greet('Bob', 'going')
"Hello, Bob! How's it going?"

When we disassemble the function and inspect what’s going on behind the scenes, we can see that the f-string in the function gets transformed into something similar to the following:

>>> def greet(name, question):
...     return("Hello, " + name + "! How's it " + question +"?")

The real implementation is slightly faster than that because it uses the BUILD_STRING opcode as an optimization. But functionally they’re the same:

>>> import dis
>>> dis.dis(greet)
  2           0 LOAD_CONST               1 ('Hello, ')
              2 LOAD_FAST                0 (name)
              4 BINARY_ADD
              6 LOAD_CONST               2 ("! How's it ")
              8 BINARY_ADD
             10 LOAD_FAST                1 (question)
             12 BINARY_ADD
             14 LOAD_CONST               3 ('?')
             16 BINARY_ADD
             18 RETURN_VALUE

String literals also support the existing format string syntax of the str.format() method. That allows you to solve the same formatting problems we’ve discussed in the previous two sections:

>>> f"Hey {name}, there's a {errno:#x} error!" 
"Hey Bob, there's a 0xbadc0ffee error!"

Python’s new Formatted String Literals are similar to the JavaScript Template Literals added in ES2015. I think they’re quite a nice addition to the language, and I’ve already started using them in my day-to-day Python3 work. You can learn more about Formatted String Literals in the offical Python documentation.

#4 - Template Strings
One more technique for string formatting in Python is Template Strings. It’s a simpler and less powerful mechanism, but in some cases this might be exactly what you’re looking for.

Let’s take a look at a simple greeting example:

>>> from string import Template
>>> t = Template('Hey, $name!')
>>> t.substitute(name=name)
'Hey, Bob!'

You see here that we need to import the Template class from Python’s built-in string module. Template strings are not a core language feature but they’re supplied by a module in the standard library.

Another difference is that template strings don’t allow format specifiers. So in order to get our error string example to work, we need to transform our int error number into a hex-string ourselves:

>>> templ_string = 'Hey $name, there is a $error error!'
>>> Template(templ_string).substitute(name=name, error=hex(errno))
'Hey Bob, there is a 0xbadc0ffee error!'

That worked great but you’re probably wondering when you use template strings in your Python programs. In my opinion, the best use case for template strings is when you’re handling format strings generated by users of your program. Due to their reduced complexity, template strings are a safer choice.

The more complex formatting mini-languages of other string formatting techniques might introduce security vulnerabilities to your programs. For example, it’s possible for format strings to access arbitrary variables in your program.

That means, if a malicious user can supply a format string they can also potentially leak secret keys and other sensitive information!Here’s a simple proof of concept of how this attack might be used:

>>> SECRET = 'this-is-a-secret'
>>> class Error:
...     def __init__(self):
...             pass

>>> err = Error()
>>> user_input = '{error.__init__.__globals__[SECRET]}'
# Uh-oh...
>>> user_input.format(error=err)
'this-is-a-secret'

See how the hypothetical attacker was able to extract our secret string by accessing the globals dictionary from the format string? Scary, huh! Template Strings close this attack vector, and this makes them a safer choice if you’re handling format strings generated from user input:

>>> user_input = '${error.__init__.__globals__[SECRET]}'
>>> Template(user_input).substitute(error=err)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/liuxiaowei/opt/anaconda3/lib/python3.9/string.py", line 121, in substitute
    return self.pattern.sub(convert, self.template)
  File "/Users/liuxiaowei/opt/anaconda3/lib/python3.9/string.py", line 118, in convert
    self._invalid(mo)
  File "/Users/liuxiaowei/opt/anaconda3/lib/python3.9/string.py", line 101, in _invalid
    raise ValueError('Invalid placeholder in string: line %d, col %d' %
ValueError: Invalid placeholder in string: line 1, col 1

Which String Formatting Method Should I Use?
I totally get that having so much choice for how to format your strings in Python can feel very confusing. This would be a good time to bust out some flowchart infographic…

But I’m not going to do that. Instead, I’ll try to boil it down to the simple rule of thumb that I apply when I’m writing Python.

Here we go—you can use this rule of thum any time you’re having difficulty deciding which string formatting method to use, depending on the circumstances:

Dan’s Python String Formatting Rule of Thumb:

if your format strings are user-supplied, use Template Strings to avoid security issues. Otherwise, use Literal String Interpolation if you’re on Python3.6+, and “New Style” String Formatting if you’re not.

相关文章
|
2月前
|
存储 Java 索引
Python String详解!
本文详细介绍了Python中的字符串数据类型,包括其创建、访问、切片、反转及格式化等操作。文章涵盖字符串的基本概念、各种操作方法以及常用内置函数。通过多个示例代码展示了如何使用单引号、双引号和三重引号创建字符串,如何通过索引和切片访问与修改字符串内容,以及如何利用格式化方法处理字符串。此外,还介绍了字符串的不可变性及其在实际应用中的重要性。通过本文的学习,读者可以全面掌握Python字符串的使用技巧。
54 4
|
2月前
|
Go C++ Python
Python Tricks: String Conversion(Every Class Needs a ___repr__)
Python Tricks: String Conversion(Every Class Needs a ___repr__)
|
2月前
|
前端开发 Python
Python Tricks-- Abstract Base Classes Keep Inheritance in Check
Python Tricks-- Abstract Base Classes Keep Inheritance in Check
|
2月前
|
C++ Python
Python Tricks--- Object Comparisons:“is” vs “==”
Python Tricks--- Object Comparisons:“is” vs “==”
|
2月前
|
Python
Python Tricks: Nothing to Return Here
Python Tricks: Nothing to Return Here
|
2月前
|
C# Python
Python Tricks : Function Argument Unpacking
Python Tricks : Function Argument Unpacking
|
2月前
|
Python
Python Tricks : How to Write Debuggable Decorators
Python Tricks : How to Write Debuggable Decorators
|
2月前
|
Linux Go Python
Python Tricks :The Power Of Decorators
Python Tricks :The Power Of Decorators
|
2月前
|
API Python
Python Tricks : Fun With args and kwargs
Python Tricks : Fun With args and kwargs
|
2月前
|
Go C# Python
Python Tricks :Lambdas Are Single-Expression Functions 原创
Python Tricks :Lambdas Are Single-Expression Functions 原创