Python Tricks: A Shocking Truth About String Formatting(二)

简介: Python Tricks: A Shocking Truth About String Formatting(二)

接上文 Python Tricks: A Shocking Truth About String Formatting(一)https://developer.aliyun.com/article/1618445

#3 - Literal String Interpolation(Python3.6+)
Python3.6 adds yet another way to format strings, called Formatted String Literals. This new way of formatting strings lets you use embedded Python expressions inside string constants. Here’s a simple example to give you a feel for the feature:

>>> f'Hello, {name}'
'Hello, Bob'

This new formatting syntax is powerful. Because you can embed arbitrary Python expressions, you can even do inline arithmetic with it, like this:

>>> a = 5
>>> b = 10
>>> f'Five plus ten is {a + b} and not {2 * (a + b)}.'
'Five plus ten is 15 and not 30.'

Behind the scenes, formatted string literals are a Python parser feature that converts f-strings into a series of string constants and expressions. They then get joined up to build the final string.

Imagine we had the following greet() function that contains an f-string:

>>> def greet(name, question):
...     return f"Hello, {name}! How's it {question}?"
... 
>>> greet('Bob', 'going')
"Hello, Bob! How's it going?"

When we disassemble the function and inspect what’s going on behind the scenes, we can see that the f-string in the function gets transformed into something similar to the following:

>>> def greet(name, question):
...     return("Hello, " + name + "! How's it " + question +"?")

The real implementation is slightly faster than that because it uses the BUILD_STRING opcode as an optimization. But functionally they’re the same:

>>> import dis
>>> dis.dis(greet)
  2           0 LOAD_CONST               1 ('Hello, ')
              2 LOAD_FAST                0 (name)
              4 BINARY_ADD
              6 LOAD_CONST               2 ("! How's it ")
              8 BINARY_ADD
             10 LOAD_FAST                1 (question)
             12 BINARY_ADD
             14 LOAD_CONST               3 ('?')
             16 BINARY_ADD
             18 RETURN_VALUE

String literals also support the existing format string syntax of the str.format() method. That allows you to solve the same formatting problems we’ve discussed in the previous two sections:

>>> f"Hey {name}, there's a {errno:#x} error!" 
"Hey Bob, there's a 0xbadc0ffee error!"

Python’s new Formatted String Literals are similar to the JavaScript Template Literals added in ES2015. I think they’re quite a nice addition to the language, and I’ve already started using them in my day-to-day Python3 work. You can learn more about Formatted String Literals in the offical Python documentation.

#4 - Template Strings
One more technique for string formatting in Python is Template Strings. It’s a simpler and less powerful mechanism, but in some cases this might be exactly what you’re looking for.

Let’s take a look at a simple greeting example:

>>> from string import Template
>>> t = Template('Hey, $name!')
>>> t.substitute(name=name)
'Hey, Bob!'

You see here that we need to import the Template class from Python’s built-in string module. Template strings are not a core language feature but they’re supplied by a module in the standard library.

Another difference is that template strings don’t allow format specifiers. So in order to get our error string example to work, we need to transform our int error number into a hex-string ourselves:

>>> templ_string = 'Hey $name, there is a $error error!'
>>> Template(templ_string).substitute(name=name, error=hex(errno))
'Hey Bob, there is a 0xbadc0ffee error!'

That worked great but you’re probably wondering when you use template strings in your Python programs. In my opinion, the best use case for template strings is when you’re handling format strings generated by users of your program. Due to their reduced complexity, template strings are a safer choice.

The more complex formatting mini-languages of other string formatting techniques might introduce security vulnerabilities to your programs. For example, it’s possible for format strings to access arbitrary variables in your program.

That means, if a malicious user can supply a format string they can also potentially leak secret keys and other sensitive information!Here’s a simple proof of concept of how this attack might be used:

>>> SECRET = 'this-is-a-secret'
>>> class Error:
...     def __init__(self):
...             pass

>>> err = Error()
>>> user_input = '{error.__init__.__globals__[SECRET]}'
# Uh-oh...
>>> user_input.format(error=err)
'this-is-a-secret'

See how the hypothetical attacker was able to extract our secret string by accessing the globals dictionary from the format string? Scary, huh! Template Strings close this attack vector, and this makes them a safer choice if you’re handling format strings generated from user input:

>>> user_input = '${error.__init__.__globals__[SECRET]}'
>>> Template(user_input).substitute(error=err)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/liuxiaowei/opt/anaconda3/lib/python3.9/string.py", line 121, in substitute
    return self.pattern.sub(convert, self.template)
  File "/Users/liuxiaowei/opt/anaconda3/lib/python3.9/string.py", line 118, in convert
    self._invalid(mo)
  File "/Users/liuxiaowei/opt/anaconda3/lib/python3.9/string.py", line 101, in _invalid
    raise ValueError('Invalid placeholder in string: line %d, col %d' %
ValueError: Invalid placeholder in string: line 1, col 1

Which String Formatting Method Should I Use?
I totally get that having so much choice for how to format your strings in Python can feel very confusing. This would be a good time to bust out some flowchart infographic…

But I’m not going to do that. Instead, I’ll try to boil it down to the simple rule of thumb that I apply when I’m writing Python.

Here we go—you can use this rule of thum any time you’re having difficulty deciding which string formatting method to use, depending on the circumstances:

Dan’s Python String Formatting Rule of Thumb:

if your format strings are user-supplied, use Template Strings to avoid security issues. Otherwise, use Literal String Interpolation if you’re on Python3.6+, and “New Style” String Formatting if you’re not.

相关文章
|
1天前
|
Go C# Python
Python Tricks:Python‘s Functions Are First-Class
Python Tricks:Python‘s Functions Are First-Class
19 3
|
1天前
|
Python
Python Tricks: A Shocking Truth About String Formatting(一)
Python Tricks: A Shocking Truth About String Formatting(一)
18 0
|
1天前
|
Python
Python tricks Context Managers and the with Statement
Python tricks Context Managers and the with Statement
|
1天前
|
开发工具 git Python
Python Tricks : Complacent Comma Placement
Python Tricks : Complacent Comma Placement
|
2月前
|
SQL JSON 测试技术
Python中的f-string
Python中的f-string
|
2月前
|
存储 Serverless 数据处理
Python - len(string)函数
通过上述介绍和示例,我们可以清楚地看到,在Python中,`len()`函数是处理字符串以及其他可迭代对象长度的重要工具。它简单、易用,但在实际应用中却非常强大,无论是在基础编程还是在复杂的数据处理中,`len()`函数都扮演着不可或缺的角色。
54 10
|
3月前
|
Java 开发者 Python
Python中,字符串(String)是一种不可变的数据类型
Python中,字符串(String)是一种不可变的数据类型
|
3月前
|
SQL 数据库 数据安全/隐私保护
【Python】已解决:(SqlServer报错)SQL错误(208):对象名‘string_split’无效
【Python】已解决:(SqlServer报错)SQL错误(208):对象名‘string_split’无效
65 2
|
5月前
|
Python
Python中的f-string记录表达式:调试文档与实践指南
【4月更文挑战第17天】Python 3.8 引入了f-string记录表达式,允许在格式化字符串时执行赋值操作。这在文档字符串和调试时尤其有用。基本语法是 `f&quot;{variable = expression}&quot;`。示例包括在函数文档字符串中展示变量值和在调试输出中记录变量状态。注意性能和可读性,以及赋值顺序。f-string记录表达式提升了代码效率和维护性,成为Python开发的实用工具。
|
1天前
|
iOS开发 MacOS Python
Python 编程案例:谁没交论文?输出并生成电子表格
Python 编程案例:谁没交论文?输出并生成电子表格
17 9