What Is Metaprogramming?
Metaprogramming is the use of code to modify or create other code.
It is primarily a developer tool and acts as a force multiplier, allowing large amounts of predictable code to be generated from just a few statements in the host language (or “metalanguage”). It is extremely useful for automating repetitive, boilerplate code.
首先谈谈什么是元编程?
普通编程是用代码去修改数据(data), 而元编程是用代码去创建和修改代码(code)
为什么需要元编程?
所有的语言都会设计最基本最抽象的keywords, 或schema, 这些程序员都可以直接使用
但是毕竟语言的schema是最基本的, 所以不可能面面俱到
程序员平时的开发中依然会碰到很多这样需求,
比如一段代码反复不停的出现, 完全可以作为样板boilerplate, 但又不想或无法使用函数(避免函数调用的耗费,避免参数提前被eval, 或其他原因)
又比如在特定的领域, 对DSL的需求
这样程序员希望可以修改和增加语言的schema, 即用代码去创建和修改代码(code), 称为元编程
可见元编程是一种高级的功能, 不是所有程序员都需要用到, 或想到去用
毕竟它只是使代码更加整洁, 更容易维护...并不是非用不可
但对于高级程序员而言, 对代码, 除了功能的实现, 还有对beautiful的追求, 所以各种语言也都或多或少的支持元编程
Most programming languages support some form of metaprogramming.
C has a preprocessor 预定义, 宏
C++ has templates
Java has annotations and aspect-oriented programming extensions
Scripting languages have “eval” statements.
Most languages have some sort of API that can be used to introspect or modify the core language features (such as classes and methods). As a last resort, any language can be used to build source code using string manipulation and then feed it to a compiler.
Code vs. Data
For most languages, treating code as data or data as code is a more or less a cumbersome process
One common strategy is to treat code as a textual string.
Another strategy is to provide a set of APIs that expose the concepts of a programming language as objects within the language, allowing the programmer to make calls such as createClass() or addMethod(), to build code structures programmatically.
在大部分语言中, 代码和数据是完全不同的, 所以为了实现元编程, 需要把代码当数据处理, 这个挺麻烦的.
要不将代码当string, 并接完后交给complier, 想想是不是很麻烦, 也很容易出错
要不实现一系列API, 来封装和产生code, 这个方法好用些, 不过仍然比较麻烦
Homoiconicity, Code = Data
Clojure (and other Lisps) provide a third way of handling the code/data distinction: there is no distinction. In Clojure, all code is data and all data is code.
对于clojure却是一件非常简单的事
其他语言都有很多复杂的语法, 所以修改和生成code都是件很复杂的事情
但clojure其实是号称没有语法的, 所有的代码, 数据都是list, 没有差别, 这样是不是很简单
(println "Hello, world") ;code
'(println "Hello, world") ;data
Macros
Macros are the primary means of metaprogramming in Clojure.
A Clojure macro is a construct which can be used to transform or replace code before it is compiled. Syntactically, they look a lot like functions, but with several crucial distinctions:
• Macros shouldn't return values directly, but a form.
• Arguments to macros are passed in without being evaluated.
• Macros are evaluated only at compile-time.
When you use a macro in your code, what you are really telling Clojure to do is to replace your macro expression with the expression returned by the macro
This is a powerful means of abstraction, and is very useful for implementing control structures or eliminating boilerplate or "wrapper" code.
Macros就是在Clojure中用来进行元编程的, 对于他的特性, 其实只要记住, 它只是在编译(compile)时执行, 并将macro替换成得到的form, 而不会象普通的function到执行时再去eval
例子, 定义macros, triple-do, 执行3遍
(triple-do (println "Hello"))
Be compiled as this expression:
(do (println "Hello") (println "Hello") (println "Hello"))
可见使用macro, 可以使代码更简洁, 并且节省function调用的耗费
在complie的时候, 编译器会完成macro和code之间的替换, 对于最终的执行代码而言, 没有任何差别.
Working with Macros
使用defmacro定义macro, 其实是定义一个function并register成macro, 所以参数和defn相同.
但这是种特殊的函数, 只会被complier执行, 并返回合法的form, 编译器会用返回的form去替换macro
To create a macro, use the defmacro macro.
This defines a function and registers it as a macro with the Clojure compiler.
From then on, when the compiler encounters the macro, it will call the function and use the return value instead of the original expression.
defmacro takes basically the same arguments as defn:
a name, an optional documentation string, a vector of arguments, and a body.
As previously mentioned, the body should evaluate to a valid Clojure form. If the form returned by the macro function is syntactically invalid, it will cause an error wherever it is used.
triple-do的定义,
(defmacro triple-do [form]
(list 'do form form form))
Note that do is quoted, so it is added to the resultant list as a symbol, rather than being evaluated in place in the body of the macro.
list是函数, 所以所有参数都是会先eval, 所以必须给do加上quote以避免eval. 而form是需要eval的, 不然最终得到的list就是(do form form form)
Debugging Macros
Using macros can be somewhat mind-bending, since you have to keep in mind not only the code you're writing, but the code you're generating. Clojure provides two functions that help debug macros as you write them: macroexpand andmacroexpand-1.
macroexpand expands the given form repeatedly until it is no longer a macro expression.
macroexpand-1 expands the expression only once.
(macroexpand '(triple-do (println "test"))) ;需要加单引号, 防止提前eval (do (println "test") (println "test") (println "test")))
macroexpand-all which, unlike macroexpand or macroexpand-1, does recursively expand all the macros it can find until there are none left.
Code Templating
Manually creating forms to return from macro functions can sometimes be tedious. Worse, with complex macros it can be difficult to determine what the output form will actually be.
为什么上面定义macro返回的时候使用list而不是直接使用普通的str ‘(do form form form)?
虽然使用str更加直观, 原因就是, 需要通过function来eval form到实际的代码
但是如果写比较复杂的代码, 还是使用str比较容易些, 怎么解决这个问题?
Clojure提供模板系统, 用`(语法引号, 反引号), 这个和普通'的唯一不同, 就是在`里面可以, 用unquote symbol (the tilde, ~)来eval value
The templating system is based around the syntax-quote character, a backquote: `.
(defmacro template-triple-do [form] `(do ~form ~form ~form))
可以比较一下, 这个和上面的基于list的实现方法, 好处两点
1. 用str更容易理解, 尤其对于复杂代码
2. 两个方法需要特别标注的部分是不一样的, list需要用'标注不需要eval的部分, 而模板需要用~标注需要eval的部分
这个例子是个特例, 而大部分代码中, 一定是不需要eval的部分更多些
Splicing Unquotes
Unquoting sequences within a syntax-quote doesn't always work out quite as intended. Sometimes, it is desirable to insert the contents of a sequence the templated list, rather than the list itself.
(defmacro template-infix [form] `(~(second form) ~(first form) ~(nnext form))) (macroexpand '(template-infix (1 + 3))) (+ 1 (3)) ;其实需要的是(+ 1 3)
问题直接看例子, 有时候拼接代码的时候, 会多出list的括号
如例子中, nnext返回的是list, 所以拼出来3两边就多一对括号
解决办法就是, 用特殊标识来表明, 这里需要的是list的value, 而不是list本身
To insert the contents of a list, use the splicing unquote, denoted by ~@.
(defmacro template-infix [form] `(~(second form) ~(first form) ~@(nnext form))) (macroexpand '(template-infix (1 + 3))) (+ 1 3)
Generating Symbols
你编写macro总要使用局部变量, 但是你并不知道macro被使用的上下文, 所以很有可能, macro中的局部变量名, 在complier完成代码替换后, 会和上下文中的局部变量名冲突.
当然解决方法, 最简单的就是, 不要使用局部变量
当然这不现实, 所以方法是, 在`范围内, 使用#作为后缀, complier会自动将该变量名引入随机字符, 以保证不发生冲突.
Within any syntax-quoted form (forms using the back-tick, `), you can append the # character to the end of any local symbol name, and when the macro is expanded, it will replace the symbol with a randomly generated symbol that is guaranteed not to conflict with anything, and which will match any other symbol created with auto gensym in the same syntax-quote template.
(defmacro debug-println [expr] `(let [result# ~expr] (println (str "Value is: " result#)) result#)) (macroexpand '(debug-println (/ 4 3))) (clojure.core/let [result_2349_auto (/ 4 3)] (clojure.core/println (clojure.core/str "Value is: " result_2349_auto) result_2349_auto)
When to Use Macros
Macros are extremely powerful and allow you to control and abstract code in ways that would not be otherwise possible. However, using them does come at a cost. They operate at a higher level of abstraction, and so they are significantly more difficult to reason about then normal code. If a problem occurs, it can be much trickier to debug, since there's an extra level of indirection between where the problem actually is, and where the error message originates.
Macros非常强大, 非常便于控制和抽象代码. 然而没有免费的午餐, 使用macros必然带来的问题是, 代码更难理解和reason, 当problem发生时, 增加了调试难度. 所以最好使用macros的方式是尽量别用, 大招之所以牛比, 就是不能老放...能用funciton的就用function来处理, 但是下面列举了一些macros的必须使用的场景
Therefore, the best way to use macros is to use them as little as possible. A few macros go a long way. Most things you need macros for (including some of the examples in this chapter) could also be accomplished with first-class functions. When you can, do that instead, and don't use macros.
That said, there are certain situations where using a macro is the best, easiest, or the only way to accomplish a given task. Usually, they fall into one of the following categories:
• Implement control structures: One of the main differences between macros and functions is that the arguments of macros are not evaluated. If you need to write a control structure that might not evaluate some of its parameters, it has to be a macro.
由于function会自动eval参数, 所以必须使用macro来控制代码逻辑的结构
典型宏运用场景, 比如在C里面对多产品的代码通过宏进行控制和切换
• Wrap def or defn: Usually, you only want to call def or defn at compile time. Calling them programmatically while a program is running is usually a recipe for disaster. So, if you need to wrap their behavior in additional logic, the best place to do it is usually a macro.
• Performance: Because they are expanded at compile time, using a macro can be faster than calling a function. Usually, this doesn't make much of a difference, but in extremely tight loops, you can sometimes eke out performance by eliminating a function call or two and using macros instead.
• Codify reoccurring patterns: Macros can be used to formalize any commonly occurring pattern in your code. In essence, macros are your means of modifying the language itself to suit your needs. Macros aren't the only way to do this, but they can sometimes do it in a way that is least invasive to other parts of your code.
Clojure中, 或Lisp中, 最强大的地方, 可以方便的产生DSL
Using Macros, 例子
Implementing a Control Structure
Consider a control form which takes two expressions and executes only one of them randomly.
This might be used in a game, or in an artificial intelligence implementation.
例子是, 随机执行form, 即改变代码的逻辑结构
(defmacro rand-expr [form1 form2]
`(let [n# (rand-int 2)]
(if (zero? n#) ~form1 ~form2)))
(rand-expr (println "A") (println "B")) B
如果使用function, 由于会自动eval参数, 会先将A,B都print出来, 所以必须使用macro
扩展一下, 如果要支持不定参数应该怎么做?
(defmacro rand-expr-multi [& forms] …)
这个比较复杂一些, 所以先构思一下, macro应该返回这样的代码
(let [ct (count <number of expressions>))] (case (rand-int ct) 0 (println "A") 1 (println "B") 2 (println "C")))
最终写成macro如下, 这里可以看出用clojure写macro的强大, 即code=data, 可以直接使用seq的function interleave而不是呆呆的一行行写...
(defmacro rand-expr-multi [& exprs] `(let [ct# ~(count exprs)] (case (rand-int ct#) ~@(interleave (range (count exprs)) exprs))))
Implementing a Macro Using Recursion
Macros can also be applied recursively.
As an example, consider a custom macro, ++, which can be used instead of +, and which automatically replaces multiargument addition expressions with nested binary expressions which perform slightly better in Clojure.
(++ 1 2 3 4 5) ==> (+ 1 (+ 2 (+ 3 (+ 4 5))))
++,就是为了便于书写, 怎么用macro实现?
(defmacro ++ [& exprs] (if (>= 2 (count exprs) `(+ ~@exprs) `(+ ~@(first exprs) (++ ~@(rest exprs)))))
++使用macro的递归来实现, 很方便, macro本身就是一种特殊的function, 所以支持递归也是正常的
注意对于macro的递归, 需要使用macroexpand-all来经行调试
(macroexpand '(++ 1 2 3 4))
(clojure.core/+ 1 (user/++ 2 3 4))
(clojure.walk/macroexpand-all '(++ 1 2 3 4))
(clojure.core/+ 1 (clojure.core/+ 2 (clojure.core/+ 3 4)))
Using Macros to Create DSLs
One common use of macros is to generate custom DSLs. Using macros, a few simple, intuitive expressions can generate much more bulky, complex code without exposing it to the user.
The potential use for DSLs in Clojure is unlimited.
Enclojure (the web framework for Clojure) allows the user to define web application paths and restful APIs using a simple, immediately understandable DSL syntax.
Incanter, provides a DSL based on the R programming language that is incredibly succinct and useful for doing statistics and building charts.
举个最简单的DSL的例子, 通过xml macro实现clojure data到xml的转换
(defn xml-helper [form] (if (not (seq? form)) (str form) (let [name (first form) children (rest form)] (str "<" name ">" (apply str (map xml-helper children)) "</" name ">")))) (defmacro xml [form] (xml-helper form))
定义好xml macro, 就可以直接使用来经行转换,
(xml (book (authors (author "Luke") (author "Stuart"))))
<book><authors><author>Luke</author><author>Stuart</author></authors></book>
本文章摘自博客园,原文发布日期:2013-02-27