转载:【原译】Erlang性能的八个误区(Efficiency Guide)

全局流量管理 GTM,标准版 1个月
云解析 DNS,旗舰版 1个月
简介: 转自:http://www.cnblogs.com/futuredo/archive/2012/10/16/2725770.html   The Eight Myths of Erlang Performance Erlang/OTP R15B02 1  Myth: Funs are slo...



The Eight Myths of Erlang Performance

Erlang/OTP R15B02

1  Myth: Funs are slow


  Yes, funs used to be slow. Very slow. Slower than apply/3. Originally, funs were implemented using nothing more than compiler trickery, ordinary tuples, apply/3, and a great deal of ingenuity.


  But that is ancient history. Funs was given its own data type in the R6B release and was further optimized in the R7B release. Now the cost for a fun call falls roughly between the cost for a call to local function and apply/3.


2  Myth: List comprehensions are slow


  List comprehensions used to be implemented using funs, and in the bad old days funs were really slow.


  Nowadays the compiler rewrites list comprehensions into an ordinary recursive function. Of course, using a tail-recursive function with a reverse at the end would be still faster. Or would it? That leads us to the next myth.


3  Myth: Tail-recursive functions are MUCH faster than recursive functions


  According to the myth, recursive functions leave references to dead terms on the stack and the garbage collector will have to copy all those dead terms, while tail-recursive functions immediately discard those terms.


  That used to be true before R7B. In R7B, the compiler started to generate code that overwrites references to terms that will never be used with an empty list, so that the garbage collector would not keep dead values any longer than necessary.


  Even after that optimization, a tail-recursive function would still most of the time be faster than a body-recursive function. Why?


  It has to do with how many words of stack that are used in each recursive call. In most cases, a recursive function would use more words on the stack for each recursion than the number of words a tail-recursive would allocate on the heap. Since more memory is used, the garbage collector will be invoked more frequently, and it will have more work traversing the stack.


  In R12B and later releases, there is an optimization that will in many cases reduces the number of words used on the stack in body-recursive calls, so that a body-recursive list function and tail-recursive function that calls lists:reverse/1 at the end will use exactly the same amount of memory. lists:map/2lists:filter/2, list comprehensions, and many other recursive functions now use the same amount of space as their tail-recursive equivalents.

  在R12B及 以后的版本中,做了一个优化,使得在许多情况下能减小体递归调用时在堆栈上占用的空间,所以一个体递归函数和一个尾递归函数,同样在最后调用 lists:reverse/1函数实现列表反转,所使用的内存空间几乎相等。现在,lists:map/2,lists:filter/2,列表解析, 以及许多其他普通递归函数占用的内存空间跟它们的尾递归实现一样。

  So which is faster?


  It depends. On Solaris/Sparc, the body-recursive function seems to be slightly faster, even for lists with very many elements. On the x86 architecture, tail-recursion was up to about 30 percent faster.


  So the choice is now mostly a matter of taste. If you really do need the utmost speed, you must measure. You can no longer be absolutely sure that the tail-recursive list function will be the fastest in all circumstances.


  Note: A tail-recursive function that does not need to reverse the list at the end is, of course, faster than a body-recursive function, as are tail-recursive functions that do not construct any terms at all (for instance, a function that sums all integers in a list).


4  Myth: '++' is always bad


  The ++ operator has, somewhat undeservedly, got a very bad reputation. It probably has something to do with code like



naive_reverse([H|T]) ->
naive_reverse([]) ->

  which is the most inefficient way there is to reverse a list. Since the ++ operator copies its left operand, the result will be copied again and again and again... leading to quadratic complexity.


  On the other hand, using ++ like this



naive_but_ok_reverse([H|T], Acc) ->
    naive_but_ok_reverse(T, [H]++Acc);
naive_but_ok_reverse([], Acc) ->

is not bad. Each list element will only be copied once. The growing result Acc is the right operand for the ++ operator, and it will not be copied.


  Of course, experienced Erlang programmers would actually write



vanilla_reverse([H|T], Acc) ->
    vanilla_reverse(T, [H|Acc]);
vanilla_reverse([], Acc) ->

  which is slightly more efficient because you don't build a list element only to directly copy it. (Or it would be more efficient if the the compiler did not automatically rewrite [H]++Acc to [H|Acc].)


5  Myth: Strings are slow


  Actually, string handling could be slow if done improperly. In Erlang, you'll have to think a little more about how the strings are used and choose an appropriate representation and use the re module instead of the obsolete regexp module if you are going to use regular expressions.


6  Myth: Repairing a Dets file is very slow


  The repair time is still proportional to the number of records in the file, but Dets repairs used to be much, much slower in the past. Dets has been massively rewritten and improved.


7  Myth: BEAM is a stack-based byte-code virtual machine (and therefore slow)


  BEAM is a register-based virtual machine. It has 1024 virtual registers that are used for holding temporary values and for passing arguments when calling functions. Variables that need to survive a function call are saved to the stack.


  BEAM is a threaded-code interpreter. Each instruction is word pointing directly to executable C-code, making instruction dispatching very fast.


8  Myth: Use '_' to speed up your program when a variable is not used


  That was once true, but since R6B the BEAM compiler is quite capable of seeing itself that a variable is not used.


Java 编译器 测试技术
深入浅出 Compose Compiler(5) 类型稳定性 Stability
深入浅出 Compose Compiler(5) 类型稳定性 Stability
69 0
深入浅出 Compose Compiler(5) 类型稳定性 Stability
JavaScript 开发者
怎么用 Performance 工具查看任务
怎么用 Performance 工具查看任务
140 0
怎么用 Performance 工具查看任务
存储 Kubernetes Cloud Native
云原生渐进式交付,刷 Argo CD 技术文档之 Understand The Basics & Core Concepts 篇
云原生渐进式交付,刷 Argo CD 技术文档之 Understand The Basics & Core Concepts 篇
134 0
Java 编译器
An Introduction to JWarmup
一、JWarmup背景 二、JWarmup功能 三、案例演示
An Introduction to JWarmup
SQL 存储 数据采集
【详谈 Delta Lake 】系列技术专题 之 基础和性能(Fundamentals and Performance)
本文翻译自大数据技术公司 Databricks 针对数据湖 Delta Lake 的系列技术文章。众所周知,Databricks 主导着开源大数据社区 Apache Spark、Delta Lake 以及 ML Flow 等众多热门技术,而 Delta Lake 作为数据湖核心存储引擎方案给企业带来诸多的优势。本系列技术文章,将详细展开介绍 Delta Lake。
【详谈 Delta Lake 】系列技术专题 之 基础和性能(Fundamentals and Performance)
网络架构 Java Go
带你读《计算机体系结构:量化研究方法(英文版·原书第6版)》之一:Fundamentals of Quantitative Design and Analysis
机器学习/深度学习 人工智能 算法
[python作业AI毕业设计博客]Analytic Methods in Systems and Software Testing-2018 系统和软件测试分析方法
图片.png 下载地址 https://itbooks.pipipan.com/fs/18113597-335471247 使用最先进的方法和工具对系统和软件测试进行综合处理。本书提供了有关最新软件测试方法的宝贵见解,并通过示例解释了该领域中使用的统计和分析方法。
内存技术 Go Windows
带你读《计算机组成与体系结构:性能设计(英文版·原书第10版)》之一:Basic Concepts and Computer Evolution
本书以Intel x86体系结构和ARM两个处理器系列为例,将当代计算机系统性能设计问题与计算机组成的基本概念和原理紧密联系起来,介绍了当代计算机体系结构的主流技术和最新技术。本书作者曾13次获a得美国教材和学术专著作者协会颁发的年度最佳计算机科学教材奖。目前,他是一名独立顾问,为众多计算机和网络制造商、软件开发公司以及政府前沿研究机构提供服务。
图形学 内存技术 Java
带你读《计算机组成与体系结构:性能设计(英文版·原书第10版)》之二:Performance Issues
本书以Intel x86体系结构和ARM两个处理器系列为例,将当代计算机系统性能设计问题与计算机组成的基本概念和原理紧密联系起来,介绍了当代计算机体系结构的主流技术和最新技术。本书作者曾13次获a得美国教材和学术专著作者协会颁发的年度最佳计算机科学教材奖。目前,他是一名独立顾问,为众多计算机和网络制造商、软件开发公司以及政府前沿研究机构提供服务。
分布式系统的烦恼------《Designing Data-Intensive Applications》读书笔记11
1275 0