[Erlang 0028] Erlang atom

简介:
 Erlang中atom数据类型能够做的唯一的运算就是比较;在erlang中模块名和方法名都是原子;Atom用来构造Tag-Message,Atom的比较时间是常量的,与Atom的长度无关(如果拿binary做tag,比较时间是线性的);Atom就是为比较而设计,除了比较运算不要把Atom用在别的运算中. Erlang M-F-A方法调用可以做的非常灵活,我们在shell里面操练一下:
复制代码
Eshell V5.9 (abort with ^G)
1> lists:seq(1,5).
[1,2,3,4,5]
2> L=lists.
lists
3> S=seq.
seq
4> L:S(1,5).
[1,2,3,4,5]
5> L:seq(1,5).
[1,2,3,4,5]
6> L2=list_to_atom("list" ++"s").
lists
7> L2:seq(1,5).
[1,2,3,4,5]
8> apply(list_to_atom("li"++"sts"),seq,[1,5]).
[1,2,3,4,5]
9>
复制代码

 
Erlang atom 不参与垃圾回收,一旦创建就不会被移除掉;一旦超出atom的数量限制(默认是1048576) VM就会终止掉.对于一个会持续运行很久的系统,把任意字符串转成atom是很危险的,内存会慢慢被吃光.如果使用的原子是在预期范围内的,比如协议模块的名称,那么可以使用list_to_existing_atom来进行防范,这个方法所产出的atom必须是之前已经创建过的.
 
 
复制代码
%% list_to_existing_atom demo

Eshell V5.9 (abort with ^G)
1> list_to_existing_atom("player_1").
** exception error: bad argument %尝试调用 list_to_existing_atom("player_1").由于player_1的原子之前没有被创建过
in function list_to_existing_atom/1 %这里报错了 异常是bad argument
called as list_to_existing_atom("player_1")
2> list_to_atom("player_1"). %创建一下player_1
player_1
3> list_to_existing_atom("player_1"). %再次调用list_to_existing_atom就是对的了
player_1
4> list_to_existing_atom("player_2").
** exception error: bad argument
in function list_to_existing_atom/1
called as list_to_existing_atom("player_2")
5> player_2. %也可以这样创建原子
player_2
6> list_to_existing_atom("player_2"). %这时调用也是对的
player_2
7>
复制代码

我们可以使用 string:tokens( binary_to_list(erlang:system_info(info)),"\n")在shell中看一下atom的使用情况,输出的片段中恰好包含了原子内存使用的情况,当前数量和数量限制;想看完整的输出?穿越到这里
复制代码
%% list_to_atom demo limit

Eshell V5.9 (abort with ^G)
1> string:tokens( binary_to_list(erlang:system_info(info)),"\n").
["=memory","total: 4331920","processes: 438877",
"processes_used: 438862","system: 3893043","atom: 146321",
"atom_used: 119102","binary: 327936","code: 1929551",
"ets: 123308","=hash_table:atom_tab","size: 4813",
"used: 3508","objs: 6410","depth: 7",
"=index_table:atom_tab","size: 7168","limit: 1048576",
"entries: 6410","=hash_table:module_code","size: 97",
"used: 52","objs: 72","depth: 4","=index_table:module_code",
"size: 1024","limit: 65536","entries: 72",
[...]|...]
2> [list_to_atom("player_"++integer_to_list(Item)) || Item <- lists:seq(1,1000000) ].
[player_1,player_2,player_3,player_4,player_5,player_6,
player_7,player_8,player_9,player_10,player_11,player_12,
player_13,player_14,player_15,player_16,player_17,player_18,
player_19,player_20,player_21,player_22,player_23,player_24,
player_25,player_26,player_27,player_28,player_29|...]
3> string:tokens( binary_to_list(erlang:system_info(info)),"\n").
["=memory","total: 93955448","processes: 37830214",
"processes_used: 37830214","system: 56125234",
"atom: 20296629","atom_used: 20279627","binary: 360656",
"code: 1965264","ets: 124260","=hash_table:atom_tab",
"size: 823117","used: 627221","objs: 1006479","depth: 7",
"=index_table:atom_tab","size: 1006592","limit: 1048576",
"entries: 1006479","=hash_table:module_code","size: 97",
"used: 54","objs: 74","depth: 4","=index_table:module_code",
"size: 1024","limit: 65536","entries: 74",
[...]|...]
4> [list_to_atom("player_"++integer_to_list(Item)) || Item <- lists:seq(1,1000000) ].
[player_1,player_2,player_3,player_4,player_5,player_6,
player_7,player_8,player_9,player_10,player_11,player_12,
player_13,player_14,player_15,player_16,player_17,player_18,
player_19,player_20,player_21,player_22,player_23,player_24,
player_25,player_26,player_27,player_28,player_29|...]
5> string:tokens( binary_to_list(erlang:system_info(info)),"\n").
["=memory","total: 98839096","processes: 42712630",
"processes_used: 42712630","system: 56126466",
"atom: 20296629","atom_used: 20279627","binary: 361888",
"code: 1965264","ets: 124260","=hash_table:atom_tab",
"size: 823117","used: 627221","objs: 1006479","depth: 7",
"=index_table:atom_tab","size: 1006592","limit: 1048576", %注意再次调用的时候这里没有变化
"entries: 1006479","=hash_table:module_code","size: 97",
"used: 54","objs: 74","depth: 4","=index_table:module_code",
"size: 1024","limit: 65536","entries: 74",
[...]|...]
6>
复制代码
我们挑战一下atom的数量上限,  [list_to_atom("player_"++integer_to_list(Item)) || Item <- lists:seq(1,100000000) ].只要运行这个就可以了,不久我们就看到下面的提示:

所以在How to Crash Erlang 一文中,无节制使用atom名列前茅:

Run out of atoms. Atoms in Erlang are analogous to symbols in Lisp--that is, symbolic, non-string identifiers that make code more readable, like green or unknown_value--with one exception. Atoms in Erlang are not garbage collected. Once an atom has been created, it lives as long as the Erlang node is running. An easy way to crash the Erlang virtual machine is to loop from 1 to some large number, calling integer_to_list and then list_to_atom on the current loop index. The atom table will fill up with unused entries, eventually bringing the runtime system to halt. 

Why is this is allowed? Because garbage collecting atoms would involve a pass over all data in all processes, something the garbage collector was specifically designed to avoid. And in practice, running out of atoms will only happen if you write code that's generating new atoms on the fly. 

 

Note:

Atoms are really nice and a great way to send messages or represent constants. However there are pitfalls to using atoms for too many things: an atom is referred to in an "atom table" which consumes memory (4 bytes/atom in a 32-bit system, 8 bytes/atom in a 64-bit system). The atom table is not garbage collected, and so atoms will accumulate until the system tips over, either from memory usage or because 1048577 atoms were declared.

This means atoms should not be generated dynamically for whatever reason; if your system has to be reliable and user input lets someone crash it at will by telling it to create atoms, you're in serious trouble. Atoms should be seen as tools for the developer because honestly, it's what they are.

 

 Even though bit strings are pretty light, you should avoid using them to tag values. It could be tempting to use string literals to say {<<"temperature">>,50}, but always use atoms when doing that. atoms were said to be taking only 4 or 8 bytes in space, no matter how long they are. By using them, you'll have basically no overhead when copying data from function to function or sending it to another Erlang node on another server. 
Conversely, do not use atoms to replace strings because they are lighter. Strings can be manipulated (splitting, regular expressions, etc) while atoms can only be compared and nothing else.

link :http://learnyousomeerlang.com/starting-out-for-real 

 

Using list_to_atom/1 to construct an atom that is passed to apply/3 like this

apply(list_to_atom("some_prefix"++Var), foo, Args)

is quite expensive and is not recommended in time-critical code.

 

2012-08-30 更新

Note: If you remember earlier texts, atoms can be used in a limited (though high) number. You shouldn't ever create dynamic atoms. This means naming processes should be reserved to important services unique to an instance of the VM and processes that should be there for the whole time your application runs.

If you need named processes but they are transient or there isn't any of them which can be unique to the VM, it may mean they need to be represented as a group instead. Linking and restarting them together if they crash might be the sane option, rather than trying to use dynamic names.

2014-8-5 11:44:00更新 

Intent
Prevent the atom table from filling up
Motivation
The VM will crash if you use too many atoms (by default 1048576)
Atoms are created in many ways:
   hand-written code (modules, functions, intentional atoms)
   generated code (e.g. ASN.1 compiler, yecc)
   reading files (config, file:consult)
   parsing (e.g. XML, JSON, ...)
Recommendation
Don’t use list_to_atom/1 and beware of libraries that do (e.g.xmerl). Use list_to_existing_atom/1 or tag tuples with strings/binaries.

 

如果你在erl_crash.dump看到了"no more index entries in atom_tab (max=1048576)."的信息,那就crashdump_viewer:start/0 检查一下看看什么逻辑创建了那么多的原子吧 

目录
相关文章
|
Shell Perl
erlang 小技巧总结
开一个页面总结一些erlang的使用技巧,随时添加新的技巧。
|
消息中间件 网络协议 Go
[Erlang 0123] Erlang EPMD
 epmd进程和Erlang节点进程如影随形,在Rabbitmq集群,Ejabberd集群,Couchbase集群产品文档中都会有相当多的内容讲epmd,epmd是什么呢?   epmd 是Erlang Port Mapper Daemon的缩写,全称足够明确表达它的功能了(相比之下,OTP就.
2063 0
[Erlang 0125] Know a little Erlang opcode
  Erlang源代码编译为beam文件,代码要经过一系列的过程(见下面的简图),Core Erlang之前已经简单介绍过了Core Erlang,代码转换为Core Erlang,就容易拨开一些语法糖的真面目了.
2302 0
|
编解码 计算机视觉 存储
|
JavaScript 前端开发 Java
|
网络协议 Shell Go