著作权归作者所有,任何形式的转载都请联系作者获得授权并注明出处。
索引的创建
ElasticSearch的RESTFul API基本格式:
http://<ip>:<port>/<索引>/<类型>/<文档id>
- 常用HTTP动词:GET / POST / PUT / DELETE
- 非结构化的创建索引:
创建索引完成后,可以在概览页面中查看到分片的信息,细框的分片是粗框分片的备份或者说副本:
我们通过查看索引信息来确定该索引是非结构化还是结构化的:
" state ”:" open ”, " settings ”:{ " index ”:{ " creation _ date ”:"1529460497345”," number _ of _ shards ”:"5", " number _ of _ replicas ”:"1”, " uuid ”:"1gB5wr0uQSuHyiaGrF1wAg”, “ version ”:{ “ created ”:“s05029g” " provided _ name :“ test index " mappings ”:{}," aliases ”:[] " primary _ terms ": 该项的值为空,则代表是非结构化的素引 “ in _ sync _ allocations :{ O :[ “RFc9heIfRHuoAtnJ0me7Hg” “ y _G106DOTWGnSP6zuCp- fw ” "1":[ “y9zFE-AJTDaDrcvVOR85NA” “Boq2XuQeTQqB05k66P_ Dtg ”
- 以上我们创建了非结构化的索引,以及知道了如何查看索引信息。接下来我们创建结构化的索引,进入复合查询页面,指定一个索引,并编写结构化的数据,最后提请求:
成功后,查看索引信息,可以看到我们所编写的结构化数据,这种就是结构化的索引:
" state ”:" open ”, " settings ":{ " index ": t " creation _ date ”:"1529460497345", " number _ of _ shards ":"5", " number _ of _ replicas ”:"1”, " uuid ”:"1gB5wr0uQSuHyiaGrF1wAg”, ' version ”:{ " created ”:“s050299” “ provided _ name ”:“ test _ index ” " mappings ":{ " test ”:{ " properties ":{ “ title ”:{ “ type ”:“ text ” 映射了结构化的数据 aliases :[ primary _ terms ":{ " in _ sync _ allocations ”:{ “ O : “RFc9heIfRHuoAtnJ0me7Hg”
- 其实使用es-head插件来创建结构化索引并不是很方便,编写JSON格式的结构化数据时比较蛋疼,没有智能的格式化功能。我们也可以使用postman来创建结构化的索引,基本上只要能模拟http请求的工具,都能用来创建es的结构化索引。如下:
" settings ":{ " number _ of _ shards ":3, " number _ of _ replicas ":1 “ mappings ":{ " man ”:{ " properties ":{ " name ": t Cype ”: “ text ” " country ”: " type ":" keyword " " age ":{ " type ":" integer " " date ":{ " type ":" date ", “ format ”:“ yyyy - MM - dd HH : mm : ss | yyyy - MM - dd epoch _mi11is”
上图中的ip和端口为es服务的ip及端口,people则是需要创建的索引名称。
映射的JSON数据如下:
{ "settings": { "number_of_shards": 3, // 分片的数量 "number_of_replicas": 1 // 副本的数量 }, "mappings": { // 索引所映射的结构化数据 "man": { "properties": { "name": { "type": "text" }, "country": { "type": "keyword" }, "age": { "type": "integer" }, "date": { "type": "date", "format": "yyyy-MM-dd HH:mm:ss||yyyy-MM-dd||epoch_millis" } } }, "woman": { } } }
提交请求成功后所反馈的信息:
" acknowledged ": true , " shards _ acknowledged ": true
可以到es-head的概览界面上,查看到新创建的索引:
“ state ":" open ”, " settings ":{ " index ”:{ " creation _ date ”:“1529462295858”," number _ of _ shards ":"3", " number _ of _ replicas ”:"1' " uuid ”:"nkvECvfhTpSCD0pEni2n1g”," version ”:{ " created ”:“505029g"" provided _ name ”:" people ” " mappings ":{ " woman ”:{}, " man ": " properties ":{ " date ":{ " format ”:" yyyy - MM - dd HH : mm : ss lIyyyy - MM - dd | lepoch _ millis ”,“ type ”:" date ” " country ”:{ " type ”:" keyword ” v " name ”:- “ type ":“ text " age ”:{ “ type ”:" integer "
插入
es的插入分为:
- 指定文档id插入
- 自动产生文档id插入 指定文档id插入,示例:
" name ":" zero ", " country ":" China ",“ age ":"20", " date ”:“1997-01-p1"
简单说明:
- people 索引
- man 类型
- 1 文档id 发送请求成功,即成功指定文档id插入数据:
"_ index ":" people ","_ type ":“ man ”, "_ id ":"1","_ version ":1 " result ":" created ", "_ shards ":{ " total ":2, " successful ":2" failed ":0 “ created ”: true
到es-head的数据浏览页面上,可以看到我们刚刚插入的数据:
以上就是指定文档id的方式插入数据,接下来我们看看如何让es自动产生文档id来插入数据:
“ name ”:" Jon , “ country ":“ China ”, " age ": “ date ”:“1993-01-01”
发送请求成功,es自动生成的id如下:
index ":" people ". " type ":" man ” "_ id ":“AwQbJ2gyYQ version ":1, " result ":" created ", "_ shards ":{ " total ":2, " successful ”: " failed ":0 " created ": true
到es-head的数据浏览页面上,验证我们刚刚插入的数据:
修改
es有两种修改文档数据的方式:
- 直接修改文档
- 脚本修改文档 直接修改文档的方式,示例:
" doc ":{ " name ":" test _ name " 新数据写在 doc 里
修改成功,返回信息如下:
"_ index ":" people ","_ type ":" man ”, "_ id ":"1", "_ version ":2, " result ":" updated ","_ shards ":{ " total ":2, " successful ":2," failed ":0
文档id为1的name数据成功修改为test_name:
脚本修改文档的方式,示例:
“ script ":{ ang :" painless ", " inline :" ctx ._ source . age +=10
注:es支持多种脚本语言,这里只是拿了lang来做例子
修改成功,返回信息如下:
"_ index ":" people ","_ type ":" man ", "_ id ":"1", "_ version ":3, " result ":" updated ","_ shards ":{ " total ":2, " successful ":2, " failed ":0
因为当时设置age的值时,设置成了字符串类型,而字符串的相加是连接,所以文档id为1的age数据变成了2010:
我们还可以把数据放到外面,放在params里面,然后在脚本代码里引用这个数据即可,如下示例:
" script ":{ " lang ":" painless ", " inline ":" ctx ._ source . age = params . age "," params ":{ " age ":21
修改成功后,如下:
删除
关于es的删除操作,我们主要介绍以下两种:
- 删除文档
- 删除索引 删除文档,示例:
" found ”: true , "_ index ":" people ", "_ type ":" man ", "_ id ”:“1", "_ version ”:6, " result ":" deleted ","_ shards ":{ " total ":2, " successful ":2, “fai1ed":0
此时就只剩一个文档数据了:
在es-head上,删除索引,示例:
确认删除:
删除成功,索引下的所有数据都会被删除:
使用postman等工具,删除索引,示例:
" acknowledged ": true
删除成功:
注:删除索引是一个非常危险的操作,需要谨慎删除,免得删库跑路。
查询
es常用的查询语法:
- 简单查询
- 条件查询
- 聚合查询 我这里已经事先准备好了一个结构化的索引:
" state ”:" open ”, " settings ”:{ " index ”:{ " creation _ date ”:"1529476542880”, - a . " uuid ”:ss8vaCRRQ1q9lMHG_nS1gg”," version ”:{ " created ":“505029g”" provided _ name ”:“ book ” “ mappings ”:{ " novel ”:{ " properties ":{ " word _ count ”:{ "“ type ”:" integer " author ”:{ “ type :“ keyword ” " title ”:{ “ type ”:“ text ” " publish _ date ”:{ " format ”:" yyyy - MM - dd HH : mm : ssllyyyy - MM - ddllepoch _ millis ”,“ type ”:" date ”
以及一些数据:
简单查询,就是直接使用GET方式访问需要查询的 索引->类型->文档id 即可,示例:
"_ index ": book ", _ type ":“ novel ”, Id ":“AwQb7YOKYQ mmOCLRPV ", ”_ version ”:1, " found ": true , _ Source ":{ " author ":"小明", " title ":"中小学数学"," word _ count ":69056, " publish _ date ”:"2011-07-11"
条件查询,我们来查询book索引下的所有数据:
" query ":{条件写在这里 " match _ all ":
查询结果如下:
" took ”:4," timed _ out ”: "_ shards ":{ " total ":3, " successful ":3, " failed ":0 false , “ hits ":{ " total ":8, “ max score :1,“ hits ”# "_ index ”:“ book ”, _ type ”:“ novel ”, ”_ id ”:"AWQb7YOKYQmm0CLRPV”,"_ score ”:1, "_ source ”:{ " author ":"小明", " title ":"中小学数学", " word _ cOunt ”:69056, “pub1ish_ date ”:“2011-e7-1i" " index ”:" book ", " type ”:" novel ”, " id ":“AWQb7TAgYQ mmOCLRPU ”, "_ SCore : l , "_ source ":{ " author ”:" Mark ” “ title ”:"电脑装机指南", ” word _ count ”:63056, " publish _ date ":"2010-07-11" "_ index ":" book ”, "_ type ":“ novel ”, "_ id ”:“AWQb72gwYQ mmOCLRPY ”, SCOre :1, source :{ " author ":"隔壁老王"」 " titie ":"爬窗术的奥秘", “ word _ count ":1233616, " publish _ date ":“200s-07-11" "_ index ":“ book ”, " type ":" novel ", "_ id ":"AWQb6qe0YQ
简单说明:
- took 查询耗时
- time_out 是否超时
- _shards 分片信息
- hits 本次查询的数据都放在这里,默认只包含十条数据 我们可以通过如下两个参数来定义查询多少条数据,例如我指定只查询一条数据:
" query :{ " match _al1": " from ": “ size ":
查询结果如下:
" took ":11, " timed _ out ": false , "_ shards ":{ “ total ":3, " successful ":3," failed ":0 " hits ":{ " total ”:8," max _ score ":1" hits ":[ " index ":" book " "_ type ":" novel ", "_ id ”:“AWQb7TAgYQ_ mmOCLRPU ”,"_ score ":1, "_ source ":{ " author ":" Mark ", " title ":"电脑装机指南", “ word _ count ”:63056, " publish _ date ":"2010-07-11"
通过关键字进行模糊查询:
" query ": “ match ":{ "tit1le":" Java " 将需要匹配的属性写在这里
查询结果如下:
" took ":45, " timed _ out ": false ,"_ shards ":{ " total ":3, " successful ":3, " failed ":0 " hits ":{ " total ":1, “ max _ score ":1.1360463,“ hits ":[ "_ index ":" book ", "_ type ":" novel ", "_ id ":"AWQb6- yeYQ _ mmOCLRPS ”, "_ score ":1.1360463, "_ source ":{ " author ":" Jon ", " title ":" Think in Java "," word _ count ":5000, " publish _ date ":"2005-02-03”
我们可以指定排序查询结果的方式,示例:
" query ":{ " match _al1:{ " sort ":[ 指定根据哪个字段进行排序及排序的方式 " publish _ date ":{ " order ":" desc "23.128:9200/
以上我们介绍了简单查询及条件查询,接下来我们简单介绍一下聚合查询,单组聚合查询示例:
{ "aggs": { "group_by_word_count": { "terms": { "field": "word_count" } } } }
查询的聚合结果如下:
" index ":" book ", "_ type ":" novel ” “_ id ":"AWQb7oNLYQ mmOCLRPX ”,"_ sCOre ":1, "_ source ":{ " author ":" Linus ", " title ":" Linux ", " word _ count ":195616, “ publish _ date ":"2001-07-11" " aggregations ":{ “ group _ by _ word _ count ":{ " doc _ count _ error _ upper _ bound ":0," sum _ other _ doc _ count ":0, " buckets ”:[ “ key ”:195616, “ doc _ count ":2 “ key ”:1000," doc _ count ":1 “ key ”:5000, " doc _ count ":1 “ key ":26000, " doc _ count ":1 " key ":63056, " doc _ count ":1 “ key ":69056, " doc _ count ":1 " key ":1233616, " doc _ count ":1
多组聚合查询示例:
{ "aggs": { "group_by_word_count": { "terms": { "field": "word_count" } }, "group_by_publish_date": { "terms": { "field": "publish_date" } } } }
查询的聚合结果如下:
{ ... "aggregations": { "group_by_publish_date": { "doc_count_error_upper_bound": 0, "sum_other_doc_count": 0, "buckets": [ { "key": 981158400000, "key_as_string": "2001-02-03 00:00:00", "doc_count": 1 }, { "key": 994809600000, "key_as_string": "2001-07-11 00:00:00", "doc_count": 1 }, { "key": 1107388800000, "key_as_string": "2005-02-03 00:00:00", "doc_count": 1 }, { "key": 1121040000000, "key_as_string": "2005-07-11 00:00:00", "doc_count": 1 }, { "key": 1215734400000, "key_as_string": "2008-07-11 00:00:00", "doc_count": 1 }, { "key": 1278806400000, "key_as_string": "2010-07-11 00:00:00", "doc_count": 1 }, { "key": 1310342400000, "key_as_string": "2011-07-11 00:00:00", "doc_count": 1 }, { "key": 1341964800000, "key_as_string": "2012-07-11 00:00:00", "doc_count": 1 } ] }, "group_by_word_count": { "doc_count_error_upper_bound": 0, "sum_other_doc_count": 0, "buckets": [ { "key": 195616, "doc_count": 2 }, { "key": 1000, "doc_count": 1 }, { "key": 5000, "doc_count": 1 }, { "key": 26000, "doc_count": 1 }, { "key": 63056, "doc_count": 1 }, { "key": 69056, "doc_count": 1 }, { "key": 1233616, "doc_count": 1 } ] } } }
除了可以分组聚合查询外,还可以进行统计查询等,与数据库中的聚合函数有些相似,如下示例:
{ "aggs": { "grades_word_count": { "stats": { "field": "word_count" // 以word_count作为统计字段 } } } }
查询的统计结果如下:
{ ... "aggregations": { "grades_word_count": { "count": 8, "min": 1000, "max": 1233616, "avg": 223620, "sum": 1788960 } } }
本文来源 blog.51cto.com/zero01/2130…