《Elastic Stack 实战手册》——三、产品能力——3.4.入门篇——3.4.2.Elasticsearch基础应用——3.4.2.15.ingest pipelines(1) https://developer.aliyun.com/article/1230173
常用的 Processors 如下:
更多 Pipeline Processors 参考更多:https://www.elastic.co/guide/en/elasticsearch/reference/master/processors.html
Trim
去除空格,如果是字符串类型的数组,数组中所有字符串都会被替换空格处理
Split
切分字符串,使用指定切分符,切分字符串为数组结构,只作用与字符串类型
Rename
重命名字段
Foreach
对一组数据进行相同的预处理,可以使用 Foreach
Lowercase / Uppercase
对字段进行大小写转换
Script
使用脚本语言进行数据预处理
Gsub
对字符串进行替换
Append
添加数据到数组
Set
设置字段值
Remove
移除字段
Trim
去除字符串中的空格:
PUT _ingest/pipeline/trim_pipeline { "processors": [ { "foreach": { "field": "message", "processor": { "trim": { "field": "_ingest._value" } } } } ] } POST _ingest/pipeline/trim_pipeline/_simulate { "docs": [ { "_source": { "message": [ "car222 ", " auto2222 " ] } } ] } #返回: { "docs" : [ { "doc" : { "_index" : "_index", "_type" : "_doc", "_id" : "_id", "_source" : { "message" : [ "car222", "auto2222" ] }, "_ingest" : { "_value" : null, "timestamp" : "2021-04-28T13:19:13.542743Z" } } } ] }
Split / Foreach
切分字符串,使用指定切分符,切分字符串为数组结构,只作用于字符串类型:
PUT _ingest/pipeline/split_pipeline { "processors": [ { "foreach": { "field": "message", "processor": { "split": { "field": "_ingest._value", "separator": " " } } } } ] } #测试 POST _ingest/pipeline/split_pipeline/_simulate { "docs": [ { "_source": { "message": [ "car222 aaa", " auto2222 aaaa bbb" ] } } ] } #返回,可以看到 message 按照空格切分为了多个字符串数组 { "docs" : [ { "doc" : { "_index" : "_index", "_type" : "_doc", "_id" : "_id", "_source" : { "message" : [ [ "car222", "aaa" ], [ "", "auto2222", "aaaa", "bbb" ] ] }, "_ingest" : { "_value" : null, "timestamp" : "2021-04-28T13:28:20.762312Z" } } } ] }
Rename
重命名一个字段, rename 往往和 reindex 结合使用:
POST goods_info_comment_message/_bulk {"index":{"_id":1}} {"message":"美 国苹果 "} {"index":{"_id":2}} {"message":"山东 苹果 "} #定义 rename_pipeline PUT _ingest/pipeline/rename_pipeline { "processors": [ { "rename": { "field": "message", "target_field": "message_new" } } ] } #重建 index POST _reindex { "source": { "index": "goods_info_comment_message" }, "dest": { "index": "goods_info_comment_message_new", "pipeline": "rename_pipeline" } } #查询 mapping GET goods_info_comment_message_new/_mapping #返回 { "goods_info_comment_message_new" : { "mappings" : { "properties" : { "message_new" : { "type" : "text", "fields" : { "keyword" : { "type" : "keyword", "ignore_above" : 256 } } } } } } }
《Elastic Stack 实战手册》——三、产品能力——3.4.入门篇——3.4.2.Elasticsearch基础应用——3.4.2.15.ingest pipelines(3) https://developer.aliyun.com/article/1230171