Elasticsearch添加拼音搜索支持-阿里云开发者社区

Elasticsearch添加拼音搜索支持

2022-09-06 472

版权

本文内容由阿里云实名注册用户自发贡献，版权归原作者所有，阿里云开发者社区不拥有其著作权，亦不承担相应法律责任。具体规则请查看《阿里云开发者社区用户服务协议》和《阿里云开发者社区知识产权保护指引》。如果您发现本社区中有涉嫌抄袭的内容，填写侵权投诉表单进行举报，一经查实，本社区将立刻删除涉嫌侵权内容。

简介： Elasticsearch添加拼音搜索支持

一、安装插件

拼音分词扩展elasticsearch-analysis-pinyin安装

文档： https://github.com/medcl/elasticsearch-analysis-pinyin

二、新建索引添加拼音支持

<index> 替换为实际 index

<type> 替换为实际 type

PUT <index>
{
  "settings" : {
      "analysis" : {
        "analyzer" : {
          "pinyin_analyzer" : {
              "tokenizer" : "my_pinyin"
              }
        },
        "tokenizer" : {
          "my_pinyin" : {
            "type" : "pinyin",
            "keep_first_letter":false,
            "keep_separate_first_letter" : false,
            "keep_full_pinyin" : true,
            "keep_original" : false,
            "limit_first_letter_length" : 16,
            "lowercase" : true
          }
        }
      }
    },
  "mappings": {
    "<type>": {
      "properties": {
        "name": {
          "type": "text",
          "index": true,
          "fields":{
              "pinyin":{
                  "type":"text",
                  "analyzer":"pinyin_analyzer"
              }
           }
        },
        "link": {
          "type": "keyword",
          "index": false
        },
        "id": {
          "type": "long"
        },
        "update_time": {
          "type": "date",
          "format": "yyyy-MM-dd HH:mm:ss||yyyy-MM-dd||epoch_millis"
        }
      }
    }
  }
}

分词测试

GET <index>/_analyze
{
  "field": "name.pinyin",
  "text": "内蒙古"
}
返回
{
  "tokens": [
    {
      "token": "nei",
      "start_offset": 0,
      "end_offset": 1,
      "type": "word",
      "position": 0
    },
    {
      "token": "meng",
      "start_offset": 1,
      "end_offset": 2,
      "type": "word",
      "position": 1
    },
    {
      "token": "gu",
      "start_offset": 2,
      "end_offset": 3,
      "type": "word",
      "position": 2
    }
  ]
}

二、已有索引添加拼音支持

1、新建索引

PUT <index>
{
  "mappings": {
    "<type>": {
      "properties": {
        "name": {
          "type": "keyword",
          "index": true
        },
        "link": {
          "type": "keyword",
          "index": false
        },
        "id": {
          "type": "long"
        },
        "update_time": {
          "type": "date",
          "format": "yyyy-MM-dd HH:mm:ss||yyyy-MM-dd||epoch_millis"
        }
      }
    }
  }
}

2、设置拼音分词器

POST  <index>/_close
PUT <index>/_settings
{
  "index": {
    "analysis": {
      "analyzer": {
        "pinyin_analyzer": {
          "tokenizer": "my_pinyin"
        }
      },
      "tokenizer": {
        "my_pinyin": {
          "type": "pinyin",
          "keep_first_letter": true,
          "keep_separate_first_letter": true,
          "keep_full_pinyin": true,
          "keep_original": false,
          "limit_first_letter_length": 16,
          "lowercase": true
        }
      }
    }
  }
}
POST  <index>/_open

3、修改mapping，添加拼音分词器

PUT <index>/<type>/_mapping
{
  "<type>": {
    "properties": {
      "name": {
        "type": "keyword",
        "index": true,
            "fields":{
                "pinyin":{
                    "type":"text",
                    "analyzer":"pinyin_analyzer"
                }
            }
      },
      "link": {
        "type": "keyword",
        "index": false
      },
      "id": {
        "type": "long"
      },
      "update_time": {
        "type": "date",
        "format": "yyyy-MM-dd HH:mm:ss||yyyy-MM-dd||epoch_millis"
      }
    }
  }
}
GET <index>/_mapping
# 将当前索引的数据重新替换一下当前索引
POST <index>/_update_by_query?conflicts=proceed

4、搜索测试

get <index>/_search
{
  "query_string": {
    "fields": [
      "name",
      "name.pinyin"
    ],
    "query": "王苏川",
    "default_operator": "AND"
  }
}

Elasticsearch添加拼音搜索支持

一、安装插件

二、新建索引添加拼音支持

热门文章

最新文章

相关课程

相关电子书

探索云世界

热门

云计算

大数据

云原生

人工智能

数据库

开发与运维

活动广场

任务中心

训练营

直播

乘风者计划

下载

镜像站

技术资料

Elasticsearch添加拼音搜索支持

一、安装插件

二、新建索引添加拼音支持

热门文章

最新文章

相关课程

相关电子书