https://github.com/medcl/elasticsearch-analysis-ik
安装方式
1、先查看版本号:
http://localhost:9200/
找到对应版本:
https://github.com/medcl/elasticsearch-analysis-ik/releases
2、安装
./bin/elasticsearch-plugin install https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v6.3.0/elasticsearch-analysis-ik-6.3.0.zip
3、重启es
4、分词测试
curl -X PUT 'localhost:9200/website' curl -XGET "http://localhost:9200/website/_analyze" -H 'Content-Type: application/json' -d' { "text":"中华人民共和国国歌","tokenizer": "ik_max_word" }'
返回内容
{ "tokens": [ { "token": "中华人民共和国", "start_offset": 0, "end_offset": 7, "type": "CN_WORD", "position": 0 }, { "token": "中华人民", "start_offset": 0, "end_offset": 4, "type": "CN_WORD", "position": 1 }, { "token": "中华", "start_offset": 0, "end_offset": 2, "type": "CN_WORD", "position": 2 }, { "token": "华人", "start_offset": 1, "end_offset": 3, "type": "CN_WORD", "position": 3 }, { "token": "人民共和国", "start_offset": 2, "end_offset": 7, "type": "CN_WORD", "position": 4 }, { "token": "人民", "start_offset": 2, "end_offset": 4, "type": "CN_WORD", "position": 5 }, { "token": "共和国", "start_offset": 4, "end_offset": 7, "type": "CN_WORD", "position": 6 }, { "token": "共和", "start_offset": 4, "end_offset": 6, "type": "CN_WORD", "position": 7 }, { "token": "国", "start_offset": 6, "end_offset": 7, "type": "CN_CHAR", "position": 8 }, { "token": "国歌", "start_offset": 7, "end_offset": 9, "type": "CN_WORD", "position": 9 } ] }
如果安装失败,可以使用如下方式进行安装
源码解压后拷贝至es目录: plugins/ik , 重启服务
ik_max_word: 会将文本做最细粒度的拆分
ik_smart: 会做最粗粒度的拆分
参考