常用查询
match全文检索
GET bank/_search { "query": { "match": { "account_number": "20" } } } #字符串 GET bank/_search { "query": { "match": { "address": "kings" } } }
match_phrase短语匹配
规则如下:
# 精准匹配, 不拆分字符串进行检索 # match_phrase:不拆分字符串进行检索 # 字段.keyword:必须全匹配上才检索成功 GET bank/_search { "query": { "match_phrase": { "address": "mill road" } } } GET bank/_search { "query": { "match": { "address.keyword": "990 Mill" } } }
query/bool/must复合查询
# must:必须达到must所列举的所有条件 GET bank/_search { "query":{ "bool":{ "must":[ {"match":{"address":"mill"}}, {"match":{"gender":"M"}} ] } } } # must_not:必须不匹配must_not所列举的所有条件。 GET bank/_search { "query": { "bool": { "must": [ { "match": { "gender": "M" } }, { "match": { "address": "mill" } } ], "must_not": [ { "match": { "age": "38" } } ] } } } # should:应该达到should列举的条件,如果到达会增加相关文档的评分,并不会改变查询的结果。如果query中只有should且只有一种匹配规则,那么should的条件就会被作为默认匹配条件二区改变查询结果。 GET bank/_search { "query": { "bool": { "must": [ { "match": { "gender": "M" } }, { "match": { "address": "mill" } } ], "must_not": [ { "match": { "age": "18" } } ], "should": [ { "match": { "lastname": "Wallace" } } ] } } }
能够看到相关度越高,得分也越高。
query/filter【结果过滤】
- must 贡献得分
- should 贡献得分
- must_not 不贡献得分
- filter 不贡献得分
匹配的越多,得分越高!只有must和should影响相关性得分
GET bank/_search { "query": { "bool": { "must": [ { "match": {"address": "mill" } } ], "filter": { "range": { "balance": { "gte": "10000", "lte": "20000" } } } } } } #这条是不会显示结果的,结果都是0 GET bank/_search { "query": { "bool": { "filter": { "range": { "balance": { "gte": "10000", "lte": "20000" } } } } } }
query/term
# 全文检索字段用match,其他非text字段匹配用term。 GET bank/_search { "query": { "term": { "address": "mill Road" } } }
aggs/agg1(聚合)
# 聚合查询就是查一点这个,查一点哪个,不是单一的查询 GET bank/_search { "query": { "match": { "address": "Mill" } }, "aggs": { "ageAgg": { "terms": { "field": "age", "size": 10 } }, "ageAvg": { "avg": { "field": "age" } }, "balanceAvg": { "avg": { "field": "balance" } } }, "size": 0 }
nested对象聚合
GET articles/_search { "size": 0, "aggs": { "nested": { # "nested": { # "path": "payment" }, "aggs": { "amount_avg": { "avg": { "field": "payment.amount" } } } } } }
Mapping字段映射
映射定义文档如何被存储和检索的
Mapping映射是用来定义一个文档(document),以及它所包含的属性(field)是如何存储和索引的。比如:使用mapping来定义:
当然对于已经存在的字段进行映射的时候,我们不能进行更新。更新必须创建新的索引才行
# 创建索引的时候去指定 PUT /my_index { "mappings": { "properties": { "age": { "type": "integer" }, "email": { "type": "keyword" }, "name": { "type": "text" } } } }
更新索引
对于已经存在的字段映射,我们不能更新。更新必须创建新的索引,进行数据迁移。
添加新的字段映射`PUT /my_index/_mapping` PUT /my_index/_mapping { "properties": { "employee-id": { "type": "keyword", "index": false # 字段不能被检索。检索 } } }
数据迁移
POST _reindex { "source": { "index": "bank", "type": "account" }, "dest": { "index": "newbank" } } GET /newbank/_search
分词
ES中的分词就是能接收一个字符流,将之分割为独立的token(词元,通常是独立的单词),然后输入tokens流
简单来说,ES解析你的输入,是一个一个字的去读取
如下:
使用分词器
关于分词器: https://www.elastic.co/guide/en/elasticsearch/reference/7.6/analysis.html
对于中文,我们需要安装额外的分词器
下载地址:
https://github.com/medcl/elasticsearch-analysis-ik/releases
将上面下载的插件,放到es的plugs文件夹下
[root@localhost plugins]# pwd /mydata/elasticsearch/plugins/
重启ES,测试分词器是否安装成功
GET _analyze { "text":"我是中国人" }
如果可以解析中文成功,证明安装成功
调整虚拟机内存为3G
自定义词库
上面ES虽然能识别中文了,但是也是一个一个字识别的,我们想让它按我们的规矩来的话,我们需要自定义一下
默认es是不支持一些新的词语,它的词库里组合是很不规律的
那么我们可以自定义扩展一下这个词库!怎么扩展呢?
安装Nginx
将分词内容放到nginx服务器当中
随便启动一个Nginx,目的是单纯为了复制配置文件 docker run -p 80:80 --name nginx -d nginx:1.10 docker container cp nginx:/etc/nginx . mv nginx/ conf mv conf/ /mydata/nginx docker rm -f nginx 结构如下: [root@localhost mydata]# cd nginx/ [root@localhost nginx]# ls conf [root@localhost nginx]# cd conf/ [root@localhost conf]# ls conf.d fastcgi_params html koi-utf koi-win logs mime.types modules nginx.conf scgi_params uwsgi_params win-utf docker run -p 80:80 --name nginx \ -v /mydata/nginx/html:/usr/share/nginx/html \ -v /mydata/nginx/logs:/var/log/nginx \ -v /mydata/nginx/conf:/etc/nginx \ -d nginx:1.10 mkdir /mydata/nginx/html/es [root@localhost ~]# cat /mydata/nginx/html/es/fenci.txt 彭于晏 杜兰特
修改ESIKAnalyzer.cfg.xml
[root@localhost config]# cat IKAnalyzer.cfg.xml CTYPE properties SYSTEM "http://java.sun.com/dtd/properties.dtd"> <properties> <comment>IK Analyzer 扩展配置</comment> <!--用户可以在这里配置自己的扩展字典 --> <entry key="ext_dict"></entry> <!--用户可以在这里配置自己的扩展停止词字典--> <entry key="ext_stopwords"></entry> <!--用户可以在这里配置远程扩展字典 --> <entry key="remote_ext_dict">http://192.168.1.8/es/fenci.txt</entry> <!--用户可以在这里配置远程扩展停止词字典--> <!-- <entry key="remote_ext_stopwords">words_location</entry> --> </properties>
修改完成后,需要重启es容器,否则修改不生效。
docker restart elasticsearch POST _analyze { "analyzer": "ik_max_word", "text":"我是彭于晏" }
当然后面有新词的话,那就直接在es指定目录下继续添加新词即可
整合ElasticSearch
我们先来一个简单的写写试试
官方文档如下:
配置类
固定写法,其中builder = RestClient.builder(new HttpHost(“192.168.1.8”, 9200, “http”));如果有ES集群的话可以用来指定多个ES主机地址
@Configuration public class EsConfig { public static final RequestOptions COMMON_OPTIONS; static { RequestOptions.Builder builder = RequestOptions.DEFAULT.toBuilder(); COMMON_OPTIONS = builder.build(); } /** * 主要是给容器中注入一个RestHighLevelClient * @return */ @Bean public RestHighLevelClient esRestClient() { RestClientBuilder builder = null; // 可以指定多个es builder = RestClient.builder(new HttpHost("192.168.1.8", 9200, "http")); RestHighLevelClient client = new RestHighLevelClient(builder); return client; } }
测试类
保存数据
测试的时候要在启动类上排除数据库
@SpringBootApplication(exclude = {DataSourceAutoConfiguration.class})
@SpringBootTest class MallSearchApplicationTests { @Resource private RestHighLevelClient client; @Test void contextLoads() { } @Test public void indexData() throws IOException { // 设置索引 IndexRequest indexRequest = new IndexRequest ("users"); indexRequest.id("1"); User user = new User(); user.setUserName("张三"); user.setAge(20); user.setGender("男"); String jsonString = JSON.toJSONString(user); //设置要保存的内容,指定数据和类型 indexRequest.source(jsonString, XContentType.JSON); //执行创建索引和保存数据 IndexResponse index = client.index(indexRequest, EsConfig.COMMON_OPTIONS); System.out.println(index); }
简单的检索
@Test public void find() throws IOException { // 1 创建检索请求 SearchRequest searchRequest = new SearchRequest(); searchRequest.indices("bank"); SearchSourceBuilder sourceBuilder = new SearchSourceBuilder(); sourceBuilder.query(QueryBuilders.matchQuery("address","mill")); // 可以加多个检索条件 // sourceBuilder.query(QueryBuilders.matchQuery("address","mill")); // sourceBuilder.query(QueryBuilders.matchQuery("address","mill")); System.out.println(sourceBuilder.toString()); searchRequest.source(sourceBuilder); // 2 执行检索 SearchResponse response = client.search(searchRequest, EsConfig.COMMON_OPTIONS); // 3 分析响应结果 System.out.println(response.toString()); }
{"query":{"match":{"address":{"query":"mill","operator":"OR","prefix_length":0,"max_expansions":50,"fuzzy_transpositions":true,"lenient":false,"zero_terms_query":"NONE","auto_generate_synonyms_phrase_query":true,"boost":1.0}}}} {"took":10,"timed_out":false,"_shards":{"total":1,"successful":1,"skipped":0,"failed":0},"hits":{"total":{"value":4,"relation":"eq"},"max_score":5.4032025,"hits":[{"_index":"bank","_type":"account","_id":"970","_score":5.4032025,"_source":{"account_number":970,"balance":19648,"firstname":"Forbes","lastname":"Wallace","age":28,"gender":"M","address":"990 Mill Road","employer":"Pheast","email":"forbeswallace@pheast.com","city":"Lopezo","state":"AK"}},{"_index":"bank","_type":"account","_id":"136","_score":5.4032025,"_source":{"account_number":136,"balance":45801,"firstname":"Winnie","lastname":"Holland","age":38,"gender":"M","address":"198 Mill Lane","employer":"Neteria","email":"winnieholland@neteria.com","city":"Urie","state":"IL"}},{"_index":"bank","_type":"account","_id":"345","_score":5.4032025,"_source":{"account_number":345,"balance":9812,"firstname":"Parker","lastname":"Hines","age":38,"gender":"M","address":"715 Mill Avenue","employer":"Baluba","email":"parkerhines@baluba.com","city":"Blackgum","state":"KY"}},{"_index":"bank","_type":"account","_id":"472","_score":5.4032025,"_source":{"account_number":472,"balance":25571,"firstname":"Lee","lastname":"Long","age":32,"gender":"F","address":"288 Mill Street","employer":"Comverges","email":"leelong@comverges.com","city":"Movico","state":"MT"}}]}}
复杂的查询
/** * 复杂检索:在bank中搜索address中包含mill的所有人的年龄分布以及平均年龄,平均薪资 * * @throws IOException */ @Test public void searchData() throws IOException { //1. 创建检索请求 SearchRequest searchRequest = new SearchRequest(); //1.1)指定索引 searchRequest.indices("bank"); //1.2)构造检索条件 SearchSourceBuilder sourceBuilder = new SearchSourceBuilder(); sourceBuilder.query(QueryBuilders.matchQuery("address", "Mill")); //1.2.1)按照年龄分布进行聚合 TermsAggregationBuilder ageAgg = AggregationBuilders.terms("ageAgg").field("age").size(10); sourceBuilder.aggregation(ageAgg); //1.2.2)计算平均年龄 AvgAggregationBuilder ageAvg = AggregationBuilders.avg("ageAvg").field("age"); sourceBuilder.aggregation(ageAvg); //1.2.3)计算平均薪资 AvgAggregationBuilder balanceAvg = AggregationBuilders.avg("balanceAvg").field("balance"); sourceBuilder.aggregation(balanceAvg); System.out.println("检索条件:" + sourceBuilder); searchRequest.source(sourceBuilder); //2. 执行检索 SearchResponse searchResponse = client.search(searchRequest, RequestOptions.DEFAULT); System.out.println("检索结果:" + searchResponse); //3. 将检索结果封装为Bean SearchHits hits = searchResponse.getHits(); SearchHit[] searchHits = hits.getHits(); for (SearchHit searchHit : searchHits) { String sourceAsString = searchHit.getSourceAsString(); Account account = JSON.parseObject(sourceAsString, Account.class); System.out.println(account); } //4. 获取聚合信息 Aggregations aggregations = searchResponse.getAggregations(); Terms ageAgg1 = aggregations.get("ageAgg"); for (Terms.Bucket bucket : ageAgg1.getBuckets()) { String keyAsString = bucket.getKeyAsString(); System.out.println("年龄:" + keyAsString + " ==> " + bucket.getDocCount()); } Avg ageAvg1 = aggregations.get("ageAvg"); System.out.println("平均年龄:" + ageAvg1.getValue()); Avg balanceAvg1 = aggregations.get("balanceAvg"); System.out.println("平均薪资:" + balanceAvg1.getValue()); }
执行结果如下:
检索条件:{"query":{"match":{"address":{"query":"Mill","operator":"OR","prefix_length":0,"max_expansions":50,"fuzzy_transpositions":true,"lenient":false,"zero_terms_query":"NONE","auto_generate_synonyms_phrase_query":true,"boost":1.0}}},"aggregations":{"ageAgg":{"terms":{"field":"age","size":10,"min_doc_count":1,"shard_min_doc_count":0,"show_term_doc_count_error":false,"order":[{"_count":"desc"},{"_key":"asc"}]}},"ageAvg":{"avg":{"field":"age"}},"balanceAvg":{"avg":{"field":"balance"}}}} 检索结果:{"took":11,"timed_out":false,"_shards":{"total":1,"successful":1,"skipped":0,"failed":0},"hits":{"total":{"value":4,"relation":"eq"},"max_score":5.4032025,"hits":[{"_index":"bank","_type":"account","_id":"970","_score":5.4032025,"_source":{"account_number":970,"balance":19648,"firstname":"Forbes","lastname":"Wallace","age":28,"gender":"M","address":"990 Mill Road","employer":"Pheast","email":"forbeswallace@pheast.com","city":"Lopezo","state":"AK"}},{"_index":"bank","_type":"account","_id":"136","_score":5.4032025,"_source":{"account_number":136,"balance":45801,"firstname":"Winnie","lastname":"Holland","age":38,"gender":"M","address":"198 Mill Lane","employer":"Neteria","email":"winnieholland@neteria.com","city":"Urie","state":"IL"}},{"_index":"bank","_type":"account","_id":"345","_score":5.4032025,"_source":{"account_number":345,"balance":9812,"firstname":"Parker","lastname":"Hines","age":38,"gender":"M","address":"715 Mill Avenue","employer":"Baluba","email":"parkerhines@baluba.com","city":"Blackgum","state":"KY"}},{"_index":"bank","_type":"account","_id":"472","_score":5.4032025,"_source":{"account_number":472,"balance":25571,"firstname":"Lee","lastname":"Long","age":32,"gender":"F","address":"288 Mill Street","employer":"Comverges","email":"leelong@comverges.com","city":"Movico","state":"MT"}}]},"aggregations":{"lterms#ageAgg":{"doc_count_error_upper_bound":0,"sum_other_doc_count":0,"buckets":[{"key":38,"doc_count":2},{"key":28,"doc_count":1},{"key":32,"doc_count":1}]},"avg#ageAvg":{"value":34.0},"avg#balanceAvg":{"value":25208.0}}} MallSearchApplicationTests.Account(account_number=970, balance=19648, firstname=Forbes, lastname=Wallace, age=28, gender=M, address=990 Mill Road, employer=Pheast, email=forbeswallace@pheast.com, city=Lopezo, state=AK) MallSearchApplicationTests.Account(account_number=136, balance=45801, firstname=Winnie, lastname=Holland, age=38, gender=M, address=198 Mill Lane, employer=Neteria, email=winnieholland@neteria.com, city=Urie, state=IL) MallSearchApplicationTests.Account(account_number=345, balance=9812, firstname=Parker, lastname=Hines, age=38, gender=M, address=715 Mill Avenue, employer=Baluba, email=parkerhines@baluba.com, city=Blackgum, state=KY) MallSearchApplicationTests.Account(account_number=472, balance=25571, firstname=Lee, lastname=Long, age=32, gender=F, address=288 Mill Street, employer=Comverges, email=leelong@comverges.com, city=Movico, state=MT) 年龄:38 ==> 2 年龄:28 ==> 1 年龄:32 ==> 1 平均年龄:34.0 平均薪资:25208.0
和我们Kibana显示的结果也对应上了