谷粒商城--整合Elasticsearch和商品的上架-2

简介: 谷粒商城--整合Elasticsearch和商品的上架

常用查询

match全文检索

GET bank/_search
{
  "query": {
    "match": {
      "account_number": "20"
    }
  }
}
#字符串
GET bank/_search
{
  "query": {
    "match": {
      "address": "kings"
    }
  }
}

match_phrase短语匹配


规则如下:


# 精准匹配, 不拆分字符串进行检索
# match_phrase:不拆分字符串进行检索
# 字段.keyword:必须全匹配上才检索成功
GET bank/_search
{
  "query": {
    "match_phrase": {
      "address": "mill road"
    }
  }
}
GET bank/_search
{
  "query": {
    "match": {
      "address.keyword": "990 Mill"  
    }
  }
}

query/bool/must复合查询

# must:必须达到must所列举的所有条件
GET bank/_search
{
   "query":{
        "bool":{   
             "must":[
              {"match":{"address":"mill"}},
              {"match":{"gender":"M"}}
             ]
         }
    }
}
# must_not:必须不匹配must_not所列举的所有条件。
GET bank/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "gender": "M"
          }
        },
        {
          "match": {
            "address": "mill"
          }
        }
      ],
      "must_not": [
        {
          "match": {
            "age": "38"
          }
        }
      ]
    }
  }
}
# should:应该达到should列举的条件,如果到达会增加相关文档的评分,并不会改变查询的结果。如果query中只有should且只有一种匹配规则,那么should的条件就会被作为默认匹配条件二区改变查询结果。
GET bank/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "gender": "M"
          }
        },
        {
          "match": {
            "address": "mill"
          }
        }
      ],
      "must_not": [
        {
          "match": {
            "age": "18"
          }
        }
      ],
      "should": [
        {
          "match": {
            "lastname": "Wallace"
          }
        }
      ]
    }
  }
}

6d1b8667621625b054e3e20b506f915c_433d9a03461e7db46228ad7c17502b02.png


能够看到相关度越高,得分也越高。


query/filter【结果过滤】


  • must 贡献得分
  • should 贡献得分
  • must_not 不贡献得分
  • filter 不贡献得分


匹配的越多,得分越高!只有must和should影响相关性得分


GET bank/_search
{
  "query": {
    "bool": {
      "must": [
        { "match": {"address": "mill" } }
      ],
      "filter": {  
        "range": {
          "balance": {  
            "gte": "10000",
            "lte": "20000"
          }
        }
      }
    }
  }
}
#这条是不会显示结果的,结果都是0
GET bank/_search
{
  "query": {
    "bool": {
      "filter": {
        "range": {
          "balance": {
            "gte": "10000",
            "lte": "20000"
          }
        }
      }
    }
  }
}

query/term


# 全文检索字段用match,其他非text字段匹配用term。
GET bank/_search
{
  "query": {
    "term": {
      "address": "mill Road"
    }
  }
}

aggs/agg1(聚合)

# 聚合查询就是查一点这个,查一点哪个,不是单一的查询
GET bank/_search
{
  "query": {
    "match": {
      "address": "Mill"
    }
  },
  "aggs": {
    "ageAgg": {
      "terms": {
        "field": "age",
        "size": 10
      }
    },
    "ageAvg": {
      "avg": {
        "field": "age"
      }
    },
    "balanceAvg": {
      "avg": {
        "field": "balance"
      }
    }
  },
  "size": 0
}

e64a59d796486276b34614b1f216a8b7_0409d494fc8af8ef0afec2e494ebee94.png


nested对象聚合


GET articles/_search
{
  "size": 0, 
  "aggs": {
    "nested": { # 
      "nested": { #
        "path": "payment"
      },
      "aggs": {
        "amount_avg": {
          "avg": {
            "field": "payment.amount"
          }
        }
      }
    }
  }
}

Mapping字段映射

映射定义文档如何被存储和检索的


Mapping映射是用来定义一个文档(document),以及它所包含的属性(field)是如何存储和索引的。比如:使用mapping来定义:


当然对于已经存在的字段进行映射的时候,我们不能进行更新。更新必须创建新的索引才行


# 创建索引的时候去指定
PUT /my_index
{
  "mappings": {
    "properties": {
      "age": {
        "type": "integer"
      },
      "email": {
        "type": "keyword" 
      },
      "name": {
        "type": "text"
      }
    }
  }
}

更新索引


对于已经存在的字段映射,我们不能更新。更新必须创建新的索引,进行数据迁移。


添加新的字段映射`PUT /my_index/_mapping`
PUT /my_index/_mapping
{
  "properties": {
    "employee-id": {
      "type": "keyword",
      "index": false # 字段不能被检索。检索
    }
  }
}

数据迁移


POST _reindex
{
  "source": {
    "index": "bank",
    "type": "account"
  },
  "dest": {
    "index": "newbank"
  }
}
GET /newbank/_search

分词

ES中的分词就是能接收一个字符流,将之分割为独立的token(词元,通常是独立的单词),然后输入tokens流


简单来说,ES解析你的输入,是一个一个字的去读取


如下:


50cdc03ca3f7b33464d2e7383ed35c9c_c97406851d4727204921589fa8dcb16a.png


使用分词器


关于分词器: https://www.elastic.co/guide/en/elasticsearch/reference/7.6/analysis.html


对于中文,我们需要安装额外的分词器


下载地址:


https://github.com/medcl/elasticsearch-analysis-ik/releases


将上面下载的插件,放到es的plugs文件夹下


[root@localhost plugins]# pwd
/mydata/elasticsearch/plugins/

重启ES,测试分词器是否安装成功

GET _analyze
{
   "text":"我是中国人"
}

如果可以解析中文成功,证明安装成功


9414c073b3e935e1636652ba0d83f965_ff824262a8756d53c546d98a0c28310d.png


调整虚拟机内存为3G


自定义词库

上面ES虽然能识别中文了,但是也是一个一个字识别的,我们想让它按我们的规矩来的话,我们需要自定义一下


默认es是不支持一些新的词语,它的词库里组合是很不规律的


那么我们可以自定义扩展一下这个词库!怎么扩展呢?


安装Nginx

将分词内容放到nginx服务器当中


随便启动一个Nginx,目的是单纯为了复制配置文件
docker run -p 80:80 --name nginx -d nginx:1.10
docker container cp nginx:/etc/nginx .
mv nginx/ conf
mv conf/ /mydata/nginx
docker rm -f nginx
结构如下:
[root@localhost mydata]# cd nginx/
[root@localhost nginx]# ls
conf
[root@localhost nginx]# cd conf/
[root@localhost conf]# ls
conf.d  fastcgi_params  html  koi-utf  koi-win  logs  mime.types  modules  nginx.conf  scgi_params  uwsgi_params  win-utf
docker run -p 80:80 --name nginx \
-v /mydata/nginx/html:/usr/share/nginx/html \
-v /mydata/nginx/logs:/var/log/nginx \
-v /mydata/nginx/conf:/etc/nginx \
-d nginx:1.10
mkdir /mydata/nginx/html/es
[root@localhost ~]# cat /mydata/nginx/html/es/fenci.txt 
彭于晏
杜兰特

修改ESIKAnalyzer.cfg.xml


[root@localhost config]# cat IKAnalyzer.cfg.xml 
CTYPE properties SYSTEM "http://java.sun.com/dtd/properties.dtd">
<properties>
  <comment>IK Analyzer 扩展配置</comment>
  <!--用户可以在这里配置自己的扩展字典 -->
  <entry key="ext_dict"></entry>
   <!--用户可以在这里配置自己的扩展停止词字典-->
  <entry key="ext_stopwords"></entry>
  <!--用户可以在这里配置远程扩展字典 -->
  <entry key="remote_ext_dict">http://192.168.1.8/es/fenci.txt</entry> 
  <!--用户可以在这里配置远程扩展停止词字典-->
  <!-- <entry key="remote_ext_stopwords">words_location</entry> -->
</properties>

修改完成后,需要重启es容器,否则修改不生效。


docker restart elasticsearch
POST _analyze
{
   "analyzer": "ik_max_word", 
   "text":"我是彭于晏"
}

9de91c4e64cf83da9a89a56f97d409a2_55f45801808c9ed16a09be17cd212192.png


当然后面有新词的话,那就直接在es指定目录下继续添加新词即可


整合ElasticSearch

我们先来一个简单的写写试试


官方文档如下:

https://www.elastic.co/guide/en/elasticsearch/client/java-rest/current/java-rest-high-document-index.html


配置类

固定写法,其中builder = RestClient.builder(new HttpHost(“192.168.1.8”, 9200, “http”));如果有ES集群的话可以用来指定多个ES主机地址

@Configuration
public class EsConfig {
    public static final RequestOptions COMMON_OPTIONS;
    static {
        RequestOptions.Builder builder = RequestOptions.DEFAULT.toBuilder();
        COMMON_OPTIONS = builder.build();
    }
     /**
     * 主要是给容器中注入一个RestHighLevelClient
     * @return
     */
    @Bean
    public RestHighLevelClient esRestClient() {
        RestClientBuilder builder = null;
        // 可以指定多个es
        builder = RestClient.builder(new HttpHost("192.168.1.8", 9200, "http"));
        RestHighLevelClient client = new RestHighLevelClient(builder);
        return client;
    }
}

测试类

保存数据

测试的时候要在启动类上排除数据库


@SpringBootApplication(exclude = {DataSourceAutoConfiguration.class})


0018b601cb706a8eac2a49de84443f1a_65ebf972b9e2a2e1db2113115a9dfe0f.png


6d17c401e14cd7f31cfc0191cf63b4e2_eb258f187bb41a1342b910a6ea653d2c.png

@SpringBootTest
class MallSearchApplicationTests {
    @Resource
    private RestHighLevelClient client;
    @Test
    void contextLoads() {
    }
    @Test
    public void indexData() throws IOException {
        // 设置索引
        IndexRequest indexRequest = new IndexRequest ("users");
        indexRequest.id("1");
        User user = new User();
        user.setUserName("张三");
        user.setAge(20);
        user.setGender("男");
        String jsonString = JSON.toJSONString(user);
        //设置要保存的内容,指定数据和类型
        indexRequest.source(jsonString, XContentType.JSON);
        //执行创建索引和保存数据
        IndexResponse index = client.index(indexRequest, EsConfig.COMMON_OPTIONS);
        System.out.println(index);
    }

601da91d69f3c5be213bd277e838551e_b9e9edf6d200d9a78a6f2b8cfde1625e.png


简单的检索

@Test
public void find() throws IOException {
    // 1 创建检索请求
    SearchRequest searchRequest = new SearchRequest();
    searchRequest.indices("bank");
    SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
    sourceBuilder.query(QueryBuilders.matchQuery("address","mill"));
    //        可以加多个检索条件
    //        sourceBuilder.query(QueryBuilders.matchQuery("address","mill"));
    //        sourceBuilder.query(QueryBuilders.matchQuery("address","mill"));
    System.out.println(sourceBuilder.toString());
    searchRequest.source(sourceBuilder);
    // 2 执行检索
    SearchResponse response = client.search(searchRequest, EsConfig.COMMON_OPTIONS);
    // 3 分析响应结果
    System.out.println(response.toString());
}
{"query":{"match":{"address":{"query":"mill","operator":"OR","prefix_length":0,"max_expansions":50,"fuzzy_transpositions":true,"lenient":false,"zero_terms_query":"NONE","auto_generate_synonyms_phrase_query":true,"boost":1.0}}}}
{"took":10,"timed_out":false,"_shards":{"total":1,"successful":1,"skipped":0,"failed":0},"hits":{"total":{"value":4,"relation":"eq"},"max_score":5.4032025,"hits":[{"_index":"bank","_type":"account","_id":"970","_score":5.4032025,"_source":{"account_number":970,"balance":19648,"firstname":"Forbes","lastname":"Wallace","age":28,"gender":"M","address":"990 Mill Road","employer":"Pheast","email":"forbeswallace@pheast.com","city":"Lopezo","state":"AK"}},{"_index":"bank","_type":"account","_id":"136","_score":5.4032025,"_source":{"account_number":136,"balance":45801,"firstname":"Winnie","lastname":"Holland","age":38,"gender":"M","address":"198 Mill Lane","employer":"Neteria","email":"winnieholland@neteria.com","city":"Urie","state":"IL"}},{"_index":"bank","_type":"account","_id":"345","_score":5.4032025,"_source":{"account_number":345,"balance":9812,"firstname":"Parker","lastname":"Hines","age":38,"gender":"M","address":"715 Mill Avenue","employer":"Baluba","email":"parkerhines@baluba.com","city":"Blackgum","state":"KY"}},{"_index":"bank","_type":"account","_id":"472","_score":5.4032025,"_source":{"account_number":472,"balance":25571,"firstname":"Lee","lastname":"Long","age":32,"gender":"F","address":"288 Mill Street","employer":"Comverges","email":"leelong@comverges.com","city":"Movico","state":"MT"}}]}}

复杂的查询

/**
     * 复杂检索:在bank中搜索address中包含mill的所有人的年龄分布以及平均年龄,平均薪资
     *
     * @throws IOException
     */
@Test
public void searchData() throws IOException {
    //1. 创建检索请求
    SearchRequest searchRequest = new SearchRequest();
    //1.1)指定索引
    searchRequest.indices("bank");
    //1.2)构造检索条件
    SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
    sourceBuilder.query(QueryBuilders.matchQuery("address", "Mill"));
    //1.2.1)按照年龄分布进行聚合
    TermsAggregationBuilder ageAgg = AggregationBuilders.terms("ageAgg").field("age").size(10);
    sourceBuilder.aggregation(ageAgg);
    //1.2.2)计算平均年龄
    AvgAggregationBuilder ageAvg = AggregationBuilders.avg("ageAvg").field("age");
    sourceBuilder.aggregation(ageAvg);
    //1.2.3)计算平均薪资
    AvgAggregationBuilder balanceAvg = AggregationBuilders.avg("balanceAvg").field("balance");
    sourceBuilder.aggregation(balanceAvg);
    System.out.println("检索条件:" + sourceBuilder);
    searchRequest.source(sourceBuilder);
    //2. 执行检索
    SearchResponse searchResponse = client.search(searchRequest, RequestOptions.DEFAULT);
    System.out.println("检索结果:" + searchResponse);
    //3. 将检索结果封装为Bean
    SearchHits hits = searchResponse.getHits();
    SearchHit[] searchHits = hits.getHits();
    for (SearchHit searchHit : searchHits) {
        String sourceAsString = searchHit.getSourceAsString();
        Account account = JSON.parseObject(sourceAsString, Account.class);
        System.out.println(account);
    }
    //4. 获取聚合信息
    Aggregations aggregations = searchResponse.getAggregations();
    Terms ageAgg1 = aggregations.get("ageAgg");
    for (Terms.Bucket bucket : ageAgg1.getBuckets()) {
        String keyAsString = bucket.getKeyAsString();
        System.out.println("年龄:" + keyAsString + " ==> " + bucket.getDocCount());
    }
    Avg ageAvg1 = aggregations.get("ageAvg");
    System.out.println("平均年龄:" + ageAvg1.getValue());
    Avg balanceAvg1 = aggregations.get("balanceAvg");
    System.out.println("平均薪资:" + balanceAvg1.getValue());
}

执行结果如下:


检索条件:{"query":{"match":{"address":{"query":"Mill","operator":"OR","prefix_length":0,"max_expansions":50,"fuzzy_transpositions":true,"lenient":false,"zero_terms_query":"NONE","auto_generate_synonyms_phrase_query":true,"boost":1.0}}},"aggregations":{"ageAgg":{"terms":{"field":"age","size":10,"min_doc_count":1,"shard_min_doc_count":0,"show_term_doc_count_error":false,"order":[{"_count":"desc"},{"_key":"asc"}]}},"ageAvg":{"avg":{"field":"age"}},"balanceAvg":{"avg":{"field":"balance"}}}}
检索结果:{"took":11,"timed_out":false,"_shards":{"total":1,"successful":1,"skipped":0,"failed":0},"hits":{"total":{"value":4,"relation":"eq"},"max_score":5.4032025,"hits":[{"_index":"bank","_type":"account","_id":"970","_score":5.4032025,"_source":{"account_number":970,"balance":19648,"firstname":"Forbes","lastname":"Wallace","age":28,"gender":"M","address":"990 Mill Road","employer":"Pheast","email":"forbeswallace@pheast.com","city":"Lopezo","state":"AK"}},{"_index":"bank","_type":"account","_id":"136","_score":5.4032025,"_source":{"account_number":136,"balance":45801,"firstname":"Winnie","lastname":"Holland","age":38,"gender":"M","address":"198 Mill Lane","employer":"Neteria","email":"winnieholland@neteria.com","city":"Urie","state":"IL"}},{"_index":"bank","_type":"account","_id":"345","_score":5.4032025,"_source":{"account_number":345,"balance":9812,"firstname":"Parker","lastname":"Hines","age":38,"gender":"M","address":"715 Mill Avenue","employer":"Baluba","email":"parkerhines@baluba.com","city":"Blackgum","state":"KY"}},{"_index":"bank","_type":"account","_id":"472","_score":5.4032025,"_source":{"account_number":472,"balance":25571,"firstname":"Lee","lastname":"Long","age":32,"gender":"F","address":"288 Mill Street","employer":"Comverges","email":"leelong@comverges.com","city":"Movico","state":"MT"}}]},"aggregations":{"lterms#ageAgg":{"doc_count_error_upper_bound":0,"sum_other_doc_count":0,"buckets":[{"key":38,"doc_count":2},{"key":28,"doc_count":1},{"key":32,"doc_count":1}]},"avg#ageAvg":{"value":34.0},"avg#balanceAvg":{"value":25208.0}}}
MallSearchApplicationTests.Account(account_number=970, balance=19648, firstname=Forbes, lastname=Wallace, age=28, gender=M, address=990 Mill Road, employer=Pheast, email=forbeswallace@pheast.com, city=Lopezo, state=AK)
MallSearchApplicationTests.Account(account_number=136, balance=45801, firstname=Winnie, lastname=Holland, age=38, gender=M, address=198 Mill Lane, employer=Neteria, email=winnieholland@neteria.com, city=Urie, state=IL)
MallSearchApplicationTests.Account(account_number=345, balance=9812, firstname=Parker, lastname=Hines, age=38, gender=M, address=715 Mill Avenue, employer=Baluba, email=parkerhines@baluba.com, city=Blackgum, state=KY)
MallSearchApplicationTests.Account(account_number=472, balance=25571, firstname=Lee, lastname=Long, age=32, gender=F, address=288 Mill Street, employer=Comverges, email=leelong@comverges.com, city=Movico, state=MT)
年龄:38 ==> 2
年龄:28 ==> 1
年龄:32 ==> 1
平均年龄:34.0
平均薪资:25208.0

和我们Kibana显示的结果也对应上了

e64a59d796486276b34614b1f216a8b7_0409d494fc8af8ef0afec2e494ebee94.png

相关实践学习
使用阿里云Elasticsearch体验信息检索加速
通过创建登录阿里云Elasticsearch集群,使用DataWorks将MySQL数据同步至Elasticsearch,体验多条件检索效果,简单展示数据同步和信息检索加速的过程和操作。
ElasticSearch 入门精讲
ElasticSearch是一个开源的、基于Lucene的、分布式、高扩展、高实时的搜索与数据分析引擎。根据DB-Engines的排名显示,Elasticsearch是最受欢迎的企业搜索引擎,其次是Apache Solr(也是基于Lucene)。 ElasticSearch的实现原理主要分为以下几个步骤: 用户将数据提交到Elastic Search 数据库中 通过分词控制器去将对应的语句分词,将其权重和分词结果一并存入数据 当用户搜索数据时候,再根据权重将结果排名、打分 将返回结果呈现给用户 Elasticsearch可以用于搜索各种文档。它提供可扩展的搜索,具有接近实时的搜索,并支持多租户。
相关文章
|
存储 SQL 自然语言处理
谷粒商城--整合Elasticsearch和商品的上架-3
谷粒商城--整合Elasticsearch和商品的上架
157 0
|
自然语言处理 关系型数据库 MySQL
谷粒商城--整合Elasticsearch和商品的上架-1
谷粒商城--整合Elasticsearch和商品的上架
154 0
|
4月前
|
存储 安全 数据管理
如何在 Rocky Linux 8 上安装和配置 Elasticsearch
本文详细介绍了在 Rocky Linux 8 上安装和配置 Elasticsearch 的步骤,包括添加仓库、安装 Elasticsearch、配置文件修改、设置内存和文件描述符、启动和验证 Elasticsearch,以及常见问题的解决方法。通过这些步骤,你可以快速搭建起这个强大的分布式搜索和分析引擎。
129 5
|
5月前
|
存储 JSON Java
elasticsearch学习一:了解 ES,版本之间的对应。安装elasticsearch,kibana,head插件、elasticsearch-ik分词器。
这篇文章是关于Elasticsearch的学习指南,包括了解Elasticsearch、版本对应、安装运行Elasticsearch和Kibana、安装head插件和elasticsearch-ik分词器的步骤。
475 0
elasticsearch学习一:了解 ES,版本之间的对应。安装elasticsearch,kibana,head插件、elasticsearch-ik分词器。
|
6月前
|
NoSQL 关系型数据库 Redis
mall在linux环境下的部署(基于Docker容器),Docker安装mysql、redis、nginx、rabbitmq、elasticsearch、logstash、kibana、mongo
mall在linux环境下的部署(基于Docker容器),docker安装mysql、redis、nginx、rabbitmq、elasticsearch、logstash、kibana、mongodb、minio详细教程,拉取镜像、运行容器
mall在linux环境下的部署(基于Docker容器),Docker安装mysql、redis、nginx、rabbitmq、elasticsearch、logstash、kibana、mongo
|
7月前
|
数据可视化 Docker 容器
一文教会你如何通过Docker安装elasticsearch和kibana 【详细过程+图解】
这篇文章提供了通过Docker安装Elasticsearch和Kibana的详细过程和图解,包括下载镜像、创建和启动容器、处理可能遇到的启动失败情况(如权限不足和配置文件错误)、测试Elasticsearch和Kibana的连接,以及解决空间不足的问题。文章还特别指出了配置文件中空格的重要性以及环境变量中字母大小写的问题。
一文教会你如何通过Docker安装elasticsearch和kibana 【详细过程+图解】
|
7月前
|
JSON 自然语言处理 数据库
Elasticsearch从入门到项目部署 安装 分词器 索引库操作
这篇文章详细介绍了Elasticsearch的基本概念、倒排索引原理、安装部署、IK分词器的使用,以及如何在Elasticsearch中进行索引库的CRUD操作,旨在帮助读者从入门到项目部署全面掌握Elasticsearch的使用。
|
8月前
|
Docker 容器
docker desktop安装es并连接elasticsearch-head:5
以上就是在Docker Desktop上安装Elasticsearch并连接Elasticsearch-head:5的步骤。
317 2
|
7月前
|
Ubuntu Oracle Java
如何在 Ubuntu VPS 上安装 Elasticsearch
如何在 Ubuntu VPS 上安装 Elasticsearch
85 0
|
7月前
|
存储 Ubuntu Oracle
在Ubuntu 14.04上安装和配置Elasticsearch的方法
在Ubuntu 14.04上安装和配置Elasticsearch的方法
75 0

热门文章

最新文章