正文
一、Elasticsearch介绍
Elasticsearch 是一个分布式文档储存中间件,它不会将信息储存为列数据行,而是储存已序列化为 JSON 文档的复杂数据结构。当你在一个集群中有多个节点时,储存的文档分布在整个集群里面,并且立刻可以从任意节点去访问。
当文档被储存时,它将建立索引并且近实时(1s)被搜索。 Elasticsearch 使用一种被称为倒排索引的数据结构,该结构支持快速全文搜索。在倒排索引里列出了所有文档中出现的每一个唯一单词并分别标识了每个单词在哪一个文档中。
索引可以被认为是文档的优化集合,每个文档索引都是字段的集合,这些字段是包含了数据的键值对。默认情况下,Elasticsearch 为每个字段中的所有数据建立倒排索引,并且每个索引字段都有专门的优化数据结构。例如:文本字段在倒排索引里,数值和地理字段被储存在 BKD 树中。正是因为通过使用按字段数据结构组合,才使得 Elasticsearch 拥有如此快速的搜索能力。
二、ElasticSearch集群安装
本文安装版本为7.15.2,老版本有些参数有些不同;jdk版本为jdk17。
传统方式
1、安装jdk环境
2、解压缩安装包
#解压缩文件 tar -zxvf elasticsearch-7.15.2-linux-x86_64.tar.gz -C /usr/local/ #重名名 mv elasticsearch-7.15.2 /usr/local/elasticsearch
3、创建用户组
由于elasticsearch不能使用root账户启动,所以需要创建账户
#创建用户组 groupadd es #创建用户 useradd -g es snail_es #授权 chown -R snail_es.es /usr/local/elasticsearch/
4、创建es数据目录存放数据和日志,并授权
#创建目录文件并授权 mkdir /usr/local/es chown -R snail_es.es /usr/local/es
5、修改配置文件 (各个节点的配置请根据下面的配置文件响应修改)
# ======================== Elasticsearch Configuration ========================= # # NOTE: Elasticsearch comes with reasonable defaults for most settings. # Before you set out to tweak and tune the configuration, make sure you # understand what are you trying to accomplish and the consequences. # # The primary way of configuring a node is via this file. This template lists # the most important settings you may want to configure for a production cluster. # # Please consult the documentation for further information on configuration options: # https://www.elastic.co/guide/en/elasticsearch/reference/index.html # # ---------------------------------- Cluster ----------------------------------- # # Use a descriptive name for your cluster: #集群名称,三个节点名字相同 cluster.name: my-es # # ------------------------------------ Node ------------------------------------ # # Use a descriptive name for the node: #每个节点的名字,各不相同 node.name: node-1 # # Add custom attributes to the node: # #node.attr.rack: r1 # # ----------------------------------- Paths ------------------------------------ # # Path to directory where to store the data (separate multiple locations by comma): #数据目录 path.data: /usr/local/es/data # # Path to log files: #日志目录 path.logs: /usr/local/es/logs # # ----------------------------------- Memory ----------------------------------- # # Lock the memory on startup: # #bootstrap.memory_lock: true # # Make sure that the heap size is set to about half the memory available # on the system and that the owner of the process is allowed to use this # limit. # # Elasticsearch performs poorly when the system is swapping the memory. # # ---------------------------------- Network ----------------------------------- # # By default Elasticsearch is only accessible on localhost. Set a different # address here to expose this node on the network: #当前主机ip network.host: 192.168.139.160 # # By default Elasticsearch listens for HTTP traffic on the first free port it # finds starting at 9200. Set a specific HTTP port here: #对外端口号 http.port: 9200 # # For more information, consult the network module documentation. # # --------------------------------- Discovery ---------------------------------- # # Pass an initial list of hosts to perform discovery when this node is started: # The default list of hosts is ["127.0.0.1", "[::1]"] #集群发现,默认端口是9300 discovery.seed_hosts: ["192.168.139.160","192.168.139.161", "192.168.139.162"] #集群节点名称 # Bootstrap the cluster using an initial set of master-eligible nodes: # cluster.initial_master_nodes: ["node-1", "node-2","node-3"] # # For more information, consult the discovery and cluster formation module documentation. # # ---------------------------------- Various ----------------------------------- # # Require explicit names when deleting indices: # #action.destructive_requires_name: true
6、修改服务器参数,不然启动时候会报错
1、Elasticsearch 使用大量文件描述符或文件句柄。文件描述符用完可能是灾难性的,并且很可能导致数据丢失。 请确保将运行 Elasticsearch 的用户打开文件描述符的数量限制增加到 65536 或更高。 /etc/security/limits.conf 将 nofile 设置为 65535 2、Elasticsearch 对不同类型的操作使用许多线程池。能够在需要时创建新线程很重要。 确保 Elasticsearch 用户可以创建的线程数至少为 4096。 可以通过设置 ulimit -u 4096 以 root 启动 Elasticsearch, 或者通过在 /etc/security/limits.conf 设置 nproc 为 4096 #解决办法 vi /etc/security/limits.conf,添加下面内容: * soft nofile 65536 * hard nofile 131072 * soft nproc 2048 * hard nproc 4096 之后重启服务器生效 3、Elasticsearch 默认使用 mmapfs 目录存储其索引。默认的操作系统对 mmap 计数的限制可能太低,这可能会导致内存不足异常。sysctl -w vm.max_map_count=262144 #解决办法: 在/etc/sysctl.conf文件最后添加一行 vm.max_map_count=262144 执行/sbin/sysctl -p 立即生效
7、切换用户进入安装目录启动,分别启动三台节点
1. ./bin/elasticsearch 2. #后台启动 3. ./bin/elasticsearch -d
8、检测结果,浏览器输入
http://192.168.139.160:9200/_cat/nodes?pretty
Docker方式安装
1、拉取镜像文件
[root@bogon ~]# docker pull elasticsearch:7.14.2
2、创建挂载目录并授权
[root@localhost ~]# mkdir -p /data/es/{conf,data,logs,plugins} #授权 [root@localhost ~]# chmod 777 -R /data/
3、配置文件,只需要修改相应的
node.name: node-1,node.name: node-2,node.name: node-3,
# ======================== Elasticsearch Configuration ========================= # # NOTE: Elasticsearch comes with reasonable defaults for most settings. # Before you set out to tweak and tune the configuration, make sure you # understand what are you trying to accomplish and the consequences. # # The primary way of configuring a node is via this file. This template lists # the most important settings you may want to configure for a production cluster. # # Please consult the documentation for further information on configuration options: # https://www.elastic.co/guide/en/elasticsearch/reference/index.html # # ---------------------------------- Cluster ----------------------------------- # # Use a descriptive name for your cluster: #集群名称,三个节点名字相同 cluster.name: my-es # # ------------------------------------ Node ------------------------------------ # # Use a descriptive name for the node: #每个节点的名字,各不相同 node.name: node-1 # # Add custom attributes to the node: # #node.attr.rack: r1 # # ----------------------------------- Paths ------------------------------------ # # Path to directory where to store the data (separate multiple locations by comma): #数据目录 #path.data: /usr/local/es/data # # Path to log files: #日志目录 #path.logs: /usr/local/es/logs # # ----------------------------------- Memory ----------------------------------- # # Lock the memory on startup: # #bootstrap.memory_lock: true # # Make sure that the heap size is set to about half the memory available # on the system and that the owner of the process is allowed to use this # limit. # # Elasticsearch performs poorly when the system is swapping the memory. # # ---------------------------------- Network ----------------------------------- # # By default Elasticsearch is only accessible on localhost. Set a different # address here to expose this node on the network: #当前主机ip network.host: 0.0.0.0 # # By default Elasticsearch listens for HTTP traffic on the first free port it # finds starting at 9200. Set a specific HTTP port here: #对外端口号 http.port: 9200 # # For more information, consult the network module documentation. # # --------------------------------- Discovery ---------------------------------- # # Pass an initial list of hosts to perform discovery when this node is started: # The default list of hosts is ["127.0.0.1", "[::1]"] #集群发现,默认端口是9300 discovery.seed_hosts: ["192.168.139.160","192.168.139.161", "192.168.139.162"] #集群节点名称 # Bootstrap the cluster using an initial set of master-eligible nodes: # cluster.initial_master_nodes: ["node-1", "node-2","node-3"] # # For more information, consult the discovery and cluster formation module documentation. # # ---------------------------------- Various ----------------------------------- # # Require explicit names when deleting indices: # #action.destructive_requires_name: true
4、创建docker容器,之前先执行上面第6步
docker run --name elasticsearch --privileged=true --net=host \ -v /data/es/conf/elasticsearch.yml:/usr/share/elasticsearch/config/elasticsearch.yml \ -v /data/es/data:/usr/share/elasticsearch/data \ -v /data/es/logs:/usr/share/elasticsearch/logs \ -v /data/es/plugins:/usr/share/elasticsearch/plugins \ -d elasticsearch:7.14.2
5、验证
http://192.168.139.160:9200/_cat/nodes?pretty
三、配置中文分词器
下载分词器,与es版本对应
https://github.com/medcl/elasticsearch-analysis-ik/releases
传统方式
1、创建ik目录
[root@bogon plugins]# mkdir -p /usr/local/elasticsearch/plugins/ik
2、将下载的分词解压到ik目录下
[root@bogon ik]# unzip /root/elasticsearch-analysis-ik-7.15.2.zip -d /usr/local/elasticsearch/plugins/ik/
3、启动elasticsearch验证分词器
Docker方式
1、解压到挂载目录下
[root@bogon plugins]# unzip /root/elasticsearch-analysis-ik-7.14.2.zip -d /data/es/plugins/ik
2、重启Docker
[root@bogon plugins]# docker start elasticsearch
3、检验方式与上面一样
四、整合SpringBoot
maven依赖
<dependencies> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-data-elasticsearch</artifactId> </dependency> <!--swagger依赖 --> <dependency> <groupId>io.springfox</groupId> <artifactId>springfox-boot-starter</artifactId> <version>3.0.0</version> </dependency> </dependencies>
核心代码
package com.xiaojie.es.service; import com.xiaojie.es.entity.User; import com.xiaojie.es.mapper.UserMapper; import com.xiaojie.es.util.ElasticSearchUtils; import org.apache.commons.lang3.RandomStringUtils; import org.apache.commons.lang3.RandomUtils; import org.apache.commons.lang3.StringUtils; import org.elasticsearch.common.Strings; import org.elasticsearch.index.query.BoolQueryBuilder; import org.elasticsearch.index.query.QueryBuilders; import org.elasticsearch.search.builder.SearchSourceBuilder; import org.elasticsearch.search.fetch.subphase.FetchSourceContext; import org.elasticsearch.search.sort.SortOrder; import org.springframework.beans.factory.annotation.Autowired; import org.springframework.stereotype.Service; import java.io.IOException; import java.util.List; import java.util.Map; /** * @Description: * @author: yan * @date: 2021.11.30 */ @Service public class UserService { @Autowired private UserMapper userMapper; @Autowired private ElasticSearchUtils elasticSearchUtils; //添加用户 public void add() throws IOException { // elasticSearchUtils.createIndex("user"); for (int i = 0; i < 100; i++) { User user = new User(); String chars = "11月29日在美国休斯敦进行的2021世界乒乓球锦标赛女子双打决赛中中国组合孙颖莎王曼昱以3比0击败日本组合伊藤美诚早田希娜夺得冠军"; user.setName(RandomStringUtils.random(3, chars)); user.setAge(RandomUtils.nextInt(18, 40)); userMapper.add(user); //添加到es elasticSearchUtils.addData(user, "user"); } } /* * * @todo 查询用户 * @author yan * @date 2021/11/30 16:24 * @return java.util.List<java.util.Map<java.lang.String,java.lang.Object>> */ public List<Map<String, Object>> search() throws IOException { //构建查询条件 BoolQueryBuilder boolQueryBuilder = new BoolQueryBuilder(); //精确查询 //boolQueryBuilder.must(QueryBuilders.wildcardQuery("name", "张三")); // 模糊查询 boolQueryBuilder.filter(QueryBuilders.wildcardQuery("name", "王")); // 范围查询 from:相当于闭区间; gt:相当于开区间(>) gte:相当于闭区间 (>=) lt:开区间(<) lte:闭区间 (<=) boolQueryBuilder.filter(QueryBuilders.rangeQuery("age").from(18).to(32)); SearchSourceBuilder query = new SearchSourceBuilder(); query.query(boolQueryBuilder); //需要查询的字段,缺省则查询全部 String fields = ""; //需要高亮显示的字段 String highlightField = "name"; if (StringUtils.isNotBlank(fields)) { //只查询特定字段。如果需要查询所有字段则不设置该项。 query.fetchSource(new FetchSourceContext(true, fields.split(","), Strings.EMPTY_ARRAY)); } //分页参数,相当于pageNum Integer from = 0; //分页参数,相当于pageSize Integer size = 10; //设置分页参数 query.from(from); query.size(size); //设置排序字段和排序方式,注意:字段是text类型需要拼接.keyword //query.sort("age", SortOrder.DESC); query.sort("name" + ".keyword", SortOrder.ASC); return elasticSearchUtils.searchListData("user", query, highlightField); } }
完整代码:spring-boot: Springboot整合redis、消息中间件等相关代码 的es模块
参考:
《Elasticsearch中文文档》 | Elasticsearch 技术论坛