前言
在一些应用中,Elasticsearch应用于全文搜索的很少,多是用于ToC端的查询,更像是一个缓存数据库。
与ToB的查询项是确定的不同——例如查名字,性别,地址,直接去DB对应的字段查就可以了,ToC大多是一个搜索框,对应多个查询项,如图:
这时候用DB就显得很不优雅了,用Elasticsearch就比较好了。
本文内容主要是通过基于SpringBoot、Elasticsearch实现对酒店查询的功能,列出一些常用Elasticsearch的API。
功能有:
1、在Elasticsearch上创建hotel索引,并配置映射。
2、从MySQL批量导入酒店信息至Elasticsearch。
3、添加单条酒店信息至MySQL、Elasticsearch。
4、根据关键词从Elasticsearch查询酒店列表。
5、根据坐标从Elasticsearch查询酒店列表。
6、根据价格范围从Elasticsearch查询酒店列表。
注:因为主要目的是展示常用的API,所以在设计上没有优化,例如数据库表设计上有一些不合理的,price不应出现在酒店表里;代码上没有优化,例如分页、自定义排序等就直接写死在代码里了;Elasticsearch也没有配置完全,分词器没有使用常用的IK。
关注公众号:麒麟改bug,一起探讨Java技术交流,获取核心学习笔记
代码&讲述
pom.xml
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 https://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<parent>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-parent</artifactId>
<version>2.4.0</version>
<relativePath /> <!-- lookup parent from repository -->
</parent>
<groupId>org.leo</groupId>
<artifactId>hotel-server</artifactId>
<version>0.0.1-SNAPSHOT</version>
<name>hotel-server</name>
<description>酒店-MySQL-ES</description>
<properties>
<java.version>1.8</java.version>
</properties>
<dependencies>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-data-elasticsearch</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-web</artifactId>
</dependency>
<dependency>
<groupId>org.mybatis.spring.boot</groupId>
<artifactId>mybatis-spring-boot-starter</artifactId>
<version>2.1.4</version>
</dependency>
<dependency>
<groupId>mysql</groupId>
<artifactId>mysql-connector-java</artifactId>
<scope>runtime</scope>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-test</artifactId>
<scope>test</scope>
</dependency>
<!-- https://mvnrepository.com/artifact/com.google.guava/guava -->
<dependency>
<groupId>com.google.guava</groupId>
<artifactId>guava</artifactId>
<version>29.0-jre</version>
</dependency>
<!-- https://mvnrepository.com/artifact/com.google.code.gson/gson -->
<dependency>
<groupId>com.google.code.gson</groupId>
<artifactId>gson</artifactId>
</dependency>
</dependencies>
<build>
<plugins>
<plugin>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-maven-plugin</artifactId>
</plugin>
</plugins>
</build>
</project>
application.properties
server.servlet.context-path=/hotel
server.port=8080
#AOP
spring.aop.proxy-target-class=true
#JDBC
spring.datasource.name=mall
spring.datasource.url=jdbc:mysql://127.0.0.1:3306/mall?useUnicode=true&useSSL=false
spring.datasource.username=root
spring.datasource.password=root
spring.datasource.driver-class-name=com.mysql.cj.jdbc.Driver
#指定XML文件目录
mybatis.mapper-locations=classpath:mapper/*.xml
#开启下划线与驼峰的转换
mybatis.configuration.map-underscore-to-camel-case=true
建表语句
CREATE TABLE `t_hotel` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`hotel_name` varchar(128) COLLATE utf8mb4_unicode_ci NOT NULL DEFAULT '' COMMENT '酒店名称',
`province` varchar(128) COLLATE utf8mb4_unicode_ci NOT NULL DEFAULT '' COMMENT '省',
`city` varchar(128) COLLATE utf8mb4_unicode_ci NOT NULL DEFAULT '' COMMENT '市',
`area` varchar(128) COLLATE utf8mb4_unicode_ci NOT NULL DEFAULT '' COMMENT '区',
`location` varchar(128) COLLATE utf8mb4_unicode_ci NOT NULL DEFAULT '' COMMENT '经纬度',
`landmark` varchar(128) COLLATE utf8mb4_unicode_ci DEFAULT '' COMMENT '地标',
`label` varchar(128) COLLATE utf8mb4_unicode_ci DEFAULT NULL COMMENT '标签',
`price` int(11) NOT NULL DEFAULT '0' COMMENT '价格',
`available_flag` tinyint(4) NOT NULL DEFAULT '2' COMMENT '营业标识。1营业2歇业',
`hotel_desc` varchar(128) COLLATE utf8mb4_unicode_ci DEFAULT NULL COMMENT '描述',
`create_time` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP COMMENT '开店时间',
PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci COMMENT='酒店表';
仅做展示用,缺失一些字段,也有一些不合理的字段。
实体类&VO
省略get/set
public class HotelEntity implements Serializable {
private static final long serialVersionUID = 1980059323208910883L;
/**
* id
*/
private Integer id;
/**
* 酒店名称
*/
private String hotelName;
/**
* 省
*/
private String province;
/**
* 市
*/
private String city;
/**
* 区
*/
private String area;
/**
* 经纬度
*/
private String location;
/**
* 地标
*/
private String landmark;
/**
* 标签
*/
private String label;
/**
* 价格
*/
private Integer price;
/**
* 营业标识。1营业2歇业
*/
private Integer availableFlag;
/**
* 描述
*/
private String hotelDesc;
/**
* 开店时间
*/
private Date createTime;
}
public class GeoLocation implements Serializable {
private static final long serialVersionUID = 8940851639489109853L;
private double lat;
private double lon;
}
public class HotelESVO implements Serializable {
private static final long serialVersionUID = -2334463717030263164L;
/**
* id
*/
private Integer id;
/**
* 酒店名称
*/
private String hotelName;
/**
* 省
*/
private String province;
/**
* 市
*/
private String city;
/**
* 区
*/
private String area;
/**
* 经纬度
*/
private GeoLocation location;
/**
* 地标
*/
private String landmark;
/**
* 标签
*/
private String label;
/**
* 价格
*/
private Integer price;
/**
* 营业标识。1营业2歇业
*/
private Integer availableFlag;
/**
* 描述
*/
private String hotelDesc;
/**
* 开店时间
*/
private String createTime;
/**
* 查询关键词
*/
private String searchKeywords;
}
HotelESVO与HotelEntity有些不同:
1、经纬度变成了GeoLocation类。
2、多了searchKeywords,供搜索用。
说一下为什么加了这个字段,用户搜索时,一般就是搜酒店名称、省区市、地标(王府井、春熙路、锦里等)、标签(网红打卡、场站接送等),全文去搜反而不如将这些关键词统一放在一起,只搜这一个字段。
3、时间从Date变成String,因为Elasticsearch不认Date
Hotel索引映射
{
"properties": {
"id": {
"type": "integer"
},
"hotelName": {
"type": "keyword"
},
"province": {
"type": "keyword"
},
"city": {
"type": "keyword"
},
"area": {
"type": "keyword"
},
"location": {
"type": "geo_shape"
},
"landmark": {
"type": "keyword"
},
"label": {
"type": "keyword"
},
"price": {
"type": "integer"
},
"availableFlag": {
"type": "integer"
},
"hotelDesc": {
"type": "text",
"index": "false"
},
"searchKeywords": {
"type": "text",
"analyzer": "whitespace"
},
"createTime": {
"type": "date",
"format": "yyyy-MM-dd HH:mm:ss"
}
}
}
有几个地方要注意:
1、为了演示方便,searchKeywords使用的分词器是空格分词器,没用IK。
2、一些字段没必要进行索引,例如hotelDesc的索引我就给禁了。
3、经纬度的映射是geo\_shape。
4、日期时间要配置格式化。
RestClientConfig
@Configuration
public class RestClientConfig extends AbstractElasticsearchConfiguration {
@Override
@Bean
public RestHighLevelClient elasticsearchClient() {
final ClientConfiguration clientConfiguration = ClientConfiguration.builder().connectedTo("127.0.0.1:9200")
.build();
return RestClients.create(clientConfiguration).rest();
}
}
SpringBoot中,原来有一个ElasticsearchTemplate可以与Elasticsearch交互,但是现在被废弃了,现在应该使用Java High Level REST Client。
跟数据库交互的Mapper、Service就不列出来了。
Controller的结构
@RestController
public class HotelController {
private static final Logger logger = LoggerFactory.getLogger(HotelController.class);
@Autowired
private HotelService hotelService;
@Autowired
RestHighLevelClient highLevelClient;
private HotelESVO buildVoFromEntity(HotelEntity hotelEntity) {
// 可用复制工具
HotelESVO vo = new HotelESVO();
vo.setArea(hotelEntity.getArea());
vo.setAvailableFlag(hotelEntity.getAvailableFlag());
vo.setCity(hotelEntity.getCity());
vo.setHotelDesc(hotelEntity.getHotelDesc());
vo.setHotelName(hotelEntity.getHotelName());
vo.setId(hotelEntity.getId());
vo.setLabel(hotelEntity.getLabel());
vo.setLandmark(hotelEntity.getLandmark());
vo.setPrice(hotelEntity.getPrice());
vo.setProvince(hotelEntity.getProvince());
// 处理时间
vo.setCreateTime(LocalDateTime.fromDateFields(hotelEntity.getCreateTime()).toString("yyyy-MM-dd HH:mm:ss"));
// 处理坐标
GeoLocation location = new GeoLocation();
location.setLat(Double.valueOf(hotelEntity.getLocation().split(",")[1]));
location.setLon(Double.valueOf(hotelEntity.getLocation().split(",")[0]));
vo.setLocation(location);
// 处理查询关键词 用set去重一下
String[] labelArr = hotelEntity.getLabel().split(",");
String[] landMarkArr = hotelEntity.getLandmark().split(",");
Set<String> skw = Sets.newHashSet();
skw.addAll(Arrays.asList(labelArr));
skw.addAll(Arrays.asList(landMarkArr));
skw.add(hotelEntity.getProvince());
skw.add(hotelEntity.getArea());
skw.add(hotelEntity.getCity());
skw.add(hotelEntity.getHotelName());
String searchKeywords = Joiner.on(" ").skipNulls().join(skw);
vo.setSearchKeywords(searchKeywords);
return vo;
}
}
私有方法是Entity转VO。
后续的方法都写在Controller里面,实际工作中,要看公司的规范要求,选择合适的地方。
添加索引
@GetMapping("/addHotelIndex")
@ResponseBody
public CreateIndexResponse addHotelIndex() throws IOException {
CreateIndexRequest req = new CreateIndexRequest("hotel");
req.settings(Settings.builder().put("index.number_of_shards", 1).put("index.number_of_replicas", 1));
req.mapping(
"{\n" + " \"properties\": {\n" + " \"id\": {\n" + " \"type\": \"integer\"\n"
+ " },\n" + " \"hotelName\": {\n" + " \"type\": \"keyword\"\n"
+ " },\n" + " \"province\": {\n" + " \"type\": \"keyword\"\n"
+ " },\n" + " \"city\": {\n" + " \"type\": \"keyword\"\n"
+ " },\n" + " \"area\": {\n" + " \"type\": \"keyword\"\n"
+ " },\n" + " \"location\": {\n" + " \"type\": \"geo_point\"\n"
+ " },\n" + " \"landmark\": {\n" + " \"type\": \"keyword\"\n"
+ " },\n" + " \"label\": {\n" + " \"type\": \"keyword\"\n"
+ " },\n" + " \"price\": {\n" + " \"type\": \"integer\"\n"
+ " },\n" + " \"availableFlag\": {\n" + " \"type\": \"integer\"\n"
+ " },\n" + " \"hotelDesc\": {\n" + " \"type\": \"text\",\n"
+ " \"index\": \"false\"\n" + " },\n" + " \"searchKeywords\": {\n"
+ " \"type\": \"text\",\n" + " \"analyzer\": \"whitespace\"\n"
+ " },\n" + " \"createTime\": {\n" + " \"type\": \"date\",\n"
+ " \"format\": \"yyyy-MM-dd HH:mm:ss\"\n" + " }\n" + " }\n" + "}",
XContentType.JSON);
CreateIndexResponse createIndexResponse = highLevelClient.indices().create(req, RequestOptions.DEFAULT);
return createIndexResponse;
}
一般来说,在Elasticsearch上创建索引就像在MySQL上创建表,应该是执行脚本或者使用Kibana,本文就是想展示一下相关API,才放在代码里了。
将数据库里的数据导入ES
@GetMapping("/importHotel2ES")
@ResponseBody
public BulkResponse importHotel2ES() throws IOException {
// 为防止数据过大,可以批量获取
List<HotelEntity> hotels = hotelService.findAll();
BulkRequest bulkReq = new BulkRequest("hotel");
Gson gson = new Gson();
for (HotelEntity hotelEntity : hotels) {
HotelESVO vo = this.buildVoFromEntity(hotelEntity);
IndexRequest req = new IndexRequest();
req.id(hotelEntity.getId().toString());
req.source(gson.toJson(vo), XContentType.JSON);
bulkReq.add(req);
}
BulkResponse bulkRes = highLevelClient.bulk(bulkReq, RequestOptions.DEFAULT);
return bulkRes;
}
有可能在实际工作中,Elasticsearch是后上的,所以有个批量导入的初始化方法还是很有必要的,为防止数据量过大,应该分批导入。
这里主要是展示一下Elasticsearch的批量操作API。
添加酒店信息:
@PostMapping("/add")
@ResponseBody
public HotelEntity add(@RequestBody HotelEntity entity) throws IOException {
entity.setCreateTime(new Date());
hotelService.add(entity);
logger.info("保存成功:" + entity.toString());
// 同步到ES
Gson gson = new Gson();
HotelESVO vo = this.buildVoFromEntity(entity);
IndexRequest req = new IndexRequest("hotel");
req.id(entity.getId().toString());
req.source(gson.toJson(vo), XContentType.JSON);
IndexResponse indexResponse = highLevelClient.index(req, RequestOptions.DEFAULT);
logger.info("同步ES:" + indexResponse.toString());
return entity;
}
在实际工作中,根据业务要求,同步数据到Elasticsearch的工作可以采用上面的方法,实时性得到保障。也可以使用Elasticsearch提供的异步方法indexAsync。或者写一个定时器,每隔一段时间从数据库批量导入。
关键词查询
@GetMapping("/search")
@ResponseBody
public List<HotelESVO> search(@RequestParam String keywords) throws IOException {
SearchRequest searchRequest = new SearchRequest("hotel");
SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
// 查询 默认是OR
sourceBuilder.query(QueryBuilders.matchQuery("searchKeywords", keywords).operator(Operator.AND));
// 分页
sourceBuilder.from(0);
sourceBuilder.size(10);
// 排序
sourceBuilder.sort(new FieldSortBuilder("id").order(SortOrder.DESC));
// 定制要返回的信息,隐藏的信息
String[] includeFields = new String[] { "id", "hotelName", "province", "city", "area", "location", "landmark",
"label", "availableFlag", "hotelDesc", "price", "createTime" };
String[] excludeFields = new String[] { "searchKeywords" };
sourceBuilder.fetchSource(includeFields, excludeFields);
searchRequest.source(sourceBuilder);
SearchResponse res = highLevelClient.search(searchRequest, RequestOptions.DEFAULT);
SearchHit[] searchHits = res.getHits().getHits();
Gson gson = new Gson();
List<HotelESVO> vos = Lists.newArrayList();
for (SearchHit hit : searchHits) {
String sourceAsString = hit.getSourceAsString();
vos.add(gson.fromJson(sourceAsString, HotelESVO.class));
}
return vos;
}
注意QueryBuilders最后面的operator,默认是or,如果用户输入"北京 四川",只要searchKeywords包含北京或四川,都会搜索出来,如果是and,就必须同时包含北京和四川才行。
如果是or,这次搜索会把其他北京的酒店也搜索出来。实际工作中用or还是and,按照业务来。
根据经纬度搜索
范围的单位是公里,排序是由近及远。
@GetMapping("/searchGeo")
@ResponseBody
public List<HotelESVO> searchGeo(@RequestParam Double lat, @RequestParam Double lon, @RequestParam Integer distance)
throws IOException {
SearchRequest searchRequest = new SearchRequest("hotel");
SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
// 查询
sourceBuilder.query(
QueryBuilders.geoDistanceQuery("location").point(lat, lon).distance(distance, DistanceUnit.KILOMETERS));
// 分页
sourceBuilder.from(0);
sourceBuilder.size(10);
// 排序
sourceBuilder.sort(new GeoDistanceSortBuilder("location", lat, lon).order(SortOrder.DESC));
// 定制要返回的信息,隐藏的信息
String[] includeFields = new String[] { "id", "hotelName", "province", "city", "area", "location", "landmark",
"label", "availableFlag", "hotelDesc", "price", "createTime" };
String[] excludeFields = new String[] { "searchKeywords" };
sourceBuilder.fetchSource(includeFields, excludeFields);
searchRequest.source(sourceBuilder);
SearchResponse res = highLevelClient.search(searchRequest, RequestOptions.DEFAULT);
SearchHit[] searchHits = res.getHits().getHits();
Gson gson = new Gson();
List<HotelESVO> vos = Lists.newArrayList();
for (SearchHit hit : searchHits) {
String sourceAsString = hit.getSourceAsString();
vos.add(gson.fromJson(sourceAsString, HotelESVO.class));
}
return vos;
}
价格范围搜索
@GetMapping("/searchPrice")
@ResponseBody
public List<HotelESVO> searchPrice(@RequestParam(required = false) Integer min,
@RequestParam(required = false) Integer max) throws IOException {
if (Objects.isNull(min)) {
min = 0;
}
if (Objects.isNull(max)) {
max = Integer.MAX_VALUE;
}
SearchRequest searchRequest = new SearchRequest("hotel");
SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
// 查询
sourceBuilder.query(QueryBuilders.rangeQuery("price").gte(min).lte(max));
// 分页
sourceBuilder.from(0);
sourceBuilder.size(10);
// 排序
sourceBuilder.sort(new FieldSortBuilder("price").order(SortOrder.ASC));
// 定制要返回的信息,隐藏的信息
String[] includeFields = new String[] { "id", "hotelName", "province", "city", "area", "location", "landmark",
"label", "availableFlag", "hotelDesc", "price", "createTime" };
String[] excludeFields = new String[] { "searchKeywords" };
sourceBuilder.fetchSource(includeFields, excludeFields);
searchRequest.source(sourceBuilder);
SearchResponse res = highLevelClient.search(searchRequest, RequestOptions.DEFAULT);
SearchHit[] searchHits = res.getHits().getHits();
Gson gson = new Gson();
List<HotelESVO> vos = Lists.newArrayList();
for (SearchHit hit : searchHits) {
String sourceAsString = hit.getSourceAsString();
vos.add(gson.fromJson(sourceAsString, HotelESVO.class));
}
return vos;
}
关注公众号:麒麟改bug,一起探讨Java技术交流,获取核心学习笔记