1 图形化环境(图数据库)
1.1 背景
Gremlin是Apache TinkerPop框架下实现的图遍历语言,支持OLTP与OLAP,是目前图数据库领域主流的查询语言,可类比SQL语言之于关系型数据库。
HugeGraph是国内的一款开源图数据库,完全支持Gremlin语言。本文将讲述如何基于HugeGraph搭建一个执行Gremlin的图形化环境。
HugeGraph的github仓库下有很多子项目,我们这里只需要使用其中的两个:
hugegraph
和hugegraph-studio
hugegraph操作手册地址:
https://hugegraph.github.io/hugegraph-doc/
1.2 部署HugeGraphServer
1.2.1 下载并解压安装包
从github下载release包
地址:https://github.com/hugegraph/hugegraph
下载完之后把他上传到虚拟机:
$ tar -zxvf hugegraph-0.11.2.tar.gz
1.2.2 配置参数
进入hugegraph-0.11.2的目录,修改conf/rest-server.properties文件
$ vim conf/rest-server.properties
restserver.url就是HugeGraphServer对外提供RESTful API服务的地址,host为127.0.0.1时只能在本机访问的,按需要修改其中的host和port部分即可。
我这里配置所有都可以访问,8080端口被占用就换成8008,如果是在云上记得去安全组开放端口,不然web访问不了。
graphs是可供连接的图名与配置项的键值对列表,hugegraph:conf/hugegraph.properties表示通过HugeGraphServer可以访问到一个名为hugegraph的图实例,该图的配置文件路径为conf/hugegraph.properties。我们可以不用去管图的配置文件,按需要修改图的名字即可。默认即可,无需修改。
1.2.3 初始化后端
运行bin/init-store.sh文件
$ bin/init-store.sh
这里初始化会为我们初始化数据库,数据库的配置是在conf/hugegraph.properties的文件中
其中backend=rocksdb就是设置数据库为rocksdb的配置项。
其他的数据库还包括:memory、cassandra、scylladb、hbase、mysql和palo。我们这里不用去管它,用默认的rocksdb即可。
初始化完成之后,会在当前目录下出现一个rocksdb-data的目录,这就是存放后端数据的地方,没事千万不要随意删它或移动它。
*注意:初始化后端这个操作只需要在第一次启动服务前执行一次,不要每次起服务都执行。不过即使执行了也没关系,hugegraph检测到已经初始化过了会跳过。
1.2.4 启动服务
执行命令
$ bin/start-hugegraph.sh
看到上面的OK就表示启动成功了,我们可以jps看一下进程。
接下来我门就可一起前往浏览器验证:
到这里HugeGraphServer的部署就完成了
1.3 部署HugeGraphStudio
1.3.1 下载并解压安装包
从github下载release包
地址:https://github.com/hugegraph/hugegraph-studio
下载完成把他上传至虚拟机(最好与HugeGraphServer的安装路径放在一起,方便查找)
$ tar -zxvf hugegraph-studio-0.11.0.tar.gz
1.3.2 配置参数
进入hugegraph-studio-0.11.0目录,修改唯一的一个配置文件。
$ vim conf/hugegraph-studio.properties
需要修改的参数是
graph.server.host=localhost、graph.server.port=8080、graph.name=hugegraph。
它们与HugeGraphServer的配置文件conf/rest-server.properties中的配置项对应,
其中:
graph.server.host=192.168.0.161与restserver.url=http://0.0.0.0:8008的host对应;
graph.server.port=8008与的restserver.url=http://0.0.0.0:8080的port对应;
graph.name=hugegraph与graphs=[hugegraph:conf/hugegraph.properties]的图名对应。
因为我之前并没有修改HugeGraphServer的配置文件conf/rest-server.properties,所以这里也不需要修改HugeGraphStudio的配置文件conf/hugegraph-studio.properties。
*注意:尽量不要用127.0.0.1、localhost而选择用自己本机的ip地址,因为会导致外网无法访问的问题
1.3.3 启动服务
执行命令
$ bin/hugegraph-studio.sh
后台启动:
$ nohup bin/hugegraph-studio.sh &
启动成功:
接下来我们去浏览器访问:
http://ip:8088,就进入了studio的界面:
2 样例数据
2.1 创建关系图
测试数据脚本
// PropertyKey
graph.schema().propertyKey("name").asText().ifNotExist().create()
graph.schema().propertyKey("age").asInt().ifNotExist().create()
graph.schema().propertyKey("addr").asText().ifNotExist().create()
graph.schema().propertyKey("lang").asText().ifNotExist().create()
graph.schema().propertyKey("tag").asText().ifNotExist().create()
graph.schema().propertyKey("weight").asFloat().ifNotExist().create()
// VertexLabel
graph.schema().vertexLabel("person").properties("name", "age", "addr", "weight").useCustomizeStringId().ifNotExist().create()
graph.schema().vertexLabel("software").properties("name", "lang", "tag", "weight").primaryKeys("name").ifNotExist().create()
graph.schema().vertexLabel("language").properties("name", "lang", "weight").primaryKeys("name").ifNotExist().create()
// EdgeLabel
graph.schema().edgeLabel("knows").sourceLabel("person").targetLabel("person").properties("weight").ifNotExist().create()
graph.schema().edgeLabel("created").sourceLabel("person").targetLabel("software").properties("weight").ifNotExist().create()
graph.schema().edgeLabel("contains").sourceLabel("software").targetLabel("software").properties("weight").ifNotExist().create()
graph.schema().edgeLabel("define").sourceLabel("software").targetLabel("language").properties("weight").ifNotExist().create()
graph.schema().edgeLabel("implements").sourceLabel("software").targetLabel("software").properties("weight").ifNotExist().create()
graph.schema().edgeLabel("supports").sourceLabel("software").targetLabel("language").properties("weight").ifNotExist().create()
// TinkerPop
okram = graph.addVertex(T.label, "person", T.id, "okram", "name", "Marko A. Rodriguez", "age", 29, "addr", "Santa Fe, New Mexico", "weight", 1)
spmallette = graph.addVertex(T.label, "person", T.id, "spmallette", "name", "Stephen Mallette", "age", 0, "addr", "", "weight", 1)
tinkerpop = graph.addVertex(T.label, "software", "name", "TinkerPop", "lang", "java", "tag", "Graph computing framework", "weight", 1)
tinkergraph = graph.addVertex(T.label, "software", "name", "TinkerGraph", "lang", "java", "tag", "In-memory property graph", "weight", 1)
gremlin = graph.addVertex(T.label, "language", "name", "Gremlin", "lang", "groovy/python/javascript", "weight", 1)
okram.addEdge("created", tinkerpop, "weight", 1)
spmallette.addEdge("created", tinkerpop, "weight", 1)
okram.addEdge("knows", spmallette, "weight", 1)
tinkerpop.addEdge("define", gremlin, "weight", 1)
tinkerpop.addEdge("contains", tinkergraph, "weight", 1)
tinkergraph.addEdge("supports", gremlin, "weight", 1)
// Titan
dalaro = graph.addVertex(T.label, "person", T.id, "dalaro", "name", "Dan LaRocque ", "age", 0, "addr", "", "weight", 1)
mbroecheler = graph.addVertex(T.label, "person", T.id, "mbroecheler", "name", "Matthias Broecheler", "age", 29, "addr", "San Francisco", "weight", 1)
titan = graph.addVertex(T.label, "software", "name", "Titan", "lang", "java", "tag", "Graph Database", "weight", 1)
dalaro.addEdge("created", titan, "weight", 1)
mbroecheler.addEdge("created", titan, "weight", 1)
okram.addEdge("created", titan, "weight", 1)
dalaro.addEdge("knows", mbroecheler, "weight", 1)
titan.addEdge("implements", tinkerpop, "weight", 1)
titan.addEdge("supports", gremlin, "weight", 1)
// HugeGraph
javeme = graph.addVertex(T.label, "person", T.id, "javeme", "name", "Jermy Li", "age", 29, "addr", "Beijing", "weight", 1)
zhoney = graph.addVertex(T.label, "person", T.id, "zhoney", "name", "Zhoney Zhang", "age", 29, "addr", "Beijing", "weight", 1)
linary = graph.addVertex(T.label, "person", T.id, "linary", "name", "Linary Li", "age", 28, "addr", "Wuhan. Hubei", "weight", 1)
hugegraph = graph.addVertex(T.label, "software", "name", "HugeGraph", "lang", "java", "tag", "Graph Database", "weight", 1)
javeme.addEdge("created", hugegraph, "weight", 1)
zhoney.addEdge("created", hugegraph, "weight", 1)
linary.addEdge("created", hugegraph, "weight", 1)
javeme.addEdge("knows", zhoney, "weight", 1)
javeme.addEdge("knows", linary, "weight", 1)
hugegraph.addEdge("implements", tinkerpop, "weight", 1)
hugegraph.addEdge("supports", gremlin, "weight", 1)
2.2 验证图查询
studio执行命令:
g.V()
2.3 数据背景介绍
1、javeme认识zhoney和linary,并创建了hugegraph图数据库
2、dalarog认识mbroecheler,他们与okram创建了titan图数据库
3、okram认识spmallette,并创建了图形计算框架TinkerPop
4、TinkerPop定义规范了Gremlin图语言
5、TinkerPop包含一个内存的图数据库TinkerGraph
6、hugegraph、titan和TinkerGraph都支持使Gremlin图语言
7、hugegraph和titan都执行了TinkerPop规范
3 java-client
3.1 HugeGraph-Client
概述:HugeGraph-Client向HugeGraph-Server发出HTTP请求,获取并解析Server的执行结果。目前仅提供了Java版,用户可以使用HugeGraph-Client编写Java代码操作HugeGraph,比如元数据和图数据的增删改查,或者执行gremlin语句。
3.1.1 环境配置
jdk1.8
maven-3.3.9
添加hugegraph-client依赖:
<dependencies>
<dependency>
<groupId>com.baidu.hugegraph</groupId>
<artifactId>hugegraph-client</artifactId>
<version>${version}</version>
</dependency>
</dependencies>
推荐版本1.9.1
3.1.2 使用教程
连接路径配置:
在application.yml中自定义hugegraph的连接地址和数据库名称
hugeGraph:
url: http://139.159.207.239:8008
dbName: hugegraph
数据库连接实例初始化:
@Component
public class HugeGraphComponent implements InitializingBean, DisposableBean {
@Value("${hugeGraph.url}")
private String hugeUrl;
@Value("${hugeGraph.dbName}")
private String hugeDbName;
HugeClient hugeClient;
GremlinManager gremlin;
/**
* 执行gremlin语句获取图数据
* @param dsl dsl
* @return List<Object>
*/
public List<Object> executeGremlin(String dsl){
ResultSet execute = gremlin.gremlin(dsl).execute();
return execute.data();
}
@Override
public void destroy() {
hugeClient.close();
}
/**
* 初始化hugeClient
* 初始化gremlin
*/
@Override
public void afterPropertiesSet() {
hugeClient = HugeClient.builder(hugeUrl,hugeDbName).build();
gremlin = hugeClient.gremlin();
}
}
调用实例连接数据库:
@SpringBootTest
@RunWith(SpringRunner.class)
public class Demo1ApplicationTests {
@Autowired
private HugeGraphComponent hgc;
@Test
public void testHuge(){
List<Object> data = hgc.executeGremlin("g.V()");
for(Object o : data){
System.out.println(o.toString());
}
}
}
结果:
3.2 tinkerpop.gremlin.driver
org.apache.tinkerpop.gremlin maven 安装包 主要是driver 和 croe 反正都安装了把
http://tinkerpop.apache.org/docs/current/reference/主要的语句都在这里
import org.apache.tinkerpop.gremlin.driver.Client;
import org.apache.tinkerpop.gremlin.driver.Cluster;
public class gremlinConnect {
String filename="D:\\study\\movies-java-spring-data-neo4j_20180423\\movies-java-spring-data-neo4j\\src\\main\\resources\\conf\\gremlin.yaml";
public Client connectGremlinServer() throws Exception {
try {
this.getClass().getClassLoader().getResources(filename);
Cluster cluster = Cluster.open(filename);
Client client = cluster.connect();
return client;
} catch (Exception e) {
e.printStackTrace();
return null;
}
}
}
上面这个是连接,用的是client,主要是提交gremlin语句的
gremlin.yaml
这个文件长下面这个样子,必须配置好
hosts: [192.168.1.1]
port: 8182
serializer: {
className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV1d0,
config: {
ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry],
}
}
hosts 是服务id地址 port 是端口
下面的是序列化 照着写就行,如果不对 请参考你安装目录下的gremlin-server.yaml
JanusGraph/janusgraph-0.2.0-hadoop2/conf/gremlin-server/gremlin-server.yaml
这个文件里配置了 你的gremlin 用的所有参数,替换下就行
使用:
String sql="g = graph.traversal(); g.V().has("name","aa").valueMap()"
List results =client.submit(str).all().get();
client 执行的就是个gremlin的sql,
client.submit(str).all().get();可以返回查询值是个result 的list 解析的话遍历就行
for (Result result : results) {
Map map = (Map)result.getObject();//强制转换成map
}
使用map 获取值就行了
但是这个map有的key不是String 用的时候注意下
4 gremlin数据库语句学习
由于gremlin语言的学习内容过于庞大,篇幅很长,下面我将放两组链接,感兴趣的可同学可以自行去学习:内容很全,知识很硬
https://blog.csdn.net/javeme/article/details/82631834
5总结
每个数据库都有自己独特的语言模式,学习起来都是复杂且枯燥的,但是当你真正的升入了解了语言的运用,那么将会沉静其中,做好学习gremlin语言的准备了吗?
那么,杨帆、启航: