janusgraph源码分析1-下载编译启动

简介: 转自:janusgraph源码分析1-下载编译启动 date: 2018-04-26title: "janusgraph源码分析1-下载编译启动"author: "邓子明"tags: - 源码 - janusgraph categories: - 源码分析 janusgraph源码分析1-下.

date: 2018-04-26
title: "janusgraph源码分析1-下载编译启动"
author: "邓子明"
tags:

- 源码
- janusgraph

categories:

- 源码分析

janusgraph源码分析1-下载编译启动

研究了好久的 neo4j源码,现在公司要换 janusgraph,只要半途而废开始研究 janusgraph 了
https://github.com/JanusGraph/janusgraphhttp://janusgraph.org/

一、下载编译

我直接使用github desktop打开了 janusgraph 的源码,使用IDEA打开,然后编译:

# 编译完整的
mvn -settings ~/opt/soft/apache-maven-3.5.0/conf/settings.xml -Dlicense.skip=true -DskipTests clean install
# 只编译core部分
mvn -pl janusgraph-core -am clean install -Dlicense.skip=true -DskipTests -P prod

-rf :janusgraph-test
mvn -pl janusgraph-test -am clean install -Dlicense.skip=true -DskipTests -P prod

更多阅读:

JanusGraph 技术交流圈

neo4j技术交流圈

JanusGraph中文翻译文档
我们在 janusgraph-test 下面编写一个例子 FirstTest

public class FirstTest {

    public static void main(String[] args) {

        /*
         * The example below will open a JanusGraph graph instance and load The Graph of the Gods dataset diagrammed above.
         * JanusGraphFactory provides a set of static open methods,
         * each of which takes a configuration as its argument and returns a graph instance.
         * This tutorial calls one of these open methods on a configuration
         * that uses the BerkeleyDB storage backend and the Elasticsearch index backend,
         * then loads The Graph of the Gods using the helper class GraphOfTheGodsFactory.
         * This section skips over the configuration details, but additional information about storage backends,
         * index backends, and their configuration are available in
         * Part III, “Storage Backends”, Part IV, “Index Backends”, and Chapter 13, Configuration Reference.
         */

        // Loading the Graph of the Gods Into JanusGraph
        JanusGraph graph = JanusGraphFactory
                .open("janusgraph-dist/src/assembly/cfilter/conf/janusgraph-berkeleyje-es.properties");

        GraphOfTheGodsFactory.load(graph);
        GraphTraversalSource g = graph.traversal();

        /*
         * The typical pattern for accessing data in a graph database is to first locate the entry point into the graph
         * using a graph index. That entry point is an element (or set of elements) 
         * — i.e. a vertex or edge. From the entry elements,
         * a Gremlin path description describes how to traverse to other elements in the graph via the explicit graph structure.
         * Given that there is a unique index on name property, the Saturn vertex can be retrieved.
         * The property map (i.e. the key/value pairs of Saturn) can then be examined.
         * As demonstrated, the Saturn vertex has a name of "saturn, " an age of 10000, and a type of "titan."
         * The grandchild of Saturn can be retrieved with a traversal that expresses:
         * "Who is Saturn’s grandchild?" (the inverse of "father" is "child"). The result is Hercules.
         */
        // Global Graph Indices
        Vertex saturn = g.V().has("name", "saturn").next();
        GraphTraversal<Vertex, Map<String, Object>> vertexMapGraphTraversal = g.V(saturn).valueMap();

        GraphTraversal<Vertex, Object> values = g.V(saturn).in("father").in("father").values("name");

        /*
         * The property place is also in a graph index. The property place is an edge property.
         * Therefore, JanusGraph can index edges in a graph index.
         * It is possible to query The Graph of the Gods for all events that have happened within 50 kilometers of Athens
          * (latitude:37.97 and long:23.72).
          * Then, given that information, which vertices were involved in those events.
         */
        System.out.println(g.E().has("place", geoWithin(Geoshape.circle(37.97, 23.72, 50))));
        System.out.println(g.E().has("place", geoWithin(Geoshape.circle(37.97, 23.72, 50)))
                .as("source").inV()
                .as("god2")
                .select("source").outV()
                .as("god1").select("god1", "god2")
                .by("name"));
    }

}

然后在"janusgraph-dist/src/assembly/cfilter/conf/janusgraph-berkeleyje-es.properties" 文件中,将注释掉的内容取消注释。

运行发现依赖挺麻烦。
首先运行报错了:

Exception in thread "main" java.lang.IllegalArgumentException: Could not find implementation class: org.janusgraph.diskstorage.berkeleyje.BerkeleyJEStoreManager

找到报错处的代码,我们发现 janusgraph-core 中通过反射创建一个类,但是这个类在 janusgraph-berkeleyje 中,而前者不依赖后者,所以找不到这个类,我们可以将后者加到前者的依赖,
但是我们发现后者依赖前者,如果加了依赖两个就相互依赖了,这是 Janus 官方设计的问题。我们只好在 FirstTest 所在的module中把两个依赖都加进来试试。
(注意,如果我们将所有的都打进一个包,这个问题就不存在了,但是在本地运行是不一样的,各自模块的编译输出文件在不同的地方。)在 janusgraph-test 中添加:

        <dependency>
            <groupId>org.janusgraph</groupId>
            <artifactId>janusgraph-berkeleyje</artifactId>
            <version>0.3.0-SNAPSHOT</version>
        </dependency>

发现 janusgraph-berkeleyje也依赖了 janusgraph-test,又相互依赖了,好麻烦。我们写写代码一定要注意这个问题。这里我的解决方法是直接把 代码放到 janusgraph-berkeleyje 中运行。

Exception in thread "main" java.lang.IllegalArgumentException: Could not find implementation class: org.janusgraph.diskstorage.es.ElasticSearchIndex

和上面一样,还依赖了 janusgraph-es,我只好吧代码复制到 janusgraph-es 的test代码块中运行(注意一点是test代码中),顺便在 janusgraph-es 中 添加上janusgraph-berkeleyje的依赖。
运行成功了,但是报了连接失败,是因为我本地没有启动es,我启动一下es:elasticsearch
然后在运行:

Exception in thread "main" org.janusgraph.core.SchemaViolationException: Adding this property for key [~T$SchemaName] and value [rtname] violates a uniqueness constraint [SystemIndex#~T$SchemaName]

经过google查到原因: https://groups.google.com/forum/#!topic/aureliusgraphs/vZ_nTXlXj4k

This exception is thrown only when you already have added property key to index. So "name" is already added and next time when you run your program somewhere it is again adding "name" property key. So check if that particular code is running twice

然后我们可以在我们传入的配置文件找到:storage.directory=../db/berkeley ,直接删除这个目录,再重新运行,就成功了:

11:20:17,051  INFO GraphDatabaseConfiguration:1285 - Set default timestamp provider MICRO
11:20:17,296  INFO GraphDatabaseConfiguration:1492 - Generated unique-instance-id=c0a815a789637-dengzimings-MacBook-Pro-local1
11:20:17,547  INFO Backend:462 - Configuring index [search]
11:20:19,279  INFO Backend:177 - Initiated backend operations thread pool of size 8
11:20:19,461  INFO KCVSLog:753 - Loaded unidentified ReadMarker start time 2018-04-26T03:20:19.408Z into org.janusgraph.diskstorage.log.kcvs.KCVSLog$MessagePuller@73cd37c0
[GraphStep(edge,[]), HasStep([place.geoWithin(BUFFER (POINT (23.72 37.97), 0.44966))])]
[GraphStep(edge,[]), HasStep([place.geoWithin(BUFFER (POINT (23.72 37.97), 0.44966))])@[source], EdgeVertexStep(IN)@[god2], SelectOneStep(last,source), EdgeVertexStep(OUT)@[god1], SelectStep(last,[god1, god2],[value(name)])]
11:20:29,578  INFO ManagementLogger:192 - Received all acknowledgements for eviction [1]

然后我们可以去 ../db/berkeley 目录查看,多了一些文件,这些文件的作用我们后续再分析。
然后我们取es查看:curl -XGET 'localhost:9200/_cat/indices?v&pretty' ,发现多了两个index:

yellow open   janusgraph_edges    QT-E7AV6SMWr8Cu_ywKsXg   5   1          6            0     13.7kb         13.7kb
yellow open   janusgraph_vertices gE4TSXFATnSZUWYdAf46Xg   5   1          6            0     10.9kb         10.9kb

还可以具体查看内容。例如名字是titan的内容:curl -XGET 'localhost:9200/janusgraph_vertices/_search?q=name:titan&pretty'

到现在我们第一个案例就结束了。

g.E().has("place", geoWithin(Geoshape.circle(37.97, 23.72, 50)))
                .as("source").inV()
                .as("god2")
                .select("source").outV()
                .as("god1").select("god1", "god2")
                .by("name")

这种风格的代码实际上是groovy语言的代码,大家可以研究一下groovy语言。

注意事项:
上述第一次运行问题的原因是 janusgraph-core需要用到 janusgraph-berkeleyje的类,
但是janusgraph-berkeleyje是依赖 janusgraph-core的,所以两个相互依赖了。
janus的做法是在core中使用反射,所以编译通过了,打包到了一起就没问题了。但是本地运行没法成功。

目录
相关文章
|
消息中间件 SQL Java
Flink自定义Connector
Flink自定义Connector
858 0
|
负载均衡 Dubbo 应用服务中间件
Docker Overlay网络的一些总结
在早期的docker版本中,是不支持跨主机通信网络驱动的,也就是说如果容器部署在不同的节点上面,只能通过暴露端口到宿主机上,再通过宿主机之间进行通信。随着docker swarm集群的推广,docker也有了自家的跨主机通信网络驱动,名叫overlay,overlay网络模型是swarm集群容器间通信的载体,将服务加入到同一个网段上的Overlay网络上,服务与服务之间就能够通信。
1109 0
Docker Overlay网络的一些总结
|
存储 弹性计算 运维
深度解读|NebulaGraph x 阿里云计算巢,云上构建超大规模图数据库
本文是NebulaGraph上架到计算巢的方案介绍,原文请查看:https://mp.weixin.qq.com/s/cj8ah7pfXqMFD74JOkmwow近期,杭州悦数科技有限公司与阿里云计算巢达成合作,NebulaGraph 作为首款图数据库产品正式入驻阿里云计算巢,为用户带来了云端一键部署企业级图数据库集群的全新体验。同时,该服务集成了多款 NebulaGraph 周边可视化图数据库管
581 0
深度解读|NebulaGraph x 阿里云计算巢,云上构建超大规模图数据库
|
自然语言处理 数据可视化 数据挖掘
基于词云图+Kmeans聚类+LDA主题分析+社会网络语义分析对大唐不夜城用户评论进行分析(下)
基于词云图+Kmeans聚类+LDA主题分析+社会网络语义分析对大唐不夜城用户评论进行分析
560 0
|
10月前
|
SQL 存储 关系型数据库
MySQL秘籍之索引与查询优化实战指南
最左前缀原则。不冗余原则。最大选择性原则。所谓前缀索引,说白了就是对文本的前几个字符建立索引(具体是几个字符在建立索引时去指定),比如以产品名称的前 10 位来建索引,这样建立起来的索引更小,查询效率更快!
361 22
 MySQL秘籍之索引与查询优化实战指南
|
API TensorFlow 算法框架/工具
精通 Transformers(四)(1)
精通 Transformers(四)
210 0
精通 Transformers(四)(1)
|
机器学习/深度学习 自然语言处理 数据格式
社区供稿 |【8卡从零训练Steel-LLM】微调探索与评估
本篇文章主要介绍下微调上的探索以及评估。另外,还特意试了试训练CMMLU数据集,能在榜单上提多少分
|
应用服务中间件 PHP nginx
Mac安装Nginx
Mac安装Nginx
185 2
Mac安装Nginx
|
TensorFlow 算法框架/工具 Python
【Tensorflow 2】解决'Tensor' object has no attribute 'numpy'
解决'Tensor' object has no attribute 'numpy'
337 3
|
存储 NoSQL 大数据
大数据存储:HBase与Cassandra的对比
【7月更文挑战第16天】HBase和Cassandra作为两种流行的分布式NoSQL数据库,在数据模型、一致性模型、数据分布、查询语言和性能等方面各有千秋。HBase适用于需要强一致性和与Hadoop生态系统集成的场景,如大规模数据处理和分析。而Cassandra则更适合需要高可用性和灵活查询能力的场景,如分布式计算、云计算和大数据应用等。在实际应用中,选择哪种数据库取决于具体的需求和场景。希望本文的对比分析能够帮助读者更好地理解这两种数据库,并做出明智的选择。
1092 1