基于容器引擎的Titan1.0入门指南

本文涉及的产品
容器镜像服务 ACR,镜像仓库100个 不限时长
简介: 分布式图数据库[Titan](http://titan.thinkaurelius.com)是基于大数据的关系分析可选的底层引擎。本文主要介绍如何借助Docker快速实现Titan的入门。

分布式图数据库Titan是基于大数据的关系分析可选的底层引擎。

本文主要介绍如何借助Docker快速实现Titan的入门。Titan支持多种存储引擎和搜索引擎,本文选用Cassandra和Elasticsearch。最新版本(1.0)的Titan使用的Elasticsearch是1.5、Cassandra是2.1,由于引擎选用的版本有些老,所以不定制Dockerfile,而是直接使用hub上的镜像。

1 启动引擎容器

Elasticsearch

1.1 搜索引擎Elasticsearch
docker run -d --name es1.5 --net=host elasticsearch:1.5
1.2 存储引擎Cassandra
  • 10.101.95.23 c1(seed node)

    sudo docker run -d --name c1 \
     -e CASSANDRA_BROADCAST_ADDRESS=10.101.95.23 \
    --net=host \
    cassandra:2.1
  • 10.189.193.225 c2(seed node)

    sudo docker run -d --name c2 \
    -e CASSANDRA_BROADCAST_ADDRESS=10.189.193.225 \
    --net=host \
    -e CASSANDRA_SEEDS=10.101.95.23,10.189.193.225 \
    cassandra:2.1
  • 10.101.110.3 c3

    sudo docker run -d --name c3 \
    -e CASSANDRA_BROADCAST_ADDRESS=10.101.110.3 \
    --net=host \
    -e CASSANDRA_SEEDS=10.101.95.23,10.189.193.225 \
    cassandra:2.1
  • 100.81.0.123 c4

    sudo docker run -d --name c4 \
    -e CASSANDRA_BROADCAST_ADDRESS=100.81.0.123 \
    --net=host \
    -e CASSANDRA_SEEDS=10.101.95.23,10.189.193.225 \
    cassandra:2.1

集群启动完毕后,检查节点状态

  sudo docker exec -ti c1 nodetool status
  Datacenter: datacenter1
  =======================
  Status=Up/Down
  |/ State=Normal/Leaving/Joining/Moving
  --  Address         Load       Tokens  Owns (effective)  Host ID                               Rack
  UN  10.101.95.23    152.12 KB  256     48.3%             7bab2c1d-91f4-40c0-a2ba-2c41c6e6e78d  rack1
  UN  100.81.0.123    19.3 KB    256     50.4%             0256243a-7434-4a1d-83db-58505d894bcc  rack1
  UN  10.189.193.225  152.56 KB  256     50.2%             62b461ed-18f8-4268-8a6f-2ce2ad126646  rack1
  UN  10.101.110.3    167.72 KB  256     51.1%             c1825c5e-5d22-4cba-9350-ffa5c73ea881  rack1

2 编辑配置并启动Titan

首先从https://github.com/thinkaurelius/titan/wiki/Downloads下载并解压缩titan-1.0.0-hadoop2.zip到ECS。

2.1 conf/eric-titan.properties
gremlin.graph=com.thinkaurelius.titan.core.TitanFactory

storage.backend=cassandra
storage.hostname=10.101.110.3

cache.db-cache = true
cache.db-cache-clean-wait = 20
cache.db-cache-time = 180000
cache.db-cache-size = 0.25

index.search.backend=elasticsearch
index.search.hostname=127.0.0.1
index.search.elasticsearch.client-only=true
2.2 conf/eric-gremlin-server.yaml
host: 10.101.91.65
port: 18182
threadPoolWorker: 1
gremlinPool: 8
scriptEvaluationTimeout: 30000
serializedResponseTimeout: 30000
channelizer: org.apache.tinkerpop.gremlin.server.channel.WebSocketChannelizer
graphs: {
  graph: conf/eric-titan.properties}
plugins:

- aurelius.titan
    scriptEngines: {
    gremlin-groovy: {
    imports: [java.lang.Math],
    staticImports: [java.lang.Math.PI],
    scripts: [scripts/empty-sample.groovy]},
    nashorn: {
      imports: [java.lang.Math],
      staticImports: [java.lang.Math.PI]}}
    serializers:
- { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV1d0, config: { useMapperFromGraph: graph }}
    - { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV1d0, config: { serializeResultToString: true }}
    - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerGremlinV1d0, config: { useMapperFromGraph: graph }}
    - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerV1d0, config: { useMapperFromGraph: graph }}
      processors:
    - { className: org.apache.tinkerpop.gremlin.server.op.session.SessionOpProcessor, config: { sessionTimeout: 28800000 }}
      metrics: {
      consoleReporter: {enabled: true, interval: 180000},
      csvReporter: {enabled: true, interval: 180000, fileName: /tmp/gremlin-server-metrics.csv},
      jmxReporter: {enabled: true},
      slf4jReporter: {enabled: true, interval: 180000},
      gangliaReporter: {enabled: false, interval: 180000, addressingMode: MULTICAST},
      graphiteReporter: {enabled: false, interval: 180000}}
      threadPoolBoss: 1
      maxInitialLineLength: 4096
      maxHeaderSize: 8192
      maxChunkSize: 8192
      maxContentLength: 65536
      maxAccumulationBufferComponents: 1024
      resultIterationBatchSize: 64
      writeBufferHighWaterMark: 32768
      writeBufferHighWaterMark: 65536
      ssl: {
      enabled: false}
2.3 启动Titan服务
bin/gremlin-server.sh conf/eric-gremlin-server.yaml

3 构造图

3.1 conf/eric-remote.yaml
hosts: 127.0.0.1
port: 18182
serializer: { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV1d0, config: { custom: [com.thinkaurelius.titan.graphdb.relations.RelationIdentifier] }}
3.2 客户端连接Titan
bin/gremlin.sh :remote conf/eric-remote.yaml
3.3 构造神家谱
gremlin> graph = TitanFactory.open('conf/eric-titan.properties')
==>standardtitangraph[cassandra:[10.101.110.3]]

gremlin> GraphOfTheGodsFactory.load(graph)
==>null

实现类:https://github.com/thinkaurelius/titan/blob/titan10/titan-core/src/main/java/com/thinkaurelius/titan/example/GraphOfTheGodsFactory.java

4 查询图

gremlin> g = graph.traversal()
==>graphtraversalsource[standardtitangraph[cassandra:[10.101.110.3]], standard]

gremlin> saturn = g.V().has('name', 'saturn').next()
==>v[4160]

gremlin> g.V(saturn).valueMap()
==>[name:[saturn], age:[10000]]

gremlin> g.V(saturn).in('father').in('father').values('name')
==>hercules

gremlin> g.E().has('place', geoWithin(Geoshape.circle(37.97, 23.72, 50)))
==>e[4r0-38g-9hx-6e0][4192-battled->8280]
==>e[4cs-38g-9hx-6i0][4192-battled->8424]

gremlin> g.E().has('place', geoWithin(Geoshape.circle(37.97, 23.72, 50))).as('source').inV().as('god2').select('source').outV().as('god1').select('god1', 'god2').by('name')
==>[god1:hercules, god2:hydra]
==>[god1:hercules, god2:nemean]

gremlin> hercules = g.V(saturn).repeat(__.in('father')).times(2).next()
==>v[4192]

gremlin> g.V(hercules).out('father', 'mother').values('name')
==>jupiter
==>alcmene

gremlin> g.V(hercules).out('father', 'mother').label()
==>god
==>human

gremlin> hercules.label()
==>demigod

gremlin> g.V(hercules).outE('battled').has('time', gt(1)).inV().values('name')
==>cerberus
==>hydra

gremlin> g.V(hercules).outE('battled').has('time', gt(1)).inV().values('name').toString()
==>[GraphStep([v[4192]],vertex), VertexStep(OUT,[battled],edge), HasStep([time.gt(1)]), EdgeVertexStep(IN), PropertiesStep([name],value)]

gremlin> pluto = g.V().has('name', 'pluto').next()
==>v[4272]

gremlin> g.V(pluto).out('lives').in('lives').values('name')
==>pluto
==>cerberus

gremlin>  g.V(pluto).out('brother').out('lives').values('name')
==>sky
==>sea

gremlin> g.V(pluto).out('brother').as('god').out('lives').as('place')
==>v[4232]
==>v[8328]

gremlin> g.V(pluto).outE('lives').values('reason')
==>no fear of death

gremlin> g.E().has('reason', textContains('loves'))
==>e[3kb-388-b2t-39k][4184-lives->4232]
==>e[36l-3c8-b2t-6fc][4328-lives->8328]

gremlin>  g.E().has('reason', textContains('loves')).as('source').values('reason').as('reason').select('source').outV().values('name').as('god').select('source').inV().values('name').as('thing').select('god', 'reason', 'thing')
==>[god:jupiter, reason:loves fresh breezes, thing:sky]
==>[god:neptune, reason:loves waves, thing:sea]

5 查看存储

5.1 使用容器内cassandra-cli查看存储
sudo docker exec -ti c1 bash
cassandra-cli
use titan;
show schema;
5.2 查看列存储详情
[default@titan] list titan_ids limit 3;
-------------------
RowKey: 0000000000000003
=> (name=fffffffffffec77f00053a400cd0f0f8306136353562343132383133382d653031303130313039313036352d7a6d6631, value=, timestamp=1471421650891001)
-------------------
RowKey: a800000000000000
=> (name=ffffffffffffd8ef00053a400d2e7278306136353562343132383133382d653031303130313039313036352d7a6d6631, value=, timestamp=1471421657019001)
-------------------
RowKey: b000000000000003
=> (name=fffffffffffec77f00053a400d28d8e0306136353562343132383133382d653031303130313039313036352d7a6d6631, value=, timestamp=1471421656652001)

[default@titan] list graphindex limit 3;
-------------------
RowKey: 10a5a072741e6d6f746865f2
=> (name=00, value=5895, timestamp=1471421652566001)
-------------------
RowKey: 0489a07361747572ee
=> (name=00, value=20c0, timestamp=1471421658055001)
-------------------
RowKey: 10a5a072741e62726f746865f2
=> (name=00, value=010095, timestamp=1471421652566001)
[default@titan] list graphindex_lock_ limit 3;
-------------------
RowKey: 0000000910a5a072741e6167e500
-------------------
RowKey: 0000000910a5a0766c1e676fe400
-------------------
RowKey: 0000000b10a5a072741e6c697665f300

[default@titan] list edgestore limit 3;
RowKey: 0000000000003415
=> (name=02, value=0001045080, timestamp=1471421652566001)
=> (name=10c0, value=a072741e31323330393a626174746c6573427954696de5044c80, timestamp=1471421652566001)
=> (name=10c2846000, value=8f01018e008080, timestamp=1471421652566001)
=> (name=10c2846400, value=99820000000000001805018e008180, timestamp=1471421652566001)
=> (name=10c2846800, value=ad81018e008280, timestamp=1471421652566001)
=> (name=10c2846c00, value=9981018e008380, timestamp=1471421652566001)
=> (name=10c2847000, value=ae80018e008480, timestamp=1471421652566001)
=> (name=10c2847400, value=b082018e008680, timestamp=1471421652566001)
=> (name=10c2847800, value=b382018e008780, timestamp=1471421652566001)
=> (name=10c4, value=0080045480, timestamp=1471421652566001)
=> (name=10c8, value=0080053a400ce4d6b8045c80, timestamp=1471421652566001)
=> (name=30c9806015847c00, value=009180, timestamp=1471421652566001)
-------------------
RowKey: 0000000000002815
=> (name=02, value=0001034080, timestamp=1471421652566001)
=> (name=10c0, value=a072741e6661746865f2033c80, timestamp=1471421652566001)
=> (name=10c2835000, value=8f00018e008080, timestamp=1471421652566001)
=> (name=10c2835400, value=9981018e008180, timestamp=1471421652566001)
=> (name=10c2835800, value=ad80018e008280, timestamp=1471421652566001)
=> (name=10c2835c00, value=9981018e008380, timestamp=1471421652566001)
=> (name=10c2836000, value=ae83018e008480, timestamp=1471421652566001)
=> (name=10c2836400, value=b082018e008680, timestamp=1471421652566001)
=> (name=10c2836800, value=b382018e008780, timestamp=1471421652566001)
=> (name=10c4, value=0080034480, timestamp=1471421652566001)
=> (name=10c8, value=0080053a400ce48898034c80, timestamp=1471421652566001)
-------------------
RowKey: 0000000000003815
=> (name=02, value=0001050480, timestamp=1471421652566001)
=> (name=10c0, value=a072741e6c697665f3050080, timestamp=1471421652566001)
=> (name=10c2851400, value=8f00018e008080, timestamp=1471421652566001)
=> (name=10c2851800, value=9981018e008180, timestamp=1471421652566001)
=> (name=10c2851c00, value=ad80018e008280, timestamp=1471421652566001)
=> (name=10c2852000, value=99820000000000001c05018e008380, timestamp=1471421652566001)
=> (name=10c2852400, value=ae80018e008480, timestamp=1471421652566001)
=> (name=10c2852800, value=b082018e008680, timestamp=1471421652566001)
=> (name=10c2852c00, value=b382018e008780, timestamp=1471421652566001)
=> (name=10c4, value=0080050880, timestamp=1471421652566001)
=> (name=10c8, value=0080053a400ce4f210051080, timestamp=1471421652566001)
[default@titan] list edgestore_lock_ limit 3;
0 Row Returned.
[default@titan] list system_properties limit 3;
-------------------
RowKey: 636f6e66696775726174696f6e
=> (name=63616368652e64622d6361636865, value=8f01, timestamp=1471415029987001)
=> (name=63616368652e64622d63616368652d636c65616e2d77616974, value=8ca8, timestamp=1471415029982001)
=> (name=63616368652e64622d63616368652d73697a65, value=943fd0000000000000, timestamp=1471415029892001)
=> (name=63616368652e64622d63616368652d74696d65, value=8d800000000002bf20, timestamp=1471415029965001)
=> (name=67726170682e74696d657374616d7073, value=b681, timestamp=1471415030003001)
=> (name=67726170682e746974616e2d76657273696f6e, value=92a0312e302eb0, timestamp=1471415030000001)
=> (name=68696464656e2e66726f7a656e, value=8f01, timestamp=1471415030159001)
=> (name=696e6465782e7365617263682e6261636b656e64, value=92a0656c61737469637365617263e8, timestamp=1471415029972001)
=> (name=696e6465782e7365617263682e656c61737469637365617263682e636c69656e742d6f6e6c79, value=8f01, timestamp=1471415029977001)
=> (name=696e6465782e7365617263682e686f73746e616d65, value=9e84a031302e3130312e38392eb3a031302e3130312e39302eb9a031302e3130312e38352e3230b8, timestamp=1471415029992001)
=> (name=73797374656d2d726567697374726174696f6e2e306136353562343131383235332d653031303130313039313036352d7a6d66312e737461727475702d74696d65, value=c18000000057b419ce0119452980, timestamp=1471420878163001)
[default@titan] list system_properties_lock_ limit 3;
0 Row Returned.
[default@titan] list systemlog limit 3;
-------------------
RowKey: ffffffffa0306136353562343131333933362d653031303130313039313036352d7a6d66b1
=> (name=01, value=0000000000000000, timestamp=1471420623958001)
-------------------
RowKey: 000000000000000000e08568
=> (name=00053a400cecf8c0a0306136353562343132383133382d653031303130313039313036352d7a6d66b10000000000000001, value=8081810489, timestamp=1471421652730001)
=> (name=00053a400d151648a0306136353562343131383235332d653031303130313039313036352d7a6d66b10000000000000001, value=81a0306136353562343132383133382d653031303130313039313036352d7a6d66b181, timestamp=1471421655362001)
=> (name=00053a400d433eb0a0306136353562343132383133382d653031303130313039313036352d7a6d66b10000000000000002, value=81a0306136353562343132383133382d653031303130313039313036352d7a6d66b181, timestamp=1471421658382001)
-------------------
RowKey: ffffffffa0306136353562343132383133382d653031303130313039313036352d7a6d66b1
=> (name=01, value=0000000000000002, timestamp=1471423313062001)
[default@titan] list txlog limit 3;
-------------------
RowKey: ffffffffa0306136353562343131333933362d653031303130313039313036352d7a6d66b1
=> (name=01, value=0000000000000000, timestamp=1471420623962001, ttl=604800)
-------------------
RowKey: ffffffffa0306136353562343132383133382d653031303130313039313036352d7a6d66b1
=> (name=01, value=0000000000000000, timestamp=1471423313067001, ttl=604800)

6 查看索引

6.1 查看Titan创建了哪些索引
$curl localhost:9200/_cat/indices
yellow open titan 5 1 12 0 13.7kb 13.7kb
6.2 查看titan索引下的Type
$curl localhost:9200/titan?pretty
{
  "titan" : {

    "aliases" : { },
    "mappings" : {
      "vertices" : {
        "_ttl" : {
          "enabled" : true
        },
        "properties" : {
          "age" : {
            "type" : "integer"
          }
        }
      },
      "edges" : {
        "_ttl" : {
          "enabled" : true
        },
        "properties" : {
          "place" : {
            "type" : "geo_point"
          },
          "reason" : {
            "type" : "string"
          }
        }
      }
    },
    "settings" : {
      "index" : {
        "creation_date" : "1471420330925",
        "number_of_shards" : "5",
        "uuid" : "WSAMXWioQX6_JAZZ4E_RWw",
        "version" : {
          "created" : "1050199"
        },
        "number_of_replicas" : "1"
      }
    },
    "warmers" : { }
  }
}
6.3 查看type=edges的文档数量
$curl localhost:9200/titan/edges/_count?pretty
{
  "count" : 6,
  "_shards" : {

    "total" : 5,
    "successful" : 5,
    "failed" : 0
  }
}
6.4 查看type=vertices的文档数量
$curl localhost:9200/titan/vertices/_count?pretty
{
  "count" : 6,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "failed" : 0
  }
}
相关实践学习
使用阿里云Elasticsearch体验信息检索加速
通过创建登录阿里云Elasticsearch集群,使用DataWorks将MySQL数据同步至Elasticsearch,体验多条件检索效果,简单展示数据同步和信息检索加速的过程和操作。
ElasticSearch 入门精讲
ElasticSearch是一个开源的、基于Lucene的、分布式、高扩展、高实时的搜索与数据分析引擎。根据DB-Engines的排名显示,Elasticsearch是最受欢迎的企业搜索引擎,其次是Apache Solr(也是基于Lucene)。 ElasticSearch的实现原理主要分为以下几个步骤: 用户将数据提交到Elastic Search 数据库中 通过分词控制器去将对应的语句分词,将其权重和分词结果一并存入数据 当用户搜索数据时候,再根据权重将结果排名、打分 将返回结果呈现给用户 Elasticsearch可以用于搜索各种文档。它提供可扩展的搜索,具有接近实时的搜索,并支持多租户。
目录
相关文章
|
4月前
|
Kubernetes 关系型数据库 MySQL
Docker Compose入门:打造多容器应用的完美舞台
Docker Compose 是一个强大的工具,它允许开发者通过简单的 YAML 文件定义和管理多容器的应用。本文将深入讨论 Docker Compose 的基本概念、常用命令以及高级应用场景,并通过更为丰富和实际的示例代码,助您轻松掌握如何通过 Docker Compose 打造复杂而高效的多容器应用。
|
3月前
|
存储 算法 Java
容器【双例集合、TreeMap容器的使用、 Iterator接口、Collections工具类】(四)-全面详解(学习总结---从入门到深化)(中)
容器【双例集合、TreeMap容器的使用、 Iterator接口、Collections工具类】(四)-全面详解(学习总结---从入门到深化)
51 0
|
2月前
|
存储 安全 Java
容器【基本使用、索引操作、并集、交集、差集】(二)-全面详解(学习总结---从入门到深化)
容器【基本使用、索引操作、并集、交集、差集】(二)-全面详解(学习总结---从入门到深化)
34 0
|
4月前
|
设计模式 C++ iOS开发
【C++】STL容器适配器入门:【堆】【栈】【队列】(16)
【C++】STL容器适配器入门:【堆】【栈】【队列】(16)
【C++】STL容器适配器入门:【堆】【栈】【队列】(16)
|
3月前
|
安全 Java 编译器
容器【泛型类、泛型接口、泛型方法 、泛型方法与可变参数 】(一)-全面详解(学习总结---从入门到深化)
容器【泛型类、泛型接口、泛型方法 、泛型方法与可变参数 】(一)-全面详解(学习总结---从入门到深化)
27 0
|
10天前
|
程序员 索引 Python
06-python数据容器-set(集合)入门基础操作
06-python数据容器-set(集合)入门基础操作
|
2月前
|
IDE 数据中心 Docker
使用PyCharm与Docker容器进行开发:从入门到精通
使用PyCharm与Docker容器进行开发:从入门到精通
350 0
|
2月前
|
存储 前端开发 C++
【C++入门到精通】C++入门 —— 容器适配器、stack和queue(STL)
在C++中​​std::stack​​​是一个模板类,它是基于容器的适配器,用于实现堆栈数据结构。堆栈是一种后进先出(LIFO)的数据结构,类似于现实生活中的一叠盘子。
27 4
|
2月前
|
存储 算法 安全
容器【双例集合、TreeMap容器的使用、 Iterator接口、Collections工具类】(四)-全面详解(学习总结---从入门到深化)
容器【双例集合、TreeMap容器的使用、 Iterator接口、Collections工具类】(四)-全面详解(学习总结---从入门到深化)
27 0
|
3月前
|
存储 容器
容器【双例集合、TreeMap容器的使用、 Iterator接口、Collections工具类】(四)-全面详解(学习总结---从入门到深化)(下)
容器【双例集合、TreeMap容器的使用、 Iterator接口、Collections工具类】(四)-全面详解(学习总结---从入门到深化)
19 0