分布式图数据库Titan是基于大数据的关系分析可选的底层引擎。
本文主要介绍如何借助Docker快速实现Titan的入门。Titan支持多种存储引擎和搜索引擎,本文选用Cassandra和Elasticsearch。最新版本(1.0)的Titan使用的Elasticsearch是1.5、Cassandra是2.1,由于引擎选用的版本有些老,所以不定制Dockerfile,而是直接使用hub上的镜像。
1 启动引擎容器
Elasticsearch
1.1 搜索引擎Elasticsearch
docker run -d --name es1.5 --net=host elasticsearch:1.5
1.2 存储引擎Cassandra
10.101.95.23 c1(seed node)
sudo docker run -d --name c1 \ -e CASSANDRA_BROADCAST_ADDRESS=10.101.95.23 \ --net=host \ cassandra:2.1
10.189.193.225 c2(seed node)
sudo docker run -d --name c2 \ -e CASSANDRA_BROADCAST_ADDRESS=10.189.193.225 \ --net=host \ -e CASSANDRA_SEEDS=10.101.95.23,10.189.193.225 \ cassandra:2.1
10.101.110.3 c3
sudo docker run -d --name c3 \ -e CASSANDRA_BROADCAST_ADDRESS=10.101.110.3 \ --net=host \ -e CASSANDRA_SEEDS=10.101.95.23,10.189.193.225 \ cassandra:2.1
100.81.0.123 c4
sudo docker run -d --name c4 \ -e CASSANDRA_BROADCAST_ADDRESS=100.81.0.123 \ --net=host \ -e CASSANDRA_SEEDS=10.101.95.23,10.189.193.225 \ cassandra:2.1
集群启动完毕后,检查节点状态
sudo docker exec -ti c1 nodetool status
Datacenter: datacenter1
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns (effective) Host ID Rack
UN 10.101.95.23 152.12 KB 256 48.3% 7bab2c1d-91f4-40c0-a2ba-2c41c6e6e78d rack1
UN 100.81.0.123 19.3 KB 256 50.4% 0256243a-7434-4a1d-83db-58505d894bcc rack1
UN 10.189.193.225 152.56 KB 256 50.2% 62b461ed-18f8-4268-8a6f-2ce2ad126646 rack1
UN 10.101.110.3 167.72 KB 256 51.1% c1825c5e-5d22-4cba-9350-ffa5c73ea881 rack1
2 编辑配置并启动Titan
首先从https://github.com/thinkaurelius/titan/wiki/Downloads下载并解压缩titan-1.0.0-hadoop2.zip到ECS。
2.1 conf/eric-titan.properties
gremlin.graph=com.thinkaurelius.titan.core.TitanFactory
storage.backend=cassandra
storage.hostname=10.101.110.3
cache.db-cache = true
cache.db-cache-clean-wait = 20
cache.db-cache-time = 180000
cache.db-cache-size = 0.25
index.search.backend=elasticsearch
index.search.hostname=127.0.0.1
index.search.elasticsearch.client-only=true
2.2 conf/eric-gremlin-server.yaml
host: 10.101.91.65
port: 18182
threadPoolWorker: 1
gremlinPool: 8
scriptEvaluationTimeout: 30000
serializedResponseTimeout: 30000
channelizer: org.apache.tinkerpop.gremlin.server.channel.WebSocketChannelizer
graphs: {
graph: conf/eric-titan.properties}
plugins:
- aurelius.titan
scriptEngines: {
gremlin-groovy: {
imports: [java.lang.Math],
staticImports: [java.lang.Math.PI],
scripts: [scripts/empty-sample.groovy]},
nashorn: {
imports: [java.lang.Math],
staticImports: [java.lang.Math.PI]}}
serializers:
- { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV1d0, config: { useMapperFromGraph: graph }}
- { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV1d0, config: { serializeResultToString: true }}
- { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerGremlinV1d0, config: { useMapperFromGraph: graph }}
- { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerV1d0, config: { useMapperFromGraph: graph }}
processors:
- { className: org.apache.tinkerpop.gremlin.server.op.session.SessionOpProcessor, config: { sessionTimeout: 28800000 }}
metrics: {
consoleReporter: {enabled: true, interval: 180000},
csvReporter: {enabled: true, interval: 180000, fileName: /tmp/gremlin-server-metrics.csv},
jmxReporter: {enabled: true},
slf4jReporter: {enabled: true, interval: 180000},
gangliaReporter: {enabled: false, interval: 180000, addressingMode: MULTICAST},
graphiteReporter: {enabled: false, interval: 180000}}
threadPoolBoss: 1
maxInitialLineLength: 4096
maxHeaderSize: 8192
maxChunkSize: 8192
maxContentLength: 65536
maxAccumulationBufferComponents: 1024
resultIterationBatchSize: 64
writeBufferHighWaterMark: 32768
writeBufferHighWaterMark: 65536
ssl: {
enabled: false}
2.3 启动Titan服务
bin/gremlin-server.sh conf/eric-gremlin-server.yaml
3 构造图
3.1 conf/eric-remote.yaml
hosts: 127.0.0.1
port: 18182
serializer: { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV1d0, config: { custom: [com.thinkaurelius.titan.graphdb.relations.RelationIdentifier] }}
3.2 客户端连接Titan
bin/gremlin.sh :remote conf/eric-remote.yaml
3.3 构造神家谱
gremlin> graph = TitanFactory.open('conf/eric-titan.properties')
==>standardtitangraph[cassandra:[10.101.110.3]]
gremlin> GraphOfTheGodsFactory.load(graph)
==>null
4 查询图
gremlin> g = graph.traversal()
==>graphtraversalsource[standardtitangraph[cassandra:[10.101.110.3]], standard]
gremlin> saturn = g.V().has('name', 'saturn').next()
==>v[4160]
gremlin> g.V(saturn).valueMap()
==>[name:[saturn], age:[10000]]
gremlin> g.V(saturn).in('father').in('father').values('name')
==>hercules
gremlin> g.E().has('place', geoWithin(Geoshape.circle(37.97, 23.72, 50)))
==>e[4r0-38g-9hx-6e0][4192-battled->8280]
==>e[4cs-38g-9hx-6i0][4192-battled->8424]
gremlin> g.E().has('place', geoWithin(Geoshape.circle(37.97, 23.72, 50))).as('source').inV().as('god2').select('source').outV().as('god1').select('god1', 'god2').by('name')
==>[god1:hercules, god2:hydra]
==>[god1:hercules, god2:nemean]
gremlin> hercules = g.V(saturn).repeat(__.in('father')).times(2).next()
==>v[4192]
gremlin> g.V(hercules).out('father', 'mother').values('name')
==>jupiter
==>alcmene
gremlin> g.V(hercules).out('father', 'mother').label()
==>god
==>human
gremlin> hercules.label()
==>demigod
gremlin> g.V(hercules).outE('battled').has('time', gt(1)).inV().values('name')
==>cerberus
==>hydra
gremlin> g.V(hercules).outE('battled').has('time', gt(1)).inV().values('name').toString()
==>[GraphStep([v[4192]],vertex), VertexStep(OUT,[battled],edge), HasStep([time.gt(1)]), EdgeVertexStep(IN), PropertiesStep([name],value)]
gremlin> pluto = g.V().has('name', 'pluto').next()
==>v[4272]
gremlin> g.V(pluto).out('lives').in('lives').values('name')
==>pluto
==>cerberus
gremlin> g.V(pluto).out('brother').out('lives').values('name')
==>sky
==>sea
gremlin> g.V(pluto).out('brother').as('god').out('lives').as('place')
==>v[4232]
==>v[8328]
gremlin> g.V(pluto).outE('lives').values('reason')
==>no fear of death
gremlin> g.E().has('reason', textContains('loves'))
==>e[3kb-388-b2t-39k][4184-lives->4232]
==>e[36l-3c8-b2t-6fc][4328-lives->8328]
gremlin> g.E().has('reason', textContains('loves')).as('source').values('reason').as('reason').select('source').outV().values('name').as('god').select('source').inV().values('name').as('thing').select('god', 'reason', 'thing')
==>[god:jupiter, reason:loves fresh breezes, thing:sky]
==>[god:neptune, reason:loves waves, thing:sea]
5 查看存储
5.1 使用容器内cassandra-cli查看存储
sudo docker exec -ti c1 bash
cassandra-cli
use titan;
show schema;
5.2 查看列存储详情
[default@titan] list titan_ids limit 3;
-------------------
RowKey: 0000000000000003
=> (name=fffffffffffec77f00053a400cd0f0f8306136353562343132383133382d653031303130313039313036352d7a6d6631, value=, timestamp=1471421650891001)
-------------------
RowKey: a800000000000000
=> (name=ffffffffffffd8ef00053a400d2e7278306136353562343132383133382d653031303130313039313036352d7a6d6631, value=, timestamp=1471421657019001)
-------------------
RowKey: b000000000000003
=> (name=fffffffffffec77f00053a400d28d8e0306136353562343132383133382d653031303130313039313036352d7a6d6631, value=, timestamp=1471421656652001)
[default@titan] list graphindex limit 3;
-------------------
RowKey: 10a5a072741e6d6f746865f2
=> (name=00, value=5895, timestamp=1471421652566001)
-------------------
RowKey: 0489a07361747572ee
=> (name=00, value=20c0, timestamp=1471421658055001)
-------------------
RowKey: 10a5a072741e62726f746865f2
=> (name=00, value=010095, timestamp=1471421652566001)
[default@titan] list graphindex_lock_ limit 3;
-------------------
RowKey: 0000000910a5a072741e6167e500
-------------------
RowKey: 0000000910a5a0766c1e676fe400
-------------------
RowKey: 0000000b10a5a072741e6c697665f300
[default@titan] list edgestore limit 3;
RowKey: 0000000000003415
=> (name=02, value=0001045080, timestamp=1471421652566001)
=> (name=10c0, value=a072741e31323330393a626174746c6573427954696de5044c80, timestamp=1471421652566001)
=> (name=10c2846000, value=8f01018e008080, timestamp=1471421652566001)
=> (name=10c2846400, value=99820000000000001805018e008180, timestamp=1471421652566001)
=> (name=10c2846800, value=ad81018e008280, timestamp=1471421652566001)
=> (name=10c2846c00, value=9981018e008380, timestamp=1471421652566001)
=> (name=10c2847000, value=ae80018e008480, timestamp=1471421652566001)
=> (name=10c2847400, value=b082018e008680, timestamp=1471421652566001)
=> (name=10c2847800, value=b382018e008780, timestamp=1471421652566001)
=> (name=10c4, value=0080045480, timestamp=1471421652566001)
=> (name=10c8, value=0080053a400ce4d6b8045c80, timestamp=1471421652566001)
=> (name=30c9806015847c00, value=009180, timestamp=1471421652566001)
-------------------
RowKey: 0000000000002815
=> (name=02, value=0001034080, timestamp=1471421652566001)
=> (name=10c0, value=a072741e6661746865f2033c80, timestamp=1471421652566001)
=> (name=10c2835000, value=8f00018e008080, timestamp=1471421652566001)
=> (name=10c2835400, value=9981018e008180, timestamp=1471421652566001)
=> (name=10c2835800, value=ad80018e008280, timestamp=1471421652566001)
=> (name=10c2835c00, value=9981018e008380, timestamp=1471421652566001)
=> (name=10c2836000, value=ae83018e008480, timestamp=1471421652566001)
=> (name=10c2836400, value=b082018e008680, timestamp=1471421652566001)
=> (name=10c2836800, value=b382018e008780, timestamp=1471421652566001)
=> (name=10c4, value=0080034480, timestamp=1471421652566001)
=> (name=10c8, value=0080053a400ce48898034c80, timestamp=1471421652566001)
-------------------
RowKey: 0000000000003815
=> (name=02, value=0001050480, timestamp=1471421652566001)
=> (name=10c0, value=a072741e6c697665f3050080, timestamp=1471421652566001)
=> (name=10c2851400, value=8f00018e008080, timestamp=1471421652566001)
=> (name=10c2851800, value=9981018e008180, timestamp=1471421652566001)
=> (name=10c2851c00, value=ad80018e008280, timestamp=1471421652566001)
=> (name=10c2852000, value=99820000000000001c05018e008380, timestamp=1471421652566001)
=> (name=10c2852400, value=ae80018e008480, timestamp=1471421652566001)
=> (name=10c2852800, value=b082018e008680, timestamp=1471421652566001)
=> (name=10c2852c00, value=b382018e008780, timestamp=1471421652566001)
=> (name=10c4, value=0080050880, timestamp=1471421652566001)
=> (name=10c8, value=0080053a400ce4f210051080, timestamp=1471421652566001)
[default@titan] list edgestore_lock_ limit 3;
0 Row Returned.
[default@titan] list system_properties limit 3;
-------------------
RowKey: 636f6e66696775726174696f6e
=> (name=63616368652e64622d6361636865, value=8f01, timestamp=1471415029987001)
=> (name=63616368652e64622d63616368652d636c65616e2d77616974, value=8ca8, timestamp=1471415029982001)
=> (name=63616368652e64622d63616368652d73697a65, value=943fd0000000000000, timestamp=1471415029892001)
=> (name=63616368652e64622d63616368652d74696d65, value=8d800000000002bf20, timestamp=1471415029965001)
=> (name=67726170682e74696d657374616d7073, value=b681, timestamp=1471415030003001)
=> (name=67726170682e746974616e2d76657273696f6e, value=92a0312e302eb0, timestamp=1471415030000001)
=> (name=68696464656e2e66726f7a656e, value=8f01, timestamp=1471415030159001)
=> (name=696e6465782e7365617263682e6261636b656e64, value=92a0656c61737469637365617263e8, timestamp=1471415029972001)
=> (name=696e6465782e7365617263682e656c61737469637365617263682e636c69656e742d6f6e6c79, value=8f01, timestamp=1471415029977001)
=> (name=696e6465782e7365617263682e686f73746e616d65, value=9e84a031302e3130312e38392eb3a031302e3130312e39302eb9a031302e3130312e38352e3230b8, timestamp=1471415029992001)
=> (name=73797374656d2d726567697374726174696f6e2e306136353562343131383235332d653031303130313039313036352d7a6d66312e737461727475702d74696d65, value=c18000000057b419ce0119452980, timestamp=1471420878163001)
[default@titan] list system_properties_lock_ limit 3;
0 Row Returned.
[default@titan] list systemlog limit 3;
-------------------
RowKey: ffffffffa0306136353562343131333933362d653031303130313039313036352d7a6d66b1
=> (name=01, value=0000000000000000, timestamp=1471420623958001)
-------------------
RowKey: 000000000000000000e08568
=> (name=00053a400cecf8c0a0306136353562343132383133382d653031303130313039313036352d7a6d66b10000000000000001, value=8081810489, timestamp=1471421652730001)
=> (name=00053a400d151648a0306136353562343131383235332d653031303130313039313036352d7a6d66b10000000000000001, value=81a0306136353562343132383133382d653031303130313039313036352d7a6d66b181, timestamp=1471421655362001)
=> (name=00053a400d433eb0a0306136353562343132383133382d653031303130313039313036352d7a6d66b10000000000000002, value=81a0306136353562343132383133382d653031303130313039313036352d7a6d66b181, timestamp=1471421658382001)
-------------------
RowKey: ffffffffa0306136353562343132383133382d653031303130313039313036352d7a6d66b1
=> (name=01, value=0000000000000002, timestamp=1471423313062001)
[default@titan] list txlog limit 3;
-------------------
RowKey: ffffffffa0306136353562343131333933362d653031303130313039313036352d7a6d66b1
=> (name=01, value=0000000000000000, timestamp=1471420623962001, ttl=604800)
-------------------
RowKey: ffffffffa0306136353562343132383133382d653031303130313039313036352d7a6d66b1
=> (name=01, value=0000000000000000, timestamp=1471423313067001, ttl=604800)
6 查看索引
6.1 查看Titan创建了哪些索引
$curl localhost:9200/_cat/indices
yellow open titan 5 1 12 0 13.7kb 13.7kb
6.2 查看titan索引下的Type
$curl localhost:9200/titan?pretty
{
"titan" : {
"aliases" : { },
"mappings" : {
"vertices" : {
"_ttl" : {
"enabled" : true
},
"properties" : {
"age" : {
"type" : "integer"
}
}
},
"edges" : {
"_ttl" : {
"enabled" : true
},
"properties" : {
"place" : {
"type" : "geo_point"
},
"reason" : {
"type" : "string"
}
}
}
},
"settings" : {
"index" : {
"creation_date" : "1471420330925",
"number_of_shards" : "5",
"uuid" : "WSAMXWioQX6_JAZZ4E_RWw",
"version" : {
"created" : "1050199"
},
"number_of_replicas" : "1"
}
},
"warmers" : { }
}
}
6.3 查看type=edges的文档数量
$curl localhost:9200/titan/edges/_count?pretty
{
"count" : 6,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
}
}
6.4 查看type=vertices的文档数量
$curl localhost:9200/titan/vertices/_count?pretty
{
"count" : 6,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
}
}