1、启动/停止OushuDB
启动OushuDB有两种方式,一种是通过”hawq start cluster”命令来启动整个集群,包括master和segment。启动哪些segment是由”/hawq-install-path/etc/slaves”中包含的节点确定的。
source /usr/local/hawq/greenplum_path.sh # 设置OushuDB环境变量hawq start cluster # 启动整个OushuDB集群
另外一种方式是分别启动OushuDB master和segment。因为OushuDB master和segment是解耦合的,分别启动master和segment是可行的。
hawq start master # 启动master,指的是启动本地masterhawq start segment # 启动segment,指的是启动本地segment
重新启动或者停止OushuDB也有两种方式:
# 方式一hawq restart cluster # 重启OushuDB集群hawq stop cluster # 停止OushuDB集群# 方式二hawq restart master # 重启本机的OushuDB masterhawq restart segment # 重启本机的OushuDB segmenthawq stop master # 停止本机OushuDB masterhawq stop segment # 停止本机OushuDB segment
启动/停止Magma
OushuDB4.0 实现了单独起停Magma服务,具体命令如下:
# 方式一 OushuDB4.0 集群起停带Magma服务 [只有hawq init|start|stop cluster命令可以带--with_magma选项]hawq init cluster --with_magma # 启动OushuDB集群时,使用--with_magma选项,同时启动Magma服务, 3.X版本不支持。# 方式二 Magma服务单独起停magma start|stop|restart clustermagma start|stop|restart node
关于OushuDB hawq命令的详细用法,可以通过”hawq –help”命令得到。
changlei:build ChangLei$ hawq --help
usage: hawq <command> [<object>] [options]
[--version]
The most commonly used hawq "commands" are:
start Start hawq service.
stop Stop hawq service.
init Init hawq service.
restart Restart hawq service.
activate Activate hawq standby master as master.
version Show hawq version information.
config Set hawq GUC values.
state Show hawq cluster status.
filespace Create hawq filespaces.
extract Extract table metadata into a YAML formatted file.
load Load data into hawq.
scp Copies files between multiple hosts at once.
ssh Provides ssh access to multiple hosts at once.
ssh-exkeys Exchanges SSH public keys between hosts.
check Verifies and validates HAWQ settings.
checkperf Verifies the baseline hardware performance of hosts.
register Register parquet files generated by other system into the corrsponding table in HAWQ
See 'hawq <command> help' for more information on a specific command.
2、创建数据库和表
本节通过使用OushuDB的命令行工具psql来说明如何创建基本数据库对象:database和table。因为OushuDB和PostgreSQL兼容,所以使用OushuDB的方式和使用PostgresSQL的方式基本相同,如果OushuDB的文档有些地方说明不清楚的话,用户也可以通过查阅PostgresSQL的帮助文档来了解更多关于OushuDB的信息。
下面这条命令使用psql连接OushuDB缺省安装的数据库postgres,然后创建一个新的数据库test,并在新的数据库中创建一个表foo。
changlei:build ChangLei$ psql -d postgres
psql (8.2.15)
Type "help" for help.
postgres=# create database test; # 创建数据库test
CREATE DATABASE
postgres=# \c test # 连接进入test数据库
You are now connected to database "test" as user "ChangLei".
test=# create table foo(id int, name varchar); # 创建表foo
CREATE TABLE
test=# \d # 显示当前数据库test中所有表
List of relations
Schema | Name | Type | Owner | Storage
--------+------+-------+----------+-------------
public | foo | table | ChangLei | append only
(1 row)
test=# insert into foo values(1, 'hawq'),(2, 'hdfs');
INSERT 0 2
test=# select * from foo; # 从表foo中选择数据
id | name
----+------
1 | hawq
2 | hdfs
(2 rows)
如果想删除表或者数据库的话可以使用drop语句。
test=# drop table foo;
DROP TABLE
test=# \d
No relations found.
test=# drop database test; # 因为现在在test数据库中,所以不能删除
ERROR: cannot drop the currently open database
test=# \c postgres # 首先连接到postgres数据库,然后删除test数据库
You are now connected to database "postgres" as user "ChangLei".
postgres=# drop database test;
DROP DATABASE
3、查看查询执行情况
使用\timing命令可以打印出查询执行的时间。
test=# \timing on
Timing is on.
test=# select * from foo; # 这时再执行SQL语句会给出语句执行时间。
id | name
----+------
1 | hawq
2 | hdfs
(2 rows)
Time: 16.369 ms
test=# \timing off # 关闭时间输出
Timing is off.
使用explain语句可以显示出查询计划。
test=# explain select count(*) from foo;
QUERY PLAN
----------------------------------------------------------------------------------
Aggregate (cost=1.07..1.08 rows=1 width=8)
-> Gather Motion 1:1 (slice1; segments: 1) (cost=1.03..1.06 rows=1 width=8)
-> Aggregate (cost=1.03..1.04 rows=1 width=8)
-> Append-only Scan on foo (cost=0.00..1.02 rows=2 width=0)
Settings: default_hash_table_bucket_number=6
(5 rows)
使用explain analyze可以显示出查询在具体执行时的状态,包括每一个操作符开始执行时间,以及结束时间,可以帮助用户找到查询的瓶颈,进而优化查询。关于查询计划以及explain analyze的执行结果的解释可以参考查询计划与查询执行章节。针对一个查询,可能会有无数个查询计划。得出优化的查询计划是查询优化器的功能。一个查询执行时间的长短与查询的计划有很大关系,所以熟悉查询计划以及具体查询的执行对查询优化有很大意义。
test=# explain analyze select count(*) from foo;
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=1.07..1.08 rows=1 width=8)
Rows out: Avg 1.0 rows x 1 workers. Max/Last(seg-1:changlei/seg-1:changlei) 1/1 rows with 5.944/5.944 ms to end, start offset by 6.568/6.568 ms.
-> Gather Motion 1:1 (slice1; segments: 1) (cost=1.03..1.06 rows=1 width=8)
Rows out: Avg 1.0 rows x 1 workers at destination. Max/Last(seg-1:changlei/seg-1:changlei) 1/1 rows with 5.941/5.941 ms to first row, 5.942/5.942 ms to end, start offset by 6.569/6.569 ms.
-> Aggregate (cost=1.03..1.04 rows=1 width=8)
Rows out: Avg 1.0 rows x 1 workers. Max/Last(seg0:changlei/seg0:changlei) 1/1 rows with 5.035/5.035 ms to first row, 5.036/5.036 ms to end, start offset by 7.396/7.396 ms.
-> Append-only Scan on foo (cost=0.00..1.02 rows=2 width=0)
Rows out: Avg 2.0 rows x 1 workers. Max/Last(seg0:changlei/seg0:changlei) 2/2 rows with 5.011/5.011 ms to first row, 5.032/5.032 ms to end, start offset by 7.397/7.397 ms.
Slice statistics:
(slice0) Executor memory: 223K bytes.
(slice1) Executor memory: 279K bytes (seg0:changlei).
Statement statistics:
Memory used: 262144K bytes
Settings: default_hash_table_bucket_number=6
Dispatcher statistics:
executors used(total/cached/new connection): (1/1/0); dispatcher time(total/connection/dispatch data): (1.462 ms/0.000 ms/0.029 ms).
dispatch data time(max/min/avg): (0.029 ms/0.029 ms/0.029 ms); consume executor data time(max/min/avg): (0.012 ms/0.012 ms/0.012 ms); free executor time(max/min/avg): (0.000 ms/0.000 ms/0.000 ms).
Data locality statistics:
data locality ratio: 1.000; virtual segment number: 1; different host number: 1; virtual segment number per host(avg/min/max): (1/1/1); segment size(avg/min/max): (56.000 B/56 B/56 B); segment size with penalty(avg/min/max): (56.000 B/56 B/56 B); continuity(avg/min/max): (1.000/1.000/1.000); DFS metadatacache: 0.049 ms; resource allocation: 0.612 ms; datalocality calculation: 0.085 ms.
Total runtime: 13.398 ms
(20 rows)