Apache IoTDB开发系统整合之Spark IoTDB Connecter

2023-09-19 115

版权

本文内容由阿里云实名注册用户自发贡献，版权归原作者所有，阿里云开发者社区不拥有其著作权，亦不承担相应法律责任。具体规则请查看《阿里云开发者社区用户服务协议》和《阿里云开发者社区知识产权保护指引》。如果您发现本社区中有涉嫌抄袭的内容，填写侵权投诉表单进行举报，一经查实，本社区将立刻删除涉嫌侵权内容。

简介： 以下 TsFile 结构为例： TsFile 架构中有三个度量：状态、温度和硬件。

version

The versions required for Spark and Java are as follow:

Spark Version	Scala Version	Java Version	TsFile
`2.4.3`	`2.11`	`1.8`	`0.10.0`

install

mvn clean scala:compile compile install.

1. maven dependency

<dependency>
<groupId>org.apache.iotdb</groupId>
<artifactId>spark-iotdb-connector</artifactId>
<version>0.10.0</version>
</dependency>

2. spark-shell user guide

spark-shell --jars spark-iotdb-connector-0.10.0.jar,iotdb-jdbc-0.10.0-jar-with-dependencies.jar
import org.apache.iotdb.spark.db._
val df = spark.read.format("org.apache.iotdb.spark.db").option("url","jdbc:iotdb://127.0.0.1:6667/").option("sql","select * from root").load
df.printSchema()
df.show()

如果要对RDD进行分区，可以执行以下操作

spark-shell --jars spark-iotdb-connector-0.10.0.jar,iotdb-jdbc-0.10.0-jar-with-dependencies.jar
import org.apache.iotdb.spark.db._
val df = spark.read.format("org.apache.iotdb.spark.db").option("url","jdbc:iotdb://127.0.0.1:6667/").option("sql","select * from root").
option("lowerBound", [lower bound of time that you want query(include)]).option("upperBound", [upper bound of time that you want query(include)]).
option("numPartition", [the partition number you want]).load
df.printSchema()
df.show()

3. 模式推理

以下 TsFile 结构为例： TsFile 架构中有三个度量：状态、温度和硬件。这三项测量的基本信息如下：

Name	Type	Encode
status	Boolean	PLAIN
temperature	Float	RLE
hardware	Text	PLAIN

The existing data in the TsFile is as follows:

device:root.ln.wf01.wt01				device:root.ln.wf02.wt02
status		temperature		hardware		status
time	value	time	value	time	value	time	value
1	True	1	2.2	2	“aaa”	1	True
3	True	2	2.2	4	“bbb”	2	False
5	False	3	2.1	6	“ccc”	4	True

The wide(default) table form is as follows:

time	root.ln.wf02.wt02.temperature	root.ln.wf02.wt02.status	root.ln.wf02.wt02.hardware	root.ln.wf01.wt01.temperature	root.ln.wf01.wt01.status	root.ln.wf01.wt01.hardware
1	null	true	null	2.2	true	null
2	null	false	aaa	2.2	null	null
3	null	null	null	2.1	true	null
4	null	true	bbb	null	null	null
5	null	null	null	null	false	null
6	null	null	ccc	null	null	null

You can also use narrow table form which as follows: (You can see part 4 about how to use narrow form)

time	device_name	status	hardware	temperature
1	root.ln.wf02.wt01	true	null	2.2
1	root.ln.wf02.wt02	true	null	null
2	root.ln.wf02.wt01	null	null	2.2
2	root.ln.wf02.wt02	false	aaa	null
3	root.ln.wf02.wt01	true	null	2.1
4	root.ln.wf02.wt02	true	bbb	null
5	root.ln.wf02.wt01	false	null	null
6	root.ln.wf02.wt02	null	ccc	null

4. 宽表和窄表之间的转换

从宽到窄

import org.apache.iotdb.spark.db._
val wide_df = spark.read.format("org.apache.iotdb.spark.db").option("url", "jdbc:iotdb://127.0.0.1:6667/").option("sql", "select * from root where time < 1100 and time > 1000").load
val narrow_df = Transformer.toNarrowForm(spark, wide_df)

从窄到宽

import org.apache.iotdb.spark.db._
val wide_df = Transformer.toWideForm(spark, narrow_df)

5. Java 用户指南

import org.apache.spark.sql.Dataset;
import org.apache.spark.sql.Row;
import org.apache.spark.sql.SparkSession;
import org.apache.iotdb.spark.db.*
public class Example {
public static void main(String[] args) {
SparkSession spark = SparkSession
.builder()
.appName("Build a DataFrame from Scratch")
.master("local[*]")
.getOrCreate();
Dataset<Row> df = spark.read().format("org.apache.iotdb.spark.db")
.option("url","jdbc:iotdb://127.0.0.1:6667/")
.option("sql","select * from root").load();
df.printSchema();
df.show();
Dataset<Row> narrowTable = Transformer.toNarrowForm(spark, df)
narrowTable.show()
}
}

Apache IoTDB开发系统整合之Spark IoTDB Connecter

version

install

1. maven dependency

2. spark-shell user guide

如果要对RDD进行分区，可以执行以下操作

3. 模式推理

4. 宽表和窄表之间的转换

从宽到窄

从窄到宽

5. Java 用户指南

物联网

热门文章

最新文章

相关课程

相关电子书

相关实验场景

推荐镜像