E-MapReduce银行员工信息查询示例
段落1创建临时表
- %spark
- import org.apache.commons.io.IOUtils
- import java.net.URL
- import java.nio.charset.Charset
- // Zeppelin creates and injects sc (SparkContext) and sqlContext (HiveContext or SqlContext)
- // So you don't need create them manually
- // load bank data
- val bankText = sc.parallelize(
- IOUtils.toString(
- new URL("http://emr-sample-projects.oss-cn-hangzhou.aliyuncs.com/bank.csv"),
- Charset.forName("utf8")).split("\n"))
- case class Bank(age: Integer, job: String, marital: String, education: String, balance: Integer)
- val bank = bankText.map(s => s.split(";")).filter(s => s(0) != "\"age\"").map(
- s => Bank(s(0).toInt,
- s(1).replaceAll("\"", ""),
- s(2).replaceAll("\"", ""),
- s(3).replaceAll("\"", ""),
- s(5).replaceAll("\"", "").toInt
- )
- ).toDF()
- bank.registerTempTable("bank")
段落2查询表结构
- %sql
- desc bank
段落3查询年龄小于30各年龄段员工人数
- %sql select age, count(1) value from bank where age < 30 group by age order by age
段落4 查询年龄小于等于20岁的员工信息
- %sql select * from bank where age <= 20
收起
0
条回答
写回答
取消
提交回答