@[toc]
入门
概述
Scala combines object-oriented and functional programming in one concise, high-level language. Scala's static types help avoid bugs in complex applications, and its JVM and JavaScript runtimes let you build high-performance systems with easy access to huge ecosystems of libraries.
- 学习Scala的意义:
Spark、Kafka、Flink
优雅
开发速度快
融合到生态圈
安装
- 安装 Java 8
- 下载 download scala 网址:https://www.scala-lang.org/download/2.11.8.html
- 解压 unzip scala
- 配置环境变量(可选)
Windows 需配置两个 Path中: D:\scala\bin 和 D:\scala\jre\bin
- 查看是否生效
Linux或Mac中操作步骤:
1.tar -zxvf scala-2.11.8.tgz -C 解压路径
2.到解压目录下 pwd 复制整个路径
3.将上面的路径 添加到环境变量中
vi ~/.bash_profile
export SCALA_HOME=复制的路径
export PATH=$SCALA_HOME/bin:$Path
保存
source ~/.bash_profile
echo $SCALA_HOME
下载之后的scala目录下的bin目录中 有普通文件 和 .bat文件
.bat文件是在Windows中用的,Linux或Mac中用不到,所以可以删掉 rm *.bat
Java VS Scala
Java
public class HelloWorld {
public static void main(String[] args) {
System.out.println("Hello World..");
}
}
Scala(每行代码并不强求使用;结束,但是Java是必须的)
Object HelloWorld {
def main(args : Array[String]) {
println("Hello World..")
}
}
val 和 var
- val:值
final
val 值名称:类型=xx
val a = 1 (不可变)
val a : int = 1
- var:变量
var 值名称:类型=xxx
var b = 1
var b : int = 1
基本数据类型
- Byte/Char
- Short/Int/Long/Float/Double
- Boolean
只有Float声明时比较特别
- var c : Float = 1.1f
scala> b=20
b: Int = 20
scala> val b:Int =10
b: Int = 10
scala> val c:Boolean=true
c: Boolean = true
scala> val d =1.1
d: Double = 1.1
scala> val e:Float=1.2f
e: Float = 1.2
lazy在Scala中的应用
lazy var d : int = 1;
延迟加载,只有在第一次使用时才加载
读取文件并以字符串形式输出
import scala.io.Source._
var info = fromFile("...").mkString
如果用lazy var info = fromFile("...").mkString,开始是检测不到错误的,要小心使用
*注意:当一个变量声明为lazy,只有当你第一次操作时才会去真正访问,如果不去访问,即使写错了,也不会发现
开发工具IDEA
Maven
1.下载IDEA和Maven
2.进入IDEA,新建项目 选择Maven 勾选create from archetype 选择scala-archetype simple-> 正常创建(注意Maven仓库位置)
3.IDEA默认是不支持Scala的,需要下载Scala插件
File -> settings -> Plugins -> install JetBrains plugin -> scala
之后就可以new 一个Scala类了
4.新建测试类,运行报错
删除pom.xml中<arg>make:transitive</args>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>org.example</groupId>
<artifactId>untitled5</artifactId>
<version>1.0-SNAPSHOT</version>
<inceptionYear>2008</inceptionYear>
<properties>
<maven.compiler.source>1.8</maven.compiler.source>
<maven.compiler.target>1.8</maven.compiler.target>
<encoding>UTF-8</encoding>
<scala.version>2.11.8</scala.version>
<spark.version>2.3.0</spark.version>
</properties>
<dependencies>
<dependency>
<groupId>org.scala-lang</groupId>
<artifactId>scala-library</artifactId>
<version>${scala.version}</version>
</dependency>
<!--引入Spark Core的依赖-->
<!-- https://mvnrepository.com/artifact/org.apache.spark/spark-core -->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.11</artifactId>
<version>2.3.0</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.spark/spark-sql -->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.12</artifactId>
<version>2.4.3</version>
</dependency>
</dependencies>
<build>
<sourceDirectory>src/main/scala</sourceDirectory>
<testSourceDirectory>src/test/scala</testSourceDirectory>
<plugins>
<plugin>
<groupId>org.scala-tools</groupId>
<artifactId>maven-scala-plugin</artifactId>
<version>2.15.0</version>
<executions>
<execution>
<goals>
<goal>compile</goal>
<goal>testCompile</goal>
</goals>
<configuration>
<args>
<arg>-dependencyfile</arg>
<arg>${project.build.directory}/.scala_dependencies</arg>
</args>
</configuration>
</execution>
</executions>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-surefire-plugin</artifactId>
<version>2.6</version>
<configuration>
<useFile>false</useFile>
<disableXmlReport>true</disableXmlReport>
<!-- If you have classpath issue like NoDefClassError,... -->
<!-- useManifestOnlyJar>false</useManifestOnlyJar -->
<includes>
<include>**/*Test.*</include>
<include>**/*Suite.*</include>
</includes>
</configuration>
</plugin>
</plugins>
</build>
</project>
函数
方法定义
def 方法名(参数: 参数类型): 返回值类型 = {
//方法体
//最后一行作为返回值(不需要使用return)
}
def max(x: Int, y: Int): Int = {
if(x > y)
x
else
y
}
package org.example
object App {
def main(args: Array[String]): Unit = {
println(add(2,5))
}
def add(x:Int,y:Int):Int={
x+y
}
}
7
package org.example
object App {
def main(args: Array[String]): Unit = {
println(three())
//没有入参的时候可以不用写
println(three)
}
def three()=1+2
}
无返回值 自动加Unit
默认参数
默认参数: 在函数定义时,允许指定参数的默认值
//参数
def sayName(name: String ) = {
println(name)
}
//默认参数
def sayName1(name: String ="Jack") = {
println(name)
}
//main调用
sayName("jaja")
sayName1()
sayName1("Ma")
jaja
Jack
Ma
相关源码:SparkContext中使用
命名参数
可以修改参数的传入 顺序
def speed(destination: Float, time: Float): Float {
destination / time
}
println(speed(100, 10))
println(speed(time = 10, destination = 100))
可变参数
变参数(可传入任意多个相同类型的参数) java中 int... numbers
JDK5+:可变参数
def sum(number: Int*) = {
var result = 0
for(num <- number) {
result += num
}
result
}
相关源码:org.apache.spark.sql.Dataset中的select方法
条件语句
循环语句
- to 1 to 10 (左闭右闭) 1.to(10)
- range Range(1,10) (左闭右开的) Range(1,10,2) (2为步长)
- until 1 until 10 (左闭右开)
to、until的底层调用都是Range
scala> 1 to 10
res1: scala.collection.immutable.Range.Inclusive = Range(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
scala> Range(1,10)
res2: scala.collection.immutable.Range = Range(1, 2, 3, 4, 5, 6, 7, 8, 9)
scala> 1.to(10)
res3: scala.collection.immutable.Range.Inclusive = Range(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
scala> Range(1,10,2)
res4: scala.collection.immutable.Range = Range(1, 3, 5, 7, 9)
scala> Range(1,10,5)
res5: scala.collection.immutable.Range = Range(1, 6)
scala> Range(10,1,-1)
res8: scala.collection.immutable.Range = Range(10, 9, 8, 7, 6, 5, 4, 3, 2)
scala> 1 until 10
res9: scala.collection.immutable.Range = Range(1, 2, 3, 4, 5, 6, 7, 8, 9)
- for
for(i <- 1.to(10)) {
println(i)
}
for(i <- 1.until(10, 2)) {
println(i)
}
for(i <- 1 to 10 if i % 2 == 0) {
println(i)
}
val courses = Array("Hadoop", "Spark SQL", "Spark Streaming", "Storm", "Scala")
for(x<- courses) {
println(x)
}
//x其实就是courses里面的每个元素
// => 就是将左边的x作用上一个函数,变成另外一个结果
courses.foreach(x=> println(x))
- while
var (num, sum) = (100, 0)
while(num > 0){
sum = sum + num
num = num - 1
}
println(sum)
面向对象
概述
Java/Scala OO(Object Oriented)
- 封装:属性、方法封装到类中,可设置访问级别
- 继承:父类和子类之间的关系 重写
- 多态:父类引用指向子类对象 开发框架基石
Person person = new Person();
User user = new User();
Person person =new User();
类的定义和使用
package org.example
object ObjectApp {
def main(args: Array[String]): Unit = {
val person = new People()
person.name = "Messi"
// println(person.name + ".." + person.age)
println("invoke eat method: " + person.eat)
person.watchFootball("Barcelona")
person.printInfo()
//编译不通过 private 修饰
// println(person.gender)
}
}
class People{
//var(变量)类型自动生成getter/setter
//这种写法就是一个占位符
var name: String = _
//val(常量)类型自动生成getter
val age: Int = 10
private [this] var gender = "male"
def printInfo() : Unit = {
print("gender: " + gender)
}
def eat(): String = {
name + " eat..."
}
def watchFootball(teamName: String): Unit = {
println(name + " is watching match of " + teamName)
}
}
invoke eat method: Messi eat...
Messi is watching match of Barcelona
gender: male
继承和重写
- 继承
class Student(name: String, age: Int, var major: String) extends Person(name, age) {}
- 重写
override def acquireUnrollMemory()
override def toString = "test override"
package org.example
object ConstructorApp {
def main(args: Array[String]): Unit = {
var person =new Person("zhangsan",99)
println(person.age+":"+person.name)
var person2 =new Person("zhangsan",99,"Man")
println(person2.age+":"+person2.name+";"+person2.gender)
}
}
//主构造器
class Person(val name: String, val age: Int){
println("Person constructor enter...")
val school = "ustc"
//占位符肯定要预先指定类型
var gender: String = _
//附属构造器
def this(name: String , age: Int, gender: String){
//必须要调用主构造器或者其他附属构造器
this(name, age)
this.gender = gender
}
override def toString = "test override"
println("Person Constructor leave...")
}
//继承
//name: String, age: Int, var major: String 继承父类的可以不用直接写var 否则需要重新申明
class Student(name: String, age: Int, var major: String) extends Person(name, age) {
//重写
override val school = "pku"
println("Person Student enter...")
println("Person Student leave...")
}
抽象类
package org.example
object AbstractApp {
def main(args: Array[String]): Unit = {
var stu =new Student1();
println(stu.age)
println(stu.name)
stu.speak;
}
}
abstract class Person3{
def speak
val name: String
val age: Int
}
class Student1 extends Person3{
override def speak: Unit = {
println("speak")
}
override val name: String = "Messi"
override val age: Int = 32
}
伴生类和伴生对象
如果有一个class
,还有一个与class
同名的object
互为 伴生类和伴生对象
class ApplyTest{
def apply(){
println(...)
}
}
object ApplyTest{
def apply(){
println("Object ApplyTest apply...")
new ApplyTest
}
}
类名() ==> Object.apply
对象() ==> Class.apply
最佳实践:在Object的apply方法中去new一个Class
package org.example
object ApplyApp {
def main(args: Array[String]): Unit = {
// for(i<-1 to 10){
// ApplyTest.incr
// }
// //object 是一个单例对象
// println(ApplyTest.count)
var b=ApplyTest()
//默认走的是object=》apply
//Object ApplyTest apply...
println("-----------------------")
var c= new ApplyTest()
c()
//Class ApplyTest apply...
}
}
class ApplyTest {
def apply() = {
println("Class ApplyTest apply...")
}
}
object ApplyTest {
println("Object start...")
var count = 0
def incr={
count=count+1
}
def apply() = {
println("Object ApplyTest apply...")
//在object中的apply中new class
new ApplyTest
}
println("Object end...")
}
case和trait
case class :不用new
case class Dog(name: String)
直接可以调用Dog("wangcai")
Trait: 类似implements
xxx entends ATrait
xxx extends Cloneable with Logging with Serializable
源码中Partition类
集合
数组
package org.example
object ArrayApp extends App{
//println("hello")
val a = new Array[String](5)
a(0)="hello"
println(a(0))
val b = Array("hello","world")
val c = Array(1,2,3,4,5,67)
c.sum
c.max
c.mkString("/")
}
val d=scala.collection.mutable.ArrayBuffer[Int]()
d+=1
d+=2
d+=(2,33,4)
d++=Array(33,45,22)
println(d+"-------------------")
d.insert(0,999)
d.remove(1,2)
d.trimEnd(2)
println(d+"-------------------")
//转化成不可变的
d.toString()
for(i<-0 until d.length){
println(c(i))
}
hello
ArrayBuffer(1, 2, 2, 33, 4, 33, 45, 22)-------------------
ArrayBuffer(999, 2, 33, 4, 33)-------------------
1
2
3
4
5
List
list是不可变的,对list进行添加删除或者取值等操作均会返回一个新的list。
scala> Nil
res4: scala.collection.immutable.Nil.type = List()
scala> Nil
res4: scala.collection.immutable.Nil.type = List()
scala> val l= List(1,2,3,4,5,56)
l: List[Int] = List(1, 2, 3, 4, 5, 56)
scala> l.head
head headOption
scala> l.head
res5: Int = 1
scala> l.tail
tail tails
scala> l.tail
res6: List[Int] = List(2, 3, 4, 5, 56)
scala> l.tails
res7: Iterator[List[Int]] = non-empty iterator
scala>
val d=scala.collection.mutable.ArrayBuffer[Int]()
d+=1
d+=2
d+=(2,33,4)
d++=Array(33,45,22)
d++ =List(1,2,3,4,)
Set
set是一个非重复的集合,若有重复数据,则会自动去重。
scala> val set = Set(1,2,3,1,2,5)
set: scala.collection.immutable.Set[Int] = Set(1, 2, 3, 5)
Map
map是K-V键值对集合。
package org.example
object MapApp {
def main(args: Array[String]): Unit = {
val map = Map(
"1" -> "hello" ,
2 -> "world",
3 -> "!!!!!"
)
println(map.mkString(","))
println("-----------------------")
for(x<-map){
println(x._1+":"+x._2)
}
println("-----------------------")
var keys = map.keys
var keyIterator = keys.iterator
while(keyIterator.hasNext) {
val key = keyIterator.next()
println(key + "\t" + map.get(key).get)
}
}
}
1 -> hello,2 -> world,3 -> !!!!!
-----------------------
1:hello
2:world
3:!!!!!
-----------------------
1 hello
2 world
3 !!!!!
Optuon&Some&None
val map = Map(
"1" -> "hello" ,
2 -> "world",
3 -> "!!!!!"
)
println(map.get(2))
println(map.get(999))
Some(world)
None
option.scala
@SerialVersionUID(5066590221178148012L) // value computed by serialver for 2.11.2, annotation added in 2.11.4
case object None extends Option[Nothing] {
def isEmpty = true
def get = throw new NoSuchElementException("None.get")
}
@SerialVersionUID(1234815782226070388L) // value computed by serialver for 2.11.2, annotation added in 2.11.4
final case class Some[+A](x: A) extends Option[A] {
def isEmpty = false
def get = x
}
Tuple
与列表一样,与列表不同的是元组可以包含不同类型的元素。元组的值是通过将单个的值包含在圆括号中构成的。创建过程可加new关键词,也可不加。
package org.example
object TupleApp {
def main(args: Array[String]): Unit = {
var t=new Tuple3[Int,Int,String](1,99,"hello")
println(t.toString())
println("----------------")
var t2=(9999,"hello")
println(t2.toString())
println(t2.swap.toString())
}
}
(1,99,hello)
----------------
(9999,hello)
(hello,9999)
模式匹配
基本类型
Java : 对一个值进行条件判断例如switch
模式匹配类似于java的switch case。Scala的模式匹配不仅可以匹配值还可以匹配类型、从上到下顺序匹配,如果匹配到则不再往下匹配、都匹配不上时,会匹配到case _ ,相当于default、match 的最外面的”{ }”可以去掉看成一个语句。
def match_test(m:Any) = {
m match {
case 1 => println("nihao")
case m:Int => println("Int")
case _ => println("default")
}
}
package org.example
object MarchApp {
def main(args: Array[String]): Unit = {
def judeGrade(grade:String)={
grade match{
case "B" => println("Just so so")
case "A" => println("good")
case "S" => println("cool")
case _=> println("No.1")
}
}
judeGrade("S")
judeGrade("A")
judeGrade("SSS")
}
}
cool
good
No.1
Array
def greeting(array:Array[String]) = {
array match {
case Array("zs")=> println("hi,zs")
case Array(x,y)=> println(x+"and"+y)
case Array("zs",_*)=>println("zs and other")
case _=>println("everyone")
}
}
greeting(Array("zs"))
greeting(Array("zs","ls"))
hi,zs
zsandls
List
def greeting1(list:List[String]) = {
list match {
case "zs"::Nil=> println("hi,zs")
case x::y::Nil=> println(x+"and"+y)
case "zs"::tail =>println("zs and other")
case _=>println("everyone")
}
}
greeting1(List("zs"))
greeting1(List("zs","ls"))
}
类型匹配
def matchType(obj: Any) = {
obj match {
case x: Int => println("hi,int")
case y: String => println(y)
case m: Map[_, _] => println("map")
case _ => println("everyone")
}
}
matchType(Map(1 -> "yes"))
matchType(11)
matchType("hello")
}
异常处理
try{
val i=10/0
println(i)
}catch {
case e:ArithmeticException=>println(e.getMessage)
case e:Exception=>println(e.getMessage)
}finally {
}
/ by zero
高级函数
字符串
- 插值
val s ="hello"
val name="jacksun"
println(s+name)
println(s+":"+name)
println(s"hello:$name")
- 多行字符串
//多行
var d =
"""
|1
|2
|3
|4
|5
|5
|6
""".stripMargin
匿名函数
匿名函数分为有参匿名函数、无参匿名函数、有返回值的匿名函数。(可以将匿名参数的返回给一个val声明的值,匿名函数不能显式的声明返回值)
package org.example
object FunctionApp extends App {
//有参数匿名函数
val printy = (a : Int) => {
println(a)
}
printy(999)
//无参数匿名函数
val printx = ()=>{
println("Scala No.1")
}
printx()
//有返回值的匿名函数
val add = (a:Int,b:Int) =>{
a+b
}
println(add(4,4))
}
Currying
将接受一个参数的转化成2个
def add(a:Int,b:Int) = a+b
println(add(2,1))
//Currying
def add2(a:Int)(b:Int) = a+b
println(add2(2)(1))
高阶函数
高阶函数(Higher-Order Function)就是操作其他函数的函数。
Scala 中允许使用高阶函数, 高阶函数可以使用其他函数作为参数,或者使用函数作为输出结果。
object Test {
def main(args: Array[String]) {
println( apply( layout, 10) )
}
// 函数 f 和 值 v 作为参数,而函数 f 又调用了参数 v
def apply(f: Int => String, v: Int) = f(v)
def layout[A](x: A) = "[" + x.toString() + "]"
}
- map
对每个集合的元组进行操作
scala> val l =List(1,2,3,4,5,6,7,8,9)
l: List[Int] = List(1, 2, 3, 4, 5, 6, 7, 8, 9)
scala> l.map(x=>(x+1))
res5: List[Int] = List(2, 3, 4, 5, 6, 7, 8, 9, 10)
scala> l.map((x:Int)=>x*2)
res6: List[Int] = List(2, 4, 6, 8, 10, 12, 14, 16, 18)
scala> l.map(x=>x*2)
res7: List[Int] = List(2, 4, 6, 8, 10, 12, 14, 16, 18)
scala> l.map(_*2)
- filter
过滤条件
scala> l.filter(_>5)
res9: List[Int] = List(6, 7, 8, 9)
- take
取数
scala> l.take(1)
res10: List[Int] = List(1)
scala> l.take(3)
res11: List[Int] = List(1, 2, 3)
- reduce
两两相加相减
scala> l.take(3).reduce(_-_)
res15: Int = -4
// 从左相减
scala> l.take(3).reduceLeft(_-_)
res16: Int = -4
// 从右相减
scala> l.take(3).reduceRight(_-_)
res17: Int = 2