Alfresco: Scala continued

=======================
tuple

tuple can hold elements of different data types. Its the light weight collection,
can be thought as A ROW

val captainStuff = ("Picard","Enterprise-D","NCC-1701-D")
println(captainStuff)
println(captainStuff._1)

Index is 1 based in tuple

scala> val captainStuff = ("Picard","Enterprise-D","NCC-1701-D")
captainStuff: (String, String, String) = (Picard,Enterprise-D,NCC-1701-D)

scala> println(captainStuff)
(Picard,Enterprise-D,NCC-1701-D)

scala> println(captainStuff._1)
Picard

===============

val aBunchOfStuff("Kirk", 1964, true)

Tuple of 2 is special tuple, can be treated as key value pair.

val picardsShip = "picard" -> "Enterprise-D"
("picard","Enterprise-D" )

println(picardsShip._2)

scala> val picardsShip = "picard" -> "Enterprise-D"
picardsShip: (String, String) = (picard,Enterprise-D)

scala> ("picard","Enterprise-D" )
res20: (String, String) = (picard,Enterprise-D)

scala> println(picardsShip._2)
Enterprise-D

val picardsShipAnotherWay = ("picard","Enterprise-D" )
println(picardsShipAnotherWay._1)

scala> val picardsShipAnotherWay = ("picard","Enterprise-D" )
picardsShipAnotherWay: (String, String) = (picard,Enterprise-D)

scala> println(picardsShipAnotherWay._1)
picard

We can have the duplicate key in tuple, map we cannot have duplicate keys.

tuple used heavily.

===============

val shipList = List("Enterprise", "Defiant")
println(shipList(1))

scala> val shipList = List("Enterprise", "Defiant")
shipList: List[String] = List(Enterprise, Defiant)

scala> println(shipList(1))
Defiant

List uses 0 based index.
List can have same kind of data types. Tuple can have different data types.

println(shipList.head)
println(shipList.tail) // everything by removing the first elements

scala> println(shipList.head)
Enterprise

scala> println(shipList.tail)
List(Defiant)

================

//revers element of content of each element
shipList.map((a :String) => {a.reverse})

scala> shipList.map((a :String) => {a.reverse})
res26: List[String] = List(esirpretnE, tnaifeD)

map in scala and map in spark same , but only thing is code is executed in distributed manner.
//to find sum of list
val numberList = List(1,2,3,4,5)
val sum = numberList.reduce((x:Int, y:Int) => x+y)
println(sum)

reduce takes 2 params, reduce and reduceby. works at 2 rows at a time

scala> val numberList = List(1,2,3,4,5)
numberList: List[Int] = List(1, 2, 3, 4, 5)

scala> val sum = numberList.reduce((x:Int, y:Int) => x+y)
sum: Int = 15

=============================

filter

val iHateFives = numberList.filter(x=> x!=5)

scala> val iHateFives = numberList.filter(x=> x!=5)
iHateFives: List[Int] = List(1, 2, 3, 4)

val iHateThrees = numberList.filter(_ != 3)

scala> val iHateThrees = numberList.filter(_ != 3)
iHateThrees: List[Int] = List(1, 2, 4, 5)

Observer the syntax, both the syntax are correct
_ means take anything in scala, and perform operation.

val alltrues = numberList.filter(x=> true)

scala> val alltrues = numberList.filter(x=> true)
alltrues: List[Int] = List(1, 2, 3, 4, 5)

============

What is scala ?
Svcala is hybrid programming langauage which supports both object oriented programming and functional
aspect. As data engineers we are more concernted about functional programming.

Why scala?
Concise langauage, which gives functional aspect. Provides scripting approach, so we get better
productivity. Spark is written in scala. Data crunching can be achieved using scala

REPL vs IDE ?

what is fuctional programming?
fuctional programming is a way of writing software using pure functions and immutable values.

function : function is a relationship between input and output.
ex: squareroot of 4 is 2.

What is Pure function :
It has 3 characterics,
1. function should be dependent on only the input parameter. Any external variabe should
not imapact the results.

def func(i:Int) = {
2*i
}

2. pure function should not modify the input parameter value.
def func(i:Int) = {
2*i
}

3. There should not be any side effects.
function should do only what is intended to do.

def doubler(x:Int){
println("hello this is a double function")
2*i
}
here we should not have even println statement.

Purity of function is tested using referencial transperency.
referencial transperency :
If we can replace the occurrence of the function with exact output,as the output does
not change. and it should give same result.

Alfresco

Sunday, November 10, 2019

Scala continued

No comments:

Post a Comment