R Presentation

Jeffery Russell
3-29-19

History

  • Back in the day (1976) Bell Laboratories created the S statisticsl programming language
  • People were sad because it was exclusively licensed by AT&T
  • During the 90's a group of people developed a S replacement called R and it was liscensed under GNU

Why use R

  • Statistics and data analysis
  • Machine learning
  • Fast prototyping
  • Creating graphs
  • Writing research papers and reports
  • Creating presentations (like this one)

Embedding Code Output in a Document

summary(cars)
     speed           dist       
 Min.   : 4.0   Min.   :  2.00  
 1st Qu.:12.0   1st Qu.: 26.00  
 Median :15.0   Median : 36.00  
 Mean   :15.4   Mean   : 42.98  
 3rd Qu.:19.0   3rd Qu.: 56.00  
 Max.   :25.0   Max.   :120.00  

Embedding Graphs in a Document

plot(mtcars$wt, mtcars$mpg, main="Weight vs MPG", xlab = "weight", ylab="MPG")

plot of chunk unnamed-chunk-2

Syntax

  • Syntax of R is C esk with its use of curly braces
  • Variables are similar to python since it infers your data type used
  • Type system is rather wierd, the base unit for everything is a vector– even integers
if(TRUE)
{
  print("Hello World")
}
[1] "Hello World"

Syntax cont

ML Example pt: 1

plot of chunk unnamed-chunk-4

ML Example pt: 2

plot of chunk unnamed-chunk-5

Super Cool ML Example pt: 3

sc <- spark_connect(master = "local")

iris_tbl <- sdf_copy_to(sc, iris, name = "iris_tbl", overwrite = TRUE)

partitions <- iris_tbl %>%
  sdf_partition(training = 0.7, test = 0.3, seed = 1111)

iris_training <- partitions$training
iris_test <- partitions$test

dt_model <- iris_training %>%
  ml_decision_tree(Species ~ .)

pred <- ml_predict(dt_model, iris_test)

ml_multiclass_classification_evaluator(pred)
[1] 0.9451737

Resources

Demo/Questions?