diff --git a/blogContent/posts/data-science/html/RPresentation.html b/blogContent/posts/data-science/html/RPresentation.html new file mode 100644 index 0000000..1abfcb0 --- /dev/null +++ b/blogContent/posts/data-science/html/RPresentation.html @@ -0,0 +1,1056 @@ + + + + + + R Presentation + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+
+ +
+

R Presentation

Jeffery Russell
3-29-19

+ +
+

Signin: bit.ly/ritlug-mar-29

+ +

banner

+ +
+ +
+
+

History

+
+
    +
  • Back in the day (1976) Bell Laboratories created the S statisticsl programming language
  • +
  • People were sad because it was exclusively licensed by AT&T
  • +
  • During the 90's a group of people developed a S replacement called R and it was liscensed under GNU
  • +
+ +

+ +
+ +
+
+

Why use R

+
+
    +
  • Statistics and data analysis
  • +
  • Machine learning
  • +
  • Fast prototyping
  • +
  • Creating graphs
  • +
  • Writing research papers and reports
  • +
  • Creating presentations (like this one)
  • +
+ +
+ +
+
+

Embedding Code Output in a Document

+
+
summary(cars)
+
+ +
     speed           dist       
+ Min.   : 4.0   Min.   :  2.00  
+ 1st Qu.:12.0   1st Qu.: 26.00  
+ Median :15.0   Median : 36.00  
+ Mean   :15.4   Mean   : 42.98  
+ 3rd Qu.:19.0   3rd Qu.: 56.00  
+ Max.   :25.0   Max.   :120.00  
+
+ +
+ +
+
+

Embedding Graphs in a Document

+
+
plot(mtcars$wt, mtcars$mpg, main="Weight vs MPG", xlab = "weight", ylab="MPG")
+
+ +

plot of chunk unnamed-chunk-2

+ +
+ +
+
+

Syntax

+
+
    +
  • Syntax of R is C esk with its use of curly braces
  • +
  • Variables are similar to python since it infers your data type used
  • +
  • Type system is rather wierd, the base unit for everything is a vector– even integers
  • +
+ +
if(TRUE)
+{
+  print("Hello World")
+}
+
+ +
[1] "Hello World"
+
+ +
+ +
+
+

Syntax cont

+
+

+ +
+ +
+
+

ML Example pt: 1

+
+

plot of chunk unnamed-chunk-4

+ +
+ +
+
+

ML Example pt: 2

+
+

plot of chunk unnamed-chunk-5

+ +
+ +
+
+

Super Cool ML Example pt: 3

+
+
sc <- spark_connect(master = "local")
+
+iris_tbl <- sdf_copy_to(sc, iris, name = "iris_tbl", overwrite = TRUE)
+
+partitions <- iris_tbl %>%
+  sdf_partition(training = 0.7, test = 0.3, seed = 1111)
+
+iris_training <- partitions$training
+iris_test <- partitions$test
+
+dt_model <- iris_training %>%
+  ml_decision_tree(Species ~ .)
+
+pred <- ml_predict(dt_model, iris_test)
+
+ml_multiclass_classification_evaluator(pred)
+
+ +
[1] 0.9451737
+
+ +
+ +
+
+

Resources

+ + +
+
+

Demo/Questions?

+
+

+ +
+ +
+ + +
+
+ + + + + + + + + +