Browse Source

Re-formatted some blog posts using custom paragraph formatter which limits col to 70 characters.

pull/77/head
jrtechs 6 years ago
parent
commit
60356eb059
6 changed files with 406 additions and 302 deletions
  1. +82
    -44
      blogContent/posts/data-science/is-using-ml-for-antivirus-safe.md
  2. +128
    -89
      blogContent/posts/data-science/lets-build-a-genetic-algorithm.md
  3. +35
    -22
      blogContent/posts/data-science/r-programming-language.md
  4. +64
    -75
      blogContent/posts/open-source/the-essential-vim-configuration.md
  5. +10
    -7
      blogContent/posts/other/2018-in-review.md
  6. +87
    -65
      blogContent/posts/other/morality-of-self-driving-cars.md

+ 82
- 44
blogContent/posts/data-science/is-using-ml-for-antivirus-safe.md View File

@ -1,28 +1,41 @@
In this blog post I examine the ways in which antivirus programs currently employ machine learning and then go into the In this blog post I examine the ways in which antivirus programs
security vulnerabilities that it brings. currently employ machine learning and then go into the security
vulnerabilities that it brings.
# ML in the Antivirus Industry # ML in the Antivirus Industry
Malware detection falls into two broad categories: static and dynamic analysis. Malware detection falls into two broad categories: static and dynamic
Static analysis examines the program without actually running the code. analysis. Static analysis examines the program without actually
Static analysis looks at things like the file fingerprints, hashes, reverse engineering, memory artifacts, packer detection, and debugging. running the code. Static analysis looks at things like the file
Static analysis is largely known for looking up the hashes of the virus against a known database of viruses. fingerprints, hashes, reverse engineering, memory artifacts, packer
It is super easy to fool signature based malware detection using simple obfuscation methods. detection, and debugging. Static analysis is largely known for
Dynamic analysis is a technique where you run the program in a sandbox and monitor all the actions that the virus takes. looking up the hashes of the virus against a known database of
If you notice that the program is acting suspicious, it is likely a virus. viruses. It is super easy to fool signature based malware detection
Suspicious behavior typically includes things like registry edits and API calls to bad host names. using simple obfuscation methods. Dynamic analysis is a technique
where you run the program in a sandbox and monitor all the actions
Antivirus detection is very difficult, but, probably not for the reasons you think. that the virus takes. If you notice that the program is acting
The issue isn't writing programs which can detect these static or dynamic properties of viruses-- that is the easy part. suspicious, it is likely a virus. Suspicious behavior typically
It is also relatively easy to determine a general rule set for what makes a program dangerous. includes things like registry edits and API calls to bad host names.
You can also easily blacklist suspicious domains, block malicious activity, and implement a signature based maleware detection program. Antivirus detection is very difficult, but, probably not for the
reasons you think. The issue isn't writing programs which can detect
The real problem is that there are hundreds of thousands of malware applications and more are created every day. these static or dynamic properties of viruses-- that is the easy part.
Not only are there tons of pesky malware applications, there is an absurd amount of normal programs which we don't want malware applications to block. It is also relatively easy to determine a general rule set for what
It is impossible for a small team of malware researchers to create a definitive set of heuristics which can correctly identify all malware programs. makes a program dangerous. You can also easily blacklist suspicious
This is where we turn to the field of Machine Learning. domains, block malicious activity, and implement a signature based
Humans are very bad with big data, but, computers love big data. maleware detection program.
Most antivirus companies use machine learning and it has been a large success so far because it has allowed us to dramatically improve our ability to detect zero day viruses. The real problem is that there are hundreds of thousands of malware
applications and more are created every day. Not only are there tons
of pesky malware applications, there is an absurd amount of normal
programs which we don't want malware applications to block. It is
impossible for a small team of malware researchers to create a
definitive set of heuristics which can correctly identify all malware
programs. This is where we turn to the field of Machine Learning.
Humans are very bad with big data, but, computers love big data. Most
antivirus companies use machine learning and it has been a large
success so far because it has allowed us to dramatically improve our
ability to detect zero day viruses.
## Interesting Examples ## Interesting Examples
@ -39,50 +52,75 @@ Anything which is not a normal program, it alerts you about since it can be a vi
### Kaspersky ### Kaspersky
Kaspersky appears to have done a ton of research into using machine learning for malware detection. Kaspersky appears to have done a ton of research into using machine
I would highly recommend that you read their [white paper](https://media.kaspersky.com/en/enterprise-security/Kaspersky-Lab-Whitepaper-Machine-Learning.pdf) on this subject. learning for malware detection. I would highly recommend that you read
their [white
paper](https://media.kaspersky.com/en/enterprise-security/Kaspersky-Lab-Whitepaper-Machine-Learning.pdf)
on this subject.
# Why is this a problem? # Why is this a problem?
It turns out that machine learning systems can be easily fooled by using other machine learning algorithms. It turns out that machine learning systems can be easily fooled by
A classic example of this is with image classification. using other machine learning algorithms. A classic example of this is
It is easy to use neural networks or genetic algorithms to generate examples which fool the machine learning application by learning the weights of the machine with image classification. It is easy to use neural networks or
learning application and then making slight tweaks to your input to give a false classification. genetic algorithms to generate examples which fool the machine
learning application by learning the weights of the machine learning
application and then making slight tweaks to your input to give a
false classification.
![](media/AISaftey/AdversarialExample.png) ![](media/AISaftey/AdversarialExample.png)
Since viruses generation is a non-differentiable problem, people often use Genetic algorithms for the adversarial network to fool the antivirus. Since viruses generation is a non-differentiable problem, people often
In other words, you don't want to attempt to calculate the derivative between two versions of a virus for gradient decent. use Genetic algorithms for the adversarial network to fool the
Since viruses are high dimensional problems, it turns out that most calc implementations would actually be inefficient at traversing the search space to find the global minimum. antivirus. In other words, you don't want to attempt to calculate the
If you want to learn more about genetic algorithms, check out my [recent blog post](https://jrtechs.net/data-science/lets-build-a-genetic-algorithm) on it. derivative between two versions of a virus for gradient decent. Since
viruses are high dimensional problems, it turns out that most calc
implementations would actually be inefficient at traversing the search
space to find the global minimum. If you want to learn more about
genetic algorithms, check out my [recent blog
post](https://jrtechs.net/data-science/lets-build-a-genetic-algorithm)
on it.
# Fooling Antivirus Software # Fooling Antivirus Software
## Genetic Algorithms ## Genetic Algorithms
There are two major approaches which people have used to generate antivirus resistant malware with genetic algorithms. There are two major approaches which people have used to generate
The first approach is to slowly make polymorphic changes to the virus in order to fool the malware detection. antivirus resistant malware with genetic algorithms. The first
One of the interesting things about this approach is that you have to have some way of verifying that the polymorphic behaviors that you apply to the virus don't break its "virus capabilities". approach is to slowly make polymorphic changes to the virus in order
to fool the malware detection. One of the interesting things about
this approach is that you have to have some way of verifying that the
polymorphic behaviors that you apply to the virus don't break its
"virus capabilities".
An other approach used is to represent a virus as a set of properties. An other approach used is to represent a virus as a set of properties.
These properties are everything from the port of attack, the payloads, obfuscation parameters, etc. These properties are everything from the port of attack, the payloads,
The genetic algorithm would simply tweak the properties of the virus until it found a configuration which evaded the antivirus program. obfuscation parameters, etc. The genetic algorithm would simply tweak
the properties of the virus until it found a configuration which
evaded the antivirus program.
## Reinforcement Learning ## Reinforcement Learning
A research group at [Endgame](https://www.endgame.com/) recently gave a [Def Con](https://www.defcon.org/) talk where they presented a framework which uses reinforcement learning to evade static virus detection. A research group at [Endgame](https://www.endgame.com/) recently gave
a [Def Con](https://www.defcon.org/) talk where they presented a
framework which uses reinforcement learning to evade static virus
detection.
![Reinforcement Learning Diagram](media/AISaftey/Reinforcement_learning_diagram.png) ![Reinforcement Learning Diagram](media/AISaftey/Reinforcement_learning_diagram.png)
At a high level, the AI plays a "game" against the antivirus where the agent can make functionality-preserving mutations to the virus. At a high level, the AI plays a "game" against the antivirus where the
The reward for the agent is its ability to not get detected by the anti-virus. agent can make functionality-preserving mutations to the virus. The
Over time the AI will learn which type of actions will result in getting detected by the antivirus. reward for the agent is its ability to not get detected by the
This framework can be found on [Github](https://github.com/endgameinc/gym-malware). anti-virus. Over time the AI will learn which type of actions will
result in getting detected by the antivirus. This framework can be
found on [Github](https://github.com/endgameinc/gym-malware).
# Takeaways # Takeaways
Machine learning is great, but, it needs to be properly defended. Machine learning is great, but, it needs to be properly defended. As
As we start to use machine learning more and more, a large portion of the cyber security field may shift its focus away from securing systems to securing big data applications. we start to use machine learning more and more, a large portion of the
cyber security field may shift its focus away from securing systems to
securing big data applications.
# Resources # Resources

+ 128
- 89
blogContent/posts/data-science/lets-build-a-genetic-algorithm.md View File

@ -4,36 +4,49 @@
# Background and Theory # Background and Theory
Since you stumbled upon this article, you might be wondering what the heck genetic algorithms are. Since you stumbled upon this article, you might be wondering what the
To put it simply: genetic algorithms employ the same tactics used in natural selection to find an optimal solution to an optimization problem. heck genetic algorithms are. To put it simply: genetic algorithms
Genetic algorithms are often used in high dimensional problems where the optimal solutions are not apparent. employ the same tactics used in natural selection to find an optimal
Genetic algorithms are commonly used to tune the [hyper-parameters](https://en.wikipedia.org/wiki/Hyperparameter) of a program. solution to an optimization problem. Genetic algorithms are often used
However, this algorithm can be used in any scenario where you have a function which defines how well a solution is. in high dimensional problems where the optimal solutions are not
Many people have used genetic algorithms in video games to auto learn the weaknesses of players. apparent. Genetic algorithms are commonly used to tune the
[hyper-parameters](https://en.wikipedia.org/wiki/Hyperparameter) of a
The beautiful part about Genetic Algorithms are their simplicity; you need absolutely no knowledge of linear algebra or calculus. program. However, this algorithm can be used in any scenario where you
To implement a genetic algorithm from scratch you only need **very basic** algebra and a general grasp of evolution. have a function which defines how well a solution is. Many people have
used genetic algorithms in video games to auto learn the weaknesses of
players.
The beautiful part about Genetic Algorithms are their simplicity; you
need absolutely no knowledge of linear algebra or calculus. To
implement a genetic algorithm from scratch you only need **very
basic** algebra and a general grasp of evolution.
# Genetic Algorithm # Genetic Algorithm
All genetic algorithms typically have a single cycle where you continuously mutate, breed, and select the most optimal solutions. All genetic algorithms typically have a single cycle where you
I will dive into each section of this algorithm using simple JavaScript code snippets. continuously mutate, breed, and select the most optimal solutions. I
The algorithm which I present is very generic and modular so it should be easy to port into other programming languages and applications. will dive into each section of this algorithm using simple JavaScript
code snippets. The algorithm which I present is very generic and
modular so it should be easy to port into other programming languages
and applications.
![Genetic Algorithms Flow Chart](media/GA/GAFlowChart.svg) ![Genetic Algorithms Flow Chart](media/GA/GAFlowChart.svg)
## Population Creation ## Population Creation
The very first thing we need to do is specify a data-structure for storing our genetic information. The very first thing we need to do is specify a data-structure for
In biology, chromosomes are composed of sequences of genes. storing our genetic information. In biology, chromosomes are composed
Many people run genetic algorithms on binary arrays since they more closely represent DNA. of sequences of genes. Many people run genetic algorithms on binary
However, as computer scientists, it is often easier to model problems using continuous numbers. arrays since they more closely represent DNA. However, as computer
In this approach, every gene will be a single floating point number ranging between zero and one. scientists, it is often easier to model problems using continuous
Every type of gene will have a max and min value which represents the absolute extremes of that gene. numbers. In this approach, every gene will be a single floating point
This works well for optimization because it allows us to easily limit our search space. number ranging between zero and one. Every type of gene will have a
For example, we can specify that "height" gene can only vary between 0 and 90. max and min value which represents the absolute extremes of that gene.
To get the actual value of the gene from its \[0-1] value we simple de-normalize it. This works well for optimization because it allows us to easily limit
our search space. For example, we can specify that "height" gene can
only vary between 0 and 90. To get the actual value of the gene from
its \[0-1] value we simple de-normalize it.
$$ $$
g_{real value} = (g_{high}- g_{low})g_{norm} + g_{low} g_{real value} = (g_{high}- g_{low})g_{norm} + g_{low}
@ -87,17 +100,22 @@ class Gene
``` ```
Now that we have genes, we can create chromosomes. Now that we have genes, we can create chromosomes. Chromosomes are
Chromosomes are simply collections of genes. simply collections of genes. Whatever language you make this in, make
Whatever language you make this in, make sure that when you create a new chromosome it sure that when you create a new chromosome it is has a [deep
is has a [deep copy](https://en.wikipedia.org/wiki/Object_copying) of the original genetic information rather than a shallow copy. copy](https://en.wikipedia.org/wiki/Object_copying) of the original
A shallow copy is when you simple copy the object pointer where a deep copy is actually creating a new object. genetic information rather than a shallow copy. A shallow copy is when
If you fail to do a deep copy, you will have weird issues where multiple chromosomes will share the same DNA. you simple copy the object pointer where a deep copy is actually
creating a new object. If you fail to do a deep copy, you will have
weird issues where multiple chromosomes will share the same DNA.
In this class I added helper functions to clone the chromosome as a random copy. In this class I added helper functions to clone the chromosome as a
You can only create a new chromosome by cloning because I wanted to keep the program generic and make no assumptions about the domain. random copy. You can only create a new chromosome by cloning because
Since you only provide the min/max information for the genes once, cloning an existing chromosome is the easiest way of I wanted to keep the program generic and make no assumptions about the
ensuring that all corresponding chromosomes contain genes with identical extrema. domain. Since you only provide the min/max information for the genes
once, cloning an existing chromosome is the easiest way of ensuring
that all corresponding chromosomes contain genes with identical
extrema.
```javascript ```javascript
@ -148,7 +166,8 @@ class Chromosome
} }
``` ```
Creating a random population is pretty straight forward if implemented a method to create a random clone of a chromosome. Creating a random population is pretty straight forward if implemented
a method to create a random clone of a chromosome.
```javascript ```javascript
/** /**
@ -170,16 +189,17 @@ const createRandomPopulation = function(geneticChromosome, populationSize)
}; };
``` ```
This is where nearly all the domain information is introduced. This is where nearly all the domain information is introduced. After
After you define what types of genes are found on each chromosome, you can create an entire population. you define what types of genes are found on each chromosome, you can
In this example all genes contain values ranging between one and ten. create an entire population. In this example all genes contain values
ranging between one and ten.
```javascript ```javascript
let gene1 = new Gene(1,10,10); let gene1 = new Gene(1,10,10);
let gene2 = new Gene(1,10,0.4); let gene2 = new Gene(1,10,0.4);
let geneList = [gene1, gene2]; let geneList = [gene1, gene2];
let exampleOrganism = new Chromosome(geneList); let exampleOrganism = new Chromosome(geneList);
let population = createRandomPopulation(genericChromosome, 100); let population = createRandomPopulation(genericChromosome, 100);
``` ```
@ -187,10 +207,14 @@ let population = createRandomPopulation(genericChromosome, 100);
## Evaluate Fitness ## Evaluate Fitness
Like all optimization problems, you need a way to evaluate the performance of a particular solution. Like all optimization problems, you need a way to evaluate the
The cost function takes in a chromosome and evaluates how close it got to the ideal solution. performance of a particular solution. The cost function takes in a
This particular example it is just computing the [Manhattan Distance](https://en.wiktionary.org/wiki/Manhattan_distance) to a random 2D point. chromosome and evaluates how close it got to the ideal solution. This
I chose two dimensions because it is easy to graph, however, real applications may have dozens of genes on each chromosome. particular example it is just computing the [Manhattan
Distance](https://en.wiktionary.org/wiki/Manhattan_distance) to a
random 2D point. I chose two dimensions because it is easy to graph,
however, real applications may have dozens of genes on each
chromosome.
```javascript ```javascript
let costx = Math.random() * 10; let costx = Math.random() * 10;
@ -209,9 +233,11 @@ const basicCostFunction = function(chromosome)
## Selection ## Selection
Selecting the best performing chromosomes is straightforward after you have a function for evaluating the performance. Selecting the best performing chromosomes is straightforward after you
This code snippet also computes the average and best chromosome of the population to make it easier to graph and define have a function for evaluating the performance. This code snippet also
the stopping point for the algorithm's main loop. computes the average and best chromosome of the population to make it
easier to graph and define the stopping point for the algorithm's main
loop.
```javascript ```javascript
/** /**
@ -249,10 +275,12 @@ const naturalSelection = function(population, keepNumber, fitnessFunction)
}; };
``` ```
You might be wondering how I sorted the list of JSON objects - not a numerical array. You might be wondering how I sorted the list of JSON objects - not a
I used the following function as a comparator for JavaScript's built in sort function. numerical array. I used the following function as a comparator for
This comparator will compare objects based on a specific attribute that you give it. JavaScript's built in sort function. This comparator will compare
This is a very handy function to include in all of your JavaScript projects for easy sorting. objects based on a specific attribute that you give it. This is a very
handy function to include in all of your JavaScript projects for easy
sorting.
```javascript ```javascript
/** /**
@ -281,15 +309,16 @@ function predicateBy(prop)
## Reproduction ## Reproduction
The process of reproduction can be broken down into Pairing and Mating. The process of reproduction can be broken down into Pairing and
Mating.
### Pairing ### Pairing
Pairing is the process of selecting mates to produce offspring. Pairing is the process of selecting mates to produce offspring. A
A typical approach will separate the population into two segments of mothers and fathers. typical approach will separate the population into two segments of
You then randomly pick pairs of mothers and fathers to produce offspring. mothers and fathers. You then randomly pick pairs of mothers and
It is ok if one chromosome mates more than once. fathers to produce offspring. It is ok if one chromosome mates more
It is just important that you keep this process random. than once. It is just important that you keep this process random.
```javascript ```javascript
/** /**
@ -317,22 +346,26 @@ const matePopulation = function(population, desiredPopulationSize)
### Mating ### Mating
Mating is the actual act of forming new chromosomes/organisms based on your previously selected pairs. Mating is the actual act of forming new chromosomes/organisms based on
From my research, there are two major forms of mating: blending, crossover. your previously selected pairs. From my research, there are two major
forms of mating: blending, crossover.
Blending is typically the most preferred approach to mating when dealing with continuous variables. Blending is typically the most preferred approach to mating when
In this approach you combine the genes of both parents based on a random factor. dealing with continuous variables. In this approach you combine the
genes of both parents based on a random factor.
$$ $$
c_{new} = r * c_{mother} + (1-r) * c_{father} c_{new} = r * c_{mother} + (1-r) * c_{father}
$$ $$
The second offspring simply uses (1-r) for their random factor to adjust the chromosomes. The second offspring simply uses (1-r) for their random factor to
adjust the chromosomes.
Crossover is the simplest approach to mating. Crossover is the simplest approach to mating. In this process you
In this process you clone the parents and then you randomly swap *n* of their genes. clone the parents and then you randomly swap *n* of their genes. This
This works fine in some scenarios; however, this severely lacks the genetic diversity of the genes because you now have to solely works fine in some scenarios; however, this severely lacks the genetic
rely on mutations for changes. diversity of the genes because you now have to solely rely on
mutations for changes.
```javascript ```javascript
/** /**
@ -373,14 +406,16 @@ const blendGene = function(gene1, gene2, blendCoef)
## Mutation ## Mutation
Mutations are random changes to an organisms DNA. Mutations are random changes to an organisms DNA. In the scope of
In the scope of genetic algorithms, it helps our population converge on the correct solution. genetic algorithms, it helps our population converge on the correct
solution.
You can either adjust genes by a factor resulting in a smaller change or, you can You can either adjust genes by a factor resulting in a smaller change
change the value of the gene to be something completely random. or, you can change the value of the gene to be something completely
Since we are using the blending technique for reproduction, we already have small incremental changes. random. Since we are using the blending technique for reproduction, we
I prefer to use mutations to randomly change the entire gene since it helps prevent the algorithm already have small incremental changes. I prefer to use mutations to
from settling on a local minimum rather than the global minimum. randomly change the entire gene since it helps prevent the algorithm
from settling on a local minimum rather than the global minimum.
```javascript ```javascript
@ -408,11 +443,13 @@ const mutatePopulation = function(population, mutatePercentage)
## Immigration ## Immigration
Immigration or "new blood" is the process of dumping random organisms into your population at each generation. Immigration or "new blood" is the process of dumping random organisms
This prevents us from getting stuck in a local minimum rather than the global minimum. into your population at each generation. This prevents us from getting
There are more advanced techniques to accomplish this same concept. stuck in a local minimum rather than the global minimum. There are
My favorite approach (not implemented here) is raising **x** populations simultaneously and every **y** generations more advanced techniques to accomplish this same concept. My favorite
you take **z** organisms from each population and move them to another population. approach (not implemented here) is raising **x** populations
simultaneously and every **y** generations you take **z** organisms
from each population and move them to another population.
```javascript ```javascript
/** /**
@ -432,7 +469,8 @@ const newBlood = function(population, immigrationSize)
## Putting It All Together ## Putting It All Together
Now that we have all the ingredients for a genetic algorithm we can piece it together in a simple loop. Now that we have all the ingredients for a genetic algorithm we can
piece it together in a simple loop.
```javascript ```javascript
/** /**
@ -487,11 +525,14 @@ const runGeneticOptimization = function(geneticChromosome, costFunction,
## Running ## Running
Running the program is pretty straight forward after you have your genes and cost function defined. Running the program is pretty straight forward after you have your
You might be wondering if there is an optimal configuration of parameters to use with this algorithm. genes and cost function defined. You might be wondering if there is an
The answer is that it varies based on the particular problem. optimal configuration of parameters to use with this algorithm. The
Problems like the one graphed by this website perform very well with a low mutation rate and a high population. answer is that it varies based on the particular problem. Problems
However, some higher dimensional problems won't even converge on a local answer if you set your mutation rate too low. like the one graphed by this website perform very well with a low
mutation rate and a high population. However, some higher dimensional
problems won't even converge on a local answer if you set your
mutation rate too low.
```javascript ```javascript
let gene1 = new Gene(1,10,10); let gene1 = new Gene(1,10,10);
@ -499,17 +540,15 @@ let gene1 = new Gene(1,10,10);
let geneN = new Gene(1,10,0.4); let geneN = new Gene(1,10,0.4);
let geneList = [gene1,..., geneN]; let geneList = [gene1,..., geneN];
let exampleOrganism = new Chromosome(geneList); let exampleOrganism = new Chromosome(geneList);
costFunction = function(chromosome) costFunction = function(chromosome) { var d =...; //compute
{ cost return d; }
var d =...;
//compute cost
return d;
}
runGeneticOptimization(exampleOrganism, costFunction, 100, 50, 0.01, 0.3, 20, 10); runGeneticOptimization(exampleOrganism, costFunction, 100, 50, 0.01, 0.3, 20, 10);
``` ```
The complete code for the genetic algorithm and the fancy JavaScript graphs can be found in my [Random Scripts GitHub Repository](https://github.com/jrtechs/RandomScripts). The complete code for the genetic algorithm and the fancy JavaScript
In the future I may package this into an [npm](https://www.npmjs.com/) package. graphs can be found in my [Random Scripts GitHub
Repository](https://github.com/jrtechs/RandomScripts). In the future I
may package this into an [npm](https://www.npmjs.com/) package.

+ 35
- 22
blogContent/posts/data-science/r-programming-language.md View File

@ -1,39 +1,52 @@
R is a programming language designed for statistical analysis and graphics. R is a programming language designed for statistical analysis and
Since R has been around since 1992, it has developed a large community and has over [13 thousand packages](https://cran.r-project.org/web/packages/) publicly available. graphics. Since R has been around since 1992, it has developed a large
What is really cool about R is that it is an open source [GNU](http://www.gnu.org/) project. community and has over [13 thousand
packages](https://cran.r-project.org/web/packages/) publicly
available. What is really cool about R is that it is an open source
[GNU](http://www.gnu.org/) project.
# R Syntax and Paradigms # R Syntax and Paradigms
The syntax of R is C esk with its use of curly braces. The syntax of R is C esk with its use of curly braces. The type
The type system of R is similar to Python where it can infer what type you are using. system of R is similar to Python where it can infer what type you are
This "lazy" type system allows for "faster" development since you don't have to worry about declaring types -- this laziness makes it harder to debug and read your code. using. This "lazy" type system allows for "faster" development since
The type system of R is rather strange and distinctly different from most other languages. you don't have to worry about declaring types -- this laziness makes
For starters, integers are represented as vectors of length 1. it harder to debug and read your code. The type system of R is rather
These things may feel weird at first, but, R's type system is one of the things that make it a great tool for manipulating data. strange and distinctly different from most other languages. For
starters, integers are represented as vectors of length 1. These
things may feel weird at first, but, R's type system is one of the
things that make it a great tool for manipulating data.
![R Arrays Start at 1](media/r/arrays.jpg) ![R Arrays Start at 1](media/r/arrays.jpg)
Did I mention that arrays start at 1? Did I mention that arrays start at 1? Technically, the thing which we
Technically, the thing which we refer to as an array in Java are really vectors in R. refer to as an array in Java are really vectors in R. Arrays in R are
Arrays in R are data objects which can store data in more than two dimensions. data objects which can store data in more than two dimensions. Since R
Since R tries to follow mathematical notation, indexing starts at 1 -- just like in linear algebra. tries to follow mathematical notation, indexing starts at 1 -- just
Using zero based indexing makes sense for languages like C because the index is used to get at a particular memory location from a pointer. like in linear algebra. Using zero based indexing makes sense for
languages like C because the index is used to get at a particular
memory location from a pointer.
<youtube src="s3FozVfd7q4" /> <youtube src="s3FozVfd7q4" />
I don't have the time to go over the basic syntax of R in a single blog post; however, I feel that this youtube video does a pretty good job. I don't have the time to go over the basic syntax of R in a single
blog post; however, I feel that this youtube video does a pretty good
job.
# R Markdown # R Markdown
One of my favorite aspects of R is its markdown language called Rmd. One of my favorite aspects of R is its markdown language called Rmd.
Rmd is essentially markdown which has can have embedded R scripts run in it. Rmd is essentially markdown which has can have embedded R scripts run
The Rmd file is compiled down to a markdown file which is converted to either a PDF, HTML file, or a slide show using pandoc. in it. The Rmd file is compiled down to a markdown file which is
You can provide options for the pandoc render using a YAMAL header in the Rmd file. converted to either a PDF, HTML file, or a slide show using pandoc.
This is an amazing tool for creating reports and writing research papers. You can provide options for the pandoc render using a YAMAL header in
The documents which you create are reproducible since you can share the source code to it. the Rmd file. This is an amazing tool for creating reports and writing
If the data which you are using changes, you simply have to recompile to document to get an updated view. research papers. The documents which you create are reproducible since
You no longer have to re-generate a dozen graphs and update figures and statistics across your document. you can share the source code to it. If the data which you are using
changes, you simply have to recompile to document to get an updated
view. You no longer have to re-generate a dozen graphs and update
figures and statistics across your document.
# Resources # Resources

+ 64
- 75
blogContent/posts/open-source/the-essential-vim-configuration.md View File

@ -1,31 +1,27 @@
# Vim Configuration # Vim Configuration
Stock Vim is pretty boring. Stock Vim is pretty boring. The good news is that Vim has a very
The good news is that Vim has a very comprehensive configuration file which comprehensive configuration file which allows you to tweak it to your
allows you to tweak it to your heart's content. heart's content. To make changes to Vim you simply modify the ~/.vimrc
To make changes to Vim you simply modify the ~/.vimrc file in your home file in your home directory. By adding simple commands this file you
directory. can easily change the way your text editor looks. Neat.
By adding simple commands this file you can easily change the way your
text editor looks.
Neat.
I attempted to create the smallest Vim configuration file which makes I attempted to create the smallest Vim configuration file which makes
Vim usable enough for me to use as my daily text editor. Vim usable enough for me to use as my daily text editor. I believe
I believe that it is important for everyone to know what their that it is important for everyone to know what their Vim configuration
Vim configuration does. does. This knowledge will help ensure that you are only adding the
This knowledge will help ensure that you are only adding the things things you want and that you can later customize it for your workflow.
you want and that you can later customize it for your workflow.
Although it may be tempting to download somebody else's massive Vim Although it may be tempting to download somebody else's massive Vim
configuration, I argue that this can lead to problems down the road. configuration, I argue that this can lead to problems down the road.
I want to mention that I don't use Vim as my primary I want to mention that I don't use Vim as my primary IDE; I only use
IDE; I only use Vim as a text editor. Vim as a text editor. I tend to use JetBrains tools on larger projects
I tend to use JetBrains tools on larger projects since they have amazing since they have amazing auto complete functionality, build tools, and
auto complete functionality, build tools, and comprehensive error detection. comprehensive error detection. There are great Vim configurations out
There are great Vim configurations out there on the internet; however, most there on the internet; however, most tend to be a bit overkill for
tend to be a bit overkill for what most people want to do. what most people want to do.
Alright, lets dive into my vim configuration! Alright, lets dive into my vim configuration!
# Spell Check # Spell Check
@ -35,34 +31,32 @@ autocmd BufRead,BufNewFile *.md setlocal spell spelllang=en_us
autocmd BufRead,BufNewFile *.txt setlocal spell spelllang=en_us autocmd BufRead,BufNewFile *.txt setlocal spell spelllang=en_us
``` ```
Since I am often an atrocious speller, having basic spell check abilities in Since I am often an atrocious speller, having basic spell check
Vim is a lifesaver. abilities in Vim is a lifesaver. It does not make sense to have spell
It does not make sense to have spell check enabled for most files since it check enabled for most files since it would light up most programming
would light up most programming files like a Christmas tree. files like a Christmas tree. I have my Vim configuration set to
I have my Vim configuration set to automatically enable spell check for markdown files automatically enable spell check for markdown files and basic text
and basic text files. files. If you need spell check in other files, you can enter the
If you need spell check in other files, you can enter the command command ":set spell" to enable spell check for that file. To see the
":set spell" to enable spell check for that file. spelling recommendations, type "z=" when you are over a highlighted
To see the spelling recommendations, type "z=" when you are over a word.
highlighted word.
# Appearance # Appearance
Adding colors to Vim is fun. Adding colors to Vim is fun. The "syntax enable" command tells vim to
The "syntax enable" command tells vim to highlight keywords in programming highlight keywords in programming files and other structured files.
files and other structured files.
```vim ```vim
syntax enable syntax enable
``` ```
I would encourage everyone to look at the different color schemes available for I would encourage everyone to look at the different color schemes
Vim. available for Vim. I threw the color scheme command in a try-catch
I threw the color scheme command in a try-catch block to ensure that it does not crash block to ensure that it does not crash Vim if you don't have the color
Vim if you don't have the color scheme installed. scheme installed. By default the desert color scheme is installed;
By default the desert color scheme is installed; however, that is not always the however, that is not always the case for [community
case for [community created](http://vimcolors.com/) Vim color schemes. created](http://vimcolors.com/) Vim color schemes.
```vim ```vim
try try
@ -70,13 +64,12 @@ try
catch catch
endtry endtry
set background=dark set background=dark ```
```
# Indentation and Tabs # Indentation and Tabs
Having your indentation settings squared away will save you a ton of time Having your indentation settings squared away will save you a ton of
if you are doing any programming in Vim. time if you are doing any programming in Vim.
```vim ```vim
"copy indentation from current line when making a new line "copy indentation from current line when making a new line
@ -84,28 +77,25 @@ set autoindent
" Smart indentation when programming: indent after { " Smart indentation when programming: indent after {
set smartindent set smartindent
set tabstop=4 " number of spaces per tab set tabstop=4 " number of spaces per tab set expandtab "
set expandtab " convert tabs to spaces convert tabs to spaces set shiftwidth=4 " set a tab press equal to 4
set shiftwidth=4 " set a tab press equal to 4 spaces spaces ```
```
# Useful UI Tweaks # Useful UI Tweaks
These are three UI tweaks that I find really useful to have, some people may These are three UI tweaks that I find really useful to have, some
have different opinions on these. people may have different opinions on these. Seeing line numbers is
Seeing line numbers is useful since programming errors typically just useful since programming errors typically just tells you what line
tells you what line your program went up in flames. your program went up in flames. The cursor line is useful since it
The cursor line is useful since it allows you to easily to find your place allows you to easily to find your place in the file -- this may be a
in the file -- this may be a bit too much for some people. bit too much for some people.
I like to keep every line under 80 characters long for technical
I like to keep every line under 80 characters long for technical files, files, having a visual queue for this is helpful. Some people prefer
having a visual queue for this is helpful. to just use the auto word wrap and keep their lines as long as they
Some people prefer to just use the auto word wrap and keep their lines as long like. I like to keep to the 80 character limit and explicitly choose
as they like. where I cut each line. Some of my university classes mandate the 80
I like to keep to the 80 character limit and explicitly choose where character limit and take points off if you don't follow it.
I cut each line.
Some of my university classes mandate the 80 character limit and take
points off if you don't follow it.
```vim ```vim
" Set Line Numbers to show " " Set Line Numbers to show "
@ -121,7 +111,7 @@ set colorcolumn=80
# Searching and Auto Complete # Searching and Auto Complete
This these configurations make searching in Vim less painful. This these configurations make searching in Vim less painful.
```vim ```vim
" search as characters are entered " " search as characters are entered "
@ -133,8 +123,8 @@ set hlsearch
set ignorecase set ignorecase
``` ```
These configurations will make command completion easier by These configurations will make command completion easier by showing
showing an auto-complete menu when you press tab. an auto-complete menu when you press tab.
```vim ```vim
" Shows a auto complete menu when you are typing a command " " Shows a auto complete menu when you are typing a command "
@ -147,11 +137,10 @@ set wildignorecase " ignore case for auto complete
# Useful Things to Have # Useful Things to Have
There is nothing too earth shattering in this section, just things that There is nothing too earth shattering in this section, just things
might save you some time. that might save you some time. Enabling mouse support is a really
Enabling mouse support is a really interesting configuration. interesting configuration. When enabled, this allows you to select
When enabled, this allows you to select text and jump between different text and jump between different locations with your mouse.
locations with your mouse.
```vim ```vim
" Enables mouse support " " Enables mouse support "
@ -170,7 +159,7 @@ set autoread
set lazyredraw set lazyredraw
``` ```
Setting your file format is always a good idea for compatibility. Setting your file format is always a good idea for compatibility.
```vim ```vim
" Set utf8 as standard encoding and en_US as the standard language " " Set utf8 as standard encoding and en_US as the standard language "
@ -183,6 +172,6 @@ set ffs=unix,dos,mac
# Wrapping it up # Wrapping it up
I hope that this quick blog post inspired you to maintain your own Vim I hope that this quick blog post inspired you to maintain your own Vim
configuration file. configuration file. You can find my current configuration files in my
You can find my current configuration files in my [random scripts
[random scripts repository](https://github.com/jrtechs/RandomScripts/tree/master/config). repository](https://github.com/jrtechs/RandomScripts/tree/master/config).

+ 10
- 7
blogContent/posts/other/2018-in-review.md View File

@ -1,7 +1,10 @@
Inspired by [Justin Flory](https://justinwflory.com/) and [Dan Schneiderman](http://www.schneidy.com), Inspired by [Justin Flory](https://justinwflory.com/) and [Dan
I decided to make a 2018 review post. I believe that it would be a good way to reflect upon what I did Schneiderman](http://www.schneidy.com), I decided to make a 2018
in 2018 and make plans for 2019. This post will be a very high level overview of the projects and review post. I believe that it would be a good way to reflect upon
activities that I did in 2018 -- nothing personal. Pictures say a thousand words, so, I will include a lot. what I did in 2018 and make plans for 2019. This post will be a very
high level overview of the projects and activities that I did in 2018
-- nothing personal. Pictures say a thousand words, so, I will include
a lot.
# January: # January:
@ -11,7 +14,7 @@ activities that I did in 2018 -- nothing personal. Pictures say a thousand words
**Started Second Semester of College** **Started Second Semester of College**
Classes: Classes:
- Mechanics of Programming - Mechanics of Programming
- Statistics - Statistics
@ -92,9 +95,9 @@ Classes:
**Second Year of College** **Second Year of College**
First year on the Eboard of RITlug as Vice President. First year on the Eboard of RITlug as Vice President.
Classes: Classes:
- Linear Algebra - Linear Algebra
- Analysis Of Algorithms - Analysis Of Algorithms

+ 87
- 65
blogContent/posts/other/morality-of-self-driving-cars.md View File

@ -1,84 +1,106 @@
<youtube src="_MFGx8d1zl0" /> <youtube src="_MFGx8d1zl0" />
Although the movie *I Robot* has not aged well, it still brings up some interesting ethical questions Although the movie *I Robot* has not aged well, it still brings up
that we are still discussing concerning self driving cars. The protagonist Detective Spooner some interesting ethical questions that we are still discussing
has an almost unhealthy amount of distrust towards concerning self driving cars. The protagonist Detective Spooner has
robots. In the movie, a robot decided to save Spooner's life over a 12 year old girl in a car accident. an almost unhealthy amount of distrust towards robots. In the movie, a
This ignites the famous ethical debate of the trolley problem, but, now with artificial intelligence. robot decided to save Spooner's life over a 12 year old girl in a car
The debate boils down to this: are machines capable of making moral decisions. The accident. This ignites the famous ethical debate of the trolley
surface level answer from the movie is presented as **no** when Spooner's presents car crash antidote. problem, but, now with artificial intelligence. The debate boils down
This question parallels the discussion that we are currently having with self driving cars. to this: are machines capable of making moral decisions. The surface
When a self driving car is presented with two options which result in the loss of life, level answer from the movie is presented as **no** when Spooner's
what should it choose? presents car crash antidote. This question parallels the discussion
that we are currently having with self driving cars. When a self
driving car is presented with two options which result in the loss of
life, what should it choose?
<iframe width="100%" height="315" src="https://www.youtube.com/embed/ixIoDYVfKA0" frameborder="0" allow="autoplay; encrypted-media" allowfullscreen></iframe> <iframe width="100%" height="315" src="https://www.youtube.com/embed/ixIoDYVfKA0" frameborder="0" allow="autoplay; encrypted-media" allowfullscreen></iframe>
When surveyed, most people say that they would prefer to have self driving cars take the utilitarian When surveyed, most people say that they would prefer to have self
approach towards the trolley problem. A utilitarian approach would try to minimize the driving cars take the utilitarian approach towards the trolley
total amount of harm. MIT made a neat [website](http://moralmachine.mit.edu/) where it presents you with a problem. A utilitarian approach would try to minimize the total
bunch of "trolley problems" where you have to decide who dies. At the end of the survey the amount of harm. MIT made a neat
website presents you with a list of observed preferences you made when deciding who's life was more important to save. [website](http://moralmachine.mit.edu/) where it presents you with a
The purpose of the trolley problem is merely to ponder what decision a self driving car bunch of "trolley problems" where you have to decide who dies. At the
should make when **all** of its alternatives are depleted. end of the survey the website presents you with a list of observed
preferences you made when deciding who's life was more important to
save. The purpose of the trolley problem is merely to ponder what
decision a self driving car should make when **all** of its
alternatives are depleted.
![Moral Machine](media/selfDrivingCars/moralmachine3.png) ![Moral Machine](media/selfDrivingCars/moralmachine3.png)
We still need to question whether We still need to question whether utilitarianism is the right moral
utilitarianism is the right moral engine for self driving cars. Would it be ethical engine for self driving cars. Would it be ethical for a car to take
for a car to take into account into account you age, race, gender, and social status when deciding
you age, race, gender, and social status when deciding if you get to live? if you get to live? If self driving cars could access personal
If self driving cars could access personal information such as criminal history or known friends, would it information such as criminal history or known friends, would it be
be ethical to use that information? Would it be moral for ethical to use that information? Would it be moral for someone to make
someone to make a car which favored the safety of the passengers of the car above a car which favored the safety of the passengers of the car above
others? others?
![Moral Machine](media/selfDrivingCars/moralMachine.png) ![Moral Machine](media/selfDrivingCars/moralMachine.png)
Even though most people want self driving cars to use utilitarianism, most people surveyed also responded Even though most people want self driving cars to use utilitarianism,
that they would not buy a car which did not have their safety as its top priority. most people surveyed also responded that they would not buy a car
This brings up a serious social dilemma. If people want everyone else's cars to be utilitarians, which did not have their safety as its top priority. This brings up a
yet, have their own cars be greedy and favor their safety, we would see none of the utilitarian improvements. This serious social dilemma. If people want everyone else's cars to be
presented us with the tragedy of the commons problem since everyone would favor their own utilitarians, yet, have their own cars be greedy and favor their
safety and nobody would sacrifice their safety for the public good. This brings up yet another question: safety, we would see none of the utilitarian improvements. This
would it be fair to ask someone to sacrifice their safety in this way? presented us with the tragedy of the commons problem since everyone
would favor their own safety and nobody would sacrifice their safety
In most cases, when a tragedy of the commons situation is presented, government intervention is for the public good. This brings up yet another question: would it be
the most piratical solution. It might be the best to have the government fair to ask someone to sacrifice their safety in this way?
mandate that all cars try to maximize the amount of life saved when a car is presented with the In most cases, when a tragedy of the commons situation is presented,
trolley problem. Despite appearing to be a good solution, the flaw in this does not become apparent before you us government intervention is the most piratical solution. It might be
consequentialism to examine this problem. the best to have the government mandate that all cars try to maximize
the amount of life saved when a car is presented with the trolley
problem. Despite appearing to be a good solution, the flaw in this
does not become apparent before you us consequentialism to examine
this problem.
![Moral Machine](media/selfDrivingCars/moralMachine6.png) ![Moral Machine](media/selfDrivingCars/moralMachine6.png)
Self driving cars are expected to reduce car accidents by 90% by eliminating human error. If people Self driving cars are expected to reduce car accidents by 90% by
decide to not use self driving cars due to the utilitarian moral engine, we run the eliminating human error. If people decide to not use self driving cars
risk of actually loosing more lives. Some people have actually argued that since due to the utilitarian moral engine, we run the risk of actually
artificial intelligence is incapable of making moral decisions, they should take loosing more lives. Some people have actually argued that since
no action at all when there is a situation which will always results in the loss of life. artificial intelligence is incapable of making moral decisions, they
In the frame of the trolley problem, should take no action at all when there is a situation which will
it is best for the artificial intelligence to not pull the lever. I will argue that always results in the loss of life. In the frame of the trolley
it is best for self driving cars to not make ethical problem, it is best for the artificial intelligence to not pull the
decisions because, it would result in the highest adoption rate of self driving cars. This would end up lever. I will argue that it is best for self driving cars to not make
saving the most lives in the long run. Plus, the likelihood that a car is actually presented with ethical decisions because, it would result in the highest adoption
a trolley problem is pretty slim. rate of self driving cars. This would end up saving the most lives in
the long run. Plus, the likelihood that a car is actually presented
The discussion over the moral decisions a car has to make is almost fruitless. It turns out with a trolley problem is pretty slim.
that humans are not even good at making moral decisions in emergency situations. When we make rash decisions The discussion over the moral decisions a car has to make is almost
influenced by anxiety, we are heavily influenced by prejudices and self motives. Despite our own shortcomings when it fruitless. It turns out that humans are not even good at making moral
comes to decision making, that does not mean that we can not do better with self driving cars. However, decisions in emergency situations. When we make rash decisions
we need to realize that it is the mass adoption of self driving cars which will save the most lives, not influenced by anxiety, we are heavily influenced by prejudices and
the moral engine which we program the cars with. We can not let the moral engine of the self driving self motives. Despite our own shortcomings when it comes to decision
cars get in the way of adoption. making, that does not mean that we can not do better with self driving
cars. However, we need to realize that it is the mass adoption of self
The conclusion that I made parallels Spooner's problem with robots in the movie *I Robot*. Spooner was so mad at the robots for driving cars which will save the most lives, not the moral engine
saving his own life rather than the girl's, he never realized that if it was not for the robots, neither of them would which we program the cars with. We can not let the moral engine of the
have survived that car crash. Does that mean we can't do better than not pulling the lever? Well... not exactly. self driving cars get in the way of adoption.
Near the end of the movie a robot was presented with another trolley problem, but, this time he managed to The conclusion that I made parallels Spooner's problem with robots in
find a way which saved both parties. Without reading into this movie too deep, this illustrates how the early the movie *I Robot*. Spooner was so mad at the robots for saving his
adoption of artificial intelligence ended up saving tons of lives like Spooners. It is only when the technology fully develops own life rather than the girl's, he never realized that if it was not
is when we can start to avoid the trolley problem completely. for the robots, neither of them would have survived that car crash.
Does that mean we can't do better than not pulling the lever? Well...
not exactly. Near the end of the movie a robot was presented with
another trolley problem, but, this time he managed to find a way which
saved both parties. Without reading into this movie too deep, this
illustrates how the early adoption of artificial intelligence ended up
saving tons of lives like Spooners. It is only when the technology
fully develops is when we can start to avoid the trolley problem
completely.

|||||||
x
 
000:0
Loading…
Cancel
Save