diff --git a/blogContent/posts/data-science/csci-331-final-review.md b/blogContent/posts/data-science/csci-331-final-review.md new file mode 100644 index 0000000..58ca7ea --- /dev/null +++ b/blogContent/posts/data-science/csci-331-final-review.md @@ -0,0 +1,170 @@ +Quick review sheet for Dr. Homan's RIT CSCI-331 final. + +# Learning from examples (Ch 18) + +- Supervised learning: where you already know the answers +- Re-enforcement learning: Learning with rewards +- Unsupervised: clustering + + +![](media/final/learningAgent.PNG) + +## Inductive learning problems + +![](media/final/inductiveLearning.PNG) + +![](media/final/ock.PNG) + +Ockham's razor: Maximize a combination of consistency and simplicity. +Often times overly complex models that perfectly fit the training data does not generalize well for new data. + + +## Decision trees + +Often the most natural way of representing a boolean problem, but, don't often generalize well. + +![](media/final/decisionTree.PNG) + +## Entropy + +Decision trees use entropy to pick which input to branch on first. +A 50/50 split in data is usually less useful than a 80/20 split in data because the 50/50 split still has more "information" in it. +We pick the input that minimizes entropy. + +$$ +entropy = \sum^n_{i = 1} -P_i log_2 P_i +$$ + +## Neural networks + +Based on human brains. + + +McCullon-Pitts + +![](media/final/pitts.PNG) + +Examples of logic functions: + +![](media/final/logicNeurons.PNG) + +### Single Layer Perceptrons + +![](media/final/singleLayer.PNG) + +### Multi-layer Perceptrons + +![](media/final/multiLayer.PNG) + + +## Backpropagation + +Way of incrementally adjusting the weights so that the model better fits the training data. + + +## SVMs: Support Vector Machine + +- very high dimensions +- as long as data is sparse, the curse of dimensionality is not an issue +- By default it assumes you can linearly separate the data if you can use a large amount of dimensions. Sometimes you use something called the kernel trick to distort the space to make the data linearly separable. + +![](media/final/svm.PNG) + +## CNNs: Convolutional neural Networks + +![](media/final/ccn.PNG) + +## LSTMs: Long short term memory + +- Heavily used in natural language processing(NLP). + +![](media/final/lstm.PNG) + +# Probabilistic Learning (Ch. 20) + +## Maximum A Posteriori approximation (MAP) + +You assume the model which is most likely and use that to make your prediction. +This is approximately equivalent to the Bayseian formula. + +Using the weighted average of the predictions of all the potential models, you make your prediction. + + +``` python +""" +Equation 20.1 + P(h_i|d) = gamma * p(d|h_i)p(h_i) +gamma is 1/P(d) where P(d) is calculated by summing P(h_i|d) +p(d|h_i) is simply the frequency of that bag in the wild times +the sum of the observations times their respective distribution +in the bag. +""" +``` + +## Maximum Likelihood approximation (MLE) + +This process has 3 steps: 1: write down expression for the likelihood of the data as a function of the parameters. 2: Write down the derivatives of the log likelihood with respect to each parameter. 3: Find the parameter values such that the derivatives are zero. + + +## EM + +Used in k-means clustering. + +# Reinforcement learning (Ch. 21) + +MDP (Markov decision process): Goal is to find an optimal policy. +Often have to explore the space to learn the reward. + + +## Bellman equation + +![](media/final/bellman.png) + +# Logic (Ch 7) + +- knowledge base = set of sentences in a formal language +- inference engine: domain-independent algorithms + +- declarative approach to logic: tell the agent what it needs to know + +![](media/final/propositional.png) + +- Logics are formal languages for representing information to make conclusions +- syntax defines the sentences in the language +- semantics define the meaning + +- A model are formally structured worlds with respect to which truth can be evaluated. + +## Propositional Logic + +- Assumes world contains facts: models evaluate truth values for propositional symbols. + +![](media/final/propLogic.png) + +## Entailment + +- Entailment means that one thing follows from another. +- KB |= alpha. Knowledge base KB entails sentence "alpha" iff "alpha" is true in all words where KB is true. Ex: x + y = 4 entails 4 = x + y +- AKA: entailment is a relationship between syntax that is based on meaning + +![](media/final/wumpus.png) + +## Inference + +- Inference: Deriving sentences from other sentences +- Soundess: derivations produce only entailed sentences +-Completeness: derivations can produce all entailed sentences + + +## Forward chaining + +Forward chaining will find everything that is true in the logic. As a basic idea, this algorithm checks all rules that are satisfied in the knowledge base and add its conclusion to the knowledge base until the query is found. + +## Resolution + +Resolution is sound and complete for propositional logic. + + +## First-order logic (Ch #8) + +First-order logic (FOL) like natural languages assumes the world contains objects, relations, functions. Has increased expressiveness power over propositional logic. diff --git a/blogContent/posts/data-science/media/final/bellman.png b/blogContent/posts/data-science/media/final/bellman.png new file mode 100644 index 0000000..2948d6d Binary files /dev/null and b/blogContent/posts/data-science/media/final/bellman.png differ diff --git a/blogContent/posts/data-science/media/final/ccn.PNG b/blogContent/posts/data-science/media/final/ccn.PNG new file mode 100644 index 0000000..bbe42e2 Binary files /dev/null and b/blogContent/posts/data-science/media/final/ccn.PNG differ diff --git a/blogContent/posts/data-science/media/final/decisionTree.PNG b/blogContent/posts/data-science/media/final/decisionTree.PNG new file mode 100644 index 0000000..566625d Binary files /dev/null and b/blogContent/posts/data-science/media/final/decisionTree.PNG differ diff --git a/blogContent/posts/data-science/media/final/inductiveLearning.PNG b/blogContent/posts/data-science/media/final/inductiveLearning.PNG new file mode 100644 index 0000000..53fa1bb Binary files /dev/null and b/blogContent/posts/data-science/media/final/inductiveLearning.PNG differ diff --git a/blogContent/posts/data-science/media/final/learningAgent.PNG b/blogContent/posts/data-science/media/final/learningAgent.PNG new file mode 100644 index 0000000..4c1fb77 Binary files /dev/null and b/blogContent/posts/data-science/media/final/learningAgent.PNG differ diff --git a/blogContent/posts/data-science/media/final/logicNeurons.PNG b/blogContent/posts/data-science/media/final/logicNeurons.PNG new file mode 100644 index 0000000..8ee0a04 Binary files /dev/null and b/blogContent/posts/data-science/media/final/logicNeurons.PNG differ diff --git a/blogContent/posts/data-science/media/final/lstm.PNG b/blogContent/posts/data-science/media/final/lstm.PNG new file mode 100644 index 0000000..7c7ae79 Binary files /dev/null and b/blogContent/posts/data-science/media/final/lstm.PNG differ diff --git a/blogContent/posts/data-science/media/final/multiLayer.PNG b/blogContent/posts/data-science/media/final/multiLayer.PNG new file mode 100644 index 0000000..eb2e1c1 Binary files /dev/null and b/blogContent/posts/data-science/media/final/multiLayer.PNG differ diff --git a/blogContent/posts/data-science/media/final/ock.PNG b/blogContent/posts/data-science/media/final/ock.PNG new file mode 100644 index 0000000..d71bf54 Binary files /dev/null and b/blogContent/posts/data-science/media/final/ock.PNG differ diff --git a/blogContent/posts/data-science/media/final/pitts.PNG b/blogContent/posts/data-science/media/final/pitts.PNG new file mode 100644 index 0000000..67ab8f0 Binary files /dev/null and b/blogContent/posts/data-science/media/final/pitts.PNG differ diff --git a/blogContent/posts/data-science/media/final/propLogic.png b/blogContent/posts/data-science/media/final/propLogic.png new file mode 100644 index 0000000..c925d99 Binary files /dev/null and b/blogContent/posts/data-science/media/final/propLogic.png differ diff --git a/blogContent/posts/data-science/media/final/propositional.png b/blogContent/posts/data-science/media/final/propositional.png new file mode 100644 index 0000000..aad79e3 Binary files /dev/null and b/blogContent/posts/data-science/media/final/propositional.png differ diff --git a/blogContent/posts/data-science/media/final/singleLayer.PNG b/blogContent/posts/data-science/media/final/singleLayer.PNG new file mode 100644 index 0000000..d262c1a Binary files /dev/null and b/blogContent/posts/data-science/media/final/singleLayer.PNG differ diff --git a/blogContent/posts/data-science/media/final/svm.PNG b/blogContent/posts/data-science/media/final/svm.PNG new file mode 100644 index 0000000..41adeb1 Binary files /dev/null and b/blogContent/posts/data-science/media/final/svm.PNG differ diff --git a/blogContent/posts/data-science/media/final/wumpus.png b/blogContent/posts/data-science/media/final/wumpus.png new file mode 100644 index 0000000..c76971c Binary files /dev/null and b/blogContent/posts/data-science/media/final/wumpus.png differ