From 89d2cf37c44361f023e14d9c3b1e3966e665c8a6 Mon Sep 17 00:00:00 2001
From: Ryan Missel <rxm7244@rit.edu>
Date: Sat, 30 Mar 2019 09:56:40 -0400
Subject: [PATCH] Added docs folder to hold relevant CSV docs.

---
 data_preparation/docs/wellness.txt | 22 ++++++++++++++++++++++
 1 file changed, 22 insertions(+)
 create mode 100644 data_preparation/docs/wellness.txt

diff --git a/data_preparation/docs/wellness.txt b/data_preparation/docs/wellness.txt
new file mode 100644
index 0000000..5df9c2c
--- /dev/null
+++ b/data_preparation/docs/wellness.txt
@@ -0,0 +1,22 @@
+Features in Wellness:
+    Pain [0, 1] - no NaNs
+    Illness [0, 0.5, 1] - no NaNs
+    Menstruation [0, 1] - 16 NaNs, filled with 0. Not a big statistical difference, so this is fine
+    Nutrition [0, 0.5, 1] - 837 NaN, filled with 0. Not a useful feature
+    NutritionAdj [0, 1] - 745 NaN, filled with 0. Again not useful
+    USGMeasurement [0, 1] 168 NaN, filled with 0.
+    USG [1.0...] 4382 NaN, not a useful feature
+    TrainingReadiness [0..1] - no NaNs
+
+Useful features include Pain, Illness, Menstruation, TrainingReadiness
+The others either have too many NaNs present to extract any useful meaning or are just unhelpful features
+to begin with, like Nutrition.
+
+
+Notnormalized_with_0NaN_wellness.csv:
+
+- The only feature of significance that had NaN values put into it were Menstruation, as only 16 NaNs were present
+and wouldn't present any statistical difference either way.
+
+- Working in the notnormalized_with_0NaN_wellness csv should be functional, just have to remove any string columns
+before putting into algorithms as they are not removed in this version
\ No newline at end of file