datafest competition 2019
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

21 lines
1.0 KiB

  1. Features in Wellness:
  2. Pain [0, 1] - no NaNs
  3. Illness [0, 0.5, 1] - no NaNs
  4. Menstruation [0, 1] - 16 NaNs, filled with 0. Not a big statistical difference, so this is fine
  5. Nutrition [0, 0.5, 1] - 837 NaN, filled with 0. Not a useful feature
  6. NutritionAdj [0, 1] - 745 NaN, filled with 0. Again not useful
  7. USGMeasurement [0, 1] 168 NaN, filled with 0.
  8. USG [1.0...] 4382 NaN, not a useful feature
  9. TrainingReadiness [0..1] - no NaNs
  10. Useful features include Pain, Illness, Menstruation, TrainingReadiness
  11. The others either have too many NaNs present to extract any useful meaning or are just unhelpful features
  12. to begin with, like Nutrition.
  13. Notnormalized_with_0NaN_wellness.csv:
  14. - The only feature of significance that had NaN values put into it were Menstruation, as only 16 NaNs were present
  15. and wouldn't present any statistical difference either way.
  16. - Working in the notnormalized_with_0NaN_wellness csv should be functional, just have to remove any string columns
  17. before putting into algorithms as they are not removed in this version