https://rpubs.com/patrickoster/481524

https://rpubs.com/patrickoster/481544

# Importing and Cleaning Data


# Data Visualization


# Analysis


# Report



## Abstract

The way in which a team trains is critical in ensuring that everyone
performs at their peak performance during a game. In order to
effectively train a team to optimize their gameday performance, it
would make intuitive sense to monitor their training data with respect
to their perceived fatigue. Through analyzing time series data
provided by our partnering women’s rugby team, it was observed that
this team altered their training schedule close to games. Although
there is some relationship between the two in the long run, our
attempts at modeling fatigue and work load in the short run suggests
little to no correlation using linear regressions. This suggests that
modeling fatigue is a more complex problem including a slew of factors
both psychological and physical which spans over a period of time;
coaches should pay attention not only to training but also sleep and
mental wellness for happy and competitive teams. To most effectively
forecast an individual’s performance during a game, we propose a
system which takes into account physiological factors such as desire
and physical factors such as sleep, soreness and amount of training. 

## Methodology

We employed a wide range of techniques for establishing our models and
hypotheses, including smoothing of time series Information, testing of
hypotheses based on a prior understanding of the domain, plotting and
visually analyzing pairs of variables, and artificial intelligence
algorithms that found various linear and nonlinear patterns in the
dataset. Coefficients of determination were calculated to determine
fitness of linear models, and F1 scores were analyzed to validate
complex nonlinear classification models. 


## Modeling Fatigue

Fatigue can be effectively and linearly modeled using daily records
and time series moving averages of acute chronic ratios, daily
workload, sleep quality, and sleep hours. This means that instead of
only lowering training before competitions, coaches should put focus
on preparing the athletes physically and mentally through a
combination of measures with a focus on sleep. 


| Iterations/100      | Mean Squared Error |
| ----------- | ----------- |
| 1      | 90.4998       |
| 11   | 1.0265        |
| 21   | 0.9604        |
| 31   | 0.8671        |
| 41   | 0.7838        |
|100 | 0.0925 |
Sample Size: 304864
Final R2: 0.532


## Predicting Performance

Trivially, performance of an individual cannot be modeled using simple
linear regressions  only involving one factors. We therefore developed
and optimized a deep neural network to capture the patterns involving
fatigue, sleep, and self-rated performance.  

The structure of the network is a 3-layer (input, output, and a hidden
layer)  sigmoid classifier that was trained on batches of 32 samples
from players with  respect to features: normalized perceived fatigue,
sliding average of perceived fatigue, sliding average over sleep
hours, and the perceived sleep quality of the players. It is optimized
through the Adam optimizer with a learning rate of  .005 and cross
entropy to calculate the loss between the logits and labels.  

The logits of the work are a confidence output on which class the
network feels the sample most likely belongs to, the real value of
which is the  classification of perceived performance by the player.
Through this method, we can show a correlation between fatigue, sleep,
and self-rated performance, as well as a means to predict this
self-rate performance based off of fatigue and self-perceived sleep
quality. 

Results with LR=.01, Batch=32: 

- Accuracy before training: 20.44388%
- Loss after step 49: .531657
- Accuracy after training: 74.846625%
- F1 Score: .94

![](media/datafest/network.png)

## Future Work

With more data to to test with we can further improve and validate out
models. With historical data from other teams we can take our analysis
one step further. Based on the training, performance, and fatigue
information from other teams we can use that to create a model to make
a recommendation for our team’s training. This model would be able to
make recommendations for our training intensity leading up to a game.
Since this will be heavily dealing with multivariate time series data
leading up to a game, using a Long Short-term Network (LSTM) would
bring promising results.