|
|
@ -0,0 +1,96 @@ |
|
|
|
|
|
|
|
|
|
|
|
# Importing and Cleaning Data |
|
|
|
|
|
|
|
|
|
|
|
# Data Visualization |
|
|
|
|
|
|
|
|
|
|
|
# Analysis |
|
|
|
|
|
|
|
|
|
|
|
# Report |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
## Abstract |
|
|
|
|
|
|
|
The way in which a team trains is critical in ensuring that everyone performs at their peak performance |
|
|
|
during a game. In order to effectively train a team to optimize their gameday performance, it would make |
|
|
|
intuitive sense to monitor their training data with respect to their perceived fatigue. Through analyzing |
|
|
|
time series data provided by our partnering women’s rugby team, it was observed that this team altered |
|
|
|
their training schedule close to games. Although there is some relationship between the two in the long |
|
|
|
run, our attempts at modeling fatigue and work load in the short run suggests little to no correlation using |
|
|
|
linear regressions. This suggests that modeling fatigue is a more complex problem including a slew of factors |
|
|
|
both psychological and physical which spans over a period of time; coaches should pay attention not only to |
|
|
|
training but also sleep and mental wellness for happy and competitive teams. To most effectively forecast an |
|
|
|
individual’s performance during a game, we propose a system which takes into account physiological factors |
|
|
|
such as desire and physical factors such as sleep, soreness and amount of training. |
|
|
|
|
|
|
|
## Methodology |
|
|
|
|
|
|
|
We employed a wide range of techniques for establishing our models and hypotheses, including smoothing |
|
|
|
of time series Information, testing of hypotheses based on a prior understanding of the domain, plotting |
|
|
|
and visually analyzing pairs of variables, and artificial intelligence algorithms that found various linear and |
|
|
|
nonlinear patterns in the dataset. Coefficients of determination were calculated to determine fitness of linear |
|
|
|
models, and F1 scores were analyzed to validate complex nonlinear classification models. |
|
|
|
|
|
|
|
|
|
|
|
## Modeling Fatigue |
|
|
|
|
|
|
|
Fatigue can be effectively and linearly modeled using daily records and time series moving |
|
|
|
averages of acute chronic ratios, daily workload, sleep quality, and sleep hours. |
|
|
|
This means that instead of only lowering training before competitions, coaches |
|
|
|
should put focus on preparing the athletes physically and mentally through a |
|
|
|
combination of measures with a focus on sleep. |
|
|
|
|
|
|
|
|
|
|
|
| Iterations/100 | Mean Squared Error | |
|
|
|
| ----------- | ----------- | |
|
|
|
| 1 | 90.4998 | |
|
|
|
| 11 | 1.0265 | |
|
|
|
| 21 | 0.9604 | |
|
|
|
| 31 | 0.8671 | |
|
|
|
| 41 | 0.7838 | |
|
|
|
|100 | 0.0925 | |
|
|
|
Sample Size: 304864 |
|
|
|
Final R2: 0.532 |
|
|
|
|
|
|
|
|
|
|
|
## Predicting Performance |
|
|
|
|
|
|
|
Trivially, performance of an individual cannot be modeled using simple linear regressions |
|
|
|
only involving one factors. We therefore developed and optimized a deep neural |
|
|
|
network to capture the patterns involving fatigue, sleep, and self-rated performance. |
|
|
|
|
|
|
|
The structure of the network is a 3-layer (input, output, and a hidden layer) |
|
|
|
sigmoid classifier that was trained on batches of 32 samples from players with |
|
|
|
respect to features: normalized perceived fatigue, sliding average of |
|
|
|
perceived fatigue, sliding average over sleep hours, and the perceived sleep quality of |
|
|
|
the players. It is optimized through the Adam optimizer with a learning rate of |
|
|
|
.005 and cross entropy to calculate the loss between the logits and labels. |
|
|
|
|
|
|
|
The logits of the work are a confidence output on which class the network |
|
|
|
feels the sample most likely belongs to, the real value of which is the |
|
|
|
classification of perceived performance by the player. Through this method, |
|
|
|
we can show a correlation between fatigue, sleep, and self-rated performance, |
|
|
|
as well as a means to predict this self-rate performance based off of fatigue |
|
|
|
and self-perceived sleep quality. |
|
|
|
|
|
|
|
Results with LR=.01, Batch=32: |
|
|
|
|
|
|
|
- Accuracy before training: 20.44388% |
|
|
|
- Loss after step 49: .531657 |
|
|
|
- Accuracy after training: 74.846625% |
|
|
|
- F1 Score: .94 |
|
|
|
|
|
|
|
![](media/datafest/network.png) |
|
|
|
|
|
|
|
## Future Work |
|
|
|
|
|
|
|
With more data to to test with we can further improve and validate out models. With historical data from |
|
|
|
other teams we can take our analysis one step further. Based on the training, performance, and fatigue |
|
|
|
information from other teams we can use that to create a model to make a recommendation for our team’s |
|
|
|
training. This model would be able to make recommendations for our training intensity leading up to a |
|
|
|
game. Since this will be heavily dealing with multivariate time series data leading up to a game, using a Long |
|
|
|
Short-term Network (LSTM) would bring promising results. |