diff --git a/blogContent/headerImages/playTimes.png b/blogContent/headerImages/playTimes.png
new file mode 100644
index 0000000..fd4256f
Binary files /dev/null and b/blogContent/headerImages/playTimes.png differ
diff --git a/blogContent/posts/data-science/media/steamGames/output_11_1.png b/blogContent/posts/data-science/media/steamGames/output_11_1.png
new file mode 100644
index 0000000..1c2e3ed
Binary files /dev/null and b/blogContent/posts/data-science/media/steamGames/output_11_1.png differ
diff --git a/blogContent/posts/data-science/media/steamGames/output_13_1.png b/blogContent/posts/data-science/media/steamGames/output_13_1.png
new file mode 100644
index 0000000..8b1cbb0
Binary files /dev/null and b/blogContent/posts/data-science/media/steamGames/output_13_1.png differ
diff --git a/blogContent/posts/data-science/media/steamGames/output_15_1.png b/blogContent/posts/data-science/media/steamGames/output_15_1.png
new file mode 100644
index 0000000..8b86e94
Binary files /dev/null and b/blogContent/posts/data-science/media/steamGames/output_15_1.png differ
diff --git a/blogContent/posts/data-science/media/steamGames/output_17_0.png b/blogContent/posts/data-science/media/steamGames/output_17_0.png
new file mode 100644
index 0000000..8265d56
Binary files /dev/null and b/blogContent/posts/data-science/media/steamGames/output_17_0.png differ
diff --git a/blogContent/posts/data-science/media/steamGames/output_19_0.png b/blogContent/posts/data-science/media/steamGames/output_19_0.png
new file mode 100644
index 0000000..b422f60
Binary files /dev/null and b/blogContent/posts/data-science/media/steamGames/output_19_0.png differ
diff --git a/blogContent/posts/data-science/media/steamGames/output_25_0.png b/blogContent/posts/data-science/media/steamGames/output_25_0.png
new file mode 100644
index 0000000..487c5df
Binary files /dev/null and b/blogContent/posts/data-science/media/steamGames/output_25_0.png differ
diff --git a/blogContent/posts/data-science/media/steamGames/output_27_1.png b/blogContent/posts/data-science/media/steamGames/output_27_1.png
new file mode 100644
index 0000000..da75984
Binary files /dev/null and b/blogContent/posts/data-science/media/steamGames/output_27_1.png differ
diff --git a/blogContent/posts/data-science/media/steamGames/output_31_0.png b/blogContent/posts/data-science/media/steamGames/output_31_0.png
new file mode 100644
index 0000000..d3bbc94
Binary files /dev/null and b/blogContent/posts/data-science/media/steamGames/output_31_0.png differ
diff --git a/blogContent/posts/data-science/media/steamGames/output_34_0.png b/blogContent/posts/data-science/media/steamGames/output_34_0.png
new file mode 100644
index 0000000..24ec6be
Binary files /dev/null and b/blogContent/posts/data-science/media/steamGames/output_34_0.png differ
diff --git a/blogContent/posts/data-science/media/steamGames/output_37_0.png b/blogContent/posts/data-science/media/steamGames/output_37_0.png
new file mode 100644
index 0000000..985dc41
Binary files /dev/null and b/blogContent/posts/data-science/media/steamGames/output_37_0.png differ
diff --git a/blogContent/posts/data-science/media/steamGames/output_38_0.png b/blogContent/posts/data-science/media/steamGames/output_38_0.png
new file mode 100644
index 0000000..c515f84
Binary files /dev/null and b/blogContent/posts/data-science/media/steamGames/output_38_0.png differ
diff --git a/blogContent/posts/data-science/media/steamGames/output_9_1.png b/blogContent/posts/data-science/media/steamGames/output_9_1.png
new file mode 100644
index 0000000..28d566c
Binary files /dev/null and b/blogContent/posts/data-science/media/steamGames/output_9_1.png differ
diff --git a/blogContent/posts/data-science/time-spent-in-steam-games.md b/blogContent/posts/data-science/time-spent-in-steam-games.md
new file mode 100644
index 0000000..6087528
--- /dev/null
+++ b/blogContent/posts/data-science/time-spent-in-steam-games.md
@@ -0,0 +1,900 @@
+Last week I scrapped a bunch of data from the Steam API using my [Steam Graph Project](https://github.com/jrtechs/SteamFriendsGraph).
+This project captures steam users, their friends, and the games that they own.
+Using the Janus-Graph traversal object, I use the Gremlin graph query language to pull this data.
+Since I am storing the hours played in a game as a property on the relationship between a player and a game node, I had to make a "join" statement to get the hours property with the game information in a single query.
+
+```java
+Object o = graph.con.getTraversal()
+    .V()
+    .hasLabel(Game.KEY_DB)
+    .match(
+            __.as("c").values(Game.KEY_STEAM_GAME_ID).as("gameID"),
+            __.as("c").values(Game.KEY_GAME_NAME).as("gameName"),
+            __.as("c").inE(Game.KEY_RELATIONSHIP).values(Game.KEY_PLAY_TIME).as("time")
+    ).select("gameID", "time", "gameName").toList();
+WrappedFileWriter.writeToFile(new Gson().toJson(o).toLowerCase(), "games.json");
+```
+
+Using the game indexing property on the players, I noted that I only ended up wholly indexing the games of 481 players after 8 hours. 
+
+```java
+graph.con.getTraversal()
+    .V()
+    .hasLabel(SteamGraph.KEY_PLAYER)
+    .has(SteamGraph.KEY_CRAWLED_GAME_STATUS, 1)
+    .count().next()
+```
+
+We now transition to Python and Matlptlib to visualize the data exported from our JanusGraph Query as a JSON object.
+The dependencies for this [notebook](https://github.com/jrtechs/RandomScripts/tree/master/notebooks) can get installed using pip.
+
+
+```python
+!pip install pandas
+!pip install matplotlib
+```
+
+```
+    Collecting pandas
+      Downloading pandas-1.0.5-cp38-cp38-manylinux1_x86_64.whl (10.0 MB)
+    [K     |████████████████████████████████| 10.0 MB 4.3 MB/s eta 0:00:01
+    [?25hCollecting pytz>=2017.2
+      Downloading pytz-2020.1-py2.py3-none-any.whl (510 kB)
+    [K     |████████████████████████████████| 510 kB 2.9 MB/s eta 0:00:01
+    [?25hRequirement already satisfied: numpy>=1.13.3 in /home/jeff/Documents/python/ml/lib/python3.8/site-packages (from pandas) (1.18.5)
+    Requirement already satisfied: python-dateutil>=2.6.1 in /home/jeff/Documents/python/ml/lib/python3.8/site-packages (from pandas) (2.8.1)
+    Requirement already satisfied: six>=1.5 in /home/jeff/Documents/python/ml/lib/python3.8/site-packages (from python-dateutil>=2.6.1->pandas) (1.15.0)
+    Installing collected packages: pytz, pandas
+    Successfully installed pandas-1.0.5 pytz-2020.1
+```
+
+The first thing we are doing is importing our JSON data as a pandas data frame.
+Pandas is an open-source data analysis and manipulation tool.
+I enjoy pandas because it has native integration with matplotlib and supports operations like aggregations and groupings. 
+
+
+```python
+import matplotlib.pyplot as plt
+import pandas as pd
+
+games_df = pd.read_json('games.json')
+games_df
+```
+
+
+<div>
+<style scoped>
+    .dataframe tbody tr th:only-of-type {
+        vertical-align: middle;
+    }
+
+    .dataframe tbody tr th {
+        vertical-align: top;
+    }
+
+    .dataframe thead th {
+        text-align: right;
+    }
+</style>
+<table border="1" class="dataframe">
+  <thead>
+    <tr style="text-align: right;">
+      <th></th>
+      <th>gameid</th>
+      <th>time</th>
+      <th>gamename</th>
+    </tr>
+  </thead>
+  <tbody>
+    <tr>
+      <th>0</th>
+      <td>210770</td>
+      <td>243</td>
+      <td>sanctum 2</td>
+    </tr>
+    <tr>
+      <th>1</th>
+      <td>210770</td>
+      <td>31</td>
+      <td>sanctum 2</td>
+    </tr>
+    <tr>
+      <th>2</th>
+      <td>210770</td>
+      <td>276</td>
+      <td>sanctum 2</td>
+    </tr>
+    <tr>
+      <th>3</th>
+      <td>210770</td>
+      <td>147</td>
+      <td>sanctum 2</td>
+    </tr>
+    <tr>
+      <th>4</th>
+      <td>210770</td>
+      <td>52</td>
+      <td>sanctum 2</td>
+    </tr>
+    <tr>
+      <th>...</th>
+      <td>...</td>
+      <td>...</td>
+      <td>...</td>
+    </tr>
+    <tr>
+      <th>36212</th>
+      <td>9800</td>
+      <td>9</td>
+      <td>death to spies</td>
+    </tr>
+    <tr>
+      <th>36213</th>
+      <td>445220</td>
+      <td>0</td>
+      <td>avorion</td>
+    </tr>
+    <tr>
+      <th>36214</th>
+      <td>445220</td>
+      <td>25509</td>
+      <td>avorion</td>
+    </tr>
+    <tr>
+      <th>36215</th>
+      <td>445220</td>
+      <td>763</td>
+      <td>avorion</td>
+    </tr>
+    <tr>
+      <th>36216</th>
+      <td>445220</td>
+      <td>3175</td>
+      <td>avorion</td>
+    </tr>
+  </tbody>
+</table>
+<p>36217 rows × 3 columns</p>
+</div>
+
+
+Using the built-in matplotlib wrapper function, we can graph a histogram of the number of hours played in a game.
+
+
+```python
+ax = games_df.hist(column='time', bins=20, range=(0, 4000))
+ax=ax[0][0]
+ax.set_title("Game Play Distribution")
+ax.set_xlabel("Minutes Played")
+ax.set_ylabel("Frequency")
+```
+
+![png](media/steamGames/output_9_1.png)
+
+
+Notice that the vast majority of the games are rarely ever played, however, it is skewed to the right with a lot of outliers.
+We can change the scale to make it easier to view using the range parameter.
+
+
+```python
+ax = games_df.hist(column='time', bins=20, range=(0, 100))
+ax=ax[0][0]
+ax.set_title("Game Play Distribution")
+ax.set_xlabel("Minutes Played")
+ax.set_ylabel("Frequency")
+```
+
+
+![png](media/steamGames/output_11_1.png)
+
+
+If we remove games that have never been played, the distribution looks more reasonable.
+
+
+```python
+ax = games_df.hist(column='time', bins=20, range=(2, 100))
+ax=ax[0][0]
+ax.set_title("Game Play Distribution")
+ax.set_xlabel("Minutes Played")
+ax.set_ylabel("Frequency")
+```
+
+![png](media/steamGames/output_13_1.png)
+
+
+Although histograms are useful, viewing the CDF is often more helpful since it is easier to extract numerical information.
+
+
+```python
+ax = games_df.hist(column='time',density=True, range=(0, 2000),  histtype='step',cumulative=True)
+ax=ax[0][0]
+ax.set_title("Game Play Distribution")
+ax.set_xlabel("Minutes Played")
+ax.set_ylabel("Frequency")
+```
+
+![png](media/steamGames/output_15_1.png)
+
+
+According to this graph, about 80% of people on steam who own a game, play it under 4 hours. Nearly half of all downloaded or purchased steam games go un-played. This data is a neat example of the legendary 80/20 principle -- aka the Pareto principle. The Pareto principle states that roughly 80% of the effects come from 20% of the causes. IE: 20% of software bugs result in 80% of debugging time.
+
+As mentioned earlier, the time in owned game distribution is heavily skewed to the right.
+
+
+```python
+ax = plt.gca()
+ax.set_title('Game Play Distribution')
+ax.boxplot(games_df['time'], vert=False,manage_ticks=False, notch=True)
+plt.xlabel("Game Play in Minutes")
+ax.set_yticks([])
+plt.show()
+```
+
+
+![png](media/steamGames/output_17_0.png)
+
+
+When zooming in on the distribution, we see that nearly half of all the purchased games go un-opened.
+
+```python
+ax = plt.gca()
+ax.set_title('Game Play Distribution')
+ax.boxplot(games_df['time']/60, vert=False,manage_ticks=False, notch=True)
+plt.xlabel("Game Play in Hours")
+ax.set_yticks([])
+ax.set_xlim([0, 10])
+plt.show()
+```
+
+
+![png](media/steamGames/output_19_0.png)
+
+
+Viewing the aggregate pool of hours in particular game data is insightful; however, comparing different games against each other is more interesting.
+In pandas, after we create a grouping on a column, we can aggregate it into metrics such as max, min, mean, etc.
+I am also sorting the data I get by count since we are more interested in "popular" games.
+
+
+```python
+stats_df = (games_df.groupby("gamename")
+                    .agg({'time': ['count', "min", 'max', 'mean']})
+                    .sort_values(by=('time', 'count')))
+stats_df
+```
+
+<div>
+<style scoped>
+    .dataframe tbody tr th:only-of-type {
+        vertical-align: middle;
+    }
+
+    .dataframe tbody tr th {
+        vertical-align: top;
+    }
+
+    .dataframe thead tr th {
+        text-align: left;
+    }
+
+    .dataframe thead tr:last-of-type th {
+        text-align: right;
+    }
+</style>
+<table border="1" class="dataframe">
+  <thead>
+    <tr>
+      <th></th>
+      <th colspan="4" halign="left">time</th>
+    </tr>
+    <tr>
+      <th></th>
+      <th>count</th>
+      <th>min</th>
+      <th>max</th>
+      <th>mean</th>
+    </tr>
+    <tr>
+      <th>gamename</th>
+      <th></th>
+      <th></th>
+      <th></th>
+      <th></th>
+    </tr>
+  </thead>
+  <tbody>
+    <tr>
+      <th>龙魂时刻</th>
+      <td>1</td>
+      <td>14</td>
+      <td>14</td>
+      <td>14.000000</td>
+    </tr>
+    <tr>
+      <th>gryphon knight epic</th>
+      <td>1</td>
+      <td>0</td>
+      <td>0</td>
+      <td>0.000000</td>
+    </tr>
+    <tr>
+      <th>growing pains</th>
+      <td>1</td>
+      <td>0</td>
+      <td>0</td>
+      <td>0.000000</td>
+    </tr>
+    <tr>
+      <th>shoppy mart: steam edition</th>
+      <td>1</td>
+      <td>0</td>
+      <td>0</td>
+      <td>0.000000</td>
+    </tr>
+    <tr>
+      <th>ground pounders</th>
+      <td>1</td>
+      <td>0</td>
+      <td>0</td>
+      <td>0.000000</td>
+    </tr>
+    <tr>
+      <th>...</th>
+      <td>...</td>
+      <td>...</td>
+      <td>...</td>
+      <td>...</td>
+    </tr>
+    <tr>
+      <th>payday 2</th>
+      <td>102</td>
+      <td>0</td>
+      <td>84023</td>
+      <td>5115.813725</td>
+    </tr>
+    <tr>
+      <th>team fortress 2</th>
+      <td>105</td>
+      <td>7</td>
+      <td>304090</td>
+      <td>25291.180952</td>
+    </tr>
+    <tr>
+      <th>unturned</th>
+      <td>107</td>
+      <td>0</td>
+      <td>16974</td>
+      <td>1339.757009</td>
+    </tr>
+    <tr>
+      <th>garry's mod</th>
+      <td>121</td>
+      <td>0</td>
+      <td>311103</td>
+      <td>20890.314050</td>
+    </tr>
+    <tr>
+      <th>counter-strike: global offensive</th>
+      <td>129</td>
+      <td>0</td>
+      <td>506638</td>
+      <td>46356.209302</td>
+    </tr>
+  </tbody>
+</table>
+<p>9235 rows × 4 columns</p>
+</div>
+
+
+To prevent one-off esoteric games that I don't have a lot of data for, throwing off metrics, I am disregarding any games that I have less than ten values for. 
+
+
+```python
+stats_df = stats_df[stats_df[('time', 'count')] > 10]
+stats_df
+```
+
+
+<div>
+<style scoped>
+    .dataframe tbody tr th:only-of-type {
+        vertical-align: middle;
+    }
+
+    .dataframe tbody tr th {
+        vertical-align: top;
+    }
+
+    .dataframe thead tr th {
+        text-align: left;
+    }
+
+    .dataframe thead tr:last-of-type th {
+        text-align: right;
+    }
+</style>
+<table border="1" class="dataframe">
+  <thead>
+    <tr>
+      <th></th>
+      <th colspan="4" halign="left">time</th>
+    </tr>
+    <tr>
+      <th></th>
+      <th>count</th>
+      <th>min</th>
+      <th>max</th>
+      <th>mean</th>
+    </tr>
+    <tr>
+      <th>gamename</th>
+      <th></th>
+      <th></th>
+      <th></th>
+      <th></th>
+    </tr>
+  </thead>
+  <tbody>
+    <tr>
+      <th>serious sam hd: the second encounter</th>
+      <td>11</td>
+      <td>0</td>
+      <td>329</td>
+      <td>57.909091</td>
+    </tr>
+    <tr>
+      <th>grim fandango remastered</th>
+      <td>11</td>
+      <td>0</td>
+      <td>248</td>
+      <td>35.000000</td>
+    </tr>
+    <tr>
+      <th>evga precision x1</th>
+      <td>11</td>
+      <td>0</td>
+      <td>21766</td>
+      <td>2498.181818</td>
+    </tr>
+    <tr>
+      <th>f.e.a.r. 2: project origin</th>
+      <td>11</td>
+      <td>0</td>
+      <td>292</td>
+      <td>43.272727</td>
+    </tr>
+    <tr>
+      <th>transistor</th>
+      <td>11</td>
+      <td>0</td>
+      <td>972</td>
+      <td>298.727273</td>
+    </tr>
+    <tr>
+      <th>...</th>
+      <td>...</td>
+      <td>...</td>
+      <td>...</td>
+      <td>...</td>
+    </tr>
+    <tr>
+      <th>payday 2</th>
+      <td>102</td>
+      <td>0</td>
+      <td>84023</td>
+      <td>5115.813725</td>
+    </tr>
+    <tr>
+      <th>team fortress 2</th>
+      <td>105</td>
+      <td>7</td>
+      <td>304090</td>
+      <td>25291.180952</td>
+    </tr>
+    <tr>
+      <th>unturned</th>
+      <td>107</td>
+      <td>0</td>
+      <td>16974</td>
+      <td>1339.757009</td>
+    </tr>
+    <tr>
+      <th>garry's mod</th>
+      <td>121</td>
+      <td>0</td>
+      <td>311103</td>
+      <td>20890.314050</td>
+    </tr>
+    <tr>
+      <th>counter-strike: global offensive</th>
+      <td>129</td>
+      <td>0</td>
+      <td>506638</td>
+      <td>46356.209302</td>
+    </tr>
+  </tbody>
+</table>
+<p>701 rows × 4 columns</p>
+</div>
+
+
+We see that the average, the playtime per player per game, is about 5 hours. However, as noted before, most purchased games go un-played. 
+
+
+```python
+ax = plt.gca()
+ax.set_title('Game Play Distribution')
+ax.boxplot(stats_df[('time', 'mean')]/60, vert=False,manage_ticks=False, notch=True)
+plt.xlabel("Mean Game Play in Hours")
+ax.set_xlim([0, 40])
+ax.set_yticks([])
+plt.show()
+```
+
+
+![png](media/steamGames/output_25_0.png)
+
+
+I had a hunch that more popular games got played more; however,  this dataset is still too small the verify this hunch.
+
+```python
+stats_df.plot.scatter(x=('time', 'count'), y=('time', 'mean'))
+```
+
+![png](media/steamGames/output_27_1.png)
+
+```python
+We can create a new filtered data frame that only contains the result of a single game to graph it. 
+```
+
+
+```python
+cc_df = games_df[games_df['gamename'] == "counter-strike: global offensive"]
+cc_df
+```
+
+<div>
+<style scoped>
+    .dataframe tbody tr th:only-of-type {
+        vertical-align: middle;
+    }
+
+    .dataframe tbody tr th {
+        vertical-align: top;
+    }
+
+    .dataframe thead th {
+        text-align: right;
+    }
+</style>
+<table border="1" class="dataframe">
+  <thead>
+    <tr style="text-align: right;">
+      <th></th>
+      <th>gameid</th>
+      <th>time</th>
+      <th>gamename</th>
+    </tr>
+  </thead>
+  <tbody>
+    <tr>
+      <th>13196</th>
+      <td>730</td>
+      <td>742</td>
+      <td>counter-strike: global offensive</td>
+    </tr>
+    <tr>
+      <th>13197</th>
+      <td>730</td>
+      <td>16019</td>
+      <td>counter-strike: global offensive</td>
+    </tr>
+    <tr>
+      <th>13198</th>
+      <td>730</td>
+      <td>1781</td>
+      <td>counter-strike: global offensive</td>
+    </tr>
+    <tr>
+      <th>13199</th>
+      <td>730</td>
+      <td>0</td>
+      <td>counter-strike: global offensive</td>
+    </tr>
+    <tr>
+      <th>13200</th>
+      <td>730</td>
+      <td>0</td>
+      <td>counter-strike: global offensive</td>
+    </tr>
+    <tr>
+      <th>...</th>
+      <td>...</td>
+      <td>...</td>
+      <td>...</td>
+    </tr>
+    <tr>
+      <th>13320</th>
+      <td>730</td>
+      <td>3867</td>
+      <td>counter-strike: global offensive</td>
+    </tr>
+    <tr>
+      <th>13321</th>
+      <td>730</td>
+      <td>174176</td>
+      <td>counter-strike: global offensive</td>
+    </tr>
+    <tr>
+      <th>13322</th>
+      <td>730</td>
+      <td>186988</td>
+      <td>counter-strike: global offensive</td>
+    </tr>
+    <tr>
+      <th>13323</th>
+      <td>730</td>
+      <td>103341</td>
+      <td>counter-strike: global offensive</td>
+    </tr>
+    <tr>
+      <th>13324</th>
+      <td>730</td>
+      <td>10483</td>
+      <td>counter-strike: global offensive</td>
+    </tr>
+  </tbody>
+</table>
+<p>129 rows × 3 columns</p>
+</div>
+
+
+It is shocking how many hours certain people play in Counter-Strike. The highest number in the dataset was 8,444 hours or 352 days! 
+
+```python
+ax = plt.gca()
+ax.set_title('Game Play Distribution for Counter-Strike')
+ax.boxplot(cc_df['time']/60, vert=False,manage_ticks=False, notch=True)
+plt.xlabel("Game Play in Hours")
+ax.set_yticks([])
+plt.show()
+```
+
+![png](media/steamGames/output_31_0.png)
+
+
+Viewing the distribution for a different game like Unturned, yields a vastly different distribution than Counter-Strike. I believe the key difference is that Counter-Strike gets played competitively, where Unturned is a more leisurely game. Competitive gamers likely skew the distribution of Counter-Strike to be very high. 
+
+```python
+u_df = games_df[games_df['gamename'] == "unturned"]
+u_df
+```
+
+<div>
+<style scoped>
+    .dataframe tbody tr th:only-of-type {
+        vertical-align: middle;
+    }
+
+    .dataframe tbody tr th {
+        vertical-align: top;
+    }
+
+    .dataframe thead th {
+        text-align: right;
+    }
+</style>
+<table border="1" class="dataframe">
+  <thead>
+    <tr style="text-align: right;">
+      <th></th>
+      <th>gameid</th>
+      <th>time</th>
+      <th>gamename</th>
+    </tr>
+  </thead>
+  <tbody>
+    <tr>
+      <th>167</th>
+      <td>304930</td>
+      <td>140</td>
+      <td>unturned</td>
+    </tr>
+    <tr>
+      <th>168</th>
+      <td>304930</td>
+      <td>723</td>
+      <td>unturned</td>
+    </tr>
+    <tr>
+      <th>169</th>
+      <td>304930</td>
+      <td>1002</td>
+      <td>unturned</td>
+    </tr>
+    <tr>
+      <th>170</th>
+      <td>304930</td>
+      <td>1002</td>
+      <td>unturned</td>
+    </tr>
+    <tr>
+      <th>171</th>
+      <td>304930</td>
+      <td>0</td>
+      <td>unturned</td>
+    </tr>
+    <tr>
+      <th>...</th>
+      <td>...</td>
+      <td>...</td>
+      <td>...</td>
+    </tr>
+    <tr>
+      <th>269</th>
+      <td>304930</td>
+      <td>97</td>
+      <td>unturned</td>
+    </tr>
+    <tr>
+      <th>270</th>
+      <td>304930</td>
+      <td>768</td>
+      <td>unturned</td>
+    </tr>
+    <tr>
+      <th>271</th>
+      <td>304930</td>
+      <td>1570</td>
+      <td>unturned</td>
+    </tr>
+    <tr>
+      <th>272</th>
+      <td>304930</td>
+      <td>23</td>
+      <td>unturned</td>
+    </tr>
+    <tr>
+      <th>273</th>
+      <td>304930</td>
+      <td>115</td>
+      <td>unturned</td>
+    </tr>
+  </tbody>
+</table>
+<p>107 rows × 3 columns</p>
+</div>
+
+
+
+
+```python
+ax = plt.gca()
+ax.set_title('Game Play Distribution for Unturned')
+ax.boxplot(u_df['time']/60, vert=False,manage_ticks=False, notch=True)
+plt.xlabel("Game Play in Hours")
+ax.set_yticks([])
+plt.show()
+```
+
+![png](media/steamGames/output_34_0.png)
+
+
+Next, I made a data frame just containing the raw data points of games that had an aggregate count of over 80. For the crawl sample size that I did, having a count of 80 would make the game "popular." Since we only have 485 players indexed, having over 80 entries implies that over 17% of people indexed had the game. It is easy to verify that the games returned were very popular by glancing at the results.
+
+
+```python
+df1 = games_df[games_df['gamename'].map(games_df['gamename'].value_counts()) > 80]
+df1['time'] = df1['time']/60
+df1
+```
+
+<div>
+<style scoped>
+    .dataframe tbody tr th:only-of-type {
+        vertical-align: middle;
+    }
+
+    .dataframe tbody tr th {
+        vertical-align: top;
+    }
+
+    .dataframe thead th {
+        text-align: right;
+    }
+</style>
+<table border="1" class="dataframe">
+  <thead>
+    <tr style="text-align: right;">
+      <th></th>
+      <th>gameid</th>
+      <th>time</th>
+      <th>gamename</th>
+    </tr>
+  </thead>
+  <tbody>
+    <tr>
+      <th>167</th>
+      <td>304930</td>
+      <td>2.333333</td>
+      <td>unturned</td>
+    </tr>
+    <tr>
+      <th>168</th>
+      <td>304930</td>
+      <td>12.050000</td>
+      <td>unturned</td>
+    </tr>
+    <tr>
+      <th>169</th>
+      <td>304930</td>
+      <td>16.700000</td>
+      <td>unturned</td>
+    </tr>
+    <tr>
+      <th>170</th>
+      <td>304930</td>
+      <td>16.700000</td>
+      <td>unturned</td>
+    </tr>
+    <tr>
+      <th>171</th>
+      <td>304930</td>
+      <td>0.000000</td>
+      <td>unturned</td>
+    </tr>
+    <tr>
+      <th>...</th>
+      <td>...</td>
+      <td>...</td>
+      <td>...</td>
+    </tr>
+    <tr>
+      <th>22682</th>
+      <td>578080</td>
+      <td>51.883333</td>
+      <td>playerunknown's battlegrounds</td>
+    </tr>
+    <tr>
+      <th>22683</th>
+      <td>578080</td>
+      <td>47.616667</td>
+      <td>playerunknown's battlegrounds</td>
+    </tr>
+    <tr>
+      <th>22684</th>
+      <td>578080</td>
+      <td>30.650000</td>
+      <td>playerunknown's battlegrounds</td>
+    </tr>
+    <tr>
+      <th>22685</th>
+      <td>578080</td>
+      <td>170.083333</td>
+      <td>playerunknown's battlegrounds</td>
+    </tr>
+    <tr>
+      <th>22686</th>
+      <td>578080</td>
+      <td>399.950000</td>
+      <td>playerunknown's battlegrounds</td>
+    </tr>
+  </tbody>
+</table>
+<p>1099 rows × 3 columns</p>
+</div>
+
+
+```python
+ax = df1.boxplot(column=["time"], by='gamename', notch=True, vert=False)
+fig = ax.get_figure()
+fig.suptitle('')
+ax.set_title('Play-time Distribution')
+plt.xlabel("Hours Played")
+ax.set_xlim([0, 2000])
+plt.ylabel("Game")
+plt.savefig("playTimes.png", dpi=300, bbox_inches = "tight")
+```
+
+![png](media/steamGames/output_38_0.png)
+
+Overall it is fascinating to see how the distributions for different games vary. In the future, I will re-run some of these analytics with even more data and possibly put them on my website as an interactive graph.