{
|
|
"cells": [
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"This post dives head first into PyTorch: a powerful open source ML library for Python. \n",
|
|
"Google's flagship ML library is TensorFlow, where Facebook developed PyTorch. \n",
|
|
"Researchers are gravitating towards PyTorch due to its flexibility and efficiency. This tutorial goes over the basics of PyTorch including tensors and a simple perceptron. \n",
|
|
"This tutorial requires knowledge of Python, Numpy and neural networks.\n",
|
|
"If you don't know anything about neural networks, I suggest that you watch this amazing video by 3Blue1Brown:\n",
|
|
"\n",
|
|
"<youtube src=\"aircAruvnKk\" />"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"You can install both numpy and pytorch using pip. Check out the [PyTorch](https://pytorch.org/) website to get the version of pytorch that works with your version of CUDA installed on your computer. If you don't have an NVIDA GPU with CUDA, you can still run PyTorch, but, you won't be able to run your programs on the GPU."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"!pip install numpy\n",
|
|
"!pip install torch torchvision"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"Pro tip: if you are in a notebook adding a \"!\" will execute your command on the terminal. "
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 24,
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"/home/jeff/Documents/python\n"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"!pwd"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"# Tensors\n",
|
|
"\n",
|
|
"The core of the PyTorch library is centered around Tensors. Tensors are analogous to Numpy matricies, however, the benefit of tensors is their ability to get placed on the GPU. Tensors also allows you to do auto gradients which makes doing back propagation in neural networks a lot faster. \n",
|
|
"\n",
|
|
"Creating an empty tensor is similar to creating a new C array: anything can be in the memory that you grabbed so don't expect it to be zero, ones, or \"random\"."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 6,
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"data": {
|
|
"text/plain": [
|
|
"tensor([[4.8132e-36, 4.5597e-41],\n",
|
|
" [1.4906e-11, 3.0957e-41],\n",
|
|
" [4.4842e-44, 0.0000e+00],\n",
|
|
" [8.9683e-44, 0.0000e+00],\n",
|
|
" [7.7759e-13, 3.0957e-41]])"
|
|
]
|
|
},
|
|
"execution_count": 6,
|
|
"metadata": {},
|
|
"output_type": "execute_result"
|
|
}
|
|
],
|
|
"source": [
|
|
"import torch\n",
|
|
"\n",
|
|
"torch.empty(5, 2)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"If you explicitely create a random matrix, you will get values between zero and one. "
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 5,
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"tensor([[0.7825, 0.7864, 0.1257],\n",
|
|
" [0.7588, 0.6572, 0.9262],\n",
|
|
" [0.4881, 0.6329, 0.3424],\n",
|
|
" [0.1333, 0.4235, 0.6760],\n",
|
|
" [0.9737, 0.6657, 0.9946]])\n",
|
|
"torch.Size([5, 3])\n"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"x = torch.rand(5, 3)\n",
|
|
"print(x)\n",
|
|
"print(x.shape)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"Similarly there is a function for random integers."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 8,
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"data": {
|
|
"text/plain": [
|
|
"tensor([[4, 2, 1],\n",
|
|
" [2, 0, 3],\n",
|
|
" [2, 2, 2]])"
|
|
]
|
|
},
|
|
"execution_count": 8,
|
|
"metadata": {},
|
|
"output_type": "execute_result"
|
|
}
|
|
],
|
|
"source": [
|
|
"torch.randint(low=0, high=5, size=(3,3))"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 25,
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"data": {
|
|
"text/plain": [
|
|
"tensor([[1.],\n",
|
|
" [1.],\n",
|
|
" [1.]])"
|
|
]
|
|
},
|
|
"execution_count": 25,
|
|
"metadata": {},
|
|
"output_type": "execute_result"
|
|
}
|
|
],
|
|
"source": [
|
|
"torch.ones(3,1)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"Similar to numpy you can also specify an empty array filled with zeros and specify a data type.\n",
|
|
"\n",
|
|
"Common data types:\n",
|
|
"- torch.long\n",
|
|
"- torch.bool\n",
|
|
"- torch.float\n",
|
|
"- torch.int\n",
|
|
"- torch.int8\n",
|
|
"- torch.int16\n",
|
|
"- torch.int32\n",
|
|
"- torch.int64\n",
|
|
"- torch.double"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 26,
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"tensor([[0, 0],\n",
|
|
" [0, 0],\n",
|
|
" [0, 0],\n",
|
|
" [0, 0],\n",
|
|
" [0, 0]])\n"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"x = torch.zeros(5, 2, dtype=torch.long)\n",
|
|
"print(x)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"Size returns a tuple. In PyTorch it is common to do .size(), however .shape will return the same thing."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 29,
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"torch.Size([5, 2])\n",
|
|
"torch.Size([5, 2])\n"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"print(x.size())\n",
|
|
"print(x.shape)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"Like Numpy, Pytorch suports a ton of operators on a Tensor.\n",
|
|
"\n",
|
|
"Check out the documentation at the [official website](https://pytorch.org/docs/stable/tensors.html)."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 39,
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"data": {
|
|
"text/plain": [
|
|
"tensor([[0.4942, 0.7370],\n",
|
|
" [0.9927, 0.7068],\n",
|
|
" [0.1702, 0.9578],\n",
|
|
" [0.6510, 0.4992],\n",
|
|
" [0.2482, 0.4928]])"
|
|
]
|
|
},
|
|
"execution_count": 39,
|
|
"metadata": {},
|
|
"output_type": "execute_result"
|
|
}
|
|
],
|
|
"source": [
|
|
"y = torch.rand(5,2)\n",
|
|
"\n",
|
|
"x + y # same as torch.add(x, y)\n",
|
|
"\n",
|
|
"result = torch.add(x, y)\n",
|
|
"\n",
|
|
"torch.add(x, y, out= result) "
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"Pytorch added a functions with \"_\" for common operators that performs the operation on the calling tensor."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 40,
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"data": {
|
|
"text/plain": [
|
|
"tensor([[0.9885, 1.4740],\n",
|
|
" [1.9855, 1.4135],\n",
|
|
" [0.3405, 1.9155],\n",
|
|
" [1.3020, 0.9984],\n",
|
|
" [0.4964, 0.9856]])"
|
|
]
|
|
},
|
|
"execution_count": 40,
|
|
"metadata": {},
|
|
"output_type": "execute_result"
|
|
}
|
|
],
|
|
"source": [
|
|
"# adds result to y\n",
|
|
"y.add_(result)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 43,
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"tensor(0.9885)\n",
|
|
"tensor(0.9885)\n",
|
|
"0.98846435546875\n",
|
|
"tensor([0.9885, 1.9855, 0.3405, 1.3020, 0.4964])\n",
|
|
"tensor([1.9855, 1.4135])\n"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"print(y[0][0]) # first element as a tensor\n",
|
|
"print(y[0, 0]) # same thing as y[0][0]\n",
|
|
"print(y[0][0].item()) # grabs data inside tensor\n",
|
|
"print(y[:, 0]) # gets first col\n",
|
|
"print(y[1, :]) # gets second row"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"You can resize the tensor using the view function."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 45,
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"tensor([[0.9885, 1.4740, 1.9855, 1.4135, 0.3405, 1.9155, 1.3020, 0.9984, 0.4964,\n",
|
|
" 0.9856]])\n",
|
|
"tensor([[0.9885, 1.4740, 1.9855, 1.4135, 0.3405],\n",
|
|
" [1.9155, 1.3020, 0.9984, 0.4964, 0.9856]])\n"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"print(y.view(1,10))\n",
|
|
"\n",
|
|
"print(y.view(2,5))"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"## CUDA\n",
|
|
"\n",
|
|
"One of the great things about PyTorch is that you can run everything on either the GPU or the CPU. To make code more flexible to run on either devices, most people set the device dynamically. Keeping your devices consistent is important because you can't do operations to a \"cuda\" tensor by a \"cpu\" tensor. This makes sence because one is on the GPU's memory where the other is in the computers main memory -- RAM. "
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 9,
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"data": {
|
|
"text/plain": [
|
|
"True"
|
|
]
|
|
},
|
|
"execution_count": 9,
|
|
"metadata": {},
|
|
"output_type": "execute_result"
|
|
}
|
|
],
|
|
"source": [
|
|
"torch.cuda.is_available() # prints if CUDA is available on system"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 14,
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"data": {
|
|
"text/plain": [
|
|
"tensor([[0.7825, 0.7864, 0.1257],\n",
|
|
" [0.7588, 0.6572, 0.9262],\n",
|
|
" [0.4881, 0.6329, 0.3424],\n",
|
|
" [0.1333, 0.4235, 0.6760],\n",
|
|
" [0.9737, 0.6657, 0.9946]], device='cuda:0')"
|
|
]
|
|
},
|
|
"execution_count": 14,
|
|
"metadata": {},
|
|
"output_type": "execute_result"
|
|
}
|
|
],
|
|
"source": [
|
|
"device = torch.device(\"cpu\")\n",
|
|
"if torch.cuda.is_available():\n",
|
|
" device = torch.device(\"cuda\")\n",
|
|
"x.to(device) # puts the x matrix on device selected"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"# let us run this cell only if CUDA is available\n",
|
|
"# We will use ``torch.device`` objects to move tensors in and out of GPU\n",
|
|
"if torch.cuda.is_available():\n",
|
|
" device = torch.device(\"cuda\") # a CUDA device object\n",
|
|
" y = torch.ones_like(x, device=device) # directly create a tensor on GPU\n",
|
|
" x = x.to(device) # or just use strings ``.to(\"cuda\")``\n",
|
|
" z = x + y\n",
|
|
" print(z)\n",
|
|
" print(z.to(\"cpu\", torch.double)) # ``.to`` can also change dtype together!"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"## NumPy to Tensor"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"It is possible to switch between Numpy arrays and Tensors. Note, that this is now a shadow reference. Anything done to the numpy array will be reflected in the original tensor and vice versa. "
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 16,
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"tensor([0., 0., 0., 0., 0.], dtype=torch.float64)\n"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"import numpy as np\n",
|
|
"\n",
|
|
"g = np.zeros(5)\n",
|
|
"\n",
|
|
"gg = torch.from_numpy(g)\n",
|
|
"\n",
|
|
"print(gg)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"## CUDA Performance\n",
|
|
"\n",
|
|
"Without question the performance of matrix operations on the GPU is lightyears faster than on the CPU. The following code is an example of the "
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 17,
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"1.8906972408294678\n",
|
|
"0.003466367721557617\n"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"import time # times in seconds\n",
|
|
"def time_torch(size):\n",
|
|
" x = torch.rand(size, size, device=torch.device(\"cuda\"))\n",
|
|
" start = time.time()\n",
|
|
" x.sin_()\n",
|
|
" end = time.time()\n",
|
|
" return(end - start)\n",
|
|
"\n",
|
|
"def time_numpy(size):\n",
|
|
" x = np.random.rand(size, size)\n",
|
|
" start = time.time()\n",
|
|
" np.sin(x, out=x)\n",
|
|
" end = time.time()\n",
|
|
" return(end - start)\n",
|
|
"\n",
|
|
"print(time_numpy(10000))\n",
|
|
"print(time_torch(10000))"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"On the CPU it took 1.9 seconds to take the sin of a 10k by 10k matrix where on my GPU (Nvidia 1060) it only took 0.003 seconds!\n",
|
|
"It is worth pointing out that there is overhead when transfering matricies from the GPU's memory to main memory. \n",
|
|
"For this reason when designing algorithms you avoid this exchange and try to keep everything on the GPU for as long as possible."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"# Basic Perceptron\n",
|
|
"\n",
|
|
"Now that we have seen Tensors, we can walk into a basic neural network example.\n",
|
|
"In this example we are simply going to be doing linear regression.\n",
|
|
"IE: our algorihtm takes in a single input and tries to predict the output using the equation:\n",
|
|
"\n",
|
|
"$$\n",
|
|
"y = mx+b\n",
|
|
"$$\n",
|
|
"\n",
|
|
"The x is our weight and the m is the weight and the b is the bias.\n",
|
|
"Since this is linear we are not including an activation function."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 20,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"import torch\n",
|
|
"from torch.autograd import Variable\n",
|
|
"import torch.nn as nn\n",
|
|
"import torch.nn.functional as F"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"There is a lot to be said about how you define a neural network in PyTorch, however, most follow this basic example.\n",
|
|
"The constructor creates each layer and the forward function defines how data is calculated as it gets pushed through the network."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 21,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"class Net(nn.Module):\n",
|
|
" def __init__(self):\n",
|
|
" super(Net, self).__init__()\n",
|
|
" self.fc1 = nn.Linear(1,1)\n",
|
|
" def forward(self, x):\n",
|
|
" x = self.fc1(x)\n",
|
|
" return x"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 22,
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"Net(\n",
|
|
" (fc1): Linear(in_features=1, out_features=1, bias=True)\n",
|
|
")\n"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"net = Net()\n",
|
|
"net.cuda() # puts the NN on the GPU\n",
|
|
"print(net) # displays NN structure"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"Printing the network like this is useful because you can see the dimensions of the data going in and out of the neural net."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 47,
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"[Parameter containing:\n",
|
|
"tensor([[0.2431]], device='cuda:0', requires_grad=True), Parameter containing:\n",
|
|
"tensor([0.3372], device='cuda:0', requires_grad=True)]\n"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"print(list(net.parameters()))"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"As to be expected, we see that the NN has two parameters initialized to random values.\n",
|
|
"\n",
|
|
"Next, we have to define our loss function before we can train.\n",
|
|
"Our loss is essentially how incorrect our prediction made by the network was.\n",
|
|
"For this example we are doing the squared difference this can be changed to other things.\n",
|
|
"The important thing is that the loss is positive so that we can do back propagation on it. "
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 48,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"def criterion(out, label):\n",
|
|
" return (label - out)**2"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"The optimizer is a special object by Pytorch that will adjust the weights of the neural network based on how they affected the gradient of the error."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 49,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"import torch.optim as optim\n",
|
|
"optimizer = optim.SGD(net.parameters(), lr=0.01, momentum=0.5)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"The dummy data simply follows the equation y = 3x + 0."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 50,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"data = [(1,3), (2,6), (3,9), (4,12), (5,15), (6,18)]"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"All training loops looks something like this. An epoch is simply how many iterations the neural network looks at the data. Optionally people will include an other outer loop for \"runs\" this will simply run this training process multiple times to see if we are always converging on the same answer or are we getting stuck at local minimums."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 51,
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"Epoch 0 - loss: 5.854729652404785\n",
|
|
"Epoch 1 - loss: 2.294259548187256\n",
|
|
"Epoch 2 - loss: 0.5001814961433411\n",
|
|
"Epoch 3 - loss: 0.8155164122581482\n",
|
|
"Epoch 4 - loss: 0.6028059720993042\n",
|
|
"Epoch 5 - loss: 0.5850293040275574\n",
|
|
"Epoch 6 - loss: 0.5123676657676697\n",
|
|
"Epoch 7 - loss: 0.4659935534000397\n",
|
|
"Epoch 8 - loss: 0.4179149270057678\n",
|
|
"Epoch 9 - loss: 0.3767469823360443\n",
|
|
"Epoch 10 - loss: 0.3389827311038971\n",
|
|
"Epoch 11 - loss: 0.3052201569080353\n",
|
|
"Epoch 12 - loss: 0.2747483551502228\n",
|
|
"Epoch 13 - loss: 0.2473425269126892\n",
|
|
"Epoch 14 - loss: 0.2226628214120865\n",
|
|
"Epoch 15 - loss: 0.2004476934671402\n",
|
|
"Epoch 16 - loss: 0.18044869601726532\n",
|
|
"Epoch 17 - loss: 0.16244502365589142\n",
|
|
"Epoch 18 - loss: 0.14623744785785675\n",
|
|
"Epoch 19 - loss: 0.13164710998535156\n",
|
|
"Epoch 20 - loss: 0.11851246654987335\n",
|
|
"Epoch 21 - loss: 0.10668816417455673\n",
|
|
"Epoch 22 - loss: 0.09604380279779434\n",
|
|
"Epoch 23 - loss: 0.08646118640899658\n",
|
|
"Epoch 24 - loss: 0.07783503830432892\n",
|
|
"Epoch 25 - loss: 0.0700690820813179\n",
|
|
"Epoch 26 - loss: 0.06307818740606308\n",
|
|
"Epoch 27 - loss: 0.056784771382808685\n",
|
|
"Epoch 28 - loss: 0.051119253039360046\n",
|
|
"Epoch 29 - loss: 0.046018924564123154\n",
|
|
"Epoch 30 - loss: 0.041427500545978546\n",
|
|
"Epoch 31 - loss: 0.0372944138944149\n",
|
|
"Epoch 32 - loss: 0.03357337787747383\n",
|
|
"Epoch 33 - loss: 0.03022376075387001\n",
|
|
"Epoch 34 - loss: 0.02720830962061882\n",
|
|
"Epoch 35 - loss: 0.0244936253875494\n",
|
|
"Epoch 36 - loss: 0.022049833089113235\n",
|
|
"Epoch 37 - loss: 0.019849959760904312\n",
|
|
"Epoch 38 - loss: 0.017869414761662483\n",
|
|
"Epoch 39 - loss: 0.016086600720882416\n",
|
|
"Epoch 40 - loss: 0.014481627382338047\n",
|
|
"Epoch 41 - loss: 0.013036711141467094\n",
|
|
"Epoch 42 - loss: 0.011736114509403706\n",
|
|
"Epoch 43 - loss: 0.010565121658146381\n",
|
|
"Epoch 44 - loss: 0.00951105635613203\n",
|
|
"Epoch 45 - loss: 0.008562112227082253\n",
|
|
"Epoch 46 - loss: 0.007707839831709862\n",
|
|
"Epoch 47 - loss: 0.006938829552382231\n",
|
|
"Epoch 48 - loss: 0.00624650064855814\n",
|
|
"Epoch 49 - loss: 0.005623326636850834\n",
|
|
"Epoch 50 - loss: 0.005062263924628496\n",
|
|
"Epoch 51 - loss: 0.004557166714221239\n",
|
|
"Epoch 52 - loss: 0.0041025192476809025\n",
|
|
"Epoch 53 - loss: 0.003693199949339032\n",
|
|
"Epoch 54 - loss: 0.003324714954942465\n",
|
|
"Epoch 55 - loss: 0.0029929918237030506\n",
|
|
"Epoch 56 - loss: 0.002694392576813698\n",
|
|
"Epoch 57 - loss: 0.002425551414489746\n",
|
|
"Epoch 58 - loss: 0.0021835630759596825\n",
|
|
"Epoch 59 - loss: 0.001965709263458848\n",
|
|
"Epoch 60 - loss: 0.0017695967108011246\n",
|
|
"Epoch 61 - loss: 0.0015930046793073416\n",
|
|
"Epoch 62 - loss: 0.001434113597497344\n",
|
|
"Epoch 63 - loss: 0.0012909761862829328\n",
|
|
"Epoch 64 - loss: 0.0011621960438787937\n",
|
|
"Epoch 65 - loss: 0.0010462489444762468\n",
|
|
"Epoch 66 - loss: 0.0009418732952326536\n",
|
|
"Epoch 67 - loss: 0.0008478892268612981\n",
|
|
"Epoch 68 - loss: 0.000763303367421031\n",
|
|
"Epoch 69 - loss: 0.0006871427176520228\n",
|
|
"Epoch 70 - loss: 0.0006185721722431481\n",
|
|
"Epoch 71 - loss: 0.0005568634951487184\n",
|
|
"Epoch 72 - loss: 0.0005012964247725904\n",
|
|
"Epoch 73 - loss: 0.0004512994782999158\n",
|
|
"Epoch 74 - loss: 0.00040626057307235897\n",
|
|
"Epoch 75 - loss: 0.000365746789611876\n",
|
|
"Epoch 76 - loss: 0.00032922677928581834\n",
|
|
"Epoch 77 - loss: 0.0002963977458421141\n",
|
|
"Epoch 78 - loss: 0.0002668169909156859\n",
|
|
"Epoch 79 - loss: 0.00024020038836169988\n",
|
|
"Epoch 80 - loss: 0.00021623533393722028\n",
|
|
"Epoch 81 - loss: 0.00019466542289592326\n",
|
|
"Epoch 82 - loss: 0.00017523708811495453\n",
|
|
"Epoch 83 - loss: 0.00015775684732943773\n",
|
|
"Epoch 84 - loss: 0.0001420119369868189\n",
|
|
"Epoch 85 - loss: 0.00012784828140866011\n",
|
|
"Epoch 86 - loss: 0.00011509257456054911\n",
|
|
"Epoch 87 - loss: 0.00010360320447944105\n",
|
|
"Epoch 88 - loss: 9.327425505034626e-05\n",
|
|
"Epoch 89 - loss: 8.396316115977243e-05\n",
|
|
"Epoch 90 - loss: 7.559276855317876e-05\n",
|
|
"Epoch 91 - loss: 6.804279837524518e-05\n",
|
|
"Epoch 92 - loss: 6.125887739472091e-05\n",
|
|
"Epoch 93 - loss: 5.5145825172076e-05\n",
|
|
"Epoch 94 - loss: 4.9642534577287734e-05\n",
|
|
"Epoch 95 - loss: 4.468947372515686e-05\n",
|
|
"Epoch 96 - loss: 4.02352525270544e-05\n",
|
|
"Epoch 97 - loss: 3.622113945311867e-05\n",
|
|
"Epoch 98 - loss: 3.2605526939732954e-05\n",
|
|
"Epoch 99 - loss: 2.934764779638499e-05\n"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"for epoch in range(100):\n",
|
|
" for i, data2 in enumerate(data):\n",
|
|
" X, Y = iter(data2)\n",
|
|
" X, Y = Variable(torch.FloatTensor([X]), requires_grad=True).cuda(), Variable(torch.FloatTensor([Y])).cuda()\n",
|
|
" optimizer.zero_grad()\n",
|
|
" outputs = net(X)\n",
|
|
" loss = criterion(outputs, Y)\n",
|
|
" loss.backward()\n",
|
|
" optimizer.step()\n",
|
|
" if (i % 10 == 0):\n",
|
|
" print(\"Epoch {} - loss: {}\".format(epoch, loss.data[0]))"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 52,
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"[Parameter containing:\n",
|
|
"tensor([[2.9989]], device='cuda:0', requires_grad=True), Parameter containing:\n",
|
|
"tensor([0.0063], device='cuda:0', requires_grad=True)]\n"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"print(list(net.parameters()))"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"We can see that our NN has extimated the data generated by \"y= 3x\" to be \"y= 2.999x + 0.006\".\n",
|
|
"\n",
|
|
"We can now use this network to make predictions.\n",
|
|
"Note: the shape of the input has to compy to the forward function and the device of your input tensor must be the same device that the network is on. "
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 58,
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"tensor([[[1.]]], device='cuda:0')\n",
|
|
"tensor([[[3.0051]]], device='cuda:0', grad_fn=<AddBackward0>)\n"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"input = Variable(torch.ones(1,1,1).cuda())\n",
|
|
"print(input)\n",
|
|
"print(net(input))"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"From this example, we can easily create more complex neural networks. By adding a few lines of code we can create a multi-layer perceptron. "
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 64,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"class Net(nn.Module):\n",
|
|
" def __init__(self):\n",
|
|
" super(Net, self).__init__()\n",
|
|
" self.fc1 = nn.Linear(1,10)\n",
|
|
" self.fc2 = nn.Linear(10,1)\n",
|
|
" def forward(self, x):\n",
|
|
" x = self.fc2(self.fc1(x))\n",
|
|
" return x"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 73,
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"data": {
|
|
"text/plain": [
|
|
"Net(\n",
|
|
" (fc1): Linear(in_features=1, out_features=10, bias=True)\n",
|
|
" (fc2): Linear(in_features=10, out_features=1, bias=True)\n",
|
|
")"
|
|
]
|
|
},
|
|
"execution_count": 73,
|
|
"metadata": {},
|
|
"output_type": "execute_result"
|
|
}
|
|
],
|
|
"source": [
|
|
"net = Net().cuda()\n",
|
|
"net"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 74,
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"Epoch 0 - loss: 7.190098285675049\n",
|
|
"Epoch 1 - loss: 1.51701192407927e-06\n",
|
|
"Epoch 2 - loss: 0.1253584325313568\n",
|
|
"Epoch 3 - loss: 0.5402220487594604\n",
|
|
"Epoch 4 - loss: 1.1704645156860352\n",
|
|
"Epoch 5 - loss: 1.7808796167373657\n",
|
|
"Epoch 6 - loss: 2.0603320598602295\n",
|
|
"Epoch 7 - loss: 1.8789410591125488\n",
|
|
"Epoch 8 - loss: 1.4110114574432373\n",
|
|
"Epoch 9 - loss: 0.9053158164024353\n",
|
|
"Epoch 10 - loss: 0.497363805770874\n",
|
|
"Epoch 11 - loss: 0.22036807239055634\n",
|
|
"Epoch 12 - loss: 0.06279802322387695\n",
|
|
"Epoch 13 - loss: 0.0023560039699077606\n",
|
|
"Epoch 14 - loss: 0.019712744280695915\n",
|
|
"Epoch 15 - loss: 0.10364333540201187\n",
|
|
"Epoch 16 - loss: 0.2530903220176697\n",
|
|
"Epoch 17 - loss: 0.4757329523563385\n",
|
|
"Epoch 18 - loss: 0.7704269289970398\n",
|
|
"Epoch 19 - loss: 1.0567550659179688\n",
|
|
"Epoch 20 - loss: 1.0800272226333618\n",
|
|
"Epoch 21 - loss: 0.697075366973877\n",
|
|
"Epoch 22 - loss: 0.27709755301475525\n",
|
|
"Epoch 23 - loss: 0.05667449161410332\n",
|
|
"Epoch 24 - loss: 6.271171878324822e-05\n",
|
|
"Epoch 25 - loss: 0.024447228759527206\n",
|
|
"Epoch 26 - loss: 0.06642721593379974\n",
|
|
"Epoch 27 - loss: 0.09308802336454391\n",
|
|
"Epoch 28 - loss: 0.09632419049739838\n",
|
|
"Epoch 29 - loss: 0.08166800439357758\n",
|
|
"Epoch 30 - loss: 0.0587223619222641\n",
|
|
"Epoch 31 - loss: 0.03566744551062584\n",
|
|
"Epoch 32 - loss: 0.017448360100388527\n",
|
|
"Epoch 33 - loss: 0.005931335035711527\n",
|
|
"Epoch 34 - loss: 0.0007410198450088501\n",
|
|
"Epoch 35 - loss: 0.00021591292170342058\n",
|
|
"Epoch 36 - loss: 0.0022106748074293137\n",
|
|
"Epoch 37 - loss: 0.004746304824948311\n",
|
|
"Epoch 38 - loss: 0.00644687982276082\n",
|
|
"Epoch 39 - loss: 0.006729086861014366\n",
|
|
"Epoch 40 - loss: 0.005728728137910366\n",
|
|
"Epoch 41 - loss: 0.004019442945718765\n",
|
|
"Epoch 42 - loss: 0.002261112444102764\n",
|
|
"Epoch 43 - loss: 0.0009276010096073151\n",
|
|
"Epoch 44 - loss: 0.000198100067791529\n",
|
|
"Epoch 45 - loss: 1.51349013322033e-08\n",
|
|
"Epoch 46 - loss: 0.00012611271813511848\n",
|
|
"Epoch 47 - loss: 0.0003555955190677196\n",
|
|
"Epoch 48 - loss: 0.0005311298882588744\n",
|
|
"Epoch 49 - loss: 0.0005841966485604644\n",
|
|
"Epoch 50 - loss: 0.0005200451705604792\n",
|
|
"Epoch 51 - loss: 0.00038515732740052044\n",
|
|
"Epoch 52 - loss: 0.00023477006470784545\n",
|
|
"Epoch 53 - loss: 0.00011088581231888384\n",
|
|
"Epoch 54 - loss: 3.358204776304774e-05\n",
|
|
"Epoch 55 - loss: 2.3553052415081766e-06\n",
|
|
"Epoch 56 - loss: 3.887686034431681e-06\n",
|
|
"Epoch 57 - loss: 2.055288859992288e-05\n",
|
|
"Epoch 58 - loss: 3.7825520848855376e-05\n",
|
|
"Epoch 59 - loss: 4.7023870138218626e-05\n",
|
|
"Epoch 60 - loss: 4.590576281771064e-05\n",
|
|
"Epoch 61 - loss: 3.690157973323949e-05\n",
|
|
"Epoch 62 - loss: 2.4580915123806335e-05\n",
|
|
"Epoch 63 - loss: 1.3046843378106132e-05\n",
|
|
"Epoch 64 - loss: 4.940734925185097e-06\n",
|
|
"Epoch 65 - loss: 8.360089509551472e-07\n",
|
|
"Epoch 66 - loss: 3.7756819892820204e-08\n",
|
|
"Epoch 67 - loss: 1.0964904504362494e-06\n",
|
|
"Epoch 68 - loss: 2.620714667500579e-06\n",
|
|
"Epoch 69 - loss: 3.7055926895845914e-06\n",
|
|
"Epoch 70 - loss: 3.951881808461621e-06\n",
|
|
"Epoch 71 - loss: 3.42826956511999e-06\n",
|
|
"Epoch 72 - loss: 2.441704282318824e-06\n",
|
|
"Epoch 73 - loss: 1.4376178114616778e-06\n",
|
|
"Epoch 74 - loss: 6.314672305052227e-07\n",
|
|
"Epoch 75 - loss: 1.5456868140972801e-07\n",
|
|
"Epoch 76 - loss: 2.482977379258955e-09\n",
|
|
"Epoch 77 - loss: 4.905820105705061e-08\n",
|
|
"Epoch 78 - loss: 1.7209913494298235e-07\n",
|
|
"Epoch 79 - loss: 2.890502059926803e-07\n",
|
|
"Epoch 80 - loss: 3.2851039577508345e-07\n",
|
|
"Epoch 81 - loss: 3.0332216738315765e-07\n",
|
|
"Epoch 82 - loss: 2.374881660216488e-07\n",
|
|
"Epoch 83 - loss: 1.5010215292932116e-07\n",
|
|
"Epoch 84 - loss: 7.50447384234576e-08\n",
|
|
"Epoch 85 - loss: 2.213346306234598e-08\n",
|
|
"Epoch 86 - loss: 1.6427748050773516e-09\n",
|
|
"Epoch 87 - loss: 2.1614710021822248e-09\n",
|
|
"Epoch 88 - loss: 1.1408701539039612e-08\n",
|
|
"Epoch 89 - loss: 1.978719410544727e-08\n",
|
|
"Epoch 90 - loss: 2.7853275241795927e-08\n",
|
|
"Epoch 91 - loss: 2.675028554222081e-08\n",
|
|
"Epoch 92 - loss: 2.1639664282702142e-08\n",
|
|
"Epoch 93 - loss: 1.3042210866842652e-08\n",
|
|
"Epoch 94 - loss: 6.844459221611032e-09\n",
|
|
"Epoch 95 - loss: 3.3565470403118525e-09\n",
|
|
"Epoch 96 - loss: 5.913989298278466e-10\n",
|
|
"Epoch 97 - loss: 1.8417267710901797e-11\n",
|
|
"Epoch 98 - loss: 2.3283064365386963e-10\n",
|
|
"Epoch 99 - loss: 7.130438461899757e-10\n"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"criterion = nn.MSELoss() # mean squared error for loss function\n",
|
|
"optimizer = optim.SGD(net.parameters(), lr=0.01, momentum=0.01)\n",
|
|
"for epoch in range(100):\n",
|
|
" for i, data2 in enumerate(data):\n",
|
|
" X, Y = iter(data2)\n",
|
|
" X, Y = Variable(torch.FloatTensor([X]), requires_grad=True).cuda(), Variable(torch.FloatTensor([Y])).cuda()\n",
|
|
" optimizer.zero_grad()\n",
|
|
" outputs = net(X)\n",
|
|
" loss = criterion(outputs, Y)\n",
|
|
" loss.backward()\n",
|
|
" optimizer.step()\n",
|
|
" if (i % 10 == 0):\n",
|
|
" print(\"Epoch {} - loss: {}\".format(epoch, loss.data))"
|
|
]
|
|
}
|
|
],
|
|
"metadata": {
|
|
"kernelspec": {
|
|
"display_name": "ml",
|
|
"language": "python",
|
|
"name": "ml"
|
|
},
|
|
"language_info": {
|
|
"codemirror_mode": {
|
|
"name": "ipython",
|
|
"version": 3
|
|
},
|
|
"file_extension": ".py",
|
|
"mimetype": "text/x-python",
|
|
"name": "python",
|
|
"nbconvert_exporter": "python",
|
|
"pygments_lexer": "ipython3",
|
|
"version": "3.8.3"
|
|
}
|
|
},
|
|
"nbformat": 4,
|
|
"nbformat_minor": 4
|
|
}
|