Personal blog written from scratch using Node.js, Bootstrap, and MySQL. https://jrtechs.net
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

217 lines
7.9 KiB

  1. It is not uncommon for me to get exuberantly excited over an
  2. open-source project that I stumble upon-- Jupyter Lab has taken the
  3. cake this month. The Jypyter project extends IPython notebooks to the
  4. web browser and added support for multiple languages.
  5. # Why Notebooks?
  6. As a researcher, I love notebooks because they enable you to easily
  7. share your code with others. Notebooks are much more interactive than
  8. simply sharing source code because you can mix text(markdown), code,
  9. and outputs in code execution. For classes and when working, this
  10. makes it very easy to generate quick reports. You can simply write a
  11. document that auto generates the graphs and figures you want to talk
  12. about in your document.
  13. Last week I worked on a computer vision assignment that required me to
  14. use Open CV to manipulate images using filters, convolutions, etc. The
  15. entirety of the assignment required me to produce roughly 30 images. A
  16. majority of the class wrote python scripts and threw each image they
  17. generated into a massive word document. They then added their analysis
  18. and submitted their assignment as a PDF alongside a bunch of python
  19. scripts. There is nothing wrong with doing that; however, what happens
  20. if at the end of the assignment you realized that you were generating
  21. Gaussian filters incorrectly? If you wrote everything in a Jupyter
  22. notebook you would just have to fix the dubious code and re-run the
  23. notebook and it would produce your report in its entirety. But, if you
  24. had your scripts as separate files you would have to fix your code and
  25. then go through and generate a dozen new images to update your report.
  26. The ability to accurately reproduce your report is pinnacle to making
  27. research more verifiable and reproducible. This is something that the
  28. R and open-science communities heavily focus on. Directly mixing your
  29. code and analysis with your report is very useful when trying to
  30. explain things un-ambiguously. Consider if the data that you are
  31. working with changes halfway through writing your research report.
  32. With a notebook, you would just have to re-run the notebook where if
  33. you had the report as a separate word or Latex file, you now run the
  34. risk of misreporting your results.
  35. # Jupyter Notebook
  36. When you run a Jupyter notebook it starts a new server and launches
  37. you into your native web browser. From your web browser, you can view
  38. files in your current directory and choose one to edit. The one that
  39. you pick will open in a new window.
  40. ![Jupyter Notebook home](media/jupyter/jupyterHome.png)
  41. In this notebook preview, you can add snippets. Each snippet can be
  42. either code, markdown or raw text. You can run snippets or rearrange
  43. them however you please.
  44. The concepts of snippets introduce the final and most compelling
  45. reason to use notebooks. Although you should be able to execute your
  46. notebook by running all snippets sequentially, you don't have to
  47. follow that order. Plus, the results of running snippets are saved in
  48. your "workspace" between runs. This means that you don't have to
  49. always re-compute your costly computations between each programming
  50. session. This enables you to load a large dataset, run complex
  51. computations, store the results in a variable and then access that
  52. variable the next day. This enables quick R&D because in a traditional
  53. setting you would consider building out infrastructure like databases
  54. to store your temporary computations.
  55. ![Jupyter Notebook preview](media/jupyter/notebookPreview.png)
  56. # Jupyter Lab
  57. Although notebooks have been around for quite some time, I got hooked
  58. on Jupyter because it brings the entire ecosystem together very
  59. nicely. With Jupyter notebook, you could only have one notebook open
  60. in a single web browser tab. If you wanted multiple notebooks, you had
  61. to open multiple windows. Jupyter Lab has a built in window manager
  62. enabling you to view files, notebooks, terminals, and other file
  63. formats all in the same internet browser tab!
  64. In case you missed it, in Jupyter lab you can launch terminals! This
  65. is important for a development framework to have because it enables
  66. you to run any program that is on your computer. I find this
  67. particularly useful when I am running Jupyter Lab on a remote computer
  68. and I want to use git.
  69. Jupyter Lab also has a built-in light and dark theme you can use.
  70. ![light theme](media/jupyter/light.png)
  71. ![dark theme](media/jupyter/darkTheme.png)
  72. # Running and Installing
  73. Since the instructions will probably change, I'm just going to link to
  74. the the website where you can install Jupyter lab from:
  75. [https://jupyter.org/install.html](https://jupyter.org/install.html)
  76. The installation is essentially just a pip install command.
  77. ```bash
  78. pip install jupyterlab
  79. ```
  80. Running Jupyter lab is also a single command:
  81. ```bash
  82. jupyter lab
  83. ```
  84. # Running for remote use
  85. Imagine that you are running an old computer and you simply want your
  86. code to run on a remote computer that has a beefier GPU for ML. With
  87. Jupyter Lab or Notebook you can do that, but, it takes a little
  88. trickery. The easiest solution that I found involves using a reverse
  89. SSH proxy.
  90. ![network diagram](media/jupyter/network.jpg)
  91. The first thing that you want to do is set up a password so that you
  92. can connect to the Jupyter lab instance using a password rather than
  93. using an authentication key that gets hidden in the terminal.
  94. ```bash
  95. jupyter notebook password
  96. ```
  97. ** note ** the password that you set is configured in the same config used by both Jupyter lab and Jupyter notebook.
  98. The next thing you should do is run the Jupyter lab instance on the
  99. port that you want it to listen to.
  100. ```bash
  101. jupyter lab --no-browser --port=6000
  102. ```
  103. The "--no-browser" will prevent Jupyter from opening in your default
  104. web browser.
  105. The next step is to do a local SSH port forward on your machine so you
  106. can access the Jupyter instance on the remote server. The benefit of
  107. doing this is that you can get behind firewalls and that all your
  108. traffic is encrypted.
  109. ![local port forwarding](media/jupyter/localForward.png)
  110. The image above comes from my presentation on "[Everything
  111. SSH](https://jrtechs.net/open-source/teaching-ssh-through-a-ctf)". The
  112. essence of the command bellow is that you will forward all connections
  113. on your machines to port 6000 to a remote server's connection to
  114. localhost:6000.
  115. ```bash
  116. ssh -L 6000:localhost:6000 user@some-remote-host.rit.edu
  117. ```
  118. After you run that command you can access the Jupyter lab instance by
  119. opening your favorite web client and going to localhost:6000. Typing
  120. that command every time is tedious so I recommend that you alias it in
  121. your shells config file.
  122. ```bash
  123. alias jj="ssh -L 6000:localhost:6000 user@some-remote-host.rit.edu"
  124. ```
  125. Now all you have to type in your command prompt is "jj" to connect to
  126. your remote Jupyter server. Neat.
  127. But, what if your roommate trips and your server gets restarted? Well,
  128. you can write a systemd script to automatically start your Jupyter
  129. server when the computer boots. This is what my system d script looks
  130. like.
  131. ```bash
  132. # location /lib/systemd/system
  133. #
  134. # After file creation run: systemctl daemon-reload
  135. # enable service on start up: systemctl enable jupyter-lab
  136. # start the service: systemctl start jupyter-lab
  137. [Unit]
  138. Description=Script to start jupyter lab
  139. Documentation=https://jrtechs.net
  140. After=network.target
  141. [Service]
  142. Type=simple
  143. User=jeff
  144. WorkingDirectory=/home/jeff/Documents/school/csci-431/
  145. ExecStart=/usr/local/bin/jupyter lab --no-browser --port=6969
  146. Restart=on-failure
  147. [Install]
  148. WantedBy=multi-user.target
  149. ```
  150. You want to set the working directory to be the location where your
  151. Jupyter notebooks are stored. You also want to make sure that you
  152. specify the absolute path to the Jupyter binary in the execstart
  153. parameter. You can find that using the which command:
  154. ```bash
  155. which jupyter
  156. ```
  157. # Conclusion
  158. If you do any data science or educational python I would strongly
  159. recommend that you check out the Jupyter project. If you want multiple
  160. users to connect to the same Jupyter server, they have a project
  161. called [Jupyter Hub](https://github.com/jupyterhub/jupyterhub) that
  162. would manage all that.