Personal blog written from scratch using Node.js, Bootstrap, and MySQL. https://jrtechs.net
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

262 lines
9.2 KiB

  1. ## What is Gremlin?
  2. Gremlin is a graph traversal language: think of Gremlin as the SQL for graph databases. Gremlin is not
  3. a graph database server, it is a language; but, there is a Gremlin Server and a Gremlin Console available for
  4. interacting with graph databases. It is possible to use Gremlin on large database platforms
  5. like [Titan](https://www.digitalocean.com/community/tutorials/how-to-set-up-the-titan-graph-database-with-cassandra-and-elasticsearch-on-ubuntu-16-04)
  6. and [HBase](https://docs.janusgraph.org/latest/hbase.html).
  7. ## Graph Data Base Basics
  8. A graph database is based on graph theory. A graph is composed of nodes, edges, and properties. A key
  9. object/component in a graph database is stored as a node. Nodes are connected via edges representing
  10. relationships. For example, you may represent people as nodes and have edges representing friendships.
  11. You can assign properties to both nodes and edges. A person (node) may have the properties of age and name,
  12. where a friendship (edge) may have a start date property.
  13. ## Why Graph Databases?
  14. Graph databases are great for modeling data where the value lies in the shape of the graph. Graph databases
  15. also allow to to model more complex relationships which would be difficult to model in a normal table-based
  16. database.
  17. ## Gremlin Installation
  18. Download and extract the following:
  19. - [Gremlin Console](https://www.apache.org/dyn/closer.lua/tinkerpop/3.3.3/apache-tinkerpop-gremlin-console-3.3.3-bin.zip)
  20. - [Gremlin Server](https://www.apache.org/dyn/closer.lua/tinkerpop/3.3.3/apache-tinkerpop-gremlin-server-3.3.3-bin.zip)
  21. Start the Gremlin server by running it with the start script in the bin folder. As a prerequisite for running gremlin, you
  22. must have Java installed on your computer.
  23. ```
  24. ./gremlin-server.sh
  25. ```
  26. Start the Gremlin console by running the gremlin.sh or gremlin.bat script in the bin folder.
  27. ```
  28. ./gremlin.sh
  29. ```
  30. Now you need to instantiate a new graph on the server to use. To to that, execute the following commands in
  31. the Gremlin console.
  32. ```java
  33. #Creates a empty graph
  34. gremlin> graph = EmptyGraph.instance()
  35. ==>emptygraph[empty]
  36. #Opens a connection to the server -- listens on localhost by default
  37. gremlin> cluster = Cluster.open()
  38. ==>localhost/127.0.0.1:8182
  39. #Tells the server to use g as the graph traversal source
  40. gremlin> g = graph.traversal().withRemote(DriverRemoteConnection.using(cluster, "g"))
  41. ==>graphtraversalsource[emptygraph[empty], standard]
  42. ```
  43. ## Gremlin Syntax
  44. Now that you have your gremlin server and console set up, you are ready to start executing Gremlin queries.
  45. ### Adding a Vertex
  46. In Gremlin nodes are referred to as "Vertexes". To add a node/vertex to the graph, you simply use the
  47. command addV() on your graph traversal source. For consistency, most people
  48. use "g" as their default graph traversal source. To append properties to your your vertex, you add a series of
  49. ".property('property_name', 'property_value')" strings to the add vertex query.
  50. EX:
  51. ```java
  52. g.addV('student').property('name', 'Jeffery').property('GPA', 4.0);
  53. ```
  54. ### Updating a Property
  55. Unlike SQL, you are not limited to a specific schema in a graph database. If you want to add or change
  56. a property on a vertex or edge, you simply use the property command again.
  57. The "g.V(1)" in the following example refers to a specific vertex with the primary id of 1-- the graph database auto assigns these ids.
  58. You can replace "g.V(1)" with a command to select a specific vertex or edge.
  59. ```java
  60. g.V(1).property('name', 'Jeffery R');
  61. ```
  62. ### Selection
  63. Selecting nodes and edges is the most complicated part of Gremlin. The concept is not particularly hard, but, there
  64. are dozens of ways to do graph traversals and selections. I will cover the most common aways to traverse a graph.
  65. This example will select all vertexes which have the label "student". The ".valueMap()" command appended to the end of the query
  66. makes Gremlin return a map of all the objects it returns with their properties.
  67. ```java
  68. g.V().hasLabel('student').valueMap();
  69. ```
  70. In this following example, instead of returning a ValueMap of values, we are just returning the names of the students
  71. in the graph.
  72. ```java
  73. g.V().hasLabel('student').values('name');
  74. ```
  75. This example will return the GPA of the student with the name "Jeffery R".
  76. ```java
  77. g.V().hasLabel('student').has('name', 'Jeffery R').values('gpa');
  78. ```
  79. This command will return all the students in order of their GPA.
  80. ```java
  81. g.V().hasLabel('student').order().by('gpa', decr).value('name')
  82. ```
  83. ### Adding Edges
  84. The easiest way (my opinion) to add edges in Gremlin is by
  85. using aliasing. In this example we select two nodes and assign them a name: in this case it is "a", and "b".
  86. After we have selected two edges, we can add an edge to them using the "addE()" command. The syntax of this is
  87. nice because we know that "a" is friends with "b"-- it is easy to tell the direction of the edge.
  88. ```java
  89. g.V(0).as('a').V(1).as('b').addE('knows')
  90. .from('a').to('b');
  91. ```
  92. ## Using Gremlin with Java
  93. Now that you know the basic syntax of Gremlin, you are ready to use it somewhere other than the Gremlin console. If you
  94. are trying to use Gremlin with Java, there is a great Maven dependency for TinkerPop and Gremlin. If you want to quickly
  95. connect to your Gremlin server with Java, make sure your server is set up exactly as it was before this tutorial started discussing
  96. Gremlin syntax.
  97. #### Maven dependency for Java:
  98. ```html
  99. <!-- https://mvnrepository.com/artifact/com.tinkerpop/gremlin-core -->
  100. <dependency>
  101. <groupId>com.tinkerpop</groupId>
  102. <artifactId>gremlin-core</artifactId>
  103. <version>3.0.0.M7</version>
  104. </dependency>
  105. <!-- https://mvnrepository.com/artifact/org.apache.tinkerpop/gremlin-driver -->
  106. <dependency>
  107. <groupId>org.apache.tinkerpop</groupId>
  108. <artifactId>gremlin-driver</artifactId>
  109. <version>3.3.3</version>
  110. </dependency>
  111. <dependency>
  112. <groupId>org.apache.tinkerpop</groupId>
  113. <artifactId>tinkergraph-gremlin</artifactId>
  114. <version>3.3.3</version>
  115. </dependency>
  116. ```
  117. It is helpful to wrap everything relating to the graph database connection into a single Java class. This is roughly
  118. the code that I usually use to interact with a Gremlin Server-- anybody is free to use it.
  119. ```java
  120. public class GraphConnection
  121. {
  122. /** Stores/manages client connections **/
  123. private Cluster cluster;
  124. /** Connection to the graph db */
  125. private Client client;
  126. public RemoteConnection()
  127. {
  128. Cluster.Builder b = Cluster.build();
  129. b.addContactPoint("localhost");
  130. b.port(8182);
  131. this.cluster = b.create();
  132. this.client = cluster.connect();
  133. }
  134. public synchronized ResultSet queryGraph(String q)
  135. {
  136. return this.client.submit(q);
  137. }
  138. public void closeConnection()
  139. {
  140. this.cluster.close();
  141. }
  142. }
  143. ```
  144. #### Basic GraphConnection.java Usage:
  145. ```java
  146. RemoteConnection con = new RemoteConnection()
  147. String query = "g.V().hasLabel('player')" +
  148. ".has('id', '" + p1 + "')" +
  149. ".as('p1')" +
  150. "V().hasLabel('player')" +
  151. ".has('id', '" + p2 + "')" +
  152. ".as('p2')" +
  153. ".addE('friends')" +
  154. ".from('p1').to('p2')";
  155. this.con.queryGraph(query);
  156. ```
  157. #### Overly complex usage with a lambda statement
  158. ```java
  159. /**
  160. * Fetches the list of a player's friends.
  161. *
  162. * @param id steam id
  163. * @return list of friends
  164. */
  165. private List<Player> getFriendsFromGraph(String id)
  166. {
  167. List<Player> friends = new ArrayList<>();
  168. String query = "g.V().hasLabel('player')" +
  169. ".has('id', '" + id + "')" +
  170. ".both().valueMap()";
  171. this.con.queryGraph(query).stream().forEach(r ->
  172. friends.add(new Player(
  173. ((ArrayList) (((HashMap<String, Object>) (r.getObject()))
  174. .get("name"))).get(0).toString(),
  175. ((ArrayList) (((HashMap<String, Object>) (r.getObject()))
  176. .get("id"))).get(0).toString()))
  177. );
  178. return friends;
  179. }
  180. ```
  181. The most important thing to do while playing around with Gremlin in Java is to keep an eye on the
  182. return type. From experience, I can say that it is often easier to return the vertex from your
  183. query rather than returning the valueMap.
  184. Without returning the valueMap in the query, you can directly access the vertex
  185. in the result rather than doing some voodoo witchcraft and casting between ArrayLists and HashMaps.
  186. The previous example could be re-written as this:
  187. ```java
  188. List<Player> friends = new ArrayList<>();
  189. String query = "g.V().hasLabel('player')" +
  190. ".has('id', '" + id + "')" +
  191. ".both()";
  192. for(Result r: this.con.queryGraph(query))
  193. {
  194. friends.add(new Player(r.getVertex("name").value().toString),
  195. r.getVertex("id").value().toString));
  196. }
  197. ```
  198. You now know enough about Gremlin to be dangerous with it. Yay! If you want to do more than basic things with Gremlin,
  199. I highly suggest that you look at the tutorial [SQL 2 Gremlin](http://sql2gremlin.com/).
  200. If you plan on deploying this to production, it is recommended that you use HBase for a persistent back end storage
  201. server.
  202. ## Resources
  203. - [SQL 2 Gremlin](http://sql2gremlin.com/)
  204. - [Practical Gremlin](http://kelvinlawrence.net/book/Gremlin-Graph-Guide.html)
  205. - [Apache TinkerPop](http://tinkerpop.apache.org/)
  206. - [Steam Friends Graph (Personal Gremlin Project)](https://github.com/jrtechs/SteamFriendsGraph)