Update and finalize Lecture 5

5 years ago · cfb93d39ae
--- a/index.md
+++ b/index.md
@ -18,7 +18,7 @@ Starting with the Fall 2019 offering of CS 224W, the course covers three broad t
 ## Network Methods

 1. [Structural Roles in Networks](): RolX, Granovetter, the Louvain algorithm
 2. [Spectral Clustering](): Graph partitions and cuts, the Laplacian, and motif clustering
 2. [Spectral Clustering](network-methods/spectral-clustering): Graph partitions and cuts, the Laplacian, and motif clustering
 3. [Influence Maximization](): Influential sets, submodularity, hill climbing
 4. [Outbreak Detection](): CELF, lazy hill climbing
 5. [Link Analysis](): PageRank and SimRank
--- a/network-methods/spectral-clustering.md
+++ b/network-methods/spectral-clustering.md
@ -34,9 +34,9 @@ In particular, $$\lambda_2$$, the second smallest eigenvalue of $$L$$, is alread

 Using these properties and the definition of $$L$$, we can write out a more concrete formula for $$\lambda_2$$: $$\lambda_2 = \min_x \frac{\sum_{(i, j) \in E} (x_i - x_j)^2}{\sum_i x_i^2}$$, subject to the constraint $$\sum_i x_i = 0$$. If we additionally constrain $$x$$ to have unit length, the objective turns into simply $$\min_x \sum_{(i, j) \in E} (x_i - x_j)^2$$.

 How does $$\lambda_2$$ relate to our original objective of finding a best partition of our graph? Let's express our partition $$(A, B)$$ as a vector $$y$$ defined by $$y_i = 1$$ if $$i \in A$$ and $$y_i = -1$$ if $$i \in B$$. Instead of using the conductance here, let's first try to minimize the cut while taking care of the problem of balancing partition sizes by enforcing that $$|A| = |B|$$ (balance size of partitions), which amounts to constraining $$\sum_i y_i = 0$$. Given this size constraint, let's minimize the cut of the partition, i.e. find $$y$$ that minimizes $$\sum_{(i, j) \in E} (y_i - y_j)^2$$. Note that the entries of $$y$$ must be $$+1$$ or $$-1$$, which has the consequence that the length of $$y$$ is fixed. *This optimization problem looks a lot like the definition of $$\lambda_2$$!* Indeed, by our findings above we have that this objective is minimized by $$\lambda_2$$ of our Laplacian, and the optimal clustering $$y$$ is given by its corresponding eigenvector, known as the **Fiedler vector**.
 How does $$\lambda_2$$ relate to our original objective of finding a best partition of our graph? Let's express our partition $$(A, B)$$ as a vector $$y$$ defined by $$y_i = 1$$ if $$i \in A$$ and $$y_i = -1$$ if $$i \in B$$. Instead of using the conductance here, let's first try to minimize the cut while taking care of the problem of balancing partition sizes by enforcing that $$\vert A\vert = \vert B\vert$$ (balance size of partitions), which amounts to constraining $$\sum_i y_i = 0$$. Given this size constraint, let's minimize the cut of the partition, i.e. find $$y$$ that minimizes $$\sum_{(i, j) \in E} (y_i - y_j)^2$$. Note that the entries of $$y$$ must be $$+1$$ or $$-1$$, which has the consequence that the length of $$y$$ is fixed. *This optimization problem looks a lot like the definition of $$\lambda_2$$!* Indeed, by our findings above we have that this objective is minimized by $$\lambda_2$$ of our Laplacian, and the optimal clustering $$y$$ is given by its corresponding eigenvector, known as the **Fiedler vector**.

 Now that we have a link between an eigenvalue of $$L$$ and graph partitioning, let's push the connection further and see if we can get rid of the hard $$|A| = |B|$$ constraint -- maybe there is a link between the more flexible conductance measure and $$\lambda_2$$. Let's rephrase conductance here in the following way: if a graph $$G$$ is partitioned into $$A$$ and $$B$$ where $$|A| \leq |B|$$, then the conductance of the cut is defined as $$\beta = cut(A, B)/|A|$$. A result called the Cheeger inequality links $$\beta$$ to $$\lambda_2$$: in particular, $$\frac{\beta^2}{2k_{max}} \leq \lambda_2 \leq 2\beta$$ where $$k_{max}$$ is the maximum node degree in the graph. The upper bound on $$\lambda_2$$ is most useful to us for graph partitioning, since we are trying to minimize the conductance; it says that $$\lambda_2$$ gives us a good estimate of the conductance -- we never overestimate it more than by a factor of 2! The corresponding eigenvector $$x$$ is defined by $$x_i = -1/a$$ if $$i \in A$$ and $$x_j = 1/b$$ if $$i \in B$$; the signs of the entries of $$x$$ give us the partition assignments of each node.
 Now that we have a link between an eigenvalue of $$L$$ and graph partitioning, let's push the connection further and see if we can get rid of the hard $$\vert A\vert = \vert B\vert$$ constraint -- maybe there is a link between the more flexible conductance measure and $$\lambda_2$$. Let's rephrase conductance here in the following way: if a graph $$G$$ is partitioned into $$A$$ and $$B$$ where $$\vert A\vert \leq \vert B\vert$$, then the conductance of the cut is defined as $$\beta = cut(A, B)/\vert A\vert$$. A result called the Cheeger inequality links $$\beta$$ to $$\lambda_2$$: in particular, $$\frac{\beta^2}{2k_{max}} \leq \lambda_2 \leq 2\beta$$ where $$k_{max}$$ is the maximum node degree in the graph. The upper bound on $$\lambda_2$$ is most useful to us for graph partitioning, since we are trying to minimize the conductance; it says that $$\lambda_2$$ gives us a good estimate of the conductance -- we never overestimate it more than by a factor of 2! The corresponding eigenvector $$x$$ is defined by $$x_i = -1/a$$ if $$i \in A$$ and $$x_j = 1/b$$ if $$i \in B$$; the signs of the entries of $$x$$ give us the partition assignments of each node.

 # Spectral Partitioning Algorithm
 Let's put all our findings together to state the spectral partitioning algorithm.
@ -53,7 +53,7 @@ Some practical considerations emerge.
 What if we want to cluster by higher-level patterns than raw edges? We can instead cluster graph motifs into "modules". We can do everything in an analogous way. Let's start by proposing analogous definitions for cut, volume and conductance:
 - $$cut_M(S)$$ is the number of motifs for which some nodes in the motif are in one side of the cut and the rest of the nodes are in the other cut
 - $$vol_M(S)$$ is the number of motif endpoints in $$S$$ for the motif $$M$$
 - $$\phi(S) = cut_M(S) / vol_M(S)$$
 - We define $$\phi(S) = cut_M(S) / vol_M(S)$$

 How do we find clusters of motifs? Given a motif $$M$$ and graph $$G$$, we'd like to find a set of nodes $$S$$ that minimizes $$\phi_M(S)$$. This problem is NP-hard, so we will again make use of spectral methods, namely **motif spectral clustering**:
 1. Preprocessing: create a matrix $$W^{(M)}$$ defined by $$W_{ij}^{(M)}$$ equals the number of times edge $$(i, j)$$ participates in $$M$$.