The Definition of a Graph
A graph \( G \) consists of a vertex set \( V(G) \), whose elements are called vertices and an edge set \( E(G) \), where elements in \( E(G) \) are unordered pairs of vertices.
The elements of \(E(G)\) are called edges.
Remarks: Recall that an unordered pair is just any two-element set \(\{ u , v \}\), where \(u\ne v.\) For brevity, we often write such an edge more simply as \( uv \). In various contexts, it is common to denote an edge by the ordered pair \( (u,v) \). This leads to an alternative definition of a graph, equivalent to ours, as a relation on the vertex set \(V(G)\) that is both symmetric (meaning that \( (u,v) \) is an edge if and only if \( (v,u) \) is an edge) and irreflexive (meaning that \( (u,u) \) is never an edge.)
Example 1:
For any edge \( uv \in E(G) \), the vertices \( u, v \in V(G) \) are said to be adjacent and/or neighbors in \( G \). The vertices \( u, v \) are called the ends of the edge \( uv \), and any edge is said to be incident at either of its ends.
For any vertex \( u \in V(G) \), we define its neighborhood as the set $$ \Gamma(u) = \{ v \in V(G) \mid uv \in E(G) \}. $$ The degree of \( u \) is the integer \( d(u) = \#\Gamma(u) \), which counts the number of \(u\)'s neighbors.
Our notation for neighborhood and degree suppress the dependence on the graph \(G,\) which is usually clear from the context. In cases where we must consider multiple graphs, we may instead write \(\Gamma_G\) and \(d_G.\)
If we sort the degrees of a graph's vertices in descending order, \( d(v_1) \ge \ldots \ge d(v_n) \), we obtain a sequence of nonnegative integers, written \( d(v_1) d(v_2) \cdots d(v_n) \), which is called the degree sequence of the graph.
Example 1 (cont.):
Let us determine the degrees of all vertices in our running example:
One consequence of this result is the saying that "at any party, an even number of hands are shaken."
A graph \( G' \) is a subgraph of \( G \) if \( V(G') \subseteq V(G) \) and \( E(G') \subseteq E(G) \). We write \( G' \subset G \).
For any subset of vertices \( U \subseteq V(G) \), we may form the induced subgraph \( G[U] \) with vertex set \( U \) and edge set \( \{ uv \in E(G) \mid u,v \in U \} \). "Induced subgraph" is a stronger condition than "subgraph."
Consider the graph \( G \) shown below:
Here is another image of a \(4\)-clique:
All cliques of a given size are exactly the same up to relabeling: this is formalized by the notion of graph isomorphism. Recall that a bijection between two sets is a function that is one-to-one and onto. Two graphs \( G, H \) are isomorphic if there exists a bijection of vertex sets \( \iota : V(G) \to V (H) \) that preserves adjacency: $$ uv \in E(G) \quad \Leftrightarrow \quad \iota (u) \iota (v) \in E(H).$$
It is not always obvious when two graphs are isomorphic: consider, for example, the two isomorphic graphs shown below, where we may construct the following isomorphism: $$\iota (1)= 8, \quad \iota (2) = 10, \quad \iota (3) = 9, \quad \iota (4) = 6, \quad \iota (5) = 7. $$
For the \( 5 \)-clique shown below, it turns out we cannot draw all \( 10 \) edges in the plane without crossings. Graphs which can be drawn in this way are said to be planar. Planar graphs are characterized by a theorem of Kuratowski, which we will discuss later in the course.
A path is a graph \( P \) of the form \( V(P) = \{x_0, \dots, x_l\} \) and \( E(P) = \{x_0x_1, x_1x_2, \dots, x_{l-1}x_l\} \), where all vertices \( x_i \) must be distinct. The number of edges \( l \) is the length of \( P \). A path may also be denoted by \( x_0 \dots x_l \). The vertices \( x_0, x_l \) are the ends of the path \( P \), which is also said to be a \( x_0 x_l \)-path, or a path from \( x_0 \) to \( x_l) \). The ends of a path are the two unique degree-1 vertices: any other vertices must have degree 2.
Similarly, a cycle of length \( l \) is a graph \( C \) with vertex set \( V(C) = \{ x_1, \dots, x_l\} \) and edge set \( E(C) = \{ x_1x_2, \dots, x_{l-1}x_l, x_l x_1 \}. \) All vertices in a cycle must have degree 2.
When we refer to a "path" or a "cycle" in a graph, one should think of a subgraph isomorphic to the path and cycle graphs defined above. In an \( n \)-vertex graph, a Hamiltonian path is any path of length \( n-1 \), and a Hamiltonian cycle is any cycle of length \( n \). Currently, there is no known algorithm for efficiently detecting Hamiltonian paths or cycles in a graph: this is related to the famous unsolved P vs NP problem.
A graph \( G \) is connected if it contains a \( uv \)-path for any \( u, v \in V(G) \). A maximal connected subgraph is a called a connected component. In most circumstances, if we have a graph with multiple connected components, it is better to think of the connected components separately as connected graphs.
Remark: The notation \(A \sqcup B\) is used when taking the union \( A \cup B \) of two sets which are disjoint, \( A \cap B = \emptyset \).
(2) \(\Rightarrow\) (1): Consider any cycle \( C \subset G \), \(C= x_1 x_2 \dots x_l x_1 \). Either \( x_1 \in A \) or \( x_1 \in B \); assume \( x_1 \in A \) without loss of generality. Then \( x_2 \in B \), \( x_3 \in A \), etc. Vertex colors alternate. Since the cycle returns to \( x_1 \in A \) after \( l \) steps, the length of the cycle \( l \) must be even.
(1) \(\Rightarrow\) (2): It is enough to consider the case where \( G \) is connected. (If \( G \) has components \( G_1, \dots, G_c \) with vertex bipartitions \( V(G_i) = A_i \sqcup B_i \), we may then take \( A = \bigcup_{1\le i \le c} A_i \) and \( B = \bigcup_{1\le i \le c} B_i \)).
Fix a vertex \( v \in V(G) \) and define: $$ A = \{ u \in V(G) \mid \exists \text{ even } uv \text{-walk in } G \} $$ $$ B = \{ u \in V(G) \mid \exists \text{ odd } uv \text{-walk in } G \} $$ Since \( G \) is connected, we have \( V(G) = A \cup B \).
Claim: \( A \cap B = \emptyset \).
If not, then for any \( u \in A \cap B \), we have an even \( uv \)-walk \( W_1 \) and an odd \( vu \)-walk \( W_2 \), the latter of which can be obtained by reversing a \( uv \)-walk promised by the definition of \(B \).
Gluing the two walks gives us \( (W_1, W_2) \), which is an odd closed walk from \( u \) to \( u \).
By the previous lemma, this contains an odd cycle, which would contradict our starting assumption that the graph \( G \) contains only even cycles.
Thus \( V(G) = A \sqcup B \). Finally, we check that all edges cross between \(A \) and \(B \). If there were an edge \( uw \) with both ends in one of the sets \( A \) or \( B \), we could combine \(vu\)-walk \( W \) with the edge \( uw \) to create a \(vw\)-walk \( W ' = (W, uw, w ) \) such that \( W \) and \( W' \) have opposite parity (meaning that either \( W \) is even and \( W' \) is odd or vice-versa.) This, however, would contradict the previously-established disjointness of \( A \) and \( B \). Thus every edge of \( G\) has one end in \( A \) and the other in \( B \).
We've already encountered several examples of bipartite graphs. One of them is the "UFO graph" shown below, better known as the complete bipartite graph \( K_{2,3} \). A suitable vertex bipartition is indicated by the red and blue vertices below. Observe that every cycle in this graph has length \( 4 \). More generally, for any integers \( m,n \ge 1 \), there is a complete bipartite graph \( K_{m,n} \) on \( m+n \) vertices which can be defined as folllows: $$ V(K_{m,n})= \{ a_1, \ldots , a_m, b_1, \ldots , b_n \}, \quad E (K_{m,n} ) = \{ a_i b_j \mid a_i \in A , b_j \in B \}. $$
A trail in a multigraph \( G \) that visits every edge is said to be an Eulerian trail. Similarly, we define an Eulerian circuit to be a circuit which visits every edge exactly once. A graph is Eulerian if it contains an Eulerian circuit.
Eulerian circuits are named after Leonhard Euler, who solved in 1736 a famous problem known as the Seven Bridges of Königsberg. The image below shows the city of Königsberg (now Kaliningrad, Russia.) Notice there are 7 bridges separating 4 landmasses. Can we cross all 7 bridges without using the same bridge twice?
Naturally, we will model this problem by constructing a multigraph, whose four vertices correspond to the landmasses, and whose seven edges correspond to the bridges.
Suppose \( G \) is a graph containing an Eulerian circuit \( C \). Every time a vertex of \( G \) appears in \( C \), this accounts for an two number of edges (1 in and 1 out). Since \( C \) is a circuit, none of the edges in \( C \) are repeated. Furthermore, since \( C \) is Eulerian, every edge of \( G \) appears in \(C \). Finally, since \( G \) is connected, every vertex must appear in \( C \). Altogether, these facts show that every vertex of \( G \) has even degree. A similar argument shows if \(G \) has an Eulerian trail from \( u \) to \( v \), then \( \text{deg}(u) \) and \( \text{deg}(v) \) are odd, and all other vertices even.
First, consider the case where \( G \) is a connected graph such that every vertex has even degree. We will prove that \( G \) has an Eulerian circuit using induction on the number of edges \( \# E( G) \).
As a base case, we may assume \( \# E(G) = 0.\) In this case, connectedness implies that \( G \) is a graph with one vertex \( v \), and the sequence \( C = (v) \) is an Eulerian circuit, since there are no edges in \(G \) to visit.
Now, suppose \( G \) is a graph with \( \# E(G) > 0\). We establish two claims.
Claim 1: \( G \) contains a circuit.
Proof: Let \( P \subseteq G \) be a path of maximal length, with ends \( u \) and \( v \). Since \( \text{deg}(u) \ge 2 \), \( u \) must have an incident edge \( e=uw \notin E(P) \). However, we cannot have \( w \notin V(P) \), since then we could get a path longer than \( P \). Thus, \( w \in V(P) \), and following \( P \) from \( u \) to \( w \) and then \( e \) gives us a circuit (in fact, a cycle.) (Note: this argument will give you a hint about how to solve Homework 1's Problem 5).
Claim 2: A maximal circuit \( C \) in \( G \) is Eulerian.
Proof: Suppose not, and consider the subgraph \( G' \subseteq G \) obtained by deleting edges in \( C \). If \( C\) was not Eulerian, then \(G ' \) would have a connected component \( H \) with at least one edge. Since every vertex in \( H \) has lost an even number of edges from \( C\), we may apply the inductive hypothesis to see that \( H \) has an Eulerian circuit \( D \). The circuits \( C \) and \( D \) are both edge-disjoint. They also have a common vertex \( u \in V(C) \cap V(D) \): this follows from the connectivity of \( G \). Without loss of generality, we may assume that \( u \) is the starting vertex for both circuits; with this assumption, we could then obtain a new Eulerian circuit by gluing, namely \( (C, D) \), whose length is strictly longer than that of \( C \). This is a contraction, so the claim that \( C \) is an Eulerian circuit must hold.
Trail Case:
Suppose now that \( G \) is connected and \( u, v \in V(G) \) are the only vertices of odd degree. Construct a new graph \( G^* \) with \( V(G^*) = V(G) \cup \{w\} \), where \( w \) is a new vertex not already in \( V(G) \), and \( E(G^*) = E(G) \cup \{uw, vw\} \). By our previous arguments, the graph \( G^* \) contains an Eulerian circuit. Finally, we may observe that removing the edges connected to \( w \) from this circuit produces a \( uv\)-Eulerian trail in the original graph \( G \).
We have seen that Eulerian graphs have a characterization which can be checked efficiently: examine each vertex and check if it has even degree. In fact, our inductive proof of this characterization already suggests an algorithm for constructing an Eulerian circuit, given that the graph is in fact Eulerian: find a circuit, delete the edges, and recurse on some smaller connected component.
The Seven Bridges Problem that motivated our study of Eulerian graphs is superficially similar to the Traveling Salesperson Problem (TSP), which relates to Hamiltonian graphs, those graphs which contain a cycle through every vertex. Suppose a salesperson must visit \( n \) cities, labeled \( 1, 2, \dots, n \). The cost of traveling from city \( i \) to city \( j \) is given by a nonnegative number \( c (i,j) \ge 0 \). The TSP then asks: what is the minimum cost trip that visits all cities?
Recognizing Hamiltonian graphs is a special case of TSP. To see this, let \( G \) any graph on \( V(G)=\{1, \dots, n\} \). We define a set of \( \binom{n}{2} \) costs as follows: whenever \( 1 \le i < j \le n \), $$ c(i,j) = \begin{cases} 0 & \text{if } ij \in E(G) \\ 1 & \text{if } ij \notin E(G) \end{cases} $$ With this choice of costs, observe that \( G \) has a Hamiltonian path \( \iff \) Salesperson has a free trip. Furthermore, \( G \) has a Hamiltonian \(uv \)-path if and only if the graph \(G^* \) is Hamiltonian, where \( G^* \) constructed by joining \( u\) and \(v \) to a new vertex as in our characterization of graphs with Eulerian trails.
As we have already noted, an efficient characterization of Hamiltonian graphs is currently unknown: providing one would be a major breakthrough. On the other hand, it is possible to give some sufficient conditions implying Hamiltonicity.
Note that the converse of Ore's theorem is false: there are many Hamiltonian graphs where \( d(u) + d(v) < \# V(G) \) for some non-edge \( uv \notin E(G) \). The simplest example is the cycle \( C_5 \), shown below.
Our proof of Ore's Theorem will rely on the lemma below.
In contrapositive form, Lemma 1 states that if a longest path in a connected graph \( G \) is shorter than a longest cycle, then \( G \) is Hamiltonian. Indeed, in any Hamiltonian graph \(G \) on \( n \) vertices, the longest cycle has length \( n \) and a longest path has \( n- 1 \).
Assume \( G \) satisfies the assumptions of Ore's theorem: \( \# V(G) \ge 3 \), and \( d(u) + d(v) \ge \# V(G)\) for any nonedge \(uv \in E(G) \). Using the pigeonhole principle as in our "Connectivity Criterion" from Section 2, we obtain that \( G \) that is connected.
Suppose now that \(G \) is non-Hamiltonian. Consider a longest path in \( G \), \( P = v_0 \dots v_\ell \). Note that \( \# V(G) \ge 2 \) implies \( \ell \ge 2 \). Having assumed that \( G \) is not Hamiltonian, it follows from Lemma 1 that \( G \) contains no cycle of length greater than \( \ell \). In particular, this implies \( v_0 v_\ell \notin E(G) \).
Consider the sets: $$ A = \{ i \mid 1 \le i \le \ell, \quad v_0 v_i \in E(G) \} $$ $$ B = \{ i \mid 1 \le i \le \ell, \quad v_{i-1} v_\ell \in E(G) \} $$
We make the following two observations:
Our assumptions on \( G \) and the two observations above imply that $$ \# V(G) \le d(v_0) + d(v_\ell) = \#A + \#B \le \ell < \ell +1 = \# V(P),$$ which is a contradiction since the path \( P \subset G \) cannot have strictly more vertices than \( G \). Thus, our asumption that \( G \) was non-Hamiltonian cannot hold.
Graph Distance: On a connected graph, for \( u, v \in V(G) \), \( d(u,v) \) is the length of a shortest \( uv \) path.
Finding a longest path has played a role in the proofs of several of our results so far. Unfortunately, this is yet another problem that is difficult to solve in practice. Surprisingly, it is much easier to find shortest paths in a graph \( G \). This is true even if we attach nonegative weights/costs to the edges of \(G \).
Dijkstra's Algorithm finds a shortest \( s-v \) path for some fixed \( s \in V(G) \), and all \( v \in V(G) \setminus \{s\} \). More generally, this algorithm works for graphs with cost function on edges, \( c: E(G) \to \mathbb{R}_{\ge 0} \). $$ d(u,v) = \displaystyle\min_{P: \text{ a } uv\text{-path}} \Bigg( \displaystyle\sum_{e \in P} c(e)\Bigg)$$
To illustrate Dijkstra's algorithm, let us consider an example of an undirected graph \( G \), with numeric labels indicating the cost \( c(u,v) \) on each edge \( uv \in E(G) \):
\( V = \{ s \} \)
Total costs of cut edges: sa (cost 6), sb (cost 2).
\( V = \{ s , b\} \)
Total costs of cut edges: sa (cost 6), bd (cost 2+2=4 ), be (cost 2+5=7).
\( V = \{ s , b, d\} \)
Total costs of cut edges: sa (cost 6), da (cost 1+4=5), dc (cost 4+4=8), de (cost 4+2=6), be (cost 2+5=7)
\( V = \{ s , b, d, a\} \)
Total costs of cut edges: ac (cost 5+2=7), dc (cost 4+4=8), de (cost 4+2=6), be (cost 2+5=7)
\( V = \{ s , b, d, a, e\} \)
Total costs of cut edges: ac (cost 5+2=7), ec (cost 6+4=10), dc (cost 4+4=8)
Upon termination, all vertices are visited, and Dijkstra's Algorithm has generated a search tree in which every vertex is labeled by its distance from the source.
A tree is any graph satisfying the equivalent conditions of this theorem.
A subgraph \( H \subseteq G \) is spanning if \( V(H) = V(G) \).
Corollary: Any connected graph \( G \) contains a spanning tree. (Proof: Delete edges from cycles until acyclic but still connected).
Let \( T \) be a tree.
Problem: Connect cities (vertices) with edges having costs, such that total cost is minimized.
Input: Connected graph \( G \), cost function \( c: E(G) \to \mathbb{R} \).
Goal: Find a spanning tree \( T \) minimizing \( \sum_{e \in E(T)} c(e) \).
Kruskal's algorithm is a "greedy algorithm" for finding a minimum spanning tree.