Graph Algorithms - Depth First Search with Ruby

By on

Purpose: This project implements a modified depth first search algorithm of a connected, undirected graph in order to find its biconnectivity and articulation points.

Input File: a text file with adjacency list to

Graph Theory

Background

Graphs has many practical applications in the world of computer science and data processing. Think of the Facebook friends network in social media, semantic networks in natural language processing, or even the internet itself.

With graph theory, we have the tool to answer questions such as: how big is the internet? How many degrees of separation between me a Beyonce on Twitter?

Using formal algormiths, we can optimize these answers.

Formal Definition

Let the graph called G = (V,E) where v is the finite set of vertices (or we can call them nodes. I will use nodes and vertices intechangeably but know are the same). E represents a finite set of unordered pair of vertices (distinct). Notice in graphs, undistinct graphs allow loops, whereas distinct means no loops. Distinct graphs have at most one edge between them.
We can include additional structures such as weights, represented as postive real number, for each edge.

For example, a pair (v,w) node is drected from v to w, where v is the tail and w is the head. If (v,w) is an edge in an undirected or directed graph, we say w is adjacent to v. In undirected graphs, (v,w) and (w,v) are the same edge, so both v and w are adjance to each other.

Sequence

A sequence v1, v2, v3, … , vk is called a path if (v1, v2), (v2, v3), … , (vk - 1, vk) are with one edges.

Path A path is called (a simple path) simple if all the vertices are distinct.

NOTE: Mathemeticians call a simple path a path and an arbritray path where vertices may be repeated a walk.

Likewise, a cycle (i.e. circuit) is a path in which v1 = vk and each vertex has only two of its neighbor on the path.

Subgraph - a subgraph of a graph G = (V,E) is a graph S = (W,F) where W exists in V and F exists in E.

Undirected graph is called contected if and only if for any two vertices v and w, there is a path between v and w.

Tree - A graph is acyclic if and only if it has no subgraphs which are cycles (circuits). An acyclic connected graph is called a tree.

Trees

  1. Trees has no cycles
  2. Is connected
  3. The number of vertices is equal to the number of edges + 1. If n is the number of vertices (nodes), then: Number of Edges = N - 1.

This is important because the third statement is a stopping step for the algorithm. An example is a spanning tree to an undirected, connected graph.

Component

Component is a graph that is connected and is the largest subgraph (connected) containing the subgraph. Somtimes also called the connected component.

The articulation point in a connected graph whos removal (and removal of all edge incident to it) will disconnect the graph.

Biconnected graph - you have to remove at least 2 verticles to disconnect the graph.

Searching a Graph

There are two standard methods to systematically traverse all the verticles of the graph. In general, we want to traverse with the following goals in mind:

  • needs to be efficient - no unnecesary repeating of edges, paths, vertices
  • needs to be complete - I know every vertex and the edge is visted
  • also needs to be concise - not too many lines of code, finding the correct data structure will help this

It’s easier to think about each nodes as objects that have states: initially all nodes are Undiscovered (U). During the search, the first time a vertex is seen, it is changed to discovered (D). When all nodes adjacent to a given vertex v are discovered, then vertex v is challeng to fully explored (X). In some text books, U D and X usually are colored-represented as: White, Grey, Black.

We can use Breath First Search (BFS) or Depth First Search(DFS). BFS systematically explores the edges of the graph to discover every vertex from a starting node. DFS is a post-order traversal. It does processing after you completely search all adjacency vertices. Articulation points can be found with DFS.

For depth first use a stack. The recursive implementation uses the call-stack). For breadth-first use a queue.

An example of BFS and DFS.

    A
   / \
  B   C
 /   / \
D   E   F

# for DFS, the order of traversal is A, B, D, C, E, F
# for BFS, the order of traversal is A, B, C, D, E, F

Updated