DFS in Graphs
Contents
DFS (Depth First Search) helps in finding the connected nodes in a graph. By connected, we mean two nodes which can be reached by following a set of edges. This makes sense in both directed and undirected graph – in directed graph, there’s only one way of moving along the edge, but in the undirected graph we can move both forward and backwards along the edge.
In simpler terms, we would want to start at a given node, follow along a path until we reach a deadend, and then move backtrack until we reach a new unvisited node, choose it and start moving along its path. Since we are backtracking, we would want the computer to take care of the bookkeeping, and therefore we use recursion (alternate is to manually handle that using a stack).
Path along one node gives us a depth first tree starting at that node; adding up all the paths for all the nodes in the graph, we get a depth first forest.
A very simple DFS snippet works like (namespace and other boilerplate is already defined here):


In the previous code, GetVertices
gets all the vertices of a graph. In
contrast, GetNeighbours
gets all the nodes which can be reached from
the given node. This will work correctly for both directed and
undirected graphs.
The Start
method starts a DFS tree for each node, and the Explore
method takes the job of going down the node path. Since the method is
recursive, once Explore
reaches a point where there are no unvisited
neigbours of current node, it backtracks until it finds an ancestor
which has unvisited neighbours, picks one and again follows the path.
Each visited node is marked as such so that it is not picked again by
any of the methods.
Bookeeping
DFS also allows to keep bookeeping information regarding the structure
of the tree and how the nodes are placed relative to one another. For
example, we can have start
and finish
times associated with every
node for the time when it was first “visited” and then revisited during
backtracking respectively. And it is quite clear that the relative
structure of nodes depends on these timestamps. Parent node (or a node
found first) will start an Explore
subroutine on its neighbours (which
are found later), and therefore every neighbour will have a greater
start
timestamp. During backtracking, the neighbours will have lesser
value of the finish
timestamp than the node which started the
Explore
method on the node.
The following code assumes that CurrentTime
method is responsible for
updating the timestamp. Implementation is available
here),
where I am using delegates to add pre/post node visit hooks.


DFS has also some interesting uses like arranging the nodes topologically via Topological Sort and also to find the connected nodes (Connected Components) in the graph. I will discuss the same in another article.
Author Tushar Tyagi
LastMod Oct 3, 2016