
A holistic perspective into exploring graphs using functional programming
A holistic perspective into exploring graphs using functional programming
Before we start, let me ask you a question — when you encounter a complex problem, isn’t it tempting to start with basic building blocks and then gradually build up the solution? Well, I won’t blame you, so to speak.
For someone who has spent a considerable time learning and programming in imperative programming languages, it often seems a ‘natural’ strategy. Unfortunately, depending on the complexity of the problem, this might not be the best strategy. If you’re already feeling lost, thinking what I am blabbering about, please bear with me for a while, at least. Let’s delve into the world of graphs, which are the perfect candidate to elucidate my point.
Graphs are ubiquitous. Rarely you would find a field where they haven’t made their presence felt. A graph is represented by a set of nodes(vertices) and edges between these nodes(G=(V,E)). The edges can either be directed or undirected. I have shown below an example of a directed graph that we’ll be using for the demonstration.
We will use a query function named edge that will tell us if two nodes are connected.
edge :: Char -> Char -> Bool
edge ‘A’ ‘B’ = True
edge ‘B’ ‘C’ = True
edge ‘C’ ‘A’ = True
edge ‘A’ ‘D’ = True
edge ‘D’ ‘E’ = True
edge ‘D’ ‘F’ = True
edge ‘F’ ‘E’ = True
edge ‘E’ ‘B’ = True
edge _ _ = False
Given a directed graph, one can easily tell if two vertices are connected if there is a path connecting them. Now, hold on for a second. You must be thinking about what does “path” mean.
Well, a path is what it means. A path between two vertices is a sequence of connected edges where each edge in the chain is connected with its neighbors by edges. For instance, a path between A and C is [(A,B),(B,C)]. Notice that there might be more than one path connecting two vertices. Length of a path is the number of intermediate edges between the endpoints or number of intermediate vertices + 1.
Inductively, any two vertices x and z are said to be connected if there exists some intermediate vertex y, such that there is a path of length k between x and y, and there is a direct edge connecting y and z. This implies that there is a path of length (k+1) connecting x and z.
In Haskell, we can define this abstraction by a type Path
> type Path = [Char]
To discover paths, we will use a function extendPath.
> extendPath :: Path → [Path]
> extendPath [ ] = [[c]| c ← [‘A’..’F’]]
> extendPath p = [p ++ [c] | c ← [‘A’ .. ‘F’], edge (last p) c]
This function will extend a path of length k to (k+1) by adding an edge from terminal vertex to another vertex if there is an edge connecting them.
To extend all paths of length k by 1, we can define a function called extendAllPath —
> extendAllPath :: [Path] → [Path]
> extendAllPath [ ] = [[c] | c ← [‘A’..’F’]]
> extendAllPath l = concatMap extendPath l
concatMap function combines map (map :: (a → b) → a → b)) and concat ([c | p ← l, c ← p]) together.
Now, how about discovering all paths of all possible lengths?
In a functional language like Haskell, we can implement it by repeatedly mapping extendAllPath to an empty list of lists.
> iterate :: (a → a) → a → a
iterate f x => [x, f x , f (f x), f (f (f x)) …]
> allPaths = iterate extendAllPath [[]]
Ignoring, the degenerate paths of length 0 and 1, we could exploit lazy evaluation offered by Haskell, which efficiently computes all paths(including paths with loops).
> drop 2 (take (n+1) allPaths)
As evident, we could clearly and explicitly define different operations on graphs without worrying about managing state variables and the extra work associated with auxiliary operations.
To conclude, I think the real difference is the entirely new paradigm you’re exposed to when you begin writing in Haskell. I started with Python, then learnt C. I was really excited about using list comprehensions, lambdas, map, reduce, filter in Python and some functional stuff offered by C STL. However, Haskell is entirely a different beast. It radically changes your thinking for good. It seems so elegant. The major differences that I notice are the immutable “variables” and the mathematical feel of it. There are other differences that feel more akin to grammar variations that I would expect to find between any programming languages.
Is it “better”? I reserve judgement on that but I am sure that you will find Haskell has an advantage in certain if not many circumstances.