Monday, April 6, 2009

Why Functional Programming Matters

I'm always trying to sharpen my skills as a programmer. One approach I've been using to become a better programmer is to learn as many development methodologies as possible. In the past few years, I've become a big fan of functional programming. The basic idea behind functional programming is to compose your programs using functions that return a single value and avoid side effects whenever possible. A side effect is an action that is not required for a function to do it's work. An example of a function with a side effect would be:

function sum(a, b) {
    print "a + b = " + (a + b)
    return a + b
}

In this case, the print statement wouldn't be required to accurately calculate the sum, so it would be considered a side effect. A pure functional language prohibits side effects altogether; although, most functional languages allow some level of impurity for convenience.

Another common feature of functional programming is the use of higher order functions. Higher order functions are functions that take functions as their arguments; thereby, providing an additional level of abstraction. An example of a higher order function included in most modern languages would be the map function.

function square(x) { return x * x }
numbers = [1, 2, 3, 4, 5]
squares = map(square, numbers)
print squares
[1, 4, 9, 16, 25]

Higher order functions can typically take anonymous functions as their arguments as well leading to increased flexibility.

I can already hear you groaning as you read this article and saying, "yeah yeah, my language can do all that too... what's the point?" Well, in the typical zen fashion of functional languages, it's not always what you can do that matters; rather it's what you can't do. A striking difference between a number of functional languages and their imperative counterparts is how mutability is handled.

A number of functional languages such as Haskell, Erlang, and Clojure address the majority of their built-in types as immutable. This means that once a value is assigned to a local variable, it cannot be changed. At face value, this may seem like a huge disadvantage and might even lead one to believe that certain types of programs might be impossible to write. The truth is that any imperative program can be rewritten in a functional style but doing so requires the adoption of a new set of tools. Specifically, recursion must be adopted in lieu of traditional iteration. Things like closures can be used to simulate state, and objects are generally thrown by the wayside.

At this point, the resounding question in your mind is probably, "why bother?" Well, fortunately there's a very strong reason behind all this shifting of methodology, and that's concurrency. Anybody that's written a non-trivial multi-threaded program in an imperative language has dealt with the difficult issues of synchronization, locking, and sharing of state. The larger and more complex the program, the more difficult it becomes to avoid subtle errors and race conditions. For most programmers writing traditional multi-threaded applications, there are more things that can go wrong than right.

Functional programs that operate on immutable data don't have to deal with any of these issues. Since nothing is shared, there's no potential for race conditions, multiple threads stomping on one another's data, or any of the other trappings one might traditionally succumb to. Freeing the programmer from worrying about such things allows focus to be shifted to solving the actual problem at hand instead of dealing with the fine grained details of threading semantics.

Why should you care about threading? I believe the introduction to Java Concurrency in Practice puts it well when they say, "For the past 30 years, computer performance has been driven by Moore's Law; from now on, it will be driven by Amdahl's Law." Basically, processors aren't getting any faster. For many years, programmers have written code under the assumption that performance wasn't an issue because the next generation of processors would pick up the slack. We've reached the point where we'll never be able to assume this again. From this point on, things will become increasingly parallel, which means that learning to work with multi-core machines will become an increasingly marketable skill.

Traditionally relegated to academia, theorem proving, and compiler writing, functional languages are finally becoming mainstream by filling the multi-core niche. Even if you don't adopt a functional language for your day job, taking the time to learn a different method of problem solving will make you a better programmer and provide valuable techniques you can use in your day to day work.

This being said, there are some really nice languages out there to play around with. I will briefly highlight a handful of them for your perusal. Please take note, that not all of these adhere explicitly to the immutable state share nothing model previously mentioned; although, most provide exceptional facilities for multi-core programming.

Clojure - A modern Lisp running on the JVM. Clojure throws out all the cruft of old world Lisp and provides a lean and mean core with immutable data structures. One of Clojure's core features is excellent concurrency performance, and it also provides seamless access to Java libraries whenever desired.

Scala - Adpoted by Twitter to solve their initial concurrency issues that Ruby could not handle, Scala runs on the JVM, which makes deployment a breeze and provides a multi-paradigm approach to development.

Erlang - Initially developed as a proprietary language by Ericcson, Erlang was open sourced some years back. Erlang makes use of immutable state and sends data back and forth between managed processes using a built-in messaging system. The process model allows code to be seamlessly distributed across several cores, servers, or networks. Ericcson has successfully used Erlang to manage their telco switching on massive networks for many years.

Haskell - A purely functional language with an advanced type system. Haskell provides a robust feature set and an extremely advanced compiler. The ability to introduce arbitrary operators and a non-standard syntax can make Haskell intimidating at first glance, but it's concurrency peformance rivals any other language. Notable projects written in Haskell include the Xmonad window manager and Pugs, a Perl 6 compiler.

OCaml - Numerous wins in the ICFP helped OCaml gain initial notoriety. Native code compilers allow OCaml programs to run at blazing speeds, and a sophisticated type system provides both safety and flexibility.

F# - Developed by Microsoft Research, F# is a variant of ML with a core that is similar to OCaml. It runs on .Net and will be a fully supported language in the .NET Framework and Visual Studio ecosystem as part of Visual Studio 2010.

PLT Scheme - A functional language with a history in academia. The ubiquitious MIT book, how to design programs, uses PLT as it's host language. Excellent documentation makes PLT a great way to cut your teeth on functional programming, and it's concurrency features are actually pretty good as well. Threading in PLT is based on the mechanism used in Concurrent ML.

New Lisp - Although including the word "New" in the name of any product might not be the best idea, New Lisp fills a nice niche in the scripted functional language domain. It seems to be at least partially inspired by Paul Graham's Arc, and it provides a good number of modern features out of the box.

SBCL - One of the de-facto standard Common Lisp implementations. I haven't done any concurrency stuff in Lisp, but I do know that SBCL provides a high quality Lisp implementation for your general Lisp-ing needs. If SBCL doesn't meet your needs, Clisp is another very solid Lisp distribution.

Javascript - The ubiquitous Javascript language now has a JVM port and runs everywhere. It's one of the most widely deployed programming languages on earth and allows the application of higher order functions, lambdas, closures, currying, etc. Who would've thought there was a functional language right under your nose?

16 comments:

Anonymous said...

see also stackless Python. Not pure, but useful.

Unknown said...

what about ocaml?

Anonymous said...

F# belongs on this list too.

Anonymous said...

The D Programming Language in version 2 allows for "Pure" functions to be created that do not have any side effects. It's an interesting mix of imperative/functional programming because you can do non-pure things inside a pure function, but the won't affect anything outside of the function.

m i c h a e l said...

Actually, newLISP predates Arc by 10 years. newLISP first appeared in 1991, whereas Arc was not announced until 2001.

Stefan said...

Are there any serious benchmarks that show that implementations of functional languagues scale better than e.g. cilk++ or the fork-join framework for java?

Niki said...

You have the definition of a side effect wrong: a side effect is code that changes the state of the program. The print statement in your example isn't a side effect (although it is arguably bad practice).

So for example:
int g_SomeGlobal = 0;
int SomeFunction(int a, int b)
{
g_SomeGlobal = a + b;
return g_SomeGlobal;
}

has a side effect, in that it changes a global variable.

A similar functional concept is a pure function which is a function that doesn't depend on any external state. These are useful in that they always return the same result when given the same input.

Niki said...

Actually I'm a little mistaken, the print statement in your example does cause a side effect - the act of printing to the stdout changes the state of the stdout.

Travis Whitton said...

@Stefan

I'm not sure if it applies to the frameworks you've mentioned, but I found the following benchmark interesting.

Thread-Ring Benchmark

Isaac Gouy said...

@Travis Whitton

Note the Smalltalk program and then wonder if you might be looking at the task-switching advantage of light-weight threads compared to OS threads.

Travis Whitton said...

@Isaac Gouy

Haskell uses green threads and Erlang uses lightweight processes. Both run inside the VM and neither are managed by the OS, which does explain why context switches are so fast.

That being said, both Erlang and Haskell have SMP support in their runtime, which allows exploitation of multiple processors while taking advantage of cheap context switching.

I agree that in some degree it's comparing apples to oranges, but if you can take advantage light-weight threading, why not use it?

Jeremy said...

Surprised Javascript isn't on the list; it's got to be the most popular (by usage) functional language out there.

Anonymous said...

The destruction of OOP as brought by functional languages is interesting.

In particular Scala, by attempting to use its statically typed powers to bridge between Java and the next world fascinates me.

But Im also a bit perplexed at how we got here except to say, in reality a new system will have to be built on top of ALOT of bad code.

Decades of it by now.

So if the spread of functional languages is needed to jar the OOP paradigm back into some normal, scalable, productive venture, then Im all for it.

Even if its "just a fad", OOP at the enterprise level is basically dead.

Travis Whitton said...

@Jeremy

Good point! Added it to the list.

Dominique De Vito said...

Oh, yes, indeed, FP do matter more and more.

FP is risen since years.

In 2005, I have written 2 posts about this rise, due to, to give one example, multi-core issue.

Thoughts about Java, C# future and OCaml (?) relationships

More ramblings on programming languages

So, Java and C# are including more and more FP feature (well, this is more true for C# than Java, as its evolution is a bit stopped these days).

I have written recently Functional programming on-going adoption as a side-effect, as "mainstream languages have already some functional features". So, on-going FP adoption happens, with mainstream language adoption, often as a side-effect, as languages are still not that much adopted for FP features. Funny, isn't (due to FP nature) ;-) !?

Other news/trends push forward FP too, as Google do. Indeed, Google has helped popularized map/reduce distributed approach, and FP matches well this algorithm as said also in Functional programming on-going adoption as a side-effect.

So, FP rules!

PS: you may include also F# into your list, at least, a Microsoft clone of OCaml (due OCaml strong influence as stated here).

Unknown said...

JavaScript isn't a functional programming language at all. That's like saying Ruby is functional. The only functional aspect of it is anonymous functions.

Anyway, good article.