I've migrated my blog

Thanks for visiting my blog. I am no longer maintaining this and I've migrated the blog to my personal domain . If you would like to follow my blog kindly use my new RSS feed

Tuesday, November 22, 2011

Going Declarative in C#

            Declarative programming can often be a simpler, more concise way to describe the behaviour of a software program than imperative programming. I am an admirer of declarative aspects of programming ever since I have started writing SQL queries. We always do our best to write code that is easier to read and maintain. Declarative style is one of the proven ways to write clean code. LINQ is an excellent example of declarative style programming that enables the developers to simply state what they want to do. When I am learning higher order functions in Haskell, I have found the interrelationship between the higher order functions and the LINQ. It really made me to think in a different way to solve a problem. Through this blog post I would like to share my experiments on higher order functions in C#.
            Let me start with a very simple requirement.
Write a program to print the even numbers present in the given n numbers
            The code implementation fairly straight forward as below

            Fine. Let me add some more twist to the code by adding two more requirements.
                        Modify the program implemented above to print odd numbers and multiples of four present in the given n numbers
            To be honest, If I have encountered this requirements before I have learnt Higher Order Functions my implementation would be as follows.

            If you look at the above implementation with a critical eye, you can find a potential candidate of duplication. Let me explain the common pattern that is being used in the implemented PrintXXXX functions.
1.      For each number in the numbers enumerable
a.      Decide whether the number should be printed or not (Deciding)
b.      Print the number if it is passes the above decision (Doing)

All the three functions iterate over the numbers enumerable and print the numbers. The only thing which actually differentiates the functions is deciding which numbers to be printed.

Now the question is how can we eliminate this duplication????

It’s where higher order functions come into picture. If we move the deciding part of the function away from its implementation then we can easily achieve it.   Here we go! The brand new implementation of Print would be

            In the new implementation we have just isolated the deciding part of the function from its implementation and parameterize it as function delegate that takes an integer as its input and return a Boolean value.  In the client code (Main function) we are actually just calling the print function and declaratively telling it to print only those numbers which satisfies the given condition. As we separated the deciding part from the actual implementation, we can easily accommodate any future requirements like “Printing multiples of five, printing only single digit numbers” by declarative calling the Print function like as below

            Cool.. Isn’t it ? Let me complicate the things little more. What would you do if you want to call this Print method across different classes?. A notorious option would be creating a Utility class with the Print method and calling it from the other classes. We can also solve these using Extension methods which results a clean readable code like as below

            So far, so good. We have started with a single function and then we added two more, then eliminated the duplication using Higher Order functions and finally we have made the code readable by using extension method.
            Okay. Now “I want to print the strings which starts with ‘s’ in the given n strings ”. Pardon me for complicating things, I will stop by this one.
            It is almost logically similar to what we have done so far. Instead of numbers here it is string. How can we put it into action?. Thanks to Generics. We can easily achieve this by modifying the extension method to support generic type as below

            That’s it. Now you are free to play with all sort of logic you want. You can play with different set of conditions to print the elements or even you can also use different collection of your custom classes. And all can be done declaratively!!
            Now its time to reveal to the interrelationship exists between the LINQ and the higher order functions. All the LINQ methods are actually using these Print extension methods kind of extension methods under the hood and makes the life of developer easily but letting them to work declaratively.
            Parallel Class a new addition in C# 4.0, also uses higher order functions and enables the developer to say “Hey CLR, I wanna run these methods parallel”.

            Awesome! No new thread creation and no verbose.
            Declarative Programming is powerful tool. It creates more readable, cleaner code and also saves the possibility of logical mistakes in multiple similar algorithms. That means fewer mistakes now and in the future.                                               

Thursday, November 3, 2011

Think Before You LINQ

            LINQ is an awesome feature which I like the most in C#. The abstraction, expressiveness and the power it offers to the code are simply amazing. In general when we think of abstractions, we tend to think towards expressiveness and fluent interfaces and get carried away. 

            Efficiency of an abstraction is often an afterthought (Based on my experience, correct me if am wrong) and also it is very hard to define an abstraction which should be efficient for all the real world problems it address. When we address an efficiency issues in an abstractions, it is our primary responsibility to get rid of it.

            Let us assume that you have found an efficiency issue with an abstraction. How would you troubleshoot it? Think! I believe, awareness of internals of the abstraction would be the prime prerequisite to circumvent the problem.  Hence as a professional developer we should be aware of what is happening under the hood when we use LINQ or any such kind of abstractions. Though we are not going to employ this in most of our coding efforts, I feel it would be an ideal weapon that we should keep in our arsenal.

            I have encountered one of such efficiency issue with LINQ and it really made me to think twice (even thrice) before applying LINQ to solve the problems. Let me explain it through a simple example. Here is the problem which I am going to solve through LINQ.

“I need a method that should take a collection of numbers as its parameter and write all the numbers in the console. If the collection contains only one number it should not write anything”              

Here is the code snippet which address this problem and along with the output.

I have used two abstractions on this function, one is the LINQ extension method “Count” and the other one is iterating through the enumeration abstraction. Would you able to find an efficiency issue lurking on this very simple function? Kudos if you find it out. 

Let me give a small background about LINQ extension methods and Enumeration. Most of the LINQ extension methods are using lazy execution internally and computes the enumeration on demand basis. However some extension methods (Count, Sum) collectively called Aggregate Operators causes immediate execution instead of lazy execution on the enumeration. We can we make an enumeration to enumerate lazily by using “yield” statements. Enough theory,  let us see some code which shows the efficiency issue associated with the function that we have seen earlier

I have added some code in “PrintMe” method to log how it is actually getting executed. Also I have added the “GetNumbers” method which lazily creates a list of numbers using yield statement.  

Now can you able to find the exact issue associated with the method “PrintMe”? The read lines are areas of concern. The list is yielded twice!! One while using the Aggregate Operator of LINQ “Count” which causes immediate execution results enumeration all the yields and the second one is yielding the list once again lazily when enumerating through “foreach” loop.   

Though it is just a matter of nanoseconds in this example, it may be possible candidate of bottleneck in real world. So, whenever you are doing more than one operation on LINQ or an enumeration or both combined, do not forget think about efficiency. In fact we should give a special attention to the speed of our algorithm when we are actually coding it. (Refer Pragmatic Programmer, Chapter 6, While You Are Coding)

I hope now you are ready to think about efficiency when you code. Here in our case we can get rid of the efficiency issue by converting the enumeration of number to an array or a list using LINQ convertor operators ToArray or ToList respectively. Like aggregate operators it causes immediate execution and converts the enumeration to the target type. Then we can do the operations on the converted target. Here is the code snippet of that with the output.

The modified code now iterate through the list only once!!


            Would you use powerful weapons to get rid of smaller problems, certainly not? LINQ is such kind of powerful weapon which is meant to solve powerful problems. So, think twice before using LINQ and don’t use it blindly. Efficiency Matters!!