Why is the 'if' statement considered evil?

Vadim picture Vadim · Oct 12, 2009 · Viewed 39.2k times · Source

I just came from Simple Design and Testing Conference. In one of the session we were talking about evil keywords in programming languages. Corey Haines, who proposed the subject, was convinced that if statement is absolute evil. His alternative was to create functions with predicates. Can you please explain to me why if is evil.

I understand that you can write very ugly code abusing if. But I don't believe that it's that bad.

Answer

Groo picture Groo · Oct 12, 2009

The if statement is rarely considered as "evil" as goto or mutable global variables -- and even the latter are actually not universally and absolutely evil. I would suggest taking the claim as a bit hyperbolic.

It also largely depends on your programming language and environment. In languages which support pattern matching, you will have great tools for replacing if at your disposal. But if you're programming a low-level microcontroller in C, replacing ifs with function pointers will be a step in the wrong direction. So, I will mostly consider replacing ifs in OOP programming, because in functional languages, if is not idiomatic anyway, while in purely procedural languages you don't have many other options to begin with.

Nevertheless, conditional clauses sometimes result in code which is harder to manage. This does not only include the if statement, but even more commonly the switch statement, which usually includes more branches than a corresponding if would.

There are cases where it's perfectly reasonable to use an if

When you are writing utility methods, extensions or specific library functions, it's likely that you won't be able to avoid ifs (and you shouldn't). There isn't a better way to code this little function, nor make it more self-documented than it is:

// this is a good "if" use-case
int Min(int a, int b)
{
    if (a < b) 
       return a;
    else
       return b;
}

// or, if you prefer the ternary operator
int Min(int a, int b)
{
    return (a < b) ? a : b;
}

Branching over a "type code" is a code smell

On the other hand, if you encounter code which tests for some sort of a type code, or tests if a variable is of a certain type, then this is most likely a good candidate for refactoring, namely replacing the conditional with polymorphism.

The reason for this is that by allowing your callers to branch on a certain type code, you are creating a possibility to end up with numerous checks scattered all over your code, making extensions and maintenance much more complex. Polymorphism on the other hand allows you to bring this branching decision as closer to the root of your program as possible.

Consider:

// this is called branching on a "type code",
// and screams for refactoring
void RunVehicle(Vehicle vehicle)
{
    // how the hell do I even test this?
    if (vehicle.Type == CAR)
        Drive(vehicle);
    else if (vehicle.Type == PLANE)
        Fly(vehicle);
    else
        Sail(vehicle);
}

By placing common but type-specific (i.e. class-specific) functionality into separate classes and exposing it through a virtual method (or an interface), you allow the internal parts of your program to delegate this decision to someone higher in the call hierarchy (potentially at a single place in code), allowing much easier testing (mocking), extensibility and maintenance:

// adding a new vehicle is gonna be a piece of cake
interface IVehicle
{
    void Run();
}

// your method now doesn't care about which vehicle 
// it got as a parameter
void RunVehicle(IVehicle vehicle)
{
    vehicle.Run();
}

And you can now easily test if your RunVehicle method works as it should:

// you can now create test (mock) implementations
// since you're passing it as an interface
var mock = new Mock<IVehicle>();

// run the client method
something.RunVehicle(mock.Object);

// check if Run() was invoked
mock.Verify(m => m.Run(), Times.Once());

Patterns which only differ in their if conditions can be reused

Regarding the argument about replacing if with a "predicate" in your question, Haines probably wanted to mention that sometimes similar patterns exist over your code, which differ only in their conditional expressions. Conditional expressions do emerge in conjunction with ifs, but the whole idea is to extract a repeating pattern into a separate method, leaving the expression as a parameter. This is what LINQ already does, usually resulting in cleaner code compared to an alternative foreach:

Consider these two very similar methods:

// average male age
public double AverageMaleAge(List<Person> people)
{
    double sum = 0.0;
    int count = 0;
    foreach (var person in people)
    {
       if (person.Gender == Gender.Male)
       {
           sum += person.Age;
           count++;
       }
    }
    return sum / count; // not checking for zero div. for simplicity
}

// average female age
public double AverageFemaleAge(List<Person> people)
{
    double sum = 0.0;
    int count = 0;
    foreach (var person in people)
    {
       if (person.Gender == Gender.Female) // <-- only the expression
       {                                   //     is different
           sum += person.Age;
           count++;
       }
    }
    return sum / count;
}

This indicates that you can extract the condition into a predicate, leaving you with a single method for these two cases (and many other future cases):

// average age for all people matched by the predicate
public double AverageAge(List<Person> people, Predicate<Person> match)
{
    double sum = 0.0;
    int count = 0;
    foreach (var person in people)
    {
       if (match(person))       // <-- the decision to match
       {                        //     is now delegated to callers
           sum += person.Age;
           count++;
       }
    }
    return sum / count;
}

var males = AverageAge(people, p => p.Gender == Gender.Male);
var females = AverageAge(people, p => p.Gender == Gender.Female);

And since LINQ already has a bunch of handy extension methods like this, you actually don't even need to write your own methods:

// replace everything we've written above with these two lines
var males = list.Where(p => p.Gender == Gender.Male).Average(p => p.Age);
var females = list.Where(p => p.Gender == Gender.Female).Average(p => p.Age);

In this last LINQ version the if statement has "disappeared" completely, although:

  1. to be honest the problem wasn't in the if by itself, but in the entire code pattern (simply because it was duplicated), and
  2. the if still actually exists, but it's written inside the LINQ Where extension method, which has been tested and closed for modification. Having less of your own code is always a good thing: less things to test, less things to go wrong, and the code is simpler to follow, analyze and maintain.

Huge runs of nested if/else statements

When you see a function spanning 1000 lines and having dozens of nested if blocks, there is an enormous chance it can be rewritten to

  1. use a better data structure and organize the input data in a more appropriate manner (e.g. a hashtable, which will map one input value to another in a single call),
  2. use a formula, a loop, or sometimes just an existing function which performs the same logic in 10 lines or less (e.g. this notorious example comes to my mind, but the general idea applies to other cases),
  3. use guard clauses to prevent nesting (guard clauses give more confidence into the state of variables throughout the function, because they get rid of exceptional cases as soon as possible),
  4. at least replace with a switch statement where appropriate.

Refactor when you feel it's a code smell, but don't over-engineer

Having said all this, you should not spend sleepless nights over having a couple of conditionals now and there. While these answers can provide some general rules of thumb, the best way to be able to detect constructs which need refactoring is through experience. Over time, some patterns emerge that result in modifying the same clauses over and over again.