Sunday, 18 December 2011

I followed through on my threat to do a presentation on using tiny objects. My supposition is that most developers (myself included) resist creating trivial (tiny) objects even when they could clean up code. Objects don’t have to possess lots of properties or methods to become useful.

Let’s take this example…

Random _randy = new Random();
public List<double> TestSeriesPoints(int numberOfPoints, double minValue, double maxValue)
{
    if(minValue >= maxValue)
    {
       throw new Exception("minValue must be less than maxValue");
    }

    List<double> points = new List<double>();
    for (int pointIndex = 0; pointIndex < numberOfPoints; pointIndex++)
    {
        double point = _randy.NextDouble() * (maxValue - minValue) + minValue;
        points.Add(point);
    }
    return points;
}

It is a pretty straightforward method that returns some test data within the range of a min and max value.  It’s a short method, and pretty easy to grok, so 99 times out 100, it wouldn’t be given a second glance.  But there is a tiny object buried in here, one that will not only make this code cleaner, it will make modifying it easier too.

Any time we see two variables that are always used in tandem, it is time to make a Tiny Object…

Random _randy = new Random();
public List<double> TestSeriesPoints(int numberOfPoints, double minValue, double maxValue)
{
    if(minValue >= maxValue)
    {
       throw new Exception("minValue must be less than maxValue");
    }

    List<double> points = new List<double>();
    for (int pointIndex = 0; pointIndex < numberOfPoints; pointIndex++)
    {
        double point = _randy.NextDouble() * (maxValue - minValue) + minValue;
        points.Add(point);
    }
    return points;
}

Where there’s a min value, there’s bound to be a max value. They don’t exist without each other, so let’s make the commitment and formalize the relationship…

public class DoubleRange
{
    public double MinValue = 0;
    public double MaxValue = 100;
}

By combining two floating (but always proximate) variables into a single Tiny Object, we’ve pulled ourselves out of procedural thinking and opened up the OOP world.  We could leave it like this and still get some some benefits.  But by its mere creation, our new Tiny Object has given us something concrete to work with and re-use. It gives us a place to add behaviors and properties that consuming code will no longer have to repeatedly re-implement.

Once this object has materialized, it becomes apparent we can ensure that our range is always valid by using a constructor and encapsulated properties.  And we can turn the actual range of our DoubleRange into a property, so the consuming code does not have to recalculate it. Finally, we can also create default constructor that sets our range to have common min and max values.

public class DoubleRange
{
    private double _minValue;
    private double _maxValue;

    public DataRange():this(0,100)
    {
    }

    public DataRange(double minValue, double maxValue)
    {
        if(minValue >= maxValue)
        {
             throw new Exception("minValue must be less than maxValue");
        }
        _minValue = minValue;
        _maxValue = maxValue;
    }

    public double MinValue
    {
        get{return _minValue;}
    }

    public double MaxValue
    {
        get{return _maxValue;}
    }

    public Range
    {
        get{return MaxValue - MinValue;}
    }
}

Our consuming code can boil off some logic that lives in the Tiny Object, so it looks like this…

Random _randy = new Random();
public List<double> TestSeriesPoints(int numberOfPoints, DoubleRange testDataRange)
{
    List<double> points = new List<double>();
    for (int pointIndex = 0; pointIndex < numberOfPoints; pointIndex++)
    {
        double point = _randy.NextDouble() * testDataRange.Range + testDataRange.MinValue;
        points.Add(point);
    }
    return points;
}

Our Tiny Object has grown into something useful…

  • the consuming code is clearer, working with a DoubleRange object, not some implicitly related values
  • the unexplained (maxValue – minValue) expression in our calculation is explicitly defined in its new object home
  • the object handles its own validation

Finally, adding one small method to our Tiny Object will also make debugging  and logging easier…

public override ToString()
{
   return string.Format("Range {0:n4} to {1:n4}", _minValue, _maxValue);
}

Now, any time the object is added to a log statement, we’ll always see the current range values like so…

Range 0.0000 to 100.0000

That gives us three of the four joys of OOP -- abstraction, encapsulation,  and polymorphism – all in one Tiny Object. 

So the next time you see two or three variable spending a lot of time together, do the right thing move them into their own Tiny Object and see what other  wonderful attributes and behaviors blossom.

And always remember there are no Tiny Objects, only tiny developers.

OOP
Sunday, 18 December 2011 16:09:56 (Eastern Standard Time, UTC-05:00)   #     Comments [0]  | 
Wednesday, 16 November 2011

The links have the zipped versions of the before and after refactoring projects from the Break the If Habit presentation. 

Wednesday, 16 November 2011 00:29:11 (Eastern Standard Time, UTC-05:00)   #     Comments [0]  | 

I headed out to Columbia, Maryland this past weekend to speak at the CMAP Fall 2011 Code Camp. I was shocked that so many people wanted to see code refactoring at 8:45 on a Saturday morning, but was happy to have an excellent crowd.

The presentation was on breaking the If habit.  To steal the abstract…

Branching logic is one of the foundations of computer programming. But like vodka, the If statement is best consumed in moderation. While the first If is harmless, and maybe even a little thrilling, soon there's a second, a third, a fourth and before you know it -- you wake up under your desk with a method you don't even recognize much less understand.

In this session we'll explore some of the common If abuses and learn how to refactor them into more readable and adaptable code. We'll use design patterns to put the demon If back in its bottle and keep it there. Saner code is within reach, and this is the first step to break the the If habit.

The inspiration for this talk was some legacy code I had to update.  It was already sagging under a dense thicket of if statements that crossed two methods and already had 18 paths. I was supposed to add yet another set of if statements to support yet another variation of the report this code generated. 

I just couldn’t bring myself to do it.  It had taken me two hours to analyze where the changes had to be made so they wouldn’t break anything else. I knew adding these statements was just stuffing more complexity into two methods that were already bursting with branching logic.  The next change would require even more analysis and would more likely lead to something getting broken.

So, I started ripping out code.  I’d like to claim I was purposefully refactoring to a pattern, but it was really more instinctual, moving the code for each variation into its own class.  What it turned out I was doing was replacing a conditional with polymorphism, or implementing a strategy pattern.

The example from the talk started out looking like this…

public double CalculateProfitSharing(Employee employee, DateTime asOfDate, double grossProfit, EmployeeStats stats)
       {
           TimeSpan lengthofEmployment = asOfDate - employee.StartDate;
           double pool = grossProfit*0.04;
           double profitSharingAmount = 0;
           if(lengthofEmployment.Days > 182)
           {
               double basePartofPool = 1.0/(double)stats.EmployeeCount * pool;
               double loyaltyMultipler = lengthofEmployment.Days/stats.AverageLengthEmploymentinDays;
             
               //current year pro-rate
               double proratedRate = 1;
               if(lengthofEmployment.Days < 365)
               {
                   proratedRate = lengthofEmployment.Days/365.0;
               }
               profitSharingAmount = basePartofPool*loyaltyMultipler*proratedRate;

               if(employee.Sales)
               {
                   profitSharingAmount = 0;
               }
               else if(employee.Level < 4)
               {
                   if(employee.Level == 2)
                   {
                       profitSharingAmount *= 1.5;
                   }
                   else if(employee.Level == 3)
                   {
                       profitSharingAmount *= 2;
                   }
                   if (profitSharingAmount  > 10000)
                   {
                       profitSharingAmount = 10000;
                   }
               }
               else
               {
                   profitSharingAmount *= 2.5;
                   if(profitSharingAmount > 5000)
                   {
                       profitSharingAmount = 5000;
                   }
               }
           }
           if(profitSharingAmount <0 & employee.Level < 4)
           {
               profitSharingAmount = 0;
           }
           if(employee.Level == 4)
           {
               if(profitSharingAmount < -5000)
               {
                   profitSharingAmount = -5000;
               }

               if(profitSharingAmount > 0)
               {
                  
                   if((asOfDate - employee.LastOptionGrant).Days<= 365)
                   {
                       profitSharingAmount = 0;
                   }
               }
           }
           if(employee.Level < 4)
           {
               if(employee.ChoseDeferred)
               {
                   profitSharingAmount *= 1.25;
               }
           }
           return Math.Ceiling(profitSharingAmount);
       }

And ended up looking like this...

public double CalculateProfitSharing(Employee employee, DateTime asOfDate, double grossProfit, EmployeeStats stats)
{
     var profitSharingLevel = GetProfitSharingLevel(employee);
     return profitSharingLevel.CalculateProfitSharing(employee, asOfDate, grossProfit, stats);
}

private ProfitSharingEmployeeLevelBase GetProfitSharingLevel(Employee employee)
{
     if(profitSharingLevels.ContainsKey(employee.Level))
     {
         return profitSharingLevels[employee.Level];
     }

     return new ProfitSharingEmployeeLevelNullObject();
}

There is obviously some code in the new classes that encapsulates the variations so there is not a reduction in the total amount of code. But every method becomes much each easier to understand when you don’t have scroll to through pages of nested if statements.  It also becomes easier to change and maintain code when your rules for each variation aren’t tangled together in gobs of if statements.

The presentation also included how to use the Null Object Pattern so you can stop sprinkling your code with null checks (there’s an example above – just imagine GetProfitSharingLevel is public and is called by more than one other method). It was also supposed to show the replacement of branching logic with the Decorator Pattern, but we ran out of time. I’ll try to explore these in more detail in other blog posts, but not tonight.

I think the root cause of most If abuse is the refusal of developers to create objects, especially small ones.  I think there is an innate aversion to creating seemingly trivial objects no matter how helpful they might be.  Packaging two pieces of related information and/or a method together should be reason enough to make a new class, but that doesn’t seem to pass the “I need a new class” threshold for most developers. I know it’s below mine. 

For instance, implementing a calculation for a tiered sales commission is usually done with if statements. 

                double commission = 0;
                 if (yearSales < yearTarget)
                 {
                     commission = 0.005 * yearSales;
                 }
                 else if (yearSales < yearTarget*2)
                 {
                     commission = 0.01 * (yearSales - yearTarget) + 0.005 * yearTarget;
                 }
                 else
                 {
                     commission = 0.02*(yearSales - yearTarget*2) + 0.01 * yearTarget + 0.005 * yearTarget;
                 }
                 compensation += commission;

And it’s fine, it works, it is even fairly compact  – but it takes a while to understand, extending it requires additional if statements and it is buried inside a method – wholly unusable anywhere else.  But the worst sin is that we have a clear object here with two properties and a method, and we have denied its existence.

What we should do is create a small object…

public class CommissionTier
    {
        private double _tierStart;
        private double _rate;

        public CommissionTier(double tierStart, double rate)
        {
            _tierStart = tierStart;
            _rate = rate;
        }

        public double TierStart
        {
            get { return _tierStart; }
        }

        public double GetCommision(double sales)
        {
            return (sales - TierStart)*_rate;
        }

        public override string ToString()
        {
            return string.Format("Tier Start:{0:c0}, Rate {1:0.00%}", _tierStart, _rate);
        }
    }

Then we can use a looping structure that is more clear about what we’re doing. It is also now easily extensible.  To add a tier, we just need to add a new instance of the CommissionTier object to the collection – no logic changes required.  Heck we could even drive tier creation from a database…

_tiers.Add(new CommissionTier(yearTarget * 2, 0.02));
_tiers.Add(new CommissionTier(yearTarget, 0.01));
_tiers.Add(new CommissionTier(0, 0.005));

double commission = 0;
double workingSales = yearSales;
foreach (CommissionTier tier in _tiers)
{
    if (workingSales > tier.TierStart)
    {
        commission += tier.GetCommision(workingSales);
        workingSales = tier.TierStart;
    }
}  

By promoting the Commission Tier to an object instead of leaving it strewn about in manually related variables, magic numbers and calculations, all of sudden we have all the tools of OOP available to us. Our logic and our data are packaged together nicely.  The object has control of its data and we can leverage the power of collections. 

So instead continuing to pack procedural programming spaghetti into our methods, we need to be object oriented designers and create more frickin’ objects. Only then can we begin to break the If habit.

Wednesday, 16 November 2011 00:03:34 (Eastern Standard Time, UTC-05:00)   #     Comments [0]  | 
Thursday, 18 May 2006
Just because an application is developed in .NET doesn't make it object oriented.

It might have classes.

It might have interfaces.

It might have methods.

It still is a mass of spaghetti code, with some objects thrown in to act as meatballs.

It never ceases to amaze me what lengths developers will go to to turn .NET into a overwrought scripting language, instead of harnessing some simple OOP principles to make their code and their lives simpler.

So, today we are going to talk about Encapsulation. Yes, I have spent the last two weeks ripping through an app that had objects, but frequently didn't use them as much more than fancy structures.  And, yes, I will rant a little bit (more).

Encapsulation is probably the simplest OOP concept to grasp, and it should be the easiest to implement.  A developer does not need to understand class factories, inheritance or polymorphism to develop encapsulated classes.  A desire for clean interfaces coupled with an abhorrence for writing the same code twice will lead naturally to encapsulation.

Yet, this well-known tool of code reuse, is often left unloved and dusty, next to the once-cracked Gang of Four OOP bible.

I like to think of Encapsulation as empowering an object.  It allows the object to say, "This is what I do. This is what I expect.  Don't tell me how to do my job, just give me what I need and let me do it."

Simple, no?  You'd think so, but it is clearly not a universal practice.

Here is an object.  What it does really doesn't matter for this discussion.  It has two ways of being populated -- one when it is first created, another for when it is  reconstituted from the database. It has two consctructors. This is a very common situation. 

public class SomeEntity{

    //Constructor for new instance
     public SomeEntity(string Name, string ANeededValue, int AnotherNeededValue){...}
    
    //Constructor for a retrieved instance
     public SomeEntity(int ID, string Name, XmlDocument TheReasonForTheObject){...}

    //Members
    private string mName;
    //...

    //Properties, Methods, Etc.

    public void DoSomethingImportant(){...}

    public string Name{
       { get{return mName;}
    }
    //...
}

Now, I don't have a problem with the first constructor -- at least there is one, this actually demonstrates an important bit of encapsulation: controlling how the object is instantiated.  The constructor gives a way to tell the world,  "This is what I need to start properly!"

For the free-for-all that can result without encapsulated instantiation, here's the same object sans a defined constructor...

public class SomeEntity{

       //default constructor
       public SomeEntity(){}
  
    //Members
    public string Name;
    public string ANeededValue;
    public int AnotherNeededValue;
   
   
//Properties, Methods, Etc....
    public void DoSomethingImportant(){...}

}

What's required to make sure an instance of this object works properly?  The developer's dilligence, memory and typing skills.  Everywhere the object is created, the developer must remember to type four lines of code -- just to get the object in a state where it is minimally functional.

SomeEntity AnObjectINeed = new SomeEntity();
AnObjectINeed.Name = "Fred";
AnObjectINeed.ANeededValue = "Tuesday";
AnObjectINeed.AnotherNeededValue = 789;

Forget to supply a value -- oops, error. What's missing?  Hope the exception message is clear and go hunting. It is bad enough that this code will be cut-and-pasted willy-nilly every time the object is needed, but what is worse is the maintenance implication.  Things change, the object needs an additonal value to work properly....

public class SomeEntity{

  
    //Members
    public string Name;
    public string ANeededValue;
    public int AnotherNeededValue;
    public DateTime OoopsForgotThis;
   
    //Properties, Methods, Etc....
    public void DoSomethingImportant(){...}

}

Hmmm, what happens now? The object won't work without the new value, but the application still compiles.  The developer must hunt for EVERY creation of the object and add another line of code, and then clean up the inevitable bugs when he misses a few.

Using a defined constructor avoids this problem.

    //Constructor for new instance
     public SomeEntity(string Name, string ANeededValue, int AnotherNeededValue, DateTime OoopsForgotThis){...}

The application won't compile now. All the places the code needs to be changed will be listed, they don't have to be hunted for.  Once the changes have been made and the app compiles, the object will always have what it needs to do its job.

Pardon the long aside, and let me finally return to what bothered me about the second constructor in the example above...

//Constructor for a retrieved instance
  public SomeEntity(int ID, string Name, XmlDocument TheReasonForTheObject){...}

What bothers me here is a subtle, but more troubling, lack of encapsulation related to this constructor. 

In order to call this constructor, three parameters must be supplied, an ID, a name and an XMLDocument. Now, where are these values stored? In a database record.  They are retrieved via a stored procedure call using the entity's ID, which is a unique integer value. 

A UNIQUE value.

The ID is unique and possesses all the object needs to know to reconstitute itself.  Why then, are the other two values needed in the constructor?

They're NOT!

Why are they there? 

I have no friggin' clue.  But this is part of the foo I've been fighting for the last few weeks. There are 5 or 6 lines of database-related code that precede every use of this constructor, along with the creation of an XmlDocument object from its string respresentation.  That's 8 lines of code repeated each time the object is recreated from the database, all of which SHOULD have been placed in the object, so it could have populated itself.

But that's not all?  What is controlling whether or not:
1) The name and XML document supplied are actually related to the supplied ID?
2) The XMLDocument has the proper schema the object expects?

The memory, diligence and typing skills of the developer.  In other words, NO ONE! 

By writing the object so it controls its own data retrieval, these problems are avoided. The object saved (or should have saved -- don't get me started) its own data and can be fairly confident it is getting the same information back -- in proper form. 

//Constructor for a retrieved instance
public SomeEntity(int ID){...}
The object is empowered, it controls its data, it works properly.  It can be used in code without requiring 8 supporting lines of code to be written each time. 

Why? 

It's encapsulated!

Want to make the app work when its disconnected and need to store the object in a local Access database?  No problem, change the storage and retrieval code in the object.  The rest of the app doesn't know and doesn't care how or where the object data is stored.  All it knows and cares about is that if it saved a SomeEntity object with a unique ID = 1776, that when this code is called...

    SomeEntity AnObjectINeed = new SomeEntity(1776);

It will get back a SomeEntity object with an ID = 1776 and the proper XMLDocument.

An object born ready to do what it is supposed to do.

What's hard about that?

Really?

Thursday, 18 May 2006 22:35:28 (Eastern Standard Time, UTC-05:00)   #     Comments [0]  | 

Theme design by Dean Fiala

Pick a theme: