Friday, November 7, 2014

.NET Iteration Performance

Last night I gave a talk to the .NET User Group of British Columbia on .NET Performance Engineering. During the presentation I demonstrated the performance characteristics of four different C# iteration mechanisms: incremented for, decremented for, foreach and LINQ.

Here (as promised to the audience) is the code for the test application and the individual test cases:
namespace PerformanceMeasurement
{
     using System;
     using System.Collections.Generic;
     using System.Diagnostics;
     using System.IO;
     using System.Linq;

     public class Program
    {
         private enum TimerUnits
        {
            Ticks,
            Milliseconds
        }

         private enum ReturnValue
        {
            TotalTime,
            AverageTime
        }

         public static void Main(string[] args)
        {   
             if (Debugger.IsAttached)
            {                  Console.WriteLine("A debugger is attached! JIT-compiled code will not be optimized ");
            }
            
            Console.Write("Press any key to warm up tests");
            Console.ReadKey();
            var testData = InitAndWarmUpIterationTests();
            Console.Write("Press any key to start tests     ");
            Console.ReadKey();
            Console.WriteLine("\rStarting Performance Tests      ");

#if DEBUG
            Console.WriteLine("This is a Debug build! JIT-compiled code will not be optimized ");
#endif
        
            Console.WriteLine();

            const int IterationCount = 1000;

            RuntIterationTests(IterationCount, testData);
            RuntCollectionTests(IterationCount);

            Console.Write("Tests Complete. Press any key to exit.");
            Console.ReadLine();
        }

        private static List<List<string>> InitAndWarmUpIterationTests()
        {
            var testData = CreateIterationTestData();

            var warmupResults = new List<int>
                                    {
                                        WordCountWithLinq(testData),
                                        WordCountWithForEach(testData),
                                        WordCountWithFor(testData),
                                        WordCountWithForDownward(testData)
                                    };

            Trace.WriteLine(warmupResults.Count);
            
            return testData;
        }

        private static List<List<string>> CreateIterationTestData()
        {
            var entries = new List<List<string>>();

            const string Path = @"data\";

            foreach (var fileName in Directory.EnumerateFiles(Path))
            {
                using (var stream = File.OpenRead(fileName))
                using (var bufferedStream = new BufferedStream(stream))
                {
                    var reader = new StreamReader(bufferedStream);
                    var lines = new List<string>();
                    string current;
                    while ((current = reader.ReadLine()) != null)
                    {
                        lines.Add(current);
                    }

                    entries.Add(lines);
                }
            }

            return entries;
        }

        private static void RuntIterationTests(int iterationCount, List<List<string>> testData)
        {
            Console.WriteLine("Iteration Performance Tests:");

            const TimerUnits Units = TimerUnits.Ticks;
            const ReturnValue ReturnValue = ReturnValue.AverageTime;

            Console.WriteLine("Creating Test Data...");
            var results = new List<int>();

            Console.WriteLine("Running Tests...");
            var t1 = TimeMe(WordCountWithLinq, testData, iterationCount, ref results, Units);
            var t2 = TimeMe(WordCountWithForEach, testData, iterationCount, ref results, Units);
            var t3 = TimeMe(WordCountWithFor, testData, iterationCount, ref results, Units);
            var t4 = TimeMe(WordCountWithForDownward, testData, iterationCount, ref results, Units);

            Console.WriteLine("Result Count: {0}", results.Count);
            Console.WriteLine("WordCountWithLinq \t\t{0}\t: {1} {2}", ReturnValue, t1, Units);
            Console.WriteLine("WordCountWithForEach \t\t{0}\t: {1} {2}", ReturnValue, t2, Units);
            Console.WriteLine("WordCountWithFor \t\t{0}\t: {1} {2}", ReturnValue, t3, Units);
            Console.WriteLine("WordCountWithForDownward \t{0}\t: {1} {2}", ReturnValue, t4, Units);
            Console.WriteLine();
        }

        private static void RuntCollectionTests(int iterationCount)
        {
            const TimerUnits Units = TimerUnits.Ticks;
            const ReturnValue ReturnValue = ReturnValue.AverageTime;

            var results = new List<int>();

            const int MaxValue = 10000;
            var t1 = TimeMe(CreateAndEnumerateList, MaxValue, iterationCount, ref results, Units);
            var t2 = TimeMe(CreateAndEnumerateGenericList, MaxValue, iterationCount, ref results, Units);

            Console.WriteLine("Collection Performance Tests:");
            Console.WriteLine("CreateAndEnumerateList \t\t{0}\t: {1} {2}", ReturnValue, t1, Units);
            Console.WriteLine("CreateAndEnumerateGenericList \t{0}\t: {1} {2}", ReturnValue, t2, Units);
            Console.WriteLine();
        }

        private static float TimeMe<TArg, TReturn>(
            Func<TArg, TReturn> me,
            TArg arg,
            long iterationCount,
            ref List<TReturn> results,
            TimerUnits units = TimerUnits.Milliseconds,
            ReturnValue returnValue = ReturnValue.AverageTime)
        {
            var timer = new Stopwatch();

            var currentIteration = 0L;
            var time = 0F;

            do
            {
                timer.Start();
                results.Add(me(arg));
                timer.Stop();
                time += (units == TimerUnits.Milliseconds) ? timer.ElapsedMilliseconds : timer.ElapsedTicks;
                currentIteration += 1;
                timer.Reset();
            }
            while (currentIteration < iterationCount);

            if (returnValue == ReturnValue.AverageTime)
            {
                time = time / iterationCount;
            }

            return time;
        }

        private static int WordCountWithFor(List<List<string>> entries)
        {
            var wordcount = 0;

            for (var i = 0; i < entries.Count; i++)
            {
                for (var j = 0; j < entries[i].Count; j++)
                {
                    wordcount += entries[i][j].Length;
                }
            }

            return wordcount;
        }

        private static int WordCountWithForDownward(List<List<string>> entries)
        {
            var wordcount = 0;
            for (var i = entries.Count - 1; i > -1; i--)
            {
                for (var j = entries[i].Count - 1; j > -1; j--)
                {
                    wordcount += entries[i][j].Length;
                }
            }

            return wordcount;
        }

        private static int WordCountWithForEach(List<List<string>> entries)
        {
            var wordcount = 0;

            foreach (var entry in entries)
            {
                foreach (var line in entry)
                {
                    wordcount += line.Length;
                }
            }
            return wordcount;
        }

        private static int WordCountWithLinq(List<List<string>> entries)
        {
            return entries.SelectMany(entry => entry).Sum(line => line.Length);
        }

        private static int CreateAndEnumerateList(int maxValue)
        {
            var list = new System.Collections.ArrayList(maxValue);

            foreach (var val in Enumerable.Range(0, maxValue))
            {
                list.Add(val);
            }

            var total = 0;

            foreach (int val in list)
            {
                total += val;
            }

            return total;
        }

        private static int CreateAndEnumerateGenericList(int maxValue)
        {
            var list = new List<int>(maxValue);

            foreach (var val in Enumerable.Range(0, maxValue))
            {
                list.Add(val);
            }

            var total = 0;

            foreach (var val in list)
            {
                total += val;
            }

            return total;
        }
    }
}
Note: in order to run this code you will need to create a data folder that contains a number of multi-line text files and modify the code to point at that folder.
 
On my workstation I get the following output when directly executing the application:
 Starting Performance Tests

Iteration Performance Tests:
Creating Test Data...
Running Tests...
Result Count: 4000
WordCountWithLinq AverageTime : 3697.65 Ticks
WordCountWithForEach AverageTime : 916.269 Ticks
WordCountWithFor AverageTime : 833.674 Ticks
WordCountWithForDownward AverageTime : 736.798 Ticks

Collection Performance Tests:
CreateAndEnumerateList AverageTime : 582.813 Ticks
CreateAndEnumerateGenericList AverageTime : 232.494 Ticks

Tests Complete. Press any key to exit.



As you can see, the decremented for loop is the fastest, and approximately 5 times faster than the LINQ implementation! It is also about 13% faster than the incremented for loop, which is significant. After showing the code and results above, I walked the audience through how I discovered why this is the case. I won't document the entire process here, but after looking at the disassembly of the optimized native code generated by the CLR x86 JIT, I discovered that in the decremented case the evaluation of the condition uses the test and jl (Jump If Less) instructions, rather than the cmp (Compare) and jae (Jump if Above or Equal) instructions, which are used in the incremented case. Obviously the former combination of instructions executes faster than the latter. The latter combination of instructions are also used in the un-optimized, i.e. Debug, version of the decremented for, which is even slower than the un-optimized foreach, which is why it is important to make sure you always performance-test the Release version of the code.

 

So why should anyone care about this?


Well if you are using LINQ in a performance-sensitive code path then you should STOP doing that immediately, and in cases that you are iterating over very large arrays with incremented for, you should try the decremented for and see what you gain. Note that the overall performance gain will vary depending on how much work you are doing in the body of the loop.


And lastly, don't be afraid to look at the disassembly of your managed code; it's right there in Visual Studio and it represents the ultimate truth of how your code is executing on the hardware.