You are here:
- Home
- >
- Resource Hub
- >
- Thought Leadership
- >
- Factors Affecting Code Performance.
View navigation
12 January 2021
“Premature optimisation is the root of all evil”. Whilst that famous quote from Donald Knuth obviously has some truth it, thinking about performance early on when designing and writing code can be a very good thing too.
Here are some small things Software Engineers could think about when writing software that might yield better performance benefits for little or no work. It is not a complete list, of course, and it does not consider databases, web browser DOMs, GPUs, etc. It just considers basic, every-day code, with C# as the particular focus.
Programming languages typically have multiple numeric variable types. In C#, there are int, long, float, double and decimal, plus the unsigned and smaller integer types. The int type is 32-bit and therefore has a smaller range than the 64-bit long type. If you know you will not need that larger range, i.e. you will be using numbers less than about 2 billion in magnitude, then using a long is just wasteful. It will take longer to process, longer read or write from memory and longer to read or write to disk. Such gains, however, are small fry. The biggest differences are between the floating-point types. The decimal type gives 128-bit floating point numbers based on powers of 10 and therefore has advantages over the 2-based 64-bit double type, e.g. in financial calculations. A double cannot store even £0.01 precisely and might instead end up as £0.01000000000000000021, which is very close, but if you then round up to the nearest pence due to some rule, then you end up with £0.02 and look pretty silly. The base-2 types are native to the chip architectures so typically execute hundreds of times faster than the non-native base-10 one. This can be a free way of making a huge improvement in code performance: don’t use decimal without good reason.
CPUs are about 1000 times faster than they were 30 years ago, but RAM is only about 30 times faster, so RAM needs to be treated with a little respect at times. To help hide this issue, chips typically have multiple levels of memory cache built into them. For example, I am typing this on a laptop with 9MB of level 3 cache. This is the largest but slowest of the caches on the chip and is the last stand before a call has to be made to the much slower main memory. If your application’s active memory usage sits nicely within your cache, it will perform significantly faster.
The graph below shows the result of a little experiment, which involved increasing the size of a dataset from a small fraction of the L3 (level 3) cache up to many times its size. If the time taken was proportional to the size of the dataset, i.e. the time taken per element is fixed, then this graph would be a horizontal line. But as you can see, there’s a notable increase in time per element, which begins once the data size approaches the cache size.
Firstly, the chart makes an important point about assessing code performance. Don’t test on a small dataset and expect the time taken to scale as naively expected to a larger one. Secondly, just don’t be wasteful with memory. For example, if you avoid creating a large array of items to be processed later in multiple steps, when actually you could process them in one go as you are creating them, then you might see a huge benefit in performance.
You can get huge zero-effort gains if you keep in mind memory locality. For example, if looping over a two-dimensional array, then you need to choose whether to loop over the first or second index in the inner loop.
In the above C# example, changing j alters the memory address less than changing i. Having the inner loop change j rather than i is therefore many times faster for large arrays. It is also often faster to use a true multi-dimensional array e.g. double[,] rather than an array-of-arrays, e.g. double [ ] [ ], at least for large arrays and for certain operations. An array-of-arrays is a set of separate arrays, which are not so easy to handle as the single block of memory used for the true multi-dimensional array. It is obvious, however, that for certain operations, e.g. swapping rows, it is much faster to use arrays-of-arrays, and is also much easier to code.
In the above C# example, changing j alters the memory address less than changing i. Having the inner loop change j rather than i is therefore many times faster for large arrays. It is also often faster to use a true multi-dimensional array e.g. double[,] rather than an array-of-arrays, e.g. double [ ] [ ], at least for large arrays and for certain operations. An array-of-arrays is a set of separate arrays, which are not so easy to handle as the single block of memory used for the true multi-dimensional array. It is obvious, however, that for certain operations, e.g. swapping rows, it is much faster to use arrays-of-arrays, and is also much easier to code.
LINQ is a useful toolbox in C# and can make code easier to write, more standard, and easier to read. It can, however, yield some slow results. A key point is that LINQ does not understand your data, but you do.
Here are some extra pointers: