Lately I’ve been trying to challenge myself by taking a stab at some common coding questions asked in software engineering job interviews. After years of working on apps you would think that this kind of thing comes easy, but software development in the modern age comes with a heavy side of Googling and lookup, so challenging your brain in this way to come up with an algorithm or process from scratch without that assistance is good practice to keep you sharp.
One of the first problems I attempted was the relatively easy:
How do you find the missing number in a given integer array of 1 to 100?
I’ll present all code in this post as easytoread Ruby
code, and I will also avoid all language shortcuts to make the method more clear.
Because we are told in the problem that the array is from 1 through 100 inclusive, I am not going to consider the edge case that a 1 or a 100 is missing in any of these solutions.
If you wanted to check for that, you would simply need to specify the beginning and end range values along with the array, and check those two values separately.
When working on these kind of problems, I generally try to identify the naive approach first. Here, the naive solution would be to iterate over every element from the beginning, until you found the missing one.
1 2 3 4 5 6 7 8 9 10 

Simply put, this code looks at each element in the array. If the number is not equal to the previous value + 1, then the missing number must be the previous + 1. To test its accuracy, I run the program through a method I wrote that tests every possible array for the right answer.
1 2 3 4 5 6 7 8 9 10 11 

Running the naive approach through the test shows that it works for all values.
1 2 

This is O(N) but not very elegant, and I think we can do better. Because the numbers are 1 through 100, we can use a special trick that Sir Isaac Newton thought of as a school boy. He was able to determine that the sum of all numbers from 1 through n was equal to (n+1)(n)/1. This works because we can line up all numbers in a row, and if we add each number to its counterpart in a reversed row, they always equal n+1. So in the array from 1100:
1 2 

The 1
plus the 100
equals 101
, the 2
plus the 99
equals 101
, and so on, n
times. So we have (n+1) * n
when we add these all up,
but then we have to divide by two because we added the array onto itself. (n+1)(n)/2
Implementing this algorithm requires us to sum the array. We still have to go through every element of the array so we don’t get any speed improvements over O(N), however the code for this approach is much simpler:
1 2 3 4 5 6 7 

Running the newton approach through the test shows that it works for all values.
1 2 

This is O(N) and a little more elegant, but I think we can still do better. I’m often reminded that when dealing with finding elements amongst a group, it is beneficial to see if you can eliminate an entire section of the group in one go. I thought of a third approach to this problem where we would divide the array in half, then check the first element of the second half. If the number is what it should be, then we know the missing number must be somewhere after this number, in the second array. We can safely eliminate the entire first half and call the method recursively. If the number is not what we would expect, then we can eliminate the entire second half, and call the method recursively. Here is what the basic logic looks like:
The first thing we need is a stop condition. Due to the unknown array item count, we can get a situation where a recursive call only has 2 or 1 items. If we see an array of 2, we can guarantee that it is an array of two numbers missing one between them, or that there is no missing number at all. For example:
1 2 3 

If we see an array of 1 number, it means that there was a missing number at the end of the first array. This is only because we are testing on the first value of the second array, and it sets up the possibility for this to occur. When we see an array of length 1, we will simply add 1 to its element to find the missing number.
1 2 

There is one more stop condition, and it checks the last number of the first half against the first number of the last half. It catches the edge case where the missing number would fall in between the halves. This check falls after we do our array split.
1 2 

After those three checks, it’s just the test on the first value of the second half being what we think it should be, the target
. The target
is defined as the entire array’s last number,
minus the first number, divided by 2.0
and rounding down, then adding back in the arrays first number to shift the midpoint up accordingly. All together, it looks like this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 

Running this halving approach through the test shows that it also works for all values.
1 2 

I could probably make it more elegant, and something about the three different stop conditions tells me that perhaps an offbyone error is in here and it could be refactored down to two stop conditions, but the rough concept is here. It runs in far fewer iterations that the previous algorithms, somewhere between O(1) and O(N), though the array slicing may be negating those speed benefits. To check, I modified the test code to include more iterations, and time tracking:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 

Then I ran the three algorithms through again:
1 2 3 4 5 6 

They are too close to be able to tell which is actually the fastest, so I will increase the high end of the array range by a factor of 1000 and run again.
1 2 3 4 5 6 

Ok, now we are seeing some interesting time benefits of the halving approach with large arrays. But what about numerous iterations on small arrays? I modified the testing code again to run the 1100 array through ten thousand times:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 

1 2 3 4 5 6 

So just as we thought, the halving approach works fastest for both the scenario of determining the missing value in a very large array, and finding the missing value in a small array over numerous iterations.
These are the three approaches I thought of to solve this question, can you think of any more? Leave a comment and let me know!