Coding Challenge: Find Missing Number in Array of 1 Through 100

7 minute read

Lately I've been trying to challenge myself by taking a stab at some common coding questions asked in software engineering job interviews. After years of working on apps you would think that this kind of thing comes easy, but software development in the modern age comes with a heavy side of Googling and lookup, so challenging your brain in this way to come up with an algorithm or process from scratch without that assistance is good practice to keep you sharp.

One of the first problems I attempted was the relatively easy:

How do you find the missing number in a given integer array of 1 to 100?

I'll present all code in this post as easy-to-read Ruby code, and I will also avoid all language shortcuts to make the method more clear. Because we are told in the problem that the array is from 1 through 100 inclusive, I am not going to consider the edge case that a 1 or a 100 is missing in any of these solutions. If you wanted to check for that, you would simply need to specify the beginning and end range values along with the array, and check those two values separately.

When working on these kind of problems, I generally try to identify the naive approach first. Here, the naive solution would be to iterate over every element from the beginning, until you found the missing one.

def missing_number_naive(arr)
  prev = arr.first - 1
  for i in arr
    if i != prev + 1
      return prev + 1
    else
      prev = i
    end
  end
end

Simply put, this code looks at each element in the array. If the number is not equal to the previous value + 1, then the missing number must be the previous + 1. To test its accuracy, I run the program through a method I wrote that tests every possible array for the right answer.

def check_missing_number_algorithm(method_name, range_start, range_finish)
  failed = []

  for i in (range_start+1..range_finish-1).to_a
    a = (range_start..range_finish).to_a - [i]
    res = send(method_name, a)
    failed << i if i !=res
  end
  failed.empty? ? true : failed
end

Running the naive approach through the test shows that it works for all values.

>> check_missing_number_algorithm :missing_number_naive, 1, 100
=> true

This is O(N) but not very elegant, and I think we can do better. Because the numbers are 1 through 100, we can use a special trick that Sir Isaac Newton thought of as a school boy. He was able to determine that the sum of all numbers from 1 through n was equal to (n+1)(n)/1. This works because we can line up all numbers in a row, and if we add each number to its counterpart in a reversed row, they always equal n+1. So in the array from 1-100:

1 2 3 4 5 6....
100 99 98 97 96 95...

The 1 plus the 100 equals 101, the 2 plus the 99 equals 101, and so on, n times. So we have (n+1) * n when we add these all up, but then we have to divide by two because we added the array onto itself. (n+1)(n)/2

Implementing this algorithm requires us to sum the array. We still have to go through every element of the array so we don't get any speed improvements over O(N), however the code for this approach is much simpler:

def missing_number_newton(arr)
  n = arr.last
  correct_sum = (n+1)*n/2
  array_sum = 0
  arr.each {|i| array_sum = array_sum + i }
  correct_sum - array_sum
end

Running the newton approach through the test shows that it works for all values.

>> check_missing_number_algorithm :missing_number_newton, 1, 100
=> true

This is O(N) and a little more elegant, but I think we can still do better. I'm often reminded that when dealing with finding elements amongst a group, it is beneficial to see if you can eliminate an entire section of the group in one go. I thought of a third approach to this problem where we would divide the array in half, then check the first element of the second half. If the number is what it should be, then we know the missing number must be somewhere after this number, in the second array. We can safely eliminate the entire first half and call the method recursively. If the number is not what we would expect, then we can eliminate the entire second half, and call the method recursively. Here is what the basic logic looks like:

The first thing we need is a stop condition. Due to the unknown array item count, we can get a situation where a recursive call only has 2 or 1 items. If we see an array of 2, we can guarantee that it is an array of two numbers missing one between them, or that there is no missing number at all. For example:

[2, 4] => Missing 3
[66, 68] => Missing 67
[13, 14] => Nothing missing

If we see an array of 1 number, it means that there was a missing number at the end of the first array. This is only because we are testing on the first value of the second array, and it sets up the possibility for this to occur. When we see an array of length 1, we will simply add 1 to its element to find the missing number.

[3] => Missing 4
[22] => Missing 23

There is one more stop condition, and it checks the last number of the first half against the first number of the last half. It catches the edge case where the missing number would fall in between the halves. This check falls after we do our array split.

...12,13] vs [15,16... => Missing 14
...44,45] vs [47,48... => Missing 46

After those three checks, it's just the test on the first value of the second half being what we think it should be, the target. The target is defined as the entire array's last number, minus the first number, divided by 2.0 and rounding down, then adding back in the arrays first number to shift the midpoint up accordingly. All together, it looks like this:

def missing_number(arr)

  if arr.length == 2
    if arr.last - arr.first == 1
      return "NO MISSING NUMBERS"
    else
      return (arr.first + arr.last) / 2
    end
  elsif arr.length == 1
    return arr.first + 1
  else

    a1 = arr.slice(0, arr.length/2)
    a2 = arr.slice(arr.length/2, arr.length)

    target = ((arr.last - arr.first)/2.0).floor + arr.first

    if a2.first - a1.last != 1
      return a2.first - 1
    elsif a2.first == target
      missing_number(a2)
    else
      missing_number(a1)
    end
  end
end

Running this halving approach through the test shows that it also works for all values.

>> check_missing_number_algorithm :missing_number, 1, 100
=> true

I could probably make it more elegant, and something about the three different stop conditions tells me that perhaps an off-by-one error is in here and it could be refactored down to two stop conditions, but the rough concept is here. It runs in far fewer iterations that the previous algorithms, somewhere between O(1) and O(N), though the array slicing may be negating those speed benefits. To check, I modified the test code to include more iterations, and time tracking:

def check_missing_number_algorithm(method_name, range_start, range_finish)
  failed = []

  time_start = Time.now

  for i in (range_start+1..range_finish-1).to_a
    a = (range_start..range_finish).to_a - [i]
    res = send(method_name, a)
    failed << i if i !=res
  end
  time_finish=Time.now time_diff=time_finish - time_start failed.empty? ? "PASSED: #{time_diff} sec" : failed
end

Then I ran the three algorithms through again:

>> check_missing_number_algorithm :missing_number_naive, 1, 100
=> "PASSED: 0.003274 sec"
>> check_missing_number_algorithm :missing_number_newton, 1, 100
=> "PASSED: 0.0039 sec"
>> check_missing_number_algorithm :missing_number, 1, 100
=> "PASSED: 0.003382 sec"

They are too close to be able to tell which is actually the fastest, so I will increase the high end of the array range by a factor of 1000 and run again.

>> check_missing_number_algorithm :missing_number_naive, 1, 100000
=> "PASSED: 886.426262 sec"
>> check_missing_number_algorithm :missing_number_newton, 1, 100000
=> "PASSED: 997.462695 sec"
>> check_missing_number_algorithm :missing_number, 1, 100000
=> "PASSED: 598.027811 sec"

Ok, now we are seeing some interesting time benefits of the halving approach with large arrays. But what about numerous iterations on small arrays? I modified the testing code again to run the 1-100 array through ten thousand times:

def check_missing_number_algorithm(method_name, range_start, range_finish)
  failed = []

  time_start = Time.now

  10000.times do
    for i in (range_start+1..range_finish-1).to_a
      a = (range_start..range_finish).to_a - [i]
      res = send(method_name, a)
      failed << i if i !=res
    end
  end

  time_finish=Time.now time_diff=time_finish - time_start failed.empty? ? "PASSED 10000 TIMES: #{time_diff} sec" : failed
end
>> check_missing_number_algorithm :missing_number_naive, 1, 100
=> "PASSED 10000 TIMES: 14.399464 sec"
>> check_missing_number_algorithm :missing_number_newton, 1, 100
=> "PASSED 10000 TIMES: 12.69414 sec"
>> check_missing_number_algorithm :missing_number, 1, 100
=> "PASSED 10000 TIMES: 10.742814 sec"

So just as we thought, the halving approach works fastest for both the scenario of determining the missing value in a very large array, and finding the missing value in a small array over numerous iterations.

These are the three approaches I thought of to solve this question, can you think of any more? Leave a comment and let me know!

Was this page helpful for you? Buy me a slice of 🍕 to say thanks!

Comments