Matt Coneybeare

MC

Coding Challenge: Implementing MergeSort in Ruby

| Comments

Lately I’ve been trying to challenge myself by taking a stab at some common coding questions asked in software engineering job interviews. After years of working on apps you would think that this kind of thing comes easy, but software development in the modern age comes with a heavy side of Googling and lookup, so challenging your brain in this way to come up with or code an algorithm or process from scratch without that assistance is good practice to keep you sharp.

Previously, I had come up with three solutions of finding a missing number in array of 1 through 100, and today, after reintroducing myself to the algorithm, I am going to implement a O(N) search method: MergeSort.

I’ll present all code in this post as easy-to-read Ruby code, and I will also try to avoid some of the obscure language shortcuts to make the method more clear.

In order to implement an algorithm in any language, you need to understand it, so here is where I needed to lookup how MergeSort works:

Merge Sort is a Divide and Conquer algorithm. It divides input array in two halves, calls itself for the two halves and then merges the two sorted halves. GeeksforGeeks

Sounds simple enough. Looking through the many different implementations in other languages around the web, it can get trickier if you are trying to do everything in constant memory, or if you are working within a language which makes you define array lengths beforehand. I’m not going to worry about that here. I’m only going to implement the logic based on the algorithm.

First, let’s define a recursive method to call and setup a simple test to see if it is working:

1
2
3
4
5
6
7
def merge_sort(array)
  array
end

unsorted_array = (1..10).to_a.shuffle
sorted = merge_sort(unsorted_array)
puts "SORTED CORRECTLY: #{sorted == unsorted_array.sort} | #{unsorted_array} => #{sorted}"

Obviously, this will fail. Running it produces our failure output:

1
SORTED CORRECTLY: false | [3, 9, 5, 2, 1, 6, 7, 10, 8, 4] => [3, 9, 5, 2, 1, 6, 7, 10, 8, 4]

Now that we have a basic test for success, we can begin, starting with “It divides input array in two halves, calls itself for the two halves”. While this is the majority of the algorithm’s description above, it’s actually the uninteresting part. After determining the half size of the array, I’ll use take and drop to split them, them call a new (unwritten) method to merge them correctly.

1
2
3
4
5
6
def merge_sort(array)
  half = array.size / 2
  array1 = array.take(half)
  array2 = array.drop(half)
  merge merge_sort(array1), merge_sort(array2)
end

But, because this is a recursive function, we need a stop condition. merge_sort returns an array of sorted values, and it doesn’t really need to concern itself with how the order is, so the only stop condition is when we have an array that is too small to sort. This works because as we keep dividing and dividing the array in to subarrays and subarrays, we will eventually get to a place where merge_sort is asked to sort an array of length 1. When this happens, we just return the already sorted array. Otherwise, we return the merged array as above. Adding this to the method:

1
2
3
4
5
6
7
8
9
10
def merge_sort(array)
  if array.size > 1
    half = array.size / 2
    array1 = array.take(half)
    array2 = array.drop(half)
    array = merge merge_sort(array1), merge_sort(array2)
  end

  array
end

Defining our merge function with some basic (incorrect) code, we can run the test again.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
def merge_sort(array)
  if array.size > 1
    half = array.size / 2
    array1 = array.take(half)
    array2 = array.drop(half)
    array = merge merge_sort(array1), merge_sort(array2)
  end

  array
end

def merge(array1, array2)
  array1 + array2
end

1
SORTED CORRECTLY: false | [4, 1, 6, 8, 7, 2, 10, 5, 9, 3] => [4, 1, 6, 8, 7, 2, 10, 5, 9, 3]

There is no change, as expected. What we have done so far is recursively broken this array up into 10 different subarrays of length 1, then combined them all through simple concatenation, leaving us with the original sequence. Let’s get down to the interesting part of the written algorithm above: “and then merges the two sorted halves”.

To merge these arrays, we need to realize that they are already sorted. This is the contract we have with the merge_sort algorithm. We are only expected to merge two already sorted arrays in this method. At the base case, we would be asked to merge two arrays of count 1 (eg: [3], [7]), but it will eventually have to merge sorted arrays of size array.size / 2 (eg: [1, 3, 4, 8, 9], [2, 5, 6, 7, 10]). Because these are already sorted, we only need to compare the left-most values of each array with one another, and choose the lowest one to put into a new array. We repeat until we have placed all items from both arrays. The pseudocode looks something like this

1
2
3
4
5
6
7
8
While there are elements in array1 and array2 to compare
  - compare the first elements, choosing the lowest one
  - move the lowest element to the sorted array
If there are any extra elements in array1
  - add array1 extras to the sorted array
Else if there are any extra elements in array2
  - add array2 extras to the sorted array
Return the sorted array

After the while block, we can be sure that either array1 or array2 is empty, so we can just add the rest on top, in order, as they are all larger than the last element in the array. The real code looks like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
def merge(array1, array2)
  merged = []
  while array1.any? && array2.any?
    if array1.first < array2.first
      merged.push array1.shift
    else
      merged.push array2.shift
    end
  end

  merged += array1
  merged += array2
end

This works because push adds an element to the end of an array, shift pops it off the front, and += will return the original array when passed an empty one. Putting everything together we have:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
def merge_sort(array)
  if array.size > 1
    half = array.size / 2
    array1 = array.take(half)
    array2 = array.drop(half)
    array = merge merge_sort(array1), merge_sort(array2)
  end

  array
end

def merge(array1, array2)
  merged = []
  while array1.any? && array2.any?
    if array1.first < array2.first
      merged.push array1.shift
    else
      merged.push array2.shift
    end
  end

  merged += array1
  merged += array2
end

And finally running our test, we see success:

1
SORTED CORRECTLY: true | [8, 2, 5, 4, 10, 7, 3, 6, 1, 9] => [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

And to make sure it works for larger arrays, let’s try one with 1000 numbers

1
2
3
unsorted_array = (1..1000).to_a.shuffle
sorted = merge_sort(unsorted_array)
puts "SORTED CORRECTLY: #{sorted == unsorted_array.sort}\n\n#{unsorted_array}\n\n#{sorted}"

1
2
3
4
5
SORTED CORRECTLY: true

[109, 626, 70, 142, 467, 879, 658, 653, 863, 392, 60, 196, 710, 669, 28, 529, 77, 596, 344, 714, 607, 401, 558, 107, 115, 211, 14, 37, 135, 963, 647, 799, 3, 928, 25, 310, 643, 640, 605, 87, 374, 521, 384, 884, 660, 398, 623, 486, 268, 666, 302, 17, 751, 264, 347, 543, 5, 478, 788, 150, 971, 200, 967, 103, 7, 172, 537, 616, 907, 64, 948, 319, 331, 448, 501, 316, 73, 531, 613, 872, 578, 96, 360, 970, 373, 112, 121, 50, 659, 679, 363, 705, 195, 990, 908, 939, 549, 481, 125, 1000, 913, 729, 849, 477, 322, 90, 621, 413, 85, 738, 697, 106, 815, 409, 210, 119, 251, 298, 293, 252, 307, 313, 122, 734, 728, 827, 33, 763, 946, 560, 883, 369, 74, 816, 737, 139, 66, 361, 662, 323, 691, 65, 415, 881, 962, 294, 657, 683, 311, 459, 686, 720, 466, 773, 539, 231, 246, 793, 346, 742, 338, 424, 490, 348, 267, 278, 770, 889, 367, 456, 843, 632, 595, 186, 775, 452, 6, 740, 698, 275, 209, 52, 566, 396, 407, 138, 998, 158, 927, 84, 667, 492, 789, 704, 834, 796, 600, 180, 568, 914, 13, 104, 53, 726, 283, 541, 865, 991, 886, 235, 394, 891, 321, 491, 260, 725, 205, 62, 434, 226, 694, 161, 571, 750, 11, 536, 861, 735, 441, 67, 592, 890, 353, 615, 366, 744, 842, 291, 202, 111, 58, 455, 781, 853, 128, 151, 300, 603, 787, 116, 695, 870, 159, 608, 674, 988, 239, 436, 230, 420, 612, 341, 522, 336, 649, 427, 535, 279, 706, 453, 858, 249, 335, 574, 474, 256, 430, 920, 622, 892, 334, 746, 243, 411, 644, 854, 818, 342, 576, 162, 538, 665, 577, 223, 646, 554, 954, 997, 555, 194, 748, 821, 601, 94, 606, 402, 232, 932, 237, 597, 684, 127, 807, 617, 76, 899, 777, 637, 510, 810, 709, 476, 80, 749, 652, 255, 968, 921, 447, 147, 918, 257, 634, 561, 406, 915, 56, 638, 438, 585, 627, 897, 282, 774, 979, 2, 512, 428, 419, 184, 337, 755, 556, 153, 926, 814, 758, 722, 609, 178, 446, 213, 958, 547, 961, 259, 425, 739, 328, 992, 463, 61, 766, 59, 145, 82, 598, 759, 326, 404, 389, 977, 896, 762, 753, 228, 551, 656, 123, 878, 403, 903, 945, 285, 93, 672, 841, 885, 628, 91, 387, 949, 572, 513, 214, 318, 250, 306, 19, 175, 880, 164, 49, 557, 931, 570, 15, 98, 31, 247, 197, 937, 839, 835, 857, 846, 383, 219, 813, 177, 299, 573, 450, 241, 947, 509, 39, 518, 745, 938, 221, 707, 590, 289, 220, 224, 10, 464, 290, 81, 277, 309, 664, 370, 315, 183, 69, 901, 132, 437, 292, 29, 524, 527, 431, 494, 423, 866, 254, 305, 520, 414, 352, 855, 54, 983, 593, 680, 179, 488, 610, 233, 747, 953, 712, 130, 330, 72, 429, 553, 386, 859, 418, 57, 756, 981, 809, 468, 171, 630, 206, 692, 862, 131, 972, 20, 663, 764, 18, 765, 191, 795, 140, 633, 154, 586, 88, 930, 960, 9, 911, 143, 78, 391, 845, 362, 685, 296, 333, 349, 189, 412, 71, 157, 769, 625, 715, 281, 822, 288, 650, 378, 540, 511, 358, 544, 985, 95, 89, 530, 144, 587, 43, 811, 701, 569, 812, 820, 482, 727, 888, 936, 454, 218, 185, 375, 562, 100, 504, 397, 973, 27, 929, 417, 922, 287, 421, 980, 498, 803, 258, 951, 689, 760, 783, 461, 266, 489, 134, 648, 26, 108, 176, 564, 584, 42, 167, 847, 690, 731, 602, 356, 594, 708, 943, 676, 465, 860, 525, 933, 496, 192, 804, 274, 129, 611, 642, 876, 935, 790, 984, 301, 893, 542, 681, 207, 110, 332, 772, 432, 523, 286, 390, 550, 75, 808, 717, 502, 141, 86, 817, 711, 385, 838, 364, 40, 395, 458, 462, 371, 784, 898, 146, 261, 996, 993, 552, 895, 439, 900, 956, 982, 479, 563, 327, 969, 528, 519, 500, 339, 312, 801, 797, 36, 442, 204, 475, 234, 965, 187, 837, 379, 906, 55, 942, 38, 469, 639, 589, 443, 382, 575, 869, 68, 877, 182, 515, 22, 833, 380, 909, 215, 825, 212, 824, 867, 23, 583, 229, 156, 41, 588, 517, 137, 700, 955, 217, 99, 190, 678, 444, 351, 917, 63, 661, 126, 368, 133, 295, 966, 830, 579, 645, 393, 682, 460, 916, 45, 655, 721, 696, 919, 280, 102, 388, 240, 994, 238, 636, 791, 829, 181, 51, 340, 343, 169, 248, 671, 308, 303, 718, 850, 472, 856, 35, 120, 779, 778, 499, 874, 581, 832, 868, 851, 408, 668, 242, 48, 792, 950, 875, 304, 236, 449, 514, 47, 924, 836, 635, 263, 516, 591, 487, 445, 619, 654, 199, 687, 699, 471, 272, 317, 910, 32, 716, 262, 173, 276, 245, 271, 599, 733, 974, 297, 894, 848, 174, 244, 160, 604, 433, 629, 101, 904, 320, 905, 618, 506, 46, 495, 761, 887, 534, 227, 732, 508, 188, 355, 693, 723, 873, 124, 117, 565, 752, 826, 457, 426, 582, 381, 83, 934, 485, 329, 314, 959, 941, 405, 620, 828, 757, 113, 12, 882, 533, 780, 800, 92, 118, 359, 473, 480, 376, 1, 507, 923, 203, 768, 776, 493, 503, 198, 786, 978, 730, 995, 743, 284, 222, 422, 105, 34, 163, 798, 864, 794, 4, 253, 410, 365, 771, 114, 703, 741, 844, 225, 273, 614, 470, 416, 345, 483, 24, 136, 548, 532, 736, 97, 673, 806, 193, 713, 852, 823, 440, 435, 871, 651, 79, 148, 940, 719, 702, 152, 149, 44, 155, 559, 269, 670, 675, 677, 631, 957, 975, 925, 325, 819, 324, 265, 580, 688, 840, 952, 21, 902, 545, 451, 999, 567, 201, 802, 754, 16, 399, 782, 270, 641, 166, 165, 785, 497, 30, 168, 805, 624, 484, 546, 987, 964, 944, 400, 989, 170, 208, 526, 372, 767, 354, 986, 357, 724, 216, 912, 976, 831, 505, 8, 377, 350]

[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191, 192, 193, 194, 195, 196, 197, 198, 199, 200, 201, 202, 203, 204, 205, 206, 207, 208, 209, 210, 211, 212, 213, 214, 215, 216, 217, 218, 219, 220, 221, 222, 223, 224, 225, 226, 227, 228, 229, 230, 231, 232, 233, 234, 235, 236, 237, 238, 239, 240, 241, 242, 243, 244, 245, 246, 247, 248, 249, 250, 251, 252, 253, 254, 255, 256, 257, 258, 259, 260, 261, 262, 263, 264, 265, 266, 267, 268, 269, 270, 271, 272, 273, 274, 275, 276, 277, 278, 279, 280, 281, 282, 283, 284, 285, 286, 287, 288, 289, 290, 291, 292, 293, 294, 295, 296, 297, 298, 299, 300, 301, 302, 303, 304, 305, 306, 307, 308, 309, 310, 311, 312, 313, 314, 315, 316, 317, 318, 319, 320, 321, 322, 323, 324, 325, 326, 327, 328, 329, 330, 331, 332, 333, 334, 335, 336, 337, 338, 339, 340, 341, 342, 343, 344, 345, 346, 347, 348, 349, 350, 351, 352, 353, 354, 355, 356, 357, 358, 359, 360, 361, 362, 363, 364, 365, 366, 367, 368, 369, 370, 371, 372, 373, 374, 375, 376, 377, 378, 379, 380, 381, 382, 383, 384, 385, 386, 387, 388, 389, 390, 391, 392, 393, 394, 395, 396, 397, 398, 399, 400, 401, 402, 403, 404, 405, 406, 407, 408, 409, 410, 411, 412, 413, 414, 415, 416, 417, 418, 419, 420, 421, 422, 423, 424, 425, 426, 427, 428, 429, 430, 431, 432, 433, 434, 435, 436, 437, 438, 439, 440, 441, 442, 443, 444, 445, 446, 447, 448, 449, 450, 451, 452, 453, 454, 455, 456, 457, 458, 459, 460, 461, 462, 463, 464, 465, 466, 467, 468, 469, 470, 471, 472, 473, 474, 475, 476, 477, 478, 479, 480, 481, 482, 483, 484, 485, 486, 487, 488, 489, 490, 491, 492, 493, 494, 495, 496, 497, 498, 499, 500, 501, 502, 503, 504, 505, 506, 507, 508, 509, 510, 511, 512, 513, 514, 515, 516, 517, 518, 519, 520, 521, 522, 523, 524, 525, 526, 527, 528, 529, 530, 531, 532, 533, 534, 535, 536, 537, 538, 539, 540, 541, 542, 543, 544, 545, 546, 547, 548, 549, 550, 551, 552, 553, 554, 555, 556, 557, 558, 559, 560, 561, 562, 563, 564, 565, 566, 567, 568, 569, 570, 571, 572, 573, 574, 575, 576, 577, 578, 579, 580, 581, 582, 583, 584, 585, 586, 587, 588, 589, 590, 591, 592, 593, 594, 595, 596, 597, 598, 599, 600, 601, 602, 603, 604, 605, 606, 607, 608, 609, 610, 611, 612, 613, 614, 615, 616, 617, 618, 619, 620, 621, 622, 623, 624, 625, 626, 627, 628, 629, 630, 631, 632, 633, 634, 635, 636, 637, 638, 639, 640, 641, 642, 643, 644, 645, 646, 647, 648, 649, 650, 651, 652, 653, 654, 655, 656, 657, 658, 659, 660, 661, 662, 663, 664, 665, 666, 667, 668, 669, 670, 671, 672, 673, 674, 675, 676, 677, 678, 679, 680, 681, 682, 683, 684, 685, 686, 687, 688, 689, 690, 691, 692, 693, 694, 695, 696, 697, 698, 699, 700, 701, 702, 703, 704, 705, 706, 707, 708, 709, 710, 711, 712, 713, 714, 715, 716, 717, 718, 719, 720, 721, 722, 723, 724, 725, 726, 727, 728, 729, 730, 731, 732, 733, 734, 735, 736, 737, 738, 739, 740, 741, 742, 743, 744, 745, 746, 747, 748, 749, 750, 751, 752, 753, 754, 755, 756, 757, 758, 759, 760, 761, 762, 763, 764, 765, 766, 767, 768, 769, 770, 771, 772, 773, 774, 775, 776, 777, 778, 779, 780, 781, 782, 783, 784, 785, 786, 787, 788, 789, 790, 791, 792, 793, 794, 795, 796, 797, 798, 799, 800, 801, 802, 803, 804, 805, 806, 807, 808, 809, 810, 811, 812, 813, 814, 815, 816, 817, 818, 819, 820, 821, 822, 823, 824, 825, 826, 827, 828, 829, 830, 831, 832, 833, 834, 835, 836, 837, 838, 839, 840, 841, 842, 843, 844, 845, 846, 847, 848, 849, 850, 851, 852, 853, 854, 855, 856, 857, 858, 859, 860, 861, 862, 863, 864, 865, 866, 867, 868, 869, 870, 871, 872, 873, 874, 875, 876, 877, 878, 879, 880, 881, 882, 883, 884, 885, 886, 887, 888, 889, 890, 891, 892, 893, 894, 895, 896, 897, 898, 899, 900, 901, 902, 903, 904, 905, 906, 907, 908, 909, 910, 911, 912, 913, 914, 915, 916, 917, 918, 919, 920, 921, 922, 923, 924, 925, 926, 927, 928, 929, 930, 931, 932, 933, 934, 935, 936, 937, 938, 939, 940, 941, 942, 943, 944, 945, 946, 947, 948, 949, 950, 951, 952, 953, 954, 955, 956, 957, 958, 959, 960, 961, 962, 963, 964, 965, 966, 967, 968, 969, 970, 971, 972, 973, 974, 975, 976, 977, 978, 979, 980, 981, 982, 983, 984, 985, 986, 987, 988, 989, 990, 991, 992, 993, 994, 995, 996, 997, 998, 999, 1000]

There are some improvements that could be made to increase efficiency, such as not creating new arrays but rather manipulate original ones, but I’ll leave that for another day.

Comments

My name is Matt Coneybeare, I design and develop for iOS (iPhone, iPad and iPod Touch), Mac OS X and the Web out of New York. In 2008 I started a software company called Urban Apps that has made some pretty popular apps such as Ambiance and Hourly News. My current Stack Overflow reputation is about 27k.

I was a Rockstar a decade ago, but then went back to school and collected a Bachelor's Degree in Computer Science from U.C. Berkeley. Now I am settled down with my beautiful wife Di and our dog Hamachi. When not at my desk, I love exploring New York City as a Yelp Elite, or training for marathons.

Contact information

Name
Matt Coneybeare
Email
Website
Twitter
Instagram
GitHub
LinkedIn