all articles

leetcode war 4 - Median of Two Sorted Arrays

2016-09-19 @sunderls

leetCode

following solutions are all created by myself, if I didn't tackle it and got solution from editorial solution, it will be made clear at the heading.

leetcode war :Median of Two Sorted Arrays

https://leetcode.com/problems/median-of-two-sorted-arrays/

There are two sorted arrays nums1 and nums2 of size m and n respectively.
Find the median of the two sorted arrays. The overall run time complexity should be O(log (m+n)).
Example 1:

nums1 = [1, 3]
nums2 = [2]

The median is 2.0
Example 2:

nums1 = [1, 2]
nums2 = [3, 4]

The median is (2 + 3)/2 = 2.5

analysis

The primitive solution is very obvious, merge two arrays into one array and get the median instantly.

Since the input arrays are already sorted, merging costs us O(m + n), and getting median is O(1), so total complexity is O(m + n), but this does not meet with the requirements of O(log(m+n)).

The goal complexity is O(log(m + n)), so the right algorithm should be something like binary search. Let's firstly get the median of one array, M1, now if we merge the other array into this array, then result median should shift left or shift right or stay the same. Two let the median stay the same, the condition is very obvious - which is the median of second array, M2, should equal to M1, which means the second array has the same account of both numbers below M1, and M2. So we can conclude:

1 . if M1 == M2, the median is the same, M1

  1. if M1 < M2, so second array as the same account of numbers divided by M2, it must have more numbers over M1 than numbers under M1, since M2 > M1. and the real median should be M1<=M3<=M2.
  2. if M1 > M2, the same analysis with above (2).

This gives us an algorithm which cuts off half of both arrays, so the complexity should be O(log(m + n)) (right?). The code should recursive, something similar to binary search I think.

solution: binary range search? I don't know how to call this

while I made a lot of mistakes doing above methods, tried and tried again, and finally get it accepted at 175ms, some gotchas are :

  1. empty arrays.
  2. above assumptions are working great if the arrays at all phases have odd count of numbers, but for the ending condition, if there are event count of numbers left, you cannot cut the head anymore, since the median you got is a combination of these two numbers. for example: [1,6], [3,4,5], if we do our reducing, [6], [3,4] will lead to wrong answer. So we can only reduce to minimum 2 elements for array. If this condition is reached, we have to merge the left arrays and get the median, is there any other good ways?
  3. we put the shortest array in the first place, this cut off extra comparison
  4. cannot calculate the complexity....suppose m < n, should it be O(log m) + O(n)?

var getMedian = function(nums, start, end){
    if (start > end || nums.length === 0){
        return null;
    }
    var total = start + end;
    if (total % 2 === 0){
        return nums[total / 2];
    } else {
        return (nums[(total - 1) / 2] + nums[(total + 1) / 2]) / 2;
    }
};

var recursiveFindMedian = function(nums1, start1, end1, nums2, start2, end2){
    // 1. if two medians are the same, then it ends
    // 2. if not the same, we cut the head of array with smaller median, and cut off tail(same account with previous head) of the other array.
    // 3. do it recursively
    // at then end, if one of them has max 2 element, merge them.

    var m1 = getMedian(nums1, start1, end1);
    var m2 = getMedian(nums2, start2, end2);
    if (m1 === null){
        return m2;
    }

    if (m2 === null){
        return m1;
    }
    if (m1 === m2){
        return m1;
    }

    // merge list
    if (end1 - start1 + 1 <= 2){
        var finalList = [];
        var i = start1;
        var j = start2;
        while (i <= end1 || j <= end2){
            if (i > end1){
                finalList.push(nums2[j]);
                j++;
            } else if (j > end2){
                finalList.push(nums1[i]);
                i++;
            } else if (nums1[i] <= nums2[j]){
                finalList.push(nums1[i]);
                i++;                    
            } else {
                finalList.push(nums2[j]);
                j++;                    
            }
        }

        return getMedian(finalList, 0, finalList.length - 1);
    }
    // if median is average of the two middel numbers
    // we cannot cut so deep
    var accountToCut = (end1 - start1 + 1) % 2 === 0 ? Math.floor((end1 - start1 + 1) / 2 - 1) : Math.floor((end1 - start1 + 1) / 2);
    // we cut the head of nums1 & tail of nums2
    if (m1 < m2){
        return recursiveFindMedian(nums1, start1 + accountToCut, end1, nums2, start2, end2 - accountToCut);
    } else {
        return recursiveFindMedian(nums1, start1, end1 - accountToCut, nums2, start2 + accountToCut, end2);
    }
}

/**
 * @param {number[]} nums1
 * @param {number[]} nums2
 * @return {number}
 */

var findMedianSortedArrays = function(nums1, nums2) {
    if (nums2.length >= nums1.length){
        return recursiveFindMedian(nums1, 0, nums1.length - 1, nums2, 0, nums2.length - 1);
    } else {
        return recursiveFindMedian(nums2, 0, nums2.length - 1, nums1, 0, nums1.length - 1)
    }
};

improvements

Leetcode doesn't offer editorial solution , so I searched on the internet and found other similar solutions like this: http://www.geeksforgeeks.org/median-of-two-sorted-arrays-of-different-sizes/.

Appeared that the special case handling is necessary, while the last-step list merging is improvable, which I shall explain:

suppose arrays at last step, is something like [a1,a2], [b1,b2,b3,b4,b5], we don't need to merge the arrays, the change array 1 can give to array 2, is only shift the median left or right. so, there are following cases:

  • array1: [a1,a2] , array2: [b1,b2,b3....m2.....bn]
    • if n % 2 === 0, m2 is [bi, bi+1]
      • if a1 <= a2 <= bi: m = mean(max(a2,bi-1), bi);
      • if a1 <= bi <= a2 <= bi+1: m = mean(a2, bi);
      • if a1 <= bi <= bi+1 <= a2: m = m2;
      • if bi <= a1 <= a2 <= bi+1: m = m1;
      • if bi <= a1 <= bi+1 < a2: m = mean(a1, bi+1);
      • if bi <= bi+1 <= a1 <= a2: m = mean(bi+1, min(a1, bi+2))
    • if n% 2 === 1, m2 is bi
      • if a2 <= bi: m = max(a2,bi-1)
      • if a1 <= bi <= a2: m = m2;
      • if a1 > bi: m = min(a1, bi+1);
  • array1: [a1], array2: [b1,b2,b3...m2...bn]
    • if n % 2 === 0, m2 is [bi, bi+1]
      • if bi <= a1 <= bi+1 : m = m1
      • if a1 <= bi: m = bi
      • if a1 > bi: m = bi+1;
    • if n%2 === 1, m2 is bi
      • if a1 < bi: m = mean(max(a1,bi-1),bi);
      • if a1 > bi: m = mean(min(a1,bi+1),bi);

we can see that this is improved from O(n-m+2) to O(1), right? After typo & typo, following accepted answer is generated,


var getMedian = function(nums, start, end){
    if (start > end || nums.length === 0){
        return null;
    }
    var total = start + end;
    if (total % 2 === 0){
        return nums[total / 2];
    } else {
        return (nums[(total - 1) / 2] + nums[(total + 1) / 2]) / 2;
    }
};

var recursiveFindMedian = function(nums1, start1, end1, nums2, start2, end2){
    // 1. if two medians are the same, then it ends
    // 2. if not the same, we cut the head of array with smaller median, and cut off tail(same account with previous head) of the other array.
    // 3. do it recursively
    // at then end, if one of them has max 2 element, merge them.

    var m1 = getMedian(nums1, start1, end1);
    var m2 = getMedian(nums2, start2, end2);
    if (m1 === null){
        return m2;
    }

    if (m2 === null){
        return m1;
    }
    if (m1 === m2){
        return m1;
    }

    // if for last 2 elements
    if (end1 - start1 + 1 <= 2){
        var medstart2 = Math.floor((start2 + end2) / 2);
        var isArray2Even =  (end2 -  start2 + 1) % 2 === 0;
        var medend2 = isArray2Even ? medstart2 + 1 : medstart2;
        if (end1 - start1 == 0){
            if (isArray2Even){
                if (m1 <= nums2[medstart2]){
                    return nums2[medstart2];
                } else if (m1 <= nums2[medend2]){
                    return m1;
                } else {
                    return nums2[medend2];
                }
            } else {
                if (m1 <= m2){
                    return (m2 + (medstart2 > 0 ? Math.max(m1, nums2[medstart2 - 1]) : m1)) / 2;
                } else {
                    return (m2 + (medend2 < nums2.length - 1 ? Math.min(m1, nums2[medend2 + 1]) : m1)) / 2;
                }
            }
        } else {
            if (isArray2Even){
                if (nums1[end1] <= nums2[medstart2]){
                    return (nums2[medstart2] + (medstart2 > 0 ? Math.max(nums1[end1], nums2[medstart2 - 1]) : nums1[end1])) / 2;
                } else if (nums1[end1] <= nums2[medend2]){
                    if (nums1[start1] <= nums2[medstart2]){
                        return (nums1[end1] + nums2[medstart2]) / 2;
                    } else {
                        return m1;
                    }

                } else {
                    if (nums1[start1] <= nums2[medstart2]){
                        return m2;
                    } else if (nums1[start1] <= nums2[medend2]){
                        return (nums1[start1] + nums2[medend2]) / 2;
                    } else {
                        return (nums2[medend2] + (medend2 < nums2.length - 1 ? Math.min(nums1[start1], nums2[medend2 + 1]) : nums1[start1]))/ 2;
                    }
                }
            } else {
                if (m2 <= nums1[start1]){
                    return medend2 < nums2.length - 1 ? Math.min(nums1[start1], nums2[medend2+1]) : nums1[start1];
                } else if (m2 > nums1[end1]){
                    return medstart2 > 0 ? Math.max(nums1[end1], nums2[medstart2 - 1]) : nums1[end1];
                } else {
                    return m2;
                }
            }
        }
    }
    // if median is average of the two middel numbers
    // we cannot cut so deep
    var accountToCut = (end1 - start1 + 1) % 2 === 0 ? Math.floor((end1 - start1 + 1) / 2 - 1) : Math.floor((end1 - start1 + 1) / 2);
    // we cut the head of nums1 & tail of nums2
    if (m1 < m2){
        return recursiveFindMedian(nums1, start1 + accountToCut, end1, nums2, start2, end2 - accountToCut);
    } else {
        return recursiveFindMedian(nums1, start1, end1 - accountToCut, nums2, start2 + accountToCut, end2);
    }
}

var findMedianSortedArrays = function(nums1, nums2) {
    if (nums2.length >= nums1.length){
        return recursiveFindMedian(nums1, 0, nums1.length - 1, nums2, 0, nums2.length - 1);
    } else {
        return recursiveFindMedian(nums2, 0, nums2.length - 1, nums1, 0, nums1.length - 1)
    }
};

Another Better Answer (2016.09.25)

After 4 days of cold battling, on the peaceful Sunday night, I happened to find a much much better answer for this problem: http://www.jiuzhang.com/solutions/median-of-two-sorted-arrays/. My mind is so narrowed, spending lot of time handling odd & event numbers cases, in fact this problem can be much more simply understood. Solution is similar, but we can generalize problem to finding k-th number of merged arrays, which will be much more simpler. The assumption is the same:

if we are to find the k-th number, we compare the 2/k-th number from the two arrays, cut off the head of the smaller one and iterate, until there k is to be 1, or one of the arrays are all cut off.

/**
 * @param {number[]} nums1
 * @param {number[]} nums2
 * @return {number}
 */

// k start from 1
var findKthNumber = function(nums1, start1, nums2, start2, k){
    // if one array can be safely abandoned
    if (start1 > nums1.length - 1){
        return nums2[start2 + k - 1];
    }

    if (start2 > nums2.length - 1){
        return nums1[start1 + k - 1];
    }
    // if we need get the last one
    if (k === 1){
        return Math.min(nums1[start1], nums2[start2]);
    }

    /* we compare the k/2 th number
     * if one of the array is not enough, 
     * e.g. [1,100,101], [1000], find the 4th. event we add all
     * numbers from short one to the other, it doesn't count up to k,
     * so we can safely cut the head.
     */
    var half = Math.floor(k / 2);
    var halfValue1 = start1 + half - 1 < nums1.length ? nums1[start1 + half - 1] : Number.MAX_VALUE;
    var halfValue2 = start2 + half - 1 < nums2.length ? nums2[start2 + half - 1] : Number.MAX_VALUE;

    if (halfValue1 < halfValue2){
        return findKthNumber(nums1, start1 + half, nums2, start2, k - half);
    } else {
        return findKthNumber(nums1, start1, nums2, start2 + half, k - half);
    }

}
var findMedianSortedArrays = function(nums1, nums2) {
    var total = nums1.length + nums2.length;
    if (total % 2 === 0){
        return (findKthNumber(nums1, 0, nums2, 0, total / 2) + findKthNumber(nums1, 0, nums2, 0, total / 2 + 1)) / 2;
    } else {
        return findKthNumber(nums1, 0, nums2, 0, (total + 1) / 2);
    }
};