January 06, 2018

Remove Duplicate Elements From An Array

23:35 Posted by Durga swaroop Perla , , , No comments

Interviews are a great place to learn about your strengths and weaknesses, which makes them a great way to improve oneself. In one of my interviews, I was asked to Remove duplicate elements from an array. So, given the array a as below, I've to produce b.

a = {1, -2, 3, 1, 0, 9, 5, 6, 4, 5, 3, 1, 0}
b = {1, -2, 3, 0, 9, 5, 6, 4}

Here b has the same order of elements as a but per the problem statement, It is not necessary to do that.

I was flustered for a bit after getting the question. It took me a while to get to a proper solution, not before getting my first solution rejected for using HashMap which apparently, I was not supposed to. I am attributing this mainly to the fact that I was told to write Java code on a piece of paper and not an IDE. Anyway, I came home after that and decided to try it out and find what other's have done online. That is what this article is about.

Array of Donuts

Since that particular interview was in Java, It is only fair that I use Java for the solution here, although I really wanted to do it in Python. Maybe some other time.

Approaches for solving the problem:

Approach #1

The most naive approach would be to just look through the entire array and compare each element to every other element to see if there's a duplicate. Of course, this is useless as its time complexity is O(n^2). So, Let's skip this one and go to the next one.

Approach #2

Another approach is using a HashMap to keep track of elements. This is what I've tried initially but was rejected because I've used HashMap when I wasn't supposed to. The pseudo code would be:

map = new Map // Create map
new_arrray = []

for number in numbers_array
  if not map.contains(number)
  map += number
  new_array += number

print(new_array)

Of course, Since I wrote my implementation of this in Java, I had to make a few modifications to this as you need to first define the size of the array and only then can you add elements to it. So, I've added a count variable to count unique elements and then created a new array after the iteration with that size. This would require two loop iterations, but it is still O(n) which is fine. But Alas I couldn't use this.

And so, then comes my final approach.

Approach #3

The third solution is to first sort the array and then from the sorted array, remove duplicates. We can do this because the problem didn't want us to maintain the given input order. Otherwise, we wouldn't have been able to sort the array.

Sorting is easy enough. We just use the built-in sort method, which will sort the array in place.

Arrays.sort(numbers);

Then comes the major part which is removing the duplicates in that sorted array. We can accomplish that by using two pointers i, j on our array. i goes through the entire loop while j is the slow-moving pointer that only changes based on a condition.

 int j = 0; // Slow moving index

// i is the fast moving index that loops through the entire array
for (int i = 1; i < numbers.length; i++) {
    if (numbers[i] != numbers[j]) {
        j++;
        numbers[j] = numbers[i];
    }
}

The j index is basically playing catch up with i. When there is a duplicate element, i moves ahead while j stays back at the first duplicate element and then with numbers[j] = numbers[i], we assign the next unique value to the j location. After this, our original array has unique elements till index j but after that, we'll have leftover elements. To take care of that, we can create a new array from the numbers array.

int[] result = Arrays.copyOf(numbers, j + 1);
System.out.println(Arrays.toString(result));

And that's it. This will remove all the duplicated elements from the array. To test it, let me run the code:

Input array: [1, -2, 3, 1, 0, 9, 5, 6, 4, 5]
Final result after removing duplicates: [-2, 0, 1, 3, 4, 5, 6, 9]

The sorting of the array can be assumed to be done in O(nlogn). And then the iteration after that is O(n). Put together you still technically get O(n), which is the same as the previous case. Of course depending on a more specific kind of array, the sort might take less time as well. O(n) for the average case is what you finally get.

The full code is present as a gist:

Let me know if you have any more questions that need answers. That is all for this article.


For more programming articles, checkout Freblogg Freblogg/Java


This is the 15th article as part of my twitter challenge #30DaysOfBlogging. Fifteen more articles on various topics, including but not limited to, Java, Git, Vim, Software Development, Python, to come.

If you are interested in this, make sure to follow me on Twitter @durgaswaroop. While you're at it, Go ahead and subscribe here on medium and my other blog as well.


If you are interested in contributing to any open source projects and haven't found the right project or if you were unsure on how to begin, I would like to suggest my own project, Delorean which is a Distributed Version control system, built from scratch in scala. You can contribute not only in the form of code but also with usage documentation and also by identifying any bugs in its functionality.


Thanks for reading. See you again in the next article.


Image of Donuts: http://echmondprojects.com/wp-content/uploads/2016/04/two-arrays-700x525.png

0 comments:

Post a Comment

Please Enter your comment here......