The sixth of the series of programming job interview challenge is out. Other then commenting the solution, I remind you that you can post the solution on your blog and get a link next week!
Here is the solution to the previous challenge:
Two-way merge sort (External Sorting) : The idea is breaking the big file into subfiles, sorting them and than merging them back together. In the first pass read one page at a time, the records in the page are sorted (quicksort for example) and the page is written back to disk. in subsequent passes, each pair of sorted output from the previous pass are read and merged to produce sorted subfile twice as long:
1. Read each page, Sort it, Write it back
2. while more than one sorted subfile:
while subfiles from previous pass left to merge:
- Choose next two subfiles from previous pass
- Read each subfile into a memory page (one page at a time)
- Merge the subfiles (remember each of them is already sorted), write output to a memory page
- Write output page to disk when needed.
Note that 3 pages are enough to run this algorithm. (do you see why?)
Heiko Hatzfeld provided a nice solution in his blog including c# implementation, notice you used 4 pages in the MergeSort stage, you could write one output file each time it is full.
Martin Ankerl also provided few correct and simple solutions, I liked his simple post.
From the commenters:
Kirill Sorokin, Jonathan Gilbert, Jonas Christensen, Heiko Hatzfeld, Tim C, Chris Jobson, and Sh4DoW provided a full and correct answer.
Amgel, Edward Shen, Per, Kris Warkentin answered merge sort but didn’t provide the full algorithm.
Alex, Your solution will produce the correct answer, however it is much less efficient in terms of IO access. since you don’t know if values are uniformly spread across the file you might end up with some very small temp files and some very large once, and might cause lots of reduced disk accesses.
Mark Bradshaw provided a bubble sort variation, a bit less efficient than merge sort.
Poul Foged Nielsen the simple bubble sort would be very inefficient.
And now let’s head on to this week’s question:
Look at the following Code segment written in C#:
1: ArrayList a = new ArrayList();
2: ArrayList b = new ArrayList();
3:
4: a.Add(1);
5: b.Add(1);
6: a.Add(2);
7: b.Add(2.0);
8:
9: Console.WriteLine((a[0] == b[0]));
10: Console.WriteLine((a[1] == b[1]));
What will be typed into the console? And WHY? Hopefully you will “play fair” and won’t type it in Visual Studio to see the output, you are in a job interview after all…
But the main issue of the question is the why.
The answer in now published in the 7th question in the series.
Good Luck!
Kirill Sorokin Said on Jun 2, 2008 :
I don’t know about C#, but in Java 5.0 (and beyond, where auto-boxing is enabled), the output would be “true” and then “false”.
Why? As the numeric primitives are added to the array list, they are auto-boxed to objects. So we have a = {new Integer(1), new Integer(2)} and b = {new Integer(1), new Double(2.0)}.
Then, when we compare a[0] and b[0], we’re comparing two integer objects created via autoboxing. It turns out that == will return true, as for integers < 127 auto-boxed objects are cached, so in fact both a and b contain the same object instance.
When comparing a[1] and b[1] we’re applying == to two different objects, and thus it will return false.
leppie Said on Jun 2, 2008 :
As expected both times it prints False.
Reason: == defined on type object translates to ReferenceEquals. As every value type gets boxed, the reference will never be equal.
Using Equals instead will make the 1st comparison True. The second remains False as the double is not automatically converted to an int or vice versa.
IMO, using Equals instead of == would have been a better test.
Heiko Hatzfeld Said on Jun 2, 2008 :
I think the output should be false/false.
The reason for this is the Arraylists are storing objects, and not value types. So a comparison invoked on the elements will allways be an object comparison. And since you are comparing pointer then, the results wont match.
To avoid this you could use a generic list, that holds int/double (for a/b). Since the result of the generic List will return a value type. the comparison will be correct, and the result would be what you expect.
Another option is to cast the stored object, prior to compaing it.
Also nitpicking about the comment… We are allowed to use up to 10 pages, and the question was not to implement “the most memory efficient algorithem”, but one that fits within the given bounds. And opening and closing file handels is more liklely to have a bad effect on such a constrained system
Jason Kikel Said on Jun 2, 2008 :
output:
false
false
explanation:
An ArrayList is a collection of System.Object. By default, System.Object.Equals only supports reference equality. This means ob1 == ob2 only if ob1 and ob2 both reference the same System.Object instance. In the above case, the arrays are populated with literal (value type) values which get boxed individually into different System.Objects and thus return false on equality tests.
Damian Brady Said on Jun 2, 2008 :
Hey,
Nice test! - brings together a couple of important under-the-cover behaviours that .Net developers often don’t think about.
I’ll leave my answer out of this comment so as not to ruin it for others, but see my site if you want my response.
configurator Said on Jun 2, 2008 :
false
false
The second answer is quite obvious: an object boxing an int will not equal to be an object boxing a float.
But why is the first one false?
Well, let’s see what happens here.
a[0] is an object boxing an int. That means it is a reference to some memory location holding the value 1.
b[0] is also an object boxing an int. But it’s a different object
i.e. if we did
object one = 1;
a.Add(one);
b.Add(one);
a[0] == b[0] would return true.
But we didn’t do that, we created two different object. And when we compare two objects, the comparison is (object == object) and not (int == int), causing it to be a reference comparison, returning a false answer.
Jonathan Gilbert Said on Jun 2, 2008 :
I would expect it to output “false” twice, as the ArrayList class returns System.Object references, and ‘==’ on System.Objects is a reference comparison. In this particular case, the objects are boxed ints and one boxed double, but System.Object’s operator== has no way of knowing that.
Justin Said on Jun 2, 2008 :
Just guessing here, I haven’t written C# for a couple of years but immediately I would say
false
false
since the boxing of the primitives creates a new object every time and the output statement provides no inference as to what the contained types are and therefore the == operator is comparing objects which are all different.
Ivan Said on Jun 2, 2008 :
It will be typed
false
false
because the comparison operator compares the references of boxed value types.
If we instead write:
9: Console.WriteLine((a[0].Equals(b[0])));
10: Console.WriteLine((a[1].Equals(b[1])));
the result will be:
true
false
because in that case we are using the int overrided Equals method
Alex Said on Jun 2, 2008 :
Isn’t this an awfully language-specific question?
It’s hard not to copy/paste code in… but I’m going with
——
True
False
——
even though a direct comparison of (2 == 2.0) returns true.
The “WHY” of it is that the integers are being abstracted into objects as soon as they’re added to the ArrayList. When comparing objects, obj.equals(obj2) is the way to determine if the two objects are logically equal. Using == just determines if the two variables are pulling from the same place in memory. a[0]==b[0] is going to return true, because each is referring to the constant “1″. a[1]==b[1] is going to return false, because since a[1] is an int and b[1] is a float, they can’t be referred to using the same reference address in memory. Since those don’t match up, it returns false, regardless of the actual values of a[1] and b[1].
An easy way to test this theory would be using Generics, which are usable in .NET 2.0 and forward.
—Code—
List a = new List();
List b = new List();
a.Add(1);
b.Add(1.0);
a.Add(2);
b.Add(2.0);
———-
If I’m right and it’s a matter of the arraylist abstracting the numbers into vague objects, than this should return:
—-
True
True
—-
Just checked. It did. Hooray!
Kris Warkentin Said on Jun 2, 2008 :
I’m assuming you will see something like:
true
false
The why is complex and related to the IEEE implementation of floating point numbers. I often refer people to http://www.physics.ohio-state.edu/~dws/grouplinks/floating_point_math.pdf which explains the very complex subject of representing floating point numbers in computer hardware. The short answer is that in general 2.0 != 2 but rather something like 1.9999999999 or 2.00000000001 and there isn’t a valid way to compare the floating point and integer versions.
The best understanding I had was that the decimal portion of the floating point is also represented in base 2 notation which means that not all numbers can be represented exactly in that format.
My grasp of the whole subject is pretty limited though. It has always seemed strange to me that there wouldn’t be some sort of implicit cast one way or another that would give the ‘intuitively’ correct answer. I have a feeling the answer to this question may be implementation dependent too. The C# aspect may be important. I’m not sure if C will do it one way, the CLR or a JVM a different.
Looking forward to see if any replies have a better explanation.
cheers,
Kris
Justin - Web Development Blog Said on Jun 2, 2008 :
My gut instinct on this is the output will be “TrueFalse”.
Without testing this I dont’t know for sure but because 2 is an integer and 2.0 is a float or deciaml my guess is that even though they are both stored as objects in the ArrayList they are still different when compared.
Justin - Web Development Blog Said on Jun 2, 2008 :
Well, I am wrong. It is FalseFalse. I ran a test through VS. So because they are stored as objects each item in each ArrayList is different until it is cast to a specific type.
The only way array list items would eb the same without casting is like a.Add(1); b.Add(a[0]);
A comparision of a[0] == b[0] would be true because a[0] was copied into b[0].
Thanks for the great questions. This is making me look harder into the details.
Tobi Said on Jun 2, 2008 :
The output:
False
False
This is the case because the reference are not the same (that’s what matters, because a[x] and b[x] return object).
The “expected” result would require:
Console.WriteLine((a[0].Equals(b[0])));
Console.WriteLine((a[1].Equals(b[1])));
And would result in:
True
False
Jon von Gillern Said on Jun 2, 2008 :
Both will be false because both all of the value types are boxed when they’re placed inside the array list.
Jonas Christensen Said on Jun 2, 2008 :
An Arraylist holds objects so you check if it points to the same object and not the content of the objects.
Ryan Emerle Said on Jun 2, 2008 :
The output would be:
false
false
The reason is:
The values passed to the Add method are boxed into objects and, by default, the operator == tests for reference equality which will only return true if the two variables are pointing to the same object.
Damian Said on Jun 2, 2008 :
False
False
Reason: the Add methods takes an object, a reference type and the the numbers are value types, so they are boxed (converted to reference types) before being passed into the Add method. As they are differenct objects, the equality operator will return false.
Shahriar Said on Jun 2, 2008 :
Hi Dear
please look at this code:
ArrayList a = new ArrayList();
ArrayList b = new ArrayList();
a.Add(1);
b.Add(1);
a.Add(2);
b.Add(2.0);
Console.WriteLine(string.Format(”The Type of a[1] => {0}”, a[1].GetType()));
Console.WriteLine(string.Format(”The Type of b[1] => {0}”,b[1].GetType()));
Console.WriteLine(string.Format(”a[0]==b[0] => {0}”,a[0]==b[0]));
Console.WriteLine(string.Format(”a[1]==b[1] => {0}”, a[1] == b[1]));
Console.WriteLine(string.Format(”(int)a[0]==(int)b[0] => {0}”, (int)a[0] == (int)b[0]));
Console.WriteLine(string.Format(”(int)a[1]==(double)b[1] => {0}”, (int)a[1]==(double)b[1]));
Console.WriteLine(string.Format(”a[0].Equals(b[0]) => {0}”, a[0].Equals(b[0])));
Console.WriteLine(string.Format(”a[1].Equals(b[1]) => {0}”, a[1].Equals(b[1])));
I feeling .net framework just when you try to compare two decimal by Equals Method compare correctly , such as:
object v = 12.0;
Console.WriteLine(double.Equals(v,12.0));
equal to:
decimal v=12.0;
Console.WriteLine(double.Equals(v,12.0));
Jamie Penney Said on Jun 2, 2008 :
false
false
Autoboxing occurs when you put the numbers into the ArrayList. Since a[x] and b[x] return Objects, they are compared using reference equality(==), and since the numbers are autoboxed they are different objects. Thus both of the comparasons are false as all 4 numbers are different objects in memory.
Alex Shenoy Said on Jun 3, 2008 :
If I am correct, the response in the console will be
True
False
The first will check the value of 1==1. However 2 == 2.0 is not true b/c 2 is an integer and 2.0 is a float.
ecco Said on Jun 3, 2008 :
false in both cases
why:
is it comparing 2 addresses rather then values which compare method offers when using collections?
Rob R Said on Jun 3, 2008 :
Firstly, unless you include System.Collections that won’t even compile. But regardless, an ArrayList stores Objects. You are comparing POINTERS with a equality operator. Both will be false.
For a more clever question, what would the following code return (and why):
1: ArrayList a = new ArrayList();
2: ArrayList b = new ArrayList();
3:
4: a.Add(1);
5: b.Add(1);
6: a.Add(2);
7: b.Add(2.0);
8:
9: Console.WriteLine((a[0].Equals(b[0])));
10: Console.WriteLine((a[1].Equals(b[1])));
Mark Bradshaw Said on Jun 3, 2008 :
I’m not entirely sure. From the hip I’d guess that both will equate to true, assuming that the second equality comparison will end up converting the float into an int.
fred Said on Jun 3, 2008 :
Both answers would be “false”.
Reason: autoboxing
Dan Said on Jun 3, 2008 :
true and true, you used the “==” operator which does not test variable type. If it was a[1] === b[1], it would return false because of the type difference (a[1] is int, b[1] is float (or double).
AdamC Said on Jun 3, 2008 :
All false because identity != equality, we’re dealing with ref types here (implicit boxing).
Rickyah Said on Jun 3, 2008 :
In this entry I try to answer this’s week challenge:
http://crazypointer.blogspot.com/2008/06/programming-job-interview-challenge-6.html
My blog is in spanish, but the answers to the challenges are written in
english.
Best Regards!
sood Said on Jun 3, 2008 :
I tried the similar kindda code in java. I have to say that I used eclipse to cross check my answer before submitting, and one of my answer was incorrect(earlier
), but I was thinking on the right lines..
I am not sure about how c++ works but considering that it is also an OO language the working should be similar to java.
So when you create two new array lists and add 1 to them and compare the result is true. I guess this is because as arraylists take up only objects hence when primitive type 1 is added it is automattically boxed to an integer object before adding it to the list. ans reference to the same object is used when adding to the second list hence the first expression evaluates to true. But if u use new Integer(1) while adding to the list the expression will fail.
Regarding the second expression m definite that it should be false. But all I am worried about it is how the == operator is making a comparison b/w an int object and a double object. Does it type cast one to another before making the comparison.
Also one more thing that i noticed is how the .toArray method works for a list having objects of diff types….
I hope I have not written too much of idioti stuff bcoz I am self taught JAVA person, so no wonder if my understanding of certain concepts is incorrect
kuldip saini Said on Jun 3, 2008 :
It will be
false
false
while adding element to arralist will cause boxing which will result in different references for every value type.
D.L Said on Jun 3, 2008 :
false
false
because ArrayList is a ref type( located in the heap)
the comparison is made by comparing two different places in memory (whice be in both cases false).
Jelle Hissink Said on Jun 3, 2008 :
It will return:
False
False
Because the comparison will be preformed on the object (boxed values) and thus it will compare the references of those.
Regards,
Jelle
AJ Said on Jun 3, 2008 :
true
false
as arraylist will store 2 as int whereas 2.0 as float, hope i am thinking in right direction.
Xerxes Said on Jun 3, 2008 :
Without cheating - they will both return false, because the ArrayList boxes all items to the loose “object” type.
Upon reference to a[0] and b[0], there is no type conversion to compare ints, so the object comparison returns false because they are different object references. Same goes for a[1] and b[1].
Greg Said on Jun 3, 2008 :
The results would be false false.
The reason is an array list is an object array. Therefore, the value types would be boxed and added to the array. Each object in the array will have it’s own pointer; therefore, == will be evaluating the reference not the value.
Chris Said on Jun 4, 2008 :
ArrayList?! Who still uses ArrayLists? That’s so .NET 1.1!
NMasao Said on Jun 5, 2008 :
from VB view
line 9 output - true ( both are integers)
line 10 output - false ( integer double)
pb Said on Jun 7, 2008 :
Arraylist takes objects so comparison will be reference comparison, so two falses