The twelfth post of the series of programming job interview challenge is out, 31 readers provided answers to job interview challenge #11. I have to admit that I probably failed explaining what I was looking for in challenge #11, because I asked you to provide the best algorithm in both manners: performance and memory. What I really meant is that performance is most important but don’t neglect the memory issue. Due to my little “embarrassing failure”, there are two groups of correct answers – the performance oriented and the memory oriented.

The correct answer which I was looking for (best at performance) as Alex, the first one to provide a detailed solution (its two times in a row), wrote:

The most time-efficient way would be to utilize a hashtable. (Or, simply a set if your language supports it)

Let’s take the example of searching for a sum 14
in the list 2,6,4,9,1,12,7.
Iterate through the list. For each value x in the list, add it to the hashtable, then check if 14-x exists in the hashtable. If not, go to the next item. if it does, return true (you found your pair). That “add to hashtable before checking” thing is necessary in order to prevent a 14-7=7 situation.

If you reach the end of the list without finding a matching pair, return false.

This algorithm is O(n) time, but also O(n) space.

Those are the other readers who provided such a solution: Kimmen, Dejan Dimitrovski, Alessio Spadaro AS, Guido Domenici, Michael and lucas.

The first one to provide the best solution when it comes to memory consumption was Morgan Cheng:

Here is a time O(n*log(n)) and space O(1) solution:

The first step: sort the N-element list which takes O(n*log(n)) time complexity.

The second step: search two elements whose sum is M. Since the list is already sorted(smaller to the left and larger to the right), we find one number from left to right, while the other number from right to left. The index of left number is indexed by i, and the index of right number is indexed by j. For each step, we calculate delta = M – Array[i] and compare it to Array[j]. If delta equals Array[j], we can return true; if delta Array[j], we increment i and continue. When i >= j, we break and return false.

Those are the other readers who provided such a solution: Mark R, Maneesh, Christof Jans, Dan Sydner, Alessio Spadaro AS, Michael Mrozek and Michael Dikman.

Please check the bloggers algorithms:

Some readers provided code solutions with no explanation and complexity, I am really sorry but I didn’t have time to understand the complexity of your solutions. next time please explain what you did:

  1. I am not familiar with each and every programming language.
  2. I don’t have the time to investigate what is the complexity of a solution written in Linq (I think that it is O(n^2)), for example.

This Week Question:

Take a look at the following managed C++ code:

__gc class ManagedClass 
{
   public:
   int i; 
   ManagedClass () { i = 0; };
};

class UnmanagedClass
{
   public:
   void incr(int * i) 
   {   
      (*i)++;
      std::cout << *i << std::endl;
   };
};

int main() {
   ManagedClass* pMngdClass = new ManagedClass;
   UnmanagedClass* pUnmngd = new UnmanagedClass;
   pUnmngd->incr(&(pMngdClass->i));
}

What is wrong with this code and how would you fix it? There are no dirty tricks here, don’t try to catch me on a typo or something like that, you can assume that this code compiles. The correct answer is about an important issue which was neglected in the above sample code. Don’t catch me on the fact that I directly expose a member variable i, for intance. This is just a little sample code where a basic concept is violated.

Get updates by RSS to catch up with the next posts in this series and to get the correct answer for today’s question. As always you may post the solution in your blog or comment. Comments with answers will be approved only next week.

Tags :

26 Responses to “A Programming Job Interview Challenge #12 – Managed and Unmanaged”


  1. Jason Kikel

    Said on July 14, 2008 :

    UnmanagedClass is referencing an address on ManagedClass without pinning it. The ManagedClass instance needs to be pinned so the GC won’t move it to another location during a collection.

  2. Michael Russell

    Said on July 14, 2008 :

    You didn’t pin the managed class in the unmanaged class prior to working on it.

  3. Heiko Hatzfeld

    Said on July 14, 2008 :

    Hello…

    I have to “complain” about the hashtable solution… The hashtable solution is NOT O(N).

    The optimal cost for retrieving an item from a hashtable is O(log(n)). And since we do this (worst case) N Times, we still have to spend N*Log(N) operations to achieve the result same result as the sorting comparing version.

    This also ignores the cost for generating the hashtable. Just because it is one line of code, it does not mean it is one operation…

    So after all you have the “same cost” as sorting and finding, you just have an overhead of:
    – Creating a hashtable
    – O(2 N) extra Memory to store the data + hashes

    So I think to solution might “look” smarter, but you are actually doing the same here, with the only difference the the hash solution would have to do N-1 comparisons in the worst case, while the sorting one will stop after N/2 comparisons

  4. Shahar Y

    Said on July 14, 2008 :

    Hi Heiko,

    “The optimal cost for retrieving an item from a hashtable is O(log(n))” – This would be true only if the values in the list are not unique, which is not the case. In our case, it would definetly be O(1).

  5. Trinity

    Said on July 14, 2008 :

    Hi,
    I think the problem is with the line

    (*i)++;

    inside the incr function of the Unmanaged class. I assume that we are trying to increment the value held in the variable pointed to by i. But because it is a single reference it wouldnt refer to the intended memory location and hence would not increment the integer at the right memory location. The way to fix this would be to declare i, the paramter to the incr function, as a pointer to a pointer as in

    void incr(int **i)

    That would fix the problem

    Thanks
    Trinity

  6. Heiko Hatzfeld

    Said on July 14, 2008 :

    A “normal” hashtable is usually implemented as a “balanced binary tree”. So in order to retrieve an element, you would have to do log(n) comparisons. If we have unique values, we just dont have to deal with more then “1 item per bucket”…

    If you want to have a retrieval with a speed of O(1), then you would have to allocate Int.Max slots.

    Please explain to me how you would locate the
    Item “2354357” in an O(1) operation if there are 4 Billion possible items, but only 10 are located in your list…

    By creating the hashtable you efficently “sort the list”. The final step of locating the final element is just not es efficient as traversing the sorted list from both sides.

  7. Heiko Hatzfeld

    Said on July 14, 2008 :

    Okok… I messed up with my collection types. I reread the documentation and I repent my ways. Ignore my last comment… At least I have I refreshed something that I remember dimly from ages ago…

  8. Stephen Goldbaum

    Said on July 15, 2008 :

    Another issue with the hashtable, is that the context switching erases any gains in algorithmic efficiency (at least in Java). For instance, up until 10K records, both solutions run dead even. At 100K, the tight loop solution pulls ahead. At 1M it’s an order of magnitude. At 10M, it’s not even close (like 1500ms vs 23000ms on m dual CPU). I don’t have enough memory to go beyond that with the hashtable solution. So for all practical purposes, the tight loop wins in performance and memory.

  9. ZagNut

    Said on July 15, 2008 :

    The code has a case of the marshaling uglies.

    A few minor changes allows marshaling easily:

    __gc class ManagedClass
    {
    public:
    int i;
    ManagedClass () { i = 0; };
    };

    class UnmanagedClass
    {
    public:
    void incr(int &i) // all those asteriks were ugly
    {
    i++;
    std::cout << i <i; // marshal here to temp variable
    pUnmngd->incr(temp_i); // allow unmanaged code to alter data
    pMngdClass->i = temp_i; // “correct” the managed object’s member variable
    }

  10. Alex

    Said on July 15, 2008 :

    Awe man, and I was on a winning streak! Unfortunately my only exposure to c++ is a class I took back in 2003, so I don’t really have a shot at this one. The only things I notice out of hand, I’m afraid will fall into the “nitpicky” category. Those two things being- The main method isn’t wrapped in a class, so I don’t know out-of-hand if it’s managed code or not, and the fact that you’re using unmanaged code to change state in managed code. Even if that’s legal, it just feels wrong.

  11. befarino

    Said on July 15, 2008 :

    The problem with the code is that the address of a managed object member is passed to unmanaged code without pinning the memory pointer first; specifically, in this line:

    pUnmngd->incr(&(pMngdClass->i));

    the &(pMngdClass->i) yields the address of a member of a managed object.

    As the code stands, the value at the address may change as a result of the garbage collector reorganizing objects on the managed heaps, resulting in various Bad Things(tm) like access violations when the incr method of the UnmanagedClass attempts to dereference the pointer. Pinning the pointer will keep the .NET garbage collector from moving the object until the address is unpinned, preventing these Bad Things.

    Fixing it requires pinning the pointer before passing it to the incr method:

    pin_ptr p = &(pMngdClass->i);
    pUnmngd->incr( ( int * )p );

    The pin will remain in effect (and the pMngdClass remains immobile in memory) until the pin_ptr goes out-of-scope.

    Cheers,
    beefarino

  12. leppie

    Said on July 15, 2008 :

    What idiot implements a hashtable as binary tree? Obviously lookups are going to be slow. A normal hashtable’s lookup is O(1). Please go read your theory books and understand why.

  13. leppie

    Said on July 15, 2008 :

    I guess the field pointer needs to be ‘fixed’ somehow like in C#. Not sure about the C++ :)

  14. Shahar Y

    Said on July 15, 2008 :

    Lets not call other people idiots and keep the discussion on a proffesional level…

  15. Christof Jans

    Said on July 15, 2008 :

    I think the problem with challenge #12 is that the managed class instance is under control of the .net memory manager and thus it can be moved around in memory whenever the .net memory manager feels like it.
    The managed instance should probably be “pinned” to a specific location in memory before it is passed to the unmanaged method call.

  16. Michael

    Said on July 15, 2008 :

    From java.util.HashSet doc: “This class offers constant time performance for the basic operations (add, remove, contains and size)…”

  17. leppie

    Said on July 15, 2008 :

    Sorry :)

  18. Jason Kikel

    Said on July 15, 2008 :

    I think Heiko’s confusion arose from ambiguity around hashtables and dictionaries. Many people use the two terms interchangeably, but do so incorrectly. A dictionary can be implemented as a hashtable or binary tree (or anything else really that supports keyed lookup), but a hashtable refers to a very specific implementation.

  19. Hugh Brown

    Said on July 15, 2008 :

    Re: “What idiot implements a hashtable as binary tree?”

    The C++ STL dictionary is implemented as a red-black tree, if I recall correctly.

    Of course, this is an implementation of a dictionary, not a hashtable.

  20. Michael Dikman

    Said on July 15, 2008 :

    Well, i guess the value incremented in UnmanagedClass will not be saved anywhere.
    (*i) retrieves the value
    then (*i)++ increments it, but it is not saved anywhere, thus managed i will still be the same.

  21. James Curran

    Said on July 15, 2008 :

    Two comments on Alex’s solution…

    >> “That “add to hashtable before checking” thing is necessary in order to prevent a 14-7=7 situation.<> This algorithm is O(n) time, but also O(n) space. <<

    This is not exactly true. It’s either O(n) time OR O(n) space.

    To achieve O(n) time, we’d need a perfect hash function: one that would always point a value to a distinct bucket. With no pre-knowledge of the values in the list (or even the size of the list), the only way you could do this is with M buckets and hash(n) = n. This would reduced the algorithm to Shams Mahmood’s boolean table. (see last weeks’s comments). Hence O(n) time, but O(m) space.

    To achieve O(n) space, we would need a 100% fill rate. Again lacking a perfect hash function, when allowed only n buckets, some collisions will occur, requiring a linear search. Taking the extreme of the completely imperfect hash function, hash(n) = 1, every lookup would itself be O(N), making the entire task O(N*N).

    Finding the balance between the two is a bit tricky: Normally space is cheaper than time, so the first would be preferred, however, there’s a practical limit on size of n, while there isn’t on m (i.e. N={1,10000000}, m=10000001 would need an awful big boolean table).

  22. Jelle Hissink

    Said on July 16, 2008 :

    I see two problems with the code:
    1) You don’t free pUnmngd
    2) You need to pin pMMngdClass because you are using the raw memory location (GC might relocate the object)

    So your main should be:
    ManagedClass __pin * pMngdClass = new ManagedClass;
    UnmanagedClass* pUnmngd = new UnmanagedClass;
    try {
    pUnmngd->incr(&(pMngdClass->i));
    } finally {
    delete pUnmngd;
    }

  23. James Curran

    Said on July 16, 2008 :

    hmmmm… It seems that the editor here swallowed my first comment in the above post. Let’s try it again:

    [[That “add to hashtable before checking” thing is necessary in order to prevent a 14-7=7 situation.]]

    I’m not sure what problem he’s trying to avoid. The only problem I see is actually caused by this technique. If you are looking for 14, and have a 7, if you add it first, and then check for 14-7, you’ll find the 7 and think you have two. For this to work, you’d need to look first, and THEN add the number to the hash.

  24. James Curran

    Said on July 16, 2008 :

2 Trackback(s)

  1. Jul 16, 2008: Honest Illusion : Dev102's Challenge #12 : Managed & unmanaged
  2. Jul 21, 2008: A Programming Job Interview Challenge #13 - Brackets | Dev102.com

Post a Comment