Today I had an enlightening conversation. I had just finished my portion of the grading from at least 200 exams. My two sections combined had a total of about 50 exams and were now completely graded and in a stack by themselves. Searching through a randomly sorted stack of 50 exams to find the 20 or so that belong to the people who will show up to my first section is very inefficient. And efficiency is always important. So I of course decided that I should sort the stack of exams alphabetically.

I started with the big unsorted stack of exams, call it stack , and sorted it using what I assume is the standardly used procedure by the unenlightened. It goes as follows. I began by moving the exam on the top of stack to a separate pile, call it stack . I then took the second exam from stack and moved it to stack and placed it either on top of or underneath the first exam so that those two are now in alphabetical order. Then I did the same thing with the third exam. Each time I moved an exam from to and placed it in stack so that was still sorted alphabetically.

In the beginning, this process was working well. I felt like I very quickly sorted the first ten or so exams. However as time went on, my rate-of-sorting decreased. And so I naturally started to doubt whether I was using the best sorting algorithm possible.

I did remember hearing once that the best sorting algorithm runs on the order of time, where is the size of the input (more on that later). But as I was sitting there wondering what the best algorithm could be, I realized that I was not just in any old room, I was in a math graduate student office! Where knowledge (and beer after 3pm on Fridays) flow like wine!

So I proposed the question to the office: What is the best approach and how much slower was I operating as a result of my silliness to not first sit down and carefully choose the algorithm for which I was going to use to sort graded exams?! Luckily, Gautam Wilkins spoke up! He told me that my algorithm was asymptotically working at speed. And, indeed, algorithms exist and are provably the best. (You can actually prove that there does not exist a better algorithm!) He shared a few of them with me.

Now, before I continue, in case there are any computer scientists or complexity theorists in the room, I should mention one technicality. I am insisting on using a comparison-based sorting algorithm. What this means is that a priori I am not assuming that I know the name of anybody in my class (which is pretty accurate). The reason is, if I knew everyone’s name, say, I could ahead of time order them (or look at the grade sheet) and assign them each a number based on where their exam’s final position in the sorted stack will be. If I did I could instead take over half of the floor space in our office and organize it in a logical way so that for each number there is a space on the floor corresponding to that number. And then all I have to do is when I find an exam, I place it in its assigned spot.

Now, this procedure takes a tiny bit longer than a computer takes when it’s putting stuff in a pre-specified place in memory, but o well. The point is that this procedure requires essentially no comparisons at all! When I come to an exam in my stack, I never have to compare it to the ones that I have already placed on the floor, since either way that exam is going to the same spot. When they’re all placed, I can quickly just pick them up in order since they’re already placed in the order I want them in. So it’s conceivable that I could come up with a clever way to take advantage of this type of comparison-less procedure (like something similar to RadixSort). But we’re throwing this possibility out because then I would first have to learn my students names, and only Rob does that.

Ok, now let’s go back to what we were discussing two paragraphs back. But before I tell you the Gautam’s algorithm, let’s talk briefly about what it means for something to run at or speed.

In our case, is the number of exams. It means that in order for me to sort the exams, I have to perform about (up to constant factors) number of operations, each of which takes roughly the same amount of time to execute. Now, since the numbers are small enough, it is worthwhile to estimate what the constant factor is. It’s on average a little less that one half. So it will take roughly operations.

But what exactly is an “operation”? Well, for instance, one operation is to move the first exam from stack to stack . Another is to move the second exam to stack . And another to compare the second exam to the first to determine whether it should go before or after it. Then we must move the third exam to stack , compare it with the first, and if it comes after it alphabetically, then it must be then compared with the second. And so on.

Well, it turns out that I was needing to perform on the order of operations to sort my exams! Who has time for that?!

So is there a better way? Yes! There are several. Two that I am not going to talk about now but may in a follow-up post are QuickSort and HeapSort. Today I’m going to tell you about one called MergeSort. This is an algorithm which can be implemented on the exam-sorting problem as follows.

- Take all the exams and divide them into piles of size about 7-9. You don’t have to waste time counting to get it exact. Once you get a feel for how many are in a stack, just roughly divide them up into about that-sized stacks.
- Sort each of these smaller piles alphabetically using the algorithm that I used before.
- Pair up these piles (if you have an odd number, just leave one pile by itself).
- Merge the two piles in each pair by looking at the exam on top of each stack and picking the exam that comes first alphabetically. Place this exam face down in a new pile. Then look at the two exams which are now on top (one will be the same as last time, one will be different) and pick the one that comes first alphabetically between those two. Place this exam facedown on the top of the first selected one. And repeat until you have merged those two piles. Do the same thing for each pair.
- If there is only one pile at this point, we’re done! If not, then we have a new collection of sorted piles which need mergin’. Go to step 3.

Below are two examples of this in action. This first one is painfully slow.

This next one is pretty cool, although maybe not as lucid.

This algorithm requires on average steps to complete. So when as in my case, I would have needed to perform roughly operations. Wow! That’s only 16% of the number from the last algorithm!

Note: some of the 1,250 operations could likely get skipped, different ones take different amounts of time to execute, and of course there are issues with how quickly the running time converges to these asymptotic functions, but glossing over all that, I will just state that Gautam reports that the net gain in practice is around 50% savings when sorting 50ish exams.

And the difference will be even more noticeable the more exams I grade, or the longer I TA. Therefore, as my years in grad school approaches infinity, the time savings will also go to infinity! And pretty quickly too!

So to all you TAs out there: work smarter, not harder. And pick faster-running algorithms to do the menial tasks involved with grad school. Then you will have more time for the important stuff, like looking forward to that beer on Fridays.

I’ve often pondered this problem myself, having TA’d for many years before arriving in SD, and I’m not sure that mergesort is the best. You have to take into account that for human TAs, not all operations are created equal. Comparisons are /much/ cheaper to perform than a “place” (exam in this slot) operation. I do feel that insertion sort is actually the best method.

You should be using quicksort. Better constants, and significantly faster (but still the same time complexity) when you pick a good pivot. This is something computers are bad at, but humans are good at.

But again, once you get down to 6-10, just do it by glancing. Again, this is something humans do quickly that computers can’t do.

http://en.wikipedia.org/wiki/Quicksort

I must admit that I have not tested these all myself. Gautam claims a big increase using mergesort over insertion sort, though. Especially when dealing with around 50 exams. I did realize that I was sweeping under the rug how the operations take different amounts of time, sometimes significantly different. But it seemed fine assuming Gautam’s final time-savings in practice are accurate.

He did also mention Quicksort as another which takes about as long. I’m not sure if he has tried it in practice. Is that what you use, Andy? I was also thinking that that one seems like one where a human can have a significant advantage over a computer. By not picking stupid pivots. Maybe I will try that one too. It seems like it would be easier to mess up with quicksort, though.

Jay! I used this algorithm to sort my exams today. Awesome.

Today I mentioned your post to my officemate, who pointed me to an old blog post of his: http://rosenblog.wordpress.com/2009/07/18/visualizing-sorting-algorithms/

There’s a pretty picture there.

Nice! And just today I saw another way to visualize the Quick-Sort sorting algorithm: