Sunday, May 22, 2011

Paul the Pessimist and the Log N Realization

Big-O notation is a method of specifying the worst-case asymptotic complexity of an algorithm. An O(Log N) complexity means that the algorithm’s running time increases as the logarithm of the input size.

Paul was pessimistic about everything. He fretted that none of his letters would arrive in time and bemoaned the possibility of rain each day. However, his true obsession was worrying about how long tasks would take him. On the day he was scheduled to interview with the Bureau of Farm Animal Accounting, he left his house three hours early to account for potential delays during his 200 meter walk.

Three months into his new job, Paul found himself overwhelmed by the thought of his daily workload. His job as a file clerk consisted of retrieving files for other members of the bureau. Each morning he received a long list of files that he needed to retrieve. Even though he always finished early, his first though each morning was: "There is no way I can finish this."

Luckily for Paul, the Bureau of Farm Animal Accounting: Large Mammal Division was home to some of the kingdom's finest computational theorists. Over her ten years in the bureau, Clare O'Connell had assembled a collection of some of the greatest minds. Thus, it was during a serendipitous lunch conversation that Paul finally came to the Log N realization that would forever change his life.

"There is no way I can finish finding all of these papers," Paul complained for the fifteenth day in a row.

Sid groaned. He was tired of hearing the same pessimistic complaint every day.

"Sure you can," Sid assured Paul. "You do it every day."

"But, today's list looks bigger," responded Paul.

"How much bigger?" asked Sid. He was determined to end Paul's complaining today.

"I don't know. Maybe two more files." answered Paul.

"And how long does it take you to find a file?" inquired Sid.

"Well, usually not that long. But it COULD take a long time! Imagine if I cannot find the correct location. I could spend hours on a single file!" Paul started to panic.

"No, you couldn't." Sid laughed. "The files are all sorted. Right?"

"Yes." answered Paul. He was unsure why it mattered.

"And you use a binary search to find the files, right?" prompted Sid.

"Yes," answered Paul.

"Then you are all set. Binary search is an O(log N) algorithm." explained Sid. The matter thus resolved, Sid went back to eating his pie. It was key-lime pie today, which was Sid's third favorite flavor.

"I am not sure what that means," admitted Paul. "Couldn't it still take hours? What if they add another filing cabinet? I heard a rumor that they might merge in the files from the Small Mammal Division. They have over twice as many files as we do. Do you know how many chipmunks there are in the kingdom?"

"You are looking at it wrong," answered Sid. "Big-O notation gives the WORST case solution."

"But Log N can still be big, right?" asked Paul.

"Sure. In theory log N can grow large. But it takes an absurdly large value of N." answered Sid. "In binary search, each step chops the problem in half. If you have 1,000 files: after the first step you have limited it to 500 files, then after the second step you have 250, then after the third step 125, and so on. That means if you double the number of files to search through, you are only adding one more step. Log N algorithms are wonderful! You would be fine if they merged in the small animal division, the insect division, the fish division, AND the bird division. You still would only add a few more steps to your search."

Paul gasped at the thought of so many files. As the panic started to build inside him, he focused on what Sid had just said. As long as he continued to use a Log N algorithm, they could double the number of files several times over, and he could still get his work done on time. With that thought, life did not seem so scary anymore. He wanted to stand on the table and sing for joy.

"Wow. You are right." Paul said. "It is not that bad. Do you have any other advice?"

"Yeah. Don't let them double the size of the list." offered Sid. "Even though it is O(Log N) to find each file, you still have to search for each one. If you have M files on your list, finding them all is O(M * Log N) time."

With that one statement, the pit of dread returned to Paul's stomach.

---------------------------

If you are interested in learning more about Big-O Notation, check out the story of how Clare used the computational complexity to shift the balance of power in the wizard's war.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.