Hi, I’m Carrie Anne, and welcome to CrashCourse

Computer Science! Over the past two episodes, we got our first

taste of programming in a high-level language, like Python or Java. We talked about different types of programming

language statements – like assignments, ifs, and loops – as well as putting statements

into functions that perform a computation, like calculating an exponent. Importantly, the function we wrote to calculate

exponents is only one possible solution. There are other ways to write this function

– using different statements in different orders – that achieve exactly the same numerical

result. The difference between them is the algorithm,

that is the specific steps used to complete the computation. Some algorithms are better than others even

if they produce equal results. Generally, the fewer steps it takes to compute,

the better it is, though sometimes we care about other factors, like how much memory

it uses. The term algorithm comes from Persian polymath

Muḥammad ibn Mūsā al-Khwārizmī who was one of the fathers of algebra more than a

millennium ago. The crafting of efficient algorithms – a

problem that existed long before modern computers – led to a whole science surrounding computation,

which evolved into the modern discipline of… you guessed it! Computer Science! INTRO One of the most storied algorithmic problems

in all of computer science is sorting… as in sorting names or sorting numbers. Computers sort all the time. Looking for the cheapest airfare, arranging

your email by most recently sent, or scrolling your contacts by last name — those all require

sorting. You might think “sorting isn’t so tough…

how many algorithms can there possibly be?” The answer is: a lot. Computer Scientists have spent decades inventing

algorithms for sorting, with cool names like Bubble Sort and Spaghetti Sort. Let’s try sorting! Imagine we have a set of airfare prices to

Indianapolis. We’ll talk about how data like this is represented

in memory next week, but for now, a series of items like this is called an array. Let’s take a look at these numbers to help

see how we might sort this programmatically. We’ll start with a simple algorithm. First, let’s scan down the array to find

the smallest number. Starting at the top with 307. It’s the only number we’ve seen, so it’s

also the smallest. The next is 239, that’s smaller than 307,

so it becomes our new smallest number. Next is 214, our new smallest number. 250 is not, neither is 384, 299, 223 or 312. So we’ve finished scanning all numbers,

and 214 is the smallest. To put this into ascending order, we swap

214 with the number in the top location. Great. We sorted one number! Now we repeat the same procedure, but instead

of starting at the top, we can start one spot below. First we see 239, which we save as our new

smallest number. Scanning the rest of the array, we find 223

is the next smallest, so we swap this with the number in the second spot. Now we repeat again, starting from the third

number down. This time, we swap 239 with 307. This process continues until we get to the

very last number, and voila, the array is sorted and you’re ready to book that flight

to Indianapolis! The process we just walked through is one

way – or one algorithm – for sorting an array. It’s called Selection Sort — and it’s

pretty basic. Here’s the pseudo-code. This function can be used to sort 8, 80, or

80 million numbers – and once you’ve written the function, you can use it over and over

again. With this sort algorithm, we loop through

each position in the array, from top to bottom, and then for each of those positions, we have

to loop through the array to find the smallest number to swap. You can see this in the code, where one FOR

loop is nested inside of another FOR loop. This means, very roughly, that if we want

to sort N items, we have to loop N times, inside of which, we loop N times, for a grand

total of roughly N times N loops… Or N squared. This relationship of input size to the number

of steps the algorithm takes to run characterizes the complexity of the Selection Sort algorithm. It gives you an approximation of how fast,

or slow, an algorithm is going to be. Computer Scientists write this order of growth

in something known as – no joke – “big O notation”. N squared is not particularly efficient. Our example array had n=8 items, and 8 squared

is 64. If we increase the size of our array from

8 items to 80, the running time is now 80 squared, which is 6,400. So although our array only grew by 10 times

– from 8 to 80 – the running time increased by 100 times – from 64 to 6,400! This effect magnifies as the array gets larger. That’s a big problem for a company like

Google, which has to sort arrays with millions or billions of entries. So, you might ask, as a burgeoning computer scientist, is there a more efficient sorting algorithm? Let’s go back to our old, unsorted array

and try a different algorithm, merge sort. The first thing merge sort does is check if

the size of the array is greater than 1. If it is, it splits the array into two halves. Since our array is size 8, it gets split into

two arrays of size 4. These are still bigger than size 1, so they

get split again, into arrays of size 2, and finally they split into 8 arrays with 1 item

in each. Now we are ready to merge, which is how “merge

sort” gets its name. Starting with the first two arrays, we read

the first – and only – value in them, in this case, 307 and 239. 239 is smaller, so we take that value first. The only number left is 307, so we put that

value second. We’ve successfully merged two arrays. We now repeat this process for the remaining

pairs, putting them each in sorted order. Then the merge process repeats. Again, we take the first two arrays, and we

compare the first numbers in them. This time its 239 and 214. 214 is lowest, so we take that number first. Now we look again at the first two numbers

in both arrays: 239 and 250. 239 is lower, so we take that number next. Now we look at the next two numbers: 307 and

250. 250 is lower, so we take that. Finally, we’re left with just 307, so that

gets added last. In every case, we start with two arrays, each

individually sorted, and merge them into a larger sorted array. We repeat the exact same merging process for

the two remaining arrays of size two. Now we have two sorted arrays of size 4. Just as before, we merge, comparing the first

two numbers in each array, and taking the lowest. We repeat this until all the numbers are merged,

and then our array is fully sorted again! The bad news is: no matter how many times

we sort these, you’re still going to have to pay $214 to get to Indianapolis. Anyway, the “Big O” computational complexity

of merge sort is N times the Log of N. The N comes from the number of times we need

to compare and merge items, which is directly proportional to the number of items in the

array. The Log N comes from the number of merge steps. In our example, we broke our array of 8 items

into 4, then 2, and finally 1. That’s 3 splits. Splitting in half repeatedly like this has

a logarithmic relationship with the number of items – trust me! Log base 2 of 8 equals 3 splits. If we double the size of our array to 16 – that’s

twice as many items to sort – it only increases the number of split steps by 1 since log base

2 of 16 equals 4. Even if we increase the size of the array

more than a thousand times, from 8 items to 8000 items, the number of split steps stays

pretty low. Log base 2 of 8000 is roughly 13. That’s more, but not much more than 3 — about

four times larger – and yet we’re sorting a lot more numbers. For this reason, merge sort is much more efficient

than selection sort. And now I can put my ceramic cat collection

in name order MUCH faster! There are literally dozens of sorting algorithms

we could review, but instead, I want to move on to my other favorite category of classic

algorithmic problems: graph search! A graph is a network of nodes connected by

lines. You can think of it like a map, with cities

and roads connecting them. Routes between these cities take different

amounts of time. We can label each line with what is called

a cost or weight. In this case, it’s weeks of travel. Now let’s say we want to find the fastest

route for an army at Highgarden to reach the castle at Winterfell. The simplest approach would just be to try

every single path exhaustively and calculate the total cost of each. That’s a brute force approach. We could have used a brute force approach

in sorting, by systematically trying every permutation of the array to check if it’s

sorted. This would have an N factorial complexity

– that is the number of nodes, times one less, times one less than that, and so on until

1. Which is way worse than even N squared. But, we can be way more clever! The classic algorithmic solution to this graph

problem was invented by one of the greatest minds in computer science practice and theory,

Edsger Dijkstra, so it’s appropriately named Dijkstra’s algorithm. We start in Highgarden with a cost of 0, which

we mark inside the node. For now, we mark all other cities with question

marks – we don’t know the cost of getting to them yet. Dijkstra’s algorithm always starts with the

node with lowest cost. In this case, it only knows about one node,

Highgarden, so it starts there. It follows all paths from that node to all

connecting nodes that are one step away, and records the cost to get to each of them. That completes one round of the algorithm. We haven’t encountered Winterfell yet, so

we loop and run Dijkstra’s algorithm again. With Highgarden already checked, the next

lowest cost node is King’s Landing. Just as before, we follow every unvisited

line to any connecting cities. The line to The Trident has a cost of 5. However, we want to keep a running cost from

Highgarden, so the total cost of getting to The Trident is 8 plus 5, which is 13 weeks. Now we follow the offroad path to Riverrun,

which has a high cost of 25, for a total of 33. But we can see inside of Riverrun that we’ve

already found a path with a lower cost of just 10. So we disregard our new path, and stick with

the previous, better path. We’ve now explored every line from King’s

Landing and didn’t find Winterfell, so we move on. The next lowest cost node is Riverrun, at

10 weeks. First we check the path to The Trident, which

has a total cost of 10 plus 2, or 12. That’s slightly better than the previous

path we found, which had a cost of 13, so we update the path and cost to The Trident. There is also a line from Riverrun to Pyke

with a cost of 3. 10 plus 3 is 13, which beats the previous

cost of 14, and so we update Pyke’s path and cost as well. That’s all paths from Riverrun checked…

so… you guessed it, Dijkstra’s algorithm loops again. The node with the next lowest cost is The

Trident and the only line from The Trident that we haven’t checked is a path to Winterfell! It has a cost of 10, plus we need to add in

the cost of 12 it takes to get to The Trident, for a grand total cost of 22. We check our last path, from Pyke to Winterfell,

which sums to 31. Now we know the lowest total cost, and also

the fastest route for the army to get there, which avoids King’s Landing! Dijkstra’s original algorithm, conceived in

1956, had a complexity of the number of nodes in the graph squared. And squared, as we already discussed, is never

great, because it means the algorithm can’t scale to big problems – like the entire road

map of the United States. Fortunately, Dijkstra’s algorithm was improved

a few years later to take the number of nodes in the graph, times the log of the number

of nodes, PLUS the number of lines. Although this looks more complicated, it’s

actually quite a bit faster. Plugging in our example graph, with 6 cities

and 9 lines, proves it. Our algorithm drops from 36 loops to around 14. As with sorting, there are innumerable graph search algorithms, with different pros and cons. Every time you use a service like Google Maps

to find directions, an algorithm much like Dijkstra’s is running on servers to figure

out the best route for you. Algorithms are everywhere and the modern world

would not be possible without them. We touched only the very tip of the algorithmic

iceberg in this episode, but a central part of being a computer scientist is leveraging

existing algorithms and writing new ones when needed, and I hope this little taste has intrigued

you to SEARCH further. I’ll see you next week.

## Leave a Reply