Into the Unknown

  • 0

Into the Unknown

Category : Uncategorized

One of my (grown-up) students at JobTrain wasn’t impressed with my Shakespeare-writing organisms since we knew the phrase we were trying to make it write. He has a point, and that’s what’s wrong with a lot of problems we assign in Math: we already know the answer. It reminded me of the Simpson’s episode where Lisa steals all the Teachers’ Editions of the schoolbooks. Without the answers, the teachers panic!

Making sure a formula or a program works is important and that’s why we test it with something we know the answer to. But once we know it works correctly, we should be extending it and using it on questions we don’t know the answer to!

There’s a famous problem in math and programming called the Traveling Salesman Problem (TSP), in which a salesperson has to travel to a bunch of cities and is of course looking for the shortest route. Even introducing it is a terrific math problem: how many routes are there between 3 cities? 4 cities? 10 cities? It really starts to add up!

Between n cities there are (n-1)! routes, and you need to divide that by 2 because half of them are duplicates, just in the opposite direction.

So between 10 cities there are 181,440 unique routes, and between 20 cities there are 60,822,550,204,416,000 routes! If you could check a million routes randomly every second, that would still take you 1,928 years.

In programming it’s challenging enough to draw the routes between the cities randomly and test their total distance. That’s what I did, following one of Dan Shiffman’s excellent Code Challenges on YouTube. But then you leave it running and it just chooses random routes. The best route improves a little now and then. To apply the genetic algorithm, I gave each organism a score, a length, and a list of numbers signifying a route through the cities. If there were 10 cities, the “cities” list would just be the numbers 0 through 9, shuffled randomly:

class Organism:
    def __init__(self):
        self.score = 0
        self.length = 0
        self.cities = nums[:]
        random.shuffle(self.cities)

Just like the Shakespeare program, but with numbers instead of letters. But how do you score the organism if you don’t know the answer? You give the highest score to the route with the lowest distance.

 def calcScore(self):
     #calculate the length of the route:
     myLength = self.calculateLength()
     #give a score. Lower distance --> higher score
     self.score = 1000000.0/myLength
  
     return self.score

Those organisms with “better genes” get more copies put into the mating pool and little by little the surviving routes improve. I added in some helpful mutations, where a random number of genes get switched.

It isn’t a foolproof system, though: the pool of genes could get stuck in a non-optimal route, but it does a great job for 10,000 organisms (in this model) starting with random routes. In this GIF you can see the process of improving the route in increments of 100 generations per frame. The white number is the best distance in the first generation: 13,537 units. By the end of 1000 generations or so of the organisms evolving better and better genes (which are just lists of numbers), we’ve improved the distance to 3,808 units.

My entire code is on Github.

Your students might be fascinated to know they (and their virtual helpers) are solving problems that have never been solved before! Is their solution the best one? There’s no answer in the back of the book!

Have fun and Go Organisms!


-Peter Farrell
May 15, 2017


Leave a Reply