GEOG 489
Advanced Python Programming for GIS

1.6.3.3 Distributed processing

PrintPrint

Distributed processing is a type of parallel processing that instead of (just) using each processor in a single machine will use all of the processors across multiple machines. Of course, this requires that you have multiple machines to run your code on but with the rise of cloud computing architectures from providers such as Amazon, Google, and Microsoft this is getting more widespread and more affordable. We won’t cover the specifics of how to implement distributed processing in this class but we have provided a few links if you want to explore the theory in more detail.

In a nutshell what we are doing with distributed processing is taking our idea of multiprocessing on a single machine and instead of using the 4 or however many processors we might have available, we're accessing a number of machines over the internet and utilizing the processors in all of them. Hadoop is one method of achieving this and others include Amazon's Elastic Map Reduce, MongoDB and Cassandra. GEOG 865 has cloud computing as its main topic, so if you are interested in this, you may want to check it out.