Now that we have completed a non-ArcGIS parallel processing exercise, let's look at a couple of examples using ArcGIS functions. There are several caveats or gotchas to using multiprocessing with ArcGIS and it is important to cover them up-front because they affect the ways in which we can write our code.
Esri describes several best practices for multiprocessing with arcpy. These include:
So bearing the top two points in mind we should make use of memory workspaces wherever possible and we should avoid writing to FGDBs (in our worker functions at least – but we could use them in our master function to merge a number of shapefiles or even individual FGDBs back into a single source).
Since we work with other packages such as arcpy, it is important to note that Classes within arcpy such as the Featureclass, Layer, Table, Raster, etc.,. cannot be returned from the worker threads without creating custom serializers to serialize and deserialize the objects between threads. This is known as Pickling and is the process of converting the object to JSON and back to an object. This method is beyond the scope of this course but built in classes and objects within python can be returned. For our example, we will return a dictionary containing information of the process.
Lesson content developed by Jan Wallgrun and James O’Brien