Working with GIS and Python

Python is a powerful scripting language that allows users to script repetitive tasks and automate system behaviors. Python is not object oriented, differentiating its development and operation from languages like C++ and Java. The scripting syntax of Python might ease the learning curve for those new to programming concepts. GIS is a great introductory to Python programming. Please excuse any formatting errors. Some indentation was lost when copying over.

 

GIS_analysis_esri.jpg

 

ArcGIS has robust support for Python, allowing many GIS methods to be automated for optimized development with the ArcMap software framework. ArcPy is a python module that allows python to directly interface with ArcGIS software, giving the user powerful scripting capabilities within the ESRI software ecospace. ArcMap has a built in script editor which provides a graphical interface users can use to construct scripts without the default python shell. This feature is called Model Builder, and it makes the relationship between python and the ArcGIS toolkit easier to understand and implement for the visual thinker.

I provided examples of my own work that have been written in either Model Builder or in the Python IDE. I tried to keep the examples strictly geographic for this post. These scripts aren’t guaranteed to work flawlessly or gracefully. This is a continued learning experience for me and any constructive criticism is welcome.

Here’s an example of what I’ve found possible using python and ArcPy.

 


# -*- coding: utf-8 -*-
# ---------------------------------------------------------------------------
# 50m.py
# Created on: 2017-03-28 15:18:18.00000
# (generated by ArcGIS/ModelBuilder)
# Description:
# ---------------------------------------------------------------------------

# Import arcpy module
import arcpy

# Local variables:
Idaho_Moscow_students = "Idaho_Moscow_students"
Idaho_Moscow_busstops_shp = "H:\\Temp\\Idaho_Moscow_busstops.shp"
busstops_50m_shp = "H:\\Temp\\busstops_50m.shp"
within_50m_shp = "H:\\Temp\\within_50m.shp"

# Process: Buffer
arcpy.Buffer_analysis(Idaho_Moscow_busstops_shp, busstops_50m_shp, "50 Meters", "FULL", "ROUND", "ALL", "", "PLANAR")

# Process: Intersect
arcpy.Intersect_analysis("Idaho_Moscow_students #;H:\\Temp\\busstops_50m.shp #", within_50m_shp, "ALL", "", "INPUT")

 

The output is tidy and properly commented by default, saving the user the usual time it takes to make the code tidy and functionally legible. It also includes a proper header and the location of map assets. All of this is done on the fly, making sure quality code is produced every time. This is a great reason to use model builder over manually programming in in the IDE.

The script above takes a dataset containing spatial information about students and bus stops in Moscow, Idaho, applies a 50 meter buffer to the bus stops and creates a shapefile of all the students that intersect this buffer. This information can then be reapplied by entities involved in either of these operations, meaning, operations can be applied to this newly created 50m layer on the fly. We can then increment the data using the model builder to create shapefiles for different buffers.

The benefit of this over manually creating the shapefile is the obscene amount of time saved. Depending on how thorough the GIS is, each one of these points might need its own shapefile or aggregation of shapefiles. This script runs the necessary 100 or so scripts to create the spatial assets in a fraction of the time it would take a human.

The script below takes the same concept but changes the variables so the output is 100m instead of 50m. Segments of the code can be changed to augment the operation without starting from scratch. This makes it possible to automate the creation of these scripts, the ultimate goal.

 

# -*- coding: utf-8 -*-
# ---------------------------------------------------------------------------
# 100m.py
# Created on: 2017-03-28 15:19:04.00000
# (generated by ArcGIS/ModelBuilder)
# Description:
# ---------------------------------------------------------------------------

# Import arcpy module
import arcpy

# Local variables:
Idaho_Moscow_students = "Idaho_Moscow_students"
Idaho_Moscow_busstops_shp = "H:\\Temp\\Idaho_Moscow_busstops.shp"
busstops_100m_shp = "H:\\Temp\\busstops_100m.shp"
within_100m_shp = "H:\\Temp\\within_100m.shp"

# Process: Buffer
arcpy.Buffer_analysis(Idaho_Moscow_busstops_shp, busstops_100m_shp, "100 Meters", "FULL", "ROUND", "ALL", "", "PLANAR")

# Process: Intersect
arcpy.Intersect_analysis("Idaho_Moscow_students #;H:\\Temp\\busstops_100m.shp #", within_100m_shp, "ALL", "", "INPUT")

 

This example with a 100m buffer instead of a 50m buffer, can either be changed in model builder itself, manually using the replace function in your favorite text editor, or changed in ArcMap’s model builder. By changing one variable we have another porperly formatted script saving time that would have been spent manually operating the tools in the ArcMap workspace. This can be further developed to take input from the user and running the tools directly through arcpy, allowing for the possibility of “headless” GIS operations without the need to design manually.

This functionality extends to database operations. In the following script shapefiles are created by attributes in a table.

 


# ---------------------------------------------------------------------------
# airport_buffer.py
# Created on: 2017-03-30 09:06:41.00000
# (generated by ArcGIS/ModelBuilder)
# Description:
# ---------------------------------------------------------------------------

# Import arcpy module
import arcpy

 

# Local variables:
airports = "airports"
airports__4_ = airports
Airport_buffer_shp = "H:\\Ex7\\Exercise07\\Challenge\\Airport_buffer.shp"

# Process: Select Layer By Attribute
arcpy.SelectLayerByAttribute_management(airports, "NEW_SELECTION", "\FEATURE\" = 'Airport'")

# Process: Buffer
arcpy.Buffer_analysis(airports__4_, Airport_buffer_shp, "15000 Meters", "FULL", "ROUND", "ALL", "", "PLANAR")

 

This script finds all attributes labeled “airport” in a dataset and creates a 15km buffer around each one. By integrated SQL queries, data can easily be parsed and presented. All of this code be generated using model builder in the ArcMap client. Efficient scripting comes in the form of the efficient application of the python functional logic with a clear realization of the objective to be achieved.

 


import arcpy
arcpy.env.workspace = "H:/Exercise12"

def countstringfields():
fields = arcpy.ListFields("H:/Exercise12/streets.shp"," ", {"String"})
namelist = []
for field in fields:
	namelist.append(field.type)
	print(len(namelist))

countstringfields

 

This script counts the number of “string” fields in a table. The function “countstringfields” starts by locating the “String” attribute in the attribute table of a shapefile. Next a list of names in defined. A loop then appends the type “String” to a list. The variable “fields” instructs the loop to run through the entire list of strings, essentially counting them “by hand”. The resultant count is then printed for the user, all out of the ArcMap client. This script can be further developed by introducing variables for the shapefile and datatype read from user input. The proper use of indentation and whitespace is an important part of Python syntax so when things like nested loops are used, special consideration should be taken. Scripts can also be used to update datasets in addition to parsing them.

 

<b></b>

import arcpy
from arcpy import env
env.workspace = "H:/Ex7/Exercise07/"
fc = "Results/airports.shp"
cursor = arcpy.da.UpdateCursor(fc, ["TOT_ENP"])
for row in cursor:
	if row[0]  a + b) or (a &gt; b + c) or (b &gt; a + c):
	print "Valid: No"
else:
	print "Valid: Yes"

##inputs

print("Feeding program lists of measurements...")

while count &lt; 4:
	if count is 0:
		print listA
		a, b, c = listA
		count = count + 1
		triangle_type(a, b, c)
		triangle_validity(a, b, c)
	elif count is 1:
		print listB
		a, b, c = listB
		count = count + 1
		triangle_type(a, b, c)
		triangle_validity(a, b, c)
	elif count is 2:
		print listC
		a, b, c = listC
		count = count + 1
		triangle_type(a, b, c)
		triangle_validity(a, b, c)
	else:
		print listD
		a, b, c = listD
		count = count + 1
		triangle_type(a, b, c)
		triangle_validity(a, b, c)

 

The following script was fun to make.

This script accepts input in the form of multiple lists. These lists are preprogrammed in this case but the could read user input or read input from a text file by using the read method in python. The while loop uses a counter to track how many times it has been run. The loop is nested with with some conditional elements.

 

import csv

yield1999 = []
yield2000 = []
yield2001 = []

f = open('C:\Users\jdean32\Downloads\yield_over_the_years.csv')
csv_f = csv.reader(f)
next(csv_f)
for row in csv_f:
yield1999.append(row[0])
yield2000.append(row[1])
yield2001.append(row[2])

yield1999 = map(float, yield1999)
yield2000 = map(float, yield2000)
yield2001 = map(float, yield2001)

f.close()

print("1999: %s") %(yield1999)
print("2000: %s") %(yield2000)
print("2001: %s") %(yield2001)

year1999 = 1999
max_value_1999 = max(yield1999)
min_value_1999 = min(yield1999)
avg_value_1999 = sum(yield1999)/len(yield1999)
print("\nIn %s, %s was the maximum yield, %s was the minimum yield and %s was the average yield.") %(year1999, max_value_1999, min_value_1999, avg_value_1999)

year2000 = 2000
max_value_2000 = max(yield2000)
min_value_2000 = min(yield2000)
avg_value_2000 = sum(yield2000)/len(yield2000)
print("In %s, %s was the maximum yield, %s was the minimum yield and %s was the average yield.") %(year2000, max_value_2000, min_value_2000, avg_value_2000)

year2001 = 2001
max_value_2001 = max(yield2001)
min_value_2001 = min(yield2001)
avg_value_2001 = sum(yield2001)/5
print("In %s, %s was the maximum yield, %s was the minimum yield and %s was the average yield.") %(year2001, max_value_2001, min_value_2001, avg_value_2001)

 

Like I said before, this was fun to make. Always eager to take the road less traveled I thought of the most obtuse way to make this calculation. The objective of the above script was the read text from a file and compare 3 years of agriculture data. The script then finds which year had the minimum yield, maximum yield, average yield. This is all accomplished with a quick for loop, relying on several sets of variables to make sure the final answers are correct. This program can ingest different kinds of input so changing the text file, or the location where it is looked for will produce different results from the same script. Different data can be automatically ran through this particular operation.

 

yield1999 = [3.34, 21.8, 1.34, 3.75, 4.81]
yield2000 = [4.07, 4.51, 3.9, 3.63, 3.15]
yield2001 = [4.21, 4.29, 4.64, 4.27, 3.55]

location1 = (yield1999[0] + yield2000[0] + yield2001[0])/3
location2 = (yield1999[1] + yield2000[1] + yield2001[1])/3
location3 = (yield1999[2] + yield2000[2] + yield2001[2])/3
location4 = (yield1999[3] + yield2000[3] + yield2001[3])/3
location5 = (yield1999[4] + yield2000[4] + yield2001[4])/3

locations = [location1, location2, location3, location4, location5]
count = 1

text_fileA = open("C:\Temp\OutputA.txt", "w")

for i in locations:
text_fileA.write(("The average yield at location %s between 1999 and 2001 %.2f\n") %(count, i))
count = count + 1

text_fileA.close()

max1999 = (yield1999.index(max(yield1999))+1)
max2000 = (yield2000.index(max(yield2000))+1)
max2001 = (yield2001.index(max(yield2001))+1)
min1999 = (yield1999.index(min(yield1999))+1)
min2000 = (yield2000.index(min(yield2000))+1)
min2001 = (yield2001.index(min(yield2001))+1)

minmax1999 = [1999, max1999, min1999]
minmax2000 = [2000, max2000, min2000]
minmax2001 = [2001, max2001, min2001]

minmax = [minmax1999, minmax2000, minmax2001]

text_fileB = open("C:\Temp\OutputB.txt", "w")

for i in minmax:
text_fileB.write(("In %s we yielded the least at location %s and the most at location %s.\n") %(i[0], i[2], i[1]))

text_fileB.close()

 

Another attempt at the agriculture problem. Versioning is something I find useful not only for keeping a record of changes but also for keeping track of progress. This was the 4th versioning of this script and I think it turned out very unorthodox, something I find most intersting about coding: you can find multiple approaches to complete an objective. The two scripts above are similar and approached in different ways. This script uses a for loop to run through a conextually sensitive amount of inputs. The values were hardcoded into the program as variables at the start of the script. They could be read from a file if necessary.

The following script looks for the basin layer in a ArcMap file and clips the soils layer using the basin layer. This produces an area where both the soil layer and the basin layer is present. From this clipped soil layer, the script goes on to select the features from a set of attributes that are "Not Prime Farmland". This is useful for property development where the amount of farmland available is a consideration.

 

 

import arcpy

print "Starting"

soils = "H:\\Final_task1\\soils.shp"
basin = "H:\\Final_task1\\basin.shp"
basin_Clip = "C:\\Users\\jdean32\\Documents\\ArcGIS\\Default.gdb\\basin_Clip"
task1_result_shp = "H:\\task1_result.shp"

arcpy.Clip_analysis(soils, basin, basin_Clip, "")

arcpy.Select_analysis(basin_Clip, task1_result_shp, "FARMLNDCL = 'Not prime farmland'")

print "Completed"

 

The next script clips all feature classes from a folder called "USA" according to the Iowa state boundary. It then places them in a new folder. This is useful if you have country-wide data but only want to present the data from a particular area, in this case Iowa.

The script will automatically read all shapefiles in the USA folder, no matter the amount.

 

 

import arcpy

sourceUSA = "H:\\Final_task2\\USA"
sourceIowa = "H:\\Final_task2\\Iowa"
iowaBoundary = "H:\\Final_task2\\Iowa\\IowaBoundary.shp"

arcpy.env.workspace = sourceUSA
fcList = arcpy.ListFeatureClasses()

print "Starting"

for features in fcList:
outputfc = sourceIowa + "\\Iowa" + features
arcpy.Clip_analysis(features, iowaBoundary, outputfc)

print "Completed"

 

The following script finds the average population for a set of counties in a data. By dividing the total population by the number of counties, the average population is found. This is useful for calculating values in large datasets without doing it by hand.

 

 

import arcpy

featureClass = "H:\\Final_task3\\Counties.shp"

row1 = arcpy.SearchCursor(featureClass)
row2 = row1.next()

avg = 0
totalPop = 0
totalRecords = 0

while row2:
totalPop += row2.POP1990
totalRecords += 1
row2 = row1.next()

avg = totalPop / totalRecords
print "The average population of the " + str(totalRecords) + " counties is: " + str(avg)

 

The following is a modified script the calculates the driving distance between two locations. Originally the script calculated the distance between UNCC and uptown. It has been edited to calculate user input. The API is finicky so the variables have to be exact to call the right data from the API. There is reconciliation of user input in the form of replacing spaces with underscores.

 

## Script Title: Printing data from a URL (webpage)
## Author(s): CoDo
## Date: December 2, 2015

# Import the urllib2 and json libraries
import urllib2
import json
import re

originaddress = raw_input("What is the address?\n")
originstate = raw_input("What is the state?\n")
originzip = raw_input("What is the zipcode\n")
destinationaddress = raw_input("What is the destination address?\n")
destinationstate = raw_input("What is the state?\n")
destinationzip = raw_input("What is the destination zipcode\n")

print originaddress
print originstate
print originzip
print destinationaddress
print destinationstate
print destinationzip

originaddress = originaddress.replace (" ", "+")
destinationaddress = destinationaddress.replace (" ", "+"# Google API key (get it at https://code.google.com/apis/console)

google_APIkey = ##removed for security

# Read the response url of our request to get directions from UNCC to the Time Warner Cable Arena
url_address = 'https://maps.googleapis.com/maps/api/directions/json?origin=%s,%s+%s&destination=%s,%s+%s&key='% (originaddress, originstate, originzip, destinationaddress, destinationstate, destinationzip) + google_APIkey
##url_address = 'https://maps.googleapis.com/maps/api/directions/json?origin=1096+Meadowbrook+Ln+SW,NC+28027&destination=9201+University+City+Blvd,NC+28223
url_sourceCode = urllib2.urlopen(url_address).read()

# Convert the url's source code from a string to a json format (i.e. dictionary type)
directions_info = json.loads(url_sourceCode)

# Extract information from the dictionary holding the information about the directions
origin_name = directions_info['routes'][0]['legs'][0]['start_address']
origin_latlng = directions_info['routes'][0]['legs'][0]['start_location'].values()
destination_name = directions_info['routes'][0]['legs'][0]['end_address']
destination_latlng = directions_info['routes'][0]['legs'][0]['end_location'].values()
distance = directions_info['routes'][0]['legs'][0]['distance']['text']
traveltime = directions_info['routes'][0]['legs'][0]['duration']['value'] / 60

# Print a phrase that summarizes the trip
print "Origin: %s %s \nDestination: %s %s \nEstimated travel time: %s minutes" % (origin_name, origin_latlng, destination_name, destination_latlng, traveltime)

 

This next script looks for feature classes in a workspace and prints the name of each feature class and the geometry type. This would be useful for parsing datasets and looking for specific features, like polygons.

 

import arcpy
from arcpy import env
env.workspace = "H:/arcpy_ex6/Exercise06"
fclist = arcpy.ListFeatureClasses()
for fc in fclist:
fcdescribe = arcpy.Describe(fc)
print (fcdescribe.basename + " is a " + str.lower(str(fcdescribe.shapeType)) + " feature class")

 

This following scrip adds a text field to an attribute table for roads. The feature class is called ferry and is populated by either "yes" or "no" values, depending on the value of the feature field.

This is useful for quickly altering data in an attribute field or dataset without directly interfacing with the ArcMap client.

 

##libraries
import arcpy
from arcpy import env
env.workspace = "C:/Users/jdean32/Downloads/Ex7/Exercise07"

##variables
fclass = "roads.shp"
nfield = "Ferry"
ftype = "TEXT"
fname = arcpy.ValidateFieldName(nfield)
flist = arcpy.ListFields(fclass)

if fname not in flist:
arcpy.AddField_management(fclass, fname, ftype, "", "", 12)
print "Ferry attribute added."

cursor = arcpy.da.UpdateCursor(fclass, ["FEATURE","FERRY"])

for row in cursor:
if row[0] == "Ferry Crossing":
row[1] = "Yes"
else:
row[1] = "No"
cursor.updateRow(row)
del cursor

 

The following script uses some familiar functionality as the airport before near the beginning of this article. It first creates a 15,000 meter buffer around airport features in a shapefile. In addition to the buffer around airports, the script creates a 7,500 meter buffer around airports that operate seaplanes. This requires looking at the attribute table for seaplane bases. The end result is two separate buffers. A picture says 1,000 words, by having two buffers we are multiplying the amount of data that can be projected by a cartographic visualization.

 

# -*- coding: utf-8 -*-
# —————————————————————————
# airport_buffer.py
# Created on: 2017-03-30 09:06:41.00000
# (generated by ArcGIS/ModelBuilder)
# Description:
# —————————————————————————

# Import arcpy module
import arcpy

# Local variables:
airports = "airports"
airports__4_ = airports
Airport_buffer_shp = "H:\\Ex7\\Exercise07\\Challenge\\Seaplane_base_buffer.shp"

# Process: Select Layer By Attribute
arcpy.SelectLayerByAttribute_management(airports, "NEW_SELECTION", "\"FEATURE\" = 'Seaplane Base'")

# Process: Buffer
arcpy.Buffer_analysis(airports__4_, Airport_buffer_shp, "7500 Meters", "FULL", "ROUND", "ALL", "", "PLANAR")

 

Finally, we have a script the looks through a geodatabase, reads the features classes, then copies the polygon features to a new geodatabase. Once again, this makes it easy to parse and migrate data between sets.

 

import arcpy, os
arcpy.env.workspace = r'H:\arcpy_ex6\Exercise06'
fclass = arcpy.ListFeatureClasses()

outputA = r'H:\arcpy_ex6\Exercise06\testA.gdb'
outputB = r'H:\arcpy_ex6\Exercise06\testB.gdb'

for fc in fclass:
fcdesc = arcpy.Describe(fc).shapeType
outputC = os.path.join(outputA, fc)
arcpy.CopyFeatures_management(fc, outputC)
if fcdesc == 'Polygon':
outputC = os.path.join(outputB, fc)
arcpy.CopyFeatures_management(fc, outputC)

 

Python is a blessing for geographers who want to automate their work. It's logical but strict syntax allows easily legible code. It's integration with the ArcGIS suite and it's fairly simple syntax make it easy to pick up, for experts and beginners alike. Model builder abstracts the programming process and makes it easier for people who are familiar with GUI interfaces. There is little a geographer with a strong knowledge of python and mathematics can't do.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s