Python for Quick and Easy GIS Data Manipulation

Getting started with Python

The key to every information system is the data within it. It’s rare to find an organization that has all of their data structured perfectly and organized in the exact formats and systems that they need to accomplish their goals. Many of our clients come to us with data that is incomplete or maybe in a spreadsheet when it needs to be in a Shapefile or a geodatabase. The Python programming language is an invaluable tool when it comes to geo-data manipulation. From copying field values from one table to another, to converting between spatial data storage formats, using Python scripts can speed up many data-related work flows.

Python scripts are great time savers for situations when there are a lot of features to be updated and more complicated rules regarding the updates. For something like calculating the value of a field based on another field, the field calculator built into ArcGIS or QGIS is the way to go. However, with situations like updates that depend upon a spatial relationship, or field calculations based on a field in another table, Python is an excellent tool for the job. Without scripting, these types of workflows become painstaking grinds of manual feature-by-feature updates. But with a little know-how, it doesn’t have to be that way.python logo: https://python.org

Recently, we needed to update several thousand parcels to create a field that would indicate which rivers and tributaries ran through each parcel. In our rivers layer, we had several dozen rivers and several hundred creeks and tributaries. Instead of doing a tedious manual update, we wrote to a Python script using the open source OGR module. Read on for a line by line walk through of the script.

Python Script – Line by Line Walk Through:  Startup

Assuming you have the Python OGR/GDAL module installed (it comes with QGIS, or you can install separately with pip or conda), to start with, import the OGR module:

from osgeo import ogr

First, we define the data sources by assigning their file system paths’ to variables:

rivers_shp = r"L:\Line 45\Shapefiles\rivers.shp"
parcels_shp = r"L:\Line 45\Shapefiles\parcels.shp"

Next, we open the files and store the resulting objects to variables:

parcels_src = ogr.Open(parcels_shp, 1)
rivers_src = ogr.Open(rivers_shp, 0)

The digits passed after the file name variables indicate the mode the files should be opened in: 0 for read-only, 1 for write access. Since we’re updating the parcels layer, we need to open with write access.

OGR works with objects called layers, which we can get from our opened file object variables using the GetLayer() method:

rivers_layer = rivers_src.GetLayer()
parcels_layer = parcels_src.GetLayer()

Python Script – Line by Line Walk Through:  Look Through Rivers

Once we have a layer, we can iterate through it, feature by feature using a for loop. If you’re familiar with ArcPy, this is similar to how the Data Access Cursors allow you to iterate data. Now that we have rivers to iterate, we need to do something with each river:

for river in rivers_layer:
    river_name = river.GetField("GNIS_NAME")
    river_shape = river.GetGeometryRef()
    parcels_layer.SetSpatialFilter(river_shape)

The first line above sets up the for loop, that will access each feature (each river in our case) in the layer one at a time. The three lines following tell it what we want done with that river once it’s accessed. First, we read a field, GNIS_NAME, and store the value in that field to the variable river_name. Second, we look up the geometry (the shape) of the river, and store that to the variable river_shape. Finally, we set a spatial filter on our parcels layer. This is like a select by location, where we limit our parcels to only those intersecting the shape we pass to the filter. In our case, we want only parcels that intersect the river.

Python Script – Line by Line Walk Through:  Look Through Parcels

With the parcels selected, we can iterate through them with another for loop to apply our updates:

for parcel in parcels_layer:
    frontage = parcel.GetField("frontage")
    if frontage:
        frontage += ", {}".format(river_name)
    else:
        frontage = river_name
        parcel.SetField("frontage", frontage)
        parcels_layer.SetFeature(parcel)
        parcel = None

Just like before, we use GetField() to read the value of the frontage field and store it to a variable. The next line uses if to create a condition, stating if there was a value in frontage field, then complete the code indented below. Otherwise, complete the code indented below the corresponding else statement. So, if the frontage field has any value in it, we add to our frontage variable a comma, a space, and the name of the current river. If there was no value at all in the frontage field, then we set our frontage variable to the name of the river.

Similar to how we can read values of fields with GetField(), we can write values to fields using SetField(). In the code, the first value, the string frontage, indicates which field we want to write, the second value is what we want to write to the field (our variable that we have updated above with the river name). To save our changes to the file, we call SetFeature() on the parcel we’re updating, then de-reference it by setting its value to None. This will finalize our changes to that parcel.

Easily Extensible

This is the logic for a single iteration of the loop through the rivers. Now that we have that set up, when we run the script it will go through all of our hundreds of rivers and thousands of parcels and update them, all in a matter of seconds. This example uses parcels and rivers, but the same logic could be applied to compile records of all the roads that pass through sensitive areas, all the trails that lead into a particular wilderness area, all the buildings within a given business district, or any number of other scenarios. With a little tweaking, we could get even more information, such as how many feet of frontage a parcel has along each river.

If this has you interested in using Python to solve your problems, check out some of the examples in this GDAL/OGR cookbook.  ArcPy is also very well documented with some quick walk throughs.  We’d also love to hear from you and help implement solutions like these with your data and requirements.

– Ben

Updates from the Line 45 Team

It has been a while since we’ve posted and a lot has been going on.  Since our last posts, we’ve been busy looking at the direction of our business and making changes to orient Line 45 for future growth.  The following shares updates on some of our key efforts.

Refining Our Goals

When we started our business, we saw lots of opportunities and possibilities.  This was extremely exciting and we wanted to keep those opportunities wide open so that we didn’t focus too narrowly and we could get experience providing a variety of services to many kinds of clients.  This has been a good experience and we’ve gotten to do some great projects.  However, as we’ve gotten busier and busier, we’ve needed to focus on the things that we do best and enjoy most.  With that, we’ve focused our business on providing mapping, GIS, and software solutions to the environmental and natural resource sectors.  This focus draws on our own expertise in natural resources and allows us to provide a bridge between natural resource companies and the technology which allows them to be more efficient and more effective.

Growing Our Team

Welcome Ben GIS Analyst Programmer

Ben Romlein – GIS Specialist & Developer

Welcome Steve UAV/Drone Pilot

Steve Buyze – Pilot & Environmental Specialist

If you check out our About page, you’ll notice that we’ve added Benjamin Romlein and Steven Buyze to our team.  Steve is our UAV/Drone pilot.  He is a geologist by training and provides expertise in the environmental and regulatory fields.

Ben is an experienced GIS analyst and developer.  He holds a master’s degree in Information Science from Indiana University in Bloomington and worked for a number of years with a civil engineering firm.  He brings a wealth of knowledge in the Esri/ArcGIS stack, data manipulation with Python, data modeling and database design, cartography, SQL, and front end development with JavaScript and OpenLayers. His past work includes building custom geoprocessing tools, CAD to GIS conversions, business development, and implementing GIS systems for power utilities in the Great Lakes region.  Ben and Steve have added technical expertise, a fresh perspective, great ideas and much needed help to our team.

Upgrading Our Image

Line 45 website

We just released our new website which we built using Bootstrap and Material Design concepts. We also updated the content to better reflect what we do.  Please check it out and let us know what you think: www.line-45.com.

Planning for the Future

We continue to look forward to and plan for the future.  We’ve started working with GIS Cloud to provide a cloud infrastructure that we can implement for clients whether they’re wanting to take their GIS to the next level or provide access to team members in a variety of locations.

We will also work with Boundless Geo to provide enterprise level support and implementation of open sources GIS systems that lower costs, avoid vendor lock-in and provide excellent functionality that can drive any business.

Expect to hear more about these topics and others in the coming months.

– Alex

Fetch and Combine CSV Files from the Web

If you’ve ever analyzed large amounts of data, you know that it can be time consuming to get all the data you need into one place.  Data providers (often government agencies) frequently provide the data online in a comma delimited format (aka CSV).  The data is regularly broken down into multiple files to provide smaller downloads or to provide more manageable data sets for users.  It is a time-consuming task to fetch and combine CSV files from the web into one file for large scale analysis.

A recent discussion with an engineer inspired us to put together a simple program to solve this problem.  We created a tool to fetch and combine CSV files from the web into one locally stored file.  This can save you lots of time if you routinely need to retrieve the same data sets over and over for analysis.  The tool can be configured to fetch the files you need and then scheduled to run at routine intervals.

Everybody likes free stuff, so we’re making this tool available free of charge to any of our readers.  If this interests you, we have some instructions below to help you get started.

Instructions

Popular data sets include files put out by the Census Bureau, Department of Energy and many other governmental agencies that distribute data for public consumption. The Michigan Public Service Commission (MPSC) compiles data from monthly natural gas production reports submitted by oil and gas operators in the state of Michigan. We’ll be using some CSV files from their website as an example.

Step 1: Download the Tool

You can download the zip file containing the program files here:

After you unzip the file, you will see that the tool consists of two files: a “main.exe” file and a text file called “urls.txt.”

Files for tool for combining multiple CSV files

The executable “main.exe” is what you run to fetch your files.  The file “urls.txt” is the file you use to configure what files the program will fetch.

Step 2: Locate the desired CSV files

For this step we will use an example site to demonstrate how you can use the tool.  When you go to the Michigan Natural Gas Production Reports page of the MPSC’s website, you’ll see that the Commission has compiled natural gas production data from July 1990 through March 2016. There are over a million rows of data for this time period and the Commission has broken it down into five CSV files:

MPSC website - example to Fetch and Combine CSV Files from the Web

Step 3:  Inspect the CSV files to make sure they all have the same columns

When using this tool, you should keep in mind that the tool is designed to fetch and combine multiple CSV files that use the same schema. That means that all of your CSV files should have the same columns in the same order (when opened in Microsoft Excel). You can see that all five of the files we are combining from the MPSC site have the following columns:

CSV column headers

Step 4: Copy your URL(s) from the web by right-clicking on the link and selecting “Copy link address”

'copy link as' screenshot

Step 5: Paste the URL for each file you want included in your output onto a new line in the urls.txt file and save the file

Open the “urls.txt” file in a standard text editor such as Notepad. After you’ve pasted the URL for each file onto a new line, save and close the file.

*PLEASE NOTE:  You should paste the URL with the column labels that you want first. The tool will use these column labels and all subsequent labels from the other URLs will be lost.

Text file screenshot - Fetch and Combine CSV Files from the Web

Step 6: Run the “main.exe” file

Executing the “main.exe” file will initiate a command line dialogue. This dialogue may appear to be doing nothing at first, but be patient. When it is finished with one file, it will show the progress to the next file. No user action is needed in this screen. When the operation is finished, the screen will automatically close.

Screenshot of tool to Fetch and Combine CSV Files from the Web

Step 7: Open “output.csv”

Once the operation is finished a new file will be in the folder, named “output.csv.” This file will have all of the data from the multiple CSVs.

Screenshot of result of Fetch and Combine CSV Files from the Web

Tip

If you find yourself routinely (weekly, monthly… etc.) needing to fetch and combine CSV files from the web because the provider is frequently posting updates, try using Window’s Task Scheduler.

Windows Task Scheduler

You can set tasks and triggers with Task Scheduler that Windows will automatically run based on set time intervals or in response to other actions.

Additional Help

We hope this is a useful tool for you to fetch and combine CSV files from the web. If you have any issues, would like additional support or would like to us to help you develop a more powerful tool, contact us. We’d love to hear from you!

– Jess

PLEASE NOTE:  This application is provided free of charge.  It is still in beta mode and may contain bugs or issues.  Users should review the output.csv file to ensure that all data was fetched and combined correctly.

Resources for Water, Wastewater and Stormwater Utilities

We’re currently conducting research on data management for water utilities.  Since we have a number of readers who are interested in water management, we wanted to share some resources that could be useful for water, wastewater and stormwater utilities as they plan for the future. Short descriptions and links for you to explore more information are below:

MWEA’s Watershed & Stormwater Seminar: Dec 6

The Michigan Water Environment Association (MWEA) is hosting a Watershed & Stormwater Seminar in East Lansing on December 6, 2106. Topics will include:

  • Stormwater Utility Legislation, Bill Overview
  • Stormwater SAW Grants
  • Stormwater Asset Management Plans
  • Watershed Certificate Program
  • MDEQ Update
  • Groundwater Recharge Products for Stormwater Management
  • Adapting Stormwater Systems to Meet Changing Expectations
  • Green Infrastructure
  • Eastern Market (City of Detroit) Green Stormwater Management Strategy

Event Information                            Agenda

CREAT 3.0: Planning for Extreme Weather

The Environmental Protection Agency’s Climate Resilience Evaluation and Awareness Tool (CREAT) helps water, wastewater, and stormwater utilities plan for future climate and weather events. As utilities upgrade aging infrastructure, CREAT 3.0 can help them identify potential extreme weather events and analyze how to protect critical assets. To learn more, check out the video below and visit the resources at: Build Climate Resilience at Your Utility.

Green Infrastructure Toolkit

Green infrastructure manages stormwater runoff by providing vegetated and permeable areas instead of expansive paved surfaces.  This allows water to be captured on site instead of flowing into stormwater systems.

Green Infrastructure - stormwater utilities   Roof green infrastructure stormwater utilities

The Georgetown Climate Center has put together a Green Infrastructure Toolkit for local governments to compare best practices and create policies for implementing green infrastructure in their areas. The tool kit includes:

  • Pilot Projects
  • Integrating Green Infrastructure into Existing Processes
  • Funding and Financing
  • Communication Strategies
  • Equity and Environmental Justice

We hope you find these resources useful!

–  Jess

Mapping Pollutants with Leaflet.js

There are plenty of of ways to visualize data, but mapping data is one of the most powerful ways to make it immediately mean something to the viewer.  Creating maps that make data meaningful is satisfying for some of the same reasons I like doing the dishes. This might seem like a weird comparison, but bear with me, I’m a weird gal… what can I say?

Start and End

The thing I like most about doing the dishes is being able to SEE the progress and accomplishment. The starting point is simply a kitchen with dirty dishes, pots, pans, and food all over and unorganized.  The ending point is a clean looking and well organized kitchen.  Similarly with visualizing and mapping data, you start with something that doesn’t have much meaning at first glance and work to bring order and clarity to it. There’s something satisfying about starting with this:

Mapping data from a spreadsheet

Data displayed in Microsoft Excel

and ending with this:

Mapping Data - Interactive Web Map

Data displayed in an interactive web map

Pretty big difference, right? You can tell a lot about the data from just a glance at the map, let alone exploring the interactive features that display additional information. With that in mind, I’ll go through a high-level, introductory overview of the four (more or less) pieces that got us from a spreadsheet to an interactive web map. I plan on diving into more detail in future blog posts, so leave a comment if there is something in particular you’d like to hear more about.

NOTE:  I put this map together quickly for the sake of demonstration. There is plenty of code clean-up and additional styling that could be done of the legend, pop-ups, etc. What do you like to see in an interactive web map? Let me know in the comments!

1. Data and GeoJSON

When mapping data, you have to start with the data (of course!). I started with data from the Canadian National Pollutant Release Inventory (NPRI) in a .csv (comma separated values) file. In the screenshot above I had opened the .csv file in Microsfot Excel, but .csv files can also be opened in a text editor which will display a series of values separated by commas in a series of rows:

Mapping Data from a .csv file

Data in a .csv file, displayed in a text editor

.csv files are more versatile because they can be opened by a variety of applications while .xls files can only be opened by certain applications like Microsoft Excel.

Since I’ll be using JavaScript to create the map, I will to convert the data into JSON (JavaScript Object Notation). And because my data was geospatial in nature (meaning it had latitude and longitude points), I utilized a special format of JSON called GeoJSON.  Using QGIS (an open source Geographic Information System) to convert the data, it comes out looking like this:

Mapping data with GeoJSON

Data in GeoJSON format displayed in a text editor

2. HTML and CSS

There are three main components of many web pages that you see today: HTML, CSS, and JavaScript.

  • HTML (Hypertext Markup Language) provides the content of the page.
  • CSS (Cascading Style Sheets) controls what that content looks like including layout, colors, backgrounds, font styles, etc.
  • JavaScript is used to control the behavior of the page. When the user interacts with the page by clicking on something or hovering over something, the page responds using JavaScript.

So of course, my interactive web map uses HTML (on the left) and CSS (some of which is displayed on the right). There isn’t very much in the HTML because the only content on the webpage is the map (seen below with the <div id=”mapid”></div>) which is created by JavaScript and styled with CSS.

Mapping data with HTML   Mapping data with CSS

3. JavaScript and jQuery

The largest piece of this web map is the JavaScript. You probably guessed this, since JavaScript controls the user’s interactions with the webpage and we are building an INTERACTIVE web map. JavaScript is so ubiquitous on the web these days that programmers have created numerous JavaScript libraries. Using these libraries allows for faster and easier development of websites and web applications. jQuery is the most popular JavaScript library. You can see in the HTML file above that I’ve downloaded the file of pre-written JavaScript and included it: <script src=”js/jquery.js”></script>.

Below is the beginning of my JavaScript file. Wherever you see a symbol, my JavaScript is calling parts of the jQuery file.

Mapping data with JavaScript

4. Leaflet.js

Leaflet.js is an important lightweight JavaScript library for creating interactive maps. You can see that I included it in my HTML file just below the jQuery file. Leaflet.js allows me to pull my GeoJSON data into the map (with L.geoJSON which you can see in the file below) and use Mapbox to get my base map. It also helps me create the different colors and sizes of the circles (with L.circleMarker) to represent the different industries and amounts of pollutants released for each incident. It helps me create pop-ups with more information, a legend and the layers that allow the user to turn different data on and off on the map.

Mapping data with Leaflet

Mapping Data: the Results

By pulling all of these different pieces together, you can make your data clean, clear and concise to a customer, investor or stakeholder.  At the end of the day, you wouldn’t invite a client or your boss to see your dirty kitchen, so why would you show them your messy and unorganized data?

Make sure to explore the possibilities on the interactive map here and let me know if you have any fun data to map!

– Jess