me

A script to batch merge multiple shapefiles into one using ogr2ogr / GDAL on Mac

Original wiki page (you should use the version there) — copied here with a more descriptive title with the hope that it becomes easier to find.

#Make a new directory called "tmp" and a sub directory called "merged"
mkdir tmp
mkdir tmp/merged

#copy all zipped files to the "tmp" directory and then "cd" into it
cp *.zip tmp
cd tmp

#unzip all the .zip archives
find . -name "*.zip" -exec unzip '{}' \;

#delete all .zip archives
rm *.zip

# move a single shapefile (and the cooresponded .shx, .dbf, etc files) to the
# "merged" directory (exchange 'myshape*' for the name of one of your shapefiles
# keeping the '*' at the end of the name)
find . -name 'myshape*' -exec mv '{}' merged \;

#Batch merge all the remaining shapefiles from the tmp dir into the copied
# file in the merge dir (exchange 'myshape' for the name of the copied shapefile)
for i in $(ls *.shp); do ogr2ogr -f 'ESRI Shapefile' -update -append merged $i -nln myshape
done

badgers

Comments (0)

Permalink

Using SELECT … WHERE IN in Hive

You can’t do it, but you can use SEMI JOINs.

From the examples:

  SELECT a.key, a.value
  FROM a
  WHERE a.key in
   (SELECT b.key
    FROM B);

can be rewritten to:

   SELECT a.key, a.val
   FROM a LEFT SEMI JOIN b on (a.key = b.key)

(Mostly a note-to-self)

Affordable php mysql hosting with cpanel

badgers

Comments (0)

Permalink

Notes to self on map design

At far zoom levels:

Use neighborhood names instead of major street names

Just outline highways

thin light lines for minor streets

As you get closer, use the same color for minor streets, but now outline them (white/lighter center)

Put highways below other road layers

This round from looking at the devseed Baltimore local maps

badgers

Comments (1)

Permalink

Export an Excel file with several sheets as multiple csv files using Open Office

Datasets often come as Excel files with multiple sheets (for example, one per year). If you don’t have Excel, you can use OpenOffice and this macro to export all the sheets as separate CSV files. It takes a directory path, searches for all Excel files in that directory, and then saves each sheet in each file as CSV with the name (file name)_(sheet name).csv

To run it, you need to:

  1. Open Office Calc, click Tools -> Macros -> Organize Macros -> OpenOffice.Org Basic.
  2. Click the new button
  3. Paste in the macro code
  4. Change the file directory (cFolder variable, around like 29) to some folder. This is where OpenOffice will search for Excel files and save the results.
  5. Run the macro (you might have to go through Tools -> Macros -> Run Macro)

There probably is a way to do this directly from the command line.

 

badgers

Comments (0)

Permalink

Fixing DocSplit / GraphicsMagick Postscript delegate failed

After installing DocSplit, I got this error when attempting to extract text from a PDF via OCR:

$>docsplit text ./BSED.pdf --pages all
execvp failed, errno = 2 (No such file or directory)
gm convert: "gs" "-q" "-dBATCH" "-dMaxBitmap=50000000" "-dNOPAUSE" "-sDEVICE=ppmraw" "-dTextAlphaBits=4" "-dGraphicsAlphaBits=4" "-r200x200" "-dFirstPage=1" "-dLastPage=1" "-sOutputFile=/var/folders/D5/D5Cief2MHW8vTHOI8nkyPU+++TM/-Tmp-/d20101117-83113-1f9etpp/gmUoD1ux" "--" "/var/folders/D5/D5Cief2MHW8vTHOI8nkyPU+++TM/-Tmp-/d20101117-83113-1f9etpp/gmXsXajF" "-c" "quit".
gm convert: Postscript delegate failed (./BSED.pdf).

I am on Mac OS 10.6. The solution was to upgrade ghostscript to the latest version:

$>brew install ghostscript.

badgers

Comments (0)

Permalink

Open Government and data in Detroit

If you are interested in either, both, or related subjects, please do get in touch: matthew.hampel@gmail.com

This post is way too long in coming, especially because I was disappointed to find no other like it when I googled those keywords earlier this year.

Some starting points: (hopefully I’ll be able to add more)

http://detroitwiki.org/Data_about_Detroit

http://datadrivendetroit.org/ (501c3, I’m interning there this semester)

badgers

Comments (0)

Permalink

Lost Landscapes of Detroit

Last week, Rick Prelinger of the Prelinger Archives showed a collection of home movies, newsreels, and other film clips of Detroit he has collected and digitized. The audience at MOCAD had a lot of fun shouting out the names of places and people as they went by.

He handed out copies of the DVD and the work is under a Creative Commons license, so I’ve put up a copy online as a torrent.

You’ll need a torrent client to download it; I’d recommend Transmission for Mac or uTorrent for Windows. I’m looking for a place to upload it for streaming online.

badgers

Comments (2)

Permalink

Class notes from today

Wayne County’s version of anti-Kelo (actually pre-Kelo)

Dumbbell tenements from a great Columbia University interactive page on apartment houses

Detroit Historic District Commission has a page about each historic building, including dates significance was recognized. (can I get these in a shapefile?)

Readings from The Power of Place.

You can transfer air rights.


badgers

Comments (0)

Permalink

Looking for an online application management system.

I’m looking for a tool that will let the Semester in Detroit program easily accept and process applications from students and community partners online.

Here are the basic features I’m looking for:

  • We can easily create an application with custom fields (Preferably with chunking. For example: personal information on one page, personal statement on the next, etc.)
  • Applicants create an account and fill out the fields online
  • Applicants can stop halfway through and finish the application later
  • Applicants push a button to submit their application
  • File uploads allowed
  • We can get the data out
  • Nice but not necessary: Some hidden fields for processing (like accepted/rejected/pending)

badgers

Comments (0)

Permalink

Download every PDF linked from a page using Python.

I wanted to download every agenda posted on the Detroit City Council website, but they were in different folders.

Happily, I there’s one page that lists all of them, so I wrote this short script:

import urllib2
import re
from BeautifulSoup import BeautifulSoup, SoupStrainer
import os
import time

# define the URL where all the links are:
url = "http://www.detroitmi.gov/legislative/CityClerk/2009add_cal.htm"
base_url = "http://www.detroitmi.gov/legislative/CityClerk/"
html = urllib2.urlopen(url).read()

# only select links with 'pdf' in the href
pdf_links = SoupStrainer('a', href=re.compile('pdf'))
soup = BeautifulSoup(html, parseOnlyThese = pdf_links)

for link in soup:
    link = base_url + link['href'] # build the full path to the PDF
    os.system("wget " + link)
    time.sleep(10) # wait a little while to be courteous

badgers

Comments (1)

Permalink