Date Sun 28 March 2010

Batch importing data into PostGIS is a topic I have covered previously in my article on importing CDSM (Chief Directorate: Surveys and Mapping) data into postgis. However its a popular question I get asked so today I am going to take another look at the problem. In particular the client has a directory heirachy of zipped shapefiles divided by regions in Africa. The shapefiles use a common naming convention which makes things easier. The convention is basically something like:

<number><region_name>/<region_abbreviation>_master/<region_abbreviation>_<feature_type>_<feature name>.zip

For example:

02 Mozambique Malawi/moz_master/moz_a_parks.zip

When unzipped the shapefiles are consistently named across regions e.g.:

moz_a_parks.shp
moz_a_parks.shx
moz_a_parks.dbf
moz_a_parks.sbn
moz_a_parks.sbx

From that we can define a few rules:

  • Recurse the top level directory looking for all zip files
  • Get the base name of the zip file without its extension (e.g. moz_a_parks)
  • Strip the region off the base name - this will give us the standard feature class name / table name used across all regions for that feature type.
  • Extract the zip file to /tmp
  • If the feature class / table does not exist in PostGIS, import it with the -c (create) option
  • Otherwise import it with the -a (append) option

So here is the little script I wrote that achieves this.

#!/bin/bash
DATABASE=foo
IFS="$(echo -e '\n\r')";
FILES=$(find -name "*.zip");
for FILE in $FILES
do
  BASE=$(basename $FILE .zip)
  TABLE=$( echo $BASE | sed 's/^[a-z]*_//g')
  # double check the extracted files have been cleaned up or zip will moan
  rm /tmp/${BASE}.*
  echo "$BASE to $TABLE"
  unzip -d /tmp $FILE
  MATCH=$(echo "\d" | psql ${DATABASE} | grep -o "${TABLE}")
  if [ "$MATCH" ]
  then
    # Append to the exisitng table
    echo "Appending to ${TABLE}"
    shp2pgsql -s 4326 -I -S -a -W UTF-8 "/tmp/${BASE}.shp" $TABLE | psql $DATABASE
  else
    # Create the table
    echo "Creating $TABLE"
    shp2pgsql -s 4326 -I -S -c -W UTF-8 "/tmp/${BASE}.shp" $TABLE | psql $DATABASE
  fi
  # clean up
  rm /tmp/${BASE}.*
done

If your shapefiles are already unzipped into a directory structure, you could easily modify the script above to find '.shp' files rather than zip files and leave out hte unzipping , cleaning up steps.

Once you have run the scripts, open your favourite PostGIS client (I use psql) and take a look at the tables created there. If all looks good you should be able to start browsing the datasets with QGIS or publishing them with Mapserver!


Comments

comments powered by Disqus