Posted by & filed under General FOSSGIS.

Intro

A client recently asked me to prepare some aeronautical data for use with QGIS. The data was supplied in about twenty 8 bit paletted GeoTiff images. The client wanted to produce a seamless mosaic so that operators only needed to open a single file to view the rasters. Also he wanted good read access times for the data (hey who doesn’t :-) ).

Trying things out…

Compressed Tiffs

The first thing I tried was simply resaving the imagery as compressed tiffs, and then building a virtual raster dataset using gdalbuildvrt. Compressing the images was easy with a few lines of bash:

#!/bin/bash

for FILE in *.tif
do
  BASE=`basename $FILE .tif`
  NEWFILE=${BASE}_c.tif
  gdal_translate -of GTiff -co COMPRESS=DEFLATE -co TILED=YES $FILE $NEWFILE
done

The script creates a new version of each tif file with a _c.tif suffix. One caveat is that last time I checked, ESRI GIS products didn’t support compressed tiff images (though it was a while ago that I last checked). The compressed tiff format works especially well on imagery with large areas of contiguous colour.

Creating a mosaic

More recent versions of gdal include the excellent gdalbuildvrt application that will build a seamless mosaic from a collection of images. The output is a ‘virtual raster’ – the original files are left untouched and the virtual raster contains a catalogue of the original files. This .vrt file can then be treated as a single file from within your GIS application. Here is an example:

gdalbuildvrt mosaic.vrt *_c.tif

Pop around to QGIS and use the add raster function to add mosaic.vrt and you should see a nice looking mosaic. However, there was one big problem with this approach – using gdalbuildvrt on paletted imagery will result in one master palette of 256 colours being assigned to all images. This results in the images having a speckled, unusable appearance.

So the next thing I did was to use gdal to expand the imagery out to 24bit rgba imagery. Doing this loses most of the gains we made by compressing the tifs, but resolves the palette incompatibility issue.

First I removed all the _c.tif files:

rm *_c.tif

Next I adapted my script to look like this:


#!/bin/bash

for FILE in *.tif
do
  BASE=`basename $FILE .tif`
  NEWFILE=${BASE}_c.tif
  gdal_translate -expand rgb -of GTiff -co COMPRESS=DEFLATE -co TILED=YES $FILE $NEWFILE
done

The noteworthy bit here is the -expand rgb option I have added which will convert the image from a paletted image to a true colour 24 bit image.

Now we can rerun our mosaic build to create a true-colour virtual raster:

gdalbuildvrt mosaic.vrt *_c.tif

Once again you can open the dataset in QGIS to test it. There are some things to bear in mind when building image mosaics. Firstly the imagery should all be in the same Coordinate Reference System. Secondly, they should all have the same spatial resolution. Lastly the band configuration should be the same for all images.

Building pyramids

If you have a particularly large collection of images, you may still find access times a little slow. You can speed things up (at the expense of some disk space) by building pyramids (also know as overviews). Pyramids presample the imagery at various scales so that it does not to resample as much data as you zoom and out of the image. Building pyramids is easy using the gdaladdo command (there is also an option within the QGIS raster properties dialog to do this).

gdaladdo -ro --config INTERLEAVE_OVERVIEW PIXEL --config \
           COMPRESS_OVERVIEW JPEG mosaic.vrt 2 4 8 16

Performance optimisation – Take 1

Because of the nature of my dataset, even with the above tweaking, data access was still too slow for me to be comfortable with passing it along to my client as a solution. To for my next party trick, I tried chopping up the mosaic into little 1000×1000 pixel tiles. I wrote a simple script to do it:

#!/bin/bash
mkdir tiles
XDIM=79011
YDIM=88977
BLOCKSIZE=1000
XPOS=0
YPOS=0
BLOCKNO=0
while [ $YPOS -le $YDIM ]
do
  while [ $XPOS -le $XDIM ]
  do
    echo "$XPOS $YPOS : ${BLOCKNO}.tif"
    gdal_translate -of GTiff -srcwin $XPOS $YPOS $BLOCKSIZE $BLOCKSIZE mosaic.vrt \
       tiles/${BLOCKNO}.tif
    BLOCKNO=`echo "$BLOCKNO + 1" | bc`
    XPOS=`echo "$XPOS + $BLOCKSIZE" | bc`
  done
  YPOS=`echo "$YPOS + $BLOCKSIZE" | bc`
  XPOS=0
done

This is just a quick and dirty approach to creating tiles and I didn’t spend more than about ten minutes writing the script, so it no doubt can be improved in all sorts of ways. One thing you should note is that I hard-coded the image dimensions.

Next I opened the newly created tiles directory and manually removed all unneeded tiles (e.g. all black ones on the periphery of the map.

Once again I built a virtual raster from this dataset:

cd tiles
gdalbuildvrt mosaic_for_tiles.vrt *_c.tif

I also made a pyramid. However performance was still too slow for my liking so I thought I would try the same approach with bigger tiles and compressing the tiles as ECW.

Performance optimisation – Take 2 (ecw)

GDAL includes support for the ECW wavelet compression format. There is a limitation in that if you don’t possess a commercial license, you may not compress a dataset larger than 500mb. Ubuntu does not ship with ecw support so you need to manually build gdal with it enabled. Here is how I built gdal from the source trunk. First I installed all of the build dependencies with some extra packages so that I could get hdf support:

sudo apt-get install build-essential libjpeg62-dev libtiff4-dev libhdf5-serial-dev \
libhdf5-serial-1.6.6-0 libhdf4g-dev libhdf4g libhdf4g-doc

Next I go hold of a copy of the ecw developer kit. Go to the erdas ecw sdk web page (link below) or just search their site for ‘ecw sdk’.

http://www.erdas.com/Products/ERDASProductInformation/tabid/84/CurrentID/1142/Default.aspx

Get the ECW JPEG2000 Codec SDK Source Code (second item listed – first if win version).

After the download you should have:

ecw_jpeg_2000_sdk_3_3_source.zip

Now unzip it:

cd /tmp
unzip ecw_jpeg_2000_sdk_3_3_source.zip
unzip ImageCompressionSDKSourceCode3.3Setup_20070509.zip
sudo mv libecwj2-3.3 /usr/local/src/
cd /usr/local/src/
sudo chown -R timlinux libecwj2-3.3/
cd libecwj2-3.3

Now build the ecw lib:

./configure
make
sudo make install

Next up we will build gdal. First install the subversion client if you don’t already have it:

sudo apt-get install subversion

Nowcheck it out gdal from svn (perhaps you prefer to use one of the stable release versions – I like to live on the edge…).

cd ~
mkdir -p dev/cpp
cd dev/cpp
svn co https://svn.osgeo.org/gdal/trunk/gdal gdal_svn
cd gdal_svn

Now configure and make gdal. Our build will go into /usr/local by default so it can exist side by side with any package installed version you may have.

make clean
./configure --with-ecw=/usr/local
make -j8
sudo make install
sudo echo "/usr/local/lib" | /etc/ld.so.conf
sudo ldconfig

It might take a little while to build but when it is done you can verify that you now have gdal built with ecw support like this:

/usr/local/bin/gdalinfo --formats | grep ECW

The response should look something like this:

  ECW (rw): ERMapper Compressed Wavelets
  JP2ECW (rw+): ERMapper JPEG2000

Ok now we can make ecw compressed images! Lets head back to our data directory and rechop up our mosaic into larger, ecw compressed pieces. My client has an ermapper license so I believe it would be ok to create a single large image for them using gdal, but for our purposes its better to stick under the 500mb limit. Here is a revision of my bash script from earlier:

#!/bin/bash
mkdir ecwtiles
XDIM=79011
YDIM=88977
#Notice I used a bigger block size now
BLOCKSIZE=10000
XPOS=0
YPOS=0
BLOCKNO=0
while [ $YPOS -le $YDIM ]
do
  while [ $XPOS -le $XDIM ]
  do
    echo "$XPOS $YPOS : ${BLOCKNO}.tif"
   # Notice I am using my hand built gdal now!
    /usr/local/bin/gdal_translate -of ECW -co LARGE_OK=NO -srcwin $XPOS $YPOS \
      $BLOCKSIZE $BLOCKSIZE mosaic.vrt ecwtiles/${BLOCKNO}.ecw
    BLOCKNO=`echo "$BLOCKNO + 1" | bc`
    XPOS=`echo "$XPOS + $BLOCKSIZE" | bc`
  done
  YPOS=`echo "$YPOS + $BLOCKSIZE" | bc`
  XPOS=0
done

When I ran that script, it created a bunch of image tiles for me in ecwtiles/ directory (it took a while!). Adding individual tiles to a QGIS project I found they loaded very quickly and pan and zoom operations were satisfyingly responsive. So how would this work out for a mosaic comprised of these tiles? Its easy to find out:

cd ecwtiles
gdalbuildvrt mosaic_for_tiles.vrt *.ecw

Afterwards I was able to open my mosaic in QGIS. It loaded in a few seconds and pan and soom operations take a second or two each. Quite a good result considering the dataset is quite large (~4GB).

Some reflections

GDAL and Linux provide an incredible toolbox to do geospatial dataprocessing. Its easy to experiment with the aid of writing bash scripts. Many people are command line averse, but I love the repeatability that using the command line gives you. Once I work out a procedure, it is easy to document and / or rerun on a different dataset. If I need to tweak it, I just create a copy of the script containing my tweaks. Compare this to a gui environment for dataprocessing where you typically have to click through sequences of dialogs in a very un-repeatable manner.

Sometimes you don’t know what will work best when you start out processing some data, but having a plethora of free tools to use makes it simple to mix and match, chop and change, until you find just the right combination that works for you.

An aside: dealing with libtiff incompatibilities

I did encounter one issue with building pyramids which (along with its solution) is described here:

http://trac.osgeo.org/gdal/ticket/3139

Here is the symptom:

gdaladdo --config COMPRESS_OVERVIEW JPEG --config INTERLEAVE_OVERVIEW PIXEL mosaic.vrt 2 4 8 16
0gdaladdo: tif_jpeg.c:1299: JPEGPreEncode: Assertion `!sp->cinfo.comm.is_decompressor' failed.
Aborted (core dumped)
  • Pingback: Linfiniti Geo Blog » Blog Archive » Clipping rasters with GDAL using polygons

  • ttamba

    Hi Tim,

    I’m getting problem with building dgal from trunk to enable ecw support following your post. I’m wondering what is going on… because when building from sources (using subversion) and I noticed that the process of importing files from trunk stop after 2 minutes and nothing happens… Could not follow up and could not figure out what i’m doing wrong. Here is a piece of shell commande to illustrate my issue.

    *************************************************************************8
    A gdal/frmts/wms/frmt_wms_tileservice_bmng.xml
    A gdal/frmts/wms/frmt_wms_metacarta_tms.xml
    A gdal/frmts/wms/frmt_wms.html
    A gdal/frmts/wms/wmsdriver.cpp
    A gdal/frmts/wms/md5.cpp
    A gdal/frmts/wms/dataset.cpp
    A gdal/frmts/wms/rasterband.cpp
    A gdal/frmts/wms/frmt_wms_metacarta_wmsc.xml
    A gdal/frmts/wms/stuff.cpp
    A gdal/frmts/wms/wmsdriver.h
    A gdal/frmts/wms/stdinc.h
    A gdal/frmts/wms/md5.h
    A gdal/frmts/wms/minidriver_tms.cpp
    A gdal/frmts/envisat
    A gdal/frmts/envisat/EnvisatFile.h
    A gdal/frmts/envisat/envisat_dump.c
    A gdal/frmts/envisat/envisatdataset.cpp
    A gdal/frmts/envisat/dumpgeo.c
    A gdal/frmts/envisat/makefile.vc
    A gdal/frmts/envisat/GNUmakefile
    A gdal/frmts/envisat/EnvisatFile.c
    A gdal/frmts/pgchip
    A gdal/frmts/pgchip/pgchip.h
    A gdal/frmts/pgchip/pgchipdataset.cpp

    ***************************************************************************

    Any idea of what is wrong ???

    Regards

  • http://linfiniti.com Tim Sutton

    It was probably just the svn server being a bit overloaded or an issue with your connection. Try to wait a few hours and then update again. Otherwise take it up with the gdal team!

    Regards

    Tim

  • ttamba

    Hi Tim,
    Thank you for your answer. You’re right I think when I tried to build the source fron svn the server was overloaded. Even I re-tried and the first time the archives have been locked after unzipping the abunch of sources and the second time was the good one.

    Finally, I successfully build gdal with the support of ECW format. Another question, when building ecw support format with gdal, could I add ECW and jpeg2000 raster files into QGIS. Looks like the ER-Mapper ecw and jpeg2000 are not available into qgis raster type ???

    Thank again Tim

    Regards

  • http://linfiniti.com Tim Sutton

    Hi

    Yes you should be able to open ecw files in QGIS. But you need to make sure that your QGIS build is linking to your hand built gdal and not to any other gdal lying around on your system otherwise it isnt going to work!

    Regards

    Tim

  • ttamba

    Hi Tim,

    Thank again, I think my qgis build is not linked to my gdal built. So the problem I don’t know how to walk through this??? I guess I have to recompile qgis from sources ???

    Regards

  • MD

    Hi!
    I got the same problem like ttamba. It seems that my QGIS does not see propper GDAL?
    How to solve that problem???

    Ubuntu 10.4
    QGIS 1.4.0 (from repository)

  • Pingback: GDAL – poradnik | Geostrona

  • http://www.biberhapial.net cexk

    Hi!
    I got the same problem like ttamba. It seems that my QGIS does not see propper GDAL?
    How to solve that problem??? http://www.bbierhapial.net

  • Pingback: GIS SUPPORT

  • Clara

    Hello! I have 18 Landsat scenes, composite images, which I need to “glue” to form a big map, for further analysis. Searching how to do this, I ended up here. But I don’t think this is the solution for me. Do you know if and how I can do this in QGIS? Would really appreciate any hints. Thanks, Clara

    • http://kartoza.com/ Tim Sutton

      Hi Clara. What kind of analysis do you want to do on the merged imagery? There are other techniques for creating mosiacs which include global colour balancing (so all scenes appear to be in the same colour range), feathering, cutlines along linear features etc – all designed to create the illusion that you have one big seamless image. However if you do this the result would normally be treated as a visual product and probably not suitable for analysis work such as image classification etc. You might want to delve into GRASS and OTB if you are planning to do more serious remote sensing work.

      Regards

      Tim

      • Clara

        Dear Tim, thank you for replying!
        I need to do a land cover classification for Romania, and my line of thought was to create a big mosaic on which then to perform the classification. I worked so far in Grass, correcting the data and creating the composites, but I am stuck at this point, as I cannot merge them together. I thought of performing the analysis separately for each image, and I know I can use Grass and OTB for further analysis, but this particular step, of merging them together (which I still will have to do in the end) remains unsolved.. any thoughts? :) Best, Clara

        • http://kartoza.com/ Tim Sutton

          Hi Clara

          Do you know about the Semi-Automatic Classification Plugin (http://fromgistors.blogspot.com/2014/06/land-cover-classification-using-SCP-3.html) For QGIS? As far as I know the ‘normal’ way would be to do your classification on a ‘per scene’ basis and then merge the result into a single image.

          Note that I am not a Remote Sensing expert so above is my intuitive response.

          Regards

          Tim

          • Clara

            Dear Tim, thanks for the link! Looks very helpful! I guess it makes sense to do it this way, still, at the end, the results need to be merged. hopefully, will find a solution! All the best, Clara