When Storing Your Spatial Data, Which Approach Is Better, Shapefiles Or Geodatabases? There’s Only One Way To Find Out….FIGHT!!

28 Oct

Apologies to Harry Hill for plagiarising one of his catchphrases, but I think this is an interesting question and one that is worthy of discussion.  From ArcGIS 9 onwards, there was a shift away from storing spatial data using individual shapefiles and raster grids (what I’ll call the ‘shapefile approach’) and towards storing all spatial data for a GIS project in a single Geodatabase (the ‘geodatabase approach’).  Most GIS textbooks, instructional information and, indeed, training courses, for ArcGIS now seem to recommend the use of Geodatabases, but these generally seem to be aimed at large organisation with very big and complex GISs that are accessed by many different people for many different purposes. The situation in ecological research is often very different, and it is much more likely to be one person and a laptop with a relatively small and simple GIS. So when using GIS for ecological research, is there actually any advantage of using one approach over the other?

Here’s my thoughts:

1. Geodatabases contain all the information in a single file, while using shapefiles and rasters requires that all the data layers are stored as separate files: This can be seen either as a benefit of using geodatabases or as a disadvantage.  With a single file, it’s much easier to keep track of all the data layers and to back up or transfer your data between computers.  It also means that its easier to ensure that working on the same files (this can be a problem with shapefiles that have a nasty habit of multiplying!).  However, it also means that if something goes wrong with that single file, you are completely screwed.  At least if you’re using shapefiles and your project crashes, you can easily re-build it from the individual shapefiles themselves as they are stored separately from the project file.

2. Geodatabases are specifically designed to work with ArcGIS – Part one:  This is not a problem as long as you continue to have access to ArcGIS. However, what happens when the ArcGIS licence for your project runs, or you have to move institutions and no longer have a licence?  If you’ve used the geodatabase approach, you may find that cannot access all your GIS data any more.  If you use the shapefile approach, there are many alternative, and often free, GIS software packages out there that you can use to access, explore, plot and manipulate your data layers.  Therefore, if you are unsure where your next ArcGIS licence might come from (and I am sure this is true for many ecologists), using the shapefile approach means that you will always be able to access your data no matter what.  This may not always be the case if you use the geodatabase approach.

3. Geodatabases are specifically designed to work with ArcGIS – Part two: Because geodatabases are specifically designed to work with ArcGIS, you can take full advantage of all the whistles and bells of the ArcGIS software.  However, it also means that you cannot easily access your data layers using different GIS software packages.   While this might not always be an issue, there are often instances in ecological research where ArcGIS just can’t do what you want it to and you find that you wish to use a different software package (e.g. doing a viewshed analysis in GRASS so that you don’t have to pay for the expensive spatial analyst tools extension to the basic ArcGIS software package just to do one thing).  If you use the geodatabase approach you may find that you can’t easily do this, whereas it’s much easier to seamlessly move between different software packages if you use the shapefile approach.

4. Geodatabases are specifically designed to work with ArcGIS – Part three:  If you are working with people from different organisations/research groups and not everyone has an ArcGIS licence, you may find sharing your data difficult if you use the geodatabase approach.  However, with the shapefile approach sharing your data layers with people using other GIS software packages is much easier.

5. Geodatabases are specifically designed to work with ArcGIS – Part four:  If you learn all your GIS using ArcGIS and geodatabases, you may find that you cannot as easily transfer this knowledge to other GIS software, and especially to free, open source GIS software.  This is not a problem if you can guarantee that you will always have access to ArcGIS for the rest of your research career, but if you think you might one day have to rely on using different GIS software, you may find it much easier to transfer your skills if you are at least familiar with the shapefile approach.  This also means that if you start with GIS career using the shapefile approach even if you’re doing it with ArcGIS, you can then choose whether to specialise in this software and move onto geodatabases, or whether to move on to other GIS software.

6. Geodatabases are more difficult to learn to use for complete beginners:  One of the main limitations for encouraging ecologists to use GIS in their research is that they get put off by over-complicated explanations of what GIS is and how it can be used. I’ve found that if I can get people playing around with real data layers as soon as possible they see how useful a tool GIS can be for their research, and they will persist with it.  If they don’t see this within the first few hours of using GIS, they will often abandon it, after all there are a lot of other key research skills out there that they can spend their time learning that will also benefit their research. One of the main reason that I tend to teach people GIS using the shapefile approach, is that it gets them up and running, working with real data, as quickly as possible (often within minutes if I’m sitting with them and using their own data in a one-on-one session).  If they become sufficiently interested, they can explore whether they would prefer using the geodatabase approach for their own work.  If I start by teaching the geodatabase approach, I have to spend those first precious minutes explaining how geodatabases are structured, how they work, the terminology, and so on. I will then quickly see their eyes glaze over and know that they’ll be lost from GIS forever.

From the above, you will clearly see that I favour the shapefile approach, I think it is more flexible and it means that you are much less chained to using ArcGIS whether you like it or not.  However, it is worth emphasising here that the bottom line is that you should use the approach that is best suited you and your own circumstances. If you like the geodatabase approach, go with it, if you like the shapefile approach, why not use it? In the end, GIS is just a tool to help you do your research.  As long as you succeed in do what you need to, it doesn’t really matter which approach you take, and don’t let anyone tell you anything different.

This article was originally posted on the GIS In Ecology forum at https://groups.google.com/forum/?fromgroups#!topic/gis-in-ecology-forum/piG6OzJEKAY.

Advertisements

One Response to “When Storing Your Spatial Data, Which Approach Is Better, Shapefiles Or Geodatabases? There’s Only One Way To Find Out….FIGHT!!”

Trackbacks/Pingbacks

  1. GIS For Biologists: Tip#1 – Downloading And Installing QGIS | GIS In Ecology - 05/03/2015

    […] be transferred across the commercial GIS software packages, such as ArcGIS (especially if you use the shapefile approach which we here at GIS In Ecology recommend for doing GIS in all GIS software […]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s