TransWikia.com

Getting topology or out of memory errors with large dataset intersects and spatial joins in ArcMap

Geographic Information Systems Asked by Nate Roth on January 12, 2021

I receive a topology error notification when intersecting a point dataset (with ~5 million points), with a polygon dataset created by buffering those points by a half mile. The goal is to create a table containing the intersection of the two datasets such that I have a list of all points within that half mile radius of each starting point. I can generate effectively identical results using either an intersection or a spatial join.

My prototype of this process works fine when I work on a small subset of each dataset. When I scale up to the full dataset, the intersect operation fails with a topology error, and the spatial join fails with an out of memory error (which is plausible given the dataset size, and memory addressing limitations of a 32bit application).

Much of the time I do these operations in PostGIS (successfully and easily), but on this project I’m constrained to working in ArcMap, with the assumption that my users will have only the ArcView (Basic) level of licensing. I’ve also done these operations in spatiallite. I’d really rather not have to pull in OGR2OGR to move the datasets to spatiallite for the processing, but can if I must.

Machine specs: Intel Core2 Quad (Q9550), Windows 7 (64bit), 8GB of ram, plenty of hard drive space

3 Answers

Here are a few suggestions that may help:

1) Perform a "Repair Geometry" on both of the datasets before attempting the intersect, or better yet, build a quick topology and correct any polygon overlay errors as needed.

2) Convert multipart features to single part features.

3) Make sure that you set a realistic/appropriate XY tolerance and XY resolution for your data using the geoprocessor.

3) Eliminate any unnecessary fields either through layers or by choosing the "ONLY_FID" option in the Join Attributes parameter (you can get the attributes later by joining via FID/OID to the original tables).

4) Play around with different formats - despite ESRI's push to try to get everyone to use the FGDB format, the lowly Shapefile format is often much faster and less error prone when performing complex spatial operations.

Answered by Brent Edwards on January 12, 2021

First the Why! The issue you are seeing is related to ESRI's scratch workspace. In ArcGIS, ESRI now uses a fGDB to store data used in temporary tasks. This workspace is built with the default value of 4GB set as the maximum space to be used; the explanation I have gotten is this is done to prevent a large run-away process with the Unlimited option setting to crash a machine during a large geoprocess.

*The best workaround!*The way you resolve this is to do a JOIN using the ONLY_FID option, this will reduce the size of your working layer in between since your temp workspace in ArcGIS is limited to 4GB in size. I have even tried running my processes totally in a ArcSDE DB and still had this issue because the tables that the software creates for the temp/scratch workspace blow-up explosively.

I actually had a ESRI guy from Redlands come in-house to track this down with me, this is the reason for the why! It is by design, we put in a feature request to have this configurable but it died in the depths of Redlands.

Answered by D.E.Wright on January 12, 2021

If doing a JOIN using the ONLY_FID option works, and your dataset is unlikely to grow significantly, that is you best bet. If not, you might be back to the old fun of tiling your dataset, processing, and merging the results. The easiest way might be to create a grid, and for each grid cell, select the point inside it, join those with the full dataset, and append the results to the output featureclass.

This won't be particularly speedy, but should get the job done for an arbitrarily large dataset.

Answered by David on January 12, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP