TransWikia.com

Is it possible to read a subset of points with the PDAL C++ api?

Geographic Information Systems Asked on November 29, 2021

I am using the PDAL C++ API to read LAS/LAZ data. We’re finding some of the tile files are very large (160M+ points), so I wish to take a sample of them. Although the point order in the tiles is arbitrary, it definitely isn’t random, so we probably want random sampling. Alternative methods would appear to be too slow and/or require an index. Index building is the specific limit I’m hitting with 160M points!

I could easily do this myself, but it looks like PDAL reads all points into memory first. This seems wasteful for such large datasets.
Is it possible to tell PDAL to "only read a random selection of N valid points from disk"?

One Answer

PDAL supports streaming, but not all filters support it. If you don't want to load all of the data into memory at once, you need to use streaming.

filters.decimation can support streaming because it doesn't need to consider all of the points to complete its work. For example, the following would keep every tenth point in the file:

pdal translate input.las output.las decimation --filters.decimation.step=10

This isn't a random sampling, however. It's an order-specific decimation. The more rigorous filters.sample performs a Poisson sample to preserve a minimum radius. For example, this preserves a 10m radius:

pdal translate input.las output.las sample --filters.sample.radius=10

filters.sample requires looking at the entire set of data, however, and does not support streaming.

You don't state what indexing method you are using. 160M points isn't that big of a dataset, so I'm not sure what specific issue you are hitting there.

Answered by Howard Butler on November 29, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP