May 14, 2019

Deep Learning with 3D Spatial Data

Blogs

Esri has been making progress in applying deep learning models to extract information from 3D spatial data. Recent experiments have used 3D point cloud data as well as 3D triangulated meshes from structure-from-motion algorithms using oblique imagery. A pilot project in Miami undertaken in 2018 intended to optimize a workflow of reconstructing 3D building models, that required manual digitization of building segments from different roof types, using lidar data in the form of high density point clouds.

Because the manual digitization was a slow process, machine learning techniques were used to train a dataset to train a neural network in order to automate this process. This happens through supervised learning, where manually categorized or labeled data is provided to a learning algorithm. The more example data is provided, the better the algorithm will function for the task it is being trained.

The majority of what is Artificial Intelligence today is in reality supervised learning. These experiments with 3D data also are an application of deep learning, which is a type of machine learning many layered artificial neural networks, which is software that roughly emulates how neurons operate in a human brain.

Reconstructing 3D buildings from aerial LiDAR with AI

For the Miami use case, a 3D point cloud would be converted to a 2D raster, storing the average heights of lidar points per pixel. Next, GIS engineers would manually digitize the different roof segment polygons on top of the newly created 2D raster. Then, ArcGIS 3D extension tools as well as CityEngine procedural rules were used to extrude the building models from the roof segment polygons.

Before training the neural network, some extra preprocessing had to be done to subtract ground elevation from the dataset, as the neural network would require many examples to make the network terrain-invariant. This resulted in a DSM (Digital Surface Model). ArcGIS Pro offers an Export Training Data for Deep Learning geoprocessing tool to create a training set from the preprocessed raster and polygon data. Next, a Mask R-CNN neural network was trained with this data; the results were loaded back into ArcGIS Pro.

The trained neural network was able to speed up the digitization process considerably compared to manual digitization processes. In a live WebScene, the impressive final results can be seen of the Procedural Rules-based extrusion, showing the different roof segment polygons of the buildings in 3D over a large area (Fig1, source: Esri). What needs to be taken into account is that raw point cloud data consists of unsorted x,y,z points, where it is not known what points belong to a building.

Deep learning using lidar data without intermediate raster file conversion

A more recent application of deep learning models using 3D spatial data used lidar data without an intermediate conversion to a raster file, as was the case with the Miami-Dade Country experiment. Instead, RANSAC algorithms were used. RANSAC is an acronym for “random sample consensus”, that refers to an iterative method to estimate parameters of a mathematical model from a set of observed data that contains outliers. RANSAC algorithms from ArcGIS Pro, such as the LAS Building Multipatch tool for creating building models derived from rooftop points captured in lidar data, were used to build 3D shells on top of building footprints, DEM and building points. From a distance, the results look impressive but contain a lot of noise on closer inspection: this has to do with the unstructured nature of point clouds, making it hard for a classification algorithm to a perfect job (Fig2, source: Esri).

Further experimentation with the open source PointCNN framework for feature learning for point clouds yielded promising results, by using 3D open lidar data from the Netherlands. Finally, the same PointCNN framework can also be used on raw 3D meshes, resulting in a segmented 3D mesh. In general, it is expected that PointCNN has a great potential to work in other more complex point classification tasks, not just on building classification.

Resources:

3D cities: Deep Learning in three-dimensional space

Architects of Intelligence