Canadian Forest Service Publications
Land cover harmonization using Latent Dirichlet Allocation. 2021. Li., Z., White, J.C., Wulder, M.A., Hermosilla,T., Davidson, A.M. International Journal of Geographical Information Science, 35:2, 348-374
Issued by: Pacific Forestry Centre
Catalog ID: 40389
Availability: PDF (download)
Available from the Journal's Web site. †
† This site may require a fee
Large-area land cover maps are produced to satisfy different information needs. Land cover maps having partial or complete spatial and/or temporal overlap, different legends, and varying accuracies for similar classes, are increasingly common. To address these concerns and combine two 30-m resolution land cover products, we implemented a harmonization procedure using a Latent Dirichlet Allocation (LDA) model. The LDA model used regionalized class co-occurrences from multiple maps to generate a harmonized class label for each pixel by statistically characterizing land attributes from the class co-occurrences. We evaluated multiple harmonization approaches: using the LDA model alone and in combination with more commonly used information sources for harmonization (i.e. error matrices and semantic affinity scores). The results were compared with the benchmark maps generated using simple legend crosswalks and showed that using LDA outputs with error matrices performed better and increased harmonized map overall accuracy by 6–19% for areas of disagreement between the source maps. Our results revealed the importance of error matrices to harmonization, since excluding error matrices reduced overall accuracy by 4–20%. The LDA-based harmonization approach demonstrated in this paper is quantitative, transparent, portable, and efficient at leveraging the strengths of multiple land cover maps over large areas.
Plain Language Summary
Land cover information is critical to understanding climate and biogeochemical cycling on the Earth, as well as to support sustainable management of natural resources. A variety of large-area land cover maps, often produced to satisfy different information needs, now exist concurrently, with partial or complete spatial and/or temporal overlap, different legends, and varying accuracies for similar classes. To provide a regional land cover map following an integrated and cross-sector legend, herein we harmonized two 30-m resolution land cover products that respectively focus on regions dominated by forest and agricultural land use in their overlap area. To automate the harmonization, we used a Latent Dirichlet Allocation (LDA) model, a statistical topic model from the field of natural language processing. Using both LDA outputs and error matrices of the source maps yielded improved harmonized maps with 6-19% increases in overall accuracy over areas of disagreement between the source maps when compared with the benchmark maps.