Forests and Trees: the Formal Semantics of Collective Categorization (ROCKY)

Datasets

GeoRic Dataset

The GeoRic dataset contains images with the corresponding captions and image location coordinates (latitude and longitude). The dataset is intended for the use in image captioning and other vision and language tasks.

For more information about the GeoRic dataset and its application to training a geographically aware image captioning system see:
Nikiforova, S., Deoskar, T., Paperno, D., & Winter, Y. (2020). Geo Aware Image Caption Generation. To appear in Proceedings of the 28th International Conference on Computational Linguistics (COLING 2020).