The Urban Panorama Project is developing a method to assess urban change by introducing two novel elements in the craft of digital historians: 1) the use of historical images of streetscapes as primary sources; 2) the use of computer vision and machine learning in the geolocation and analysis of these images.
Initially, a case-study will be used to develop these methods and tools. For this stage, we will employ image analysis techniques to match a corpus of a few hundred historical digital photographs of Raleigh, NC, available at the State Archives of North Carolina, to present-day images. We plan to compare this corpus of thousands of historical street images of Raleigh with photographs from the 2010s scraped from the Google Street View service. This will be accomplished through the development of a platform, powered by a trained computer vision algorithm, that will allow users to quickly annotate, compare, and match images.
Understanding Spatial Change
There are many ways of understanding urban and spatial change, from the close reading of memoirs of historical characters embedded in particular locations to the interpretation of the legislation, urban plans, and reports produced by private developers and state officials to the plotting of historical data and the subsequent analysis of patterns through naked-eye and statistical methods. In recent years, the latter approach has become ubiquitous thanks to the popularization of GIS in the humanities and social sciences. GIS has allowed scholars to aggregate data through location in space in the form of points, lines, and polygons. However, there is a bias implicit in this method. GIS-based approaches understand urban change from the air, reproducing the point of view of the planner.
This project aims to introduce a new dimension in the historical assessment of how cities change. By changing the focus from geometric parcels, as seen from the air, to images of streetscapes, as seen at the street level, we intend to move closer to the perspective of the people experiencing change in the landscape of the city as they traverse its streets and avenues. To achieve this goal, our project will combine GIS with machine learning and computer vision technologies to geolocate, analyze, and interpret historical images of urban landscapes.
Computer Vision with Small datasets
In recent years, researchers have used state-of-the-art deep convolutional neural networks (ConvNets) to analyze and locate large datasets of present-day image. Urban Panorama takes a different approach, based on grammar-driven learning and inference. due to the high number of parameters, ConvNets requires a large amount of data for training (e.g., in the order of O(106)). ConvNets, therefore, are usually not useful for analyzing historical datasets, which are small and limited. Grammar models, on the other hand, have shown very good performance with small data in training, which make them ideal for the type of dataset used in this project.
The task of image parsing is a grammar-driven learning and inference process analogous to language parsing in natural language understanding and text mining. Image parsing consists of two components: 1) the syntactic decomposition of a scene into objects, parts, subparts, all the way down to image pixels; 2) the use of semantic understanding and reasoning to infer the spatial, semantic, and causal relations between decomposed entities. They both constitute recursive procedures—the more one looks, the more one sees. These techniques have led to very successful results in two previous projects led by Tianfu Wu. In the case of Urban Panorama, our algorithm will be trained to decompose façades into their structural components, e.g. doors, window frames, columns, signs, bricks, etc.
Image segmentation and annotation interface (ISAI)
One of the goals of this project is to develop a platform for the segmentation and annotation of images (Image Segmentation and Annotation Interface – ISAI). This platform will allow users to graphically select features in streetscape images (“annotate”) and use the image analysis algorithm to query the database to find possible matches. This platform will also provide data on how users (i.e. scholars, graduate students, stakeholders) match urban locations in streetscape images through a process of decomposing façades in their basic elements. This data, in turn, will be used to train and refine the algorithm that analyses the image database, offering users with suggestions of possible matches.
Geolocation through image analysis
In this project, we propose the geolocation of historical streetscape photographs through their matching to geolocated present-day images. This can be a multi-step, multi-scalar, and often iterative process. In it, a researcher will decompose a scene into its visual elements, parsing identities and establishing relationships with elements external to the image in question. Ultimately, he or she will separate features that are relevant for geolocation from those extraneous to the task. This sorting process is highly dependent on the knowledge of the viewer—their familiarity with architectural forms and typologies; their proficiency in local history; and their awareness of an urban landscape with both perennial and transitional elements are all necessary for finding visual matches diachronically. Eventually the viewer will narrow the field of potential locations, or trigger other avenues of inquiry.
- Frederico Freitas, Assistant Professor, History (NCSU)
- Tianfu (Matt) Wu, Assistant Professor, Electrical and Computer Engineering (NCSU)
- Todd Berreth, Assistant Professor, Art and Design (NCSU)
- Arnav Jhala Associate Professor, Computer Science (NCSU)
- Matthew Morse Booker Associate Professor, History (NCSU)
Current and Past Research Assistants
- Stephanie Lee Huang, Art + Design (NCSU)
- Catherine Stiers, History (NCSU)
- Austin Robinson, History (NCSU)
- Mahesh Masale, Computer Science (NCSU)
- Suraj Shanbhag, Electrical and Computer Engineering (NCSU)