Street View Image Analysis for Building Assessments
RISE is applying multimodal large language models to analyse Google Street View images for cultural heritage and building usage assessments, automating what traditionally requires extensive manual inspection.
RISE is applying multimodal large language models (LLM) to analyse Google Street View images for cultural heritage and building usage assessments. LLMs identify cultural heritage values, building types, materials, and architectural details, automating what traditionally requires extensive manual inspection.
The European Context
During 2025 and 2026, the Energy Performance of Buildings Directive is being implemented in the European Union member states, requiring all member states to have National Building Renovation Plans. In Sweden, there is no national register of buildings with heritage values and incomplete estimates of building usage.
Scale of Analysis
As part of the analyses, 248,855 street view images representing 154,710 buildings have been analysed using multimodal Large Language Models. The results are used to estimate aspects of cultural heritage value, building usage and building renovation state.
Zero-shot predictions by LLMs were used as a basis for assigning heritage values for 5.0 million square meters of heated building area for the Swedish Building Renovation Plan.
Ethical Considerations
At the same time, the project highlights the ethical and regulatory questions surrounding the use of public imagery and AI in decision-making processes. Issues like data privacy, consent, and bias in model training require careful navigation. Potential risks for authorities using LLM-based data are addressed, with a focus on issues of transparency, error detection and sycophancy.
Impact
The work demonstrates both AI’s potential for large-scale sustainability efforts and RISE commitment to developing responsible methods that respect individual and cultural contexts. The same models can be used to identify many other features for other applications.


