The New Science

While the real estate industry has long understood the need for data, it still struggles with connecting information to decision making. New strides in data science could change that.

Real estate is an “informationally inefficient” asset class, which means that real estate assets may not fully reflect information about the assets in their price. Because real estate pricing does not always reflect the underlying fundamental value of the asset, research to discover and exploit data in real estate markets can uncover fundamental value, identify mispricing, and generate above-market return easier than in other, publicly traded asset classes, such as stock and bonds.

Advances in data availability and data science techniques are bringing new transparency to historically opaque real estate markets. Taking a data science approach to real estate analysis can reveal fundamental relationships between building-specific features, amenities in the built environment, sociodemographic factors, and real estate pricing at a scale previously unthinkable using traditional real estate analysis techniques.

This article explores the use of data science in real estate analysis. We draw connections between data science and comparator analysis often used in traditional real estate analysis, outline ways that data science can be integrated into the investment process, showcase a practical example, and conclude with thoughts about the future of data science in real estate.


The data science approach to real estate analysis involves collecting information on a large number of real estate assets and testing whether there are consistent relationships between factors that may drive value (e.g., coffee shop density, median household income, etc.) and some real estate value indicators (e.g., rent level, asset value growth, etc.).

We build a bridge between real estate data science and a traditional real estate pricing method—comparator analysis—by highlighting where they overlap and where they differ on four key traits:

  • Sample size: Both the data science approach and the comparator set analysis approach are comparative methods; that is, both seek to identify comparator properties that can guide pricing on a subject asset. They differ in the number of comparator properties used in the analysis. Data science requires a very large number of comparator properties, typically ranging from the hundreds and potentially into the millions. By contrast, the traditional comparator analysis uses only a handful of comparators that are as similar as possible to the subject asset.
  • Analytical method: The data science approach and the comparator set approach both adjust information on comparator properties to come up with an estimated price for the subject property. Data science uses statistical models such as regression, decision trees, and neural networks to identify relationships between variables. Comparator set analysis involves manually comparing similarities and differences among a subject asset and comparator assets to develop a range and point estimate for subject asset pricing.
  • Granularity of analysis: Granularity of analysis refers to how deep the approach can go into the unit, suite, or storefront at a subject property. In principle, data science can go down to the finest detail. In practice, however, data availability limits the data science approach to the building level, with unit-specific analysis possible only for a relatively small set of assets. Since the comparator set approach is manual and limited to a small number of comparator assets, the approach allows for high unit-specific adjustments to estimate a price for the subject property. In most cases, the comparator set approach will be more granular than the data science approach.
  • Presentation of price ranges: Estimated pricing is just that: an estimate. Prices, therefore, are often presented as ranges to reflect the uncertainty over the spot estimate ultimately used for a bid or rental contract. Because the data science approach relies on statistical models to estimate pricing, price ranges take the form of statistical confidence intervals reflecting deviations around the point estimate at the desired level of confidence. Comparator set analysis uses scenario analysis and sensitivity tables to show how changes in key assumptions would impact pricing and show a range of pricing outcomes.


To bring the data science approach to life, we will demonstrate a practical example estimating the transportation premium on apartment rents.

Intuitively, properties that are closer to a major employment center will command higher rents due to a variety of factors, including shorter commute times to work. We think renters are known to value transportation connectivity and the associated commute times, but by how much? A look at the Seattle multifamily market can help us understand how data science can be used to estimate this transportation or commute premium.

At first glance we can already see the relationship between rents and commute time. Exhibit 2 shows a map of Seattle and the weighted average commute time to work. It is evident that downtown Seattle and Bellevue have significantly lower commute times compared to areas farther out from downtown. And in an almost inverted image, we see the opposite with the downtown Seattle and Bellevue having the highest rents and dispersing outward in Exhibit 3. Exhibit 4 demonstrates this explicitly, showing that lower commute times are associated with higher rents and vice versa.


We used a data science model to answer this question. Our model uses rent per square foot as a dependent variable and eighteen independent variables that explain about 68% of the variation in Seattle apartment rents. After controlling for these variables, we can isolate the transportation premium. We find that for every additional ten minutes in commute, rents fall by $0.02. Our results illustrate and quantify the inverse relationship between commute time and rents while supporting the well-known assumption that renters are willing to pay for shorter commutes.


After building a data science capability, the next challenge investors face is integrating data science into their existing workflow. In this section, we outline some ideas for how to weave data science into existing workflows.

  • Identify mispricing: As an informationally inefficient asset class, there can be long lags between information entering the real estate market and when it is incorporated into real estate pricing. If an investor can identify what price an asset should be, based on its fundamental value drivers, they can exploit the time lag between information availability and pricing changes to acquire undervalued assets or sell overvalued assets. Data science is one of the best tools for uncovering and pricing this information efficiently, at a large scale, and in ways that are best done through machine learning instead of manual human analysis. Using the commute time example, we know that on average rents at a building level fall by $0.02 for every additional ten minutes of commute time. If an investor spots a collection of buildings that appear to have rents $0.10 lower relative to their commutable distance to an employment hub, the market may be undervaluing this neighborhood with properties trading at a discount to fair value, representing an attractive buying opportunity.
  • Scenario analysis and simulations: Investors make capital allocation decisions based on investment theses, such as the rise of e-commerce, long-term trends in remote working, or changing preferences for residential living. These investment theses can involve long-term calls on megatrends whose ultimate impact is as much guesswork as conviction. Data science offers spot estimates for variables that investors can adjust to reflect future states of the world based on their investment theses. One major trend post-pandemic is the rise of hybrid working arrangements. An investor could use data science to simulate what residential rents might be if rents fell about $0.01 for every extra ten-minute commute instead of $0.02 pre-pandemic due to less frequent commuting. This makes long-term investment theses more concrete and actionable.
  • Adjudicate between competing deals: Capital-constrained investors must adjudicate between competing investment opportunities when making capital allocation decisions. Is it better to be near a popular park, in a sought-after school district, or next to major train station? Data science can help adjudicate between competing deals by objectively pricing features based on observed tenant willingness-to-pay data, taking part of the subjectivity out of the process. Investors and companies are moving in this direction as well. The World Green Building Council has 132 global signatories committed to reducing carbon emissions to net zero in their real estate footprints.13 The National Council of Real Estate Investment Fiduciaries (NCREIF) is implementing measures to track key environmental and social goals.14 European investors have been proactive in this area for some time; North America is beginning to catch up.

For instance, say an investor was presented with two opportunities: an apartment building very near the CBD with a low average commute time and an office building next to the busiest metro station in the city. If the investor finds that the apartment building undervalues its low commute time while the office overvalues its proximity to transit, the investor will have higher rental growth prospects in the apartment building.



Data science is quickly becoming the gold standard for real estate research. Nonetheless, data scientists in real estate still face several practical challenges. High-quality and large-scale data can be difficult to come by, particularly if one wants to price highly granular, unit-specific features. Methodological standards for data collection continue to evolve, making long-run time series analysis more difficult than in other asset classes.

Real estate also faces certain problems unique to spatial data science, such as what to do when data is constrained within artificial boundaries like census tracts or neighborhood definitions. And investors continue to struggle with how to incorporate data science into their existing investment workflow. Data science is quickly becoming the gold standard for real estate research. Nonetheless, data scientists in real estate still face several practical challenges.

High-quality and large-scale data can be difficult to come by, particularly if one wants to price highly granular, unit-specific features. Methodological standards for data collection continue to evolve, making long-run time series analysis more difficult than in other asset classes. Real estate also faces certain problems unique to spatial data science, such as what to do when data is constrained within artificial boundaries like census tracts or neighborhood definitions. And investors continue to struggle with how to incorporate data science into their existing investment workflow.

Still, the data science opportunity in real estate—a famously illiquid, opaque, and informationally-inefficient asset class—is enormous. Advances in data availability and innovations in data science are starting to shed new light on real estate markets. Investors with a data science capability and an organizational structure set up to harness the power of data science can reap outsized, market-beating investment performance.

A friend once told me that as a real estate tech investor, he’s excited about how new data tools can remove friction and make real estate transactions more transparent. However, as a real estate investor he’s less enthused, because such tools allow more competitors to enter the fi eld with a better understanding of local market dynamics. This is the tension between traditional comparator analysis and the emerging power of data science highlighted by the authors.

Data science encourages us to look beyond conventional “real estate” data toward innovative new sources such as geo-social data (e.g., Instagram, Twitter, Facebook, etc.), satellite imagery, cell phone data, and other novel datasets to better understand real estate markets and trends. For example, Dr. Andrea Chegut’s Wide Data Project at MIT draws on more than 3,000 variables from 22 datasets to model asset pricing of every building in New York City. For any one individual to try and process this amount of data would be impossible but using machine learning and proprietary computational techniques her team has been able to answer granular questions such as what is the value of seasonal daylight exposure, proximity to coffee shops or craft brewpubs, trees on a street, or superior data connectivity. Understanding where to find relevant data and how to extract strategic information from it will be key for future real estate practitioners as they move to distinguish themselves from their competition.

Data science is not a panacea, however. As anyone building a data lake will attest, data sets are often dirty, siloed, incomplete and incompatible. As the authors point out, acquiring accurate and relevant data remains a challenge, such as differentiating between asking and realized rents after concessions. Moreover, irrational exuberance in markets can trump “true” value. For now, machine learning augments human knowledge and understanding, allowing real estate practitioners to make better decisions instead of having to rely on gut or intuition. But in the near future we will likely rely more heavily on increasingly automated, dynamic, real-time, and significantly more accurate valuation models that are able to process immense amounts of data and overcome the market’s information inefficiencies.

Steve Weikal
Head of Industry Relations, MIT Center for Real Estate
Editorial Board Member, Summit Journal


Brian Biggs, CFA, is Vice President, Research, and Ashton Sein is Research Analyst for Grosvenor‘s North American property business, part of Grosvenor, an international organization whose activities span urban property, food and agtech, rural estate management, and support for philanthropic initiatives.



Through the rest of this year, investors forecast challenges for global capital, but thoughtful investors are forging ahead.
Gunnar Branson and Benjamin van Loon | AFIRE


While the market rarely sends clear investing signals, current market conditions are replete with clues, but as timing for corrections is difficult, a move to risk-off strategies could be useful.
Joseph L. Pagliari | University of Chicago


Mobile information technology has upended US land use regulation, and the ramifications of this technological upheaval are finally coming into view.
Robert Seldin | Madison Highland Live Work Lofts


As buildings become increasingly technologized, especially after the pandemic, cyber-attacks can put entire properties at risk and require a firmwide security approach.
Noëlle Brisson and Michael Savoie | CyberReady, LLC


The rapid rise in consumer prices has rekindled the old debate about whether commercial real estate provides a long-term hedge against inflation (hint: look at multifamily).
Gleb Nechayev, CRE | Berkshire Residential Investments


While the real estate industry has long understood the need for data, it still struggles with connecting information to decision making. New strides in data science could change that.
Brian Biggs and Ashton Sein | Grosvenor


The practice and expectations of investing across all industries is undergoing major upheaval and the key to stability will mean looking beyond profit for profit’s sake.
Michael Cooper and Richard Florida | Dream Unlimited Corporation


Forecasts about the future of the office sector are often wildly conflicting, but the looming high tide of generational leadership transitions could change the script.
Sabrina Unger and Britteni Lupe | American Realty Advisors


The logistics sector was the winner of the pandemic recession—but is its rise built to last?
Hugues Braconnier and Dr. Megan Walters | Allianz Real Estate


From retail to office to abandoned factories and warehouses, owners of real estate are rethinking—and reinventing—the future of their investments.
John Thomas and Stacey Krumin | Squire Patton Boggs


Data centers have become an increasingly institutionalized property class over the past several years, but finding success in the sector depends on talent and expertise.
Max Shepherd, Jannah Babasa, and Isabel Ruiz Halter | Sheffield Haworth


As insurance costs of residential and commercial spiral out of control, a 1400-year-old tradition is poised to offer long-term, sustainable growth for real estate investments.
Ishmam Ahmed | Georgetown University & AFIRE


Dive into the report to understand if and how COVID impacted domestic migration patterns on a state, city, and zip code level.
Ethan Chernofsky |


How does the Consumer Price Index account for the cost of housing?
David Wessel and Sophia Campbell | The Brookings Institution


Aegon Asset Management is an active global investor that manages and advises on assets of $328 billion* for global pension plans, public funds, insurance companies, banks, wealth managers, family offices, and foundations. Aegon AM’s Real Assets platform focuses on delivering yield-oriented and total return solutions spanning the risk/return spectrum.

With an over 35-year history and $25 billion* in AUM/AUA, the Real Assets business is built on a cycle-tested platform, deep and broad market access, and long-term relationships.

Our real assets debt and equity strategies seek to deliver strong relative value and returns through a research-intensive process. The process encompasses thoughtful top-down research and intelligent bottom-up analysis deployed by an experienced multidisciplined team of over 110 investment professionals.*

Each capability is underpinned by dedicated, in-house support and service teams including applied research, engineering and environmental, valuation, accounting, client service, legal and risk management.

*As of June 30, 2022. The assets under management/advisement described herein incorporates the entities within Aegon Asset Management brand as well as the following affiliates: Aegon Asset Management Holding B.V., Aegon Asset Management Spain, and joint-venture participations in Aegon Industrial Fund Management Co. LTD, La Banque Postale Asset Management SA, and Pelargos Capital BV.

Member Login

Enter your email address and password associated with your membership to log into If you are unable to login through this popup, go to to reset your password. For questions, contact us.

Forgot your password?