Real Estate Development Models

The real estate development models included in this module are designed to implement pencil out pro formas, which generally measure the cash inflows and outflows of a potential investment (in this case, real estate development) with the outcome being some measure of profitability or return on investment. Pro formas would normally be performed in a spreadsheet program (e.g. Excel), but are implemented in vectorized Python implementations so that many (think millions) of pro formas can be performed at a time.

The functionality is split into two modules - the square foot pro forma and the developer model - as there are many use cases that call for the pro formas without the developer model. The sqftproforma module computes real estate feasibility for a set of parcels dependent on allowed uses, prices, and building costs, but does not actually build anything (both figuratively and literally). The developer model decides how much to build, then picks among the set of feasible buildings attempting to meet demand, and adds the new buildings to the set of current buildings. Thus developer model is primarily useful in the context of an urban forecast.

An example of the sample code required to generate the set of feasible buildings is shown below. This code comes from the utils module of the current sanfran_urbansim demo. Notice that the SqFtProForma is first initialized and a DataFrame of parcels is tested for feasibliity (each individual parcel is tested for feasibility). Each use (e.g. retail, office, residential, etc) is assigned a price per parcel, typically from empirical data of currents rents and prices in the city but can be the result of forecast rents and prices as well. The lookup function is then called with a specific building form and the pro forma returns whether that form is profitable for each parcel.

A large number of assumptions enter in to the computation of profitability and these are set in the SqFtProFormaConfig module, and include such things as the set of uses to model, the mix of uses into forms, the impact of parking requirements, parking costs, building costs at different heights (taller buildings typically requiring more expensive construction methods), the profit ratio required, the building efficiency, parcel coverage, and cap rate to name a few. See the API documentation for the complete list and detailed descriptions.

Note that unit mixes don’t typically enter in to the square foot pro forma (hence the name). After discussions with numerous real estate developers, we found that most developers thought first and foremost in terms of price and cost per square foot and the arbitrage between, and second in terms of the translation to unit sizes and mixes in a given market (also larger and smaller units of a given unit type will typically lower and raise their prices as stands to reason). Since getting data on unit mixes in the current building stock is extremely difficult, most feasibility computations here happen on a square foot basis and the developer model below handles the translation to units.

pf = sqftproforma.SqFtProForma()

df = parcels.to_frame()

# add prices for each use
for use in pf.config.uses:
    df[use] = parcel_price_callback(use)

# convert from cost to yearly rent
if residential_to_yearly:
    df["residential"] *= pf.config.cap_rate

d = {}
for form in pf.config.forms:
    print "Computing feasibility for form %s" % form
    d[form] = pf.lookup(form, df[parcel_use_allowed_callback(form)])

far_predictions = pd.concat(d.values(), keys=d.keys(), axis=1)

sim.add_table("feasibility", far_predictions)

The developer model is responsible for picking among feasible buildings in order to meet demand. An example usage of the model is shown below - which is also lifted form the sanfran_urbansim demo.

This module provides a simple utility to compute the number of units (or amount of floorspace) to build. Although the vacancy rate can be applied at the regional level, it can also be used to meet vacancy rates at a sub-regional level. The developer model itself is agnostic to which parcels the user passes it, and the user is responsible for knowing at which level of geography demand is assumed to operate. The developer model then chooses which buildings to “build,” usually as a random choice weighted by profitability. This means more profitable buildings are more likely to be built although the results are a bit stochastic.

The only remaining steps are then “bookkeeping” in the sense that some additional fields might need to be added (year_built or a conversion from developer forms to building_type_ids). Finally the new buildings and old buildings need to be merged in such a way that the old ids are preserved and not duplicated (new ids are assigned at the max of the old ids+1 and then incremented from there).

dev = developer.Developer(feasibility.to_frame())

target_units = dev.\

new_buildings = dev.pick(forms,

if year is not None:
    new_buildings["year_built"] = year

if form_to_btype_callback is not None:
    new_buildings["building_type_id"] = new_buildings["form"].\

all_buildings = dev.merge(buildings.to_frame(buildings.local_columns),

sim.add_table("buildings", all_buildings)

Square Foot Pro Forma API

class urbansim.developer.sqftproforma.SqFtProForma(config=None)[source]

Initialize the square foot based pro forma.

This pro forma has no representation of units - it does not differentiate between the rent attained by 1BR, 2BR, or 3BR and change the rents accordingly. This is largely because it is difficult to get information on the unit mix in an existing building in order to compute its acquisition cost. Thus rents and costs per sqft are used for new and current buildings which assumes there is a constant return on increasing and decreasing unit sizes, an extremely useful simplifying assumption above the project scale (i.e. city of regional scale)


The configuration object which should be an instance of SqFtProFormaConfig. The configuration options for this pro forma are documented on the configuration object.

get_ave_cost_sqft(self, form, parking_config)[source]

Get the average cost per sqft for the pro forma for a given form


Get a series representing the average cost per sqft for each form in the config


The parking configuration to get debug info for


A series where the index is the far and the values are the average cost per sqft at which the building is “break even” given the configuration parameters that were passed at run time.

get_debug_info(self, form, parking_config)[source]

Get the debug info after running the pro forma for a given form and parking configuration


The form to get debug info for


The parking configuration to get debug info for


A dataframe where the index is the far with many columns representing intermediate steps in the pro forma computation. Additional documentation will be added at a later date, although many of the columns should be fairly self-expanatory.

lookup(self, form, df, only_built=True, pass_through=None)[source]

This function does the developer model lookups for all the actual input data.


One of the forms specified in the configuration file

df: dataframe

Pass in a single data frame which is indexed by parcel_id and has the following columns


Whether to return only those buildings that are profitable and allowed by zoning, or whether to return as much information as possible, even if unlikely to be built (can be used when development might be subsidized or when debugging)

pass_throughlist of strings

List of field names to take from the input parcel frame and pass to the output feasibility frame - is usually used for debugging purposes - these fields will be passed all the way through developer

Input Dataframe Columns

A set of columns, one for each of the uses passed in the configuration. Values are yearly rents for that use. Typical column names would be “residential”, “retail”, “industrial” and “office”


A series representing the CURRENT yearly rent for each parcel. Used to compute acquisition costs for the parcel.


A series representing the parcel size for each parcel.


A series representing the maximum far allowed by zoning. Buildings will not be built above these fars.


A series representing the maxmium height allowed by zoning. Buildings will not be built above these heights. Will pick between the min of the far and height, will ignore on of them if one is nan, but will not build if both are nan.

max_duaseries, optional

A series representing the maximum dwelling units per acre allowed by zoning. If max_dua is passed, the average unit size should be passed below to translate from dua to floor space.

ave_unit_sizeseries, optional

This is required if max_dua is passed above, otherwise it is optional. This is the same as the parameter to Developer.pick() (it should be the same series).

indexSeries, int

parcel identifiers

building_sqftSeries, float

The number of square feet for the building to build. Keep in mind this includes parking and common space. Will need a helpful function to convert from gross square feet to actual usable square feet in residential units.

building_costSeries, float

The cost of constructing the building as given by the ave_cost_per_sqft from the cost model (for this FAR) and the number of square feet.

total_costSeries, float

The cost of constructing the building plus the cost of acquisition of the current parcel/building.

building_revenueSeries, float

The NPV of the revenue for the building to be built, which is the number of square feet times the yearly rent divided by the cap rate (with a few adjustment factors including building efficiency).

max_profit_farSeries, float

The FAR of the maximum profit building (constrained by the max_far and max_height from the input dataframe).

max_profit :

The profit for the maximum profit building (constrained by the max_far and max_height from the input dataframe).

class urbansim.developer.sqftproforma.SqFtProFormaConfig[source]

This class encapsulates the configuration options for the square foot based pro forma.


A list of parcel sizes to test. Interestingly, right now the parcel sizes cancel in this style of pro forma computation so you can set this to something reasonable for debugging purposes - e.g. [10000]. All sizes can be feet or meters as long as they are consistently used.


A list of floor area ratios to use. FAR is a multiple of the parcel size that is the total building bulk that is allowed by zoning on the site. In this case, all of these ratios will be tested regardless of zoning and the zoning test will be performed later.


A list of space uses to use within a building. These are mixed into forms. Generally speaking, you should only have uses for which you have an estimate (or observed) values for rents in the building. By default, uses are retail, industrial, office, and residential.


A dictionary where keys are names for the form and values are also dictionaries where keys are uses and values are the proportion of that use used in this form. The values of the dictionary should sum to 1.0. For instance, a form called “residential” might have a dict of space allocations equal to {“residential”: 1.0} while a form called “mixedresidential” might have a dictionary of space allocations equal to {“retail”: .1, “residential” .9] which is 90% residential and 10% retail.


A dict of rates per thousand square feet where keys are the uses from the list specified in the attribute above. The ratios are typically in the range 0.5 - 3.0 or similar. So for instance, a key-value pair of “retail”: 2.0 would be two parking spaces per 1,000 square feet of retail. This is a per square foot pro forma, so the more typically parking ratio of spaces per residential unit must be converted to square feet for use in this pro forma.


The number of square feet per unit for use in the parking_rates above. By default this is set to 1,000 but can be overridden.


An expert parameter and is usually unchanged. By default it is set to [‘surface’, ‘deck’, ‘underground’] and very semantic differences in the computation are performed for each of these parking configurations. Generally speaking it will break things to change this array, but an item can be removed if that parking configuration should not be tested.


A dictionary where keys are the three parking configurations listed above and values are square foot uses of parking spaces in that configuration. This is to capture the fact that surface parking is usually more space intensive than deck or underground parking.


The parking cost for each parking configuration. Keys are the name of the three parking configurations listed above and values are dollars PER SQUARE FOOT for parking in that configuration. Used to capture the fact that underground and deck are far more expensive than surface parking.


A list of “break points” as heights at which construction becomes more expensive. Generally these are the heights at which construction materials change from wood, to concrete, to steel. Costs are also given as lists by use for each of these break points and are considered to be valid up to the break point. A list would look something like [15, 55, 120, np.inf].


The keys are uses from the attribute above and the values are a list of floating point numbers of same length as the height_for_costs attribute. A key-value pair of “residential”: [160.0, 175.0, 200.0, 230.0] would say that the residential use if $160/sqft up to 15ft in total height for the building, $175/sqft up to 55ft, $200/sqft up to 120ft, and $230/sqft beyond. A final value in the height_for_costs array of np.inf is typical.


The per-story height for the building used to turn an FAR into an actual height.


The maximum height of retail buildings to consider.


The maximum height of industrial buildings to consider.


The ratio of profit a developer expects to make above the break even rent. Should be greater than 1.0, e.g. a 10% profit would be a profit factor of 1.1.


The efficiency of the building. This turns total FAR into the amount of space which gets a square foot rent. The entire building gets the cost of course.


The ratio of the building footprint to the parcel size. Also used to turn an FAR into a height to cost properly.


The rate an investor is willing to pay for a cash flow per year. This means $1/year is equivalent to 1/cap_rate present dollars. This is a macroeconomic input that is widely available on the internet.

Developer Model API

class urbansim.developer.developer.Developer(feasibility)[source]

Pass the dataframe that is returned by feasibility here

Can also be a dictionary where keys are building forms and values are the individual data frames returned by the proforma lookup routine.

static compute_units_to_build(num_agents, num_units, target_vacancy)[source]

Compute number of units to build to match target vacancy.


number of agents that need units in the region


number of units in buildings

target_vacancyfloat (0-1.0)

target vacancy rate


the number of units that need to be built

keep_form_with_max_profit(self, forms=None)[source]

This converts the dataframe, which shows all profitable forms, to the form with the greatest profit, so that more profitable forms outcompete less profitable forms.

forms: list of strings

List of forms which compete which other. Can leave some out.

Nothing. Goes from a multi-index to a single index with only the
most profitable form.
static merge(old_df, new_df, return_index=False)[source]

Merge two dataframes of buildings. The old dataframe is usually the buildings dataset and the new dataframe is a modified (by the user) version of what is returned by the pick method.


Current set of buildings


New buildings to add, usually comes from this module


If return_index is true, this method will return the new index of new_df (which changes in order to create a unique index after the merge)


Combined DataFrame of buildings, makes sure indexes don’t overlap


If and only if return_index is True, return the new index for the new_df dataframe (which changes in order to create a unique index after the merge)

pick(self, form, target_units, parcel_size, ave_unit_size, current_units, max_parcel_size=200000, min_unit_size=400, drop_after_build=True, residential=True, bldg_sqft_per_job=400.0, profit_to_prob_func=None)[source]

Choose the buildings from the list that are feasible to build in order to match the specified demand.

formstring or list

One or more of the building forms from the pro forma specification - e.g. “residential” or “mixedresidential” - these are configuration parameters passed previously to the pro forma. If more than one form is passed the forms compete with each other (based on profitability) for which one gets built in order to meet demand.


The number of units to build. For non-residential buildings this should be passed as the number of job spaces that need to be created.


The size of the parcels. This was passed to feasibility as well, but should be passed here as well. Index should be parcel_ids.


The average residential unit size around each parcel - this is indexed by parcel, but is usually a disaggregated version of a zonal or accessibility aggregation.

bldg_sqft_per_jobfloat (default 400.0)

The average square feet per job for this building form.


Values less than this number in ave_unit_size will be set to this number. Deals with cases where units are currently not built.


The current number of units on the parcel. Is used to compute the net number of units produced by the developer model. Many times the developer model is redeveloping units (demolishing them) and is trying to meet a total number of net units produced.


Parcels larger than this size will not be considered for development - usually large parcels should be specified manually in a development projects table.


Whether or not to drop parcels from consideration after they have been chosen for development. Usually this is true so as to not develop the same parcel twice.

residential: bool

If creating non-residential buildings set this to false and developer will fill in job_spaces rather than residential_units

profit_to_prob_func: function

As there are so many ways to turn the development feasibility into a probability to select it for building, the user may pass a function which takes the feasibility dataframe and returns a series of probabilities. If no function is passed, the behavior of this method will not change

None if thar are no feasible buildings

DataFrame of buildings to add. These buildings are rows from the DataFrame that is returned from feasibility.