Distance utilities API¶
ChoiceModels also includes tools for constructing pairwise distance matrices and calculating which geographies are within various distance bands of some reference geography.
Distance matrices¶

choicemodels.tools.
great_circle_distance_matrix
(df, x, y, earth_radius=6371009, return_int=True)[source]¶ Calculate a pairwise greatcircle distance matrix from a DataFrame of points. Distances returned are in units of earth_radius (default is meters).
 Parameters
df (pandas DataFrame) – a DataFrame of points, uniquely indexed by place identifier (e.g., tract ID or parcel ID), represented by x and y coordinate columns
x (str) – label of the x coordinate column in the DataFrame
y (str) – label of the y coordinate column in the DataFrame
earth_radius (numeric) – radius of earth in units in which distance will be returned (default is meters)
return_int (bool) – if True, convert all distances to integers
 Returns
Multiindexed distance vector in units of df’s values, with toplevel index representing “from” and secondlevel index representing “to”.
 Return type
pandas Series

choicemodels.tools.
euclidean_distance_matrix
(df)[source]¶ Calculate a pairwise euclidean distance matrix from a DataFrame of points. Distances returned are in units of x and y columns.
 Parameters
df (pandas DataFrame) – a DataFrame of points, uniquely indexed by place identifier (e.g., tract ID or parcel ID), represented by x and y coordinate columns
 Returns
Multiindexed distance vector in units of df’s values, with toplevel index representing “from” and secondlevel index representing “to”.
 Return type
pandas Series

choicemodels.tools.
distance_matrix
(df, method='euclidean', x='lng', y='lat', earth_radius=6371009, return_int=True)[source]¶ Calculate a pairwise distance matrix from a DataFrame of twodimensional points.
 Parameters
df (pandas DataFrame) – a DataFrame of points, uniquely indexed by place identifier (e.g., tract ID or parcel ID), represented by x and y coordinate columns
method (str) – {‘euclidean’, ‘greatcircle’, ‘network’} which algorithm to use for calculating pairwise distances
x (str) – if method=’greatcircle’ or ‘network’, label of the x coordinate column in the DataFrame
y (str) – if method=’greatcircle’ or ‘network’, label of the y coordinate column in the DataFrame
earth_radius (numeric) – if method=’greatcircle’, radius of earth in units in which distance will be returned (default is meters)
return_int (bool) – if method=’greatcircle’, if True, convert all distances to integers
 Returns
Multiindexed distance vector in units of df’s values, with toplevel index representing “from” and secondlevel index representing “to”.
 Return type
pandas Series
Distance bands¶

choicemodels.tools.
distance_bands
(dist_vector, distances)[source]¶ Identify all geographies located within each distance band of each geography.
The list of distances is treated pairwise to create distance bands, with the first element of each pair forming the band’s inclusive lower limit and the second element of each pair forming the band’s exclusive upper limit. For example, if distances=[0, 10, 30], band 0 will contain all geographies with a distance >= 0 and < 10 units (e.g., meters) from the reference geography, and band 1 will contain all geographies with a distance >= 10 and < 30 units from the reference geography.
To make the final distance band include all geographies beyond a certain distance, make the final value in the distances list np.inf.
 Parameters
dist_vector (pandas Series) – Multiindexed distance vector in units of df’s values, with toplevel index representing “from” and secondlevel index representing “to”.
distances (list) – a list of distance band increments
 Returns
a series multiindexed by geography ID and distance band number, with values of arrays of geography IDs with the corresponding distances from that ID
 Return type
pandas Series