Distance utilities API¶
ChoiceModels also includes tools for constructing pairwise distance matrices and calculating which geographies are within various distance bands of some reference geography.
Distance matrices¶
-
choicemodels.tools.
great_circle_distance_matrix
(df, x, y, earth_radius=6371009, return_int=True)[source]¶ Calculate a pairwise great-circle distance matrix from a DataFrame of points. Distances returned are in units of earth_radius (default is meters).
- Parameters
df (pandas DataFrame) – a DataFrame of points, uniquely indexed by place identifier (e.g., tract ID or parcel ID), represented by x and y coordinate columns
x (str) – label of the x coordinate column in the DataFrame
y (str) – label of the y coordinate column in the DataFrame
earth_radius (numeric) – radius of earth in units in which distance will be returned (default is meters)
return_int (bool) – if True, convert all distances to integers
- Returns
Multi-indexed distance vector in units of df’s values, with top-level index representing “from” and second-level index representing “to”.
- Return type
pandas Series
-
choicemodels.tools.
euclidean_distance_matrix
(df)[source]¶ Calculate a pairwise euclidean distance matrix from a DataFrame of points. Distances returned are in units of x and y columns.
- Parameters
df (pandas DataFrame) – a DataFrame of points, uniquely indexed by place identifier (e.g., tract ID or parcel ID), represented by x and y coordinate columns
- Returns
Multi-indexed distance vector in units of df’s values, with top-level index representing “from” and second-level index representing “to”.
- Return type
pandas Series
-
choicemodels.tools.
distance_matrix
(df, method='euclidean', x='lng', y='lat', earth_radius=6371009, return_int=True)[source]¶ Calculate a pairwise distance matrix from a DataFrame of two-dimensional points.
- Parameters
df (pandas DataFrame) – a DataFrame of points, uniquely indexed by place identifier (e.g., tract ID or parcel ID), represented by x and y coordinate columns
method (str) – {‘euclidean’, ‘greatcircle’, ‘network’} which algorithm to use for calculating pairwise distances
x (str) – if method=’greatcircle’ or ‘network’, label of the x coordinate column in the DataFrame
y (str) – if method=’greatcircle’ or ‘network’, label of the y coordinate column in the DataFrame
earth_radius (numeric) – if method=’greatcircle’, radius of earth in units in which distance will be returned (default is meters)
return_int (bool) – if method=’greatcircle’, if True, convert all distances to integers
- Returns
Multi-indexed distance vector in units of df’s values, with top-level index representing “from” and second-level index representing “to”.
- Return type
pandas Series
Distance bands¶
-
choicemodels.tools.
distance_bands
(dist_vector, distances)[source]¶ Identify all geographies located within each distance band of each geography.
The list of distances is treated pairwise to create distance bands, with the first element of each pair forming the band’s inclusive lower limit and the second element of each pair forming the band’s exclusive upper limit. For example, if distances=[0, 10, 30], band 0 will contain all geographies with a distance >= 0 and < 10 units (e.g., meters) from the reference geography, and band 1 will contain all geographies with a distance >= 10 and < 30 units from the reference geography.
To make the final distance band include all geographies beyond a certain distance, make the final value in the distances list np.inf.
- Parameters
dist_vector (pandas Series) – Multi-indexed distance vector in units of df’s values, with top-level index representing “from” and second-level index representing “to”.
distances (list) – a list of distance band increments
- Returns
a series multi-indexed by geography ID and distance band number, with values of arrays of geography IDs with the corresponding distances from that ID
- Return type
pandas Series