roll_rate_analysis package

Module contents

Roll rate analysis for credit risk scorecards.

class roll_rate_analysis.MOMRollRateTable(month_i: LazyFrame | DataFrame | str | Path, month_i_plus_1: LazyFrame | DataFrame | str | Path, *, unique_key_col: str, delinquency_col: str, max_delq: int = 6, binary_cols: Sequence[str] = ())[source]

Bases: object

Month-over-month roll rate table for two consecutive monthly snapshots.

Parameters

month_i:

Data for month i. Accepts a polars LazyFrame/DataFrame or a path/string pointing to a CSV file.

month_i_plus_1:

Data for month i+1. Same supported types as month_i.

unique_key_col:

Name of the account identifier column. Must exist in both inputs.

delinquency_col:

Name of the delinquency column (integer months past due). Must exist in both inputs.

max_delq:

Largest delinquency level kept as its own row/column. Anything above rolls into the N+ bucket.

binary_cols:

Optional binary indicator columns to append to the matrix. Listed in descending priority — the first entry wins ties. Each indicator gets one extra row and column.

Use

>>> table = MOMRollRateTable(
...     "jan.csv", "feb.csv",
...     unique_key_col="id", delinquency_col="delq", max_delq=6,
... )
>>> matrix = table.compute()      # polars.DataFrame, the full transition matrix
>>> reduced = table.reduce()      # polars.DataFrame, roll_down / stable / roll_up

compute and reduce are idempotent; the matrix is cached after the first call. Both return polars DataFrame``s whose first column (``from_state) holds the row label.

compute() DataFrame[source]

Compute the transition matrix and return it as a polars DataFrame.

property matrix: DataFrame

Return the cached transition matrix, computing it on first access.

reduce(percentages: bool = True) DataFrame[source]

Return roll_down / stable / roll_up per row, in percentages or counts.

class roll_rate_analysis.SnapshotRollRateTable(snapshot: LazyFrame | DataFrame | str | Path, observation: Sequence[LazyFrame | DataFrame | str | Path], performance: Sequence[LazyFrame | DataFrame | str | Path], *, unique_key_col: str, delinquency_col: str, max_delq: int = 6, detailed: bool = False, granularity: int = 1, keep_cols: Sequence[str] | None = None)[source]

Bases: object

Roll rate table for a snapshot month with observation and performance windows.

For every account in the snapshot, the observation window is reduced to its maximum delinquency across the supplied observation files, and similarly for the performance window. The resulting transition matrix has rows indexed by the observation max-delinquency and columns indexed by the performance max-delinquency.

Parameters

snapshot:

Data for the snapshot month (defines the account universe). Accepts a polars LazyFrame/DataFrame or a path/string pointing to a CSV.

observation:

Sequence of frames or paths forming the observation window.

performance:

Sequence of frames or paths forming the performance window.

unique_key_col:

Name of the account identifier column. Must exist in every input.

delinquency_col:

Name of the delinquency column. Must exist in every observation and performance frame.

max_delq:

Largest delinquency level kept as its own row/column. Anything above rolls into the N+ bucket.

detailed:

Split delinquency levels 3 and 4 into granularity sub-rows showing how many times the account hit that level during the observation window.

granularity:

Number of sub-rows per detailed level. Must be ≥ 2 when detailed.

keep_cols:

Optional column whitelist applied to each observation/performance frame before joining (memory optimisation). Must include delinquency_col.

Use

>>> table = SnapshotRollRateTable(
...     "snap.csv",
...     ["obs1.csv", "obs2.csv"],
...     ["perf1.csv", "perf2.csv"],
...     unique_key_col="id",
...     delinquency_col="delq",
...     detailed=True,
...     granularity=2,
... )
>>> matrix = table.compute()      # polars.DataFrame, the full transition matrix
>>> reduced = table.reduce()      # polars.DataFrame, roll_down / stable / roll_up

compute and reduce are idempotent; the matrix is cached after the first call.

compute() DataFrame[source]

Compute the transition matrix and return it as a polars DataFrame.

property extra_rows: int

Number of additional rows beyond max_delq + 1 due to detailed mode.

property matrix: DataFrame

Return the cached transition matrix, computing it on first access.

reduce(percentages: bool = True) DataFrame[source]

Return roll_down / stable / roll_up per row, in percentages or counts.

roll_rate_analysis.mom module

Month-over-month roll rate table.

class roll_rate_analysis.mom.MOMRollRateTable(month_i: LazyFrame | DataFrame | str | Path, month_i_plus_1: LazyFrame | DataFrame | str | Path, *, unique_key_col: str, delinquency_col: str, max_delq: int = 6, binary_cols: Sequence[str] = ())[source]

Bases: object

Month-over-month roll rate table for two consecutive monthly snapshots.

Parameters

month_i:

Data for month i. Accepts a polars LazyFrame/DataFrame or a path/string pointing to a CSV file.

month_i_plus_1:

Data for month i+1. Same supported types as month_i.

unique_key_col:

Name of the account identifier column. Must exist in both inputs.

delinquency_col:

Name of the delinquency column (integer months past due). Must exist in both inputs.

max_delq:

Largest delinquency level kept as its own row/column. Anything above rolls into the N+ bucket.

binary_cols:

Optional binary indicator columns to append to the matrix. Listed in descending priority — the first entry wins ties. Each indicator gets one extra row and column.

Use

>>> table = MOMRollRateTable(
...     "jan.csv", "feb.csv",
...     unique_key_col="id", delinquency_col="delq", max_delq=6,
... )
>>> matrix = table.compute()      # polars.DataFrame, the full transition matrix
>>> reduced = table.reduce()      # polars.DataFrame, roll_down / stable / roll_up

compute and reduce are idempotent; the matrix is cached after the first call. Both return polars DataFrame``s whose first column (``from_state) holds the row label.

compute() DataFrame[source]

Compute the transition matrix and return it as a polars DataFrame.

property matrix: DataFrame

Return the cached transition matrix, computing it on first access.

reduce(percentages: bool = True) DataFrame[source]

Return roll_down / stable / roll_up per row, in percentages or counts.

roll_rate_analysis.snapshot module

Snapshot roll rate table over observation and performance windows.

class roll_rate_analysis.snapshot.SnapshotRollRateTable(snapshot: LazyFrame | DataFrame | str | Path, observation: Sequence[LazyFrame | DataFrame | str | Path], performance: Sequence[LazyFrame | DataFrame | str | Path], *, unique_key_col: str, delinquency_col: str, max_delq: int = 6, detailed: bool = False, granularity: int = 1, keep_cols: Sequence[str] | None = None)[source]

Bases: object

Roll rate table for a snapshot month with observation and performance windows.

For every account in the snapshot, the observation window is reduced to its maximum delinquency across the supplied observation files, and similarly for the performance window. The resulting transition matrix has rows indexed by the observation max-delinquency and columns indexed by the performance max-delinquency.

Parameters

snapshot:

Data for the snapshot month (defines the account universe). Accepts a polars LazyFrame/DataFrame or a path/string pointing to a CSV.

observation:

Sequence of frames or paths forming the observation window.

performance:

Sequence of frames or paths forming the performance window.

unique_key_col:

Name of the account identifier column. Must exist in every input.

delinquency_col:

Name of the delinquency column. Must exist in every observation and performance frame.

max_delq:

Largest delinquency level kept as its own row/column. Anything above rolls into the N+ bucket.

detailed:

Split delinquency levels 3 and 4 into granularity sub-rows showing how many times the account hit that level during the observation window.

granularity:

Number of sub-rows per detailed level. Must be ≥ 2 when detailed.

keep_cols:

Optional column whitelist applied to each observation/performance frame before joining (memory optimisation). Must include delinquency_col.

Use

>>> table = SnapshotRollRateTable(
...     "snap.csv",
...     ["obs1.csv", "obs2.csv"],
...     ["perf1.csv", "perf2.csv"],
...     unique_key_col="id",
...     delinquency_col="delq",
...     detailed=True,
...     granularity=2,
... )
>>> matrix = table.compute()      # polars.DataFrame, the full transition matrix
>>> reduced = table.reduce()      # polars.DataFrame, roll_down / stable / roll_up

compute and reduce are idempotent; the matrix is cached after the first call.

compute() DataFrame[source]

Compute the transition matrix and return it as a polars DataFrame.

property extra_rows: int

Number of additional rows beyond max_delq + 1 due to detailed mode.

property matrix: DataFrame

Return the cached transition matrix, computing it on first access.

reduce(percentages: bool = True) DataFrame[source]

Return roll_down / stable / roll_up per row, in percentages or counts.