boilercv_pipeline.dfs#

Data frame operations.

Module Contents#

Functions#

sparkhist

Render a sparkline histogram.

get_hists

Add sparklines row to the top of a dataframe.

save_df

Save data frame to a compressed HDF5 file.

limit_group_size

Filter out groups shorter than a certain length.

API#

boilercv_pipeline.dfs.sparkhist(
grp: pandas.DataFrame,
) str#

Render a sparkline histogram.

boilercv_pipeline.dfs.get_hists(
df: pandas.DataFrame,
groupby: str,
cols: list[str],
) pandas.DataFrame#

Add sparklines row to the top of a dataframe.

boilercv_pipeline.dfs.save_df(
df: pandas.DataFrame,
path: pathlib.Path | str,
key: str | None = None,
)#

Save data frame to a compressed HDF5 file.

boilercv_pipeline.dfs.limit_group_size(
df: pandas.DataFrame,
by: str | list[str],
n: int,
) pandas.DataFrame#

Filter out groups shorter than a certain length.