Python API#
Feedback Forensics can be used to interpret annotator data within Python. Below is a minimal example:
import feedback_forensics as ff
# load dataset from AnnotatedPairs json file produced by ICAI package
dataset = ff.DatasetHandler()
dataset.add_data_from_path("data/output/example/annotated_pairs.json")
overall_metrics = dataset.get_overall_metrics()
annotator_metrics = dataset.get_annotator_metrics()
DatasetHandler#
- class DatasetHandler(cache: dict | None = None, avail_datasets: dict | None = None, reference_models: list[str] | None = None)#
Class to handle dataset operations of multi-column annotation datasets.
A dataset consists of one or multiple annotation columns, represented by ColumnHandler objects. Each column can either be a different annotator or the same annotator but on a different data(sub)set.
Initialization
Initialize the dataset handler.
Args: cache (dict | None): Cache dictionary to store and retrieve model annotators avail_datasets (dict | None): Dictionary of available datasets reference_models (list[str] | None): List of reference models to use for virtual annotators. Only relevant if using model annotators.
- property cache#
- property avail_datasets#
- property col_handlers#
Get column handlers (each handler is a single dataset).
- property first_handler#
Get the first column handler.
- property first_handler_name#
Get the name of the first dataset handler.
- property is_single_dataset#
Check if the dataset handler is a single dataset.
- property votes_dicts#
Get the votes_dicts for all dataset handlers.
- property num_cols#
Get the number of columns in the dataset.
- add_col_handler(name: str, handler: feedback_forensics.data.handler.ColumnHandler)#
Add a column handler.
- reset_handlers()#
Reset the dataset handlers.
- add_data_from_name(name: str)#
Load data from a given dataset name.
- load_data_from_names(names: list[str])#
Load data from a given list of dataset names.
- add_data_from_path(path: str | pathlib.Path, name: str | None = None)#
Add data from a given path.
- load_data_from_paths(paths: list[str | pathlib.Path])#
Load data from a given list of paths.
- add_data_from_votes_dict(votes_dict: dict, name: str)#
Add data from a given votes_dict.
- load_data_from_votes_dicts(votes_dicts: dict[str, dict])#
Load data from a given list of votes_dicts.
- get_col_handler(name: str)#
Get the column handler for a given dataset name.
- get_available_annotators()#
Get annotators available onall dataset columns.
Some datasets may not have all annotators. This method provides access to the shared annotators.
- get_available_annotator_visible_names()#
Get the visible names of the available annotators.
- set_annotator_rows(annotator_visible_names: list[str] | None = None, annotator_keys: list[str] | None = None)#
Change the visible annotator rows for all dataset handlers.
- split_by_col(col: str, selected_vals: list[str] | None = None)#
Split the dataset by a given column to create multiple datasets.
- set_annotator_cols(annotator_visible_names: list[str] | None = None, annotator_keys: list[str] | None = None)#
Change the annotator columns for all dataset handlers.
- get_overall_metrics()#
Get the overall metrics for all dataset handlers.
- get_annotator_metrics()#
Get the annotator metrics for all dataset handlers.
- get_annotator_metrics_df(metric_name: str, add_max_diff_col: bool = True, index_col_name: str = 'Annotator')#
Get the annotator metrics for all dataset handlers as a single dataframe.
Comprehensive API docs#
See the comprehensive API docs here