Python API

Python API#

Feedback Forensics can be used to interpret annotator data within Python. Below is a minimal example:

import feedback_forensics as ff

# load dataset from AnnotatedPairs json file produced by ICAI package
dataset = ff.DatasetHandler()
dataset.add_data_from_path("data/output/example/annotated_pairs.json")

overall_metrics = dataset.get_overall_metrics()
annotator_metrics = dataset.get_annotator_metrics()

DatasetHandler#

class DatasetHandler(cache: dict | None = None, avail_datasets: dict | None = None, reference_models: list[str] | None = None)#

Class to handle dataset operations of multi-column annotation datasets.

A dataset consists of one or multiple annotation columns, represented by ColumnHandler objects. Each column can either be a different annotator or the same annotator but on a different data(sub)set.

Initialization

Initialize the dataset handler.

Args: cache (dict | None): Cache dictionary to store and retrieve model annotators avail_datasets (dict | None): Dictionary of available datasets reference_models (list[str] | None): List of reference models to use for virtual annotators. Only relevant if using model annotators.

property cache#

property avail_datasets#

property col_handlers#: Get column handlers (each handler is a single dataset).

property first_handler#: Get the first column handler.

property first_handler_name#: Get the name of the first dataset handler.

property is_single_dataset#: Check if the dataset handler is a single dataset.

property votes_dicts#: Get the votes_dicts for all dataset handlers.

property num_cols#: Get the number of columns in the dataset.

add_col_handler(name: str, handler: feedback_forensics.data.handler.ColumnHandler)#: Add a column handler.

reset_handlers()#: Reset the dataset handlers.

add_data_from_name(name: str)#: Load data from a given dataset name.

load_data_from_names(names: list[str])#: Load data from a given list of dataset names.

add_data_from_path(path: str | pathlib.Path, name: str | None = None)#: Add data from a given path.

load_data_from_paths(paths: list[str | pathlib.Path])#: Load data from a given list of paths.

add_data_from_votes_dict(votes_dict: dict, name: str)#: Add data from a given votes_dict.

load_data_from_votes_dicts(votes_dicts: dict[str, dict])#: Load data from a given list of votes_dicts.

get_col_handler(name: str)#: Get the column handler for a given dataset name.

get_available_annotators()#

Get annotators available onall dataset columns.

Some datasets may not have all annotators. This method provides access to the shared annotators.

get_available_annotator_visible_names()#: Get the visible names of the available annotators.

set_annotator_rows(annotator_visible_names: list[str] | None = None, annotator_keys: list[str] | None = None)#: Change the visible annotator rows for all dataset handlers.

split_by_col(col: str, selected_vals: list[str] | None = None)#: Split the dataset by a given column to create multiple datasets.

set_annotator_cols(annotator_visible_names: list[str] | None = None, annotator_keys: list[str] | None = None)#: Change the annotator columns for all dataset handlers.

get_overall_metrics()#: Get the overall metrics for all dataset handlers.

get_annotator_metrics()#: Get the annotator metrics for all dataset handlers.

get_annotator_metrics_df(metric_name: str, add_max_diff_col: bool = True, index_col_name: str = 'Annotator')#: Get the annotator metrics for all dataset handlers as a single dataframe.

Comprehensive API docs#

See the comprehensive API docs here

Python API

Contents

Python API#

DatasetHandler#

Comprehensive API docs#