feedback_forensics.data.loader#

Module Contents#

Functions#

load_json_file

convert_vote_to_string

get_votes_dict

Get the votes dataframe for a given results directory. If the dataframe is already in the cache, return it. Otherwise, create it, add it to the cache, and return it.

add_virtual_annotators

Add virtual model annotators to a votes dictionary.

get_votes_dict_from_annotated_pairs_json

Get the votes dataframe for a given json path

_check_for_nondefault_annotators

Check for non-default annotators in the dataframe.

_remove_empty_response_comparisons

create_votes_dict_from_icai_log_files

Create the votes dataframe and voter metadata from ICAI log files.

API#

feedback_forensics.data.loader.load_json_file(path: str)#
feedback_forensics.data.loader.convert_vote_to_string(vote: bool | None) str#
feedback_forensics.data.loader.get_votes_dict(results_path: pathlib.Path, cache: dict | None = None) dict#

Get the votes dataframe for a given results directory. If the dataframe is already in the cache, return it. Otherwise, create it, add it to the cache, and return it.

feedback_forensics.data.loader.add_virtual_annotators(votes_dict: dict, cache: dict | None, dataset_cache_key: pathlib.Path, reference_models: list | None, target_models: list | None) dict#

Add virtual model annotators to a votes dictionary.

Args: votes_dict: Base votes dictionary to add the annotators to cache: Cache dictionary to store and retrieve model annotators dataset_cache_key: Key used for caching (typically the results path) reference_models: List of model names to use as reference models. Empty list means all. target_models: List of model names to use as target models. Empty list means all.

Returns: A votes dictionary with model annotators added

feedback_forensics.data.loader.get_votes_dict_from_annotated_pairs_json(results_path: pathlib.Path) dict#

Get the votes dataframe for a given json path

feedback_forensics.data.loader._check_for_nondefault_annotators(df: pandas.DataFrame) dict#

Check for non-default annotators in the dataframe.

Checks for each column in dataframe if it contains a column with values “text_a” or “text_b”, and if so, adds it as an annotator metadata entry.

feedback_forensics.data.loader._remove_empty_response_comparisons(df: pandas.DataFrame) pandas.DataFrame#
feedback_forensics.data.loader.create_votes_dict_from_icai_log_files(results_dir: pathlib.Path) list[dict]#

Create the votes dataframe and voter metadata from ICAI log files.

Args: results_dir (pathlib.Path): Path to the results directory.

Returns: dict: A dictionary containing the votes dataframe and annotator metadata.