feedback_forensics.data.dataset_utils

`feedback_forensics.data.dataset_utils`#

Utilities for working with datasets.

Module Contents#

Functions#

`get_available_models`	Get all available model names from a dataset.
`add_annotators_to_votes_dict`	Add annotators to an existing votes_dict.
`get_annotators_by_type`	Extract all annotators grouped by their type from a votes_dict.
`get_first_json_key_value`	Get the first key and value from a JSON file.

API#

feedback_forensics.data.dataset_utils.get_available_models(df: pandas.DataFrame) → list#

Get all available model names from a dataset.

Args: df: DataFrame containing the dataset

Returns: List of unique model names found in the dataset

feedback_forensics.data.dataset_utils.add_annotators_to_votes_dict(votes_dict: dict, annotator_metadata: dict, annotations_df: pandas.DataFrame) → dict#

Add annotators to an existing votes_dict.

Args: votes_dict: An existing votes_dict with dataframe and metadata annotator_metadata: Dictionary of annotator metadata to add annotations_df: DataFrame with only comparison_id and annotator columns

Returns: A new votes_dict with annotators added

feedback_forensics.data.dataset_utils.get_annotators_by_type(votes_dict: Dict[str, Any]) → Dict[str, Dict[str, List[str]]]#

Extract all annotators grouped by their type from a votes_dict.

Args: votes_dict: Dictionary containing annotator metadata

Returns: Dictionary mapping variant types to dictionaries with “column_ids” and “visible_names” keys Automatically handles missing keys with empty lists using defaultdict

feedback_forensics.data.dataset_utils.get_first_json_key_value(file_path)#

Get the first key and value from a JSON file.

This is useful for extracting metadata from AnnotatedPairs file, without loading full dataset.

feedback_forensics.data.dataset_utils

Contents

feedback_forensics.data.dataset_utils#

Module Contents#

Functions#

API#

`feedback_forensics.data.dataset_utils`#