squadds.database package#

Submodules#

squadds.database.HuggingFace module#

squadds.database.HuggingFace.add_column_to_dataset(dataset, column_name, column_data)[source]#

Add a new column to a dataset.

Parameters:
  • dataset (Dataset) – Hugging Face dataset to which you want to add a column.

  • column_name (str) – Name of the new column.

  • column_data (list) – Data for the new column.

Returns:

Dataset with the new column.

Return type:

Dataset

squadds.database.HuggingFace.add_row_to_dataset(dataset, row_data)[source]#

Add a new row to a dataset.

Parameters:
  • dataset (Dataset) – The Hugging Face dataset to which you want to add a row.

  • row_data (dict) – The row data in dictionary format.

Returns:

Dataset with the new row added.

Return type:

Dataset

squadds.database.HuggingFace.create_PR(repo_id, branch_name, title, description)[source]#

Create a Pull Request (PR) on Hugging Face Hub.

Parameters:
  • repo_id (str) – The repo ID (namespace/repo) where the PR will be created.

  • branch_name (str) – The branch name where the changes are made.

  • title (str) – The title of the PR.

  • description (str) – A description of the changes made in the PR.

Returns:

Information about the created PR.

Return type:

dict

squadds.database.HuggingFace.filter_dataset(dataset, filter_fn)[source]#

Filter a dataset based on a custom condition.

Parameters:
  • dataset (Dataset) – Hugging Face dataset to filter.

  • filter_fn (function) – Function that returns True or False for filtering.

Returns:

Filtered dataset.

Return type:

Dataset

squadds.database.HuggingFace.fork_dataset(repo_id, dataset_name, new_dataset_name, private=True)[source]#

Fork a dataset from Hugging Face Hub.

Parameters:
  • repo_id (str) – The repo ID (namespace/repo) of the dataset to fork.

  • dataset_name (str) – Name of the dataset to fork.

  • new_dataset_name (str) – Name of the new dataset.

  • private (bool) – Whether the new dataset should be private or public.

Returns:

None

squadds.database.HuggingFace.load_hf_dataset(dataset_name, config=None)[source]#

Load a dataset from Hugging Face Hub.

Parameters:
  • dataset_name (str) – The name or path of the dataset on the Hugging Face Hub.

  • config (str) – Specific configuration or version of the dataset.

Returns:

Loaded dataset.

Return type:

Dataset or DatasetDict

squadds.database.HuggingFace.login_to_huggingface()[source]#

Log into Hugging Face using an API token from environment variables.

squadds.database.HuggingFace.merge_datasets(dataset1, dataset2)[source]#

Merge two datasets into one.

Parameters:
  • dataset1 (Dataset) – First dataset.

  • dataset2 (Dataset) – Second dataset.

Returns:

Merged dataset.

Return type:

Dataset

squadds.database.HuggingFace.remove_column_from_dataset(dataset, column_name)[source]#

Remove a column from a dataset.

Parameters:
  • dataset (Dataset) – Hugging Face dataset from which you want to remove a column.

  • column_name (str) – Name of the column to remove.

Returns:

Dataset with the column removed.

Return type:

Dataset

squadds.database.HuggingFace.remove_row_from_dataset(dataset, row_index)[source]#

Remove a row from a dataset by index.

Parameters:
  • dataset (Dataset) – Hugging Face dataset from which you want to remove a row.

  • row_index (int) – Index of the row to remove.

Returns:

Dataset with the row removed.

Return type:

Dataset

squadds.database.HuggingFace.save_dataset_to_hf(dataset, repo_id, dataset_name, private=True)[source]#

Push a dataset to Hugging Face Hub.

Parameters:
  • dataset (Dataset) – The dataset to push to Hugging Face Hub.

  • repo_id (str) – The repo ID (namespace/repo) on Hugging Face Hub.

  • dataset_name (str) – Name of the dataset on Hugging Face Hub.

  • private (bool) – Whether the dataset should be private or public.

Returns:

None

squadds.database.HuggingFace.update_column_in_dataset(dataset, column_name, new_column_data)[source]#

Update a specific column in the dataset.

Parameters:
  • dataset (Dataset) – Hugging Face dataset to update.

  • column_name (str) – Name of the column to update.

  • new_column_data (list) – List of new data to replace the existing column.

Returns:

Updated dataset.

Return type:

Dataset

squadds.database.HuggingFace.update_row_in_dataset(dataset, row_index, new_row_data)[source]#

Update an existing row in a dataset by index.

Parameters:
  • dataset (Dataset) – Hugging Face dataset to update.

  • row_index (int) – Index of the row to update.

  • new_row_data (dict) – The new data for the row.

Returns:

Updated dataset.

Return type:

Dataset

squadds.database.HuggingFace.view_column_in_dataset(dataset, column_name, num_values)[source]#

View a specific column in the dataset by its name.

Parameters:
  • dataset (Dataset) – Hugging Face dataset.

  • column_name (str) – Name of the column to view.

Returns:

Data from the specified column.

Return type:

list

squadds.database.HuggingFace.view_row_in_dataset(dataset, row_index)[source]#

View a specific row in the dataset by index.

Parameters:
  • dataset (Dataset) – Hugging Face dataset.

  • row_index (int) – Index of the row to view.

Returns:

Data for the specified row.

Return type:

dict

squadds.database.abstract_upload_data module#

class squadds.database.abstract_upload_data.AbstractUploadData(config_name)[source]#

Bases: ABC

abstract add_design(design)[source]#
abstract add_notes(notes={})[source]#
abstract add_sim_result(result_name, result_value, unit)[source]#
abstract add_sim_setup(sim_setup)[source]#
abstract clear()[source]#
abstract create_PR()[source]#
abstract get_config_schema()[source]#
abstract show()[source]#
abstract show_config_schema()[source]#
abstract submit()[source]#
abstract to_dict()[source]#
abstract validate()[source]#

squadds.database.checker module#

class squadds.database.checker.Checker[source]#

Bases: object

check(file)[source]#

squadds.database.config module#

Helper methods to create config files

class squadds.database.config.SQuADDS_DB_Config(circuit_element=None, element_name=None, result_type=None, **kwargs)[source]#

Bases: BuilderConfig

BuilderConfig for SQuADDS_DB.

squadds.database.contributor module#

class squadds.database.contributor.ExistingConfigData(config='')[source]#

Bases: object

Represents an existing configuration data object.

config#

The name of the configuration.

Type:

str

sim_results#

A dictionary containing simulation results.

Type:

dict

design#

A dictionary containing design options and the design tool.

Type:

dict

sim_options#

A dictionary containing simulation setup options.

Type:

dict

units#

A set containing the units used in the simulation results.

Type:

set

notes#

A dictionary containing additional notes.

Type:

dict

ref_entry#

A dictionary containing the reference entry.

Type:

dict

contributor#

A dictionary containing contributor information.

Type:

dict

entry#

A dictionary containing the contribution data.

Type:

dict

local_repo_path#

The local repository path.

Type:

str

sweep_data#

A list containing sweep data.

Type:

list

_validate_config_name()[source]#

Validates the configuration name.

get_config_schema()[source]#

Retrieves the schema for the given configuration name.

show_config_schema()[source]#

Prints the schema for the given configuration name.

_supported_config_names()[source]#

Retrieves the supported configuration names.

show()[source]#

Prints the contribution data.

__set_contributor_info()#

Sets the contributor information.

get_contributor_info()[source]#

Retrieves the contributor information.

add_sim_result(result_name, result_value, unit)[source]#

Adds a simulation result.

add_sim_setup(sim_setup)[source]#

Adds simulation setup options to the contribution.

add_design(design)[source]#

Adds a design to the contribution.

add_design_v0(design)[source]#

Adds a design to the contribution (version 0).

to_dict()[source]#

Converts the contribution data to a dictionary.

clear()[source]#

Clears the contribution data.

add_notes(notes)[source]#

Adds notes to the contribution.

validate_structure(actual_structure)[source]#

Validates the structure of the contributor object.

_validate_structure()[source]#

Validates the structure of the contributor object.

validate_types(data)[source]#

Validates the types of the data.

_validate_types()[source]#

Validates the types of the data.

_validate_content_v0()[source]#

Validates the content of the contribution against the dataset schema.

add_design(design)[source]#

Adds a design to the contribution.

Parameters:

design (dict) – A dictionary containing design options and the design tool.

add_design_v0(design)[source]#

Adds a design to the contribution.

Parameters:

design (dict) – A dictionary containing design options and the design tool.

add_notes(notes={})[source]#

Adds notes to the contribution.

Parameters:

notes (dict) – A dictionary containing notes.

add_sim_result(result_name, result_value, unit)[source]#

Add a simulation result to the contributor.

Parameters:
  • result_name (str) – The name of the simulation result.

  • result_value (float) – The value of the simulation result.

  • unit (str) – The unit of measurement for the simulation result.

Returns:

None

add_sim_setup(sim_setup)[source]#

Adds simulation setup options to the contribution.

Parameters:

sim_setup (dict) – A dictionary containing simulation setup options that match the configs schema.

clear()[source]#

Clears the contribution data.

contribute(path_to_repo, is_sweep=False)[source]#

Contributes to the repository by updating the local repo, updating the database, and uploading to HF.

Parameters:
  • path_to_repo (str) – The path to the repository.

  • is_sweep (bool) – True if the contribution is a sweep, False otherwise.

Returns:

None

from_json(json_file, is_sweep=False)[source]#

Loads a contribution from a JSON file.

Parameters:
  • json_file (str) – The path to the JSON file.

  • is_sweep (bool) – True if the contribution is a sweep, False otherwise.

get_config_schema()[source]#

Connects to the repository with the given configuration name. Chooses the first entry from the config dataset and extracts the schema.

Returns:

A dictionary containing the schema for the given configuration name.

get_contributor_info()[source]#

Returns the contributor information.

Returns:

The contributor information.

Return type:

str

property invalidate#

Invalidates the contributor by setting the isValidated flag to False.

property is_validated#

Returns True if the contribution is validated, False otherwise.

Returns:

True if the contribution is validated, False otherwise.

Return type:

bool

show()[source]#

Print the contribution data in a pretty format.

Parameters:

None

Returns:

None

show_config_schema()[source]#

Connects to the repository with the given configuration name. Chooses the first entry from the config dataset and extracts the schema.

Returns:

None

submit()[source]#

Sends the data and the config name to a remote server.

to_dict()[source]#

Converts the Contributor object to a dictionary.

Returns:

A dictionary representation of the Contributor object.

Return type:

dict

update_db(path_to_repo, is_sweep=False)[source]#

Updates the local repository with the validated data.

Parameters:

path_to_repo (str) – The path to the local repository.

Raises:

ValueError – If the data has not been validated.

update_repo(path_to_repo)[source]#

Updates the repository at the specified path.

Parameters:

path_to_repo (str) – The path to the repository.

Raises:

subprocess.CalledProcessError – If the git commands fail.

upload_to_HF(path_to_repo)[source]#

Uploads validated data to the specified repository.

Parameters:

path_to_repo (str) – The path to the repository.

Raises:
  • ValueError – If the data has not been validated.

  • subprocess.CalledProcessError – If the git commands fail.

Returns:

None

validate()[source]#

Validates the contribution by performing various checks.

Raises:

Exception – If any validation check fails.

validate_content(data)[source]#
Parameters:

data (dict) – The data to be validated.

Validates the content of the contribution against the dataset schema.

validate_structure(actual_structure)[source]#

Validates the structure of the contributor object.

Parameters:

actual_structure (dict) – The actual structure of the contributor object.

Raises:

ValueError – If any required key or sub-key is missing in the actual structure.

validate_sweep()[source]#

Validates the sweep data by performing structure, type, and content validation on each entry.

Raises:

Exception – If the validation fails.

Returns:

None

validate_types(data)[source]#
Parameters:

data (dict) – The data to be validated.

Validates the types of the data using the schema defined in the config.

squadds.database.contributor_HF module#

class squadds.database.contributor_HF.Contribute(data_files)[source]#

Bases: object

Class representing a contributor for dataset creation and upload.

dataset_files#

List of dataset file paths.

Type:

list

institute#

Institution name.

Type:

str

pi_name#

PI (Principal Investigator) name.

Type:

str

api#

Hugging Face API object.

Type:

HfApi

token#

Hugging Face API token.

Type:

str

dataset_name#

Name of the dataset.

Type:

str

dataset_files#

List of dataset file paths.

Type:

list

Link to the dataset.

Type:

str

check_for_api_key()[source]#

Checks for the presence of Hugging Face API key.

create_dataset_name()[source]#

Creates a unique name for the dataset.

Retrieves the link to the dataset.

upload_dataset()[source]#

Uploads the dataset to Hugging Face.

create_dataset_repository()[source]#

Creates a repository for the dataset on Hugging Face.

upload_dataset_no_validation()[source]#

Uploads the dataset to Hugging Face without validation.

check_for_api_key()[source]#

Checks for the presence of Hugging Face API key.

Returns:

Hugging Face API object. token (str): Hugging Face API token.

Return type:

api (HfApi)

Raises:

ValueError – If Hugging Face token is not found.

create_dataset_name(components, data_type, data_nature, data_source, date=None)[source]#

Creates a unique name for the dataset.

Parameters:
  • components (list) – List of components.

  • data_type (str) – Type of the data.

  • data_nature (str) – Nature of the data.

  • data_source (str) – Source of the data.

  • date (str, optional) – Date of the dataset creation. Defaults to None.

Returns:

Unique name for the dataset.

Return type:

str

create_dataset_repository(components, data_type, data_nature, data_source)[source]#

Creates a repository for the dataset on HuggingFace (if it doesn’t exist).

Parameters:
  • components (list) – List of components.

  • data_type (str) – Type of the data.

  • data_nature (str) – Nature of the data.

  • data_source (str) – Source of the data.

get_dataset_link()[source]#

Retrieves the link to the dataset.

Returns:

Link to the dataset.

Return type:

str

upload_dataset()[source]#

Uploads the dataset to Hugging Face.

Raises:

NotImplementedError – If dataset upload is not implemented.

upload_dataset_no_validation(components, data_type, data_nature, data_source, files, date=None)[source]#

Uploads the dataset to HuggingFace without validation.

Parameters:
  • components (list) – List of components.

  • data_type (str) – Type of the data.

  • data_nature (str) – Nature of the data.

  • data_source (str) – Source of the data.

  • files (list) – List of file paths.

  • date (str, optional) – Date of the dataset creation. Defaults to None.

squadds.database.new_contribution module#

class squadds.database.new_contribution.ConfigMaker(component, component_name, data_type)[source]#

Bases: object

create_metadata(interactive=True)[source]#
set_schema(ref_file=None, interactive=True)[source]#

# TODO: Implement create_metadata method (both interactive and non-interactive) for required fields if interactive:

self.set_design_fields() self.set_sim_options_fields() self.set_sim_results_fields() self.set_other_fields()

else:

self.set_fields(ref_file)

submit(results)[source]#

squadds.database.utils module#

Utilities for the database package.

squadds.database.utils.copy_files_to_new_location(data_path, new_path)[source]#

Copy files from the given data path to the new location.

Parameters:
  • data_path (str) – The path to the directory containing the files to be copied.

  • new_path (str) – The path to the directory where the files will be copied to.

Returns:

None

Raises:

None –

squadds.database.utils.create_contributor_info()[source]#

Prompt the user for information and update the .env file.

This function prompts the user to enter information such as institution name, group name, PI name, user name, and an optional contrib_misc. It then validates the input and updates the corresponding fields in the .env file. If the fields already exist in the .env file, the function prompts the user to confirm whether to overwrite the existing values.

Raises:

ValueError – If any of the input fields are empty (except for contrib_misc).

squadds.database.utils.generate_file_name(data_file)[source]#

Generate a unique file name based on the given data file.

Parameters:

data_file (str) – The path to the data file.

Returns:

The generated file name.

Return type:

str

Module contents#