squadds.core package#
Submodules#
squadds.core.analysis module#
- class squadds.core.analysis.Analyzer(db=None)[source]#
Bases:
object
The Analyzer class is responsible for analyzing designs and finding the closest designs based on target parameters.
- _add_target_params_columns()[source]#
Adds target parameter columns to the dataframe based on the selected system.
- _fix_cavity_claw_df()[source]#
Fixes the cavity claw DataFrame by renaming columns and updating values.
- _get_H_param_keys()[source]#
Gets the parameter keys for the Hamiltonian based on the selected system.
- set_metric_strategy(strategy
MetricStrategy): Sets the metric strategy to use for calculating the distance metric.
- _outside_bounds(df
pd.DataFrame, params: dict, display=True) -> bool: Checks if entered parameters are outside the bounds of a dataframe.
- find_closest(target_params
dict, num_top: int, metric: str = ‘Euclidean’, display: bool = True): Finds the closest designs in the library based on the target parameters.
- get_interpolated_design(target_params
dict, metric: str = ‘Euclidean’, display: bool = True): Gets the interpolated design based on the target parameters.
Initializes an instance of the Analysis class.
- Parameters:
db (-) – The database object.
- - db
The database object.
- - selected_component_name
The name of the selected component.
- - selected_component
The selected component.
- - selected_data_type
The selected data type.
- - selected_confg
The selected configuration.
- - selected_qubit
The selected qubit.
- - selected_cavity
The selected cavity.
- - selected_coupler
The selected coupler.
- - selected_system
The selected system.
- - df
The selected dataframe.
- - closest_df_entry
The closest dataframe entry.
- - closest_design
The closest design.
- - presimmed_closest_cpw_design
The presimmed closest CPW design.
- - presimmed_closest_qubit_design
The presimmed closest qubit design.
- - presimmed_closest_coupler_design
The presimmed closest coupler design.
- - interpolated_design
The interpolated design.
- - metric_strategy
The metric strategy (will be set dynamically).
- - custom_metric_func
The custom metric function.
- - metric_weights
The metric weights.
- - target_params
The target parameters.
- - H_param_keys
The H parameter keys.
- closest_design_in_H_space()[source]#
Plots a scatter plot of the closest design in the H-space.
This method creates a scatter plot with two subplots. The first subplot shows the relationship between ‘cavity_frequency_GHz’ and ‘kappa_kHz’, while the second subplot shows the relationship between ‘anharmonicity_MHz’ and ‘g_MHz’. The scatter plot includes pre-simulated data, target data, and the closest design entry from the database.
- Returns:
None
- find_closest(target_params, num_top, metric='Euclidean', display=True, parallel=False, num_cpu='auto', skip_df_gen=False)[source]#
Find the closest designs in the library based on the target parameters.
- Parameters:
target_params (-) – A dictionary containing the target parameters.
num_top (-) – The number of closest designs to retrieve.
metric (-) – The distance metric to use for calculating distances. Defaults to ‘Euclidean’.
display (-) – Whether to display warnings for parameters outside of the library bounds. Defaults to True.
parallell (-) – Whether to run metric calculation in a parallelized way
num_cpu (-) – The number of CPUs to run a job over
skip_df_gen (-) – Whether to generate the df or run from memory
- Returns:
A DataFrame containing the closest designs.
- Return type:
closest_df (DataFrame)
- Raises:
- ValueError – If the specified metric is not supported or if num_top is bigger than the size of the library.
- ValueError – If the metric is invalid.
- get_Ljs(df)[source]#
Extracts the EJ values from the dataframe. Converts them to Josephson inductance values using pyEPR
Parameters: df (pd.DataFrame): The dataframe containing design options.
Returns: np.array: An array of Josephson inductance values.
- get_closest_cavity()[source]#
Returns the closest cavity design.
- Returns:
The closest cavity design.
- Return type:
pd.Series
- get_complete_df(target_params, metric='Euclidean', display=True)[source]#
Returns the complete DataFrame (design + Hamiltonian parameters) sourced using the target parameters.
- Parameters:
target_params (-) – A dictionary containing the target parameters.
metric (-) – The distance metric to use for calculating distances. Defaults to ‘Euclidean’.
display (-) – Whether to display warnings for parameters outside of the library bounds. Defaults to True.
- Returns:
A DataFrame containing all designs and Hamiltonian parameters.
- Return type:
complete_df (DataFrame)
- Raises:
- ValueError – If the specified metric is not supported or if num_top is bigger than the size of the library.
- ValueError – If the metric is invalid.
- get_coupler_options(df)[source]#
Extracts coupler options from the dataframe.
Parameters: df (pd.DataFrame): The dataframe containing design options.
Returns: Dict[str, List[Any]]: A dictionary containing lists of the extracted coupler options.
- Return type:
Dict[str, List[Any]]
- get_cpw_options(df)[source]#
Extracts CPW options from the dataframe.
Parameters: df (pd.DataFrame): The dataframe containing design options.
Returns: Dict[str, List[Any]]: A dictionary containing lists of the extracted CPW options.
- Return type:
Dict[str, List[Any]]
- get_design(df)[source]#
Extracts the design parameters from the dataframe and returns a dict.
- Returns:
A dict containing the design parameters.
- Return type:
dict
- get_qubit_options(df)[source]#
Extracts qubit design options from the dataframe.
Parameters: df (pd.DataFrame): The dataframe containing design options.
Returns: Dict[str, List[Any]]: A dictionary containing lists of the extracted qubit options.
- Return type:
Dict[str, List[Any]]
- set_metric_strategy(strategy)[source]#
Sets the metric strategy to use for calculating the distance metric.
- Parameters:
strategy (MetricStrategy) – The strategy to use for calculating the distance metric.
- Raises:
ValueError – If the specified metric is not supported.
- squadds.core.analysis.scale_value(value, ratio)[source]#
Scales the given value by the specified ratio.
- Parameters:
value (-) – The value to be scaled, in the format ‘Xum’ where X is a number.
ratio (-) – The scaling ratio.
- Returns:
The scaled value in the format ‘Xum’ where X is the scaled number.
- Return type:
scaled_value (str)
squadds.core.db module#
!TODO: add FULL support for half-wave cavity
- class squadds.core.db.SQuADDS_DB(*args, **kwargs)[source]#
Bases:
object
A class representing the SQuADDS database.
- _delete_cache()#
Delete the dataset cache directory.
- get_dataset_info(component, component_name, data_type)[source]#
Print information about a specific dataset.
- view_contributors_of_config(config)[source]#
Print a table of contributors for a specific configuration.
- view_contributors_of(component, component_name, data_type)[source]#
Print a table of contributors for a specific component, component name, and data type.
- select_components(component_dict)[source]#
Select a configuration based on a component dictionary or string.
- select_system(components)[source]#
Select a system based on a list of components or a single component.
Constructor for the SQuADDS_DB class.
- repo_name#
The name of the repository.
- Type:
str
- configs#
List of supported configuration names.
- Type:
list
- selected_component_name#
The name of the selected component.
- Type:
str
- selected_component#
The selected component.
- Type:
str
- selected_data_type#
The selected data type.
- Type:
str
- selected_confg#
The selected configuration.
- Type:
str
- selected_qubit#
The selected qubit.
- Type:
str
- selected_cavity#
The selected cavity.
- Type:
str
- selected_coupler#
The selected coupler.
- Type:
str
- selected_resonator_type#
The selected resonator type.
- Type:
str
- selected_system#
The selected system.
- Type:
str
- selected_df#
The selected dataframe.
- Type:
str
- target_param_keys#
The target parameter keys.
- Type:
str
- units#
The units.
- Type:
str
- _internal_call#
Flag to track internal calls.
- Type:
bool
- create_qubit_cavity_df(qubit_df, cavity_df, merger_terms=None, parallelize=False, num_cpu=None)[source]#
Creates a merged DataFrame by merging the qubit and cavity DataFrames based on the specified merger terms.
- Parameters:
qubit_df (pandas.DataFrame) – The DataFrame containing qubit data.
cavity_df (pandas.DataFrame) – The DataFrame containing cavity data.
merger_terms (list) – A list of column names to be used for merging the DataFrames. Defaults to None.
parallelize (bool) – Whether to use multiprocessing to speed up the merging. Defaults to False.
num_cpu (int) – The number of CPU cores to use for multiprocessing. If not specified, the function will use the maximum number of available cores.
- Returns:
The merged DataFrame.
- Return type:
pandas.DataFrame
- Raises:
None –
- create_system_df(parallelize=False, num_cpu=None)[source]#
Creates and returns a DataFrame based on the selected system.
- Parameters:
parallelize (bool) – Whether to use multiprocessing to speed up the merging. Defaults to False.
num_cpu (int) – The number of CPU cores to use for multiprocessing. If not specified, the function will use the maximum number of available cores.
If the selected system is a single component, it retrieves the dataset based on the selected data type, component, and component name. If a coupler is selected, the DataFrame is filtered by the coupler. The resulting DataFrame is stored in the selected_df attribute.
If the selected system is a list of components (qubit and cavity), it retrieves the qubit and cavity DataFrames. The qubit DataFrame is obtained based on the selected qubit component name and data type “cap_matrix”. The cavity DataFrame is obtained based on the selected cavity component name and data type “eigenmode”. The qubit and cavity DataFrames are merged into a single DataFrame using the merger terms [‘claw_width’, ‘claw_length’, ‘claw_gap’]. The resulting DataFrame is stored in the selected_df attribute.
- Raises:
UserWarning – If the selected system is either not specified or does not contain a cavity.
- Returns:
The created DataFrame based on the selected system.
- Return type:
pandas.DataFrame
- find_parquet_files()[source]#
Searches for parquet files in the repository and returns their paths/filenames.
- Returns:
A list of paths/filenames of parquet files in the repository.
- Return type:
list
- generate_qubit_half_wave_cavity_df(parallelize=False, num_cpu=None, save_data=False)[source]#
Generates a DataFrame that combines the qubit and half-wave cavity data.
- Parameters:
parallelize (bool, optional) – Flag indicating whether to parallelize the computation. Defaults to False.
num_cpu (int, optional) – Number of CPUs to use for parallelization. Defaults to None.
save_data (bool, optional) – Flag indicating whether to save the generated data. Defaults to False.
- Returns:
The generated DataFrame.
- Return type:
pandas.DataFrame
- Raises:
None –
Notes
This method generates a DataFrame by combining the qubit and half-wave cavity data.
The qubit and cavity data are obtained from the get_dataset and generate_updated_half_wave_cavity_df methods, respectively.
The generated DataFrame is optimized to reduce memory usage using various optimization techniques.
If save_data is True, the generated DataFrames are saved in the “data” directory.
- generate_updated_half_wave_cavity_df(parallelize=False, num_cpu=None)[source]#
!TODO: speed this up!
- get_component_names(component=None)[source]#
Get the names of the components associated with a specific component.
- Parameters:
component (str) – The specific component to retrieve names for.
- Returns:
A list of component names associated with the specified component.
- Return type:
list
- get_configs()[source]#
Returns the configurations stored in the database.
- Returns:
A list of configuration names.
- Return type:
list
- get_dataset(data_type=None, component=None, component_name=None)[source]#
Retrieves a dataset based on the specified data type, component, and component name.
- Parameters:
data_type (str) – The type of data to retrieve.
component (str) – The component to retrieve the data from.
component_name (str) – The name of the component to retrieve the data from.
- Returns:
The retrieved dataset.
- Return type:
pandas.DataFrame
- Raises:
ValueError – If the system and component name are not defined.
ValueError – If the data type is not specified.
ValueError – If the component is not supported.
ValueError – If the component name is not supported.
ValueError – If the data type is not supported.
Exception – If an error occurs while loading the dataset.
- get_dataset_info(component=None, component_name=None, data_type=None)[source]#
Retrieves and prints information about a dataset.
- Parameters:
component (str) – The component of the dataset.
component_name (str) – The name of the component.
data_type (str) – The type of data.
- Returns:
None
- get_device_contributors_of(component=None, component_name=None, data_type=None)[source]#
View the reference/source experimental device that was used to validate a specific simulation configuration.
- Parameters:
component (str) – The component of interest.
component_name (str) – The name of the component.
data_type (str) – The type of data.
- Returns:
The relevant contributor information.
- Return type:
dict
- get_existing_files()[source]#
Retrieves the list of existing files in the repository.
- Returns:
A list of existing file names in the repository.
- Return type:
list
- get_measured_devices()[source]#
Retrieve all measured devices with their corresponding design codes, paper links, images, foundries, and fabrication recipes.
- Returns:
A DataFrame containing the name, design code, paper link, image, foundry, and fabrication recipe for each device.
- Return type:
pd.DataFrame
- read_parquet_file(file_name)[source]#
Takes in the filename and returns the object to be read as a pandas dataframe.
- Parameters:
file_name (str) – The name of the parquet file to read.
- Returns:
The dataframe read from the parquet file.
- Return type:
pandas.DataFrame
- see_dataset(data_type=None, component=None, component_name=None)[source]#
View a dataset based on the provided data type, component, and component name.
- Parameters:
data_type (str) – The type of data to view.
component (str) – The component to use. If not provided, the selected system will be used.
component_name (str) – The name of the component. If not provided, the selected component name will be used.
- Returns:
The flattened dataset.
- Return type:
pandas.DataFrame
- Raises:
ValueError – If both system and component name are not defined.
ValueError – If data type is not specified.
ValueError – If the component is not supported.
ValueError – If the component name is not supported.
ValueError – If the data type is not supported.
Exception – If an error occurs while loading the dataset.
- select_cavity(cavity=None)[source]#
Selects a cavity and sets the necessary attributes for further operations.
- Parameters:
cavity (str) – The name of the cavity to be selected.
- Raises:
UserWarning – If the selected system is either not specified or does not contain a cavity.
- Returns:
None
- select_cavity_claw(cavity=None)[source]#
Selects a cavity claw component.
- Parameters:
cavity (str) – The name of the cavity to select.
- Raises:
UserWarning – If the selected system is not specified or does not contain a cavity.
- Returns:
None
- select_components(component_dict=None)[source]#
Selects components based on the provided component dictionary or string.
- Parameters:
component_dict (dict or str) – A dictionary containing the component details (component, component_name, data_type) or a string representing the component.
- Returns:
None
- select_coupler(coupler=None)[source]#
Selects a coupler for the database.
- Parameters:
coupler (str, optional) – The name of the coupler to select. Defaults to None.
- Returns:
None
- select_qubit(qubit=None)[source]#
Selects a qubit and sets the necessary attributes for the selected qubit.
- Parameters:
qubit (str) – The name of the qubit to be selected.
- Raises:
UserWarning – If the selected system is not specified or does not contain a qubit.
- Returns:
None
- select_resonator_type(resonator_type)[source]#
Select the coupler based on the resonator type.
- Parameters:
resonator_type (str) – The type of resonator, e.g., “quarter” or “half”.
- select_system(components=None)[source]#
Selects the system and component(s) to be used.
- Parameters:
components (list or str) – The component(s) to be selected. If a list is provided, each component will be checked against the supported components. If a string is provided, it will be checked against the supported components.
- Returns:
None
- Raises:
None –
- show_selections()[source]#
Prints the selected system, component, and data type.
If the selected system is a list, it prints the selected qubit, cavity, coupler, and system. If the selected system is a string, it prints the selected component, component name, data type, system, and coupler.
- supported_component_names()[source]#
Returns a list of supported component names extracted from the configs.
- Returns:
A list of supported component names.
- Return type:
list
- supported_components()[source]#
Returns a list of supported components based on the configurations.
- Returns:
A list of supported components.
- Return type:
list
- supported_config_names()[source]#
Retrieves the supported configuration names from the repository.
- Returns:
A list of supported configuration names.
- supported_data_types()[source]#
Returns a list of supported data types.
- Returns:
A list of supported data types.
- Return type:
list
- unselect(param)[source]#
Unselects the specified parameter.
Parameters: param (str): The parameter to unselect. Valid options are:
“component”
“component_name”
“data_type”
“qubit”
“cavity_claw”
“coupler”
“system”
Returns: None
- unselect_all()[source]#
Clears the selected component, data type, qubit, cavity, coupler, and system.
- upload_dataset(file_paths, repo_file_names, overwrite=False)[source]#
Uploads a dataset to the repository.
- Parameters:
file_paths (list) – A list of file paths to upload.
repo_file_names (list) – A list of file names to use in the repository.
overwrite (bool) – Whether to overwrite an existing dataset. Defaults to False.
- view_all_contributors()[source]#
View all unique contributors and their relevant information from simulation configurations.
This method iterates through the simulation configurations and extracts the relevant information of each contributor. It checks if the combination of uploader, PI, group, and institution is already in the list of unique contributors. If not, it adds the relevant information to the list. Finally, it prints the list of unique contributors in a tabular format with a banner.
- view_all_simulation_contributors()[source]#
View all unique simulation contributors and their relevant information.
- view_component_names(component=None)[source]#
Prints the names of the components available in the database.
- Parameters:
component (str) – The specific component to view names for. If None, all component names will be printed.
- Returns:
None
- view_contributors_of(component=None, component_name=None, data_type=None, measured_device_name=None)[source]#
View contributors of a specific component, component name, and data type.
- Parameters:
component (str) – The component of interest.
component_name (str) – The name of the component.
data_type (str) – The type of data.
measured_device_name (str) – The name of the measured device.
- Returns:
None
- view_contributors_of_config(config)[source]#
View the contributors of a specific configuration.
- Parameters:
config (str) – The name of the configuration.
- Returns:
None
- view_datasets()[source]#
View the datasets available in the database.
This method retrieves the supported components, component names, and data types from the database and displays them in a tabular format.
- view_device_contributors_of(component=None, component_name=None, data_type=None)[source]#
View the reference/source experimental device that was used to validate a specific simulation configuration.
- Parameters:
component (str) – The component of interest.
component_name (str) – The name of the component.
data_type (str) – The type of data.
- Returns:
The name of the experimentally validated reference device, or an error message if not found.
- Return type:
str
- view_measured_devices()[source]#
View all measured devices with their corresponding design codes, paper links, images, foundries, and fabrication recipes.
This method retrieves and displays the relevant information for each device in the dataset in a well-formatted table.
- view_recipe_of(device_name)[source]#
Retrieve the foundry and fabrication recipe information for a specified device.
- Parameters:
device_name (str) – The name of the device to retrieve information for.
- Returns:
A dictionary containing foundry and fabrication recipe information.
- Return type:
dict
- view_reference_device_of(component=None, component_name=None, data_type=None)[source]#
View the reference/source experimental device that was used to validate a specific simulation configuration.
- Parameters:
component (str) – The component of interest.
component_name (str) – The name of the component.
data_type (str) – The type of data.
- view_reference_devices()[source]#
View all unique reference (experimental) devices and their relevant information.
This method iterates through the configurations and extracts the chip’s name within the SQuADDS DB, group, and who the chip was measured by. It also finds the simulation results for the device.It checks if the combination of simulation results uploader, PI, group, and institution is already in the list of unique contributors. If not, it adds the relevant information to the list. Finally, it prints the list of unique devices in a tabular format.
- view_sim_contributors_of(component=None, component_name=None, data_type=None, measured_device_name=None)[source]#
View the simulation contributors of a specific component, component name, and data type.
- Parameters:
component (str) – The component of interest.
component_name (str) – The name of the component.
data_type (str) – The type of data.
measured_device_name (str) – The name of the measured device.
- Returns:
None
squadds.core.design_patterns module#
squadds.core.globals module#
squadds.core.metrics module#
- class squadds.core.metrics.ChebyshevMetric[source]#
Bases:
MetricStrategy
Implements the Chebyshev metric strategy.
- calculate(target_params, df_row)[source]#
Calculate the Chebyshev distance between target_params and df_row.
- Parameters:
target_params (dict) – The target parameters as a dictionary.
df_row (pd.Series) – A single row from a DataFrame representing a set of parameters.
- Returns:
The Chebyshev distance.
- Return type:
float
- class squadds.core.metrics.CustomMetric(custom_metric_func)[source]#
Bases:
MetricStrategy
Implements a custom metric strategy using a user-defined function.
- Example Usage:
To use a custom Manhattan distance metric, define the function as follows:
- def manhattan_distance(target, simulated):
return sum(abs(target[key] - simulated.get(key, 0)) for key in target)
Then, instantiate CustomMetric with this function:
custom_metric = CustomMetric(manhattan_distance)
Initialize CustomMetric with a custom metric function.
- Parameters:
custom_metric_func (callable) – User-defined custom metric function. The function should take two dictionaries as arguments and return a float.
- calculate(target_params, df_row)[source]#
Calculate the custom metric between target_params and df_row using the user-defined function.
- Parameters:
target_params (dict) – The target parameters as a dictionary.
df_row (pd.Series) – A single row from a DataFrame representing a set of parameters.
- Returns:
The custom metric calculated using the user-defined function.
- Return type:
float
- class squadds.core.metrics.EuclideanMetric[source]#
Bases:
MetricStrategy
Implements the specific Euclidean metric strategy as per your definition.
- calculate(target_params, df_row)[source]#
Calculate the custom Euclidean distance between target_params and df_row.
The Euclidean distance is calculated as: sqrt(sum_i (x_i - x_{target})^2 / x_{target}), where x_i are the values in df_row and x_{target} are the target parameters.
- Parameters:
target_params (dict) – The target parameters as a dictionary.
df_row (pd.Series) – A single row from a DataFrame representing a set of parameters.
- Returns:
The custom Euclidean distance.
- Return type:
float
- class squadds.core.metrics.ManhattanMetric[source]#
Bases:
MetricStrategy
Implements the Manhattan metric strategy.
- calculate(target_params, df_row)[source]#
Calculate the Manhattan distance between target_params and df_row.
- Parameters:
target_params (dict) – The target parameters as a dictionary.
df_row (pd.Series) – A single row from a DataFrame representing a set of parameters.
- Returns:
The Manhattan distance.
- Return type:
float
- class squadds.core.metrics.MetricStrategy[source]#
Bases:
ABC
Abstract class for metric strategies.
- abstract calculate(target_params, row)[source]#
Calculate the distance metric between target parameters and a DataFrame row.
- Parameters:
target_params (dict) – Dictionary of target parameters.
row (pd.Series) – A row from a DataFrame.
- Returns:
Calculated distance.
- Return type:
float
- calculate_in_parallel(target_params, df, num_jobs=4)[source]#
Calculate distances in parallel.
- Parameters:
target_params (dict) – Dictionary of target parameters.
df (pd.DataFrame) – The DataFrame containing rows to calculate distances for.
num_jobs (int) – Number of jobs for parallel processing.
- Returns:
Series of calculated distances.
- Return type:
pd.Series
- class squadds.core.metrics.WeightedEuclideanMetric(weights)[source]#
Bases:
MetricStrategy
Concrete class for weighted Euclidean metric.
Initialize the weights.
- Parameters:
weights (dict) – Dictionary of weights for each parameter.
- calculate(target_params, row)[source]#
Calculate the weighted Euclidean distance between target parameters and a DataFrame row.
- Parameters:
target_params (dict) – Dictionary of target parameters.
row (pd.Series) – A row from a DataFrame.
- Returns:
Calculated weighted Euclidean distance.
- Return type:
float
squadds.core.processing module#
- squadds.core.processing.update_cavity_frequency_and_kappa(merged_df, Z0=50)[source]#
Updates the cavity frequency and kappa based on the given merged_df DataFrame.
Parameters: - merged_df: DataFrame containing the necessary simulation results. - Z0: Characteristic impedance of the system (default: 50 Ohms).
Returns: - cavity_frequency_updated: Updated cavity frequency in Hz. - kappa: Updated kappa in Hz.
squadds.core.utils module#
- squadds.core.utils.can_be_categorical(column)[source]#
Check if all elements in the column are hashable.
- squadds.core.utils.columns_memory_usage(df)[source]#
Calculates the memory usage of each column and returns a DataFrame showing each column’s memory usage and percentage of total memory usage.
Parameters: - df: DataFrame to process.
Returns: - mem_usage_df: DataFrame with columns ‘Column’, ‘Memory Usage (MB)’, and ‘Percentage of Total Memory Usage’.
- squadds.core.utils.compare_schemas(data_schema, expected_schema, path='')[source]#
Compare two schemas and raise an error if there are any mismatches.
- Parameters:
data_schema (dict) – The data schema to compare.
expected_schema (dict) – The expected schema to compare against.
path (str, optional) – The current path in the schema. Used for error messages. Defaults to ‘’.
- Raises:
ValueError – If there is a key in the data schema that is not present in the expected schema.
ValueError – If there is a type mismatch between the data schema and the expected schema.
- squadds.core.utils.compute_memory_usage(df)[source]#
Compute the memory usage of the given DataFrame.
- Parameters:
df (pandas.DataFrame) – The DataFrame to compute the memory usage for.
- Returns:
The memory usage of the DataFrame in megabytes.
- Return type:
float
- squadds.core.utils.convert_list_to_str(lst)[source]#
Converts the given list of floats to a string representation. :param lst: The list of floats to be converted. :type lst: list
- Returns:
The string representation of the list.
- Return type:
str
- squadds.core.utils.convert_numpy(obj)[source]#
Converts NumPy arrays to Python lists recursively.
- Parameters:
obj – The object to be converted.
- Returns:
The converted object.
- squadds.core.utils.convert_to_numeric(value)[source]#
Converts a value to a numeric type if possible.
- Parameters:
value – The value to be converted.
- Returns:
The converted value if it can be converted to int or float, otherwise returns the original value.
- squadds.core.utils.convert_to_str(value, units)[source]#
Converts the given value to a string with the given units. :param value: The value to be converted. :type value: float :param units: The units to be appended to the value. :type units: str
- Returns:
The value as a string with the units.
- Return type:
str
- squadds.core.utils.create_mailto_link(recipients, subject, body)[source]#
Create a mailto link with the given recipients, subject, and body.
- Parameters:
recipients (list) – A list of email addresses of the recipients.
subject (str) – The subject of the email.
body (str) – The body of the email.
- Returns:
The generated mailto link.
- Return type:
str
- squadds.core.utils.create_unified_design_options(row)[source]#
Create a unified design options dictionary based on the given row.
- Parameters:
row (pandas.Series) – The row containing the design options.
- Returns:
The unified design options dictionary.
- Return type:
dict
- squadds.core.utils.delete_categorical_columns(df)[source]#
Deletes all columns of type ‘category’ from the DataFrame.
Parameters: - df: DataFrame to process.
Returns: - df: DataFrame with ‘category’ columns removed.
- squadds.core.utils.delete_object_columns(df)[source]#
Deletes all columns of type ‘object’ from the DataFrame.
Parameters: - df: DataFrame to process.
Returns: - df: DataFrame with ‘object’ columns removed.
- squadds.core.utils.filter_df_by_conditions(df, conditions)[source]#
Filter a DataFrame based on given conditions.
- Parameters:
df (pandas.DataFrame) – The DataFrame to be filtered.
conditions (dict) – A dictionary containing column-value pairs as conditions.
- Returns:
The filtered DataFrame.
- Return type:
pandas.DataFrame
- Raises:
None –
- squadds.core.utils.flatten_df_second_level(df)[source]#
Flattens a DataFrame by expanding dictionary-like data in the second level of columns.
- Parameters:
df (pandas.DataFrame) – The DataFrame to be flattened.
- Returns:
A new DataFrame with the flattened data.
- Return type:
pandas.DataFrame
- squadds.core.utils.float_to_string(value, units)[source]#
Converts a float value to a string representation with units.
- Parameters:
value (float) – The value to be converted.
units (str) – The units to be appended to the value.
- Returns:
The value as a string with the units.
- Return type:
str
- squadds.core.utils.get_config_schema(entry)[source]#
Generates the schema for the given entry with specific rules. The ‘sim_results’ are fully expanded, while others are expanded to the first level.
- squadds.core.utils.get_entire_schema(obj)[source]#
Recursively traverses the given object and returns a schema representation.
- Parameters:
obj – The object to generate the schema for.
- Returns:
The schema representation of the object.
- squadds.core.utils.get_schema(obj)[source]#
Returns the schema of the given object.
- Parameters:
obj – The object for which the schema needs to be determined.
- Returns:
The schema of the object. If the object is a dictionary, the schema will be a dictionary with the same keys as the original dictionary, where the values represent the schema of the corresponding values in the original dictionary. If the object is a list, the schema will be either ‘dict’ if the list contains dictionaries, or the type name of the first element in the list. For any other type of object, the schema will be the type name of the object.
- squadds.core.utils.get_sim_results_keys(dataframes)[source]#
Get the unique keys from the ‘sim_results’ column of the given dataframes.
- Parameters:
dataframes (list or pandas.DataFrame) – A list of dataframes or a single dataframe.
- Returns:
A list of unique keys extracted from the ‘sim_results’ column.
- Return type:
list
- squadds.core.utils.optimize_dataframe(df)[source]#
Optimize the memory usage of a pandas DataFrame by downcasting data types.
Parameters: - df (pandas.DataFrame): The DataFrame to be optimized.
Returns: - df_optimized (pandas.DataFrame): The optimized DataFrame.
- squadds.core.utils.print_column_types(df)[source]#
Prints out the data type of each column in the DataFrame.
Parameters: - df: DataFrame to analyze.
- squadds.core.utils.process_design_options(merged_df)[source]#
Processes the ‘design_options’ column in merged_df, appends new columns, converts values, and drops ‘design_options’.
Parameters: - merged_df: DataFrame containing the ‘design_options’ column.
Returns: - merged_df: Modified DataFrame with new columns added and ‘design_options’ dropped.
- squadds.core.utils.save_intermediate_df(df, filename, file_idx)[source]#
Save the intermediate DataFrame to disk in Parquet format.
- Parameters:
df (pd.DataFrame) – The DataFrame to save.
filename (str) – The base name of the file to save the DataFrame to.
file_idx (int) – The index of the file chunk.
- squadds.core.utils.send_email_via_client(dataset_name, institute, pi_name, date, dataset_link)[source]#
Sends an email notification to recipients with the details of the created dataset.
- Parameters:
dataset_name (str) – The name of the dataset.
institute (str) – The name of the institute where the dataset was created.
pi_name (str) – The name of the principal investigator who created the dataset.
date (str) – The date when the dataset was created.
dataset_link (str) – The link to the created dataset.
- Returns:
None
- squadds.core.utils.set_github_token()[source]#
Sets the GitHub token by appending it to the .env file. If the token already exists in the .env file, it does not add it again. If the GitHub token is not found, it raises a ValueError.
- squadds.core.utils.set_huggingface_api_key()[source]#
Sets the Hugging Face API key by appending it to the .env file. If the API key already exists in the .env file, it does not add it again. If the Hugging Face token is not found, it raises a ValueError.
- squadds.core.utils.string_to_float(string)[source]#
Converts a string representation of a number to a float.
- Parameters:
string (str) – The string representation of the number.
- Returns:
The converted float value.
- Return type:
float
- squadds.core.utils.validate_types(data_part, schema_part)[source]#
Recursively validates the types of data_part against the expected types defined in schema_part.
- Parameters:
data_part (dict) – The data to be validated.
schema_part (dict) – The schema defining the expected types.
- Raises:
TypeError – If the type of any key in data_part does not match the expected type in schema_part.
- Returns:
None