Tutorial 3: Contributing to SQuADDS#

In this tutorial, we will go over the basics of contributing to SQuADDS. We will cover the following topics:

  1. Contribution Information Setup

  2. Understanding the terminology and database structure

  3. Contributing to an existing database node

  4. Creating new database node

  5. Building on top of others works


[86]:
%load_ext autoreload
%autoreload 2
The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload
[14]:
!pip install -e ../.
Obtaining file:///Users/shanto/LFL/SQuADDS/SQuADDS
  Preparing metadata (setup.py) ... done
Installing collected packages: SQuADDS
  Attempting uninstall: SQuADDS
    Found existing installation: SQuADDS 0.1
    Uninstalling SQuADDS-0.1:
      Successfully uninstalled SQuADDS-0.1
  Running setup.py develop for SQuADDS
Successfully installed SQuADDS-0.1

Contribution Information Setup#

In order to contribute to SQuADDS, you will need to provide some information about yourself. This information will be used to track your contributions and to give you credit for your work. You can provide this information by updating the following variables in the .env file in the root directory of the repository:

GROUP_NAME = ""
PI_NAME = ""
INSTITUTION = ""
USER_NAME = ""

Or you can provide this information by executing the following cell.

[17]:
from squadds.database import *
[ ]:
create_contributor_info()

Understanding the terminology and database structure#

  • HuggingFace

  • Datasets

  • Configurations

  • Structure of SQuADDS_DB

  • Adding to SQuADDS_DB

Where we left of in Tutorial 1 - Contributing to an existing configuration#

Format the result from Tutorial 1 to be compatible with SQuADDS.

[2]:

How to contribute to an existing configuration:#

  1. Clone/Fork the Repository: If you have not already forked or cloned the repository, do so.

  2. Create or Checkout a Branch: If adding new data, it might be best to do it on a new branch:

    git checkout -b add_to_configuration
    
  3. Modify the Configuration: Add or modify the data files as necessary for the configuration. Make sure to follow any guidelines provided by the dataset maintainers for the specific structure and format required.

  4. Commit and Push Your Changes: Commit the new data and push it to your fork:

    git add .
    git commit -m "Add new data to configuration Y"
    git push origin add_to_configuration
    
  5. Pull Request: Create a pull request against the original repository.


Show how to add this new data entry to the SQuADDS database by creating a PR.

  1. Code to fork the repo and create a new branch.

  2. Code to add the new data to the chosen configuration.

  3. Code to commit and push the changes.

  4. Code to create a PR.

[ ]:

Contributing a New Configuration#

We may find that we possess a dataset that is not currently included in SQuADDS. In this case, we can add a new configuration to SQuADDS.

But before we do that, we need to make sure that the dataset is in a format that is compatible with SQuADDS. In order to do this, we need to refer back to the structure of SQuADDS.

Data Processing:#

We want the data to be in a json format with AT LEAST to have the following fields. You can add as many more supplementary fields as you want.

{
    "design":{
        "design_options": design_options,
        "design_tool": design_tool_name,
    },
    "sim_options":{
        "setup": sim_setup_options,
        "simulator": simulator_name,
    },
    "sim_results":{
        "result1": sim_result1,
        "unit1": unit1,
        "result2": sim_result2,
        "unit2": unit2,
    },
    "contributor":{
        "group": group_name,
        "PI": pi_name,
        "institution": institution,
        "uploader": user_name,
        "date_created": "YYYY-MM-DD-HHMMSS",
    },
}

If all the sim_results has the same units you can just use a "units":units field instead of repeating the unit for each result.

Metadata:#

The metadata for the configuration should include

  • the name of the configuration

  • a description of the configuration

  • contributors to the configuration

  • device design, design tool and name

  • simulation setup and simulator name

  • simulation code Simulator

  • simulation result parameters

  • measured parameters

Code to help generate portions of the metadata and guide the user to create custom Simulator (future release and tutorial)

[ ]:

How to contribute a new configuration:#

  1. Fork the Dataset Repository: On the Hugging Face Hub, fork the dataset repository you want to contribute to.

  2. Clone Your Fork Locally: Clone the forked repository to your local machine using the following command:

    git clone https://huggingface.co/datasets/YOUR_USERNAME/DATASET_NAME
    
  3. Create a New Branch: It’s a good practice to create a new branch for your configuration contribution:

    git checkout -b new_configuration
    
  4. Add Your Configuration: Depending on the dataset’s structure, this might involve adding new files or modifying existing ones. If the dataset uses the datasets library’s builder configurations, you will need to modify the Python script that defines the configurations.

  5. Commit Your Changes: Commit the changes with a clear commit message:

    git add .
    git commit -m "Add new configuration for circuit element X"
    
  6. Push to Your Fork: Push your new branch to your fork on the Hugging Face Hub:

    git push origin new_configuration
    
  7. Create a Pull Request: Go to the Hugging Face Hub, navigate to your fork, and create a pull request for your new branch. The pull request will be reviewed by the dataset maintainers.

Show

  1. Code to fork the repo and create a new branch.

  2. Code to add the new data to a new configuration.

  3. Code to add metadata to the configuration.

  4. Code to commit and push the changes.

  5. Code to create a PR.

[ ]:

Building on top of others works#

We might have some data that can be appended to an existing dataset in SQuADDS. In this case, we can add to the existing configuration in SQuADDS_DB without pushing new entries to the original dataset.

  1. Code to fork the repo and create a new branch.

  2. Code to add the new data to the chosen configuration.

    • handling contributions (appending code gives uploader name and institute only and notes)

  3. Code to commit and push the changes.

  4. Code to create a PR.

[ ]:

License#

This code is a part of SQuADDS

Developed by Sadman Ahmed Shanto

© Copyright Sadman Ahmed Shanto & Eli Levenson-Falk 2023.

This code is licensed under the MIT License. You mayobtain a copy of this license in the LICENSE.txt file in the root directory of this source tree.

Any modifications or derivative works of this code must retain thiscopyright notice, and modified files need to carry a notice indicatingthat they have been altered from the originals.