{ "cells": [ { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# Tutorial 9: Learning the Inverse Design Map\n", "\n", "In this tutorial, we explore how to infer device geometry from Hamiltonian parameters by learning the inverse design map. We introduce a custom machine learning architecture tailored to capture the underlying physics of this mapping. Specifically, we aim to answer:\n", "\n", "1) Which design space parameters most significantly influence a given set of Hamiltonian parameters?\n", "2) How does each Hamiltonian parameter quantitatively depend on these key design variables?\n", "\n", "\n", "**Note: This tutorial was originally created for the [Quantum Device Workshop](https://qdw-ucla.squarespace.com/), and includes some optional \"tasks\" for workshop participants. Feel free to skip these if you're just interested in the main content.**\n", "\n", "### Environment Setup Recommendation\n", "\n", "**We strongly recommend creating a separate Python environment for this notebook to ensure all dependencies are installed and to avoid conflicts with your base environment.**\n", "\n", "Please follow the setup instructions here: \n", "https://github.com/LFL-Lab/QDW2025/blob/main/notebooks/advance-track/setup_instructions.md" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%load_ext autoreload\n", "%autoreload 2\n", "%matplotlib inline" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Our Model:\n", "\n", "We propose a two-model architecture to understand the physics of the Hamiltonian-Design space mapping. \n", "\n", "1. **Design-Relevance Encoder**\n", " The first stage is a lightweight model (e.g., Random Forest or Lasso) that **identifies the subset of geometric design parameters most relevant** to each Hamiltonian parameter. This compresses the full design space into a minimal, interpretable feature subset for each target Hamiltonian.\n", "\n", "2. **KAN-based Symbolic Decoder**\n", " The second model, a **Kolmogorov–Arnold Network (KAN)** is trained to **symbolically model the Hamiltonian parameter** as a function of only its relevant design variables. The KAN outputs a closed-form symbolic expression, enabling direct interpretation of the underlying physics-driven dependency." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Design-Relevance Encoder\n", "\n", "The **Design-Relevance Encoder** identifies which geometric parameters $\\boldsymbol{\\xi}$ most influence each target Hamiltonian parameter $\\hat{\\mathcal{H}}$. To do this, we train an interpretable model — **Lasso regression** — to predict each Hamiltonian component from the full set of geometric inputs.\n", "\n", "* **Lasso (Least Absolute Shrinkage and Selection Operator)** selects relevant features via coefficient shrinkage, automatically zeroing out irrelevant ones.\n", "\n", "By using this model's outputs we construct a relevance mask that tells us which subset of design parameters significantly affect each $\\hat{\\mathcal{H}}$. This subset is then used as input to symbolic regression (e.g. KAN) to extract compact, human-readable expressions.\n" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "import pandas as pd\n", "import numpy as np\n", "from sklearn.model_selection import train_test_split\n", "from sklearn.preprocessing import StandardScaler\n", "import json\n", "from sklearn.ensemble import RandomForestRegressor\n", "from sklearn.linear_model import MultiTaskLassoCV\n", "from sklearn.preprocessing import StandardScaler\n", "from sklearn.tree import plot_tree\n", "import torch\n", "import sympy as sp\n", "import matplotlib.pyplot as plt\n", "import random" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's use the training dataset we generated in [Tutorial 7](https://lfl-lab.github.io/SQuADDS/source/tutorials/Tutorial-8_ML_interpolation_in_SQuADDS.html)" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "training_df = pd.read_parquet(\"data/training_data.parquet\")" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "hamiltonian_parameters = ['qubit_frequency_GHz', 'anharmonicity_MHz', 'cavity_frequency_GHz', 'kappa_kHz', 'g_MHz']\n", "design_parameters = ['cross_length', 'claw_length','coupling_length', 'total_length','ground_spacing']" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "Y_hamiltonian = training_df[hamiltonian_parameters].values # Hamiltonian parameters\n", "X_design = training_df[design_parameters].values # Design parameters" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The following object implemets the `LASSO` model to serve as the Design-Relevance Encoder" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [], "source": [ "class DesignRelevanceEncoder:\n", " def __init__(self, X_design, Y_hamiltonian, design_labels, hamiltonian_labels):\n", " \"\"\"\n", " Initialize the analyzer with design inputs and Hamiltonian outputs.\n", " \"\"\"\n", " self.X_raw = X_design\n", " self.Y_raw = Y_hamiltonian\n", " self.design_labels = design_labels\n", " self.hamiltonian_labels = hamiltonian_labels\n", "\n", " self.scaler_X = StandardScaler()\n", " self.scaler_Y = StandardScaler()\n", "\n", " self.X = self.scaler_X.fit_transform(self.X_raw)\n", " self.Y = self.scaler_Y.fit_transform(self.Y_raw)\n", "\n", " def run_random_forest(self, n_estimators=200, random_state=42):\n", " \"\"\"\n", " Trains a random forest for each Hamiltonian parameter.\n", " Stores and returns a feature importance DataFrame.\n", " \"\"\"\n", " importance_matrix = np.zeros((self.X.shape[1], self.Y.shape[1]))\n", "\n", " for i, h_name in enumerate(self.hamiltonian_labels):\n", " rf = RandomForestRegressor(n_estimators=n_estimators, random_state=random_state)\n", " rf.fit(self.X, self.Y[:, i])\n", " importance_matrix[:, i] = rf.feature_importances_\n", "\n", " self.rf_importance_df = pd.DataFrame(importance_matrix,\n", " index=self.design_labels,\n", " columns=self.hamiltonian_labels)\n", " return self.rf_importance_df\n", "\n", " def run_multitask_lasso(self, alpha_grid=np.logspace(-4, 1, 20)):\n", " \"\"\"\n", " Trains a multi-task Lasso model.\n", " Stores and returns the coefficient matrix as DataFrame.\n", " \"\"\"\n", " model = MultiTaskLassoCV(alphas=alpha_grid, cv=5, random_state=42)\n", " model.fit(self.X, self.Y)\n", " coef_matrix = model.coef_.T # (n_design, n_hamiltonian)\n", "\n", " self.lasso_coef_df = pd.DataFrame(coef_matrix,\n", " index=self.design_labels,\n", " columns=self.hamiltonian_labels)\n", " return self.lasso_coef_df\n", "\n", " def plot_heatmaps(self):\n", " \"\"\"\n", " Plots side-by-side heatmaps of RF and Lasso results.\n", " \"\"\"\n", " fig, axs = plt.subplots(1, 2, figsize=(16, 6))\n", "\n", " sns.heatmap(self.rf_importance_df,\n", " annot=True, cmap=\"YlOrRd\", ax=axs[0], cbar_kws={'label': 'Feature Importance'})\n", " axs[0].set_title(\"Random Forest: Design Parameter Importance\")\n", " axs[0].set_xlabel(\"Hamiltonian Parameter\")\n", " axs[0].set_ylabel(\"Design Parameter\")\n", "\n", " sns.heatmap(self.lasso_coef_df,\n", " annot=True, center=0, cmap=\"coolwarm\", ax=axs[1], cbar_kws={'label': 'Coefficient Value'})\n", " axs[1].set_title(\"Multi-Task Lasso: Design Influence\")\n", " axs[1].set_xlabel(\"Hamiltonian Parameter\")\n", " axs[1].set_ylabel(\"Design Parameter\")\n", "\n", " plt.tight_layout()\n", " plt.show()\n", "\n", " def print_dependency_summary(self, top_k=3, threshold=1e-3):\n", " \"\"\"\n", " Prints top-k most important design variables per Hamiltonian parameter\n", " from both Random Forest and Lasso results.\n", " \"\"\"\n", " print(\"\\n=== Top Influencers from Random Forest ===\")\n", " for h in self.hamiltonian_labels:\n", " top = self.rf_importance_df[h].sort_values(ascending=False)\n", " print(f\"\\n- {h}:\")\n", " for i in range(top_k):\n", " print(f\" • {top.index[i]} → importance = {top.values[i]:.4f}\")\n", "\n", " print(\"\\n=== Top Influencers from Lasso (with direction) ===\")\n", " for h in self.hamiltonian_labels:\n", " top = self.lasso_coef_df[h].abs().sort_values(ascending=False)\n", " print(f\"\\n- {h}:\")\n", " for i in range(top_k):\n", " param = top.index[i]\n", " coef = self.lasso_coef_df[h][param]\n", " print(f\" • {param} → coef = {coef:.4f} ({'↑' if coef > 0 else '↓'})\")\n", "\n", " def plot_heatmap(self):\n", " \"\"\"\n", " Plots a single heatmap depending on which model was run.\n", " If both are run, plots both side by side.\n", " If only one is run, plots only that one.\n", " \"\"\"\n", " has_rf = hasattr(self, 'rf_importance_df')\n", " has_lasso = hasattr(self, 'lasso_coef_df')\n", "\n", " if has_rf and has_lasso:\n", " self.plot_heatmaps()\n", " elif has_rf:\n", " plt.figure(figsize=(8, 6))\n", " sns.heatmap(self.rf_importance_df, annot=True, cmap=\"YlOrRd\", cbar_kws={'label': 'Feature Importance'})\n", " plt.title(\"Random Forest: Design Parameter Importance\")\n", " plt.xlabel(\"Hamiltonian Parameter\")\n", " plt.ylabel(\"Design Parameter\")\n", " plt.tight_layout()\n", " plt.show()\n", " elif has_lasso:\n", " plt.figure(figsize=(8, 6))\n", " sns.heatmap(self.lasso_coef_df, annot=True, center=0, cmap=\"coolwarm\", cbar_kws={'label': 'Coefficient Value'})\n", " plt.title(\"Multi-Task Lasso: Design Influence\")\n", " plt.xlabel(\"Hamiltonian Parameter\")\n", " plt.ylabel(\"Design Parameter\")\n", " plt.tight_layout()\n", " plt.show()\n", " else:\n", " raise ValueError(\"Neither Random Forest nor Lasso results are available. Run one of the analysis methods first.\")\n", "\n", " def get_dependency_summary_json(self, top_k=3, threshold=1e-3, pretty_print=True):\n", " \"\"\"\n", " Returns a JSON-like dict of top design parameters (and their scores) for each Hamiltonian parameter.\n", " Only includes parameters with absolute score above the threshold.\n", " If both models are run, returns both. Otherwise, returns only the available one.\n", " If pretty_print is True, also prints the summary in a human-readable format.\n", " \"\"\"\n", " summary = {}\n", " if hasattr(self, 'rf_importance_df'):\n", " rf_summary = {}\n", " for h in self.hamiltonian_labels:\n", " top = self.rf_importance_df[h].sort_values(ascending=False)\n", " filtered = [\n", " {\"parameter\": top.index[i], \"importance\": float(top.values[i])}\n", " for i in range(len(top))\n", " if abs(top.values[i]) >= threshold\n", " ][:top_k]\n", " rf_summary[h] = filtered\n", " summary['random_forest'] = rf_summary\n", " if hasattr(self, 'lasso_coef_df'):\n", " lasso_summary = {}\n", " for h in self.hamiltonian_labels:\n", " top = self.lasso_coef_df[h].abs().sort_values(ascending=False)\n", " filtered = [\n", " {\n", " \"parameter\": top.index[i],\n", " \"coef\": float(self.lasso_coef_df[h][top.index[i]])\n", " }\n", " for i in range(len(top))\n", " if abs(top.values[i]) >= threshold\n", " ][:top_k]\n", " lasso_summary[h] = filtered\n", " summary['lasso'] = lasso_summary\n", "\n", " if pretty_print:\n", " print(\"\\n=== Dependency Summary (Top {} per Hamiltonian parameter, threshold={}) ===\".format(top_k, threshold))\n", " print(json.dumps(summary, indent=4))\n", "\n", " return summary\n" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [], "source": [ "ml_analyzer = DesignRelevanceEncoder(\n", " X_design, \n", " Y_hamiltonian, \n", " design_labels=design_parameters,\n", " hamiltonian_labels=hamiltonian_parameters\n", ")\n" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", " | qubit_frequency_GHz | \n", "anharmonicity_MHz | \n", "cavity_frequency_GHz | \n", "kappa_kHz | \n", "g_MHz | \n", "
---|---|---|---|---|---|
cross_length | \n", "-0.956645 | \n", "0.904895 | \n", "0.000207 | \n", "0.000060 | \n", "-0.574872 | \n", "
claw_length | \n", "-0.005958 | \n", "0.007807 | \n", "-0.220192 | \n", "-0.141799 | \n", "0.932547 | \n", "
coupling_length | \n", "-0.000277 | \n", "0.000236 | \n", "-0.153045 | \n", "0.877849 | \n", "-0.073914 | \n", "
total_length | \n", "0.000482 | \n", "-0.000646 | \n", "-0.924025 | \n", "-0.484510 | \n", "-0.394851 | \n", "
ground_spacing | \n", "-0.001715 | \n", "0.002270 | \n", "0.000009 | \n", "0.000282 | \n", "-0.350630 | \n", "
Developed by Sadman Ahmed Shanto
\n", "This tutorial is written by Sadman Ahmed Shanto
\n", "© Copyright Sadman Ahmed Shanto & Eli Levenson-Falk 2025.
\n", "This code is licensed under the MIT License. You may
obtain a copy of this license in the LICENSE.txt file in the root directory
of this source tree.
Any modifications or derivative works of this code must retain this
copyright notice, and modified files need to carry a notice indicating
that they have been altered from the originals.