squadds.core package#
Submodules#
squadds.core.analysis module#
- class squadds.core.analysis.Analyzer(db=None)[source]#
Bases:
objectThe Analyzer class is responsible for analyzing designs and finding the closest designs based on target parameters.
- _add_target_params_columns()[source]#
Adds target parameter columns to the dataframe based on the selected system.
- _fix_cavity_claw_df()[source]#
Fixes the cavity claw DataFrame by renaming columns and updating values.
- _get_H_param_keys()[source]#
Gets the parameter keys for the Hamiltonian based on the selected system.
- set_metric_strategy(strategy
MetricStrategy): Sets the metric strategy to use for calculating the distance metric.
- _outside_bounds(df
pd.DataFrame, params: dict, display=True) -> bool: Checks if entered parameters are outside the bounds of a dataframe.
- find_closest(target_params
dict, num_top: int, metric: str = ‘Euclidean’, display: bool = True): Finds the closest designs in the library based on the target parameters.
- get_interpolated_design(target_params
dict, metric: str = ‘Euclidean’, display: bool = True): Gets the interpolated design based on the target parameters.
Initializes an instance of the Analysis class.
- Parameters:
db (-) – The database object.
- - db
The database object.
- - selected_component_name
The name of the selected component.
- - selected_component
The selected component.
- - selected_data_type
The selected data type.
- - selected_confg
The selected configuration.
- - selected_qubit
The selected qubit.
- - selected_cavity
The selected cavity.
- - selected_coupler
The selected coupler.
- - selected_system
The selected system.
- - df
The selected dataframe.
- - closest_df_entry
The closest dataframe entry.
- - closest_design
The closest design.
- - presimmed_closest_cpw_design
The presimmed closest CPW design.
- - presimmed_closest_qubit_design
The presimmed closest qubit design.
- - presimmed_closest_coupler_design
The presimmed closest coupler design.
- - interpolated_design
The interpolated design.
- - metric_strategy
The metric strategy (will be set dynamically).
- - custom_metric_func
The custom metric function.
- - metric_weights
The metric weights.
- - target_params
The target parameters.
- - H_param_keys
The H parameter keys.
- closest_design_in_H_space()[source]#
Plots a scatter plot of the closest design in the H-space.
This method creates a scatter plot with two subplots. The first subplot shows the relationship between ‘cavity_frequency_GHz’ and ‘kappa_kHz’, while the second subplot shows the relationship between ‘anharmonicity_MHz’ and ‘g_MHz’. The scatter plot includes pre-simulated data, target data, and the closest design entry from the database.
- Returns:
None
- find_closest(target_params, num_top, metric='Euclidean', display=True, parallel=False, num_cpu='auto', skip_df_gen=False)[source]#
Find the closest designs in the library based on the target parameters.
- Parameters:
target_params (-) – A dictionary containing the target parameters.
num_top (-) – The number of closest designs to retrieve.
metric (-) – The distance metric to use for calculating distances. Defaults to ‘Euclidean’.
display (-) – Whether to display warnings for parameters outside of the library bounds. Defaults to True.
parallell (-) – Whether to run metric calculation in a parallelized way
num_cpu (-) – The number of CPUs to run a job over
skip_df_gen (-) – Whether to generate the df or run from memory
- Returns:
A DataFrame containing the closest designs.
- Return type:
closest_df (DataFrame)
- Raises:
- ValueError – If the specified metric is not supported or if num_top is bigger than the size of the library.
- ValueError – If the metric is invalid.
- get_Ljs(df)[source]#
Extracts the EJ values from the dataframe. Converts them to Josephson inductance values using pyEPR
Parameters: df (pd.DataFrame): The dataframe containing design options.
Returns: np.array: An array of Josephson inductance values.
- get_closest_cavity()[source]#
Returns the closest cavity design.
- Returns:
The closest cavity design.
- Return type:
pd.Series
- get_complete_df(target_params, metric='Euclidean', display=True)[source]#
Returns the complete DataFrame (design + Hamiltonian parameters) sourced using the target parameters.
- Parameters:
target_params (-) – A dictionary containing the target parameters.
metric (-) – The distance metric to use for calculating distances. Defaults to ‘Euclidean’.
display (-) – Whether to display warnings for parameters outside of the library bounds. Defaults to True.
- Returns:
A DataFrame containing all designs and Hamiltonian parameters.
- Return type:
complete_df (DataFrame)
- Raises:
- ValueError – If the specified metric is not supported or if num_top is bigger than the size of the library.
- ValueError – If the metric is invalid.
- get_coupler_options(df)[source]#
Extracts coupler options from the dataframe.
Parameters: df (pd.DataFrame): The dataframe containing design options.
Returns: Dict[str, List[Any]]: A dictionary containing lists of the extracted coupler options.
- Return type:
dict[str, list[Any]]
- get_cpw_options(df)[source]#
Extracts CPW options from the dataframe.
Parameters: df (pd.DataFrame): The dataframe containing design options.
Returns: Dict[str, List[Any]]: A dictionary containing lists of the extracted CPW options.
- Return type:
dict[str, list[Any]]
- get_design(df)[source]#
Extracts the design parameters from the dataframe and returns a dict.
- Returns:
A dict containing the design parameters.
- Return type:
dict
- get_qubit_options(df)[source]#
Extracts qubit design options from the dataframe.
Parameters: df (pd.DataFrame): The dataframe containing design options.
Returns: Dict[str, List[Any]]: A dictionary containing lists of the extracted qubit options.
- Return type:
dict[str, list[Any]]
- set_metric_strategy(strategy)[source]#
Sets the metric strategy to use for calculating the distance metric.
- Parameters:
strategy (MetricStrategy) – The strategy to use for calculating the distance metric.
- Raises:
ValueError – If the specified metric is not supported.
- squadds.core.analysis.scale_value(value, ratio)[source]#
Scales the given value by the specified ratio.
- Parameters:
value (-) – The value to be scaled, in the format ‘Xum’ where X is a number.
ratio (-) – The scaling ratio.
- Returns:
The scaled value in the format ‘Xum’ where X is the scaled number.
- Return type:
scaled_value (str)
squadds.core.analysis_enrichment module#
Pure dataframe enrichment helpers used by the Analyzer compatibility facade.
- squadds.core.analysis_enrichment.extract_coupler_options(df)[source]#
Extract coupler geometry arrays from Analyzer result rows.
- Return type:
dict[str, list[Any]]
- squadds.core.analysis_enrichment.extract_cpw_options(df)[source]#
Extract CPW geometry arrays from Analyzer result rows.
- Return type:
dict[str, list[Any]]
squadds.core.analysis_plotting module#
Plotting helpers for the Analyzer compatibility facade.
squadds.core.analysis_search module#
Pure search helpers used by the Analyzer compatibility facade.
- squadds.core.analysis_search.filter_df_by_target_params(df, target_params)[source]#
Filter a dataframe by the categorical values present in target params.
- Return type:
DataFrame
- squadds.core.analysis_search.get_H_param_keys_for_system(selected_system)[source]#
Return Hamiltonian parameter keys for the legacy supported systems.
- squadds.core.analysis_search.outside_bounds(df, params, display=True)[source]#
Check whether requested numeric or categorical parameters fall outside the available library.
- Return type:
bool
- squadds.core.analysis_search.rank_closest_indices(filtered_df, target_params, metric_strategy, num_top)[source]#
Return indices for the closest rows according to the configured metric.
- Return type:
Index
squadds.core.db module#
!TODO: add FULL support for half-wave cavity
- class squadds.core.db.SQuADDS_DB(*args, **kwargs)[source]#
Bases:
objectA class representing the SQuADDS database.
- _delete_cache()#
Delete the dataset cache directory.
- get_dataset_info(component, component_name, data_type)[source]#
Print information about a specific dataset.
- view_contributors_of_config(config)[source]#
Print a table of contributors for a specific configuration.
- view_contributors_of(component, component_name, data_type)[source]#
Print a table of contributors for a specific component, component name, and data type.
- select_components(component_dict)[source]#
Select a configuration based on a component dictionary or string.
- select_system(components)[source]#
Select a system based on a list of components or a single component.
Constructor for the SQuADDS_DB class.
- repo_name#
The name of the repository.
- Type:
str
- configs#
List of supported configuration names.
- Type:
list
- selected_component_name#
The name of the selected component.
- Type:
str
- selected_component#
The selected component.
- Type:
str
- selected_data_type#
The selected data type.
- Type:
str
- selected_confg#
The selected configuration.
- Type:
str
- selected_qubit#
The selected qubit.
- Type:
str
- selected_cavity#
The selected cavity.
- Type:
str
- selected_coupler#
The selected coupler.
- Type:
str
- selected_resonator_type#
The selected resonator type.
- Type:
str
- selected_system#
The selected system.
- Type:
str
- selected_df#
The selected dataframe.
- Type:
str
- target_param_keys#
The target parameter keys.
- Type:
str
- units#
The units.
- Type:
str
- _internal_call#
Flag to track internal calls.
- Type:
bool
- create_qubit_cavity_df(qubit_df, cavity_df, merger_terms=None, parallelize=False, num_cpu=None)[source]#
Creates a merged DataFrame by merging the qubit and cavity DataFrames based on the specified merger terms.
- Parameters:
qubit_df (pandas.DataFrame) – The DataFrame containing qubit data.
cavity_df (pandas.DataFrame) – The DataFrame containing cavity data.
merger_terms (list) – A list of column names to be used for merging the DataFrames. Defaults to None.
parallelize (bool) – Whether to use multiprocessing to speed up the merging. Defaults to False.
num_cpu (int) – The number of CPU cores to use for multiprocessing. If not specified, the function will use the maximum number of available cores.
- Returns:
The merged DataFrame.
- Return type:
pandas.DataFrame
- Raises:
None –
- create_system_df(parallelize=False, num_cpu=None)[source]#
Creates and returns a DataFrame based on the selected system.
- Parameters:
parallelize (bool) – Whether to use multiprocessing to speed up the merging. Defaults to False.
num_cpu (int) – The number of CPU cores to use for multiprocessing. If not specified, the function will use the maximum number of available cores.
If the selected system is a single component, it retrieves the dataset based on the selected data type, component, and component name. If a coupler is selected, the DataFrame is filtered by the coupler. The resulting DataFrame is stored in the selected_df attribute.
If the selected system is a list of components (qubit and cavity), it retrieves the qubit and cavity DataFrames. The qubit DataFrame is obtained based on the selected qubit component name and data type “cap_matrix”. The cavity DataFrame is obtained based on the selected cavity component name and data type “eigenmode”. The qubit and cavity DataFrames are merged into a single DataFrame using the merger terms [‘claw_width’, ‘claw_length’, ‘claw_gap’]. The resulting DataFrame is stored in the selected_df attribute.
- Raises:
UserWarning – If the selected system is either not specified or does not contain a cavity.
- Returns:
The created DataFrame based on the selected system.
- Return type:
pandas.DataFrame
- find_parquet_files()[source]#
Searches for parquet files in the repository and returns their paths/filenames.
- Returns:
A list of paths/filenames of parquet files in the repository.
- Return type:
list
- generate_qubit_half_wave_cavity_df(parallelize=False, num_cpu=None, save_data=False)[source]#
Generates a DataFrame that combines the qubit and half-wave cavity data.
- Parameters:
parallelize (bool, optional) – Flag indicating whether to parallelize the computation. Defaults to False.
num_cpu (int, optional) – Number of CPUs to use for parallelization. Defaults to None.
save_data (bool, optional) – Flag indicating whether to save the generated data. Defaults to False.
- Returns:
The generated DataFrame.
- Return type:
pandas.DataFrame
- Raises:
None –
Notes
This method generates a DataFrame by combining the qubit and half-wave cavity data.
The qubit and cavity data are obtained from the get_dataset and generate_updated_half_wave_cavity_df methods, respectively.
The generated DataFrame is optimized to reduce memory usage using various optimization techniques.
If save_data is True, the generated DataFrames are saved in the “data” directory.
- generate_updated_half_wave_cavity_df(parallelize=False, num_cpu=None)[source]#
!TODO: speed this up!
- get_component_names(component=None)[source]#
Get the names of the components associated with a specific component.
- Parameters:
component (str) – The specific component to retrieve names for.
- Returns:
A list of component names associated with the specified component.
- Return type:
list
- get_configs()[source]#
Returns the configurations stored in the database.
- Returns:
A list of configuration names.
- Return type:
list
- get_dataset(data_type=None, component=None, component_name=None)[source]#
Retrieves a dataset based on the specified data type, component, and component name.
- Parameters:
data_type (str) – The type of data to retrieve.
component (str) – The component to retrieve the data from.
component_name (str) – The name of the component to retrieve the data from.
- Returns:
The retrieved dataset.
- Return type:
pandas.DataFrame
- Raises:
ValueError – If the system and component name are not defined.
ValueError – If the data type is not specified.
ValueError – If the component is not supported.
ValueError – If the component name is not supported.
ValueError – If the data type is not supported.
Exception – If an error occurs while loading the dataset.
- get_dataset_info(component=None, component_name=None, data_type=None)[source]#
Retrieves and prints information about a dataset.
- Parameters:
component (str) – The component of the dataset.
component_name (str) – The name of the component.
data_type (str) – The type of data.
- Returns:
None
- get_device_contributors_of(component=None, component_name=None, data_type=None)[source]#
View the reference/source experimental device that was used to validate a specific simulation configuration.
- Parameters:
component (str) – The component of interest.
component_name (str) – The name of the component.
data_type (str) – The type of data.
- Returns:
The relevant contributor information.
- Return type:
dict
- get_existing_files()[source]#
Retrieves the list of existing files in the repository.
- Returns:
A list of existing file names in the repository.
- Return type:
list
- get_measured_devices()[source]#
Retrieve all measured devices with their corresponding design codes, paper links, images, foundries, and fabrication recipes.
- Returns:
A DataFrame containing the name, design code, paper link, image, foundry, and fabrication recipe for each device.
- Return type:
pd.DataFrame
- read_parquet_file(file_name)[source]#
Takes in the filename and returns the object to be read as a pandas dataframe.
- Parameters:
file_name (str) – The name of the parquet file to read.
- Returns:
The dataframe read from the parquet file.
- Return type:
pandas.DataFrame
- see_dataset(data_type=None, component=None, component_name=None)[source]#
View a dataset based on the provided data type, component, and component name.
- Parameters:
data_type (str) – The type of data to view.
component (str) – The component to use. If not provided, the selected system will be used.
component_name (str) – The name of the component. If not provided, the selected component name will be used.
- Returns:
The flattened dataset.
- Return type:
pandas.DataFrame
- Raises:
ValueError – If both system and component name are not defined.
ValueError – If data type is not specified.
ValueError – If the component is not supported.
ValueError – If the component name is not supported.
ValueError – If the data type is not supported.
Exception – If an error occurs while loading the dataset.
- select_cavity(cavity=None)[source]#
Selects a cavity and sets the necessary attributes for further operations.
- Parameters:
cavity (str) – The name of the cavity to be selected.
- Raises:
UserWarning – If the selected system is either not specified or does not contain a cavity.
- Returns:
None
- select_cavity_claw(cavity=None)[source]#
Selects a cavity claw component.
- Parameters:
cavity (str) – The name of the cavity to select.
- Raises:
UserWarning – If the selected system is not specified or does not contain a cavity.
- Returns:
None
- select_components(component_dict=None)[source]#
Selects components based on the provided component dictionary or string.
- Parameters:
component_dict (dict or str) – A dictionary containing the component details (component, component_name, data_type) or a string representing the component.
- Returns:
None
- select_coupler(coupler=None)[source]#
Selects a coupler for the database.
- Parameters:
coupler (str, optional) – The name of the coupler to select. Defaults to None.
- Returns:
None
- select_qubit(qubit=None)[source]#
Selects a qubit and sets the necessary attributes for the selected qubit.
- Parameters:
qubit (str) – The name of the qubit to be selected.
- Raises:
UserWarning – If the selected system is not specified or does not contain a qubit.
- Returns:
None
- select_resonator_type(resonator_type)[source]#
Select the coupler based on the resonator type.
- Parameters:
resonator_type (str) – The type of resonator, e.g., “quarter” or “half”.
- select_system(components=None)[source]#
Selects the system and component(s) to be used.
- Parameters:
components (list or str) – The component(s) to be selected. If a list is provided, each component will be checked against the supported components. If a string is provided, it will be checked against the supported components.
- Returns:
None
- Raises:
None –
- show_selections()[source]#
Prints the selected system, component, and data type.
If the selected system is a list, it prints the selected qubit, cavity, coupler, and system. If the selected system is a string, it prints the selected component, component name, data type, system, and coupler.
- supported_component_names()[source]#
Returns a list of supported component names extracted from the configs.
- Returns:
A list of supported component names.
- Return type:
list
- supported_components()[source]#
Returns a list of supported components based on the configurations.
- Returns:
A list of supported components.
- Return type:
list
- supported_config_names()[source]#
Retrieves the supported configuration names from the repository.
- Returns:
A list of supported configuration names.
- supported_data_types()[source]#
Returns a list of supported data types.
- Returns:
A list of supported data types.
- Return type:
list
- unselect(param)[source]#
Unselects the specified parameter.
Parameters: param (str): The parameter to unselect. Valid options are:
“component”
“component_name”
“data_type”
“qubit”
“cavity_claw”
“coupler”
“system”
Returns: None
- unselect_all()[source]#
Clears the selected component, data type, qubit, cavity, coupler, and system.
- upload_dataset(file_paths, repo_file_names, overwrite=False)[source]#
Uploads a dataset to the repository.
- Parameters:
file_paths (list) – A list of file paths to upload.
repo_file_names (list) – A list of file names to use in the repository.
overwrite (bool) – Whether to overwrite an existing dataset. Defaults to False.
- view_all_contributors()[source]#
View all unique contributors and their relevant information from simulation configurations.
This method iterates through the simulation configurations and extracts the relevant information of each contributor. It checks if the combination of uploader, PI, group, and institution is already in the list of unique contributors. If not, it adds the relevant information to the list. Finally, it prints the list of unique contributors in a tabular format with a banner.
- view_all_simulation_contributors()[source]#
View all unique simulation contributors and their relevant information.
- view_component_names(component=None)[source]#
Prints the names of the components available in the database.
- Parameters:
component (str) – The specific component to view names for. If None, all component names will be printed.
- Returns:
None
- view_contributors_of(component=None, component_name=None, data_type=None, measured_device_name=None)[source]#
View contributors of a specific component, component name, and data type.
- Parameters:
component (str) – The component of interest.
component_name (str) – The name of the component.
data_type (str) – The type of data.
measured_device_name (str) – The name of the measured device.
- Returns:
None
- view_contributors_of_config(config)[source]#
View the contributors of a specific configuration.
- Parameters:
config (str) – The name of the configuration.
- Returns:
None
- view_datasets()[source]#
View the datasets available in the database.
This method retrieves the supported components, component names, and data types from the database and displays them in a tabular format.
- view_device_contributors_of(component=None, component_name=None, data_type=None)[source]#
View the reference/source experimental device that was used to validate a specific simulation configuration.
- Parameters:
component (str) – The component of interest.
component_name (str) – The name of the component.
data_type (str) – The type of data.
- Returns:
The name of the experimentally validated reference device, or an error message if not found.
- Return type:
str
- view_measured_devices()[source]#
View all measured devices with their corresponding design codes, paper links, images, foundries, and fabrication recipes.
This method retrieves and displays the relevant information for each device in the dataset in a well-formatted table.
- view_recipe_of(device_name)[source]#
Retrieve the foundry and fabrication recipe information for a specified device.
- Parameters:
device_name (str) – The name of the device to retrieve information for.
- Returns:
A dictionary containing foundry and fabrication recipe information.
- Return type:
dict
- view_reference_device_of(component=None, component_name=None, data_type=None)[source]#
View the reference/source experimental device that was used to validate a specific simulation configuration.
- Parameters:
component (str) – The component of interest.
component_name (str) – The name of the component.
data_type (str) – The type of data.
- view_reference_devices()[source]#
View all unique reference (experimental) devices and their relevant information.
This method iterates through the configurations and extracts the chip’s name within the SQuADDS DB, group, and who the chip was measured by. It also finds the simulation results for the device.It checks if the combination of simulation results uploader, PI, group, and institution is already in the list of unique contributors. If not, it adds the relevant information to the list. Finally, it prints the list of unique devices in a tabular format.
- view_sim_contributors_of(component=None, component_name=None, data_type=None, measured_device_name=None)[source]#
View the simulation contributors of a specific component, component name, and data type.
- Parameters:
component (str) – The component of interest.
component_name (str) – The name of the component.
data_type (str) – The type of data.
measured_device_name (str) – The name of the measured device.
- Returns:
None
squadds.core.db_catalog module#
Catalog helpers for SQuADDS database config discovery.
- squadds.core.db_catalog.extract_supported_component_names(configs)[source]#
Return supported component names while preserving legacy duplicate entries.
- Return type:
list[str]
- squadds.core.db_catalog.extract_supported_components(configs)[source]#
Return supported components while preserving legacy duplicate entries.
- Return type:
list[str]
- squadds.core.db_catalog.extract_supported_data_types(configs)[source]#
Return supported data types while preserving legacy duplicate entries.
- Return type:
list[str]
- squadds.core.db_catalog.filter_simulation_config_names(configs)[source]#
Keep only config names that follow the legacy three-part naming convention.
- Return type:
list[str]
squadds.core.db_devices module#
- squadds.core.db_devices.build_measured_device_records(dataset)[source]#
Build the measured-device dataframe payload used by
get_measured_devices.
- squadds.core.db_devices.build_measured_device_rows(dataset)[source]#
Build the rows printed by
view_measured_devices.
- squadds.core.db_devices.build_recipe_rows(dataset, device_name)[source]#
Return the printable recipe rows for a measured device.
- squadds.core.db_devices.build_reference_device_records(dataset, simulation_lookup_fn)[source]#
Return unique reference-device rows for
view_reference_devices.
- squadds.core.db_devices.collect_all_simulation_contributors(configs, repo_name, load_dataset_fn)[source]#
Gather unique simulation contributors across all dataset configs.
- squadds.core.db_devices.find_device_contributor_info(dataset, config)[source]#
Return contributor information for the measured device that validates a config.
- squadds.core.db_devices.find_reference_device_info(dataset, config)[source]#
Return the combined measured-device metadata for a config.
squadds.core.db_half_wave module#
Helpers for the half-wave cavity flow in SQuADDS_DB.
- squadds.core.db_half_wave.filter_and_validate_ncap_cavity_df(cavity_df, *, filter_df_by_conditions_fn, coupler_type='NCap')[source]#
Filter a cavity dataframe to the required coupler type and validate the result.
squadds.core.db_loader module#
Dataset request helpers for the SQuADDS_DB compatibility facade.
- class squadds.core.db_loader.DatasetRequestValidation(is_valid, message=None, options=None)[source]#
Bases:
objectValidation result for a dataset lookup request.
- is_valid: bool#
- message: str | None = None#
- options: list[str] | None = None#
- squadds.core.db_loader.build_dataset_config(component, component_name, data_type)[source]#
Build the legacy dataset config identifier.
- Return type:
str
squadds.core.db_merge module#
Merge helpers for composing qubit and cavity datasets.
squadds.core.db_selection module#
Selection helpers for the SQuADDS_DB compatibility facade.
- squadds.core.db_selection.build_component_selection(selected_system, required_component, component_name, data_type, warning_message)[source]#
Return the selected component name and data type for matching systems.
- Return type:
tuple[str | None, str]
- squadds.core.db_selection.is_supported_coupler(coupler, supported_component_names)[source]#
Check coupler support while preserving the legacy CLT alias.
- Return type:
bool
squadds.core.db_state module#
- squadds.core.db_state.format_selection_lines(selected_system, selected_component, selected_component_name, selected_data_type, selected_qubit, selected_cavity, selected_coupler, selected_resonator_type)[source]#
Return the printed selection lines for
show_selections.
- squadds.core.db_state.get_unselect_attr_name(param)[source]#
Map the public
unselectparameter to the instance attribute name.
squadds.core.db_views module#
squadds.core.design_patterns module#
squadds.core.globals module#
squadds.core.json_utils module#
Internal helpers for normalizing legacy JSON-like payloads.
- squadds.core.json_utils.deserialize_json_like(value)[source]#
Recursively deserialize dataset fields that may already be dicts or JSON strings.
- squadds.core.json_utils.extract_optional_setup_payload(row, *keys, default=None)[source]#
Best-effort variant of extract_setup_payload().
squadds.core.metrics module#
- class squadds.core.metrics.ChebyshevMetric[source]#
Bases:
MetricStrategyImplements the Chebyshev metric strategy.
- calculate(target_params, df_row)[source]#
Calculate the Chebyshev distance between target_params and df_row.
- Parameters:
target_params (dict) – The target parameters as a dictionary.
df_row (pd.Series) – A single row from a DataFrame representing a set of parameters.
- Returns:
The Chebyshev distance.
- Return type:
float
- class squadds.core.metrics.CustomMetric(custom_metric_func)[source]#
Bases:
MetricStrategyImplements a custom metric strategy using a user-defined function.
- Example Usage:
To use a custom Manhattan distance metric, define the function as follows:
- def manhattan_distance(target, simulated):
return sum(abs(target[key] - simulated.get(key, 0)) for key in target)
Then, instantiate CustomMetric with this function:
custom_metric = CustomMetric(manhattan_distance)
Initialize CustomMetric with a custom metric function.
- Parameters:
custom_metric_func (callable) – User-defined custom metric function. The function should take two dictionaries as arguments and return a float.
- calculate(target_params, df_row)[source]#
Calculate the custom metric between target_params and df_row using the user-defined function.
- Parameters:
target_params (dict) – The target parameters as a dictionary.
df_row (pd.Series) – A single row from a DataFrame representing a set of parameters.
- Returns:
The custom metric calculated using the user-defined function.
- Return type:
float
- class squadds.core.metrics.EuclideanMetric[source]#
Bases:
MetricStrategyImplements the specific Euclidean metric strategy as per your definition.
- calculate(target_params, df_row)[source]#
Calculate the custom Euclidean distance between target_params and df_row.
The Euclidean distance is calculated as: sqrt(sum_i (x_i - x_{target})^2 / x_{target}), where x_i are the values in df_row and x_{target} are the target parameters.
- Parameters:
target_params (dict) – The target parameters as a dictionary.
df_row (pd.Series) – A single row from a DataFrame representing a set of parameters.
- Returns:
The custom Euclidean distance.
- Return type:
float
- class squadds.core.metrics.ManhattanMetric[source]#
Bases:
MetricStrategyImplements the Manhattan metric strategy.
- calculate(target_params, df_row)[source]#
Calculate the Manhattan distance between target_params and df_row.
- Parameters:
target_params (dict) – The target parameters as a dictionary.
df_row (pd.Series) – A single row from a DataFrame representing a set of parameters.
- Returns:
The Manhattan distance.
- Return type:
float
- class squadds.core.metrics.MetricStrategy[source]#
Bases:
ABCAbstract class for metric strategies.
- abstractmethod calculate(target_params, row)[source]#
Calculate the distance metric between target parameters and a DataFrame row.
- Parameters:
target_params (dict) – Dictionary of target parameters.
row (pd.Series) – A row from a DataFrame.
- Returns:
Calculated distance.
- Return type:
float
- calculate_vectorized(target_params, df)[source]#
Calculate distances using vectorized operations.
- Parameters:
target_params (dict) – Dictionary of target parameters.
df (pd.DataFrame) – The DataFrame containing rows to calculate distances for.
- Returns:
Series of calculated distances.
- Return type:
pd.Series
- class squadds.core.metrics.WeightedEuclideanMetric(weights)[source]#
Bases:
MetricStrategyConcrete class for weighted Euclidean metric.
Initialize the weights.
- Parameters:
weights (dict) – Dictionary of weights for each parameter.
- calculate(target_params, row)[source]#
Calculate the weighted Euclidean distance between target parameters and a DataFrame row.
- Parameters:
target_params (dict) – Dictionary of target parameters.
row (pd.Series) – A row from a DataFrame.
- Returns:
Calculated weighted Euclidean distance.
- Return type:
float
squadds.core.processing module#
- squadds.core.processing.update_cavity_frequency_and_kappa(merged_df, Z0=50)[source]#
Updates the cavity frequency and kappa based on the given merged_df DataFrame.
Parameters: - merged_df: DataFrame containing the necessary simulation results. - Z0: Characteristic impedance of the system (default: 50 Ohms).
Returns: - cavity_frequency_updated: Updated cavity frequency in Hz. - kappa: Updated kappa in Hz.
squadds.core.utils module#
- class squadds.core.utils.HfApi(endpoint=None, token=None, library_name=None, library_version=None, user_agent=None, headers=None)[source]#
Bases:
objectClient to interact with the Hugging Face Hub via HTTP.
The client is initialized with some high-level settings used in all requests made to the Hub (HF endpoint, authentication, user agents…). Using the HfApi client is preferred but not mandatory as all of its public methods are exposed directly at the root of huggingface_hub.
- Parameters:
endpoint (str, optional) – Endpoint of the Hub. Defaults to <https://huggingface.co>.
token (bool or str, optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.
library_name (str, optional) – The name of the library that is making the HTTP request. Will be added to the user-agent header. Example: “transformers”.
library_version (str, optional) – The version of the library that is making the HTTP request. Will be added to the user-agent header. Example: “4.24.0”.
user_agent (str, dict, optional) – The user agent info in the form of a dictionary or a single string. It will be completed with information about the installed packages.
headers (dict, optional) – Additional headers to be sent with each request. Example: {“X-My-Header”: “value”}. Headers passed here are taking precedence over the default headers.
- accept_access_request(repo_id, user, *, repo_type=None, token=None)[source]#
Accept an access request from a user for a given gated repo.
Once the request is accepted, the user will be able to download any file of the repo and access the community tab. If the approval mode is automatic, you don’t have to accept requests manually. An accepted request can be cancelled or rejected at any time using [cancel_access_request] and [reject_access_request].
For more info about gated repos, see https://huggingface.co/docs/hub/models-gated.
- Parameters:
repo_id (str) – The id of the repo to accept access request for.
user (str) – The username of the user which access request should be accepted.
repo_type (str, optional) – The type of the repo to accept access request for. Must be one of model, dataset or space. Defaults to model.
token (bool or str, optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.
- Raises:
[HfHubHTTPError] – HTTP 400 if the repo is not gated.
[HfHubHTTPError] – HTTP 403 if you only have read-only access to the repo. This can be the case if you don’t have write or admin role in the organization the repo belongs to or if you passed a read token.
[HfHubHTTPError] – HTTP 404 if the user does not exist on the Hub.
[HfHubHTTPError] – HTTP 404 if the user access request cannot be found.
[HfHubHTTPError] – HTTP 404 if the user access request is already in the accepted list.
- add_collection_item(collection_slug, item_id, item_type, *, note=None, exists_ok=False, token=None)[source]#
Add an item to a collection on the Hub.
- Parameters:
collection_slug (str) – Slug of the collection to update. Example: “TheBloke/recent-models-64f9a55bb3115b4f513ec026”.
item_id (str) – Id of the item to add to the collection. Use the repo_id for repos/spaces/datasets, the paper id for papers, or the slug of another collection (e.g. “moonshotai/kimi-k2”).
item_type (str) – Type of the item to add. Can be one of “model”, “dataset”, “space”, “paper” or “collection”.
note (str, optional) – A note to attach to the item in the collection. The maximum size for a note is 500 characters.
exists_ok (bool, optional) – If True, do not raise an error if item already exists.
token (bool or str, optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.
- Return type:
Collection
Returns: [Collection]
- Raises:
[HfHubHTTPError] – HTTP 403 if you only have read-only access to the repo. This can be the case if you don’t have write or admin role in the organization the repo belongs to or if you passed a read token.
[HfHubHTTPError] – HTTP 404 if the item you try to add to the collection does not exist on the Hub.
[HfHubHTTPError] – HTTP 409 if the item you try to add to the collection is already in the collection (and exists_ok=False)
- Return type:
Collection
Example:
```py >>> from huggingface_hub import add_collection_item >>> collection = add_collection_item( … collection_slug=”davanstrien/climate-64f99dc2a5067f6b65531bab”, … item_id=”pierre-loic/climate-news-articles”, … item_type=”dataset” … ) >>> collection.items[-1].item_id “pierre-loic/climate-news-articles” # ^item got added to the collection on last position
# Add item with a note >>> add_collection_item( … collection_slug=”davanstrien/climate-64f99dc2a5067f6b65531bab”, … item_id=”datasets/climate_fever”, … item_type=”dataset” … note=”This dataset adopts the FEVER methodology that consists of 1,535 real-world claims regarding climate-change collected on the internet.” … ) (…) ```
- add_space_secret(repo_id, key, value, *, description=None, token=None)[source]#
Adds or updates a secret in a Space.
Secrets allow to set secret keys or tokens to a Space without hardcoding them. For more details, see https://huggingface.co/docs/hub/spaces-overview#managing-secrets.
- Parameters:
repo_id (str) – ID of the repo to update. Example: “bigcode/in-the-stack”.
key (str) – Secret key. Example: “GITHUB_API_KEY”
value (str) – Secret value. Example: “your_github_api_key”.
description (str, optional) – Secret description. Example: “Github API key to access the Github API”.
token (bool or str, optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.
- add_space_variable(repo_id, key, value, *, description=None, token=None)[source]#
Adds or updates a variable in a Space.
Variables allow to set environment variables to a Space without hardcoding them. For more details, see https://huggingface.co/docs/hub/spaces-overview#managing-secrets-and-environment-variables
- Parameters:
repo_id (str) – ID of the repo to update. Example: “bigcode/in-the-stack”.
key (str) – Variable key. Example: “MODEL_REPO_ID”
value (str) – Variable value. Example: “the_model_repo_id”.
description (str) – Description of the variable. Example: “Model Repo ID of the implemented model”.
token (bool or str, optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.
- Return type:
dict[str, SpaceVariable]
- auth_check(repo_id, *, repo_type=None, token=None)[source]#
Check if the provided user token has access to a specific repository on the Hugging Face Hub.
This method verifies whether the user, authenticated via the provided token, has access to the specified repository. If the repository is not found or if the user lacks the required permissions to access it, the method raises an appropriate exception.
- Parameters:
repo_id (str) – The repository to check for access. Format should be “user/repo_name”. Example: “user/my-cool-model”.
repo_type (str, optional) – The type of the repository. Should be one of “model”, “dataset”, or “space”. If not specified, the default is “model”.
` (token) – A valid user access token. If not provided, the locally saved token will be used, which is the recommended authentication method. Set to False to disable authentication. Refer to: https://huggingface.co/docs/huggingface_hub/quick-start#authentication.
- Raises:
[RepositoryNotFoundError] – Raised if the repository does not exist, is private, or the user does not have access. This can occur if the repo_id or repo_type is incorrect or if the repository is private but the user is not authenticated.
[GatedRepoError] – Raised if the repository exists but is gated and the user is not authorized to access it.
Example
Check if the user has access to a repository:
```python >>> from huggingface_hub import auth_check >>> from huggingface_hub.utils import GatedRepoError, RepositoryNotFoundError
- try:
auth_check(“user/my-cool-model”)
- except GatedRepoError:
# Handle gated repository error print(“You do not have permission to access this gated repository.”)
- except RepositoryNotFoundError:
# Handle repository not found error print(“The repository was not found or you do not have access.”)
In this example: - If the user has access, the method completes successfully. - If the repository is gated or does not exist, appropriate exceptions are raised, allowing the user to handle them accordingly.
- cancel_access_request(repo_id, user, *, repo_type=None, token=None)[source]#
Cancel an access request from a user for a given gated repo.
A cancelled request will go back to the pending list and the user will lose access to the repo.
For more info about gated repos, see https://huggingface.co/docs/hub/models-gated.
- Parameters:
repo_id (str) – The id of the repo to cancel access request for.
user (str) – The username of the user which access request should be cancelled.
repo_type (str, optional) – The type of the repo to cancel access request for. Must be one of model, dataset or space. Defaults to model.
token (bool or str, optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.
- Raises:
[HfHubHTTPError] – HTTP 400 if the repo is not gated.
[HfHubHTTPError] – HTTP 403 if you only have read-only access to the repo. This can be the case if you don’t have write or admin role in the organization the repo belongs to or if you passed a read token.
[HfHubHTTPError] – HTTP 404 if the user does not exist on the Hub.
[HfHubHTTPError] – HTTP 404 if the user access request cannot be found.
[HfHubHTTPError] – HTTP 404 if the user access request is already in the pending list.
- cancel_job(*, job_id, namespace=None, token=None)[source]#
Cancel a compute Job on Hugging Face infrastructure.
- Parameters:
job_id (str) – ID of the Job.
namespace (str, optional) – The namespace where the Job is running. Defaults to the current user’s namespace.
` (token) – A valid user access token. If not provided, the locally saved token will be used, which is the recommended authentication method. Set to False to disable authentication. Refer to: https://huggingface.co/docs/huggingface_hub/quick-start#authentication.
- change_discussion_status(repo_id, discussion_num, new_status, *, token=None, comment=None, repo_type=None)[source]#
Closes or re-opens a Discussion or Pull Request.
- Parameters:
repo_id (str) – A namespace (user or an organization) and a repo name separated by a /.
discussion_num (int) – The number of the Discussion or Pull Request . Must be a strictly positive integer.
new_status (str) – The new status for the discussion, either “open” or “closed”.
comment (str, optional) – An optional comment to post with the status change.
repo_type (str, optional) – Set to “dataset” or “space” if uploading to a dataset or space, None or “model” if uploading to a model. Default is None.
token (bool or str, optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.
- Returns:
the status change event
- Return type:
[DiscussionStatusChange]
Examples
```python >>> new_title = “New title, fixing a typo” >>> HfApi().rename_discussion( … repo_id=”username/repo_name”, … discussion_num=34 … new_title=new_title … ) # DiscussionStatusChange(id=’deadbeef0000000’, type=’status-change’, …)
> [!TIP] > Raises the following errors: > > - [HTTPError](https://requests.readthedocs.io/en/latest/api/#requests.HTTPError) > if the HuggingFace API returned an error > - [ValueError](https://docs.python.org/3/library/exceptions.html#ValueError) > if some parameter value is invalid > - [~utils.RepositoryNotFoundError] > If the repository to download from cannot be found. This may be because it doesn’t exist, > or because it is set to private and you do not have access.
- comment_discussion(repo_id, discussion_num, comment, *, token=None, repo_type=None)[source]#
Creates a new comment on the given Discussion.
- Parameters:
repo_id (str) – A namespace (user or an organization) and a repo name separated by a /.
discussion_num (int) – The number of the Discussion or Pull Request . Must be a strictly positive integer.
comment (str) – The content of the comment to create. Comments support markdown formatting.
repo_type (str, optional) – Set to “dataset” or “space” if uploading to a dataset or space, None or “model” if uploading to a model. Default is None.
token (bool or str, optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.
- Returns:
the newly created comment
- Return type:
[DiscussionComment]
Examples
>>> comment = """ ... Hello @otheruser! ... ... # This is a title ... ... **This is bold**, *this is italic* and ~this is strikethrough~ ... And [this](http://url) is a link ... """
>>> HfApi().comment_discussion( ... repo_id="username/repo_name", ... discussion_num=34 ... comment=comment ... ) # DiscussionComment(id='deadbeef0000000', type='comment', ...)
> [!TIP] > Raises the following errors: > > - [HTTPError](https://requests.readthedocs.io/en/latest/api/#requests.HTTPError) > if the HuggingFace API returned an error > - [ValueError](https://docs.python.org/3/library/exceptions.html#ValueError) > if some parameter value is invalid > - [~utils.RepositoryNotFoundError] > If the repository to download from cannot be found. This may be because it doesn’t exist, > or because it is set to private and you do not have access.
- create_branch(repo_id, *, branch, revision=None, token=None, repo_type=None, exist_ok=False)[source]#
Create a new branch for a repo on the Hub, starting from the specified revision (defaults to main). To find a revision suiting your needs, you can use [list_repo_refs] or [list_repo_commits].
- Parameters:
repo_id (str) – The repository in which the branch will be created. Example: “user/my-cool-model”.
branch (str) – The name of the branch to create.
revision (str, optional) – The git revision to create the branch from. It can be a branch name or the OID/SHA of a commit, as a hexadecimal string. Defaults to the head of the “main” branch.
token (bool or str, optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.
repo_type (str, optional) – Set to “dataset” or “space” if creating a branch on a dataset or space, None or “model” if tagging a model. Default is None.
exist_ok (bool, optional, defaults to False) – If True, do not raise an error if branch already exists.
- Raises:
[RepositoryNotFoundError] – If repository is not found (error 404): wrong repo_id/repo_type, private but not authenticated or repo does not exist.
[BadRequestError] – If invalid reference for a branch. Ex: refs/pr/5 or ‘refs/foo/bar’.
[HfHubHTTPError] – If the branch already exists on the repo (error 409) and exist_ok is set to False.
- create_collection(title, *, namespace=None, description=None, private=False, exists_ok=False, token=None)[source]#
Create a new Collection on the Hub.
- Parameters:
title (str) – Title of the collection to create. Example: “Recent models”.
namespace (str, optional) – Namespace of the collection to create (username or org). Will default to the owner name.
description (str, optional) – Description of the collection to create.
private (bool, optional) – Whether the collection should be private or not. Defaults to False (i.e. public collection).
exists_ok (bool, optional) – If True, do not raise an error if collection already exists.
token (bool or str, optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.
- Return type:
Collection
Returns: [Collection]
Example:
`py >>> from huggingface_hub import create_collection >>> collection = create_collection( ... title="ICCV 2023", ... description="Portfolio of models, papers and demos I presented at ICCV 2023", ... ) >>> collection.slug "username/iccv-2023-64f9a55bb3115b4f513ec026" `
- create_commit(repo_id, operations, *, commit_message, commit_description=None, token=None, repo_type=None, revision=None, create_pr=None, num_threads=5, parent_commit=None, run_as_future=False)[source]#
Creates a commit in the given repo, deleting & uploading files as needed.
> [!WARNING] > The input list of CommitOperation will be mutated during the commit process. Do not reuse the same objects > for multiple commits.
> [!WARNING] > create_commit assumes that the repo already exists on the Hub. If you get a > Client error 404, please make sure you are authenticated and that repo_id and > repo_type are set correctly. If repo does not exist, create it first using > [~hf_api.create_repo].
> [!WARNING] > create_commit is limited to 25k LFS files and a 1GB payload for regular files.
- Parameters:
repo_id (str) – The repository in which the commit will be created, for example: “username/custom_transformers”
operations (Iterable of [~hf_api.CommitOperation]) –
An iterable of operations to include in the commit, either:
[~hf_api.CommitOperationAdd] to upload a file
[~hf_api.CommitOperationDelete] to delete a file
[~hf_api.CommitOperationCopy] to copy a file
Operation objects will be mutated to include information relative to the upload. Do not reuse the same objects for multiple commits.
commit_message (str) – The summary (first line) of the commit that will be created.
commit_description (str, optional) – The description of the commit that will be created
token (bool or str, optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.
repo_type (str, optional) – Set to “dataset” or “space” if uploading to a dataset or space, None or “model” if uploading to a model. Default is None.
revision (str, optional) – The git revision to commit from. Defaults to the head of the “main” branch.
create_pr (boolean, optional) – Whether or not to create a Pull Request with that commit. Defaults to False. If revision is not set, PR is opened against the “main” branch. If revision is set and is a branch, PR is opened against this branch. If revision is set and is not a branch name (example: a commit oid), an RevisionNotFoundError is returned by the server.
num_threads (int, optional) – Number of concurrent threads for uploading files. Defaults to 5. Setting it to 2 means at most 2 files will be uploaded concurrently.
parent_commit (str, optional) – The OID / SHA of the parent commit, as a hexadecimal string. Shorthands (7 first characters) are also supported. If specified and create_pr is False, the commit will fail if revision does not point to parent_commit. If specified and create_pr is True, the pull request will be created from parent_commit. Specifying parent_commit ensures the repo has not changed before committing the changes, and can be especially useful if the repo is updated / committed to concurrently.
run_as_future (bool, optional) – Whether or not to run this method in the background. Background jobs are run sequentially without blocking the main thread. Passing run_as_future=True will return a [Future](https://docs.python.org/3/library/concurrent.futures.html#future-objects) object. Defaults to False.
- Returns:
Instance of [CommitInfo] containing information about the newly created commit (commit hash, commit url, pr url, commit message,…). If run_as_future=True is passed, returns a Future object which will contain the result when executed.
- Return type:
[CommitInfo] or Future
- Raises:
[ValueError](https – //docs.python.org/3/library/exceptions.html#ValueError) If commit message is empty.
[ValueError](https – //docs.python.org/3/library/exceptions.html#ValueError) If parent commit is not a valid commit OID.
[ValueError](https – //docs.python.org/3/library/exceptions.html#ValueError) If a README.md file with an invalid metadata section is committed. In this case, the commit will fail early, before trying to upload any file.
[ValueError](https – //docs.python.org/3/library/exceptions.html#ValueError) If create_pr is True and revision is neither None nor “main”.
[RepositoryNotFoundError] – If repository is not found (error 404): wrong repo_id/repo_type, private but not authenticated or repo does not exist.
- create_discussion(repo_id, title, *, token=None, description=None, repo_type=None, pull_request=False)[source]#
Creates a Discussion or Pull Request.
Pull Requests created programmatically will be in “draft” status.
Creating a Pull Request with changes can also be done at once with [HfApi.create_commit].
- Parameters:
repo_id (str) – A namespace (user or an organization) and a repo name separated by a /.
title (str) – The title of the discussion. It can be up to 200 characters long, and must be at least 3 characters long. Leading and trailing whitespaces will be stripped.
token (bool or str, optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.
description (str, optional) – An optional description for the Pull Request. Defaults to “Discussion opened with the huggingface_hub Python library”
pull_request (bool, optional) – Whether to create a Pull Request or discussion. If True, creates a Pull Request. If False, creates a discussion. Defaults to False.
repo_type (str, optional) – Set to “dataset” or “space” if uploading to a dataset or space, None or “model” if uploading to a model. Default is None.
- Return type:
DiscussionWithDetails
Returns: [DiscussionWithDetails]
> [!TIP] > Raises the following errors: > > - [HTTPError](https://requests.readthedocs.io/en/latest/api/#requests.HTTPError) > if the HuggingFace API returned an error > - [ValueError](https://docs.python.org/3/library/exceptions.html#ValueError) > if some parameter value is invalid > - [~utils.RepositoryNotFoundError] > If the repository to download from cannot be found. This may be because it doesn’t exist, > or because it is set to private and you do not have access.
- create_inference_endpoint(name, *, repository, framework, accelerator, instance_size, instance_type, region, vendor, account_id=None, min_replica=1, max_replica=1, scaling_metric=None, scaling_threshold=None, scale_to_zero_timeout=None, revision=None, task=None, custom_image=None, env=None, secrets=None, type=InferenceEndpointType.PROTECTED, domain=None, path=None, cache_http_responses=None, tags=None, namespace=None, token=None)[source]#
Create a new Inference Endpoint.
- Parameters:
name (str) – The unique name for the new Inference Endpoint.
repository (str) – The name of the model repository associated with the Inference Endpoint (e.g. “gpt2”).
framework (str) – The machine learning framework used for the model (e.g. “custom”).
accelerator (str) – The hardware accelerator to be used for inference (e.g. “cpu”).
instance_size (str) – The size or type of the instance to be used for hosting the model (e.g. “x4”).
instance_type (str) – The cloud instance type where the Inference Endpoint will be deployed (e.g. “intel-icl”).
region (str) – The cloud region in which the Inference Endpoint will be created (e.g. “us-east-1”).
vendor (str) – The cloud provider or vendor where the Inference Endpoint will be hosted (e.g. “aws”).
account_id (str, optional) – The account ID used to link a VPC to a private Inference Endpoint (if applicable).
min_replica (int, optional) – The minimum number of replicas (instances) to keep running for the Inference Endpoint. To enable scaling to zero, set this value to 0 and adjust scale_to_zero_timeout accordingly. Defaults to 1.
max_replica (int, optional) – The maximum number of replicas (instances) to scale to for the Inference Endpoint. Defaults to 1.
scaling_metric (str or [`InferenceEndpointScalingMetric `], optional) – The metric reference for scaling. Either “pendingRequests” or “hardwareUsage” when provided. Defaults to None (meaning: let the HF Endpoints service specify the metric).
scaling_threshold (float, optional) – The scaling metric threshold used to trigger a scale up. Ignored when scaling metric is not provided. Defaults to None (meaning: let the HF Endpoints service specify the threshold).
scale_to_zero_timeout (int, optional) – The duration in minutes before an inactive endpoint is scaled to zero, or no scaling to zero if set to None and min_replica is not 0. Defaults to None.
revision (str, optional) – The specific model revision to deploy on the Inference Endpoint (e.g. “6c0e6080953db56375760c0471a8c5f2929baf11”).
task (str, optional) – The task on which to deploy the model (e.g. “text-classification”).
custom_image (dict, optional) – A custom Docker image to use for the Inference Endpoint. This is useful if you want to deploy an Inference Endpoint running on the text-generation-inference (TGI) framework (see examples).
env (dict[str, str], optional) – Non-secret environment variables to inject in the container environment.
secrets (dict[str, str], optional) – Secret values to inject in the container environment.
type ([InferenceEndpointType], optional) – The type of the Inference Endpoint, which can be “protected” (default), “public” or “private”.
domain (str, optional) – The custom domain for the Inference Endpoint deployment, if setup the inference endpoint will be available at this domain (e.g. “my-new-domain.cool-website.woof”).
path (str, optional) – The custom path to the deployed model, should start with a / (e.g. “/models/google-bert/bert-base-uncased”).
cache_http_responses (bool, optional) – Whether to cache HTTP responses from the Inference Endpoint. Defaults to False.
tags (list[str], optional) – A list of tags to associate with the Inference Endpoint.
namespace (str, optional) – The namespace where the Inference Endpoint will be created. Defaults to the current user’s namespace.
token (bool or str, optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.
Returns – [InferenceEndpoint]: information about the updated Inference Endpoint.
Example
```python –
HfApi (>>> from huggingface_hub import)
HfApi() (>>> api =)
api.create_inference_endpoint( (>>> endpoint =)
"my-endpoint-name" (...)
- Return type:
InferenceEndpoint
:param : :param … repository=”gpt2”: :param : :param … framework=”pytorch”: :param : :param … task=”text-generation”: :param : :param … accelerator=”cpu”: :param : :param … vendor=”aws”: :param : :param … region=”us-east-1”: :param : :param … type=”protected”: :param : :param … instance_size=”x2”: :param : :param … instance_type=”intel-icl”: :param : :param … ): :param >>> endpoint: :param InferenceEndpoint: :type InferenceEndpoint: name=’my-endpoint-name’, status=”pending”,… :param # Run inference on the endpoint: :param >>> endpoint.client.text_generation: :type >>> endpoint.client.text_generation: … :param “…”: :param
`: :param ```python: :param # Start an Inference Endpoint running Zephyr-7b-beta on TGI: :param >>> from huggingface_hub import HfApi: :param >>> api = HfApi(): :param >>> endpoint = api.create_inference_endpoint(: :param ... "aws-zephyr-7b-beta-0486": :param : :param ... repository="HuggingFaceH4/zephyr-7b-beta": :param : :param ... framework="pytorch": :param : :param ... task="text-generation": :param : :param ... accelerator="gpu": :param : :param ... vendor="aws": :param : :param ... region="us-east-1": :param : :param ... type="protected": :param : :param ... instance_size="x1": :param : :param ... instance_type="nvidia-a10g": :param : :param ... env={: :param ... "MAX_BATCH_PREFILL_TOKENS": "2048", :param ... "MAX_INPUT_LENGTH": "1024", :param ... "MAX_TOTAL_TOKENS": "1512", :param ... "MODEL_ID": "/repository" :param ... }: :param : :param ... custom_image={: :param ... "health_route": "/health", :param ... "url": "ghcr.io/huggingface/text-generation-inference:1.1.0", :param ... }: :param : :param ... secrets={"MY_SECRET_KEY": "secret_value"}, :param ... tags=["dev": :param "text-generation"]: :param : :param ... ): :param `: :param`python: :param # Start an Inference Endpoint running ProsusAI/finbert while scaling to zero in 15 minutes: :param >>> from huggingface_hub import HfApi: :param >>> api = HfApi(): :param >>> endpoint = api.create_inference_endpoint(: :param ... "finbert-classifier": :param : :param ... repository="ProsusAI/finbert": :param : :param ... framework="pytorch": :param : :param ... task="text-classification": :param : :param ... min_replica=0: :param : :param ... scale_to_zero_timeout=15: :param : :param ... accelerator="cpu": :param : :param ... vendor="aws": :param : :param ... region="us-east-1": :param : :param ... type="protected": :param : :param ... instance_size="x2": :param : :param ... instance_type="intel-icl": :param : :param ... ): :param >>> endpoint.wait: :type >>> endpoint.wait: timeout=300 :param # Run inference on the endpoint: :param >>> endpoint.client.text_generation: :type >>> endpoint.client.text_generation: ... :param TextClassificationOutputElement: :type TextClassificationOutputElement: label='positive', score=0.8983615040779114 :param `:
- create_inference_endpoint_from_catalog(repo_id, *, name=None, token=None, namespace=None)[source]#
Create a new Inference Endpoint from a model in the Hugging Face Inference Catalog.
The goal of the Inference Catalog is to provide a curated list of models that are optimized for inference and for which default configurations have been tested. See https://endpoints.huggingface.co/catalog for a list of available models in the catalog.
- Parameters:
repo_id (str) – The ID of the model in the catalog to deploy as an Inference Endpoint.
name (str, optional) – The unique name for the new Inference Endpoint. If not provided, a random name will be generated.
token (bool or str, optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication).
namespace (str, optional) – The namespace where the Inference Endpoint will be created. Defaults to the current user’s namespace.
- Returns:
information about the new Inference Endpoint.
- Return type:
[InferenceEndpoint]
> [!WARNING] > create_inference_endpoint_from_catalog is experimental. Its API is subject to change in the future. Please provide feedback > if you have any suggestions or requests.
- create_pull_request(repo_id, title, *, token=None, description=None, repo_type=None)[source]#
Creates a Pull Request . Pull Requests created programmatically will be in “draft” status.
Creating a Pull Request with changes can also be done at once with [HfApi.create_commit];
This is a wrapper around [HfApi.create_discussion].
- Parameters:
repo_id (str) – A namespace (user or an organization) and a repo name separated by a /.
title (str) – The title of the discussion. It can be up to 200 characters long, and must be at least 3 characters long. Leading and trailing whitespaces will be stripped.
token (bool or str, optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.
description (str, optional) – An optional description for the Pull Request. Defaults to “Discussion opened with the huggingface_hub Python library”
repo_type (str, optional) – Set to “dataset” or “space” if uploading to a dataset or space, None or “model” if uploading to a model. Default is None.
- Return type:
DiscussionWithDetails
Returns: [DiscussionWithDetails]
> [!TIP] > Raises the following errors: > > - [HTTPError](https://requests.readthedocs.io/en/latest/api/#requests.HTTPError) > if the HuggingFace API returned an error > - [ValueError](https://docs.python.org/3/library/exceptions.html#ValueError) > if some parameter value is invalid > - [~utils.RepositoryNotFoundError] > If the repository to download from cannot be found. This may be because it doesn’t exist, > or because it is set to private and you do not have access.
- create_repo(repo_id, *, token=None, private=None, repo_type=None, exist_ok=False, resource_group_id=None, space_sdk=None, space_hardware=None, space_storage=None, space_sleep_time=None, space_secrets=None, space_variables=None)[source]#
Create an empty repo on the HuggingFace Hub.
- Parameters:
repo_id (str) – A namespace (user or an organization) and a repo name separated by a /.
token (bool or str, optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.
private (bool, optional) – Whether to make the repo private. If None (default), the repo will be public unless the organization’s default is private. This value is ignored if the repo already exists.
repo_type (str, optional) – Set to “dataset” or “space” if uploading to a dataset or space, None or “model” if uploading to a model. Default is None.
exist_ok (bool, optional, defaults to False) – If True, do not raise an error if repo already exists.
resource_group_id (str, optional) – Resource group in which to create the repo. Resource groups is only available for Enterprise Hub organizations and allow to define which members of the organization can access the resource. The ID of a resource group can be found in the URL of the resource’s page on the Hub (e.g. “66670e5163145ca562cb1988”). To learn more about resource groups, see https://huggingface.co/docs/hub/en/security-resource-groups.
space_sdk (str, optional) – Choice of SDK to use if repo_type is “space”. Can be “streamlit”, “gradio”, “docker”, or “static”.
space_hardware (SpaceHardware or str, optional) – Choice of Hardware if repo_type is “space”. See [SpaceHardware] for a complete list.
space_storage (SpaceStorage or str, optional) – Choice of persistent storage tier. Example: “small”. See [SpaceStorage] for a complete list.
space_sleep_time (int, optional) – Number of seconds of inactivity to wait before a Space is put to sleep. Set to -1 if you don’t want your Space to sleep (default behavior for upgraded hardware). For free hardware, you can’t configure the sleep time (value is fixed to 48 hours of inactivity). See https://huggingface.co/docs/hub/spaces-gpus#sleep-time for more details.
space_secrets (list[dict[str, str]], optional) – A list of secret keys to set in your Space. Each item is in the form {“key”: …, “value”: …, “description”: …} where description is optional. For more details, see https://huggingface.co/docs/hub/spaces-overview#managing-secrets.
space_variables (list[dict[str, str]], optional) – A list of public environment variables to set in your Space. Each item is in the form {“key”: …, “value”: …, “description”: …} where description is optional. For more details, see https://huggingface.co/docs/hub/spaces-overview#managing-secrets-and-environment-variables.
- Returns:
URL to the newly created repo. Value is a subclass of str containing attributes like endpoint, repo_type and repo_id.
- Return type:
[RepoUrl]
- create_scheduled_job(*, image, command, schedule, suspend=None, concurrency=None, env=None, secrets=None, flavor=None, timeout=None, namespace=None, token=None)[source]#
Create scheduled compute Jobs on Hugging Face infrastructure.
- Parameters:
image (str) – The Docker image to use. Examples: “ubuntu”, “python:3.12”, “pytorch/pytorch:2.6.0-cuda12.4-cudnn9-devel”. Example with an image from a Space: “hf.co/spaces/lhoestq/duckdb”.
command (list[str]) – The command to run. Example: [“echo”, “hello”].
schedule (str) – One of “@annually”, “@yearly”, “@monthly”, “@weekly”, “@daily”, “@hourly”, or a CRON schedule expression (e.g., ‘0 9 * * 1’ for 9 AM every Monday).
suspend (bool, optional) – If True, the scheduled Job is suspended (paused). Defaults to False.
concurrency (bool, optional) – If True, multiple instances of this Job can run concurrently. Defaults to False.
env (dict[str, Any], optional) – Defines the environment variables for the Job.
secrets (dict[str, Any], optional) – Defines the secret environment variables for the Job.
flavor (str, optional) – Flavor for the hardware, as in Hugging Face Spaces. See [SpaceHardware] for possible values. Defaults to “cpu-basic”.
timeout (Union[int, float, str], optional) – Max duration for the Job: int/float with s (seconds, default), m (minutes), h (hours) or d (days). Example: 300 or “5m” for 5 minutes.
namespace (str, optional) – The namespace where the Job will be created. Defaults to the current user’s namespace.
` (token) – A valid user access token. If not provided, the locally saved token will be used, which is the recommended authentication method. Set to False to disable authentication. Refer to: https://huggingface.co/docs/huggingface_hub/quick-start#authentication.
- Return type:
ScheduledJobInfo
Example
Create your first scheduled Job:
`python >>> from huggingface_hub import create_scheduled_job >>> create_scheduled_job(image="python:3.12", command=["python", "-c" ,"print('Hello from HF compute!')"], schedule="@hourly") `Use a CRON schedule expression:
`python >>> from huggingface_hub import create_scheduled_job >>> create_scheduled_job(image="python:3.12", command=["python", "-c" ,"print('this runs every 5min')"], schedule="*/5 * * * *") `Create a scheduled GPU Job:
`python >>> from huggingface_hub import create_scheduled_job >>> image = "pytorch/pytorch:2.6.0-cuda12.4-cudnn9-devel" >>> command = ["python", "-c", "import torch; print(f"This code ran with the following GPU: {torch.cuda.get_device_name()}")"] >>> create_scheduled_job(image, command, flavor="a10g-small", schedule="@hourly") `
- create_scheduled_uv_job(script, *, script_args=None, schedule, suspend=None, concurrency=None, dependencies=None, python=None, image=None, env=None, secrets=None, flavor=None, timeout=None, namespace=None, token=None)[source]#
Run a UV script Job on Hugging Face infrastructure.
- Parameters:
script (str) – Path or URL of the UV script, or a command.
script_args (list[str], optional) – Arguments to pass to the script, or a command.
schedule (str) – One of “@annually”, “@yearly”, “@monthly”, “@weekly”, “@daily”, “@hourly”, or a CRON schedule expression (e.g., ‘0 9 * * 1’ for 9 AM every Monday).
suspend (bool, optional) – If True, the scheduled Job is suspended (paused). Defaults to False.
concurrency (bool, optional) – If True, multiple instances of this Job can run concurrently. Defaults to False.
dependencies (list[str], optional) – Dependencies to use to run the UV script.
python (str, optional) – Use a specific Python version. Default is 3.12.
(str (image) – python3.12-bookworm”): Use a custom Docker image with uv installed.
optional – python3.12-bookworm”): Use a custom Docker image with uv installed.
"ghcr.io/astral-sh/uv (defaults to) – python3.12-bookworm”): Use a custom Docker image with uv installed.
env (dict[str, Any], optional) – Defines the environment variables for the Job.
secrets (dict[str, Any], optional) – Defines the secret environment variables for the Job.
flavor (str, optional) – Flavor for the hardware, as in Hugging Face Spaces. See [SpaceHardware] for possible values. Defaults to “cpu-basic”.
timeout (Union[int, float, str], optional) – Max duration for the Job: int/float with s (seconds, default), m (minutes), h (hours) or d (days). Example: 300 or “5m” for 5 minutes.
namespace (str, optional) – The namespace where the Job will be created. Defaults to the current user’s namespace.
` (token) – A valid user access token. If not provided, the locally saved token will be used, which is the recommended authentication method. Set to False to disable authentication. Refer to: https://huggingface.co/docs/huggingface_hub/quick-start#authentication.
- Return type:
ScheduledJobInfo
Example
Schedule a script from a URL:
`python >>> from huggingface_hub import create_scheduled_uv_job >>> script = "https://raw.githubusercontent.com/huggingface/trl/refs/heads/main/trl/scripts/sft.py" >>> script_args = ["--model_name_or_path", "Qwen/Qwen2-0.5B", "--dataset_name", "trl-lib/Capybara", "--push_to_hub"] >>> create_scheduled_uv_job(script, script_args=script_args, dependencies=["trl"], flavor="a10g-small", schedule="@weekly") `Schedule a local script:
`python >>> from huggingface_hub import create_scheduled_uv_job >>> script = "my_sft.py" >>> script_args = ["--model_name_or_path", "Qwen/Qwen2-0.5B", "--dataset_name", "trl-lib/Capybara", "--push_to_hub"] >>> create_scheduled_uv_job(script, script_args=script_args, dependencies=["trl"], flavor="a10g-small", schedule="@weekly") `Schedule a command:
`python >>> from huggingface_hub import create_scheduled_uv_job >>> script = "lighteval" >>> script_args= ["endpoint", "inference-providers", "model_name=openai/gpt-oss-20b,provider=auto", "lighteval|gsm8k|0|0"] >>> create_scheduled_uv_job(script, script_args=script_args, dependencies=["lighteval"], flavor="a10g-small", schedule="@weekly") `
- create_tag(repo_id, *, tag, tag_message=None, revision=None, token=None, repo_type=None, exist_ok=False)[source]#
Tag a given commit of a repo on the Hub.
- Parameters:
repo_id (str) – The repository in which a commit will be tagged. Example: “user/my-cool-model”.
tag (str) – The name of the tag to create.
tag_message (str, optional) – The description of the tag to create.
revision (str, optional) – The git revision to tag. It can be a branch name or the OID/SHA of a commit, as a hexadecimal string. Shorthands (7 first characters) are also supported. Defaults to the head of the “main” branch.
token (bool or str, optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.
repo_type (str, optional) – Set to “dataset” or “space” if tagging a dataset or space, None or “model” if tagging a model. Default is None.
exist_ok (bool, optional, defaults to False) – If True, do not raise an error if tag already exists.
- Raises:
[RepositoryNotFoundError] – If repository is not found (error 404): wrong repo_id/repo_type, private but not authenticated or repo does not exist.
[RevisionNotFoundError] – If revision is not found (error 404) on the repo.
[HfHubHTTPError] – If the branch already exists on the repo (error 409) and exist_ok is set to False.
- create_webhook(*, url=None, job_id=None, watched, domains=None, secret=None, token=None)[source]#
Create a new webhook.
The webhook can either send a payload to a URL, or trigger a Job to run on Hugging Face infrastructure. This function should be called with one of url or job_id, but not both.
- Parameters:
url (str) – URL to send the payload to.
job_id (str) – ID of the source Job to trigger with the webhook payload in the environment variable WEBHOOK_PAYLOAD. Additional environment variables are available for convenience: WEBHOOK_REPO_ID, WEBHOOK_REPO_TYPE and WEBHOOK_SECRET.
watched (list[WebhookWatchedItem]) – List of [WebhookWatchedItem] to be watched by the webhook. It can be users, orgs, models, datasets or spaces. Watched items can also be provided as plain dictionaries.
domains (list[Literal[“repo”, “discussion”]], optional) – List of domains to watch. It can be “repo”, “discussion” or both.
secret (str, optional) – A secret to sign the payload with.
token (bool or str, optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.
- Returns:
Info about the newly created webhook.
- Return type:
[WebhookInfo]
Example
Create a webhook that sends a payload to a URL ```python >>> from huggingface_hub import create_webhook >>> payload = create_webhook( … watched=[{“type”: “user”, “name”: “julien-c”}, {“type”: “org”, “name”: “HuggingFaceH4”}], … url=”https://webhook.site/a2176e82-5720-43ee-9e06-f91cb4c91548”, … domains=[“repo”, “discussion”], … secret=”my-secret”, … ) >>> print(payload) WebhookInfo(
id=”654bbbc16f2ec14d77f109cc”, url=”https://webhook.site/a2176e82-5720-43ee-9e06-f91cb4c91548”, job=None, watched=[WebhookWatchedItem(type=”user”, name=”julien-c”), WebhookWatchedItem(type=”org”, name=”HuggingFaceH4”)], domains=[“repo”, “discussion”], secret=”my-secret”, disabled=False,
)#
Run a Job and then create a webhook that triggers this Job ```python >>> from huggingface_hub import create_webhook, run_job >>> job = run_job( … image=”ubuntu”, … command=[“bash”, “-c”, r”echo An event occured in $WEBHOOK_REPO_ID: $WEBHOOK_PAYLOAD”], … ) >>> payload = create_webhook( … watched=[{“type”: “user”, “name”: “julien-c”}, {“type”: “org”, “name”: “HuggingFaceH4”}], … job_id=job.id, … domains=[“repo”, “discussion”], … secret=”my-secret”, … ) >>> print(payload) WebhookInfo(
id=”654bbbc16f2ec14d77f109cc”, url=None, job=JobSpec(
docker_image=’ubuntu’, space_id=None, command=[‘bash’, ‘-c’, ‘echo An event occured in $WEBHOOK_REPO_ID: $WEBHOOK_PAYLOAD’], arguments=[], environment={}, secrets=[], flavor=’cpu-basic’, timeout=None, tags=None, arch=None
), watched=[WebhookWatchedItem(type=”user”, name=”julien-c”), WebhookWatchedItem(type=”org”, name=”HuggingFaceH4”)], domains=[“repo”, “discussion”], secret=”my-secret”, disabled=False,
)#
- dataset_info(repo_id, *, revision=None, timeout=None, files_metadata=False, expand=None, token=None)[source]#
Get info on one specific dataset on huggingface.co.
Dataset can be private if you pass an acceptable token.
- Parameters:
repo_id (str) – A namespace (user or an organization) and a repo name separated by a /.
revision (str, optional) – The revision of the dataset repository from which to get the information.
timeout (float, optional) – Whether to set a timeout for the request to the Hub.
files_metadata (bool, optional) – Whether or not to retrieve metadata for files in the repository (size, LFS metadata, etc). Defaults to False.
expand (list[ExpandDatasetProperty_T], optional) – List properties to return in the response. When used, only the properties in the list will be returned. This parameter cannot be used if files_metadata is passed. Possible values are “author”, “cardData”, “citation”, “createdAt”, “disabled”, “description”, “downloads”, “downloadsAllTime”, “gated”, “lastModified”, “likes”, “paperswithcode_id”, “private”, “siblings”, “sha”, “tags”, “trendingScore”,`”usedStorage”, and `”resourceGroup”.
token (bool or str, optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.
- Returns:
The dataset repository information.
- Return type:
[hf_api.DatasetInfo]
> [!TIP] > Raises the following errors: > > - [~utils.RepositoryNotFoundError] > If the repository to download from cannot be found. This may be because it doesn’t exist, > or because it is set to private and you do not have access. > - [~utils.RevisionNotFoundError] > If the revision to download from cannot be found.
- delete_branch(repo_id, *, branch, token=None, repo_type=None)[source]#
Delete a branch from a repo on the Hub.
- Parameters:
repo_id (str) – The repository in which a branch will be deleted. Example: “user/my-cool-model”.
branch (str) – The name of the branch to delete.
token (bool or str, optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.
repo_type (str, optional) – Set to “dataset” or “space” if creating a branch on a dataset or space, None or “model” if tagging a model. Default is None.
- Raises:
[RepositoryNotFoundError] – If repository is not found (error 404): wrong repo_id/repo_type, private but not authenticated or repo does not exist.
[HfHubHTTPError] – If trying to delete a protected branch. Ex: main cannot be deleted.
[HfHubHTTPError] – If trying to delete a branch that does not exist.
- delete_collection(collection_slug, *, missing_ok=False, token=None)[source]#
Delete a collection on the Hub.
- Parameters:
collection_slug (str) – Slug of the collection to delete. Example: “TheBloke/recent-models-64f9a55bb3115b4f513ec026”.
missing_ok (bool, optional) – If True, do not raise an error if collection doesn’t exists.
token (bool or str, optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.
Example:
`py >>> from huggingface_hub import delete_collection >>> collection = delete_collection("username/useless-collection-64f9a55bb3115b4f513ec026", missing_ok=True) `> [!WARNING] > This is a non-revertible action. A deleted collection cannot be restored.
- delete_collection_item(collection_slug, item_object_id, *, missing_ok=False, token=None)[source]#
Delete an item from a collection.
- Parameters:
collection_slug (str) – Slug of the collection to update. Example: “TheBloke/recent-models-64f9a55bb3115b4f513ec026”.
item_object_id (str) – ID of the item in the collection. This is not the id of the item on the Hub (repo_id or paper id). It must be retrieved from a [CollectionItem] object. Example: collection.items[0].item_object_id.
missing_ok (bool, optional) – If True, do not raise an error if item doesn’t exists.
token (bool or str, optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.
Example:
```py >>> from huggingface_hub import get_collection, delete_collection_item
# Get collection first >>> collection = get_collection(“TheBloke/recent-models-64f9a55bb3115b4f513ec026”)
# Delete item based on its ID >>> delete_collection_item( … collection_slug=”TheBloke/recent-models-64f9a55bb3115b4f513ec026”, … item_object_id=collection.items[-1].item_object_id, … ) ```
- delete_file(path_in_repo, repo_id, *, token=None, repo_type=None, revision=None, commit_message=None, commit_description=None, create_pr=None, parent_commit=None)[source]#
Deletes a file in the given repo.
- Parameters:
path_in_repo (str) – Relative filepath in the repo, for example: “checkpoints/1fec34a/weights.bin”
repo_id (str) – The repository from which the file will be deleted, for example: “username/custom_transformers”
token (bool or str, optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.
repo_type (str, optional) – Set to “dataset” or “space” if the file is in a dataset or space, None or “model” if in a model. Default is None.
revision (str, optional) – The git revision to commit from. Defaults to the head of the “main” branch.
commit_message (str, optional) – The summary / title / first line of the generated commit. Defaults to f”Delete {path_in_repo} with huggingface_hub”.
commit_description (str optional) – The description of the generated commit
create_pr (boolean, optional) – Whether or not to create a Pull Request with that commit. Defaults to False. If revision is not set, PR is opened against the “main” branch. If revision is set and is a branch, PR is opened against this branch. If revision is set and is not a branch name (example: a commit oid), an RevisionNotFoundError is returned by the server.
parent_commit (str, optional) – The OID / SHA of the parent commit, as a hexadecimal string. Shorthands (7 first characters) are also supported. If specified and create_pr is False, the commit will fail if revision does not point to parent_commit. If specified and create_pr is True, the pull request will be created from parent_commit. Specifying parent_commit ensures the repo has not changed before committing the changes, and can be especially useful if the repo is updated / committed to concurrently.
- Return type:
CommitInfo
> [!TIP] > Raises the following errors: > > - [HTTPError](https://requests.readthedocs.io/en/latest/api/#requests.HTTPError) > if the HuggingFace API returned an error > - [ValueError](https://docs.python.org/3/library/exceptions.html#ValueError) > if some parameter value is invalid > - [~utils.RepositoryNotFoundError] > If the repository to download from cannot be found. This may be because it doesn’t exist, > or because it is set to private and you do not have access. > - [~utils.RevisionNotFoundError] > If the revision to download from cannot be found. > - [~utils.EntryNotFoundError] > If the file to download cannot be found.
- delete_files(repo_id, delete_patterns, *, token=None, repo_type=None, revision=None, commit_message=None, commit_description=None, create_pr=None, parent_commit=None)[source]#
Delete files from a repository on the Hub.
If a folder path is provided, the entire folder is deleted as well as all files it contained.
- Parameters:
repo_id (str) – The repository from which the folder will be deleted, for example: “username/custom_transformers”
delete_patterns (list[str]) – List of files or folders to delete. Each string can either be a file path, a folder path or a Unix shell-style wildcard. E.g. [“file.txt”, “folder/”, “data/*.parquet”]
token (bool or str, optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False. to the stored token.
repo_type (str, optional) – Type of the repo to delete files from. Can be “model”, “dataset” or “space”. Defaults to “model”.
revision (str, optional) – The git revision to commit from. Defaults to the head of the “main” branch.
commit_message (str, optional) – The summary (first line) of the generated commit. Defaults to f”Delete files using huggingface_hub”.
commit_description (str optional) – The description of the generated commit.
create_pr (boolean, optional) – Whether or not to create a Pull Request with that commit. Defaults to False. If revision is not set, PR is opened against the “main” branch. If revision is set and is a branch, PR is opened against this branch. If revision is set and is not a branch name (example: a commit oid), an RevisionNotFoundError is returned by the server.
parent_commit (str, optional) – The OID / SHA of the parent commit, as a hexadecimal string. Shorthands (7 first characters) are also supported. If specified and create_pr is False, the commit will fail if revision does not point to parent_commit. If specified and create_pr is True, the pull request will be created from parent_commit. Specifying parent_commit ensures the repo has not changed before committing the changes, and can be especially useful if the repo is updated / committed to concurrently.
- Return type:
CommitInfo
- delete_folder(path_in_repo, repo_id, *, token=None, repo_type=None, revision=None, commit_message=None, commit_description=None, create_pr=None, parent_commit=None)[source]#
Deletes a folder in the given repo.
Simple wrapper around [create_commit] method.
- Parameters:
path_in_repo (str) – Relative folder path in the repo, for example: “checkpoints/1fec34a”.
repo_id (str) – The repository from which the folder will be deleted, for example: “username/custom_transformers”
token (bool or str, optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False. to the stored token.
repo_type (str, optional) – Set to “dataset” or “space” if the folder is in a dataset or space, None or “model” if in a model. Default is None.
revision (str, optional) – The git revision to commit from. Defaults to the head of the “main” branch.
commit_message (str, optional) – The summary / title / first line of the generated commit. Defaults to f”Delete folder {path_in_repo} with huggingface_hub”.
commit_description (str optional) – The description of the generated commit.
create_pr (boolean, optional) – Whether or not to create a Pull Request with that commit. Defaults to False. If revision is not set, PR is opened against the “main” branch. If revision is set and is a branch, PR is opened against this branch. If revision is set and is not a branch name (example: a commit oid), an RevisionNotFoundError is returned by the server.
parent_commit (str, optional) – The OID / SHA of the parent commit, as a hexadecimal string. Shorthands (7 first characters) are also supported. If specified and create_pr is False, the commit will fail if revision does not point to parent_commit. If specified and create_pr is True, the pull request will be created from parent_commit. Specifying parent_commit ensures the repo has not changed before committing the changes, and can be especially useful if the repo is updated / committed to concurrently.
- Return type:
CommitInfo
- delete_inference_endpoint(name, *, namespace=None, token=None)[source]#
Delete an Inference Endpoint.
This operation is not reversible. If you don’t want to be charged for an Inference Endpoint, it is preferable to pause it with [pause_inference_endpoint] or scale it to zero with [scale_to_zero_inference_endpoint].
For convenience, you can also delete an Inference Endpoint using [InferenceEndpoint.delete].
- Parameters:
name (str) – The name of the Inference Endpoint to delete.
namespace (str, optional) – The namespace in which the Inference Endpoint is located. Defaults to the current user.
token (bool or str, optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.
- delete_repo(repo_id, *, token=None, repo_type=None, missing_ok=False)[source]#
Delete a repo from the HuggingFace Hub. CAUTION: this is irreversible.
- Parameters:
repo_id (str) – A namespace (user or an organization) and a repo name separated by a /.
token (bool or str, optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.
repo_type (str, optional) – Set to “dataset” or “space” if uploading to a dataset or space, None or “model” if uploading to a model.
missing_ok (bool, optional, defaults to False) – If True, do not raise an error if repo does not exist.
- Raises:
[RepositoryNotFoundError] – If the repository to delete from cannot be found and missing_ok is set to False (default).
- delete_scheduled_job(*, scheduled_job_id, namespace=None, token=None)[source]#
Delete a scheduled compute Job on Hugging Face infrastructure.
- Parameters:
scheduled_job_id (str) – ID of the scheduled Job.
namespace (str, optional) – The namespace where the scheduled Job is. Defaults to the current user’s namespace.
` (token) – A valid user access token. If not provided, the locally saved token will be used, which is the recommended authentication method. Set to False to disable authentication. Refer to: https://huggingface.co/docs/huggingface_hub/quick-start#authentication.
- delete_space_secret(repo_id, key, *, token=None)[source]#
Deletes a secret from a Space.
Secrets allow to set secret keys or tokens to a Space without hardcoding them. For more details, see https://huggingface.co/docs/hub/spaces-overview#managing-secrets.
- Parameters:
repo_id (str) – ID of the repo to update. Example: “bigcode/in-the-stack”.
key (str) – Secret key. Example: “GITHUB_API_KEY”.
token (bool or str, optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.
- delete_space_storage(repo_id, *, token=None)[source]#
Delete persistent storage for a Space.
- Parameters:
repo_id (str) – ID of the Space to update. Example: “open-llm-leaderboard/open_llm_leaderboard”.
token (bool or str, optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.
- Returns:
Runtime information about a Space including Space stage and hardware.
- Return type:
[SpaceRuntime]
- Raises:
[BadRequestError] – If space has no persistent storage.
- delete_space_variable(repo_id, key, *, token=None)[source]#
Deletes a variable from a Space.
Variables allow to set environment variables to a Space without hardcoding them. For more details, see https://huggingface.co/docs/hub/spaces-overview#managing-secrets-and-environment-variables
- Parameters:
repo_id (str) – ID of the repo to update. Example: “bigcode/in-the-stack”.
key (str) – Variable key. Example: “MODEL_REPO_ID”
token (bool or str, optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.
- Return type:
dict[str, SpaceVariable]
- delete_tag(repo_id, *, tag, token=None, repo_type=None)[source]#
Delete a tag from a repo on the Hub.
- Parameters:
repo_id (str) – The repository in which a tag will be deleted. Example: “user/my-cool-model”.
tag (str) – The name of the tag to delete.
token (bool or str, optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.
repo_type (str, optional) – Set to “dataset” or “space” if tagging a dataset or space, None or “model” if tagging a model. Default is None.
- Raises:
[RepositoryNotFoundError] – If repository is not found (error 404): wrong repo_id/repo_type, private but not authenticated or repo does not exist.
[RevisionNotFoundError] – If tag is not found.
- delete_webhook(webhook_id, *, token=None)[source]#
Delete a webhook.
- Parameters:
webhook_id (str) – The unique identifier of the webhook to delete.
token (bool or str, optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.
- Returns:
None
- Return type:
None
Example
`python >>> from huggingface_hub import delete_webhook >>> delete_webhook("654bbbc16f2ec14d77f109cc") `
- disable_webhook(webhook_id, *, token=None)[source]#
Disable a webhook (makes it “disabled”).
- Parameters:
webhook_id (str) – The unique identifier of the webhook to disable.
token (bool or str, optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.
- Returns:
Info about the disabled webhook.
- Return type:
[WebhookInfo]
Example
```python >>> from huggingface_hub import disable_webhook >>> disabled_webhook = disable_webhook(“654bbbc16f2ec14d77f109cc”) >>> disabled_webhook WebhookInfo(
id=”654bbbc16f2ec14d77f109cc”, url=”https://webhook.site/a2176e82-5720-43ee-9e06-f91cb4c91548”, jon=None, watched=[WebhookWatchedItem(type=”user”, name=”julien-c”), WebhookWatchedItem(type=”org”, name=”HuggingFaceH4”)], domains=[“repo”, “discussion”], secret=”my-secret”, disabled=True,
)#
- duplicate_space(from_id, to_id=None, *, private=None, token=None, exist_ok=False, hardware=None, storage=None, sleep_time=None, secrets=None, variables=None)[source]#
Duplicate a Space.
Programmatically duplicate a Space. The new Space will be created in your account and will be in the same state as the original Space (running or paused). You can duplicate a Space no matter the current state of a Space.
- Parameters:
from_id (str) – ID of the Space to duplicate. Example: “pharma/CLIP-Interrogator”.
to_id (str, optional) – ID of the new Space. Example: “dog/CLIP-Interrogator”. If not provided, the new Space will have the same name as the original Space, but in your account.
private (bool, optional) – Whether the new Space should be private or not. Defaults to the same privacy as the original Space.
token (bool or str, optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.
exist_ok (bool, optional, defaults to False) – If True, do not raise an error if repo already exists.
hardware (SpaceHardware or str, optional) – Choice of Hardware. Example: “t4-medium”. See [SpaceHardware] for a complete list.
storage (SpaceStorage or str, optional) – Choice of persistent storage tier. Example: “small”. See [SpaceStorage] for a complete list.
sleep_time (int, optional) – Number of seconds of inactivity to wait before a Space is put to sleep. Set to -1 if you don’t want your Space to sleep (default behavior for upgraded hardware). For free hardware, you can’t configure the sleep time (value is fixed to 48 hours of inactivity). See https://huggingface.co/docs/hub/spaces-gpus#sleep-time for more details.
secrets (list[dict[str, str]], optional) – A list of secret keys to set in your Space. Each item is in the form {“key”: …, “value”: …, “description”: …} where description is optional. For more details, see https://huggingface.co/docs/hub/spaces-overview#managing-secrets.
variables (list[dict[str, str]], optional) – A list of public environment variables to set in your Space. Each item is in the form {“key”: …, “value”: …, “description”: …} where description is optional. For more details, see https://huggingface.co/docs/hub/spaces-overview#managing-secrets-and-environment-variables.
- Returns:
URL to the newly created repo. Value is a subclass of str containing attributes like endpoint, repo_type and repo_id.
- Return type:
[RepoUrl]
- Raises:
[RepositoryNotFoundError] – If one of from_id or to_id cannot be found. This may be because it doesn’t exist, or because it is set to private and you do not have access.
[HfHubHTTPError] – If the HuggingFace API returned an error
Example: ```python >>> from huggingface_hub import duplicate_space
# Duplicate a Space to your account >>> duplicate_space(“multimodalart/dreambooth-training”) RepoUrl(’https://huggingface.co/spaces/nateraw/dreambooth-training’,…)
# Can set custom destination id and visibility flag. >>> duplicate_space(“multimodalart/dreambooth-training”, to_id=”my-dreambooth”, private=True) RepoUrl(’https://huggingface.co/spaces/nateraw/my-dreambooth’,…) ```
- edit_discussion_comment(repo_id, discussion_num, comment_id, new_content, *, token=None, repo_type=None)[source]#
Edits a comment on a Discussion / Pull Request.
- Parameters:
repo_id (str) – A namespace (user or an organization) and a repo name separated by a /.
discussion_num (int) – The number of the Discussion or Pull Request . Must be a strictly positive integer.
comment_id (str) – The ID of the comment to edit.
new_content (str) – The new content of the comment. Comments support markdown formatting.
repo_type (str, optional) – Set to “dataset” or “space” if uploading to a dataset or space, None or “model” if uploading to a model. Default is None.
token (bool or str, optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.
- Returns:
the edited comment
- Return type:
[DiscussionComment]
> [!TIP] > Raises the following errors: > > - [HTTPError](https://requests.readthedocs.io/en/latest/api/#requests.HTTPError) > if the HuggingFace API returned an error > - [ValueError](https://docs.python.org/3/library/exceptions.html#ValueError) > if some parameter value is invalid > - [~utils.RepositoryNotFoundError] > If the repository to download from cannot be found. This may be because it doesn’t exist, > or because it is set to private and you do not have access.
- enable_webhook(webhook_id, *, token=None)[source]#
Enable a webhook (makes it “active”).
- Parameters:
webhook_id (str) – The unique identifier of the webhook to enable.
token (bool or str, optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.
- Returns:
Info about the enabled webhook.
- Return type:
[WebhookInfo]
Example
```python >>> from huggingface_hub import enable_webhook >>> enabled_webhook = enable_webhook(“654bbbc16f2ec14d77f109cc”) >>> enabled_webhook WebhookInfo(
id=”654bbbc16f2ec14d77f109cc”, job=None, url=”https://webhook.site/a2176e82-5720-43ee-9e06-f91cb4c91548”, watched=[WebhookWatchedItem(type=”user”, name=”julien-c”), WebhookWatchedItem(type=”org”, name=”HuggingFaceH4”)], domains=[“repo”, “discussion”], secret=”my-secret”, disabled=False,
)#
- fetch_job_logs(*, job_id, namespace=None, token=None)[source]#
Fetch all the logs from a compute Job on Hugging Face infrastructure.
- Parameters:
job_id (str) – ID of the Job.
namespace (str, optional) – The namespace where the Job is running. Defaults to the current user’s namespace.
` (token) – A valid user access token. If not provided, the locally saved token will be used, which is the recommended authentication method. Set to False to disable authentication. Refer to: https://huggingface.co/docs/huggingface_hub/quick-start#authentication.
- Return type:
Iterable[str]
Example
`python >>> from huggingface_hub import fetch_job_logs, run_job >>> job = run_job(image="python:3.12", command=["python", "-c" ,"print('Hello from HF compute!')"]) >>> for log in fetch_job_logs(job_id=job.id): ... print(log) Hello from HF compute! `
- fetch_job_metrics(*, job_id, namespace=None, token=None)[source]#
Fetch all the live metrics from a compute Job on Hugging Face infrastructure.
- Parameters:
job_id (str) – ID of the Job.
namespace (str, optional) – The namespace where the Job is running. Defaults to the current user’s namespace.
` (token) – A valid user access token. If not provided, the locally saved token will be used, which is the recommended authentication method. Set to False to disable authentication. Refer to: https://huggingface.co/docs/huggingface_hub/quick-start#authentication.
- Return type:
Iterable[dict[str, Any]]
Example
```python >>> from huggingface_hub import fetch_job_metrics, run_job >>> job = run_job(image=”python:3.12”, command=[“python”, “-c” ,”print(‘Hello from HF compute!’)”], flavor=”a10g-small”) >>> for metrics in fetch_job_metrics(job_id=job.id): … print(metrics) {
“cpu_usage_pct”: 0, “cpu_millicores”: 3500, “memory_used_bytes”: 1306624, “memory_total_bytes”: 15032385536, “rx_bps”: 0, “tx_bps”: 0, “gpus”: {
- “882fa930”: {
“utilization”: 0, “memory_used_bytes”: 0, “memory_total_bytes”: 22836000000
}
}, “replica”: “57vr7”
}#
- file_exists(repo_id, filename, *, repo_type=None, revision=None, token=None)[source]#
Checks if a file exists in a repository on the Hugging Face Hub.
- Parameters:
repo_id (str) – A namespace (user or an organization) and a repo name separated by a /.
filename (str) – The name of the file to check, for example: “config.json”
repo_type (str, optional) – Set to “dataset” or “space” if getting repository info from a dataset or a space, None or “model” if getting repository info from a model. Default is None.
revision (str, optional) – The revision of the repository from which to get the information. Defaults to “main” branch.
token (bool or str, optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.
- Returns:
True if the file exists, False otherwise.
- Return type:
bool
Examples
`py >>> from huggingface_hub import file_exists >>> file_exists("bigcode/starcoder", "config.json") True >>> file_exists("bigcode/starcoder", "not-a-file") False >>> file_exists("bigcode/not-a-repo", "config.json") False `
- get_collection(collection_slug, *, token=None)[source]#
Gets information about a Collection on the Hub.
- Parameters:
collection_slug (str) – Slug of the collection of the Hub. Example: “TheBloke/recent-models-64f9a55bb3115b4f513ec026”.
token (bool or str, optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.
- Return type:
Collection
Returns: [Collection]
Example:
```py >>> from huggingface_hub import get_collection >>> collection = get_collection(“TheBloke/recent-models-64f9a55bb3115b4f513ec026”) >>> collection.title ‘Recent models’ >>> len(collection.items) 37 >>> collection.items[0] CollectionItem(
item_object_id=’651446103cd773a050bf64c2’, item_id=’TheBloke/U-Amethyst-20B-AWQ’, item_type=’model’, position=88, note=None
)#
- get_dataset_tags()[source]#
List all valid dataset tags as a nested namespace object.
- Return type:
dict
- get_discussion_details(repo_id, discussion_num, *, repo_type=None, token=None)[source]#
Fetches a Discussion’s / Pull Request ‘s details from the Hub.
- Parameters:
repo_id (str) – A namespace (user or an organization) and a repo name separated by a /.
discussion_num (int) – The number of the Discussion or Pull Request . Must be a strictly positive integer.
repo_type (str, optional) – Set to “dataset” or “space” if uploading to a dataset or space, None or “model” if uploading to a model. Default is None.
token (bool or str, optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.
- Return type:
DiscussionWithDetails
Returns: [DiscussionWithDetails]
> [!TIP] > Raises the following errors: > > - [HTTPError](https://requests.readthedocs.io/en/latest/api/#requests.HTTPError) > if the HuggingFace API returned an error > - [ValueError](https://docs.python.org/3/library/exceptions.html#ValueError) > if some parameter value is invalid > - [~utils.RepositoryNotFoundError] > If the repository to download from cannot be found. This may be because it doesn’t exist, > or because it is set to private and you do not have access.
- get_full_repo_name(model_id, *, organization=None, token=None)[source]#
Returns the repository name for a given model ID and optional organization.
- Parameters:
model_id (str) – The name of the model.
organization (str, optional) – If passed, the repository name will be in the organization namespace instead of the user namespace.
token (bool or str, optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.
- Returns:
The repository name in the user’s namespace ({username}/{model_id}) if no organization is passed, and under the organization namespace ({organization}/{model_id}) otherwise.
- Return type:
str
- get_hf_file_metadata(*, url, token=None, timeout=10)[source]#
Fetch metadata of a file versioned on the Hub for a given url.
- Parameters:
url (str) – File url, for example returned by [hf_hub_url].
token (bool or str, optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.
timeout (float, optional, defaults to 10) – How many seconds to wait for the server to send metadata before giving up.
- Returns:
A [HfFileMetadata] object containing metadata such as location, etag, size and commit_hash.
- Return type:
HfFileMetadata
- get_inference_endpoint(name, *, namespace=None, token=None)[source]#
Get information about an Inference Endpoint.
- Parameters:
name (str) – The name of the Inference Endpoint to retrieve information about.
namespace (str, optional) – The namespace in which the Inference Endpoint is located. Defaults to the current user.
token (bool or str, optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.
- Returns:
information about the requested Inference Endpoint.
- Return type:
[InferenceEndpoint]
Example: ```python >>> from huggingface_hub import HfApi >>> api = HfApi() >>> endpoint = api.get_inference_endpoint(“my-text-to-image”) >>> endpoint InferenceEndpoint(name=’my-text-to-image’, …)
# Get status >>> endpoint.status ‘running’ >>> endpoint.url ‘https://my-text-to-image.region.vendor.endpoints.huggingface.cloud’
- get_organization_overview(organization, token=None)[source]#
Get an overview of an organization on the Hub.
- Parameters:
organization (str) – Name of the organization to get an overview of.
token (bool or str, optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.
- Returns:
An [Organization] object with the organization’s overview.
- Return type:
Organization
- Raises:
[HTTPError](https – //requests.readthedocs.io/en/latest/api/#requests.HTTPError): HTTP 404 If the organization does not exist on the Hub.
- get_paths_info(repo_id, paths, *, expand=False, revision=None, repo_type=None, token=None)[source]#
Get information about a repo’s paths.
- Parameters:
repo_id (str) – A namespace (user or an organization) and a repo name separated by a /.
paths (Union[list[str], str], optional) – The paths to get information about. If a path do not exist, it is ignored without raising an exception.
expand (bool, optional, defaults to False) – Whether to fetch more information about the paths (e.g. last commit and files’ security scan results). This operation is more expensive for the server so only 50 results are returned per page (instead of 1000). As pagination is implemented in huggingface_hub, this is transparent for you except for the time it takes to get the results.
revision (str, optional) – The revision of the repository from which to get the information. Defaults to “main” branch.
repo_type (str, optional) – The type of the repository from which to get the information (“model”, “dataset” or “space”. Defaults to “model”.
token (bool or str, optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.
- Returns:
The information about the paths, as a list of [RepoFile] and [RepoFolder] objects.
- Return type:
list[Union[RepoFile, RepoFolder]]
- Raises:
[RepositoryNotFoundError] – If repository is not found (error 404): wrong repo_id/repo_type, private but not authenticated or repo does not exist.
[RevisionNotFoundError] – If revision is not found (error 404) on the repo.
Example: ```py >>> from huggingface_hub import get_paths_info >>> paths_info = get_paths_info(“allenai/c4”, [“README.md”, “en”], repo_type=”dataset”) >>> paths_info [
RepoFile(path=’README.md’, size=2379, blob_id=’f84cb4c97182890fc1dbdeaf1a6a468fd27b4fff’, lfs=None, last_commit=None, security=None), RepoFolder(path=’en’, tree_id=’dc943c4c40f53d02b31ced1defa7e5f438d5862e’, last_commit=None)
]#
- get_repo_discussions(repo_id, *, author=None, discussion_type=None, discussion_status=None, repo_type=None, token=None)[source]#
Fetches Discussions and Pull Requests for the given repo.
- Parameters:
repo_id (str) – A namespace (user or an organization) and a repo name separated by a /.
author (str, optional) – Pass a value to filter by discussion author. None means no filter. Default is None.
discussion_type (str, optional) – Set to “pull_request” to fetch only pull requests, “discussion” to fetch only discussions. Set to “all” or None to fetch both. Default is None.
discussion_status (str, optional) – Set to “open” (respectively “closed”) to fetch only open (respectively closed) discussions. Set to “all” or None to fetch both. Default is None.
repo_type (str, optional) – Set to “dataset” or “space” if fetching from a dataset or space, None or “model” if fetching from a model. Default is None.
token (bool or str, optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.
- Returns:
An iterator of [Discussion] objects.
- Return type:
Iterator[Discussion]
Example
Collecting all discussions of a repo in a list:
`python >>> from huggingface_hub import get_repo_discussions >>> discussions_list = list(get_repo_discussions(repo_id="bert-base-uncased")) `Iterating over discussions of a repo:
`python >>> from huggingface_hub import get_repo_discussions >>> for discussion in get_repo_discussions(repo_id="bert-base-uncased"): ... print(discussion.num, discussion.title) `
- get_safetensors_metadata(repo_id, *, repo_type=None, revision=None, token=None)[source]#
Parse metadata for a safetensors repo on the Hub.
We first check if the repo has a single safetensors file or a sharded safetensors repo. If it’s a single safetensors file, we parse the metadata from this file. If it’s a sharded safetensors repo, we parse the metadata from the index file and then parse the metadata from each shard.
To parse metadata from a single safetensors file, use [parse_safetensors_file_metadata].
For more details regarding the safetensors format, check out https://huggingface.co/docs/safetensors/index#format.
- Parameters:
repo_id (str) – A user or an organization name and a repo name separated by a /.
repo_type (str, optional) – Set to “dataset” or “space” if the file is in a dataset or space, None or “model” if in a model. Default is None.
revision (str, optional) – The git revision to fetch the file from. Can be a branch name, a tag, or a commit hash. Defaults to the head of the “main” branch.
token (bool or str, optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.
- Returns:
information related to safetensors repo.
- Return type:
[SafetensorsRepoMetadata]
- Raises:
[NotASafetensorsRepoError] – If the repo is not a safetensors repo i.e. doesn’t have either a model.safetensors or a model.safetensors.index.json file.
[SafetensorsParsingError] – If a safetensors file header couldn’t be parsed correctly.
Example
```py # Parse repo with single weights file >>> metadata = get_safetensors_metadata(“bigscience/bloomz-560m”) >>> metadata SafetensorsRepoMetadata(
metadata=None, sharded=False, weight_map={‘h.0.input_layernorm.bias’: ‘model.safetensors’, …}, files_metadata={‘model.safetensors’: SafetensorsFileMetadata(…)}
) >>> metadata.files_metadata[“model.safetensors”].metadata {‘format’: ‘pt’}
# Parse repo with sharded model >>> metadata = get_safetensors_metadata(“bigscience/bloom”) Parse safetensors files: 100%|██████████████████████████████████████████| 72/72 [00:12<00:00, 5.78it/s] >>> metadata SafetensorsRepoMetadata(metadata={‘total_size’: 352494542848}, sharded=True, weight_map={…}, files_metadata={…}) >>> len(metadata.files_metadata) 72 # All safetensors files have been fetched
# Parse repo with sharded model >>> get_safetensors_metadata(“runwayml/stable-diffusion-v1-5”) NotASafetensorsRepoError: ‘runwayml/stable-diffusion-v1-5’ is not a safetensors repo. Couldn’t find ‘model.safetensors.index.json’ or ‘model.safetensors’ files. ```
- get_space_runtime(repo_id, *, token=None)[source]#
Gets runtime information about a Space.
- Parameters:
repo_id (str) – ID of the repo to update. Example: “bigcode/in-the-stack”.
token (bool or str, optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.
- Returns:
Runtime information about a Space including Space stage and hardware.
- Return type:
[SpaceRuntime]
- get_space_variables(repo_id, *, token=None)[source]#
Gets all variables from a Space.
Variables allow to set environment variables to a Space without hardcoding them. For more details, see https://huggingface.co/docs/hub/spaces-overview#managing-secrets-and-environment-variables
- Parameters:
repo_id (str) – ID of the repo to query. Example: “bigcode/in-the-stack”.
token (bool or str, optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.
- Return type:
dict[str, SpaceVariable]
- get_user_overview(username, token=None)[source]#
Get an overview of a user on the Hub.
- Parameters:
username (str) – Username of the user to get an overview of.
token (bool or str, optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.
- Returns:
A [User] object with the user’s overview.
- Return type:
User
- Raises:
[HfHubHTTPError] – HTTP 404 If the user does not exist on the Hub.
- get_webhook(webhook_id, *, token=None)[source]#
Get a webhook by its id.
- Parameters:
webhook_id (str) – The unique identifier of the webhook to get.
token (bool or str, optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.
- Returns:
Info about the webhook.
- Return type:
[WebhookInfo]
Example
```python >>> from huggingface_hub import get_webhook >>> webhook = get_webhook(“654bbbc16f2ec14d77f109cc”) >>> print(webhook) WebhookInfo(
id=”654bbbc16f2ec14d77f109cc”, job=None, watched=[WebhookWatchedItem(type=”user”, name=”julien-c”), WebhookWatchedItem(type=”org”, name=”HuggingFaceH4”)], url=”https://webhook.site/a2176e82-5720-43ee-9e06-f91cb4c91548”, secret=”my-secret”, domains=[“repo”, “discussion”], disabled=False,
)#
- grant_access(repo_id, user, *, repo_type=None, token=None)[source]#
Grant access to a user for a given gated repo.
Granting access don’t require for the user to send an access request by themselves. The user is automatically added to the accepted list meaning they can download the files You can revoke the granted access at any time using [cancel_access_request] or [reject_access_request].
For more info about gated repos, see https://huggingface.co/docs/hub/models-gated.
- Parameters:
repo_id (str) – The id of the repo to grant access to.
user (str) – The username of the user to grant access.
repo_type (str, optional) – The type of the repo to grant access to. Must be one of model, dataset or space. Defaults to model.
token (bool or str, optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.
- Raises:
[HfHubHTTPError] – HTTP 400 if the repo is not gated.
[HfHubHTTPError] – HTTP 400 if the user already has access to the repo.
[HfHubHTTPError] – HTTP 403 if you only have read-only access to the repo. This can be the case if you don’t have write or admin role in the organization the repo belongs to or if you passed a read token.
[HfHubHTTPError] – HTTP 404 if the user does not exist on the Hub.
- hf_hub_download(repo_id, filename, *, subfolder=None, repo_type=None, revision=None, cache_dir=None, local_dir=None, force_download=False, etag_timeout=10, token=None, local_files_only=False, tqdm_class=None, dry_run=False)[source]#
Download a given file if it’s not already present in the local cache.
The new cache file layout looks like this: - The cache directory contains one subfolder per repo_id (namespaced by repo type) - inside each repo folder:
refs is a list of the latest known revision => commit_hash pairs
blobs contains the actual file blobs (identified by their git-sha or sha256, depending on
whether they’re LFS files or not) - snapshots contains one subfolder per commit, each “commit” contains the subset of the files that have been resolved at that particular commit. Each filename is a symlink to the blob at that particular commit.
``` [ 96] . └── [ 160] models–julien-c–EsperBERTo-small
├── [ 160] blobs │ ├── [321M] 403450e234d65943a7dcf7e05a771ce3c92faa84dd07db4ac20f592037a1e4bd │ ├── [ 398] 7cb18dc9bafbfcf74629a4b760af1b160957a83e │ └── [1.4K] d7edf6bd2a681fb0175f7735299831ee1b22b812 ├── [ 96] refs │ └── [ 40] main └── [ 128] snapshots
├── [ 128] 2439f60ef33a0d46d85da5001d52aeda5b00ce9f │ ├── [ 52] README.md -> ../../blobs/d7edf6bd2a681fb0175f7735299831ee1b22b812 │ └── [ 76] pytorch_model.bin -> ../../blobs/403450e234d65943a7dcf7e05a771ce3c92faa84dd07db4ac20f592037a1e4bd └── [ 128] bbc77c8132af1cc5cf678da3f1ddf2de43606d48
├── [ 52] README.md -> ../../blobs/7cb18dc9bafbfcf74629a4b760af1b160957a83e └── [ 76] pytorch_model.bin -> ../../blobs/403450e234d65943a7dcf7e05a771ce3c92faa84dd07db4ac20f592037a1e4bd
If local_dir is provided, the file structure from the repo will be replicated in this location. When using this option, the cache_dir will not be used and a .cache/huggingface/ folder will be created at the root of local_dir to store some metadata related to the downloaded files. While this mechanism is not as robust as the main cache-system, it’s optimized for regularly pulling the latest version of a repository.
- Parameters:
repo_id (str) – A user or an organization name and a repo name separated by a /.
filename (str) – The name of the file in the repo.
subfolder (str, optional) – An optional value corresponding to a folder inside the repository.
repo_type (str, optional) – Set to “dataset” or “space” if downloading from a dataset or space, None or “model” if downloading from a model. Default is None.
revision (str, optional) – An optional Git revision id which can be a branch name, a tag, or a commit hash.
cache_dir (str, Path, optional) – Path to the folder where cached files are stored.
local_dir (str or Path, optional) – If provided, the downloaded file will be placed under this directory.
force_download (bool, optional, defaults to False) – Whether the file should be downloaded even if it already exists in the local cache.
etag_timeout (float, optional, defaults to 10) – When fetching ETag, how many seconds to wait for the server to send data before giving up which is passed to httpx.request.
token (bool or str, optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.
local_files_only (bool, optional, defaults to False) – If True, avoid downloading the file and return the path to the local cached file if it exists.
tqdm_class (tqdm, optional) – If provided, overwrites the default behavior for the progress bar. Passed argument must inherit from tqdm.auto.tqdm or at least mimic its behavior. Defaults to the custom HF progress bar that can be disabled by setting HF_HUB_DISABLE_PROGRESS_BARS environment variable.
dry_run (bool, optional, defaults to False) – If True, perform a dry run without actually downloading the file. Returns a [DryRunFileInfo] object containing information about what would be downloaded.
- Returns:
If dry_run=False: Local path of file or if networking is off, last version of file cached on disk.
If dry_run=True: A [DryRunFileInfo] object containing download information.
- Return type:
str or [DryRunFileInfo]
- Raises:
[RepositoryNotFoundError] – If the repository to download from cannot be found. This may be because it doesn’t exist, or because it is set to private and you do not have access.
[RevisionNotFoundError] – If the revision to download from cannot be found.
[RemoteEntryNotFoundError] – If the file to download cannot be found.
[LocalEntryNotFoundError] – If network is disabled or unavailable and file is not found in cache.
[EnvironmentError](https – //docs.python.org/3/library/exceptions.html#EnvironmentError) If token=True but the token cannot be found.
[OSError](https – //docs.python.org/3/library/exceptions.html#OSError) If ETag cannot be determined.
[ValueError](https – //docs.python.org/3/library/exceptions.html#ValueError) If some parameter value is invalid.
- hide_discussion_comment(repo_id, discussion_num, comment_id, *, token=None, repo_type=None)[source]#
Hides a comment on a Discussion / Pull Request.
> [!WARNING] > Hidden comments’ content cannot be retrieved anymore. Hiding a comment is irreversible.
- Parameters:
repo_id (str) – A namespace (user or an organization) and a repo name separated by a /.
discussion_num (int) – The number of the Discussion or Pull Request . Must be a strictly positive integer.
comment_id (str) – The ID of the comment to edit.
repo_type (str, optional) – Set to “dataset” or “space” if uploading to a dataset or space, None or “model” if uploading to a model. Default is None.
token (bool or str, optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.
- Returns:
the hidden comment
- Return type:
[DiscussionComment]
> [!TIP] > Raises the following errors: > > - [HTTPError](https://requests.readthedocs.io/en/latest/api/#requests.HTTPError) > if the HuggingFace API returned an error > - [ValueError](https://docs.python.org/3/library/exceptions.html#ValueError) > if some parameter value is invalid > - [~utils.RepositoryNotFoundError] > If the repository to download from cannot be found. This may be because it doesn’t exist, > or because it is set to private and you do not have access.
- inspect_job(*, job_id, namespace=None, token=None)[source]#
Inspect a compute Job on Hugging Face infrastructure.
- Parameters:
job_id (str) – ID of the Job.
namespace (str, optional) – The namespace where the Job is running. Defaults to the current user’s namespace.
` (token) – A valid user access token. If not provided, the locally saved token will be used, which is the recommended authentication method. Set to False to disable authentication. Refer to: https://huggingface.co/docs/huggingface_hub/quick-start#authentication.
- Return type:
JobInfo
Example
```python >>> from huggingface_hub import inspect_job, run_job >>> job = run_job(image=”python:3.12”, command=[“python”, “-c” ,”print(‘Hello from HF compute!’)”]) >>> inspect_job(job.id) JobInfo(
id=’68780d00bbe36d38803f645f’, created_at=datetime.datetime(2025, 7, 16, 20, 35, 12, 808000, tzinfo=datetime.timezone.utc), docker_image=’python:3.12’, space_id=None, command=[‘python’, ‘-c’, “print(‘Hello from HF compute!’)”], arguments=[], environment={}, secrets={}, flavor=’cpu-basic’, status=JobStatus(stage=’RUNNING’, message=None)
)#
- inspect_scheduled_job(*, scheduled_job_id, namespace=None, token=None)[source]#
Inspect a scheduled compute Job on Hugging Face infrastructure.
- Parameters:
scheduled_job_id (str) – ID of the scheduled Job.
namespace (str, optional) – The namespace where the scheduled Job is. Defaults to the current user’s namespace.
` (token) – A valid user access token. If not provided, the locally saved token will be used, which is the recommended authentication method. Set to False to disable authentication. Refer to: https://huggingface.co/docs/huggingface_hub/quick-start#authentication.
- Return type:
ScheduledJobInfo
Example
`python >>> from huggingface_hub import inspect_job, create_scheduled_job >>> scheduled_job = create_scheduled_job(image="python:3.12", command=["python", "-c" ,"print('Hello from HF compute!')"], schedule="@hourly") >>> inspect_scheduled_job(scheduled_job.id) `
- list_accepted_access_requests(repo_id, *, repo_type=None, token=None)[source]#
Get accepted access requests for a given gated repo.
An accepted request means the user has requested access to the repo and the request has been accepted. The user can download any file of the repo. If the approval mode is automatic, this list should contains by default all requests. Accepted requests can be cancelled or rejected at any time using [cancel_access_request] and [reject_access_request]. A cancelled request will go back to the pending list while a rejected request will go to the rejected list. In both cases, the user will lose access to the repo.
For more info about gated repos, see https://huggingface.co/docs/hub/models-gated.
- Parameters:
repo_id (str) – The id of the repo to get access requests for.
repo_type (str, optional) – The type of the repo to get access requests for. Must be one of model, dataset or space. Defaults to model.
token (bool or str, optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.
- Returns:
An iterable of [AccessRequest] objects. Each time contains a username, email, status and timestamp attribute. If the gated repo has a custom form, the fields attribute will be populated with user’s answers.
- Return type:
Iterable[AccessRequest]
- Raises:
[HfHubHTTPError] – HTTP 400 if the repo is not gated.
[HfHubHTTPError] – HTTP 403 if you only have read-only access to the repo. This can be the case if you don’t have write or admin role in the organization the repo belongs to or if you passed a read token.
Example: ```py >>> from huggingface_hub import list_accepted_access_requests
>>> requests = list(list_accepted_access_requests("meta-llama/Llama-2-7b")) >>> len(requests) 411 >>> requests[0] [ AccessRequest( username='clem', fullname='Clem 🤗', email='***', timestamp=datetime.datetime(2023, 11, 23, 18, 4, 53, 828000, tzinfo=datetime.timezone.utc), status='accepted', fields=None, ), ... ] ```
- list_collections(*, owner=None, item=None, sort=None, limit=None, token=None)[source]#
List collections on the Huggingface Hub, given some filters.
> [!WARNING] > When listing collections, the item list per collection is truncated to 4 items maximum. To retrieve all items > from a collection, you must use [get_collection].
- Parameters:
owner (list[str] or str, optional) – Filter by owner’s username.
item (list[str] or str, optional) – Filter collections containing a particular items. Example: “models/teknium/OpenHermes-2.5-Mistral-7B”, “datasets/squad” or “papers/2311.12983”.
sort (Literal[“lastModified”, “trending”, “upvotes”], optional) – Sort collections by last modified, trending or upvotes.
limit (int, optional) – Maximum number of collections to be returned.
token (bool or str, optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.
- Returns:
an iterable of [Collection] objects.
- Return type:
Iterable[Collection]
- list_daily_papers(*, date=None, token=None, week=None, month=None, submitter=None, sort=None, p=None, limit=None)[source]#
List the daily papers published on a given date on the Hugging Face Hub.
- Parameters:
date (str, optional) – Date in ISO format (YYYY-MM-DD) for which to fetch daily papers. Defaults to most recent ones.
token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token. To disable authentication, pass False.
week (str, optional) – Week in ISO format (YYYY-Www) for which to fetch daily papers. Example, 2025-W09.
month (str, optional) – Month in ISO format (YYYY-MM) for which to fetch daily papers. Example, 2025-02.
submitter (str, optional) – Username of the submitter to filter daily papers.
sort (Literal[“publishedAt”, “trending”], optional) – Sort order for the daily papers. Can be either by publishedAt or by trending. Defaults to “publishedAt”
p (int, optional) – Page number for pagination. Defaults to 0.
limit (int, optional) – Limit of papers to fetch. Defaults to 50.
- Returns:
an iterable of [huggingface_hub.hf_api.PaperInfo] objects.
- Return type:
Iterable[PaperInfo]
Example:
```python >>> from huggingface_hub import HfApi
>>> api = HfApi() >>> list(api.list_daily_papers(date="2025-10-29")) ```
- list_datasets(*, filter=None, author=None, benchmark=None, dataset_name=None, gated=None, language_creators=None, language=None, multilinguality=None, size_categories=None, task_categories=None, task_ids=None, search=None, sort=None, direction=None, limit=None, expand=None, full=None, token=None)[source]#
List datasets hosted on the Huggingface Hub, given some filters.
- Parameters:
filter (str or Iterable[str], optional) – A string or list of string to filter datasets on the hub.
author (str, optional) – A string which identify the author of the returned datasets.
benchmark (str or List, optional) – A string or list of strings that can be used to identify datasets on the Hub by their official benchmark.
dataset_name (str, optional) – A string or list of strings that can be used to identify datasets on the Hub by its name, such as SQAC or wikineural
gated (bool, optional) – A boolean to filter datasets on the Hub that are gated or not. By default, all datasets are returned. If gated=True is passed, only gated datasets are returned. If gated=False is passed, only non-gated datasets are returned.
language_creators (str or List, optional) – A string or list of strings that can be used to identify datasets on the Hub with how the data was curated, such as crowdsourced or machine_generated.
language (str or List, optional) – A string or list of strings representing a two-character language to filter datasets by on the Hub.
multilinguality (str or List, optional) – A string or list of strings representing a filter for datasets that contain multiple languages.
size_categories (str or List, optional) – A string or list of strings that can be used to identify datasets on the Hub by the size of the dataset such as 100K<n<1M or 1M<n<10M.
tags (str or List, optional) – Deprecated. Pass tags in filter to filter datasets by tags.
task_categories (str or List, optional) – A string or list of strings that can be used to identify datasets on the Hub by the designed task, such as audio_classification or named_entity_recognition.
task_ids (str or List, optional) – A string or list of strings that can be used to identify datasets on the Hub by the specific task such as speech_emotion_recognition or paraphrase.
search (str, optional) – A string that will be contained in the returned datasets.
sort (DatasetSort_T, optional) – The key with which to sort the resulting datasets. Possible values are “created_at”, “downloads”, “last_modified”, “likes” and “trending_score”.
direction (Literal[-1] or int, optional) – Deprecated. This parameter is not used and will be removed in version 1.5.
limit (int, optional) – The limit on the number of datasets fetched. Leaving this option to None fetches all datasets.
expand (list[ExpandDatasetProperty_T], optional) – List properties to return in the response. When used, only the properties in the list will be returned. This parameter cannot be used if full is passed. Possible values are “author”, “cardData”, “citation”, “createdAt”, “disabled”, “description”, “downloads”, “downloadsAllTime”, “gated”, “lastModified”, “likes”, “paperswithcode_id”, “private”, “siblings”, “sha”, “tags”, “trendingScore”, “usedStorage”, and “resourceGroup”.
full (bool, optional) – Whether to fetch all dataset data, including the last_modified, the card_data and the files. Can contain useful information such as the PapersWithCode ID.
token (bool or str, optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.
- Returns:
an iterable of [huggingface_hub.hf_api.DatasetInfo] objects.
- Return type:
Iterable[DatasetInfo]
Example usage with the filter argument:
```python >>> from huggingface_hub import HfApi
>>> api = HfApi()
# List all datasets >>> api.list_datasets()
# List only the text classification datasets >>> api.list_datasets(filter=”task_categories:text-classification”)
# List only the datasets in russian for language modeling >>> api.list_datasets( … filter=(“language:ru”, “task_ids:language-modeling”) … )
# List FiftyOne datasets (identified by the tag “fiftyone” in dataset card) >>> api.list_datasets(tags=”fiftyone”) ```
Example usage with the search argument:
```python >>> from huggingface_hub import HfApi
>>> api = HfApi()
# List all datasets with “text” in their name >>> api.list_datasets(search=”text”)
# List all datasets with “text” in their name made by google >>> api.list_datasets(search=”text”, author=”google”) ```
- list_inference_catalog(*, token=None)[source]#
List models available in the Hugging Face Inference Catalog.
The goal of the Inference Catalog is to provide a curated list of models that are optimized for inference and for which default configurations have been tested. See https://endpoints.huggingface.co/catalog for a list of available models in the catalog.
Use [create_inference_endpoint_from_catalog] to deploy a model from the catalog.
- Parameters:
token (bool or str, optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication).
- Returns:
A list of model IDs available in the catalog.
- Return type:
List[str]
> [!WARNING] > list_inference_catalog is experimental. Its API is subject to change in the future. Please provide feedback > if you have any suggestions or requests.
- list_inference_endpoints(namespace=None, *, token=None)[source]#
Lists all inference endpoints for the given namespace.
- Parameters:
namespace (str, optional) – The namespace to list endpoints for. Defaults to the current user. Set to “*” to list all endpoints from all namespaces (i.e. personal namespace and all orgs the user belongs to).
token (bool or str, optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.
- Returns:
A list of all inference endpoints for the given namespace.
- Return type:
list[InferenceEndpoint]
Example:
`python >>> from huggingface_hub import HfApi >>> api = HfApi() >>> api.list_inference_endpoints() [InferenceEndpoint(name='my-endpoint', ...), ...] `
- list_jobs(*, timeout=None, namespace=None, token=None)[source]#
List compute Jobs on Hugging Face infrastructure.
- Parameters:
timeout (float, optional) – Whether to set a timeout for the request to the Hub.
namespace (str, optional) – The namespace from where it lists the jobs. Defaults to the current user’s namespace.
` (token) – A valid user access token. If not provided, the locally saved token will be used, which is the recommended authentication method. Set to False to disable authentication. Refer to: https://huggingface.co/docs/huggingface_hub/quick-start#authentication.
- Return type:
list[JobInfo]
- list_lfs_files(repo_id, *, repo_type=None, token=None)[source]#
List all LFS files in a repo on the Hub.
This is primarily useful to count how much storage a repo is using and to eventually clean up large files with [permanently_delete_lfs_files]. Note that this would be a permanent action that will affect all commits referencing this deleted files and that cannot be undone.
- Parameters:
repo_id (str) – The repository for which you are listing LFS files.
repo_type (str, optional) – Type of repository. Set to “dataset” or “space” if listing from a dataset or space, None or “model” if listing from a model. Default is None.
token (bool or str, optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.
- Returns:
An iterator of [LFSFileInfo] objects.
- Return type:
Iterable[LFSFileInfo]
Example
```py >>> from huggingface_hub import HfApi >>> api = HfApi() >>> lfs_files = api.list_lfs_files(“username/my-cool-repo”)
# Filter files files to delete based on a combination of filename, pushed_at, ref or size. # e.g. select only LFS files in the “checkpoints” folder >>> lfs_files_to_delete = (lfs_file for lfs_file in lfs_files if lfs_file.filename.startswith(“checkpoints/”))
# Permanently delete LFS files >>> api.permanently_delete_lfs_files(“username/my-cool-repo”, lfs_files_to_delete) ```
- list_liked_repos(user=None, *, token=None)[source]#
List all public repos liked by a user on huggingface.co.
This list is public so token is optional. If user is not passed, it defaults to the logged in user.
See also [unlike].
- Parameters:
user (str, optional) – Name of the user for which you want to fetch the likes.
token (bool or str, optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.
- Returns:
object containing the user name and 3 lists of repo ids (1 for models, 1 for datasets and 1 for Spaces).
- Return type:
[UserLikes]
- Raises:
[ValueError](https – //docs.python.org/3/library/exceptions.html#ValueError) If user is not passed and no token found (either from argument or from machine).
Example: ```python >>> from huggingface_hub import list_liked_repos
>>> likes = list_liked_repos("julien-c")
>>> likes.user "julien-c"
>>> likes.models ["osanseviero/streamlit_1.15", "Xhaheen/ChatGPT_HF", ...] ```
- list_models(*, filter=None, author=None, apps=None, gated=None, inference=None, inference_provider=None, model_name=None, trained_dataset=None, search=None, pipeline_tag=None, emissions_thresholds=None, sort=None, direction=None, limit=None, expand=None, full=None, cardData=False, fetch_config=False, token=None)[source]#
List models hosted on the Huggingface Hub, given some filters.
- Parameters:
filter (str or Iterable[str], optional) – A string or list of string to filter models on the Hub. Models can be filtered by library, language, task, tags, and more.
author (str, optional) – A string which identify the author (user or organization) of the returned models.
apps (str or List, optional) – A string or list of strings to filter models on the Hub that support the specified apps. Example values include “ollama” or [“ollama”, “vllm”].
gated (bool, optional) – A boolean to filter models on the Hub that are gated or not. By default, all models are returned. If gated=True is passed, only gated models are returned. If gated=False is passed, only non-gated models are returned.
inference (Literal[“warm”], optional) – If “warm”, filter models on the Hub currently served by at least one provider.
inference_provider (Literal[“all”] or str, optional) – A string to filter models on the Hub that are served by a specific provider. Pass “all” to get all models served by at least one provider.
model_name (str, optional) – A string that contain complete or partial names for models on the Hub, such as “bert” or “bert-base-cased”
trained_dataset (str or List, optional) – A string tag or a list of string tags of the trained dataset for a model on the Hub.
search (str, optional) – A string that will be contained in the returned model ids.
pipeline_tag (str, optional) – A string pipeline tag to filter models on the Hub by, such as summarization.
emissions_thresholds (Tuple, optional) – A tuple of two ints or floats representing a minimum and maximum carbon footprint to filter the resulting models with in grams.
sort (ModelSort_T, optional) – The key with which to sort the resulting models. Possible values are “created_at”, “downloads”, “last_modified”, “likes” and “trending_score”.
direction (Literal[-1] or int, optional) – Deprecated. This parameter is not used and will be removed in version 1.5.
limit (int, optional) – The limit on the number of models fetched. Leaving this option to None fetches all models.
expand (list[ExpandModelProperty_T], optional) – List properties to return in the response. When used, only the properties in the list will be returned. This parameter cannot be used if full, cardData or fetch_config are passed. Possible values are “author”, “cardData”, “config”, “createdAt”, “disabled”, “downloads”, “downloadsAllTime”, “gated”, “gguf”, “inference”, “inferenceProviderMapping”, “lastModified”, “library_name”, “likes”, “mask_token”, “model-index”, “pipeline_tag”, “private”, “safetensors”, “sha”, “siblings”, “spaces”, “tags”, “transformersInfo”, “trendingScore”, “widgetData”, and “resourceGroup”.
full (bool, optional) – Whether to fetch all model data, including the last_modified, the sha, the files and the tags. This is set to True by default when using a filter.
cardData (bool, optional) – Whether to grab the metadata for the model as well. Can contain useful information such as carbon emissions, metrics, and datasets trained on.
fetch_config (bool, optional) – Whether to fetch the model configs as well. This is not included in full due to its size.
token (bool or str, optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.
- Returns:
an iterable of [huggingface_hub.hf_api.ModelInfo] objects.
- Return type:
Iterable[ModelInfo]
Example:
```python >>> from huggingface_hub import HfApi
>>> api = HfApi()
# List all models >>> api.list_models()
# List text classification models >>> api.list_models(filter=”text-classification”)
# List models from the KerasHub library >>> api.list_models(filter=”keras-hub”)
# List models served by Cohere >>> api.list_models(inference_provider=”cohere”)
# List models with “bert” in their name >>> api.list_models(search=”bert”)
# List models with “bert” in their name and pushed by google >>> api.list_models(search=”bert”, author=”google”) ```
- list_organization_followers(organization, token=None)[source]#
List followers of an organization on the Hub.
- Parameters:
organization (str) – Name of the organization to get the followers of.
token (bool or str, optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.
- Returns:
A list of [User] objects with the followers of the organization.
- Return type:
Iterable[User]
- Raises:
[HfHubHTTPError] – HTTP 404 If the organization does not exist on the Hub.
- list_organization_members(organization, token=None)[source]#
List of members of an organization on the Hub.
- Parameters:
organization (str) – Name of the organization to get the members of.
token (bool or str, optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.
- Returns:
A list of [User] objects with the members of the organization.
- Return type:
Iterable[User]
- Raises:
[HfHubHTTPError] – HTTP 404 If the organization does not exist on the Hub.
- list_papers(*, query=None, token=None)[source]#
List daily papers on the Hugging Face Hub given a search query.
- Parameters:
query (str, optional) – A search query string to find papers. If provided, returns papers that match the query.
token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.
- Returns:
an iterable of [huggingface_hub.hf_api.PaperInfo] objects.
- Return type:
Iterable[PaperInfo]
Example:
```python >>> from huggingface_hub import HfApi
>>> api = HfApi()
# List all papers with “attention” in their title >>> api.list_papers(query=”attention”) ```
- list_pending_access_requests(repo_id, *, repo_type=None, token=None)[source]#
Get pending access requests for a given gated repo.
A pending request means the user has requested access to the repo but the request has not been processed yet. If the approval mode is automatic, this list should be empty. Pending requests can be accepted or rejected using [accept_access_request] and [reject_access_request].
For more info about gated repos, see https://huggingface.co/docs/hub/models-gated.
- Parameters:
repo_id (str) – The id of the repo to get access requests for.
repo_type (str, optional) – The type of the repo to get access requests for. Must be one of model, dataset or space. Defaults to model.
token (bool or str, optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.
- Returns:
An iterable of [AccessRequest] objects. Each time contains a username, email, status and timestamp attribute. If the gated repo has a custom form, the fields attribute will be populated with user’s answers.
- Return type:
Iterable[AccessRequest]
- Raises:
[HfHubHTTPError] – HTTP 400 if the repo is not gated.
[HfHubHTTPError] – HTTP 403 if you only have read-only access to the repo. This can be the case if you don’t have write or admin role in the organization the repo belongs to or if you passed a read token.
Example: ```py >>> from huggingface_hub import list_pending_access_requests, accept_access_request
# List pending requests >>> requests = list(list_pending_access_requests(“meta-llama/Llama-2-7b”)) >>> len(requests) 411 >>> requests[0] [
]
# Accept Clem’s request >>> accept_access_request(“meta-llama/Llama-2-7b”, “clem”) ```
- list_rejected_access_requests(repo_id, *, repo_type=None, token=None)[source]#
Get rejected access requests for a given gated repo.
A rejected request means the user has requested access to the repo and the request has been explicitly rejected by a repo owner (either you or another user from your organization). The user cannot download any file of the repo. Rejected requests can be accepted or cancelled at any time using [accept_access_request] and [cancel_access_request]. A cancelled request will go back to the pending list while an accepted request will go to the accepted list.
For more info about gated repos, see https://huggingface.co/docs/hub/models-gated.
- Parameters:
repo_id (str) – The id of the repo to get access requests for.
repo_type (str, optional) – The type of the repo to get access requests for. Must be one of model, dataset or space. Defaults to model.
token (bool or str, optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.
- Returns:
An iterable of [AccessRequest] objects. Each time contains a username, email, status and timestamp attribute. If the gated repo has a custom form, the fields attribute will be populated with user’s answers.
- Return type:
Iterable[AccessRequest]
- Raises:
[HfHubHTTPError] – HTTP 400 if the repo is not gated.
[HfHubHTTPError] – HTTP 403 if you only have read-only access to the repo. This can be the case if you don’t have write or admin role in the organization the repo belongs to or if you passed a read token.
Example: ```py >>> from huggingface_hub import list_rejected_access_requests
>>> requests = list(list_rejected_access_requests("meta-llama/Llama-2-7b")) >>> len(requests) 411 >>> requests[0] [ AccessRequest( username='clem', fullname='Clem 🤗', email='***', timestamp=datetime.datetime(2023, 11, 23, 18, 4, 53, 828000, tzinfo=datetime.timezone.utc), status='rejected', fields=None, ), ... ] ```
- list_repo_commits(repo_id, *, repo_type=None, token=None, revision=None, formatted=False)[source]#
Get the list of commits of a given revision for a repo on the Hub.
Commits are sorted by date (last commit first).
- Parameters:
repo_id (str) – A namespace (user or an organization) and a repo name separated by a /.
repo_type (str, optional) – Set to “dataset” or “space” if listing commits from a dataset or a Space, None or “model” if listing from a model. Default is None.
token (bool or str, optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.
revision (str, optional) – The git revision to commit from. Defaults to the head of the “main” branch.
formatted (bool) – Whether to return the HTML-formatted title and description of the commits. Defaults to False.
- Return type:
list[GitCommitInfo]
Example: ```py >>> from huggingface_hub import HfApi >>> api = HfApi()
# Commits are sorted by date (last commit first) >>> initial_commit = api.list_repo_commits(“gpt2”)[-1]
# Initial commit is always a system commit containing the .gitattributes file. >>> initial_commit GitCommitInfo(
commit_id=’9b865efde13a30c13e0a33e536cf3e4a5a9d71d8’, authors=[‘system’], created_at=datetime.datetime(2019, 2, 18, 10, 36, 15, tzinfo=datetime.timezone.utc), title=’initial commit’, message=’’, formatted_title=None, formatted_message=None
)
# Create an empty branch by deriving from initial commit >>> api.create_branch(“gpt2”, “new_empty_branch”, revision=initial_commit.commit_id) ```
- Returns:
list of objects containing information about the commits for a repo on the Hub.
- Return type:
list[[GitCommitInfo]]
- Raises:
[RepositoryNotFoundError] – If repository is not found (error 404): wrong repo_id/repo_type, private but not authenticated or repo does not exist.
[RevisionNotFoundError] – If revision is not found (error 404) on the repo.
- list_repo_files(repo_id, *, revision=None, repo_type=None, token=None)[source]#
Get the list of files in a given repo.
- Parameters:
repo_id (str) – A namespace (user or an organization) and a repo name separated by a /.
revision (str, optional) – The revision of the repository from which to get the information.
repo_type (str, optional) – Set to “dataset” or “space” if uploading to a dataset or space, None or “model” if uploading to a model. Default is None.
token (bool or str, optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.
- Returns:
the list of files in a given repository.
- Return type:
list[str]
- list_repo_likers(repo_id, *, repo_type=None, token=None)[source]#
List all users who liked a given repo on the hugging Face Hub.
See also [list_liked_repos].
- Parameters:
repo_id (str) – The repository to retrieve . Example: “user/my-cool-model”.
token (bool or str, optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.
repo_type (str, optional) – Set to “dataset” or “space” if uploading to a dataset or space, None or “model” if uploading to a model. Default is None.
- Returns:
an iterable of [huggingface_hub.hf_api.User] objects.
- Return type:
Iterable[User]
- list_repo_refs(repo_id, *, repo_type=None, include_pull_requests=False, token=None)[source]#
Get the list of refs of a given repo (both tags and branches).
- Parameters:
repo_id (str) – A namespace (user or an organization) and a repo name separated by a /.
repo_type (str, optional) – Set to “dataset” or “space” if listing refs from a dataset or a Space, None or “model” if listing from a model. Default is None.
include_pull_requests (bool, optional) – Whether to include refs from pull requests in the list. Defaults to False.
token (bool or str, optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.
- Return type:
GitRefs
Example: ```py >>> from huggingface_hub import HfApi >>> api = HfApi() >>> api.list_repo_refs(“gpt2”) GitRefs(branches=[GitRefInfo(name=’main’, ref=’refs/heads/main’, target_commit=’e7da7f221d5bf496a48136c0cd264e630fe9fcc8’)], converts=[], tags=[])
>>> api.list_repo_refs("bigcode/the-stack", repo_type='dataset') GitRefs( branches=[ GitRefInfo(name='main', ref='refs/heads/main', target_commit='18edc1591d9ce72aa82f56c4431b3c969b210ae3'), GitRefInfo(name='v1.1.a1', ref='refs/heads/v1.1.a1', target_commit='f9826b862d1567f3822d3d25649b0d6d22ace714') ], converts=[], tags=[ GitRefInfo(name='v1.0', ref='refs/tags/v1.0', target_commit='c37a8cd1e382064d8aced5e05543c5f7753834da') ] ) ```
- Returns:
object containing all information about branches and tags for a repo on the Hub.
- Return type:
[GitRefs]
- list_repo_tree(repo_id, path_in_repo=None, *, recursive=False, expand=False, revision=None, repo_type=None, token=None)[source]#
List a repo tree’s files and folders and get information about them.
- Parameters:
repo_id (str) – A namespace (user or an organization) and a repo name separated by a /.
path_in_repo (str, optional) – Relative path of the tree (folder) in the repo, for example: “checkpoints/1fec34a/results”. Will default to the root tree (folder) of the repository.
recursive (bool, optional, defaults to False) – Whether to list tree’s files and folders recursively.
expand (bool, optional, defaults to False) – Whether to fetch more information about the tree’s files and folders (e.g. last commit and files’ security scan results). This operation is more expensive for the server so only 50 results are returned per page (instead of 1000). As pagination is implemented in huggingface_hub, this is transparent for you except for the time it takes to get the results.
revision (str, optional) – The revision of the repository from which to get the tree. Defaults to “main” branch.
repo_type (str, optional) – The type of the repository from which to get the tree (“model”, “dataset” or “space”. Defaults to “model”.
token (bool or str, optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.
- Returns:
The information about the tree’s files and folders, as an iterable of [RepoFile] and [RepoFolder] objects. The order of the files and folders is not guaranteed.
- Return type:
Iterable[Union[RepoFile, RepoFolder]]
- Raises:
[RepositoryNotFoundError] – If repository is not found (error 404): wrong repo_id/repo_type, private but not authenticated or repo does not exist.
[RevisionNotFoundError] – If revision is not found (error 404) on the repo.
[RemoteEntryNotFoundError] – If the tree (folder) does not exist (error 404) on the repo.
Examples
Get information about a repo’s tree. ```py >>> from huggingface_hub import list_repo_tree >>> repo_tree = list_repo_tree(“lysandre/arxiv-nlp”) >>> repo_tree <generator object HfApi.list_repo_tree at 0x7fa4088e1ac0> >>> list(repo_tree) [
RepoFile(path=’.gitattributes’, size=391, blob_id=’ae8c63daedbd4206d7d40126955d4e6ab1c80f8f’, lfs=None, last_commit=None, security=None), RepoFile(path=’README.md’, size=391, blob_id=’43bd404b159de6fba7c2f4d3264347668d43af25’, lfs=None, last_commit=None, security=None), RepoFile(path=’config.json’, size=554, blob_id=’2f9618c3a19b9a61add74f70bfb121335aeef666’, lfs=None, last_commit=None, security=None), RepoFile(
path=’flax_model.msgpack’, size=497764107, blob_id=’8095a62ccb4d806da7666fcda07467e2d150218e’, lfs={‘size’: 497764107, ‘sha256’: ‘d88b0d6a6ff9c3f8151f9d3228f57092aaea997f09af009eefd7373a77b5abb9’, ‘pointer_size’: 134}, last_commit=None, security=None
), RepoFile(path=’merges.txt’, size=456318, blob_id=’226b0752cac7789c48f0cb3ec53eda48b7be36cc’, lfs=None, last_commit=None, security=None), RepoFile(
path=’pytorch_model.bin’, size=548123560, blob_id=’64eaa9c526867e404b68f2c5d66fd78e27026523’, lfs={‘size’: 548123560, ‘sha256’: ‘9be78edb5b928eba33aa88f431551348f7466ba9f5ef3daf1d552398722a5436’, ‘pointer_size’: 134}, last_commit=None, security=None
), RepoFile(path=’vocab.json’, size=898669, blob_id=’b00361fece0387ca34b4b8b8539ed830d644dbeb’, lfs=None, last_commit=None, security=None)]
]#
Get even more information about a repo’s tree (last commit and files’ security scan results) ```py >>> from huggingface_hub import list_repo_tree >>> repo_tree = list_repo_tree(“prompthero/openjourney-v4”, expand=True) >>> list(repo_tree) [
- RepoFolder(
path=’feature_extractor’, tree_id=’aa536c4ea18073388b5b0bc791057a7296a00398’, last_commit={
‘oid’: ‘47b62b20b20e06b9de610e840282b7e6c3d51190’, ‘title’: ‘Upload diffusers weights (#48)’, ‘date’: datetime.datetime(2023, 3, 21, 9, 5, 27, tzinfo=datetime.timezone.utc)
}
), RepoFolder(
path=’safety_checker’, tree_id=’65aef9d787e5557373fdf714d6c34d4fcdd70440’, last_commit={
‘oid’: ‘47b62b20b20e06b9de610e840282b7e6c3d51190’, ‘title’: ‘Upload diffusers weights (#48)’, ‘date’: datetime.datetime(2023, 3, 21, 9, 5, 27, tzinfo=datetime.timezone.utc)
}
), RepoFile(
path=’model_index.json’, size=582, blob_id=’d3d7c1e8c3e78eeb1640b8e2041ee256e24c9ee1’, lfs=None, last_commit={
‘oid’: ‘b195ed2d503f3eb29637050a886d77bd81d35f0e’, ‘title’: ‘Fix deprecation warning by changing CLIPFeatureExtractor to CLIPImageProcessor. (#54)’, ‘date’: datetime.datetime(2023, 5, 15, 21, 41, 59, tzinfo=datetime.timezone.utc)
}, security={
‘safe’: True, ‘av_scan’: {‘virusFound’: False, ‘virusNames’: None}, ‘pickle_import_scan’: None
}
]#
- list_scheduled_jobs(*, timeout=None, namespace=None, token=None)[source]#
List scheduled compute Jobs on Hugging Face infrastructure.
- Parameters:
timeout (float, optional) – Whether to set a timeout for the request to the Hub.
namespace (str, optional) – The namespace from where it lists the jobs. Defaults to the current user’s namespace.
` (token) – A valid user access token. If not provided, the locally saved token will be used, which is the recommended authentication method. Set to False to disable authentication. Refer to: https://huggingface.co/docs/huggingface_hub/quick-start#authentication.
- Return type:
list[ScheduledJobInfo]
- list_spaces(*, filter=None, author=None, search=None, datasets=None, models=None, linked=False, sort=None, direction=None, limit=None, expand=None, full=None, token=None)[source]#
List spaces hosted on the Huggingface Hub, given some filters.
- Parameters:
filter (str or Iterable, optional) – A string tag or list of tags that can be used to identify Spaces on the Hub.
author (str, optional) – A string which identify the author of the returned Spaces.
search (str, optional) – A string that will be contained in the returned Spaces.
datasets (str or Iterable, optional) – Whether to return Spaces that make use of a dataset. The name of a specific dataset can be passed as a string.
models (str or Iterable, optional) – Whether to return Spaces that make use of a model. The name of a specific model can be passed as a string.
linked (bool, optional) – Whether to return Spaces that make use of either a model or a dataset.
sort (SpaceSort_T, optional) – The key with which to sort the resulting spaces. Possible values are “created_at”, “last_modified”, “likes” and “trending_score”.
direction (Literal[-1] or int, optional) – Deprecated. This parameter is not used and will be removed in version 1.5.
limit (int, optional) – The limit on the number of Spaces fetched. Leaving this option to None fetches all Spaces.
expand (list[ExpandSpaceProperty_T], optional) – List properties to return in the response. When used, only the properties in the list will be returned. This parameter cannot be used if full is passed. Possible values are “author”, “cardData”, “datasets”, “disabled”, “lastModified”, “createdAt”, “likes”, “models”, “private”, “runtime”, “sdk”, “siblings”, “sha”, “subdomain”, “tags”, “trendingScore”, “usedStorage”, and “resourceGroup”.
full (bool, optional) – Whether to fetch all Spaces data, including the last_modified, siblings and card_data fields.
token (bool or str, optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.
- Returns:
an iterable of [huggingface_hub.hf_api.SpaceInfo] objects.
- Return type:
Iterable[SpaceInfo]
- list_user_followers(username, token=None)[source]#
Get the list of followers of a user on the Hub.
- Parameters:
username (str) – Username of the user to get the followers of.
token (bool or str, optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.
- Returns:
A list of [User] objects with the followers of the user.
- Return type:
Iterable[User]
- Raises:
[HfHubHTTPError] – HTTP 404 If the user does not exist on the Hub.
- list_user_following(username, token=None)[source]#
Get the list of users followed by a user on the Hub.
- Parameters:
username (str) – Username of the user to get the users followed by.
token (bool or str, optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.
- Returns:
A list of [User] objects with the users followed by the user.
- Return type:
Iterable[User]
- Raises:
[HfHubHTTPError] – HTTP 404 If the user does not exist on the Hub.
- list_webhooks(*, token=None)[source]#
List all configured webhooks.
- Parameters:
token (bool or str, optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.
- Returns:
List of webhook info objects.
- Return type:
list[WebhookInfo]
Example
```python >>> from huggingface_hub import list_webhooks >>> webhooks = list_webhooks() >>> len(webhooks) 2 >>> webhooks[0] WebhookInfo(
id=”654bbbc16f2ec14d77f109cc”, watched=[WebhookWatchedItem(type=”user”, name=”julien-c”), WebhookWatchedItem(type=”org”, name=”HuggingFaceH4”)], url=”https://webhook.site/a2176e82-5720-43ee-9e06-f91cb4c91548”, secret=”my-secret”, domains=[“repo”, “discussion”], disabled=False,
)#
- merge_pull_request(repo_id, discussion_num, *, token=None, comment=None, repo_type=None)[source]#
Merges a Pull Request.
- Parameters:
repo_id (str) – A namespace (user or an organization) and a repo name separated by a /.
discussion_num (int) – The number of the Discussion or Pull Request . Must be a strictly positive integer.
comment (str, optional) – An optional comment to post with the status change.
repo_type (str, optional) – Set to “dataset” or “space” if uploading to a dataset or space, None or “model” if uploading to a model. Default is None.
token (bool or str, optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.
- Returns:
the status change event
- Return type:
[DiscussionStatusChange]
> [!TIP] > Raises the following errors: > > - [HTTPError](https://requests.readthedocs.io/en/latest/api/#requests.HTTPError) > if the HuggingFace API returned an error > - [ValueError](https://docs.python.org/3/library/exceptions.html#ValueError) > if some parameter value is invalid > - [~utils.RepositoryNotFoundError] > If the repository to download from cannot be found. This may be because it doesn’t exist, > or because it is set to private and you do not have access.
- model_info(repo_id, *, revision=None, timeout=None, securityStatus=None, files_metadata=False, expand=None, token=None)[source]#
Get info on one specific model on huggingface.co
Model can be private if you pass an acceptable token or are logged in.
- Parameters:
repo_id (str) – A namespace (user or an organization) and a repo name separated by a /.
revision (str, optional) – The revision of the model repository from which to get the information.
timeout (float, optional) – Whether to set a timeout for the request to the Hub.
securityStatus (bool, optional) – Whether to retrieve the security status from the model repository as well. The security status will be returned in the security_repo_status field.
files_metadata (bool, optional) – Whether or not to retrieve metadata for files in the repository (size, LFS metadata, etc). Defaults to False.
expand (list[ExpandModelProperty_T], optional) – List properties to return in the response. When used, only the properties in the list will be returned. This parameter cannot be used if securityStatus or files_metadata are passed. Possible values are “author”, “baseModels”, “cardData”, “childrenModelCount”, “config”, “createdAt”, “disabled”, “downloads”, “downloadsAllTime”, “gated”, “gguf”, “inference”, “inferenceProviderMapping”, “lastModified”, “library_name”, “likes”, “mask_token”, “model-index”, “pipeline_tag”, “private”, “safetensors”, “sha”, “siblings”, “spaces”, “tags”, “transformersInfo”, “trendingScore”, “widgetData”, “usedStorage”, and “resourceGroup”.
token (bool or str, optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.
- Returns:
The model repository information.
- Return type:
[huggingface_hub.hf_api.ModelInfo]
> [!TIP] > Raises the following errors: > > - [~utils.RepositoryNotFoundError] > If the repository to download from cannot be found. This may be because it doesn’t exist, > or because it is set to private and you do not have access. > - [~utils.RevisionNotFoundError] > If the revision to download from cannot be found.
- move_repo(from_id, to_id, *, repo_type=None, token=None)[source]#
Moving a repository from namespace1/repo_name1 to namespace2/repo_name2
Note there are certain limitations. For more information about moving repositories, please see https://hf.co/docs/hub/repositories-settings#renaming-or-transferring-a-repo.
- Parameters:
from_id (str) – A namespace (user or an organization) and a repo name separated by a /. Original repository identifier.
to_id (str) – A namespace (user or an organization) and a repo name separated by a /. Final repository identifier.
repo_type (str, optional) – Set to “dataset” or “space” if uploading to a dataset or space, None or “model” if uploading to a model. Default is None.
token (bool or str, optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.
> [!TIP] > Raises the following errors: > > - [~utils.RepositoryNotFoundError] > If the repository to download from cannot be found. This may be because it doesn’t exist, > or because it is set to private and you do not have access.
- paper_info(id)[source]#
Get information for a paper on the Hub.
- Parameters:
id (str, optional) – ArXiv id of the paper.
- Returns:
A PaperInfo object.
- Return type:
PaperInfo
- Raises:
[HfHubHTTPError] – HTTP 404 If the paper does not exist on the Hub.
- parse_safetensors_file_metadata(repo_id, filename, *, repo_type=None, revision=None, token=None)[source]#
Parse metadata from a safetensors file on the Hub.
To parse metadata from all safetensors files in a repo at once, use [get_safetensors_metadata].
For more details regarding the safetensors format, check out https://huggingface.co/docs/safetensors/index#format.
- Parameters:
repo_id (str) – A user or an organization name and a repo name separated by a /.
filename (str) – The name of the file in the repo.
repo_type (str, optional) – Set to “dataset” or “space” if the file is in a dataset or space, None or “model” if in a model. Default is None.
revision (str, optional) – The git revision to fetch the file from. Can be a branch name, a tag, or a commit hash. Defaults to the head of the “main” branch.
token (bool or str, optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.
- Returns:
information related to a safetensors file.
- Return type:
[SafetensorsFileMetadata]
- Raises:
[NotASafetensorsRepoError] – If the repo is not a safetensors repo i.e. doesn’t have either a model.safetensors or a model.safetensors.index.json file.
[SafetensorsParsingError] – If a safetensors file header couldn’t be parsed correctly.
- pause_inference_endpoint(name, *, namespace=None, token=None)[source]#
Pause an Inference Endpoint.
A paused Inference Endpoint will not be charged. It can be resumed at any time using [resume_inference_endpoint]. This is different than scaling the Inference Endpoint to zero with [scale_to_zero_inference_endpoint], which would be automatically restarted when a request is made to it.
For convenience, you can also pause an Inference Endpoint using [pause_inference_endpoint].
- Parameters:
name (str) – The name of the Inference Endpoint to pause.
namespace (str, optional) – The namespace in which the Inference Endpoint is located. Defaults to the current user.
token (bool or str, optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.
- Returns:
information about the paused Inference Endpoint.
- Return type:
[InferenceEndpoint]
- pause_space(repo_id, *, token=None)[source]#
Pause your Space.
A paused Space stops executing until manually restarted by its owner. This is different from the sleeping state in which free Spaces go after 48h of inactivity. Paused time is not billed to your account, no matter the hardware you’ve selected. To restart your Space, use [restart_space] and go to your Space settings page.
For more details, please visit [the docs](https://huggingface.co/docs/hub/spaces-gpus#pause).
- Parameters:
repo_id (str) – ID of the Space to pause. Example: “Salesforce/BLIP2”.
token (bool or str, optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.
- Returns:
Runtime information about your Space including stage=PAUSED and requested hardware.
- Return type:
[SpaceRuntime]
- Raises:
[RepositoryNotFoundError] – If your Space is not found (error 404). Most probably wrong repo_id or your space is private but you are not authenticated.
[HfHubHTTPError] – 403 Forbidden: only the owner of a Space can pause it. If you want to manage a Space that you don’t own, either ask the owner by opening a Discussion or duplicate the Space.
[BadRequestError] – If your Space is a static Space. Static Spaces are always running and never billed. If you want to hide a static Space, you can set it to private.
- permanently_delete_lfs_files(repo_id, lfs_files, *, rewrite_history=True, repo_type=None, token=None)[source]#
Permanently delete LFS files from a repo on the Hub.
> [!WARNING] > This is a permanent action that will affect all commits referencing the deleted files and might corrupt your > repository. This is a non-revertible operation. Use it only if you know what you are doing.
- Parameters:
repo_id (str) – The repository for which you are listing LFS files.
lfs_files (Iterable[LFSFileInfo]) – An iterable of [LFSFileInfo] items to permanently delete from the repo. Use [list_lfs_files] to list all LFS files from a repo.
rewrite_history (bool, optional, default to True) – Whether to rewrite repository history to remove file pointers referencing the deleted LFS files (recommended).
repo_type (str, optional) – Type of repository. Set to “dataset” or “space” if listing from a dataset or space, None or “model” if listing from a model. Default is None.
token (bool or str, optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.
Example
```py >>> from huggingface_hub import HfApi >>> api = HfApi() >>> lfs_files = api.list_lfs_files(“username/my-cool-repo”)
# Filter files files to delete based on a combination of filename, pushed_at, ref or size. # e.g. select only LFS files in the “checkpoints” folder >>> lfs_files_to_delete = (lfs_file for lfs_file in lfs_files if lfs_file.filename.startswith(“checkpoints/”))
# Permanently delete LFS files >>> api.permanently_delete_lfs_files(“username/my-cool-repo”, lfs_files_to_delete) ```
- preupload_lfs_files(repo_id, additions, *, token=None, repo_type=None, revision=None, create_pr=None, num_threads=5, free_memory=True, gitignore_content=None)[source]#
Pre-upload LFS files to S3 in preparation on a future commit.
This method is useful if you are generating the files to upload on-the-fly and you don’t want to store them in memory before uploading them all at once.
> [!WARNING] > This is a power-user method. You shouldn’t need to call it directly to make a normal commit. > Use [create_commit] directly instead.
> [!WARNING] > Commit operations will be mutated during the process. In particular, the attached path_or_fileobj will be > removed after the upload to save memory (and replaced by an empty bytes object). Do not reuse the same > objects except to pass them to [create_commit]. If you don’t want to remove the attached content from the > commit operation object, pass free_memory=False.
- Parameters:
repo_id (str) – The repository in which you will commit the files, for example: “username/custom_transformers”.
operations (Iterable of [CommitOperationAdd]) – The list of files to upload. Warning: the objects in this list will be mutated to include information relative to the upload. Do not reuse the same objects for multiple commits.
token (bool or str, optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.
repo_type (str, optional) – The type of repository to upload to (e.g. “model” -default-, “dataset” or “space”).
revision (str, optional) – The git revision to commit from. Defaults to the head of the “main” branch.
create_pr (boolean, optional) – Whether or not you plan to create a Pull Request with that commit. Defaults to False.
num_threads (int, optional) – Number of concurrent threads for uploading files. Defaults to 5. Setting it to 2 means at most 2 files will be uploaded concurrently.
gitignore_content (str, optional) – The content of the .gitignore file to know which files should be ignored. The order of priority is to first check if gitignore_content is passed, then check if the .gitignore file is present in the list of files to commit and finally default to the .gitignore file already hosted on the Hub (if any).
Example: ```py >>> from huggingface_hub import CommitOperationAdd, preupload_lfs_files, create_commit, create_repo
>>> repo_id = create_repo("test_preupload").repo_id
# Generate and preupload LFS files one by one >>> operations = [] # List of all CommitOperationAdd objects that will be generated >>> for i in range(5): … content = … # generate binary content … addition = CommitOperationAdd(path_in_repo=f”shard_{i}_of_5.bin”, path_or_fileobj=content) … preupload_lfs_files(repo_id, additions=[addition]) # upload + free memory … operations.append(addition)
# Create commit >>> create_commit(repo_id, operations=operations, commit_message=”Commit all shards”) ```
- reject_access_request(repo_id, user, *, repo_type=None, rejection_reason, token=None)[source]#
Reject an access request from a user for a given gated repo.
A rejected request will go to the rejected list. The user cannot download any file of the repo. Rejected requests can be accepted or cancelled at any time using [accept_access_request] and [cancel_access_request]. A cancelled request will go back to the pending list while an accepted request will go to the accepted list.
For more info about gated repos, see https://huggingface.co/docs/hub/models-gated.
- Parameters:
repo_id (str) – The id of the repo to reject access request for.
user (str) – The username of the user which access request should be rejected.
repo_type (str, optional) – The type of the repo to reject access request for. Must be one of model, dataset or space. Defaults to model.
rejection_reason (str, optional) – Optional rejection reason that will be visible to the user (max 200 characters).
token (bool or str, optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.
- Raises:
[HfHubHTTPError] – HTTP 400 if the repo is not gated.
[HfHubHTTPError] – HTTP 403 if you only have read-only access to the repo. This can be the case if you don’t have write or admin role in the organization the repo belongs to or if you passed a read token.
[HfHubHTTPError] – HTTP 404 if the user does not exist on the Hub.
[HfHubHTTPError] – HTTP 404 if the user access request cannot be found.
[HfHubHTTPError] – HTTP 404 if the user access request is already in the rejected list.
- rename_discussion(repo_id, discussion_num, new_title, *, token=None, repo_type=None)[source]#
Renames a Discussion.
- Parameters:
repo_id (str) – A namespace (user or an organization) and a repo name separated by a /.
discussion_num (int) – The number of the Discussion or Pull Request . Must be a strictly positive integer.
new_title (str) – The new title for the discussion
repo_type (str, optional) – Set to “dataset” or “space” if uploading to a dataset or space, None or “model” if uploading to a model. Default is None.
token (bool or str, optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.
- Returns:
the title change event
- Return type:
[DiscussionTitleChange]
Examples
```python >>> new_title = “New title, fixing a typo” >>> HfApi().rename_discussion( … repo_id=”username/repo_name”, … discussion_num=34 … new_title=new_title … ) # DiscussionTitleChange(id=’deadbeef0000000’, type=’title-change’, …)
> [!TIP] > Raises the following errors: > > - [HTTPError](https://requests.readthedocs.io/en/latest/api/#requests.HTTPError) > if the HuggingFace API returned an error > - [ValueError](https://docs.python.org/3/library/exceptions.html#ValueError) > if some parameter value is invalid > - [~utils.RepositoryNotFoundError] > If the repository to download from cannot be found. This may be because it doesn’t exist, > or because it is set to private and you do not have access.
- repo_exists(repo_id, *, repo_type=None, token=None)[source]#
Checks if a repository exists on the Hugging Face Hub.
- Parameters:
repo_id (str) – A namespace (user or an organization) and a repo name separated by a /.
repo_type (str, optional) – Set to “dataset” or “space” if getting repository info from a dataset or a space, None or “model” if getting repository info from a model. Default is None.
token (bool or str, optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.
- Returns:
True if the repository exists, False otherwise.
- Return type:
bool
Examples
`py >>> from huggingface_hub import repo_exists >>> repo_exists("google/gemma-7b") True >>> repo_exists("google/not-a-repo") False `
- repo_info(repo_id, *, revision=None, repo_type=None, timeout=None, files_metadata=False, expand=None, token=None)[source]#
Get the info object for a given repo of a given type.
- Parameters:
repo_id (str) – A namespace (user or an organization) and a repo name separated by a /.
revision (str, optional) – The revision of the repository from which to get the information.
repo_type (str, optional) – Set to “dataset” or “space” if getting repository info from a dataset or a space, None or “model” if getting repository info from a model. Default is None.
timeout (float, optional) – Whether to set a timeout for the request to the Hub.
expand (ExpandModelProperty_T or ExpandDatasetProperty_T or ExpandSpaceProperty_T, optional) – List properties to return in the response. When used, only the properties in the list will be returned. This parameter cannot be used if files_metadata is passed. For an exhaustive list of available properties, check out [model_info], [dataset_info] or [space_info].
files_metadata (bool, optional) – Whether or not to retrieve metadata for files in the repository (size, LFS metadata, etc). Defaults to False.
token (bool or str, optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.
- Returns:
The repository information, as a [huggingface_hub.hf_api.DatasetInfo], [huggingface_hub.hf_api.ModelInfo] or [huggingface_hub.hf_api.SpaceInfo] object.
- Return type:
Union[SpaceInfo, DatasetInfo, ModelInfo]
> [!TIP] > Raises the following errors: > > - [~utils.RepositoryNotFoundError] > If the repository to download from cannot be found. This may be because it doesn’t exist, > or because it is set to private and you do not have access. > - [~utils.RevisionNotFoundError] > If the revision to download from cannot be found.
- request_space_hardware(repo_id, hardware, *, token=None, sleep_time=None)[source]#
Request new hardware for a Space.
- Parameters:
repo_id (str) – ID of the repo to update. Example: “bigcode/in-the-stack”.
hardware (str or [SpaceHardware]) – Hardware on which to run the Space. Example: “t4-medium”.
token (bool or str, optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.
sleep_time (int, optional) – Number of seconds of inactivity to wait before a Space is put to sleep. Set to -1 if you don’t want your Space to sleep (default behavior for upgraded hardware). For free hardware, you can’t configure the sleep time (value is fixed to 48 hours of inactivity). See https://huggingface.co/docs/hub/spaces-gpus#sleep-time for more details.
- Returns:
Runtime information about a Space including Space stage and hardware.
- Return type:
[SpaceRuntime]
> [!TIP] > It is also possible to request hardware directly when creating the Space repo! See [create_repo] for details.
- request_space_storage(repo_id, storage, *, token=None)[source]#
Request persistent storage for a Space.
- Parameters:
repo_id (str) – ID of the Space to update. Example: “open-llm-leaderboard/open_llm_leaderboard”.
storage (str or [SpaceStorage]) – Storage tier. Either ‘small’, ‘medium’, or ‘large’.
token (bool or str, optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.
- Returns:
Runtime information about a Space including Space stage and hardware.
- Return type:
[SpaceRuntime]
> [!TIP] > It is not possible to decrease persistent storage after its granted. To do so, you must delete it > via [delete_space_storage].
- restart_space(repo_id, *, token=None, factory_reboot=False)[source]#
Restart your Space.
This is the only way to programmatically restart a Space if you’ve put it on Pause (see [pause_space]). You must be the owner of the Space to restart it. If you are using an upgraded hardware, your account will be billed as soon as the Space is restarted. You can trigger a restart no matter the current state of a Space.
For more details, please visit [the docs](https://huggingface.co/docs/hub/spaces-gpus#pause).
- Parameters:
repo_id (str) – ID of the Space to restart. Example: “Salesforce/BLIP2”.
token (bool or str, optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.
factory_reboot (bool, optional) – If True, the Space will be rebuilt from scratch without caching any requirements.
- Returns:
Runtime information about your Space.
- Return type:
[SpaceRuntime]
- Raises:
[RepositoryNotFoundError] – If your Space is not found (error 404). Most probably wrong repo_id or your space is private but you are not authenticated.
[HfHubHTTPError] – 403 Forbidden: only the owner of a Space can restart it. If you want to restart a Space that you don’t own, either ask the owner by opening a Discussion or duplicate the Space.
[BadRequestError] – If your Space is a static Space. Static Spaces are always running and never billed. If you want to hide a static Space, you can set it to private.
- resume_inference_endpoint(name, *, namespace=None, running_ok=True, token=None)[source]#
Resume an Inference Endpoint.
For convenience, you can also resume an Inference Endpoint using [InferenceEndpoint.resume].
- Parameters:
name (str) – The name of the Inference Endpoint to resume.
namespace (str, optional) – The namespace in which the Inference Endpoint is located. Defaults to the current user.
running_ok (bool, optional) – If True, the method will not raise an error if the Inference Endpoint is already running. Defaults to True.
token (bool or str, optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.
- Returns:
information about the resumed Inference Endpoint.
- Return type:
[InferenceEndpoint]
- resume_scheduled_job(*, scheduled_job_id, namespace=None, token=None)[source]#
Resume (unpause) a scheduled compute Job on Hugging Face infrastructure.
- Parameters:
scheduled_job_id (str) – ID of the scheduled Job.
namespace (str, optional) – The namespace where the scheduled Job is. Defaults to the current user’s namespace.
` (token) – A valid user access token. If not provided, the locally saved token will be used, which is the recommended authentication method. Set to False to disable authentication. Refer to: https://huggingface.co/docs/huggingface_hub/quick-start#authentication.
- revision_exists(repo_id, revision, *, repo_type=None, token=None)[source]#
Checks if a specific revision exists on a repo on the Hugging Face Hub.
- Parameters:
repo_id (str) – A namespace (user or an organization) and a repo name separated by a /.
revision (str) – The revision of the repository to check.
repo_type (str, optional) – Set to “dataset” or “space” if getting repository info from a dataset or a space, None or “model” if getting repository info from a model. Default is None.
token (bool or str, optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.
- Returns:
True if the repository and the revision exists, False otherwise.
- Return type:
bool
Examples
`py >>> from huggingface_hub import revision_exists >>> revision_exists("google/gemma-7b", "float16") True >>> revision_exists("google/gemma-7b", "not-a-revision") False `
- run_as_future(fn, *args, **kwargs)[source]#
Run a method in the background and return a Future instance.
The main goal is to run methods without blocking the main thread (e.g. to push data during a training). Background jobs are queued to preserve order but are not ran in parallel. If you need to speed-up your scripts by parallelizing lots of call to the API, you must setup and use your own [ThreadPoolExecutor](https://docs.python.org/3/library/concurrent.futures.html#threadpoolexecutor).
Note: Most-used methods like [upload_file], [upload_folder] and [create_commit] have a run_as_future: bool argument to directly call them in the background. This is equivalent to calling api.run_as_future(…) on them but less verbose.
- Parameters:
fn (Callable) – The method to run in the background.
*args – Arguments with which the method will be called.
**kwargs –
Arguments with which the method will be called.
- Returns:
a [Future](https://docs.python.org/3/library/concurrent.futures.html#future-objects) instance to get the result of the task.
- Return type:
Future
Example
`py >>> from huggingface_hub import HfApi >>> api = HfApi() >>> future = api.run_as_future(api.whoami) # instant >>> future.done() False >>> future.result() # wait until complete and return result (...) >>> future.done() True `
- run_job(*, image, command, env=None, secrets=None, flavor=None, timeout=None, namespace=None, token=None)[source]#
Run compute Jobs on Hugging Face infrastructure.
- Parameters:
image (str) – The Docker image to use. Examples: “ubuntu”, “python:3.12”, “pytorch/pytorch:2.6.0-cuda12.4-cudnn9-devel”. Example with an image from a Space: “hf.co/spaces/lhoestq/duckdb”.
command (list[str]) – The command to run. Example: [“echo”, “hello”].
env (dict[str, Any], optional) – Defines the environment variables for the Job.
secrets (dict[str, Any], optional) – Defines the secret environment variables for the Job.
flavor (str, optional) – Flavor for the hardware, as in Hugging Face Spaces. See [SpaceHardware] for possible values. Defaults to “cpu-basic”.
timeout (Union[int, float, str], optional) – Max duration for the Job: int/float with s (seconds, default), m (minutes), h (hours) or d (days). Example: 300 or “5m” for 5 minutes.
namespace (str, optional) – The namespace where the Job will be created. Defaults to the current user’s namespace.
` (token) – A valid user access token. If not provided, the locally saved token will be used, which is the recommended authentication method. Set to False to disable authentication. Refer to: https://huggingface.co/docs/huggingface_hub/quick-start#authentication.
- Return type:
JobInfo
Example
Run your first Job:
`python >>> from huggingface_hub import run_job >>> run_job(image="python:3.12", command=["python", "-c" ,"print('Hello from HF compute!')"]) `Run a GPU Job:
`python >>> from huggingface_hub import run_job >>> image = "pytorch/pytorch:2.6.0-cuda12.4-cudnn9-devel" >>> command = ["python", "-c", "import torch; print(f"This code ran with the following GPU: {torch.cuda.get_device_name()}")"] >>> run_job(image=image, command=command, flavor="a10g-small") `
- run_uv_job(script, *, script_args=None, dependencies=None, python=None, image=None, env=None, secrets=None, flavor=None, timeout=None, namespace=None, token=None)[source]#
Run a UV script Job on Hugging Face infrastructure.
- Parameters:
script (str) – Path or URL of the UV script, or a command.
script_args (list[str], optional) – Arguments to pass to the script or command.
dependencies (list[str], optional) – Dependencies to use to run the UV script.
python (str, optional) – Use a specific Python version. Default is 3.12.
(str (image) – python3.12-bookworm”): Use a custom Docker image with uv installed.
optional – python3.12-bookworm”): Use a custom Docker image with uv installed.
"ghcr.io/astral-sh/uv (defaults to) – python3.12-bookworm”): Use a custom Docker image with uv installed.
env (dict[str, Any], optional) – Defines the environment variables for the Job.
secrets (dict[str, Any], optional) – Defines the secret environment variables for the Job.
flavor (str, optional) – Flavor for the hardware, as in Hugging Face Spaces. See [SpaceHardware] for possible values. Defaults to “cpu-basic”.
timeout (Union[int, float, str], optional) – Max duration for the Job: int/float with s (seconds, default), m (minutes), h (hours) or d (days). Example: 300 or “5m” for 5 minutes.
namespace (str, optional) – The namespace where the Job will be created. Defaults to the current user’s namespace.
` (token) – A valid user access token. If not provided, the locally saved token will be used, which is the recommended authentication method. Set to False to disable authentication. Refer to: https://huggingface.co/docs/huggingface_hub/quick-start#authentication.
- Return type:
JobInfo
Example
Run a script from a URL:
`python >>> from huggingface_hub import run_uv_job >>> script = "https://raw.githubusercontent.com/huggingface/trl/refs/heads/main/trl/scripts/sft.py" >>> script_args = ["--model_name_or_path", "Qwen/Qwen2-0.5B", "--dataset_name", "trl-lib/Capybara", "--push_to_hub"] >>> run_uv_job(script, script_args=script_args, dependencies=["trl"], flavor="a10g-small") `Run a local script:
`python >>> from huggingface_hub import run_uv_job >>> script = "my_sft.py" >>> script_args = ["--model_name_or_path", "Qwen/Qwen2-0.5B", "--dataset_name", "trl-lib/Capybara", "--push_to_hub"] >>> run_uv_job(script, script_args=script_args, dependencies=["trl"], flavor="a10g-small") `Run a command:
`python >>> from huggingface_hub import run_uv_job >>> script = "lighteval" >>> script_args= ["endpoint", "inference-providers", "model_name=openai/gpt-oss-20b,provider=auto", "lighteval|gsm8k|0|0"] >>> run_uv_job(script, script_args=script_args, dependencies=["lighteval"], flavor="a10g-small") `
- scale_to_zero_inference_endpoint(name, *, namespace=None, token=None)[source]#
Scale Inference Endpoint to zero.
An Inference Endpoint scaled to zero will not be charged. It will be resume on the next request to it, with a cold start delay. This is different than pausing the Inference Endpoint with [pause_inference_endpoint], which would require a manual resume with [resume_inference_endpoint].
For convenience, you can also scale an Inference Endpoint to zero using [InferenceEndpoint.scale_to_zero].
- Parameters:
name (str) – The name of the Inference Endpoint to scale to zero.
namespace (str, optional) – The namespace in which the Inference Endpoint is located. Defaults to the current user.
token (bool or str, optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.
- Returns:
information about the scaled-to-zero Inference Endpoint.
- Return type:
[InferenceEndpoint]
- set_space_sleep_time(repo_id, sleep_time, *, token=None)[source]#
Set a custom sleep time for a Space running on upgraded hardware..
Your Space will go to sleep after X seconds of inactivity. You are not billed when your Space is in “sleep” mode. If a new visitor lands on your Space, it will “wake it up”. Only upgraded hardware can have a configurable sleep time. To know more about the sleep stage, please refer to https://huggingface.co/docs/hub/spaces-gpus#sleep-time.
- Parameters:
repo_id (str) – ID of the repo to update. Example: “bigcode/in-the-stack”.
sleep_time (int, optional) – Number of seconds of inactivity to wait before a Space is put to sleep. Set to -1 if you don’t want your Space to pause (default behavior for upgraded hardware). For free hardware, you can’t configure the sleep time (value is fixed to 48 hours of inactivity). See https://huggingface.co/docs/hub/spaces-gpus#sleep-time for more details.
token (bool or str, optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.
- Returns:
Runtime information about a Space including Space stage and hardware.
- Return type:
[SpaceRuntime]
> [!TIP] > It is also possible to set a custom sleep time when requesting hardware with [request_space_hardware].
- snapshot_download(repo_id, *, repo_type=None, revision=None, cache_dir=None, local_dir=None, etag_timeout=10, force_download=False, token=None, local_files_only=False, allow_patterns=None, ignore_patterns=None, max_workers=8, tqdm_class=None, dry_run=False)[source]#
Download repo files.
Download a whole snapshot of a repo’s files at the specified revision. This is useful when you want all files from a repo, because you don’t know which ones you will need a priori. All files are nested inside a folder in order to keep their actual filename relative to that folder. You can also filter which files to download using allow_patterns and ignore_patterns.
If local_dir is provided, the file structure from the repo will be replicated in this location. When using this option, the cache_dir will not be used and a .cache/huggingface/ folder will be created at the root of local_dir to store some metadata related to the downloaded files.While this mechanism is not as robust as the main cache-system, it’s optimized for regularly pulling the latest version of a repository.
An alternative would be to clone the repo but this requires git and git-lfs to be installed and properly configured. It is also not possible to filter which files to download when cloning a repository using git.
- Parameters:
repo_id (str) – A user or an organization name and a repo name separated by a /.
repo_type (str, optional) – Set to “dataset” or “space” if downloading from a dataset or space, None or “model” if downloading from a model. Default is None.
revision (str, optional) – An optional Git revision id which can be a branch name, a tag, or a commit hash.
cache_dir (str, Path, optional) – Path to the folder where cached files are stored.
local_dir (str or Path, optional) – If provided, the downloaded files will be placed under this directory.
etag_timeout (float, optional, defaults to 10) – When fetching ETag, how many seconds to wait for the server to send data before giving up which is passed to httpx.request.
force_download (bool, optional, defaults to False) – Whether the file should be downloaded even if it already exists in the local cache.
token (bool or str, optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.
local_files_only (bool, optional, defaults to False) – If True, avoid downloading the file and return the path to the local cached file if it exists.
allow_patterns (list[str] or str, optional) – If provided, only files matching at least one pattern are downloaded.
ignore_patterns (list[str] or str, optional) – If provided, files matching any of the patterns are not downloaded.
max_workers (int, optional) – Number of concurrent threads to download files (1 thread = 1 file download). Defaults to 8.
tqdm_class (tqdm, optional) – If provided, overwrites the default behavior for the progress bar. Passed argument must inherit from tqdm.auto.tqdm or at least mimic its behavior. Note that the tqdm_class is not passed to each individual download. Defaults to the custom HF progress bar that can be disabled by setting HF_HUB_DISABLE_PROGRESS_BARS environment variable.
dry_run (bool, optional, defaults to False) – If True, perform a dry run without actually downloading the files. Returns a list of [DryRunFileInfo] objects containing information about what would be downloaded.
- Returns:
If dry_run=False: Folder path of the repo snapshot.
If dry_run=True: A list of [DryRunFileInfo] objects containing download information.
- Return type:
str or list of [DryRunFileInfo]
- Raises:
[RepositoryNotFoundError] – If the repository to download from cannot be found. This may be because it doesn’t exist, or because it is set to private and you do not have access.
[RevisionNotFoundError] – If the revision to download from cannot be found.
[EnvironmentError](https – //docs.python.org/3/library/exceptions.html#EnvironmentError) If token=True and the token cannot be found.
[OSError](https – //docs.python.org/3/library/exceptions.html#OSError) if ETag cannot be determined.
[ValueError](https – //docs.python.org/3/library/exceptions.html#ValueError) if some parameter value is invalid.
- space_info(repo_id, *, revision=None, timeout=None, files_metadata=False, expand=None, token=None)[source]#
Get info on one specific Space on huggingface.co.
Space can be private if you pass an acceptable token.
- Parameters:
repo_id (str) – A namespace (user or an organization) and a repo name separated by a /.
revision (str, optional) – The revision of the space repository from which to get the information.
timeout (float, optional) – Whether to set a timeout for the request to the Hub.
files_metadata (bool, optional) – Whether or not to retrieve metadata for files in the repository (size, LFS metadata, etc). Defaults to False.
expand (list[ExpandSpaceProperty_T], optional) – List properties to return in the response. When used, only the properties in the list will be returned. This parameter cannot be used if full is passed. Possible values are “author”, “cardData”, “createdAt”, “datasets”, “disabled”, “lastModified”, “likes”, “models”, “private”, “runtime”, “sdk”, “siblings”, “sha”, “subdomain”, “tags”, “trendingScore”, “usedStorage”, and “resourceGroup”.
token (bool or str, optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.
- Returns:
The space repository information.
- Return type:
[~hf_api.SpaceInfo]
> [!TIP] > Raises the following errors: > > - [~utils.RepositoryNotFoundError] > If the repository to download from cannot be found. This may be because it doesn’t exist, > or because it is set to private and you do not have access. > - [~utils.RevisionNotFoundError] > If the revision to download from cannot be found.
- super_squash_history(repo_id, *, branch=None, commit_message=None, repo_type=None, token=None)[source]#
Squash commit history on a branch for a repo on the Hub.
Squashing the repo history is useful when you know you’ll make hundreds of commits and you don’t want to clutter the history. Squashing commits can only be performed from the head of a branch.
> [!WARNING] > Once squashed, the commit history cannot be retrieved. This is a non-revertible operation.
> [!WARNING] > Once the history of a branch has been squashed, it is not possible to merge it back into another branch since > their history will have diverged.
- Parameters:
repo_id (str) – A namespace (user or an organization) and a repo name separated by a /.
branch (str, optional) – The branch to squash. Defaults to the head of the “main” branch.
commit_message (str, optional) – The commit message to use for the squashed commit.
repo_type (str, optional) – Set to “dataset” or “space” if listing commits from a dataset or a Space, None or “model” if listing from a model. Default is None.
token (bool or str, optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.
- Raises:
[RepositoryNotFoundError] – If repository is not found (error 404): wrong repo_id/repo_type, private but not authenticated or repo does not exist.
[RevisionNotFoundError] – If the branch to squash cannot be found.
[BadRequestError] – If invalid reference for a branch. You cannot squash history on tags.
Example: ```py >>> from huggingface_hub import HfApi >>> api = HfApi()
# Create repo >>> repo_id = api.create_repo(“test-squash”).repo_id
# Make a lot of commits. >>> api.upload_file(repo_id=repo_id, path_in_repo=”file.txt”, path_or_fileobj=b”content”) >>> api.upload_file(repo_id=repo_id, path_in_repo=”lfs.bin”, path_or_fileobj=b”content”) >>> api.upload_file(repo_id=repo_id, path_in_repo=”file.txt”, path_or_fileobj=b”another_content”)
# Squash history >>> api.super_squash_history(repo_id=repo_id) ```
- suspend_scheduled_job(*, scheduled_job_id, namespace=None, token=None)[source]#
Suspend (pause) a scheduled compute Job on Hugging Face infrastructure.
- Parameters:
scheduled_job_id (str) – ID of the scheduled Job.
namespace (str, optional) – The namespace where the scheduled Job is. Defaults to the current user’s namespace.
` (token) – A valid user access token. If not provided, the locally saved token will be used, which is the recommended authentication method. Set to False to disable authentication. Refer to: https://huggingface.co/docs/huggingface_hub/quick-start#authentication.
- unlike(repo_id, *, token=None, repo_type=None)[source]#
Unlike a given repo on the Hub (e.g. remove from favorite list).
To prevent spam usage, it is not possible to like a repository from a script.
See also [list_liked_repos].
- Parameters:
repo_id (str) – The repository to unlike. Example: “user/my-cool-model”.
token (bool or str, optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.
repo_type (str, optional) – Set to “dataset” or “space” if unliking a dataset or space, None or “model” if unliking a model. Default is None.
- Raises:
[RepositoryNotFoundError] – If repository is not found (error 404): wrong repo_id/repo_type, private but not authenticated or repo does not exist.
Example:
`python >>> from huggingface_hub import list_liked_repos, unlike >>> "gpt2" in list_liked_repos().models # we assume you have already liked gpt2 True >>> unlike("gpt2") >>> "gpt2" in list_liked_repos().models False `
- update_collection_item(collection_slug, item_object_id, *, note=None, position=None, token=None)[source]#
Update an item in a collection.
- Parameters:
collection_slug (str) – Slug of the collection to update. Example: “TheBloke/recent-models-64f9a55bb3115b4f513ec026”.
item_object_id (str) – ID of the item in the collection. This is not the id of the item on the Hub (repo_id or paper id). It must be retrieved from a [CollectionItem] object. Example: collection.items[0].item_object_id.
note (str, optional) – A note to attach to the item in the collection. The maximum size for a note is 500 characters.
position (int, optional) – New position of the item in the collection.
token (bool or str, optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.
Example:
```py >>> from huggingface_hub import get_collection, update_collection_item
# Get collection first >>> collection = get_collection(“TheBloke/recent-models-64f9a55bb3115b4f513ec026”)
# Update item based on its ID (add note + update position) >>> update_collection_item( … collection_slug=”TheBloke/recent-models-64f9a55bb3115b4f513ec026”, … item_object_id=collection.items[-1].item_object_id, … note=”Newly updated model!” … position=0, … ) ```
- update_collection_metadata(collection_slug, *, title=None, description=None, position=None, private=None, theme=None, token=None)[source]#
Update metadata of a collection on the Hub.
All arguments are optional. Only provided metadata will be updated.
- Parameters:
collection_slug (str) – Slug of the collection to update. Example: “TheBloke/recent-models-64f9a55bb3115b4f513ec026”.
title (str) – Title of the collection to update.
description (str, optional) – Description of the collection to update.
position (int, optional) – New position of the collection in the list of collections of the user.
private (bool, optional) – Whether the collection should be private or not.
theme (str, optional) – Theme of the collection on the Hub.
token (bool or str, optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.
- Return type:
Collection
Returns: [Collection]
Example:
`py >>> from huggingface_hub import update_collection_metadata >>> collection = update_collection_metadata( ... collection_slug="username/iccv-2023-64f9a55bb3115b4f513ec026", ... title="ICCV Oct. 2023" ... description="Portfolio of models, datasets, papers and demos I presented at ICCV Oct. 2023", ... private=False, ... theme="pink", ... ) >>> collection.slug "username/iccv-oct-2023-64f9a55bb3115b4f513ec026" # ^collection slug got updated but not the trailing ID `
- update_inference_endpoint(name, *, accelerator=None, instance_size=None, instance_type=None, min_replica=None, max_replica=None, scale_to_zero_timeout=None, scaling_metric=None, scaling_threshold=None, repository=None, framework=None, revision=None, task=None, custom_image=None, env=None, secrets=None, domain=None, path=None, cache_http_responses=None, tags=None, namespace=None, token=None)[source]#
Update an Inference Endpoint.
This method allows the update of either the compute configuration, the deployed model, the route, or any combination. All arguments are optional but at least one must be provided.
For convenience, you can also update an Inference Endpoint using [InferenceEndpoint.update].
- Parameters:
name (str) – The name of the Inference Endpoint to update.
accelerator (str, optional) – The hardware accelerator to be used for inference (e.g. “cpu”).
instance_size (str, optional) – The size or type of the instance to be used for hosting the model (e.g. “x4”).
instance_type (str, optional) – The cloud instance type where the Inference Endpoint will be deployed (e.g. “intel-icl”).
min_replica (int, optional) – The minimum number of replicas (instances) to keep running for the Inference Endpoint.
max_replica (int, optional) – The maximum number of replicas (instances) to scale to for the Inference Endpoint.
scale_to_zero_timeout (int, optional) – The duration in minutes before an inactive endpoint is scaled to zero.
scaling_metric (str or [`InferenceEndpointScalingMetric `], optional) – The metric reference for scaling. Either “pendingRequests” or “hardwareUsage” when provided. Defaults to None.
scaling_threshold (float, optional) – The scaling metric threshold used to trigger a scale up. Ignored when scaling metric is not provided. Defaults to None.
repository (str, optional) – The name of the model repository associated with the Inference Endpoint (e.g. “gpt2”).
framework (str, optional) – The machine learning framework used for the model (e.g. “custom”).
revision (str, optional) – The specific model revision to deploy on the Inference Endpoint (e.g. “6c0e6080953db56375760c0471a8c5f2929baf11”).
task (str, optional) – The task on which to deploy the model (e.g. “text-classification”).
custom_image (dict, optional) – A custom Docker image to use for the Inference Endpoint. This is useful if you want to deploy an Inference Endpoint running on the text-generation-inference (TGI) framework (see examples).
env (dict[str, str], optional) – Non-secret environment variables to inject in the container environment
secrets (dict[str, str], optional) – Secret values to inject in the container environment.
domain (str, optional) – The custom domain for the Inference Endpoint deployment, if setup the inference endpoint will be available at this domain (e.g. “my-new-domain.cool-website.woof”).
path (str, optional) – The custom path to the deployed model, should start with a / (e.g. “/models/google-bert/bert-base-uncased”).
cache_http_responses (bool, optional) – Whether to cache HTTP responses from the Inference Endpoint.
tags (list[str], optional) – A list of tags to associate with the Inference Endpoint.
namespace (str, optional) – The namespace where the Inference Endpoint will be updated. Defaults to the current user’s namespace.
token (bool or str, optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.
- Returns:
information about the updated Inference Endpoint.
- Return type:
[InferenceEndpoint]
- update_repo_settings(repo_id, *, gated=None, private=None, token=None, repo_type=None)[source]#
Update the settings of a repository, including gated access and visibility.
To give more control over how repos are used, the Hub allows repo authors to enable access requests for their repos, and also to set the visibility of the repo to private.
- Parameters:
repo_id (str) – A namespace (user or an organization) and a repo name separated by a /.
gated (Literal[“auto”, “manual”, False], optional) – The gated status for the repository. If set to None (default), the gated setting of the repository won’t be updated. * “auto”: The repository is gated, and access requests are automatically approved or denied based on predefined criteria. * “manual”: The repository is gated, and access requests require manual approval. * False : The repository is not gated, and anyone can access it.
private (bool, optional) – Whether the repository should be private.
token (Union[str, bool, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.
repo_type (str, optional) – The type of the repository to update settings from (“model”, “dataset” or “space”). Defaults to “model”.
- Raises:
[ValueError](https – //docs.python.org/3/library/exceptions.html#ValueError) If gated is not one of “auto”, “manual”, or False.
[ValueError](https – //docs.python.org/3/library/exceptions.html#ValueError) If repo_type is not one of the values in constants.REPO_TYPES.
[HfHubHTTPError] – If the request to the Hugging Face Hub API fails.
[RepositoryNotFoundError] – If the repository to download from cannot be found. This may be because it doesn’t exist, or because it is set to private and you do not have access.
- update_webhook(webhook_id, *, url=None, watched=None, domains=None, secret=None, token=None)[source]#
Update an existing webhook.
- Parameters:
webhook_id (str) – The unique identifier of the webhook to be updated.
url (str, optional) – The URL to which the payload will be sent.
watched (list[WebhookWatchedItem], optional) – List of items to watch. It can be users, orgs, models, datasets, or spaces. Refer to [WebhookWatchedItem] for more details. Watched items can also be provided as plain dictionaries.
domains (list[Literal[“repo”, “discussion”]], optional) – The domains to watch. This can include “repo”, “discussion”, or both.
secret (str, optional) – A secret to sign the payload with, providing an additional layer of security.
token (bool or str, optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.
- Returns:
Info about the updated webhook.
- Return type:
[WebhookInfo]
Example
```python >>> from huggingface_hub import update_webhook >>> updated_payload = update_webhook( … webhook_id=”654bbbc16f2ec14d77f109cc”, … url=”https://new.webhook.site/a2176e82-5720-43ee-9e06-f91cb4c91548”, … watched=[{“type”: “user”, “name”: “julien-c”}, {“type”: “org”, “name”: “HuggingFaceH4”}], … domains=[“repo”], … secret=”my-secret”, … ) >>> print(updated_payload) WebhookInfo(
id=”654bbbc16f2ec14d77f109cc”, job=None, url=”https://new.webhook.site/a2176e82-5720-43ee-9e06-f91cb4c91548”, watched=[WebhookWatchedItem(type=”user”, name=”julien-c”), WebhookWatchedItem(type=”org”, name=”HuggingFaceH4”)], domains=[“repo”], secret=”my-secret”, disabled=False,
- upload_file(*, path_or_fileobj, path_in_repo, repo_id, token=None, repo_type=None, revision=None, commit_message=None, commit_description=None, create_pr=None, parent_commit=None, run_as_future=False)[source]#
Upload a local file (up to 50 GB) to the given repo. The upload is done through a HTTP post request, and doesn’t require git or git-lfs to be installed.
- Parameters:
path_or_fileobj (str, Path, bytes, or IO) – Path to a file on the local machine or binary data stream / fileobj / buffer.
path_in_repo (str) – Relative filepath in the repo, for example: “checkpoints/1fec34a/weights.bin”
repo_id (str) – The repository to which the file will be uploaded, for example: “username/custom_transformers”
token (bool or str, optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.
repo_type (str, optional) – Set to “dataset” or “space” if uploading to a dataset or space, None or “model” if uploading to a model. Default is None.
revision (str, optional) – The git revision to commit from. Defaults to the head of the “main” branch.
commit_message (str, optional) – The summary / title / first line of the generated commit
commit_description (str optional) – The description of the generated commit
create_pr (boolean, optional) – Whether or not to create a Pull Request with that commit. Defaults to False. If revision is not set, PR is opened against the “main” branch. If revision is set and is a branch, PR is opened against this branch. If revision is set and is not a branch name (example: a commit oid), an RevisionNotFoundError is returned by the server.
parent_commit (str, optional) – The OID / SHA of the parent commit, as a hexadecimal string. Shorthands (7 first characters) are also supported. If specified and create_pr is False, the commit will fail if revision does not point to parent_commit. If specified and create_pr is True, the pull request will be created from parent_commit. Specifying parent_commit ensures the repo has not changed before committing the changes, and can be especially useful if the repo is updated / committed to concurrently.
run_as_future (bool, optional) – Whether or not to run this method in the background. Background jobs are run sequentially without blocking the main thread. Passing run_as_future=True will return a [Future](https://docs.python.org/3/library/concurrent.futures.html#future-objects) object. Defaults to False.
- Returns:
Instance of [CommitInfo] containing information about the newly created commit (commit hash, commit url, pr url, commit message,…). If run_as_future=True is passed, returns a Future object which will contain the result when executed.
- Return type:
[CommitInfo] or Future
> [!TIP] > Raises the following errors: > > - [HTTPError](https://requests.readthedocs.io/en/latest/api/#requests.HTTPError) > if the HuggingFace API returned an error > - [ValueError](https://docs.python.org/3/library/exceptions.html#ValueError) > if some parameter value is invalid > - [~utils.RepositoryNotFoundError] > If the repository to download from cannot be found. This may be because it doesn’t exist, > or because it is set to private and you do not have access. > - [~utils.RevisionNotFoundError] > If the revision to download from cannot be found.
> [!WARNING] > upload_file assumes that the repo already exists on the Hub. If you get a > Client error 404, please make sure you are authenticated and that repo_id and > repo_type are set correctly. If repo does not exist, create it first using > [~hf_api.create_repo].
Example:
```python >>> from huggingface_hub import upload_file
>>> with open("./local/filepath", "rb") as fobj: ... upload_file( ... path_or_fileobj=fileobj, ... path_in_repo="remote/file/path.h5", ... repo_id="username/my-dataset", ... repo_type="dataset", ... token="my_token", ... )
>>> upload_file( ... path_or_fileobj=".\\local\\file\\path", ... path_in_repo="remote/file/path.h5", ... repo_id="username/my-model", ... token="my_token", ... )
>>> upload_file( ... path_or_fileobj=".\\local\\file\\path", ... path_in_repo="remote/file/path.h5", ... repo_id="username/my-model", ... token="my_token", ... create_pr=True, ... ) ```
- upload_folder(*, repo_id, folder_path, path_in_repo=None, commit_message=None, commit_description=None, token=None, repo_type=None, revision=None, create_pr=None, parent_commit=None, allow_patterns=None, ignore_patterns=None, delete_patterns=None, run_as_future=False)[source]#
Upload a local folder to the given repo. The upload is done through a HTTP requests, and doesn’t require git or git-lfs to be installed.
The structure of the folder will be preserved. Files with the same name already present in the repository will be overwritten. Others will be left untouched.
Use the allow_patterns and ignore_patterns arguments to specify which files to upload. These parameters accept either a single pattern or a list of patterns. Patterns are Standard Wildcards (globbing patterns) as documented [here](https://tldp.org/LDP/GNU-Linux-Tools-Summary/html/x11655.htm). If both allow_patterns and ignore_patterns are provided, both constraints apply. By default, all files from the folder are uploaded.
Use the delete_patterns argument to specify remote files you want to delete. Input type is the same as for allow_patterns (see above). If path_in_repo is also provided, the patterns are matched against paths relative to this folder. For example, upload_folder(…, path_in_repo=”experiment”, delete_patterns=”logs/*”) will delete any remote file under ./experiment/logs/. Note that the .gitattributes file will not be deleted even if it matches the patterns.
Any .git/ folder present in any subdirectory will be ignored. However, please be aware that the .gitignore file is not taken into account.
Uses HfApi.create_commit under the hood.
- Parameters:
repo_id (str) – The repository to which the file will be uploaded, for example: “username/custom_transformers”
folder_path (str or Path) – Path to the folder to upload on the local file system
path_in_repo (str, optional) – Relative path of the directory in the repo, for example: “checkpoints/1fec34a/results”. Will default to the root folder of the repository.
token (bool or str, optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.
repo_type (str, optional) – Set to “dataset” or “space” if uploading to a dataset or space, None or “model” if uploading to a model. Default is None.
revision (str, optional) – The git revision to commit from. Defaults to the head of the “main” branch.
commit_message (str, optional) – The summary / title / first line of the generated commit. Defaults to: f”Upload {path_in_repo} with huggingface_hub”
commit_description (str optional) – The description of the generated commit
create_pr (boolean, optional) – Whether or not to create a Pull Request with that commit. Defaults to False. If revision is not set, PR is opened against the “main” branch. If revision is set and is a branch, PR is opened against this branch. If revision is set and is not a branch name (example: a commit oid), an RevisionNotFoundError is returned by the server.
parent_commit (str, optional) – The OID / SHA of the parent commit, as a hexadecimal string. Shorthands (7 first characters) are also supported. If specified and create_pr is False, the commit will fail if revision does not point to parent_commit. If specified and create_pr is True, the pull request will be created from parent_commit. Specifying parent_commit ensures the repo has not changed before committing the changes, and can be especially useful if the repo is updated / committed to concurrently.
allow_patterns (list[str] or str, optional) – If provided, only files matching at least one pattern are uploaded.
ignore_patterns (list[str] or str, optional) – If provided, files matching any of the patterns are not uploaded.
delete_patterns (list[str] or str, optional) – If provided, remote files matching any of the patterns will be deleted from the repo while committing new files. This is useful if you don’t know which files have already been uploaded. Note: to avoid discrepancies the .gitattributes file is not deleted even if it matches the pattern.
run_as_future (bool, optional) – Whether or not to run this method in the background. Background jobs are run sequentially without blocking the main thread. Passing run_as_future=True will return a [Future](https://docs.python.org/3/library/concurrent.futures.html#future-objects) object. Defaults to False.
- Returns:
Instance of [CommitInfo] containing information about the newly created commit (commit hash, commit url, pr url, commit message,…). If run_as_future=True is passed, returns a Future object which will contain the result when executed.
- Return type:
[CommitInfo] or Future
> [!TIP] > Raises the following errors: > > - [HTTPError](https://requests.readthedocs.io/en/latest/api/#requests.HTTPError) > if the HuggingFace API returned an error > - [ValueError](https://docs.python.org/3/library/exceptions.html#ValueError) > if some parameter value is invalid
> [!WARNING] > upload_folder assumes that the repo already exists on the Hub. If you get a Client error 404, please make > sure you are authenticated and that repo_id and repo_type are set correctly. If repo does not exist, create > it first using [~hf_api.create_repo].
> [!TIP] > When dealing with a large folder (thousands of files or hundreds of GB), we recommend using [~hf_api.upload_large_folder] instead.
Example:
```python # Upload checkpoints folder except the log files >>> upload_folder( … folder_path=”local/checkpoints”, … path_in_repo=”remote/experiment/checkpoints”, … repo_id=”username/my-dataset”, … repo_type=”datasets”, … token=”my_token”, … ignore_patterns=”**/logs/*.txt”, … )
# Upload checkpoints folder including logs while deleting existing logs from the repo # Useful if you don’t know exactly which log files have already being pushed >>> upload_folder( … folder_path=”local/checkpoints”, … path_in_repo=”remote/experiment/checkpoints”, … repo_id=”username/my-dataset”, … repo_type=”datasets”, … token=”my_token”, … delete_patterns=”**/logs/*.txt”, … )
# Upload checkpoints folder while creating a PR >>> upload_folder( … folder_path=”local/checkpoints”, … path_in_repo=”remote/experiment/checkpoints”, … repo_id=”username/my-dataset”, … repo_type=”datasets”, … token=”my_token”, … create_pr=True, … ) ```
- upload_large_folder(repo_id, folder_path, *, repo_type, revision=None, private=None, allow_patterns=None, ignore_patterns=None, num_workers=None, print_report=True, print_report_every=60)[source]#
Upload a large folder to the Hub in the most resilient way possible.
Several workers are started to upload files in an optimized way. Before being committed to a repo, files must be hashed and be pre-uploaded if they are LFS files. Workers will perform these tasks for each file in the folder. At each step, some metadata information about the upload process is saved in the folder under .cache/.huggingface/ to be able to resume the process if interrupted. The whole process might result in several commits.
- Parameters:
repo_id (str) – The repository to which the file will be uploaded. E.g. “HuggingFaceTB/smollm-corpus”.
folder_path (str or Path) – Path to the folder to upload on the local file system.
repo_type (str) – Type of the repository. Must be one of “model”, “dataset” or “space”. Unlike in all other HfApi methods, repo_type is explicitly required here. This is to avoid any mistake when uploading a large folder to the Hub, and therefore prevent from having to re-upload everything.
revision (str, optional) – The branch to commit to. If not provided, the main branch will be used.
private (bool, optional) – Whether the repository should be private. If None (default), the repo will be public unless the organization’s default is private.
allow_patterns (list[str] or str, optional) – If provided, only files matching at least one pattern are uploaded.
ignore_patterns (list[str] or str, optional) – If provided, files matching any of the patterns are not uploaded.
num_workers (int, optional) – Number of workers to start. Defaults to half of CPU cores (minimum 1). A higher number of workers may speed up the process if your machine allows it. However, on machines with a slower connection, it is recommended to keep the number of workers low to ensure better resumability. Indeed, partially uploaded files will have to be completely re-uploaded if the process is interrupted.
print_report (bool, optional) – Whether to print a report of the upload progress. Defaults to True. Report is printed to sys.stdout every X seconds (60 by defaults) and overwrites the previous report.
print_report_every (int, optional) – Frequency at which the report is printed. Defaults to 60 seconds.
> [!TIP] > A few things to keep in mind: > - Repository limits still apply: https://huggingface.co/docs/hub/repositories-recommendations > - Do not start several processes in parallel. > - You can interrupt and resume the process at any time. > - Do not upload the same folder to several repositories. If you need to do so, you must delete the local .cache/.huggingface/ folder first.
> [!WARNING] > While being much more robust to upload large folders, upload_large_folder is more limited than [upload_folder] feature-wise. In practice: > - you cannot set a custom path_in_repo. If you want to upload to a subfolder, you need to set the proper structure locally. > - you cannot set a custom commit_message and commit_description since multiple commits are created. > - you cannot delete from the repo while uploading. Please make a separate commit first. > - you cannot create a PR directly. Please create a PR first (from the UI or using [create_pull_request]) and then commit to it by passing revision.
Technical details:
- upload_large_folder process is as follow:
(Check parameters and setup.)
Create repo if missing.
List local files to upload.
- Run validation checks and display warnings if repository limits might be exceeded:
Warns if the total number of files exceeds 100k (recommended limit).
Warns if any folder contains more than 10k files (recommended limit).
Warns about files larger than 20GB (recommended) or 50GB (hard limit).
- Start workers. Workers can perform the following tasks:
Hash a file.
Get upload mode (regular or LFS) for a list of files.
Pre-upload an LFS file.
Commit a bunch of files.
Once a worker finishes a task, it will move on to the next task based on the priority list (see below) until all files are uploaded and committed. 6. While workers are up, regularly print a report to sys.stdout.
- Order of priority:
Commit if more than 5 minutes since last commit attempt (and at least 1 file).
Commit if at least 150 files are ready to commit.
Get upload mode if at least 10 files have been hashed.
Pre-upload LFS file if at least 1 file and no worker is pre-uploading.
Hash file if at least 1 file and no worker is hashing.
Get upload mode if at least 1 file and no worker is getting upload mode.
Pre-upload LFS file if at least 1 file.
Hash file if at least 1 file to hash.
Get upload mode if at least 1 file to get upload mode.
Commit if at least 1 file to commit and at least 1 min since last commit attempt.
Commit if at least 1 file to commit and all other queues are empty.
- Special rules:
Only one worker can commit at a time.
If no tasks are available, the worker waits for 10 seconds before checking again.
- verify_repo_checksums(repo_id, *, repo_type=None, revision=None, local_dir=None, cache_dir=None, token=None)[source]#
Verify local files for a repo against Hub checksums.
- Parameters:
repo_id (str) – A namespace (user or an organization) and a repo name separated by a /.
repo_type (str, optional) – The type of the repository from which to get the tree (“model”, “dataset” or “space”. Defaults to “model”.
revision (str, optional) – The revision of the repository from which to get the tree. Defaults to “main” branch.
local_dir (str or Path, optional) – The local directory to verify.
cache_dir (str or Path, optional) – The cache directory to verify.
token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.
- Returns:
a structured result containing the verification details.
- Return type:
[FolderVerification]
- Raises:
[RepositoryNotFoundError] – If repository is not found (error 404): wrong repo_id/repo_type, private but not authenticated or repo does not exist.
[RevisionNotFoundError] – If revision is not found (error 404) on the repo.
- whoami(token=None, *, cache=False)[source]#
Call HF API to know “whoami”.
If passing cache=True, the result will be cached for subsequent calls for the duration of the Python process. This is useful if you plan to call whoami multiple times as this endpoint is heavily rate-limited for security reasons.
- Parameters:
token (bool or str, optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.
cache (bool, optional) – Whether to cache the result of the whoami call for subsequent calls. If an error occurs during the first call, it won’t be cached. Defaults to False.
- Return type:
dict
- squadds.core.utils.get_token()[source]#
Get token if user is logged in.
- Note: in most cases, you should use [huggingface_hub.utils.build_hf_headers] instead. This method is only useful
if you want to retrieve the token for other purposes than sending an HTTP request.
Token is retrieved in priority from the HF_TOKEN environment variable. Otherwise, we read the token file located in the Hugging Face home folder. Returns None if user is not logged in. To log in, use [login] or hf auth login.
- Returns:
The token, None if it doesn’t exist.
- Return type:
str or None