orbitra.lake.client
Functions
get_lake_client
environment: Environment to use (“prod” or “dev”). Defaults to “prod”.credential: Synchronous Azure credential for API operations.
- Configured lake client instance.
Classes
OrbitraLakeClient
Client for interacting with the Orbitra Lake database.
Methods:
list_namespaces
- list[str]: A list of namespace names.
create_table
namespace: The namespace where the table should be created.table: The schema of the table to create.
- The schema of the created table.
LakeError: If the table already exists or if the namespace does not exist.
list_tables
namespace: The namespace to list tables from.
- list[str]: A list of table names in the specified namespace.
LakeError: If the namespace does not exist.
get_table_metadata
namespace: The namespace where the table is located.table_name: The name of the table to retrieve metadata for.
- The schema of the table if it exists.
LakeError: If the table does not exist.
add_column_to_table
namespace: The namespace where the table is located.table_name: The name of the table to add the column to.column: The name of the new column to add.column_type: The data type of the new column.
- The updated schema of the table after adding the new column.
LakeError: If the column is invalid, already exists or if the table does not exist.
remove_column_from_table
namespace: The namespace where the table is located.table_name: The name of the table to remove the column from.column: The name of the column to remove.
- The updated schema of the table after removing the column.
LakeError: If the table or column does not exist or if it is a reserved column.
add_or_update_table
namespace: The namespace where the table is located.table: The schema of the table to add or update.allow_column_removal: Whether to allow column removal.
- The updated schema of the table after adding or updating it.
LakeError: If there are changes in partition columns or if a column is removed and allow_column_removal is False.
overwrite_data
namespace: The namespace where the table is located.table_name: The name of the table to overwrite data in.df: The DataFrame containing the data to overwrite in the table.
- A response object containing information about the modified partitions and inserted rows.
LakeError: If the table does not exist or if the DataFrame contains invalid data.
delete_data
namespace: The namespace where the table is located.table_name: The name of the table to delete data from.partition_filters: A list of partition filters to apply for the delete operation. Must be empty if the table has no partition columns.
- An operation ID for tracking the delete operation.
LakeError: If the table does not exist or if the partition filters are invalid.
overwrite_data_by_custom_columns
namespace: The namespace where the table is located.table_name: The name of the table to overwrite data into.custom_columns: A list of columns to use as custom columns.df: The DataFrame containing the data to overwrite.
- A response object containing information about the modified custom values and inserted rows.
LakeError: If the table does not exist or if the custom columns are invalid.
get_table_data
namespace: The namespace where the table is located.table_name: The name of the table to retrieve data from.scan_filters: A list of column filters to apply for the query.
- pd.DataFrame: A DataFrame containing the data retrieved from the table.
LakeError: If the table does not exist or if the scan filters are invalid.
run_query
namespace: The namespace to run the query in.query: The query to run.engine: The engine to use for the query.
- pd.DataFrame: A DataFrame containing the retrieved data.
LakeError: If the query is invalid or if the engine is not supported.
save_raw_bytes_to_blob
bytes_io: The bytes object to persist.full_filename: The blob path, including virtual directories, e.g."finance/2025/09/transactions.parquet".namespace: Namespace used to compose the container name. The effective container issettings.orbitra_lake_raw_container_prefix + namespace.
- True if the bytes object was stored, False if it already exists and is the same.
save_raw_df_to_blob
df: The DataFrame to persist.full_filename: The blob path, including virtual directories, e.g."finance/2025/09/transactions.parquet".namespace: Namespace used to compose the container name. The effective container issettings.orbitra_lake_raw_container_prefix + namespace.
- True if the DataFrame was stored, False if it already exists and is the same.
read_raw_bytes_from_blob
full_filename: The full path and filename of the raw bytes object to read from the raw storage container.namespace: Namespace used to compose the container name. The effective container issettings.orbitra_lake_raw_container_prefix + namespace.
- io.BytesIO: The raw bytes object read from the blob storage.
read_raw_df_from_blob
full_filename: The full path and filename of the Parquet file to read from the raw storage container.namespace: Namespace used to compose the container name. The effective container issettings.orbitra_lake_raw_container_prefix + namespace.
- pd.DataFrame: The contents of the Parquet file as a pandas DataFrame.
get_raw_file_system
namespace: Logical namespace used to compose container/directory name.
- A filesystem interface for accessing raw storage.
set_processed_flag
full_filename: The full path and filename of the raw file to set the processed flag for.namespace: Namespace used to compose the container name. The effective container issettings.orbitra_lake_raw_container_prefix + namespace.is_processed: The processed flag value to set.
get_processed_flag
full_filename: The full path and filename of the raw file to get the processed flag for.namespace: Namespace used to compose the container name. The effective container issettings.orbitra_lake_raw_container_prefix + namespace.
- The processed flag value.