### Polars SQL FROM Clause Example Source: https://docs.pola.rs/api/python/stable/reference/sql/clauses Shows how to use the FROM clause in Polars to specify the table from which to retrieve data. The example demonstrates selecting all columns from the current DataFrame. ```python df = pl.DataFrame( { "a": [1, 2, 3], "b": ["zz", "yy", "xx"], } ) df.sql(""" SELECT * FROM self """) ``` -------------------------------- ### Polars SQL JOIN Clause Examples Source: https://docs.pola.rs/api/python/stable/reference/sql/clauses Provides examples of different JOIN types in Polars, including FULL JOIN and INNER JOIN, using SQL syntax. It demonstrates combining data from multiple DataFrames based on related columns. ```python df1 = pl.DataFrame( { "foo": [1, 2, 3], "ham": ["a", "b", "c"], } ) df2 = pl.DataFrame( { "apple": ["x", "y", "z"], "ham": ["a", "b", "d"], } ) pl.sql(""" SELECT foo, apple, COALESCE(df1.ham, df2.ham) AS ham FROM df1 FULL JOIN df2 USING (ham) """).collect() ``` ```python pl.sql(""" SELECT COLUMNS('^\w+$') FROM df1 NATURAL INNER JOIN df2 """).collect() ``` -------------------------------- ### Select columns by name in Polars (Python) Source: https://docs.pola.rs/api/python/stable/reference/selectors Demonstrates column selection by name in Polars, with options to require all names or match any names. Shows basic and partial matching examples. ```python df.select(cs.by_name("foo", "bar")) ``` ```python df.select(cs.by_name("baz", "moose", "foo", "bear", require_all=False)) ``` -------------------------------- ### Expr.list.first Source: https://docs.pola.rs/api/python/stable/reference/expressions/list Gets the first value from each sublist. ```APIDOC ## Expr.list.first ### Description Get the first value of the sublists. ### Method `first()` ### Endpoint Not applicable (Python library method). ### Parameters #### Path Parameters None #### Query Parameters None #### Request Body None ### Request Example ```python from polars import col df.select(col("my_list").list.first()) ``` ### Response #### Success Response (200) - **First Element** (various types) - The first element of each sublist. ``` -------------------------------- ### Polars SQL ORDER BY Clause Example Source: https://docs.pola.rs/api/python/stable/reference/sql/clauses Demonstrates sorting query results based on specified columns using the ORDER BY clause in Polars SQL. This example sorts the results by the 'bar' column in descending order. ```python df = pl.DataFrame( { "foo": ["b", "a", "c", "b"], "bar": [20, 10, 40, 30], } ) df.sql(""" SELECT foo, bar FROM self ORDER BY bar DESC """) ``` -------------------------------- ### Polars SQL LIMIT Clause Example Source: https://docs.pola.rs/api/python/stable/reference/sql/clauses Illustrates how to limit the number of rows returned by a query using the LIMIT clause in Polars SQL. This example restricts the output to the first 2 rows. ```python df = pl.DataFrame( { "foo": ["b", "a", "c", "b"], "bar": [20, 10, 40, 30], } ) df.sql(""" SELECT foo, bar FROM self LIMIT 2 """) ``` -------------------------------- ### Polars SQL GROUP BY Clause Example Source: https://docs.pola.rs/api/python/stable/reference/sql/clauses Illustrates how to group rows with the same values in specified columns using the GROUP BY clause in Polars SQL. This example calculates the sum of 'bar' for each unique 'foo' value. ```python df = pl.DataFrame( { "foo": ["a", "b", "b"], "bar": [10, 20, 30], } ) df.sql(""" SELECT foo, SUM(bar) FROM self GROUP BY foo """) ``` -------------------------------- ### Create KDE plot from Series Source: https://docs.pola.rs/api/python/stable/reference/series/plot Generates a kernel density estimate (KDE) plot of a Series using Altair backend. Equivalent to alt.Chart with density transformation. Requires Polars and Altair installed. ```python >>> s.plot.kde() ``` -------------------------------- ### Create line plot from Series Source: https://docs.pola.rs/api/python/stable/reference/series/plot Generates a line plot of a Series using Altair backend. Equivalent to alt.Chart with mark_line and index-based x-axis encoding. Requires Polars and Altair installed. ```python >>> s.plot.line() ``` -------------------------------- ### Expr.list.sample Source: https://docs.pola.rs/api/python/stable/reference/expressions/list Samples elements from each list. ```APIDOC ## Expr.list.sample ### Description Sample from this list. ### Method `sample(n=None, fraction=None, ...)` ### Endpoint Not applicable (Python library method). ### Parameters #### Path Parameters None #### Query Parameters None #### Request Body None ### Request Example ```python from polars import col df.select(col("my_list").list.sample(n=2)) ``` ### Response #### Success Response (200) - **Sampled List** (list) - A list containing randomly sampled elements from the original list. ``` -------------------------------- ### Get Upper Bound of Float Series (Polars) Source: https://docs.pola.rs/api/python/stable/reference/series/api/polars.Series This example shows how to obtain the upper bound for a floating-point Series using polars. The upper_bound() method is called on the Series to get a new Series containing the maximum representable value for the float dtype, which can be infinity for Float64. ```python >>> s = pl.Series("s", [1.0, 2.5, 3.0], dtype=pl.Float64) >>> s.upper_bound() shape: (1,) Series: 's' [f64] [ inf ] ``` -------------------------------- ### Get Polars Thread Pool Size Source: https://docs.pola.rs/api/python/stable/reference/api/polars Returns the number of threads in the Polars thread pool. The size can be overridden by setting the POLARS_MAX_THREADS environment variable before the process starts. Note that the thread pool cannot be modified once set as it's not behind a lock. ```python >>> pl.thread_pool_size() 16 ``` -------------------------------- ### Catalog API Overview Source: https://docs.pola.rs/api/python/stable/reference/catalog/index This section provides an overview of the Polars Catalog API, including its main class and related informational structures for Unity Catalog. ```APIDOC ## Catalog API Interface with data catalogs, specifically Unity Catalog. ### Classes - **`polars.Catalog`**: The main class for interacting with data catalogs. - **`polars.catalog.unity.CatalogInfo`**: Represents information about a catalog. - **`polars.catalog.unity.ColumnInfo`**: Represents information about a column within a table. - **`polars.catalog.unity.DataSourceFormat`**: Enumerates supported data source formats. - **`polars.catalog.unity.NamespaceInfo`**: Represents information about a namespace (schema) within a catalog. - **`polars.catalog.unity.TableInfo`**: Represents information about a table. - **`polars.catalog.unity.TableType`**: Enumerates supported table types. ``` -------------------------------- ### Catalog Methods Source: https://docs.pola.rs/api/python/stable/reference/catalog/index These methods allow you to list and retrieve information about catalogs, namespaces, and tables, as well as scan tables. ```APIDOC ## Catalog Methods ### `Catalog.list_catalogs()` #### Description Lists all available catalogs. #### Method `GET` #### Endpoint `/catalogs` #### Parameters None #### Response ##### Success Response (200) - **catalogs** (list of str) - A list of catalog names. #### Response Example ```json { "catalogs": ["main", "default"] } ``` ### `Catalog.list_namespaces(catalog_name: str)` #### Description Lists all namespaces (schemas) within a specified catalog. #### Method `GET` #### Endpoint `/catalogs/{catalog_name}/namespaces` #### Parameters ##### Path Parameters - **catalog_name** (str) - Required - The name of the catalog. #### Response ##### Success Response (200) - **namespaces** (list of str) - A list of namespace names within the catalog. #### Response Example ```json { "namespaces": ["default", "analytics"] } ``` ### `Catalog.list_tables(catalog_name: str, namespace_name: str)` #### Description Lists all tables within a specified catalog and namespace. #### Method `GET` #### Endpoint `/catalogs/{catalog_name}/namespaces/{namespace_name}/tables` #### Parameters ##### Path Parameters - **catalog_name** (str) - Required - The name of the catalog. - **namespace_name** (str) - Required - The name of the namespace. #### Response ##### Success Response (200) - **tables** (list of str) - A list of table names within the namespace. #### Response Example ```json { "tables": ["users", "orders"] } ``` ### `Catalog.get_table_info(catalog_name: str, namespace_name: str, table_name: str)` #### Description Retrieves detailed information about a specific table. #### Method `GET` #### Endpoint `/catalogs/{catalog_name}/namespaces/{namespace_name}/tables/{table_name}` #### Parameters ##### Path Parameters - **catalog_name** (str) - Required - The name of the catalog. - **namespace_name** (str) - Required - The name of the namespace. - **table_name** (str) - Required - The name of the table. #### Response ##### Success Response (200) - **table_info** (TableInfo) - An object containing detailed information about the table. #### Response Example ```json { "table_info": { "name": "users", "schema": "default", "catalog": "main", "type": "MANAGED", "columns": [ {"name": "user_id", "type": "integer"}, {"name": "username", "type": "string"} ] } } ``` ### `Catalog.scan_table(catalog_name: str, namespace_name: str, table_name: str)` #### Description Scans a table and returns it as a Polars DataFrame. #### Method `GET` #### Endpoint `/catalogs/{catalog_name}/namespaces/{namespace_name}/tables/{table_name}/scan` #### Parameters ##### Path Parameters - **catalog_name** (str) - Required - The name of the catalog. - **namespace_name** (str) - Required - The name of the namespace. - **table_name** (str) - Required - The name of the table. #### Response ##### Success Response (200) - **dataframe** (polars.DataFrame) - A Polars DataFrame representing the table data. #### Response Example ```python # This is a conceptual example as the actual response is a DataFrame import polars as pl df = Catalog.scan_table("main", "default", "users") print(df) ``` ``` -------------------------------- ### Select temporal columns except time columns with Polars (Python) Source: https://docs.pola.rs/api/python/stable/reference/selectors This snippet selects only date columns by subtracting the cs.time() selector from cs.temporal(). It demonstrates how to combine selectors to exclude specific temporal subtypes. The DataFrame setup is the same as the temporal example. ```python import polars as pl import polars.selectors as cs from datetime import date, datetime, time df = pl.DataFrame( { "dtm": [datetime(2001, 5, 7, 10, 25), datetime(2031, 12, 31, 0, 30)], "dt": [date(1999, 12, 31), date(2024, 8, 9)], "tm": [time(0, 0, 0), time(23, 59, 59)], } ) result = df.select(cs.temporal() - cs.time()) print(result) ``` -------------------------------- ### Skip rows with OFFSET in Polars DataFrame SQL Source: https://docs.pola.rs/api/python/stable/reference/sql/clauses This example shows how to use the OFFSET clause in a SQL query to skip the first two rows of a Polars DataFrame. The query returns the remaining rows starting from the third row. The DataFrame is created with sample data for demonstration. ```python df = pl.DataFrame( { "foo": ["b", "a", "c", "b"], "bar": [20, 10, 40, 30], } ) df.sql(""" SELECT foo, bar FROM self LIMIT 2 OFFSET 2 """) ``` -------------------------------- ### Create Index Column with Expressions - Python Source: https://docs.pola.rs/api/python/stable/reference/lazyframe/api/polars.LazyFrame This example demonstrates an alternative way to create an index column using Polars expressions int_range and len. It depends on polars and operates on a LazyFrame, selecting the index and all other columns. The index starts from 0 and is of UInt32 type; this method avoids potential performance drawbacks of with_row_index. ```python lf.select( pl.int_range(pl.len(), dtype=pl.UInt32).alias("index"), pl.all(), ).collect() ``` -------------------------------- ### Initialize QueryOptFlags in Python Source: https://docs.pola.rs/api/python/stable/reference/lazyframe/api/polars Initializes a QueryOptFlags instance with optional optimization flags. These flags control various aspects of query optimization such as predicate pushdown, projection pushdown, and expression simplification. ```python class _polars.QueryOptFlags( *, predicate_pushdown: None | bool = None, projection_pushdown: None | bool = None, simplify_expression: None | bool = None, slice_pushdown: None | bool = None, comm_subplan_elim: None | bool = None, comm_subexpr_elim: None | bool = None, cluster_with_columns: None | bool = None, collapse_joins: None | bool = None, check_order_observe: None | bool = None, fast_projection: None | bool = None, ) ``` -------------------------------- ### Get Upper Bound of Integer Series (Polars) Source: https://docs.pola.rs/api/python/stable/reference/series/api/polars.Series This example demonstrates how to retrieve the upper bound for an integer Series using polars. The function takes a polars Series as input and returns a new Series representing the upper bound of its dtype. This is useful for understanding the maximum representable value for the Series' data type. ```python >>> s = pl.Series("s", [-1, 0, 1], dtype=pl.Int8) >>> s.upper_bound() shape: (1,) Series: 's' [i8] [ 127 ] ``` -------------------------------- ### Catalog.list_namespaces(catalog_name) Source: https://docs.pola.rs/api/python/stable/reference/catalog/unity Lists the available namespaces (unity schema) under the specified catalog. This allows for discovery of the schemas within a particular catalog. ```APIDOC ## GET /catalogs/{catalog_name}/namespaces ### Description Lists the available namespaces (unity schema) under the specified catalog. ### Method GET ### Endpoint /catalogs/{catalog_name}/namespaces ### Parameters #### Path Parameters - **catalog_name** (string) - Required - The name of the catalog. #### Query Parameters None #### Request Body None ### Response #### Success Response (200) - **namespaces** (array) - A list of namespace names. ``` -------------------------------- ### Initialize Polars Schema with Data Types Source: https://docs.pola.rs/api/python/stable/reference/schema/index Demonstrates creating a Polars Schema by providing column names and their corresponding Polars data types. This method is useful for defining the structure of a DataFrame. ```python >>> schema = pl.Schema( ... { ... "foo": pl.String(), ... "bar": pl.Duration("us"), ... "baz": pl.Array(pl.Int8, 4), ... } ... ) >>> schema Schema({'foo': String, 'bar': Duration(time_unit='us'), 'baz': Array(Int8, shape=(4,))}) ``` -------------------------------- ### Select Columns Starting With a Prefix using polars.selectors Source: https://docs.pola.rs/api/python/stable/reference/selectors Selects columns from a DataFrame whose names start with a specified substring. This selector is versatile and can accept multiple prefixes to match columns starting with any of them. It can also be negated to select columns that do not start with the given prefixes. ```python >>> import polars.selectors as cs >>> df = pl.DataFrame({ ... "foo": [1.0, 2.0], ... "bar": [3.0, 4.0], ... "baz": [5, 6], ... "zap": [7, 8], ... }) >>> df.select(cs.starts_with("b")) shape: (2, 2) ┌─────┬─────┐ │ bar ┆ baz │ │ --- ┆ --- │ │ f64 ┆ i64 │ ╞═════╪═════╪═════╡ │ 3.0 ┆ 5 │ │ 4.0 ┆ 6 │ └─────┴─────┘ >>> df.select(cs.starts_with("b", "z")) shape: (2, 3) ┌─────┬─────┬─────┐ │ bar ┆ baz ┆ zap │ │ --- ┆ --- ┆ --- │ │ f64 ┆ i64 ┆ i64 │ ╞═════╪═════╪═════╡ │ 3.0 ┆ 5 ┆ 7 │ │ 4.0 ┆ 6 ┆ 8 │ └─────┴─────┴─────┘ >>> df.select(~cs.starts_with("b")) shape: (2, 2) ┌─────┬─────┐ │ foo ┆ zap │ │ --- ┆ --- │ │ f64 ┆ i64 │ ╞═════╪═════╡ │ 1.0 ┆ 7 │ │ 2.0 ┆ 8 │ └─────┴─────┘ ``` -------------------------------- ### Import styling helpers and create DataFrame (Python) Source: https://docs.pola.rs/api/python/stable/reference/dataframe/style Shows how to import the required selectors and styling helpers from Great Tables and create a Polars DataFrame with sample data. This setup is required before applying any styling functions. No external dependencies beyond Polars and Great Tables. ```python >>> import polars.selectors as cs >>> from great_tables import loc, style >>> df = pl.DataFrame( ... { ... "site_id": [0, 1, 2], ... "measure_a": [5, 4, 6], ... "measure_b": [7, 3, 3], ... } ... ) ``` -------------------------------- ### Polars SQL WHERE Clause Example Source: https://docs.pola.rs/api/python/stable/reference/sql/clauses Demonstrates filtering rows using the WHERE clause in Polars SQL. The example selects rows from a DataFrame where the 'foo' column value is greater than 42. ```python df = pl.DataFrame( { "foo": [30, 40, 50], "ham": ["a", "b", "c"], } ) df.sql(""" SELECT * FROM self WHERE foo > 42 """) ``` -------------------------------- ### TableInfo Constructor Python Source: https://docs.pola.rs/api/python/stable/reference/catalog/api/polars.catalog.unity Initializes a TableInfo object with details about a catalog table. It requires the table's name, ID, type, and properties, with optional fields for comments, storage location, format, columns, and timestamps. ```python polars.catalog.unity.TableInfo( _name : str_, _comment : str | None_, _table_id : str_, _table_type : TableType_, _storage_location : str | None_, _data_source_format : DataSourceFormat | None_, _columns : list[ColumnInfo] | None_, _properties : dict[str, str]_, _created_at : datetime | None_, _created_by : str | None_, _updated_at : datetime | None_, _updated_by : str | None_, ) ``` -------------------------------- ### Polars SQL HAVING Clause Example Source: https://docs.pola.rs/api/python/stable/reference/sql/clauses Shows how to filter groups based on conditions using the HAVING clause in conjunction with GROUP BY in Polars SQL. This example selects groups where the sum of 'bar' is 40 or greater. ```python df = pl.DataFrame( { "foo": ["a", "b", "b", "c"], "bar": [10, 20, 30, 40], } ) df.sql(""" SELECT foo, SUM(bar) FROM self GROUP BY foo HAVING bar >= 40 """) ``` -------------------------------- ### Catalog.list_catalogs() Source: https://docs.pola.rs/api/python/stable/reference/catalog/unity Lists the available catalogs. This method provides a way to discover the catalogs within a Unity metastore. ```APIDOC ## GET /catalogs ### Description Lists the available catalogs within a Unity metastore. ### Method GET ### Endpoint /catalogs ### Parameters #### Path Parameters None #### Query Parameters None #### Request Body None ### Response #### Success Response (200) - **catalogs** (array) - A list of catalog names. ``` -------------------------------- ### Expr.list.last Source: https://docs.pola.rs/api/python/stable/reference/expressions/list Gets the last value from each sublist. ```APIDOC ## Expr.list.last ### Description Get the last value of the sublists. ### Method `last()` ### Endpoint Not applicable (Python library method). ### Parameters #### Path Parameters None #### Query Parameters None #### Request Body None ### Request Example ```python from polars import col df.select(col("my_list").list.last()) ``` ### Response #### Success Response (200) - **Last Element** (various types) - The last element of each sublist. ``` -------------------------------- ### Catalog.list_tables(catalog_name, namespace) Source: https://docs.pola.rs/api/python/stable/reference/catalog/unity Lists the available tables under the specified schema. This allows for discovery of the tables within a specific namespace in a catalog. ```APIDOC ## GET /catalogs/{catalog_name}/namespaces/{namespace}/tables ### Description Lists the available tables under the specified schema. ### Method GET ### Endpoint /catalogs/{catalog_name}/namespaces/{namespace}/tables ### Parameters #### Path Parameters - **catalog_name** (string) - Required - The name of the catalog. - **namespace** (string) - Required - The name of the namespace. #### Query Parameters None #### Request Body None ### Response #### Success Response (200) - **tables** (array) - A list of table names. ``` -------------------------------- ### Database Source: https://docs.pola.rs/api/python/stable/reference/io Functions for reading from and writing to databases. ```APIDOC ## read_database ### Description Read the results of a SQL query into a DataFrame, given a connection object. ### Method `read_database` ### Parameters #### Path Parameters - **query** (str) - Required - The SQL query to execute. - **connection** (DBAPI connection) - Required - A DBAPI 2.0 connection object. #### Query Parameters - **...** (other parameters) - Additional parameters for database reading. ### Request Example ```python # Example usage would go here if a request body was applicable ``` ### Response #### Success Response (200) - **DataFrame** (polars.DataFrame) - The data resulting from the SQL query. #### Response Example ```json { "column1": [1, 2, 3], "column2": ["a", "b", "c"] } ``` ## read_database_uri ### Description Read the results of a SQL query into a DataFrame, given a URI. ### Method `read_database_uri` ### Parameters #### Path Parameters - **query** (str) - Required - The SQL query to execute. - **uri** (str) - Required - The database connection URI. #### Query Parameters - **...** (other parameters) - Additional parameters for database reading. ### Request Example ```python # Example usage would go here if a request body was applicable ``` ### Response #### Success Response (200) - **DataFrame** (polars.DataFrame) - The data resulting from the SQL query. #### Response Example ```json { "column1": [1, 2, 3], "column2": ["a", "b", "c"] } ``` ## DataFrame.write_database ### Description Write the data in a Polars DataFrame to a database. ### Method `DataFrame.write_database` ### Parameters #### Path Parameters - **table_name** (str) - Required - The name of the table to write to. - **connection** (DBAPI connection | str) - Required - A DBAPI 2.0 connection object or a database connection URI. #### Query Parameters - **...** (other parameters) - Additional parameters for database writing. ### Request Example ```python # Example usage would go here if a request body was applicable ``` ### Response #### Success Response (200) - **None** - This method writes to a database and does not return a value. #### Response Example ```json // No response body ``` ``` -------------------------------- ### Extract substring with SUBSTR function Source: https://docs.pola.rs/api/python/stable/reference/sql/functions/string Returns a substring starting at a 1-indexed position with specified length. Takes the string column, start position (1-indexed), and length as parameters. Useful for extracting specific portions of text. ```python df = pl.DataFrame({"foo": ["apple", "banana", "orange", "grape"]}) df.sql(""" SELECT foo, SUBSTR(foo, 3, 4) AS foo_3_4 FROM self """) ``` -------------------------------- ### Clipboard Source: https://docs.pola.rs/api/python/stable/reference/io Functions for reading from and writing to the system clipboard. ```APIDOC ## read_clipboard ### Description Read text from clipboard and pass to `read_csv`. ### Method `read_clipboard` ### Parameters #### Path Parameters - **separator** (str) - Optional - The separator to use when parsing the clipboard text as CSV. ### Request Example ```python # Example usage would go here if a request body was applicable ``` ### Response #### Success Response (200) - **DataFrame** (polars.DataFrame) - The data read from the clipboard. #### Response Example ```json { "column1": [1, 2, 3], "column2": ["a", "b", "c"] } ``` ## DataFrame.write_clipboard ### Description Copy `DataFrame` in csv format to the system clipboard with `write_csv`. ### Method `DataFrame.write_clipboard` ### Parameters #### Path Parameters - **separator** (str) - Optional - The separator to use for the CSV format. ### Request Example ```python # Example usage would go here if a request body was applicable ``` ### Response #### Success Response (200) - **None** - This method copies to the clipboard and does not return a value. #### Response Example ```json // No response body ``` ``` -------------------------------- ### Avro Source: https://docs.pola.rs/api/python/stable/reference/io Functions for reading from and writing to Apache Avro format. ```APIDOC ## read_avro ### Description Read into a DataFrame from Apache Avro format. ### Method `read_avro` ### Parameters #### Path Parameters - **source** (str | Path | io.BytesIO | io.BufferedReader) - Required - Path to the Avro file or buffer. #### Query Parameters - **columns** (list[str] | None) - Optional - Subset of column names to push down to the reader. - **n_rows** (int | None) - Optional - Stop reading after this many rows. ### Request Example ```python # Example usage would go here if a request body was applicable ``` ### Response #### Success Response (200) - **DataFrame** (polars.DataFrame) - The data read from the Avro file. #### Response Example ```json { "column1": [1, 2, 3], "column2": ["a", "b", "c"] } ``` ## DataFrame.write_avro ### Description Write to Apache Avro file. ### Method `DataFrame.write_avro` ### Parameters #### Path Parameters - **file** (str | Path | io.BufferedWriter) - Required - Path to write the Avro file to. #### Query Parameters - **compression** (str | None) - Optional - Compression codec to use. Options: "null", "lz4", "snappy", "zstd". - **name** (str | None) - Optional - The name of the table to embed in the file. ### Request Example ```python # Example usage would go here if a request body was applicable ``` ### Response #### Success Response (200) - **None** - This method writes to a file and does not return a value. #### Response Example ```json // No response body ``` ``` -------------------------------- ### Expr.list.get Source: https://docs.pola.rs/api/python/stable/reference/expressions/list Gets the value at a specific index from sublists. ```APIDOC ## Expr.list.get ### Description Get the value by index in the sublists. ### Method `get(index, null_on_oob=False)` ### Endpoint Not applicable (Python library method). ### Parameters #### Path Parameters None #### Query Parameters None #### Request Body None ### Request Example ```python from polars import col df.select(col("my_list").list.get(1)) ``` ### Response #### Success Response (200) - **Element at Index** (various types) - The element at the specified index in each sublist. ``` -------------------------------- ### TableInfo Methods Source: https://docs.pola.rs/api/python/stable/reference/catalog/index Methods for retrieving schema information from `TableInfo` objects. ```APIDOC ## TableInfo Methods ### `TableInfo.get_polars_schema()` #### Description Retrieves the schema of the table in a format compatible with Polars. #### Method `GET` #### Endpoint `/tables/{table_name}/polars_schema` (relative to TableInfo object) #### Parameters None #### Response ##### Success Response (200) - **polars_schema** (dict) - A dictionary representing the Polars schema. #### Response Example ```json { "polars_schema": {"user_id": "int64", "username": "string"} } ``` ``` -------------------------------- ### Create and Apply Reusable Polars Config Instances Source: https://docs.pola.rs/api/python/stable/reference/config This Python code snippet shows how to create multiple `Config` instances with `apply_on_context_enter=True`. These instances can be reused and applied as decorators to functions, enabling different configurations for various code sections. Dependencies include the `polars` library and `sys` for standard output. ```python import polars as pl import sys cfg_verbose = pl.Config(verbose=True, apply_on_context_enter=True) cfg_markdown = pl.Config(tbl_formatting="MARKDOWN", apply_on_context_enter=True) @cfg_markdown def write_markdown_frame_to_stdout(df: pl.DataFrame) -> None: sys.stdout.write(str(df)) @cfg_verbose def do_various_things(): # Function implementation would go here pass ``` -------------------------------- ### Select First Column with Polars Selectors Source: https://docs.pola.rs/api/python/stable/reference/selectors Shows how to select the first column of a DataFrame using `cs.first()`. It also demonstrates selecting all columns *except* the first one using the negation operator `~`. ```python >>> import polars as pl >>> import polars.selectors as cs >>> df = pl.DataFrame({ ... "foo": ["x", "y"], ... "bar": [123, 456], ... "baz": [2.0, 5.5], ... "zap": [0, 1], ... }) >>> df.select(cs.first()) shape: (2, 1) ┌─────┐ │ foo │ │ --- │ │ str │ ╞═════╡ │ x │ │ y │ └─────┘ >>> df.select(~cs.first()) shape: (2, 3) ┌─────┬─────┬─────┐ │ bar ┆ baz ┆ zap │ │ --- ┆ --- ┆ --- │ │ i64 ┆ f64 ┆ i64 │ ╞═════╪═════╪═════╡ │ 123 ┆ 2.0 ┆ 0 │ │ 456 ┆ 5.5 ┆ 1 │ └─────┴─────┴─────┘ ``` -------------------------------- ### Expr.list.gather_every Source: https://docs.pola.rs/api/python/stable/reference/expressions/list Takes every n-th value from sublists, starting from an offset. ```APIDOC ## Expr.list.gather_every ### Description Take every n-th value start from offset in sublists. ### Method `gather_every(n, offset=0)` ### Endpoint Not applicable (Python library method). ### Parameters #### Path Parameters None #### Query Parameters None #### Request Body None ### Request Example ```python from polars import col df.select(col("my_list").list.gather_every(2, offset=1)) ``` ### Response #### Success Response (200) - **Gathered Every Nth** (list) - A list containing every n-th element starting from the offset. ``` -------------------------------- ### Expr.list.item Source: https://docs.pola.rs/api/python/stable/reference/expressions/list Gets the single value from each sublist, expecting lists to contain only one element. ```APIDOC ## Expr.list.item ### Description Get the single value of the sublists. ### Method `item(allow_empty=False)` ### Endpoint Not applicable (Python library method). ### Parameters #### Path Parameters None #### Query Parameters None #### Request Body None ### Request Example ```python from polars import col df.select(col("single_element_list").list.item()) ``` ### Response #### Success Response (200) - **Single Element** (various types) - The single element contained within each sublist. ``` -------------------------------- ### Check binary prefix with polars Expr.bin.starts_with (Python) Source: https://docs.pola.rs/api/python/stable/reference/expressions/api/polars.Expr.bin Demonstrates how to use polars Expr.bin.starts_with to test binary columns against a literal prefix and a dynamic column prefix. Requires the Polars library and a DataFrame with binary data. Returns a Boolean column indicating match results. ```Python >>> colors = pl.DataFrame( { "name": ["black", "yellow", "blue"], "code": [b"\\x00\\x00\\x00", b"\\xff\\xff\\x00", b"\\x00\\x00\\xff"], "prefix": [b"\\x00", b"\\xff\\x00", b"\\x00\\x00"] } ) >>> colors.select( "name", pl.col("code").bin.starts_with(b"\\xff").alias("starts_with_lit"), pl.col("code") .bin.starts_with(pl.col("prefix")) .alias("starts_with_expr"), ) ``` -------------------------------- ### Import Polars Testing Assertions Source: https://docs.pola.rs/api/python/stable/reference/testing Demonstrates how to import specific assertion functions from the Polars testing module for use in unit tests. These functions help verify the equality or inequality of DataFrames and Series. ```Python from polars.testing import assert_frame_equal, assert_series_equal ``` -------------------------------- ### Expr.cat.starts_with() Source: https://docs.pola.rs/api/python/stable/reference/expressions/api/polars.Expr.cat Checks if string representations of values in a categorical column start with a specified substring. This method requires a literal string prefix. ```APIDOC ## Expr.cat.starts_with ### Description Checks if string representations of values start with a substring. This is a method on the `Expr.cat` namespace for categorical data. ### Method This is a method within the Polars expression API, not a direct HTTP request. ### Endpoint N/A (Polars Expression API) ### Parameters #### Path Parameters None #### Query Parameters None #### Request Body This method operates on an expression, not a request body. ### Request Example ```python # Example usage within a Polars DataFrame import polars as pl df = pl.DataFrame({ "fruits": pl.Series(["apple", "mango", None], dtype=pl.Categorical) }) # Applying starts_with to a column result_df = df.with_columns( pl.col("fruits").cat.starts_with("app").alias("has_prefix") ) print(result_df) ``` ### Response #### Success Response (Output of the expression) - **has_prefix** (bool) - Returns `true` if the string starts with the prefix, `false` otherwise. Returns `null` if the input is `null`. #### Response Example ``` shape: (3, 2) ┌────────┬────────────┐ │ fruits ┆ has_prefix │ │ --- ┆ --- │ │ cat ┆ bool │ ╞════════╪════════════╡ │ apple ┆ true │ │ mango ┆ false │ │ null ┆ null │ └────────┴────────────┘ ``` ### See Also - `contains`: Check if string repr contains a substring that matches a pattern. - `ends_with`: Check if string repr end with a substring. ### Notes `cat.starts_with` requires a literal string value for the prefix, unlike `str.starts_with` which can accept expression inputs. ``` -------------------------------- ### Select all signed integer columns Source: https://docs.pola.rs/api/python/stable/reference/selectors This snippet demonstrates how to select columns that are signed integers. It also includes an example of selecting all columns that are not signed integers. ```python import polars as pl import polars.selectors as cs df = pl.DataFrame({ "foo": [-123, -456], "bar": [3456, 6789], "baz": [7654, 4321], "zap": ["ab", "cd"], }, schema_overrides={"bar": pl.UInt32, "baz": pl.UInt64}) # Select all signed integer columns print(df.select(cs.signed_integer())) # Select all columns except for those that are signed integers print(df.select(~cs.signed_integer())) ``` -------------------------------- ### Assert Series Not Equal - Python Source: https://docs.pola.rs/api/python/stable/reference/api/polars.testing Demonstrates the usage of polars.testing.assert_series_not_equal to check inequality between two Polars Series. The example showcases how this function is used, and the resulting AssertionError when Series are equal. ```python >>> from polars.testing import assert_series_not_equal >>> s1 = pl.Series([1, 2, 3]) >>> s2 = pl.Series([1, 2, 3]) >>> assert_series_not_equal(s1, s2) Traceback (most recent call last): ... AssertionError: Series are equal (but are expected not to be) ``` -------------------------------- ### Register Custom Series Namespace for Mathematical Operations Source: https://docs.pola.rs/api/python/stable/reference/api Shows how to create a 'math' namespace on Series objects with square and cube operations. Demonstrates implementing mathematical shortcuts using the register_series_namespace decorator. ```python @pl.api.register_series_namespace("math") class MathShortcuts: def __init__(self, s: pl.Series) -> None: self._s = s def square(self) -> pl.Series: return self._s * self._s def cube(self) -> pl.Series: return self._s * self._s * self._s s = pl.Series("n", [1, 2, 3, 4, 5]) s2 = s.math.square().rename("n2") s3 = s.math.cube().rename("n3") ``` -------------------------------- ### Import PyArrow Schema into Polars Source: https://docs.pola.rs/api/python/stable/reference/schema/index Demonstrates how to create a Polars Schema by importing an existing pyarrow schema. This facilitates interoperability between Polars and PyArrow. ```python >>> import pyarrow as pa >>> pl.Schema(pa.schema([pa.field("x", pa.int32())])) Schema({'x': Int32}) ``` -------------------------------- ### Get Polars Schema Python Source: https://docs.pola.rs/api/python/stable/reference/catalog/api/polars.catalog.unity Retrieves the native Polars schema for the catalog table. This method does not take any arguments and returns the schema representation compatible with Polars DataFrames. ```python table_info.get_polars_schema() ``` -------------------------------- ### Partition LazyFrame using sink_ipc Source: https://docs.pola.rs/api/python/stable/reference/api/polars.LazyFrame This snippet illustrates how to partition a LazyFrame into multiple files using a hive-partitioning style based on a column. The `mkdir=True` argument ensures that the necessary directories are created. This is effective for organizing large datasets. ```python >>> pl.LazyFrame({"x": [1, 2, 1], "y": [3, 4, 5]}).sink_ipc( ... pl.PartitionByKey("./out/", by="x"), ... mkdir=True ... ) ``` -------------------------------- ### Check if an Object is a Polars Selector Source: https://docs.pola.rs/api/python/stable/reference/selectors Provides an example of using `is_selector` to determine if a given object is a Polars selector. This is useful for validating inputs or understanding expression types. ```python >>> from polars.selectors import is_selector >>> import polars.selectors as cs >>> is_selector(pl.col("colx")) False >>> is_selector(cs.first() | cs.last()) True ``` -------------------------------- ### Use Config as a Context Manager (Direct Init) Source: https://docs.pola.rs/api/python/stable/reference/config Illustrates a cleaner way to use Polars Config as a context manager by setting options directly in the constructor. This approach simplifies the code for temporary configuration changes. ```python import polars as pl with pl.Config(verbose=True): # Code that benefits from verbose logging # ... do_various_things() ... # Verbose logging is automatically disabled here ``` -------------------------------- ### LOG10 - Logarithm Base 10 in Pola.rs Source: https://docs.pola.rs/api/python/stable/reference/sql/functions/math Computes the logarithm of the given value in base 10. The example uses Pola.rs DataFrame operations and SQL expressions. ```polars df = pl.DataFrame({"a": [1, 2, 4]}) df.sql(""" SELECT a, LOG10(a) AS log10_a FROM self """) ``` -------------------------------- ### Iceberg Source: https://docs.pola.rs/api/python/stable/reference/io Functions for scanning and writing to Apache Iceberg tables. ```APIDOC ## scan_iceberg ### Description Lazily read from an Apache Iceberg table. ### Method `scan_iceberg` ### Parameters #### Path Parameters - **source** (str | Path) - Required - Path to the Iceberg table. #### Query Parameters - **snapshot_id** (int | None) - Optional - Specify a snapshot ID to read. - **...** (other parameters) - Additional parameters for Iceberg scanning. ### Request Example ```python # Example usage would go here if a request body was applicable ``` ### Response #### Success Response (200) - **LazyFrame** (polars.LazyFrame) - A lazy representation of the data. #### Response Example ```json // Response is a lazy frame, not a direct data structure ``` ## DataFrame.write_iceberg ### Description Write DataFrame to an Iceberg table. ### Method `DataFrame.write_iceberg` ### Parameters #### Path Parameters - **target** (str | Path) - Required - Path to the Iceberg table to write to. #### Query Parameters - **mode** (str) - Required - Write mode: "append", "overwrite". - **...** (other parameters) - Additional parameters for Iceberg writing. ### Request Example ```python # Example usage would go here if a request body was applicable ``` ### Response #### Success Response (200) - **None** - This method writes to an Iceberg table and does not return a value. #### Response Example ```json // No response body ``` ``` -------------------------------- ### LN - Natural Logarithm in Pola.rs Source: https://docs.pola.rs/api/python/stable/reference/sql/functions/math Computes the natural logarithm of the given value. The example shows usage within a Pola.rs DataFrame and using SQL syntax. ```polars df = pl.DataFrame({"a": [1, 2, 4]}) df.sql(""" SELECT a, LN(a) AS ln_a FROM self """) ``` -------------------------------- ### Select decimal columns using Polars selectors Source: https://docs.pola.rs/api/python/stable/reference/selectors Shows how to select decimal columns using the cs.decimal() selector and how to exclude them using negation. Demonstrates selecting all decimal columns and selecting all non-decimal columns in a DataFrame. ```python df.select(cs.decimal()) ``` ```python df.select(~cs.decimal()) ``` -------------------------------- ### Get Polars Schema Column Names using names() Source: https://docs.pola.rs/api/python/stable/reference/schema/index Retrieves a list of all column names defined in the Polars Schema. This method is essential for accessing or referencing columns by their names. ```python >>> s = pl.Schema({"x": pl.Float64(), "y": pl.Datetime(time_zone="UTC")}) >>> s.names() ['x', 'y'] ``` -------------------------------- ### Get Polars Config State (Python) Source: https://docs.pola.rs/api/python/stable/reference/api/polars.Config Retrieves the current state of Polars configuration variables. It can return all states or only those that have been explicitly set. Optionally, it can filter to include only environment variables. ```python >>> set_state = pl.Config.state(if_set=True) >>> all_state = pl.Config.state() ```