### Initialize DataFrame for examples Source: https://docs.pola.rs/api/python/stable/reference/dataframe/api/polars.DataFrame.map_rows.html Setup code used for the subsequent map_rows examples. ```python >>> df = pl.DataFrame({"foo": [1, 2, 3], "bar": [-1, 5, 8]}) ``` -------------------------------- ### Initialize LazyFrames for update examples Source: https://docs.pola.rs/api/python/stable/reference/lazyframe/api/polars.LazyFrame.update.html Setup the initial LazyFrame and the update source LazyFrame used in subsequent examples. ```python >>> lf = pl.LazyFrame( ... { ... "A": [1, 2, 3, 4], ... "B": [400, 500, 600, 700], ... } ... ) >>> lf.collect() shape: (4, 2) ┌─────┬─────┐ │ A ┆ B │ │ --- ┆ --- │ │ i64 ┆ i64 │ ╞═════╪═════╡ │ 1 ┆ 400 │ │ 2 ┆ 500 │ │ 3 ┆ 600 │ │ 4 ┆ 700 │ └─────┴─────┘ >>> new_lf = pl.LazyFrame( ... { ... "B": [-66, None, -99], ... "C": [5, 3, 1], ... } ... ) ``` -------------------------------- ### Initialize DataFrame for map_elements examples Source: https://docs.pola.rs/api/python/stable/reference/expressions/api/polars.Expr.map_elements.html Setup a basic DataFrame to demonstrate element-wise mapping. ```python >>> df = pl.DataFrame( ... { ... "a": [1, 2, 3, 1], ... "b": ["a", "b", "c", "c"], ... } ... ) ``` -------------------------------- ### Initialize DataFrame for examples Source: https://docs.pola.rs/api/python/stable/reference/expressions/api/polars.Expr.str.find.html Setup a sample DataFrame containing text and patterns for subsequent search operations. ```python >>> df = pl.DataFrame( ... { ... "txt": ["Crab", "Lobster", None, "Crustacean"], ... "pat": ["a[bc]", "b.t", "[aeiuo]", "(?i)A[BC]"], ... } ... ) ``` -------------------------------- ### Initialize LazyFrame for casting examples Source: https://docs.pola.rs/api/python/stable/reference/lazyframe/api/polars.LazyFrame.cast.html Setup a LazyFrame with integer, float, and date columns for demonstration. ```python >>> from datetime import date >>> lf = pl.LazyFrame( ... { ... "foo": [1, 2, 3], ... "bar": [6.0, 7.0, 8.0], ... "ham": [date(2020, 1, 2), date(2021, 3, 4), date(2022, 5, 6)], ... } ... ) ``` -------------------------------- ### Create DataFrame for Sorting Examples Source: https://docs.pola.rs/api/python/stable/reference/dataframe/api/polars.DataFrame.sort.html Initializes a Polars DataFrame used in the subsequent sorting examples. No specific setup is required beyond having Polars installed. ```python >>> df = pl.DataFrame( ... { ... "a": [1, 2, None], ... "b": [6.0, 5.0, 4.0], ... "c": ["a", "c", "b"], ... } ... ) ``` -------------------------------- ### Initialize LazyFrames for SQLContext Source: https://docs.pola.rs/api/python/stable/reference/sql/api/polars.SQLContext.register_many.html Setup multiple LazyFrame objects to be used in subsequent SQLContext registration examples. ```python >>> lf1 = pl.LazyFrame({"a": [1, 2, 3], "b": ["m", "n", "o"]}) >>> lf2 = pl.LazyFrame({"a": [2, 3, 4], "c": ["p", "q", "r"]}) >>> lf3 = pl.LazyFrame({"a": [3, 4, 5], "b": ["s", "t", "u"]}) >>> lf4 = pl.LazyFrame({"a": [4, 5, 6], "c": ["v", "w", "x"]}) ``` -------------------------------- ### Create DataFrame for Examples Source: https://docs.pola.rs/api/python/stable/reference/expressions/col.html Initializes a Polars DataFrame with 'foo' and 'bar' columns for subsequent examples. ```python >>> from polars import col >>> df = pl.DataFrame( ... { ... ``` -------------------------------- ### Initialize DataFrame for Selector Examples Source: https://docs.pola.rs/api/python/stable/reference/selectors.html Sets up a sample Polars DataFrame used in various selector examples. Ensure `polars.selectors` is imported as `cs`. ```python >>> import polars.selectors as cs >>> df = pl.DataFrame( ... { ... "foo": ["x", "y"], ... "bar": [123, 456], ... "baz": [2.0, 5.5], ... "zap": [0, 1], ... } ... ) ``` -------------------------------- ### Create DataFrame Source: https://docs.pola.rs/api/python/stable/reference/expressions/api/polars.fold.html Initializes a polars DataFrame for subsequent examples. No specific setup is required beyond importing polars. ```python >>> df = pl.DataFrame( ... { ... "a": [1, 2, 3], ... "b": [3, 4, 5], ... "c": [5, 6, 7], ... } ... ) >>> df shape: (3, 3) ┌─────┬─────┬─────┐ │ a ┆ b ┆ c │ │ --- ┆ --- ┆ --- │ │ i64 ┆ i64 ┆ i64 │ ╞═════╪═════╪═════╡ │ 1 ┆ 3 ┆ 5 │ │ 2 ┆ 4 ┆ 6 │ │ 3 ┆ 5 ┆ 7 │ └─────┴─────┴─────┘ ``` -------------------------------- ### Create a DataFrame with temporal and time columns Source: https://docs.pola.rs/api/python/stable/reference/selectors.html This setup code demonstrates creating a Polars DataFrame with date, time, and datetime columns for selector examples. ```python >>> from datetime import date, datetime, time >>> import polars.selectors as cs >>> df = pl.DataFrame( ... { ... "dtm": [datetime(2001, 5, 7, 10, 25), datetime(2031, 12, 31, 0, 30)], ... "dt": [date(1999, 12, 31), date(2024, 8, 9)], ... "tm": [time(0, 0, 0), time(23, 59, 59)], ... }, ... ) ``` -------------------------------- ### Draw Interactive Series Examples Source: https://docs.pola.rs/api/python/stable/reference/api/polars.testing.parametric.series.html Call the .example() method on a series strategy to get a concrete instance of a generated Series. This is useful for interactive development but should be avoided during test execution. ```python from polars.testing.parametric import series from polars.testing.parametric import lists s = series(strategy=lists(pl.String, select_from=["xx", "yy", "zz"])) s.example() ``` -------------------------------- ### Initialize DataFrame for Casting Examples Source: https://docs.pola.rs/api/python/stable/reference/dataframe/api/polars.DataFrame.cast.html This code initializes a sample DataFrame with integer, float, and date columns, which is used in subsequent casting examples. ```python from datetime import date df = pl.DataFrame({ "foo": [1, 2, 3], "bar": [6.0, 7.0, 8.0], "ham": [date(2020, 1, 2), date(2021, 3, 4), date(2022, 5, 6)], }) ``` -------------------------------- ### Initialize DataFrame for top_k_by examples Source: https://docs.pola.rs/api/python/stable/reference/expressions/api/polars.Expr.top_k_by.html Creates a sample DataFrame with columns 'a', 'b', and 'c' to demonstrate top_k_by functionality. ```python >>> df = pl.DataFrame( ... { ... "a": [1, 2, 3, 4, 5, 6], ... "b": [6, 5, 4, 3, 2, 1], ... "c": ["Apple", "Orange", "Apple", "Apple", "Banana", "Banana"], ... } ... ) >>> df shape: (6, 3) ┌─────┬─────┬────────┐ │ a ┆ b ┆ c │ │ --- ┆ --- ┆ --- │ │ i64 ┆ i64 ┆ str │ ╞═════╪═════╪════════╡ │ 1 ┆ 6 ┆ Apple │ │ 2 ┆ 5 ┆ Orange │ │ 3 ┆ 4 ┆ Apple │ │ 4 ┆ 3 ┆ Apple │ │ 5 ┆ 2 ┆ Banana │ │ 6 ┆ 1 ┆ Banana │ └─────┴─────┴────────┘ ``` -------------------------------- ### length Source: https://docs.pola.rs/api/python/stable/reference/expressions/index.html Get the length of a slice. If `None`, all rows starting from the offset are selected. ```APIDOC ## length ### Description Get the length of a slice. If `None`, all rows starting from the offset are selected. ### Parameters #### Path Parameters * **offset** (integer) - The starting index of the slice. * **length** (integer or None) - The number of elements to include in the slice. If `None`, all elements from the offset to the end are included. ### Example ```python df.select(pl.all().slice(1, 2)) ``` ``` -------------------------------- ### Initialize DataFrame for bottom_k examples Source: https://docs.pola.rs/api/python/stable/reference/dataframe/api/polars.DataFrame.bottom_k.html Create a sample DataFrame with columns 'a' and 'b' to demonstrate row selection methods. ```python >>> df = pl.DataFrame( ... { ... "a": ["a", "b", "a", "b", "b", "c"], ... "b": [2, 1, 1, 3, 2, 1], ... } ... ) ``` -------------------------------- ### DataFrame.slice() Source: https://docs.pola.rs/api/python/stable/reference/dataframe/api/polars.DataFrame.slice.html Get a slice of this DataFrame by specifying a starting offset and an optional length. ```APIDOC ## DataFrame.slice() ### Description Get a slice of this DataFrame. This method allows you to extract a subset of rows starting from a specified offset and continuing for a given length. ### Method DataFrame.slice ### Endpoint Not applicable (this is a method of a DataFrame object, not a REST API endpoint). ### Parameters #### Path Parameters None #### Query Parameters None #### Request Body None ### Parameters * **offset** (int) - Required - Start index. Negative indexing is supported. * **length** (int | None) - Optional - Length of the slice. If set to `None`, all rows starting at the offset will be selected. ### Request Example ```python df = pl.DataFrame({ "foo": [1, 2, 3], "bar": [6.0, 7.0, 8.0], "ham": ["a", "b", "c"], }) df.slice(1, 2) ``` ### Response #### Success Response (DataFrame) Returns a new DataFrame containing the specified slice of rows. #### Response Example ``` shape: (2, 3) ┌─────┬─────┬─────┐ │ foo ┆ bar ┆ ham │ │ --- ┆ --- ┆ --- │ │ i64 ┆ f64 ┆ str │ ╞═════╪═════╪═════╡ │ 2 ┆ 7.0 ┆ b │ │ 3 ┆ 8.0 ┆ c │ └─────┴─────┴─────┘ ``` ``` -------------------------------- ### Initialize LazyFrame for top_k examples Source: https://docs.pola.rs/api/python/stable/reference/lazyframe/api/polars.LazyFrame.top_k.html Create a sample LazyFrame to demonstrate top_k functionality. ```python >>> lf = pl.LazyFrame( ... { ... "a": ["a", "b", "a", "b", "b", "c"], ... "b": [2, 1, 1, 3, 2, 1], ... } ... ) ``` -------------------------------- ### Check if strings start with a dynamic prefix from another column Source: https://docs.pola.rs/api/python/stable/reference/expressions/api/polars.Expr.str.starts_with.html This example demonstrates checking if string values start with a prefix defined in another column. This is useful for more complex pattern matching. ```python >>> df = pl.DataFrame( ... {"fruits": ["apple", "mango", "banana"], "prefix": ["app", "na", "ba"]} ... ) >>> df.with_columns( ... pl.col("fruits").str.starts_with(pl.col("prefix")).alias("has_prefix"), ... ) shape: (3, 3) ┌────────┬────────┬────────────┐ │ fruits ┆ prefix ┆ has_prefix │ │ --- ┆ --- ┆ --- │ │ str ┆ str ┆ bool │ ╞════════╪════════╪════════════╡ │ apple ┆ app ┆ true │ │ mango ┆ na ┆ false │ │ banana ┆ ba ┆ true │ └────────┴────────┴────────────┘ ``` -------------------------------- ### Initialize DataFrame for examples Source: https://docs.pola.rs/api/python/stable/reference/dataframe/api/polars.DataFrame.rows_by_key.html Create a sample DataFrame used for demonstrating row grouping operations. ```python >>> df = pl.DataFrame( ... { ... "w": ["a", "b", "b", "a"], ... "x": ["q", "q", "q", "k"], ... "y": [1.0, 2.5, 3.0, 4.5], ... "z": [9, 8, 7, 6], ... } ... ) ``` -------------------------------- ### slice Source: https://docs.pola.rs/api/python/stable/reference/dataframe/index.html Get a slice of the DataFrame. This selects a contiguous subset of rows starting from a given offset. ```APIDOC ## slice ### Description Get a slice of the DataFrame. This selects a contiguous subset of rows starting from a given offset. ### Parameters #### Path Parameters - **offset** (int) - The starting row index of the slice. - **length** (int) - The number of rows to include in the slice. If `None`, all rows starting at the offset will be selected. ### Examples ```python >>> df = pl.DataFrame({ ... "foo": [1, 2, 3], ... "bar": [6.0, 7.0, 8.0], ... "ham": ["a", "b", "c"] ... }) >>> df.slice(1, 2) shape: (2, 3) ┌─────┬─────┬─────┐ │ foo ┆ bar ┆ ham │ │ --- ┆ --- ┆ --- │ │ i64 ┆ f64 ┆ str │ ╞═════╪═════╪═════╡ │ 2 ┆ 7.0 ┆ b │ │ 3 ┆ 8.0 ┆ c │ └─────┴─────┴─────┘ ``` ``` -------------------------------- ### Specify tables with FROM Source: https://docs.pola.rs/api/python/stable/reference/sql/clauses.html Demonstrates various ways to use the FROM clause, including as a leading clause. ```python df = pl.DataFrame( { "a": [1, 2, 3], "b": ["zz", "yy", "xx"], } ) for query in ( "SELECT * FROM self", "FROM self SELECT *", "FROM self", ): df.sql(query) ``` ```python df.sql(""" FROM self SELECT b, a """) ``` -------------------------------- ### Explain Query Plan Source: https://docs.pola.rs/api/python/stable/reference/lazyframe/index.html Illustrates how to generate a string representation of the query plan for a LazyFrame, with various optimization flags available. ```python explain( _*_ , _format : ExplainFormat = 'plain'_, _optimized : bool = True_, _type_coercion : bool = True_, _predicate_pushdown : bool = True_, _projection_pushdown : bool = True_, _simplify_expression : bool = True_, _slice_pushdown : bool = True_, _comm_subplan_elim : bool = True_, _comm_subexpr_elim : bool = True_, _cluster_with_columns : bool = True_, _collapse_joins : bool = True_, _streaming : bool = False_, _engine : EngineType = 'auto'_, _tree_format : bool | None = None_, _optimizations : QueryOptFlags = ()_, ) → str[source] ``` -------------------------------- ### Series.slice() Source: https://docs.pola.rs/api/python/stable/reference/series/api/polars.Series.slice.html Get a slice of this Series. Allows extracting a sub-section of the Series by specifying a starting offset and an optional length. ```APIDOC ## Series.slice() ### Description Get a slice of this Series. Negative indexing is supported for the offset. ### Method This is a method of the Series object, not a standalone API endpoint. ### Endpoint N/A (Instance method) ### Parameters #### Path Parameters None #### Query Parameters None #### Request Body None ### Parameters - **offset** (int) - Required - Start index. Negative indexing is supported. - **length** (int | None) - Optional - Length of the slice. If set to `None`, all rows starting at the offset will be selected. ### Request Example ```python import polars as pl s = pl.Series("a", [1, 2, 3, 4]) result = s.slice(1, 2) print(result) ``` ### Response #### Success Response (Series) - **Series** - A new Series object containing the sliced data. #### Response Example ``` shape: (2,) Series: 'a' [i64] [ 2 3 ] ``` ``` -------------------------------- ### LazyFrame Explain Example Source: https://docs.pola.rs/api/python/stable/reference/lazyframe/api/polars.LazyFrame.explain.html Demonstrates how to use the explain() method on a LazyFrame after performing a group_by and aggregation. This shows the query plan for the operations performed. ```python lf = pl.LazyFrame({ "a": ["a", "b", "a", "b", "b", "c"], "b": [1, 2, 3, 4, 5, 6], "c": [6, 5, 4, 3, 2, 1], }) lf.group_by("a", maintain_order=True).agg(pl.all().sum()).sort( "a" ).explain() ``` -------------------------------- ### Select every nth value with offset using gather_every Source: https://docs.pola.rs/api/python/stable/reference/expressions/api/polars.Expr.gather_every.html Use gather_every with an offset to select every nth value starting from a specific index. This example selects every 3rd value starting from the second element (index 1). ```python >>> df.select(pl.col("foo").gather_every(3, offset=1)) ``` -------------------------------- ### CASE Expression Examples Source: https://docs.pola.rs/api/python/stable/reference/sql/functions/conditional.html Demonstrates both searched and simple CASE expressions for conditional logic in SQL. ```python df = pl.DataFrame({"n": [10, 20, 30, 40, 50]}) df.sql(""" SELECT n, CASE WHEN n <= 20 THEN 'small' WHEN n <= 40 THEN 'medium' ELSE 'large' END AS size FROM self """) ``` ```python df = pl.DataFrame({"lbl": ["a", "b", "c", "d"]}) df.sql(""" SELECT lbl, CASE lbl WHEN 'a' THEN 'alpha' WHEN 'b' THEN 'beta' ELSE 'other' END AS name FROM self """) ``` -------------------------------- ### Get Array Width with DataTypeExpr.arr.width() Source: https://docs.pola.rs/api/python/stable/reference/datatype_expr/api/polars.DataTypeExpr.arr.width.html Use DataTypeExpr.arr.width() to retrieve the width of an array. This example demonstrates its usage with a simple integer array. ```python >>> pl.select(pl.Array(pl.Int8, (1, 2, 3)).to_dtype_expr().arr.width()) shape: (1, 1) ┌─────────┐ │ literal │ │ --- │ │ u32 │ ╞═════════╡ │ 1 │ └─────────┘ ``` -------------------------------- ### Initialize DataFrame Source: https://docs.pola.rs/api/python/stable/reference/dataframe/api/polars.DataFrame.remove.html Creates a sample DataFrame for demonstrating the remove method. This setup is required before applying any remove operations. ```python >>> df = pl.DataFrame( ... { ... "foo": [2, 3, None, 4, 0], ... "bar": [5, 6, None, None, 0], ... "ham": ["a", "b", None, "c", "d"], ... } ... ) ``` -------------------------------- ### Execute SHOW TABLES SQL command Source: https://docs.pola.rs/api/python/stable/reference/sql/api/polars.SQLContext.tables.html Demonstrates how to execute the 'SHOW TABLES' SQL command within a SQLContext to retrieve registered table names. The result is returned as a DataFrame. ```python >>> frame_data = pl.DataFrame({"hello": ["world"]}) >>> ctx = pl.SQLContext(hello_world=frame_data) >>> ctx.execute("SHOW TABLES", eager=True) shape: (1, 1) ┌─────────────┐ │ name │ │ --- │ │ str │ ╞═════════════╡ │ hello_world │ └─────────────┘ ``` -------------------------------- ### Series.rle_id() Source: https://docs.pola.rs/api/python/stable/reference/series/api/polars.Series.rle_id.html Get a distinct integer ID for each run of identical values. The ID starts at 0 and increases by one each time the value of the column changes. ```APIDOC ## Series.rle_id() ### Description Get a distinct integer ID for each run of identical values. The ID starts at 0 and increases by one each time the value of the column changes. ### Method This is a method of the Polars Series object, not a standalone API endpoint. ### Endpoint N/A (Instance method) ### Parameters This method does not accept any parameters. ### Request Body N/A ### Response #### Success Response - **Series** (Series[UInt32]) - A Series of data type `UInt32` containing the run-length encoding IDs. #### Response Example ```python >>> s = pl.Series("s", [1, 1, 2, 1, None, 1, 3, 3]) >>> s.rle_id() shape: (8,) Series: 's' [u32] [ 0 0 1 2 3 4 5 5 ] ``` ### Notes This functionality is especially useful for defining a new group for every time a column’s value changes, rather than for every distinct value of that column. ### See Also - `rle` ``` -------------------------------- ### Slice LazyFrame rows Source: https://docs.pola.rs/api/python/stable/reference/lazyframe/api/polars.LazyFrame.slice.html Use LazyFrame.slice to get a subset of rows starting from a specific offset with a defined length. Negative indexing is supported for the offset. ```python >>> lf = pl.LazyFrame( ... { ... "a": ["x", "y", "z"], ... "b": [1, 3, 5], ... "c": [2, 4, 6], ... } ... ) >>> lf.slice(1, 2).collect() shape: (2, 3) ┌─────┬─────┬─────┐ │ a ┆ b ┆ c │ │ --- ┆ --- ┆ --- │ │ str ┆ i64 ┆ i64 │ ╞═════╪═════╪═════╡ │ y ┆ 3 ┆ 4 │ │ z ┆ 5 ┆ 6 │ └─────┴─────┴─────┘ ``` -------------------------------- ### Catalog Client Initialization Source: https://docs.pola.rs/api/python/stable/reference/catalog/unity.html Initialize the Unity Catalog client with workspace URL and optional bearer token. ```APIDOC ## Catalog Client Initialization ### Description Initializes the Unity Catalog client. Requires the workspace URL and optionally accepts a bearer token for authentication. ### Method Constructor ### Endpoint N/A ### Parameters #### Path Parameters None #### Query Parameters None #### Request Body None ### Request Example ```python from polars.dataframe.api import Catalog # Initialize with workspace URL catalog = Catalog("your_workspace_url") # Initialize with workspace URL and bearer token catalog = Catalog("your_workspace_url", bearer_token="your_bearer_token") ``` ### Response #### Success Response (200) N/A (Initialization) #### Response Example N/A ``` -------------------------------- ### Expr.rle_id() Source: https://docs.pola.rs/api/python/stable/reference/expressions/api/polars.Expr.rle_id.html Get a distinct integer ID for each run of identical values. The ID starts at 0 and increases by one each time the value of the column changes. ```APIDOC ## Expr.rle_id() ### Description Get a distinct integer ID for each run of identical values. The ID starts at 0 and increases by one each time the value of the column changes. ### Method This is a method of the `Expr` object in Polars. ### Endpoint N/A (This is a Python method, not a REST API endpoint) ### Parameters This method does not take any parameters. ### Request Body N/A ### Response #### Success Response - **Returns**: `Expr` - An expression of data type `UInt32`. ### Response Example ```python # Example usage within a Polars DataFrame df = pl.DataFrame({ "a": [1, 2, 1, 1, 1], "b": ["x", "x", None, "y", "y"], }) df.with_columns( rle_id_a=pl.col("a").rle_id(), rle_id_ab=pl.struct("a", "b").rle_id(), ) ``` #### Output Example ``` shape: (5, 4) ┌─────┬──────┬──────────┬───────────┐ │ a ┆ b ┆ rle_id_a ┆ rle_id_ab │ │ --- ┆ --- ┆ --- ┆ --- │ │ i64 ┆ str ┆ u32 ┆ u32 │ ╞═════╪══════╪══════════╪═══════════╡ │ 1 ┆ x ┆ 0 ┆ 0 │ │ 2 ┆ x ┆ 1 ┆ 1 │ │ 1 ┆ null ┆ 2 ┆ 2 │ │ 1 ┆ y ┆ 2 ┆ 3 │ │ 1 ┆ y ┆ 2 ┆ 3 │ └─────┴──────┴──────────┴───────────┘ ``` ### Notes This functionality is especially useful for defining a new group for every time a column’s value changes, rather than for every distinct value of that column. ### See Also - `rle` ``` -------------------------------- ### Initialize DataFrame for Expr.exclude examples Source: https://docs.pola.rs/api/python/stable/reference/expressions/api/polars.Expr.exclude.html Sets up a sample DataFrame with mixed types to demonstrate column exclusion. ```python >>> df = pl.DataFrame( ... { ... "aa": [1, 2, 3], ... "ba": ["a", "b", None], ... "cc": [None, 2.5, 1.5], ... } ... ) >>> df shape: (3, 3) ┌─────┬──────┬──────┐ │ aa ┆ ba ┆ cc │ │ --- ┆ --- ┆ --- │ │ i64 ┆ str ┆ f64 │ ╞═════╪══════╪══════╡ │ 1 ┆ a ┆ null │ │ 2 ┆ b ┆ 2.5 │ │ 3 ┆ null ┆ 1.5 │ └─────┴──────┴──────┘ ``` -------------------------------- ### Get Polars Index Type Source: https://docs.pola.rs/api/python/stable/reference/api/polars.get_index_type.html Use this function to determine the data type Polars uses for indexing. This can vary between regular and bigidx Polars installations. ```python >>> pl.get_index_type() UInt32 ``` -------------------------------- ### Initialize DataFrame for coalesce examples Source: https://docs.pola.rs/api/python/stable/reference/expressions/api/polars.coalesce.html Sets up a sample DataFrame with null values to demonstrate coalesce behavior. ```python >>> df = pl.DataFrame( ... { ... "a": [1, None, None, None], ... "b": [1, 2, None, None], ... "c": [5, None, 3, None], ... } ... ) ``` -------------------------------- ### Strip characters as a set with strip_chars_start Source: https://docs.pola.rs/api/python/stable/reference/expressions/api/polars.Expr.str.strip_chars_start.html The order of characters provided to strip_chars_start does not matter; they are treated as a set. This example demonstrates stripping 'c', 'b', and 'a' from the start of a string. ```python >>> pl.DataFrame({"foo": ["aabcdef"]}).with_columns( ... foo_strip_start=pl.col("foo").str.strip_chars_start("cba") ... ) shape: (1, 2) ┌─────────┬─────────────────┐ │ foo ┆ foo_strip_start │ │ --- ┆ --- │ │ str ┆ str │ ╞═════════╪═════════════════╡ │ aabcdef ┆ def │ └─────────┴─────────────────┘ ``` -------------------------------- ### Example: `all()` Selector Source: https://docs.pola.rs/api/python/stable/reference/selectors.html Illustrates the usage of the `all()` selector to select all columns in a DataFrame. ```APIDOC ## Example: `all()` Selector ```python >>> from datetime import date >>> import polars as pl >>> import polars.selectors as cs >>> df = pl.DataFrame({ ... "dt": [date(1999, 12, 31), date(2024, 1, 1)], ... "value": [1_234_500, 5_000_555], ... }, schema_overrides={"value": pl.Int32}) # Selecting all columns using cs.all() >>> df.select(cs.all()) shape: (2, 2) ┌──────────┬─────────┐ │ dt ┆ value │ │ --- ┆ --- │ │ date ┆ i32 │ ╞══════════╪═════════╡ │ 1999-12-31 ┆ 1234500 │ │ 2024-01-01 ┆ 5000555 │ └──────────┴─────────┘ ``` ``` -------------------------------- ### Check binary prefixes with Expr.bin.starts_with Source: https://docs.pola.rs/api/python/stable/reference/expressions/api/polars.Expr.bin.starts_with.html Demonstrates checking binary columns against both literal binary prefixes and dynamic column-based prefixes. ```python >>> colors = pl.DataFrame( ... { ... "name": ["black", "yellow", "blue"], ... "code": [b"\x00\x00\x00", b"\xff\xff\x00", b"\x00\x00\xff"], ... "prefix": [b"\x00", b"\xff\x00", b"\x00\x00"], ... } ... ) >>> colors.select( ... "name", ... pl.col("code").bin.starts_with(b"\xff").alias("starts_with_lit"), ... pl.col("code") ... .bin.starts_with(pl.col("prefix")) ... .alias("starts_with_expr"), ... ) ``` -------------------------------- ### Slice DataFrame Rows Source: https://docs.pola.rs/api/python/stable/reference/dataframe/api/polars.DataFrame.slice.html Use `slice` to get a subset of rows from a DataFrame. Specify the starting offset and the desired length. Negative indexing is supported for the offset. ```python >>> df = pl.DataFrame( ... { ... "foo": [1, 2, 3], ... "bar": [6.0, 7.0, 8.0], ... "ham": ["a", "b", "c"], ... } ... ) >>> df.slice(1, 2) shape: (2, 3) ┌─────┬─────┬─────┐ │ foo ┆ bar ┆ ham │ │ --- ┆ --- ┆ --- │ │ i64 ┆ f64 ┆ str │ ╞═════╪═════╪═════╡ │ 2 ┆ 7.0 ┆ b │ │ 3 ┆ 8.0 ┆ c │ └─────┴─────┴─────┘ ``` -------------------------------- ### Slice a polars Series Source: https://docs.pola.rs/api/python/stable/reference/series/api/polars.Series.slice.html Use Series.slice to get a sub-section of the Series. Specify the starting offset and the desired length. If length is None, all elements from the offset to the end are returned. ```python >>> s = pl.Series("a", [1, 2, 3, 4]) >>> s.slice(1, 2) shape: (2,) Series: 'a' [i64] [ 2 3 ] ``` -------------------------------- ### Create a Polars DataFrame Source: https://docs.pola.rs/api/python/stable/reference/dataframe/api/polars.DataFrame.glimpse.html Initializes a Polars DataFrame with various data types, including floats, integers, booleans, strings, and dates. This setup is used for subsequent examples. ```python from datetime import date df = pl.DataFrame({ "a": [1.0, 2.8, 3.0], "b": [4, 5, None], "c": [True, False, True], "d": [None, "b", "c"], "e": ["usd", "eur", None], "f": [date(2020, 1, 1), date(2021, 1, 2), date(2022, 1, 1)], }) ``` -------------------------------- ### cs.starts_with() - Select Columns by Prefix Source: https://docs.pola.rs/api/python/stable/reference/selectors.html Selects columns whose names start with a specified substring or a list of substrings. ```APIDOC ## SELECT Columns by Prefix ### Description Selects columns from a DataFrame whose names begin with the provided prefix string(s). ### Method `select` ### Endpoint N/A (DataFrame operation) ### Parameters #### Path Parameters None #### Query Parameters None #### Request Body None ### Request Example ```python >>> import polars.selectors as cs >>> df = pl.DataFrame({ ... "foo": [1.0, 2.0], ... "bar": [3.0, 4.0], ... "baz": [5, 6], ... "zap": [7, 8], ... }) # Match columns starting with a 'b' >>> df.select(cs.starts_with("b")) shape: (2, 2) ┌─────┬─────┐ │ bar ┆ baz │ │ --- ┆ --- │ │ f64 ┆ i64 │ ╞═════╪═════╡ │ 3.0 ┆ 5 │ │ 4.0 ┆ 6 │ └─────┴─────┘ # Match columns starting with either 'b' or 'z' >>> df.select(cs.starts_with("b", "z")) shape: (2, 3) ┌─────┬─────┬─────┐ │ bar ┆ baz ┆ zap │ │ --- ┆ --- ┆ --- │ │ f64 ┆ i64 ┆ i64 │ ╞═════╪═════╪═════╡ │ 3.0 ┆ 5 ┆ 7 │ │ 4.0 ┆ 6 ┆ 8 │ └─────┴─────┴─────┘ # Match all columns except those starting with 'b' >>> df.select(~cs.starts_with("b")) shape: (2, 2) ┌─────┬─────┐ │ foo ┆ zap │ │ --- ┆ --- │ │ f64 ┆ i64 │ ╞═════╪═════╡ │ 1.0 ┆ 7 │ │ 2.0 ┆ 8 │ └─────┴─────┘ ``` ### Response #### Success Response (200) DataFrame with columns matching the specified prefix(es). #### Response Example ```json { "bar": [3.0, 4.0], "baz": [5, 6] } ``` ``` -------------------------------- ### Retrieve Struct Fields by Regex Source: https://docs.pola.rs/api/python/stable/reference/expressions/api/polars.Expr.struct.field.html Use a regular expression to select struct fields whose names match the pattern. This example selects fields starting with 'a' or ending with 'b'. ```python >>> df.select(pl.col("struct_col").struct.field("^a.*|b.*$")) ``` -------------------------------- ### Initialize DataFrame for Signed Integer Selector Examples Source: https://docs.pola.rs/api/python/stable/reference/selectors.html Sets up a sample Polars DataFrame with signed and unsigned integers, and strings, for demonstrating `cs.signed_integer()`. ```python >>> import polars.selectors as cs >>> df = pl.DataFrame( ... { ... "foo": [-123, -456], ... "bar": [3456, 6789], ... "baz": [7654, 4321], ... "zap": ["ab", "cd"], ... }, ... schema_overrides={"bar": pl.UInt32, "baz": pl.UInt64}, ... ) ``` -------------------------------- ### Series.bin.starts_with Source: https://docs.pola.rs/api/python/stable/reference/series/api/polars.Series.bin.starts_with.html Checks if each element in a binary Series starts with the specified prefix. ```APIDOC ## Series.bin.starts_with ### Description Check if values in a binary Series start with a binary substring. ### Parameters #### Arguments - **prefix** (IntoExpr) - Required - The prefix substring to check against. ### Request Example ```python import polars as pl s = pl.Series("colors", [b"\x00\x00\x00", b"\xff\xff\x00", b"\x00\x00\xff"]) s.bin.starts_with(b"\x00") ``` ### Response - **Series** (bool) - A Series of boolean values indicating whether each element starts with the provided prefix. ``` -------------------------------- ### Select every nth value with gather_every Source: https://docs.pola.rs/api/python/stable/reference/expressions/api/polars.Expr.gather_every.html Use gather_every to select every nth value from a Series. This example selects every 3rd value starting from the first element. ```python >>> df = pl.DataFrame({"foo": [1, 2, 3, 4, 5, 6, 7, 8, 9]}) >>> df.select(pl.col("foo").gather_every(3)) ``` -------------------------------- ### Create Column Expression with Standard Syntax Source: https://docs.pola.rs/api/python/stable/reference/expressions/col.html Demonstrates creating a new column 'baz' by combining 'foo' and 'bar' columns using standard column expression syntax. ```python >>> df.with_columns(baz=(col("foo") * col("bar")) / 2) shape: (2, 3) ┌─────┬─────┬─────┐ │ foo ┆ bar ┆ baz │ │ --- ┆ --- ┆ --- │ │ i64 ┆ i64 ┆ f64 │ ╞═════╪═════╪═════╡ │ 1 ┆ 3 ┆ 1.5 │ │ 2 ┆ 4 ┆ 4.0 │ └─────┴─────┴─────┘ ``` -------------------------------- ### Get distinct IDs for value runs Source: https://docs.pola.rs/api/python/stable/reference/series/index.html The `rle_id` method assigns a unique, incrementing integer ID to each consecutive run of identical values in a Series. The IDs start at 0. ```python # Example for rle_id would go here if present in source ``` -------------------------------- ### Partition by Year with File Size Limits Source: https://docs.pola.rs/api/python/stable/reference/api/polars.PartitionBy.html Combine partitioning by a key ('year') with limits on individual file sizes (max_rows_per_file and approximate_bytes_per_file). ```python pl.LazyFrame({"year": [2026, 2027, 1970], "month": [0, 0, 0]}).sink_parquet( pl.PartitionBy( "data/", key="year", max_rows_per_file=1000, approximate_bytes_per_file=100_000_000, ) ) ``` -------------------------------- ### Get local maximum peaks with Series.peak_max() Source: https://docs.pola.rs/api/python/stable/reference/series/api/polars.Series.peak_max.html Use `Series.peak_max()` to obtain a boolean mask indicating the positions of local maximum peaks in a Series. This method requires no special setup. ```python >>> s = pl.Series("a", [1, 2, 3, 4, 5]) >>> s.peak_max() shape: (5,) Series: 'a' [bool] [ false false false false true ] ``` -------------------------------- ### Import Styling Helpers and Create DataFrame Source: https://docs.pola.rs/api/python/stable/reference/dataframe/style.html Imports necessary modules from polars and great_tables, and initializes a sample DataFrame for styling examples. Ensure these imports are present before applying any styling. ```python >>> import polars.selectors as cs >>> from great_tables import loc, style >>> df = pl.DataFrame( ... { ... "site_id": [0, 1, 2], ... "measure_a": [5, 4, 6], ... "measure_b": [7, 3, 3], ... } ... ) ``` -------------------------------- ### Profile a LazyFrame with default settings Source: https://docs.pola.rs/api/python/stable/reference/lazyframe/api/polars.LazyFrame.profile.html This example demonstrates how to profile a LazyFrame using its default settings. It shows the materialized result and the profiling information of the query execution nodes. ```python lf = pl.LazyFrame({ "a": ["a", "b", "a", "b", "b", "c"], "b": [1, 2, 3, 4, 5, 6], "c": [6, 5, 4, 3, 2, 1], }) lf.group_by("a", maintain_order=True).agg(pl.all().sum()).sort( "a" ).profile() ``` -------------------------------- ### Create Date Ranges with polars.date_ranges Source: https://docs.pola.rs/api/python/stable/reference/expressions/api/polars.date_ranges.html Use polars.date_ranges to create a column of date ranges within a DataFrame. This example demonstrates generating date ranges based on 'start' and 'end' columns. ```python >>> from datetime import date >>> df = pl.DataFrame( ... { ... "start": [date(2022, 1, 1), date(2022, 1, 2)], ... "end": date(2022, 1, 3), ... } ... ) >>> with pl.Config(fmt_str_lengths=50): ... df.with_columns(date_range=pl.date_ranges("start", "end")) shape: (2, 3) ┌────────────┬────────────┬──────────────────────────────────────┐ │ start ┆ end ┆ date_range │ │ --- ┆ --- ┆ --- │ │ date ┆ date ┆ list[date] │ ╞════════════╪════════════╪══════════════════════════════════════╡ │ 2022-01-01 ┆ 2022-01-03 ┆ [2022-01-01, 2022-01-02, 2022-01-03] │ │ 2022-01-02 ┆ 2022-01-03 ┆ [2022-01-02, 2022-01-03] │ └────────────┴────────────┴──────────────────────────────────────┘ ``` -------------------------------- ### Extract Capture Group with Inline Flags Source: https://docs.pola.rs/api/python/stable/reference/series/api/polars.Series.str.extract.html Use inline flags like '(?m)' for multi-line matching to extract capture groups from strings. This example extracts the first word starting with 'T' on each line. ```python >>> s = pl.Series( ... name="lines", ... values=[ ... "I Like\nThose\nOdds", ... "This is\nThe Way", ... ], ... ) >>> s.str.extract(r"(?m)^(T\w+)", 1).alias("matches") shape: (2,) Series: 'matches' [str] [ "Those" "This" ] ``` -------------------------------- ### GPUEngine Constructor Source: https://docs.pola.rs/api/python/stable/reference/lazyframe/api/polars.lazyframe.engine_config.GPUEngine.html Initializes the configuration for the GPU execution engine. ```APIDOC ## class polars.lazyframe.engine_config.GPUEngine ### Description Configuration options for the GPU execution engine. Use this if you want control over details of the execution. ### Parameters - **device** (int) - Optional - Select the GPU used to run the query. If not provided, the query uses the current CUDA device. - **memory_resource** (rmm.mr.DeviceMemoryResource) - Optional - Provide a memory resource for GPU memory allocations. Warning: If passing a memory_resource, you must ensure that it is valid for the selected device. - **raise_on_fail** (bool) - Optional - If True, do not fall back to the Polars CPU engine if the GPU engine cannot execute the query, but instead raise an error. - **kwargs** (Any) - Optional - Additional configuration options for the engine. ``` -------------------------------- ### Draw Interactive DataFrame Examples Source: https://docs.pola.rs/api/python/stable/reference/api/polars.testing.parametric.dataframes.html Call `.example()` directly on the strategy to get a concrete instance of a generated DataFrame. This is useful for interactive development but should be avoided during test execution. Specify `allowed_dtypes` and `max_cols` for custom generation. ```python df = dataframes(allowed_dtypes=[pl.Datetime, pl.Float64], max_cols=3) df.example() ``` -------------------------------- ### Create a DataFrame for null filling examples Source: https://docs.pola.rs/api/python/stable/reference/dataframe/api/polars.DataFrame.fill_null.html Initializes a DataFrame with integer and float columns, including null values, to demonstrate null filling techniques. ```python df = pl.DataFrame( { "a": [1, 2, None, 4], "b": [0.5, 4, None, 13], } ) ``` -------------------------------- ### Get Local Minimum Peaks with Series.peak_min() Source: https://docs.pola.rs/api/python/stable/reference/series/api/polars.Series.peak_min.html Use `Series.peak_min()` to obtain a boolean mask indicating the positions of local minimum peaks within a Series. This method requires no special setup beyond having a Polars Series. ```python >>> s = pl.Series("a", [4, 1, 3, 2, 5]) >>> s.peak_min() shape: (5,) Series: 'a' [bool] [ false true false true false ] ```