### Test DataFrames.jl Package Installation

Source: https://dataframes.juliadata.org/stable/man/basics

This code snippet demonstrates how to run the bundled tests for the DataFrames.jl package to verify its installation. Be aware that this process can take over 30 minutes to complete.

```julia
using Pkg

Pkg.test("DataFrames") # Warning! This will take more than 30 minutes.
```

--------------------------------

### Install DataFrames.jl

Source: https://dataframes.juliadata.org/stable/man/sorting

Installs the DataFrames.jl package using the Julia package manager. This is the first step to using the package for data manipulation.

```julia
using Pkg
Pkg.add("DataFrames")
```

--------------------------------

### Install CSV.jl Package

Source: https://dataframes.juliadata.org/stable/man/basics

Shows how to install the CSV.jl package, which is a dependency for reading CSV files into DataFrames. This is typically done using Julia's package manager.

```julia
using Pkg

Pkg.add("CSV")
```

--------------------------------

### Setup DataFrame for manipulation

Source: https://dataframes.juliadata.org/stable/man/basics

Initializes a DataFrame named 'df' with three columns 'x', 'y', and 'z', each containing a range of integers. This setup is a prerequisite for demonstrating subsequent data manipulation and indexing operations.

```julia
julia> df = DataFrame(x = 1:3, y = 4:6, z = 7:9)  # define data frame
3×3 DataFrame
 Row │ x      y      z
     │ Int64  Int64  Int64
─────┼─────────────────────
   1 │     1      4      7
   2 │     2      5      8
   3 │     3      6      9
```

--------------------------------

### Install DataFrames.jl Package

Source: https://dataframes.juliadata.org/stable/man/basics

This code snippet shows how to add the DataFrames.jl package to your Julia environment using the Pkg manager. It requires no external dependencies beyond a Julia installation.

```julia
using Pkg

Pkg.add("DataFrames")
```

```julia
] # ']' should be pressed

(@v1.9) pkg> add DataFrames
```

--------------------------------

### Install DataFramesMeta.jl Package

Source: https://dataframes.juliadata.org/stable/man/querying_frameworks

This code snippet shows how to install the DataFramesMeta.jl package using the Julia Package manager.

```julia
using Pkg
Pkg.add("DataFramesMeta")
```

--------------------------------

### Install Query.jl Package

Source: https://dataframes.juliadata.org/stable/man/querying_frameworks

This code snippet shows how to install the Query.jl package using Julia's Pkg manager. It's a prerequisite for using Query.jl's data manipulation features.

```julia
using Pkg
Pkg.add("Query")
```

--------------------------------

### Install DataFrameMacros.jl Package

Source: https://dataframes.juliadata.org/stable/man/querying_frameworks

This snippet shows how to install the DataFrameMacros.jl package using the Pkg manager in Julia. It's a prerequisite for using the package's functionalities.

```julia
using Pkg
Pkg.add("DataFrameMacros")
```

--------------------------------

### Install TidierData.jl Package

Source: https://dataframes.juliadata.org/stable/man/querying_frameworks

This code snippet demonstrates how to install the TidierData.jl package using Julia's package manager. It ensures that the necessary functionalities for data manipulation are available in the Julia environment.

```julia
using Pkg
Pkg.add("TidierData")
```

--------------------------------

### Check DataFrames.jl Package Status

Source: https://dataframes.juliadata.org/stable/man/basics

This code snippet shows how to check the installed version and status of the DataFrames.jl package using the Pkg manager in Julia. This is useful for verifying the installation and managing package versions.

```julia
]

(@v1.9) pkg> status DataFrames
```

--------------------------------

### Install CSV.jl Package

Source: https://dataframes.juliadata.org/stable/man/importing_and_exporting

Installs the CSV.jl package using the Julia package manager. This is a prerequisite for using CSV.jl functions.

```julia
using Pkg
Pkg.add("CSV")
```

--------------------------------

### Create DataFrames for Joining Example

Source: https://dataframes.juliadata.org/stable/man/joins

Demonstrates the creation of two sample DataFrames, 'people' and 'jobs', which will be used for join operations. This requires the DataFrames package.

```julia
using DataFrames

people = DataFrame(ID=[20, 40], Name=["John Doe", "Jane Doe"])
jobs = DataFrame(ID=[20, 40], Job=["Lawyer", "Doctor"])
```

--------------------------------

### Get Single Column View from DataFrame - Julia

Source: https://dataframes.juliadata.org/stable/man/basics

Creates a view of a single column from a DataFrame using the `@view` macro. This allows access to the column data without copying it, improving memory efficiency. The example selects the first column for the first 5 rows.

```julia
julia> @view german[1:5, 1]
5-element view(::Vector{Int64}, 1:5) with eltype Int64:
 0
 1
 2
 3
 4
```

--------------------------------

### DataFrames.subset! Function Documentation

Source: https://dataframes.juliadata.org/stable/lib/functions

Provides detailed documentation for the subset! function, including its signatures, behavior, parameters, and examples for both DataFrames and GroupedDataFrames.

```APIDOC
## `subset!(df::AbstractDataFrame, args...; skipmissing::Bool=false, threads::Bool=true)`
## `subset!(gdf::GroupedDataFrame{DataFrame}, args...; skipmissing::Bool=false, ungroup::Bool=true, threads::Bool=true)`

### Description
Updates data frame `df` or the parent of `gdf` in place to contain only rows for which all values produced by transformation(s) `args` for a given row is `true`. All transformations must produce vectors containing `true` or `false`.

When the first argument is a `GroupedDataFrame`, transformations are also allowed to return a single `true` or `false` value, which results in including or excluding a whole group.

If `skipmissing=false` (the default) `args` are required to produce results containing only `Bool` values. If `skipmissing=true`, additionally `missing` is allowed and it is treated as `false` (i.e. rows for which one of the conditions returns `missing` are skipped).

Each argument passed in `args` can be any specifier following the rules described for `select` with the restriction that:
  * specifying target column name is not allowed as `subset!` does not create new columns;
  * every passed transformation must return a scalar or a vector (returning `AbstractDataFrame`, `NamedTuple`, `DataFrameRow` or `AbstractMatrix` is not supported).

If `ungroup=false` the passed `GroupedDataFrame` `gdf` is updated (preserving the order of its groups) and returned.

If `threads=true` (the default) transformations may be run in separate tasks which can execute in parallel (possibly being applied to multiple rows or groups at the same time). Whether or not tasks are actually spawned and their number are determined automatically. Set to `false` if some transformations require serial execution or are not thread-safe.

If `GroupedDataFrame` is subsetted then it must include all groups present in the `parent` data frame, like in `select!`. In this case the passed `GroupedDataFrame` is updated to have correct groups after its parent is updated.

### Method
`subset!`

### Parameters
#### Path Parameters
None

#### Query Parameters
- **skipmissing** (Bool) - Optional - Defaults to `false`. If `true`, `missing` values in transformation results are treated as `false`.
- **threads** (Bool) - Optional - Defaults to `true`. If `true`, transformations may run in parallel.
- **ungroup** (Bool) - Optional - Defaults to `true`. If `false` when used with `GroupedDataFrame`, the `GroupedDataFrame` is updated and returned, preserving group order.

#### Request Body
Transformations (`args`): Each argument can be a column name or a transformation function. Transformations must return a scalar or a vector of `Bool` (or `Bool?` if `skipmissing=true`).

### Request Example
```julia
df = DataFrame(id=1:4, x=[true, false, true, false], y=[true, true, false, false])
subset!(df, :x, :y => ByRow(!));
# df is now 1x3 DataFrame with Row 3

df_grouped = DataFrame(id=1:4, y=[true, true, false, false], v=[1, 2, 11, 12])
subset!(groupby(df_grouped, :y), :v => x -> x .> minimum(x));
# df_grouped is now 2x3 DataFrame containing groups that satisfy the condition

df_missing = DataFrame(id=1:4, x=[true, false, true, false], z=[true, true, missing, missing])
subset!(df_missing, :x, :z, skipmissing=true);
# df_missing is now 1x4 DataFrame with Row 1
```

### Response
#### Success Response (200)
Returns the modified `AbstractDataFrame` or `GroupedDataFrame` in place.

#### Response Example
```julia
# Example for df after subset!(df, :x, :y => ByRow(!))
1×3 DataFrame
 Row │ id     x     y
     │ Int64  Bool  Bool
─────┼────────────────────
   1 │     3  true  false

# Example for df_grouped after subset!(groupby(df_grouped, :y), :v => x -> x .> minimum(x))
2×3 DataFrame
 Row │ id     y      v
     │ Int64  Bool   Int64
─────┼─────────────────────
   1 │     2   true      2
   2 │     4  false     12

# Example for df_missing after subset!(df_missing, :x, :z, skipmissing=true)
1×4 DataFrame
 Row │ id     x     z      v
     │ Int64  Bool  Bool?  Int64
─────┼────────────────────────
   1 │     1  true     true      1
```

### Error Handling
- `ArgumentError`: Raised if `skipmissing=false` and a transformation returns `missing` values.
- Other errors may occur based on invalid transformation specifications or incompatible types.
```

--------------------------------

### StackedVector Constructor Example in Julia

Source: https://dataframes.juliadata.org/stable/lib/types

Demonstrates the construction of a `StackedVector`, which provides a linear, concatenated view into multiple AbstractVectors. It takes a collection of AbstractVectors as input.

```julia
StackedVector(Any[[1, 2], [9, 10], [11, 12]])  # [1, 2, 9, 10, 11, 12]
```

--------------------------------

### DataFrame Construction Examples

Source: https://dataframes.juliadata.org/stable/lib/types

Demonstrates various ways to construct and manipulate DataFrames using the `AsTable` type for column selection and transformation. This includes passing columns as a NamedTuple and expanding it back into columns.

```julia
julia> df1 = DataFrame(a=1:3, b=11:13)
3×2 DataFrame
 Row │ a      b
     │ Int64  Int64
─────┼──────────────
   1 │     1     11
   2 │     2     12
   3 │     3     13

julia> df2 = select(df1, AsTable([:a, :b]) => ByRow(identity))
3×1 DataFrame
 Row │ a_b_identity
     │ NamedTuple…
─────┼─────────────────
   1 │ (a = 1, b = 11)
   2 │ (a = 2, b = 12)
   3 │ (a = 3, b = 13)

julia> select(df2, :a_b_identity => AsTable)
3×2 DataFrame
 Row │ a      b
     │ Int64  Int64
─────┼──────────────
   1 │     1     11
   2 │     2     12
   3 │     3     13

julia> select(df1, AsTable([:a, :b]) => ByRow(nt -> map(x -> x^2, nt)) => AsTable)
3×2 DataFrame
 Row │ a      b
     │ Int64  Int64
─────┼──────────────
   1 │     1    121
   2 │     4    144
   3 │     9    169
```

--------------------------------

### Prepend Rows to DataFrame (Julia)

Source: https://dataframes.juliadata.org/stable/lib/functions

Demonstrates how to prepend rows from one DataFrame to another. Shows examples with different column matching strategies like :union.

```julia
df1 = DataFrame(A=1:3, B=1:3)
df2 = DataFrame(A=4.0:6.0, B=4:6)
prepend!(df1, df2)

df2 = DataFrame(A=4.0:6.0, B=4:6)
prepend!(df2, DataFrame(A=1), (; C=1:2), cols=:union)
```

--------------------------------

### Select Rows and All Columns in DataFrame (Julia)

Source: https://dataframes.juliadata.org/stable/man/basics

This example shows how to select a range of rows while retaining all columns from a DataFrame. The colon ':' is used as a shorthand for selecting all columns. The result is a DataFrame containing the specified rows and all original columns.

```julia
julia> german[1:5, :]
5×10 DataFrame
 Row │ id     Age    Sex      Job    Housing  Saving accounts  Checking accoun ⋯
     │ Int64  Int64  String7  Int64  String7  String15         String15        ⋯
─────┼──────────────────────────────────────────────────────────────────────────
   1 │     0     67  male         2  own      NA               little          ⋯
   2 │     1     22  female       2  own      little           moderate
   3 │     2     49  male         1  own      little           NA
   4 │     3     45  male         2  free     little           little
   5 │     4     53  male         2  free     little           little          ⋯
                                                               4 columns omitted
```

--------------------------------

### Create and Initialize DataFrame in Julia

Source: https://dataframes.juliadata.org/stable/man/working_with_dataframes

Initializes a DataFrame with three columns: 'A', 'B', and 'C', using ranges and repeated values. This is a common starting point for data manipulation tasks in Julia.

```julia
df = DataFrame(A=1:2:1000, B=repeat(1:10, inner=50), C=1:500)
```

--------------------------------

### Julia: Reorder columns using `select`

Source: https://dataframes.juliadata.org/stable/man/basics

This example illustrates how to reorder columns in a DataFrame using the `select` function. It explicitly lists the desired column order, effectively changing the DataFrame's column arrangement.

```julia
df = DataFrame(a = 1:4, b = [50,50,60,60], c = ["hat","bat","cat","dog"])
select(df, :c, :b, :a)
```

--------------------------------

### Getting Proportion of Rows per Group with DataFrames.jl

Source: https://dataframes.juliadata.org/stable/man/split_apply_combine

Illustrates how to calculate the proportion of rows for each group in a `GroupedDataFrame` using the `proprow` operation. Examples include using the default column name and specifying a custom target column name.

```julia
df = DataFrame(customer_id=["a", "b", "b", "b", "c", "c"],
                  transaction_id=[12, 15, 19, 17, 13, 11],
                  volume=[2, 3, 1, 4, 5, 9])
gdf = groupby(df, :customer_id, sort=true)

# Using default column name :proprow
combine(gdf, proprow)

# Using a custom target column name
combine(gdf, proprow => "transaction_fraction")
```

--------------------------------

### Get Single Cell View from DataFrame - Julia

Source: https://dataframes.juliadata.org/stable/man/basics

Retrieves a view of a single cell within a DataFrame using the `@view` macro. This is highly memory-efficient as it only provides a reference to the existing data. The example accesses the data at row 2, column 2.

```julia
julia> @view german[2, 2]
0-dimensional view(::Vector{Int64}, 2) with eltype Int64:
22
```

--------------------------------

### Get Single Row View from DataFrame - Julia

Source: https://dataframes.juliadata.org/stable/man/basics

Creates a view of a single row from a DataFrame using the `@view` macro. This provides efficient access to all columns of a specific row without data duplication. The example selects row 3 and columns 2 through 5.

```julia
julia> @view german[3, 2:5]
DataFrameRow
 Row │ Age    Sex      Job    Housing
     │ Int64  String7  Int64  String7
─────┼────────────────────────────────
   3 │    49  male         1  own
```

--------------------------------

### Create DataFrame from Matrix and Column Names

Source: https://dataframes.juliadata.org/stable/man/basics

Demonstrates creating a DataFrame by providing a matrix of data and a vector of column names to the DataFrame constructor. This method is useful when data is already in memory as a matrix.

```julia
mat = [1 2 4 5; 15 58 69 41; 23 21 26 69]
nms = ["a", "b", "c", "d"]
DataFrame(mat, nms)
```

--------------------------------

### Benchmark Indexing vs. View Creation - Julia

Source: https://dataframes.juliadata.org/stable/man/basics

Compares the performance and memory allocation of standard DataFrame indexing versus creating a DataFrame view using the `@view` macro. Benchmarking is done using the BenchmarkTools.jl package. The results show that view creation is significantly faster and allocates much less memory.

```julia
julia> using BenchmarkTools

julia> @btime $german[1:end-1, 1:end-1];
  9.900 μs (44 allocations: 57.56 KiB)

julia> @btime @view $german[1:end-1, 1:end-1];
  67.332 ns (2 allocations: 32 bytes)
```

--------------------------------

### Split-Apply-Combine with DataFramesMeta.jl - Grouping and Aggregation

Source: https://dataframes.juliadata.org/stable/man/querying_frameworks

This example illustrates the split-apply-combine pattern using DataFramesMeta.jl. It filters data, groups by a key, calculates minimum and maximum values for each group, and then selects a derived range column.

```julia
julia> df = DataFrame(key=repeat(1:3, 4), value=1:12)
12×2 DataFrame
 Row │ key    value 
     │ Int64  Int64 
─────┼──────────────
   1 │     1      1 
   2 │     2      2 
   3 │     3      3 
   4 │     1      4 
   5 │     2      5 
   6 │     3      6 
   7 │     1      7 
   8 │     2      8 
   9 │     3      9 
  10 │     1     10 
  11 │     2     11 
  12 │     3     12 

julia> @chain df begin
           @rsubset :value > 3 
           @by(:key, :min = minimum(:value), :max = maximum(:value))
           @select(:key, :range = :max - :min)
        end
3×2 DataFrame
 Row │ key    range 
     │ Int64  Int64 
─────┼──────────────
   1 │     1      6 
   2 │     2      6 
   3 │     3      6
```

--------------------------------

### Get DataFrame Dimensions with size()

Source: https://dataframes.juliadata.org/stable/man/basics

The `size` function returns the dimensions (number of rows and columns) of a DataFrame. It can be called with one argument to get a tuple of (rows, columns), or with a second argument (1 for rows, 2 for columns) to get a specific dimension.

```julia
julia> german = copy(german_ref);

julia> size(german)
(1000, 10)

julia> size(german, 1)
1000

julia> size(german, 2)
10
```

--------------------------------

### Get DataFrame Column Names (Julia)

Source: https://dataframes.juliadata.org/stable/man/basics

Shows how to retrieve column names from a DataFrame as a vector of strings using the `names` function. It also demonstrates filtering column names based on their element type, such as `AbstractString`.

```julia
julia> names(german)
10-element Vector{String}:
 "id"
 "Age"
 "Sex"
 "Job"
 "Housing"
 "Saving accounts"
 "Checking account"
 "Credit amount"
 "Duration"
 "Purpose"

julia> names(german, AbstractString)
5-element Vector{String}:
 "Sex"
 "Housing"
 "Saving accounts"
 "Checking account"
 "Purpose"
```

--------------------------------

### Load DataFrames.jl Package

Source: https://dataframes.juliadata.org/stable/man/basics

This code snippet shows the command to load the DataFrames.jl package into your Julia session, making its functionalities available for use. This is a prerequisite for working with DataFrames.

```julia
using DataFrames
```

--------------------------------

### Broadcasting Functions for Vector Operations in Julia

Source: https://dataframes.juliadata.org/stable/man/basics

Explains how to define functions that broadcast over vectors, allowing direct application to columns without needing ByRow. Examples include element-wise addition and a function operating on two columns.

```julia
g(x) = x .+ 1
transform(df, :a => g)
h(x, y) = x .+ y .+ 1
transform(df, [:a, :b] => h)
```

--------------------------------

### Create DataFrames with Keyword Arguments and Pairs

Source: https://context7.com/context7/dataframes_juliadata_stable/llms.txt

Demonstrates creating DataFrames.jl DataFrames using keyword arguments, named tuples, pairs, dictionaries, and matrices. These methods allow for flexible initialization of tabular data structures in Julia.

```julia
using DataFrames

# Keyword argument constructor
df = DataFrame(a=1:4, b=["M", "F", "F", "M"])
# 4×2 DataFrame
#  Row │ a      b
#      │ Int64  String
# ─────┼───────────────
#    1 │     1  M
#    2 │     2  F
#    3 │     3  F
#    4 │     4  M

# Named tuple of vectors
df = DataFrame((a=[1, 2], b=[3, 4]))

# Vector of named tuples
df = DataFrame([(a=1, b=0), (a=2, b=0)])

# Pair constructor
df = DataFrame("a" => 1:2, "b" => 0)

# Dictionary constructor
df = DataFrame(Dict(:a => 1:2, :b => 0))

# Matrix constructor with automatic column names
df = DataFrame([1 0; 2 0], :auto)
# 2×2 DataFrame
#  Row │ x1     x2
#      │ Int64  Int64
# ─────┼──────────────
#    1 │     1      0
#    2 │     2      0
```

--------------------------------

### Advanced DataFrame Column Selection with Not, Between, Cols, All (Julia)

Source: https://dataframes.juliadata.org/stable/man/working_with_dataframes

Provides examples of using advanced column selectors like `Not`, `Between`, `Cols`, and `All` for more complex DataFrame subsetting. `Not` excludes, `Between` selects a range, `All` selects all, and `Cols` selects based on a predicate.

```julia
julia> df = DataFrame(r=1, x1=2, x2=3, y=4)
1×4 DataFrame
 Row │ r      x1     x2     y
     │ Int64  Int64  Int64  Int64
─────┼────────────────────────────
   1 │     1      2      3      4

julia> df[:, Not(:r)] # drop :r column
1×3 DataFrame
 Row │ x1     x2     y
     │ Int64  Int64  Int64
─────┼─────────────────────
   1 │     2      3      4

julia> df[:, Between(:r, :x2)] # keep columns between :r and :x2
1×3 DataFrame
 Row │ r      x1     x2
     │ Int64  Int64  Int64
─────┼─────────────────────
   1 │     1      2      3

julia> df[:, All()] # keep all columns
1×4 DataFrame
 Row │ r      x1     x2     y
     │ Int64  Int64  Int64  Int64
─────┼────────────────────────────
   1 │     1      2      3      4

julia> df[:, Cols(x -> startswith(x, "x"))] # keep columns whose name starts with "x"
1×2 DataFrame
 Row │ x1     x2
     │ Int64  Int64
─────┼──────────────
   1 │     2      3
```

--------------------------------

### Copy DataFrame

Source: https://dataframes.juliadata.org/stable/man/basics

Demonstrates creating a copy of an existing DataFrame. This is a common practice to preserve the original data before performing modifications.

```julia
german = copy(german_ref)
```

--------------------------------

### Initialize DataFrame with Named Columns in Julia

Source: https://dataframes.juliadata.org/stable/man/basics

Shows how to initialize a DataFrame with specified column names and data. Supports broadcasting scalar values to fill entire columns. Column names are provided as Symbols (e.g., :A, :B).

```julia
julia> DataFrame(A=1:3, B=5:7, fixed=1)
3×3 DataFrame
 Row │ A      B      fixed
     │ Int64  Int64  Int64
─────┼─────────────────────
   1 │     1      5      1
   2 │     2      6      1
   3 │     3      7      1
```

--------------------------------

### RepeatedVector Constructor Example in Julia

Source: https://dataframes.juliadata.org/stable/lib/types

Provides examples of how to construct a `RepeatedVector`, which is a view into an AbstractVector with repeated elements. It takes a parent vector and specifies inner and outer repetition counts.

```julia
RepeatedVector([1, 2], 3, 1)   # [1, 1, 1, 2, 2, 2]
RepeatedVector([1, 2], 1, 3)   # [1, 2, 1, 2, 1, 2]
RepeatedVector([1, 2], 2, 2)   # [1, 1, 2, 2, 1, 1, 2, 2]
```

--------------------------------

### Create and Display DataFrame in Julia

Source: https://dataframes.juliadata.org/stable/man/working_with_dataframes

Demonstrates creating a DataFrame in Julia using the DataFrames.jl package and displays its default summarized output. It also shows how to adjust printing options to display all rows or columns.

```julia
using DataFrames

df = DataFrame(A=1:2:1000, B=repeat(1:10, inner=50), C=1:500)

# Display default output (sample)
println(df)

# Display all rows
show(df, allrows=true)

# Display all columns
show(df, allcols=true)
```

--------------------------------

### Get Last N Rows of DataFrame

Source: https://dataframes.juliadata.org/stable/lib/functions

The `last` function can also be used to get a specified number of rows from the end of a DataFrame. It returns a new DataFrame or a SubDataFrame view. It preserves metadata.

```julia
last(df::AbstractDataFrame, n::Integer; view::Bool=false)
```

--------------------------------

### Create DataFrame and SubDataFrame View

Source: https://dataframes.juliadata.org/stable/lib/types

Demonstrates creating a DataFrame and then creating a SubDataFrame by selecting a range of rows and specific columns using the `view` function.

```julia
df = DataFrame(a=repeat([1, 2, 3, 4], outer=[2]),
              b=repeat([2, 1], outer=[4]),
              c=1:8)

view(df, 1:4, [:a, :c])
```

--------------------------------

### Get First N Rows of DataFrame

Source: https://dataframes.juliadata.org/stable/lib/functions

The `first` function can also be used to get a specified number of rows from the beginning of a DataFrame. It returns a new DataFrame or a SubDataFrame view. It preserves metadata.

```julia
first(df::AbstractDataFrame, n::Integer; view::Bool=false)
```

--------------------------------

### Basic DataFrame Operations in Julia

Source: https://dataframes.juliadata.org/stable/man/basics

Demonstrates basic DataFrame creation and operations like sum, maximum, and vector subtraction using combine and transform. It highlights how scalar results are broadcasted and how vector operations behave.

```julia
df = DataFrame(a = [1, 2, 3], b = [4, 5, 4])
combine(df, :a => sum)
transform(df, :b => maximum) # `transform` and `select` copy scalar result to all rows
transform(df, [:b, :a] => -) # vector subtraction is okay
```

--------------------------------

### Extract First Two Columns as DataFrame (Julia)

Source: https://dataframes.juliadata.org/stable/man/basics

This example demonstrates extracting the first two columns of a DataFrame into a new DataFrame. It shows multiple ways to achieve this, including using a range of column indices or vectors of column names/symbols. The distinction between copying and non-copying extraction is also illustrated.

```julia
julia> german[:, 1:2]        # Copies the columns
julia> german[:, [:id, :Age]]  # Copies the columns
julia> german[:, ["id", "Age"]] # Copies the columns

julia> german[!, 1:2]        # Reuses columns without copying
julia> german[!, [:id, :Age]]  # Reuses columns without copying
julia> german[!, ["id", "Age"]] # Reuses columns without copying
```

--------------------------------

### Julia: Select specific columns and all others using `select`

Source: https://dataframes.juliadata.org/stable/man/basics

This code snippet shows how to use the `select` function to pick a specific column and all remaining columns using the `:` operator. This is useful for rearranging or focusing on a subset of data.

```julia
df = DataFrame(a = 1:4, b = [50,50,60,60], c = ["hat","bat","cat","dog"])
select(df, :b, :)
```

--------------------------------

### Get Element Types of DataFrame Columns (Julia)

Source: https://dataframes.juliadata.org/stable/man/basics

Demonstrates how to get the element types of each column in a DataFrame. It uses `eachcol` to iterate over the columns and then broadcasts the `eltype` function to determine the data type of elements within each column.

```julia
julia> eltype.(eachcol(german))
10-element Vector{DataType}:
 Int64
 Int64
 String7
 Int64
 String7
 String15
 String15
 Int64
 Int64
 String31
```

--------------------------------

### Julia: Perform multi-step transformation sequentially

Source: https://dataframes.juliadata.org/stable/man/basics

This code demonstrates the correct way to perform sequential transformations where a new column created in one step is used in a subsequent step. It first creates column `:d` by summing `:a` and `:b`, then transforms `:d` in a separate `transform!` call.

```julia
df = DataFrame(a = 1:4, b = [50,50,60,60], c = ["hat","bat","cat","dog"])
new_df = transform(df, [:a, :b] => ByRow(+) => :d)
transform!(new_df, :d => (x -> x ./ 2) => :d_2)
```

--------------------------------

### Julia: Broadcasting Multiple Functions to Different DataFrame Columns

Source: https://dataframes.juliadata.org/stable/man/basics

Demonstrates how to apply different functions to different columns within a DataFrame. This example creates two simple functions, `f1` and `f2`, and broadcasts them to columns 'a' and 'b' respectively, creating new columns for the results.

```julia
julia> df = DataFrame(a=1:4, b=5:8)
4×2 DataFrame
 Row │ a      b
     │ Int64  Int64
─────┼──────────────
   1 │     1      5
   2 │     2      6
   3 │     3      7
   4 │     4      8

julia> f1(x) = x .+ 1
f1 (generic function with 1 method)

julia> f2(x) = x ./ 10
f2 (generic function with 1 method)

julia> transform(df, [:a, :b] .=> [f1, f2])
4×4 DataFrame
 Row │ a      b      a_f1   b_f2
     │ Int64  Int64  Int64  Float64
─────┼──────────────────────────────
   1 │     1      5      2      0.5
   2 │     2      6      3      0.6
   3 │     3      7      4      0.7
   4 │     4      8      5      0.8
```

--------------------------------

### Get Group Indices with DataFrames.jl

Source: https://dataframes.juliadata.org/stable/man/split_apply_combine

Demonstrates how to retrieve the group number for each row in a grouped DataFrame using the `groupindices` operation. This can be used with `combine` or `transform` to add a group index column, or directly as a function to get a vector of indices.

```julia
julia> combine(gdf, groupindices)
3×2 DataFrame
 Row │ customer_id  groupindices
     │ String       Int64
─────┼───────────────────────────
   1 │ a                       1
   2 │ b                       2
   3 │ c                       3

```

```julia
julia> transform(gdf, groupindices)
6×4 DataFrame
 Row │ customer_id  transaction_id  volume  groupindices
     │ String       Int64           Int64   Int64
─────┼───────────────────────────────────────────────────
   1 │ a                        12       2             1
   2 │ b                        15       3             2
   3 │ b                        19       1             2
   4 │ b                        17       4             2
   5 │ c                        13       5             3
   6 │ c                        11       9             3

```

```julia
julia> combine(gdf, groupindices => "group_number")
3×2 DataFrame
 Row │ customer_id  group_number
     │ String       Int64
─────┼───────────────────────────
   1 │ a                       1
   2 │ b                       2
   3 │ c                       3

```

```julia
julia> groupindices(gdf)
6-element Vector{Union{Missing, Int64}}:
 1
 2
 2
 2
 3
 3

```

--------------------------------

### Applying Custom Functions Element-wise in Julia

Source: https://dataframes.juliadata.org/stable/man/basics

Illustrates applying custom defined functions element-wise to DataFrame columns using ByRow. It demonstrates a simple addition function and how it transforms a column.

```julia
f(x) = x + 1
transform(df, :a => ByRow(f))
```

--------------------------------

### Get DataFrame Row and Column Counts with nrow() and ncol()

Source: https://dataframes.juliadata.org/stable/man/basics

The `nrow` function returns the number of rows in a DataFrame, while the `ncol` function returns the number of columns. These provide a more direct way to get specific dimension counts compared to `size()`.

```julia
julia> nrow(german)
1000

julia> ncol(german)
10
```

--------------------------------

### Select DataFrame columns using collections and patterns in Julia

Source: https://dataframes.juliadata.org/stable/man/basics

Illustrates advanced column selection using collections (like vectors), regular expressions, and special selectors (`Not`, `Between`, `All`, `Cols`). This provides flexible ways to select subsets of columns based on various criteria.

```julia
df = DataFrame(
    id = [1, 2, 3],
    first_name = ["José", "Emma", "Nathan"],
    last_name = ["Garcia", "Marino", "Boyer"],
    age = [61, 24, 33]
)

select(df, [:last_name, :first_name])

select(df, r"name")

select(df, Not(:id))

select(df, Between(2,4))
```

--------------------------------

### Get DataFrameRow Element Count

Source: https://dataframes.juliadata.org/stable/lib/functions

Returns the number of elements in a DataFrameRow. If a dimension is specified, it must be 1 and returns the number of elements directly.

```julia
size(dfr::DataFrameRow[, dim])
dfr = DataFrame(a=1:3, b='a':'c')[1, :]
size(dfr)
(2,)
size(dfr, 1)
2
```

--------------------------------

### Get Number of Dimensions of DataFrameRow with Base.ndims

Source: https://dataframes.juliadata.org/stable/lib/functions

Returns the number of dimensions for a `DataFrameRow` or its type, which is always 1, reflecting its structure as a single row.

```julia
ndims(::DataFrameRow)
ndims(::Type{<:DataFrameRow})
```

--------------------------------

### Select Rows and Specific Columns in DataFrame (Julia)

Source: https://dataframes.juliadata.org/stable/man/basics

This code snippet demonstrates how to select a range of rows and specific columns from a DataFrame. It uses standard Julia DataFrame indexing syntax, specifying row indices and a vector of column names. The output is a new DataFrame with the selected subset of data.

```julia
julia> german[1:5, [:Sex, :Age]]
5×2 DataFrame
 Row │ Sex      Age
     │ String7  Int64
─────┼────────────────
   1 │ male        67
   2 │ female      22
   3 │ male        49
   4 │ male        45
   5 │ male        53
```

--------------------------------

### Get Table Metadata

Source: https://dataframes.juliadata.org/stable/lib/functions

Retrieves the value of a table-level metadata key from a DataFrame. Optionally returns the metadata style and a default value if the key does not exist.

```APIDOC
## GET /dataframes/metadata

### Description
Retrieves the value of a table-level metadata key from a DataFrame.

### Method
GET

### Endpoint
/dataframes/metadata

### Parameters
#### Query Parameters
- **key** (string) - Required - The metadata key to retrieve.
- **default** (any) - Optional - The default value to return if the key does not exist.
- **style** (boolean) - Optional - If true, returns a tuple of (value, style). Defaults to false.

### Request Example
```json
{
  "key": "name",
  "style": true
}
```

### Response
#### Success Response (200)
- **value** (any) - The metadata value.
- **style** (symbol) - The metadata style (if style=true).

#### Response Example
```json
{
  "value": "example",
  "style": "note"
}
```
```

--------------------------------

### Constructing DataFrame Column by Column in Julia

Source: https://dataframes.juliadata.org/stable/man/getting_started

Demonstrates creating an empty DataFrame and adding columns sequentially. It shows different syntaxes for column assignment and modification, including broadcasting scalar values. Note the difference between `df.col` and `df[:, :col]` for replacement vs. in-place updates.

```julia
df = DataFrame()
df.A = 1:8
df[:, :B] = ["M", "F", "F", "M", "F", "M", "M", "F"]
df[!, :C] .= 0
println(df)
println("Size: ", size(df))
```

```julia
df.B = df.B .== "F"
println(df)
```

--------------------------------

### Get Single Row DataFrame

Source: https://dataframes.juliadata.org/stable/lib/functions

The `only` function returns a DataFrameRow if the input DataFrame has exactly one row, otherwise it throws an ArgumentError. It preserves metadata.

```julia
only(df::AbstractDataFrame)
```

--------------------------------

### Get Last Row of DataFrame

Source: https://dataframes.juliadata.org/stable/lib/functions

The `last` function retrieves the last row of a DataFrame and returns it as a `DataFrameRow`. This function preserves table-level and column-level metadata.

```julia
last(df::AbstractDataFrame)
```

--------------------------------

### Get First Row of DataFrame

Source: https://dataframes.juliadata.org/stable/lib/functions

The `first` function retrieves the first row of a DataFrame and returns it as a `DataFrameRow`. This function preserves table-level and column-level metadata.

```julia
first(df::AbstractDataFrame)
```

--------------------------------

### Getting Column Names from DataFrameColumns (Julia)

Source: https://dataframes.juliadata.org/stable/lib/functions

Retrieves a vector of column names as Symbols from a DataFrameColumns object, typically representing columns within a DataFrame.

```julia
source
keys(dfc::DataFrameColumns)
```

Get a vector of column names of `dfc` as `Symbol`s.
```

--------------------------------

### Constructing DataFrame from Tables.jl Interface in Julia

Source: https://dataframes.juliadata.org/stable/man/getting_started

Shows how to create a DataFrame from other table-like data structures that adhere to the Tables.jl interface. This example includes writing a DataFrame to a CSV file and loading it into an SQLite database, demonstrating DataFrames.jl's interoperability.

```julia
using CSV
using SQLite

df = DataFrame(a=[1, 2, 3], b=[:a, :b, :c])

# write DataFrame out to CSV file
CSV.write("dataframe.csv", df)

# store DataFrame in an SQLite database table
db = db"mydatabase.sqlite"
SQLite.load!(df, db, "dataframe_table")
close!(db)
```

--------------------------------

### Create and Filter/Select DataFrame with TidierData.jl

Source: https://dataframes.juliadata.org/stable/man/querying_frameworks

This example shows how to create a DataFrame using DataFrames.jl and then apply filtering and selection operations using TidierData.jl's @chain, @filter, and @select macros. It demonstrates a common data wrangling pattern.

```julia
using TidierData
using DataFrames

df = DataFrame(
        name = ["John", "Sally", "Roger"],
        age = [54.0, 34.0, 79.0],
        children = [0, 2, 4]
    )

@chain df begin
          @filter(children != 2)
          @select(name, num_children = children)
      end
```

--------------------------------

### Get DataFrame Dimensions

Source: https://dataframes.juliadata.org/stable/lib/functions

Retrieves the dimensions (number of rows and columns) of an AbstractDataFrame. Optionally, a specific dimension (1 for rows, 2 for columns) can be requested.

```julia
size(df::AbstractDataFrame[, dim])
df = DataFrame(a=1:3, b='a':'c')
size(df)
(3, 2)
size(df, 1)
3
```

--------------------------------

### Get Number of Rows with DataAPI.nrow

Source: https://dataframes.juliadata.org/stable/lib/functions

Returns the count of rows present in an `AbstractDataFrame`. This function is crucial for understanding the depth of your data. It complements `ncol` and `size`.

```julia
nrow(df::AbstractDataFrame)

# Example:
df = DataFrame(i=1:10, x=rand(10), y=rand(["a", "b", "c"], 10))
nrow(df)
```