### CLI Example: Parse SQL File to JSON Source: https://context7.com/apache/datafusion-sqlparser-rs/llms.txt The included 'cli' binary parses a .sql file and prints the AST as JSON. This requires the 'json_example' feature flag. Examples show building and running the CLI with different SQL files and dialects. ```bash # Build and run the CLI example cargo run --features json_example --example cli path/to/query.sql [--dialectname] # Examples: cargo run --features json_example --example cli query.sql cargo run --features json_example --example cli query.sql --postgresql cargo run --features json_example --example cli query.sql --mysql ``` -------------------------------- ### Install honggfuzz Source: https://github.com/apache/datafusion-sqlparser-rs/blob/main/docs/fuzzing.md Installs the honggfuzz binary using cargo. Ensure you have installed system dependencies separately. ```shell cargo install honggfuzz ``` -------------------------------- ### Example Fuzzing Commands Source: https://github.com/apache/datafusion-sqlparser-rs/blob/main/docs/fuzzing.md Demonstrates running the fuzzer and debugging crashes for the `fuzz_parse_sql` target. ```shell cargo hfuzz run fuzz_parse_sql ``` ```shell cargo hfuzz run-debug fuzz_parse_sql hfuzz_workspace/fuzz_parse_sql/*.fuzz ``` -------------------------------- ### Install PyGitHub for Changelog Generation Source: https://github.com/apache/datafusion-sqlparser-rs/blob/main/dev/release/README.md Install the PyGitHub library using pip, which is required for the changelog generation script. ```shell pip3 install PyGitHub ``` -------------------------------- ### Generate Flamegraph for SQLParser Source: https://github.com/apache/datafusion-sqlparser-rs/blob/main/sqlparser_bench/README.md Create a flamegraph visualization to profile the performance of the sqlparser-rs library. This requires the `flamegraph` Cargo subcommand to be installed. ```shell cargo flamegraph --bench sqlparser_bench ``` -------------------------------- ### Tokenize SQL with Location Tracking Source: https://context7.com/apache/datafusion-sqlparser-rs/llms.txt Use Tokenizer directly for raw tokens, enabling custom parsers or lexical analysis. This example demonstrates tokenizing SQL with source location tracking and unescaping. ```rust use sqlparser::dialect::GenericDialect; use sqlparser::tokenizer::{Tokenizer, Token}; fn main() { let dialect = GenericDialect {}; let sql = "SELECT 42, 'hello' FROM t"; // Tokenize with source location tracking let tokens = Tokenizer::new(&dialect, sql) .with_unescape(true) .tokenize_with_location() .expect("tokenization failed"); for tok in &tokens { match &tok.token { Token::Word(w) => println!("WORD {:?}", w.value), Token::Number(n, _) => println!("NUMBER {n}"), Token::SingleQuotedString(s) => println!("STRING '{s}'"), Token::Comma => println!("COMMA"), Token::Whitespace(_) => {} // skip whitespace Token::EOF => println!("EOF"), other => println!("OTHER {other:?}"), } } // WORD "SELECT" // NUMBER 42 // STRING 'hello' // WORD "FROM" // WORD "t" // EOF } ``` -------------------------------- ### Generate Changelog with GitHub Token Source: https://github.com/apache/datafusion-sqlparser-rs/blob/main/dev/release/README.md Generate a changelog by setting the GITHUB_TOKEN environment variable and running the Python script with commit IDs or tags and the release version. This example generates changes between v0.51.0 and HEAD for version 0.52.0. ```shell export GITHUB_TOKEN= python ./dev/release/generate-changelog.py v0.51.0 HEAD 0.52.0 > changelog/0.52.0.md ``` -------------------------------- ### Custom Visitor Method Call with `visit` attribute Source: https://github.com/apache/datafusion-sqlparser-rs/blob/main/derive/README.md Use `#[visit(with = "method_name")]` to specify a custom visitor method to call for a type. This example shows calling `pre_visit_expr`. ```rust #[derive(Visit, VisitMut)] #[visit(with = "visit_expr")] enum Expr { IsNull(Box), .. } ``` -------------------------------- ### Enable Serde for AST Serialization/Deserialization Source: https://context7.com/apache/datafusion-sqlparser-rs/llms.txt Enable the 'serde' feature in Cargo.toml to serialize and deserialize AST nodes to/from JSON, MessagePack, etc. This example shows serializing an AST to JSON and deserializing it back. ```toml # Cargo.toml [dependencies] sqlparser = { version = "0.62", features = ["serde"] } serde_json = "1" ``` ```rust use sqlparser::dialect::GenericDialect; use sqlparser::parser::Parser; use sqlparser::ast::Statement; fn main() { let sql = "SELECT a, b FROM orders WHERE id > 100"; let stmts = Parser::parse_sql(&GenericDialect {}, sql).unwrap(); // Serialize AST to JSON let json = serde_json::to_string_pretty(&stmts[0]).unwrap(); println!("{json}"); // Deserialize back let recovered: Statement = serde_json::from_str(&json).unwrap(); assert_eq!(recovered.to_string(), stmts[0].to_string()); } ``` -------------------------------- ### Get Source Location Information with Spanned Trait Source: https://context7.com/apache/datafusion-sqlparser-rs/llms.txt Illustrates how to retrieve the start and end source code locations (line and column) for AST nodes using the `Spanned` trait. This is useful for error reporting or code analysis. ```rust use sqlparser::dialect::GenericDialect; use sqlparser::parser::Parser; use sqlparser::ast::Spanned; use sqlparser::tokenizer::Location; fn main() { let sql = "SELECT A\n FROM table_1"; let stmts = Parser::parse_sql(&GenericDialect {}, sql).unwrap(); let span = stmts[0].span(); assert_eq!(span.start, Location::new(1, 1)); // line 1, col 1 assert_eq!(span.end, Location::new(2, 15)); // line 2, col 15 println!("Statement spans lines {}-{}", span.start.line, span.end.line); } ``` -------------------------------- ### Run SQL Parser Benchmarks Source: https://github.com/apache/datafusion-sqlparser-rs/blob/main/README.md Use these commands to run micro benchmarks and save a baseline for comparison. Ensure you checkout the main branch before running the initial benchmark and your feature branch before running the comparison benchmark. ```bash git checkout main cd sqlparser_bench cargo bench -- --save-baseline main git checkout cargo bench -- --baseline main ``` -------------------------------- ### Create Custom SQL Dialects with derive_dialect Source: https://context7.com/apache/datafusion-sqlparser-rs/llms.txt Demonstrates how to create custom SQL dialects by extending existing ones using the derive_dialect macro. This allows for overriding specific dialect features or preserving base dialect properties. ```rust use sqlparser::derive_dialect; use sqlparser::dialect::{Dialect, GenericDialect, PostgreSqlDialect}; use sqlparser::parser::{Parser, ParserError}; // Override a single flag: enable ORDER BY ALL on top of GenericDialect derive_dialect!(EnhancedGeneric, GenericDialect, overrides = { supports_order_by_all = true, supports_nested_comments = true, }); // Preserve the base dialect's TypeId so the parser's dialect-specific // branches still fire (e.g., Postgres-specific parsing paths) derive_dialect!(BacktickPostgres, PostgreSqlDialect, preserve_type_id = true, overrides = { identifier_quote_style = '`' } ); fn main() -> Result<(), ParserError> { let d = EnhancedGeneric::new(); assert!(d.supports_order_by_all()); assert!(d.supports_nested_comments()); let stmts = Parser::parse_sql(&d, "SELECT * FROM t ORDER BY ALL")?; println!("{}", stmts[0]); // BacktickPostgres is still recognized as PostgreSqlDialect inside the parser let pg: &dyn Dialect = &BacktickPostgres::new(); assert!(pg.is::()); Ok(()) } ``` -------------------------------- ### Publish SQL Parser Derive Crate Source: https://github.com/apache/datafusion-sqlparser-rs/blob/main/dev/release/README.md If necessary, navigate to the 'derive' directory and run this command to publish the sqlparser_derive crate. This is a separate step from publishing the main crate. ```shell (cd derive && cargo publish) ``` -------------------------------- ### Redundant Visitor Calls with Nested Attributes Source: https://github.com/apache/datafusion-sqlparser-rs/blob/main/derive/README.md Annotating both the type and a field with `#[visit(with = ...)]` can lead to redundant visitor method calls. This example shows the resulting calls. ```rust #[derive(Visit, VisitMut)] #[visit(with = "visit_expr")] enum Expr { IsNull(#[visit(with = "visit_expr")] Box), .. } ``` -------------------------------- ### Dynamically Lookup Dialect from String Source: https://context7.com/apache/datafusion-sqlparser-rs/llms.txt Use `dialect_from_str` to get a boxed dialect by its lowercase name. This is useful for runtime configuration. It returns `None` for unknown dialect names. ```rust use sqlparser::dialect::{dialect_from_str, AnsiDialect, PostgreSqlDialect, GenericDialect}; fn main() { // Supported names: "generic", "mysql", "postgresql"/"postgres", "hive", // "sqlite", "snowflake", "redshift", "mssql", "clickhouse", "bigquery", // "ansi", "duckdb", "databricks", "spark"/"sparksql", "oracle", "teradata" let d = dialect_from_str("postgres").expect("unknown dialect"); assert!(d.is::()); let d2 = dialect_from_str("ANSI").expect("unknown dialect"); assert!(d2.is::()); assert!(dialect_from_str("unknown").is_none()); } ``` -------------------------------- ### Field-Specific Custom Visitor Method Call Source: https://github.com/apache/datafusion-sqlparser-rs/blob/main/derive/README.md Use `#[visit(with = "method_name")]` on a field to call a specific visitor method only for that field. This example uses `visit_relation` for the `name` field. ```rust #[derive(Visit, VisitMut)] #[visit(with = "visit_table_factor")] pub enum TableFactor { Table { #[visit(with = "visit_relation")] name: ObjectName, alias: Option, }, .. } ``` -------------------------------- ### Verify Release Candidate Artifacts Source: https://github.com/apache/datafusion-sqlparser-rs/blob/main/dev/release/README.md Use the verify-release-candidate.sh script to assist in the verification process of the release candidate artifacts. ```shell ./dev/release/verify-release-candidate.sh 0.52.0 1 ``` -------------------------------- ### Create and Push Release Candidate Git Tag Source: https://github.com/apache/datafusion-sqlparser-rs/blob/main/dev/release/README.md Create a git tag for a release candidate (e.g., v0.52.0-rc1) from the 'apache/main' branch and push it to the GitHub remote. ```shell git fetch apache git tag v0.52.0-rc1 apache/main # push tag to Github remote git push apache v0.52.0-rc1 ``` -------------------------------- ### Run SQLParser Benchmarks Source: https://github.com/apache/datafusion-sqlparser-rs/blob/main/sqlparser_bench/README.md Execute the performance benchmarks for the sqlparser-rs project using Cargo. This command runs all defined benchmarks within the sqlparser_bench crate. ```shell cargo bench --bench sqlparser_bench ``` -------------------------------- ### Retrieve Source Locations from AST Nodes Source: https://github.com/apache/datafusion-sqlparser-rs/blob/main/README.md This snippet demonstrates how to retrieve the source span (start and end locations) of an AST node after parsing SQL. Note that source location recovery is a work in progress. ```rust use sqlparser::dialect::GenericDialect; use sqlparser::parser::Parser; use sqlparser::ast::Span; use sqlparser::parser::Location; // Parse SQL let ast = Parser::parse_sql(&GenericDialect, "SELECT A FROM B").unwrap(); // The source span can be retrieved with start and end locations assert_eq!(ast[0].span(), Span { start: Location::of(1, 1), end: Location::of(1, 16), }); ``` -------------------------------- ### Create and Upload Release Candidate Artifacts Source: https://github.com/apache/datafusion-sqlparser-rs/blob/main/dev/release/README.md Execute the create-tarball.sh script with the version tag and release candidate number to create and upload release candidate artifacts to the Apache distribution SVN server. ```shell GITHUB_TOKEN= ./dev/release/create-tarball.sh 0.52.0 1 ``` -------------------------------- ### Publish to Crates.io Source: https://github.com/apache/datafusion-sqlparser-rs/blob/main/derive/README.md Use this command to publish the latest checked-in version of the crate to crates.io. Ensure your local main branch is up-to-date before running. ```bash cargo publish ``` -------------------------------- ### Basic Unit Test for SQL Statement Verification Source: https://github.com/apache/datafusion-sqlparser-rs/blob/main/AGENTS.md Use this template for simple unit tests where the SQL generated from the AST should match the input SQL. Ensure the dialect is correctly specified. ```rust dialect().verified_stmt(r#"..."#); ``` -------------------------------- ### Pre-commit Checks for Rust Projects Source: https://github.com/apache/datafusion-sqlparser-rs/blob/main/AGENTS.md Run these commands to ensure your code changes adhere to project standards and pass the CI process before committing. This includes running all tests, formatting, and linting. ```bash cargo test --all-features ``` ```bash cargo fmt --all ``` ```bash cargo clippy --all-targets --all-features -- -D warnings ``` -------------------------------- ### Fetch Apache Main Branch and Checkout Source: https://github.com/apache/datafusion-sqlparser-rs/blob/main/dev/release/README.md Fetch the latest changes from the 'apache/main' branch and checkout the main commit to be released. ```shell git fetch apache git checkout apache/main ``` -------------------------------- ### AST Traversal with Visitor Pattern Source: https://context7.com/apache/datafusion-sqlparser-rs/llms.txt Demonstrates read-only AST traversal using the `visitor` feature. Includes convenience functions like `visit_relations` and `visit_expressions`, and a custom `Visitor` implementation for targeted traversal with early exit. ```toml # Cargo.toml [dependencies] sqlparser = { version = "0.62", features = ["visitor"] } ``` ```rust use sqlparser::dialect::GenericDialect; use sqlparser::parser::Parser; use sqlparser::ast::{ visit_relations, visit_expressions, visit_statements, Expr, ObjectName, Statement, Visit, Visitor, }; use core::ops::ControlFlow; fn main() { let sql = "SELECT a FROM orders WHERE id IN (SELECT id FROM customers); \ CREATE TABLE audit (ts TIMESTAMP)"; let stmts = Parser::parse_sql(&GenericDialect {}, sql).unwrap(); // Collect all table names let mut tables = vec![]; visit_relations(&stmts, |rel: &ObjectName| { tables.push(rel.to_string()); ControlFlow::<()>::Continue(()) }); assert_eq!(tables, ["orders", "customers", "audit"]); // Collect all expressions (pre-order) let mut exprs = vec![]; visit_expressions(&stmts, |e: &Expr| { exprs.push(e.to_string()); ControlFlow::<()>::Continue(()) }); println!("expressions: {:?}", exprs); // ["a", "id IN (SELECT id FROM customers)", "id", "id"] // Full visitor with early exit on first SELECT #[derive(Default)] struct FindFirstSelect { found: bool } impl Visitor for FindFirstSelect { type Break = (); fn pre_visit_statement(&mut self, stmt: &Statement) -> ControlFlow<()> { if matches!(stmt, Statement::Query(_)) { self.found = true; ControlFlow::Break(()) } else { ControlFlow::Continue(()) } } } let mut v = FindFirstSelect::default(); stmts.visit(&mut v); assert!(v.found); } ``` -------------------------------- ### AST Round-trip and Pretty-Printing Source: https://context7.com/apache/datafusion-sqlparser-rs/llms.txt Shows how to convert parsed SQL statements back into SQL strings. Implements `Display` for compact, normalized output and uses the alternate formatter (`{:#}`) for indented, pretty-printed output. ```rust use sqlparser::dialect::GenericDialect; use sqlparser::parser::Parser; fn main() { let sql = "select A,B from foo where X > 1 ORDER BY A DESC"; let stmts = Parser::parse_sql(&GenericDialect {}, sql).unwrap(); // Compact normalized output let compact = stmts[0].to_string(); assert_eq!(compact, "SELECT A, B FROM foo WHERE X > 1 ORDER BY A DESC"); println!("compact: {}", compact); // Pretty-printed with newlines and indentation let pretty = format!("{{:#}}", stmts[0]); println!("pretty:\n{}", pretty); // SELECT // A, // B // FROM // foo // WHERE // X > 1 // ORDER BY // A DESC } ``` -------------------------------- ### Promote RC Tag to Official Release Tag Source: https://github.com/apache/datafusion-sqlparser-rs/blob/main/dev/release/README.md Promote the release candidate tag to the official release tag by creating a new tag (e.g., v0.52.0) pointing to the same commit and pushing it to the Apache remote. ```shell git tag v0.52.0 v0.52.0-rc3 git push apache v0.52.0 ``` -------------------------------- ### Parse and Round Trip SQL Syntax Source: https://github.com/apache/datafusion-sqlparser-rs/blob/main/README.md Use this snippet to parse SQL text into an Abstract Syntax Tree (AST) and then regenerate the original SQL text. It also shows how to pretty-print the AST. ```rust use sqlparser::dialect::GenericDialect; use sqlparser::parser::Parser; // Parse SQL let sql = "SELECT 'hello'"; let ast = Parser::parse_sql(&GenericDialect, sql).unwrap(); // The original SQL text can be generated from the AST assert_eq!(ast[0].to_string(), sql); // The SQL can also be pretty-printed with newlines and indentation assert_eq!(format!({"#"}), ast[0]), "SELECT\n 'hello'"); ``` -------------------------------- ### Migrate ObjectName construction Source: https://github.com/apache/datafusion-sqlparser-rs/blob/main/changelog/0.55.0.md When constructing an `ObjectName`, use the new `From` implementation for convenience. ```diff ```diff - name: ObjectName(vec![Ident::new("f")]), + name: ObjectName::from(vec![Ident::new("f")]), ``` ``` -------------------------------- ### Parse SQL with Specific Dialect Structs Source: https://context7.com/apache/datafusion-sqlparser-rs/llms.txt Instantiate specific dialect structs (e.g., `MySqlDialect`, `PostgreSqlDialect`) to parse SQL according to their respective syntaxes. Each dialect struct implements the `Dialect` trait. ```rust use sqlparser::parser::{Parser, ParserError}; use sqlparser::dialect::{ AnsiDialect, BigQueryDialect, ClickHouseDialect, DatabricksDialect, DuckDbDialect, GenericDialect, HiveDialect, MsSqlDialect, MySqlDialect, OracleDialect, PostgreSqlDialect, RedshiftSqlDialect, SnowflakeDialect, SparkSqlDialect, SQLiteDialect, TeradataDialect, }; fn parse_with(dialect: &D, sql: &str) -> Result<(), ParserError> { let stmts = Parser::parse_sql(dialect, sql)?; println!("[{}] {}", std::any::type_name::(), stmts[0]); Ok(()) } fn main() -> Result<(), ParserError> { // MySQL: backtick identifiers, LIMIT x,y syntax parse_with(&MySqlDialect {}, "SELECT `id` FROM `users` LIMIT 10, 20")?; // PostgreSQL: double-colon cast, $1 placeholders parse_with(&PostgreSqlDialect {}, "SELECT col::TEXT, $1 FROM tbl")?; // Snowflake: QUALIFY clause, PIVOT, MATCH_RECOGNIZE parse_with( &SnowflakeDialect {}, "SELECT * FROM t QUALIFY ROW_NUMBER() OVER (ORDER BY id) = 1", )?; // BigQuery: pipe operator, SELECT AS STRUCT parse_with(&BigQueryDialect {}, "SELECT AS STRUCT 1 AS a, 'b' AS b")?; // DuckDB: ORDER BY ALL, struct literals parse_with(&DuckDbDialect {}, "SELECT * FROM tbl ORDER BY ALL")?; // GenericDialect: union of most dialect-specific syntax when unambiguous parse_with(&GenericDialect {}, "SELECT 1")?; Ok(()) } ``` -------------------------------- ### Parser::new / Parser::try_with_sql / Parser::parse_statements Source: https://context7.com/apache/datafusion-sqlparser-rs/llms.txt Provides a builder-style API for parsing SQL, allowing fine-grained control over options like recursion depth and trailing commas before parsing. ```APIDOC ## Parser::new / Parser::try_with_sql / Parser::parse_statements ### Description Provides a builder-style API for parsing SQL, allowing fine-grained control over options like recursion depth and trailing commas before parsing. ### Method `fn new(dialect: &dyn Dialect) -> Parser<'_>` `fn try_with_sql(self, sql: &str) -> Result, ParserError>` `fn parse_statements(self) -> Result, ParserError>` ### Parameters #### Path Parameters None #### Query Parameters None #### Request Body None ### Request Example ```rust use sqlparser::dialect::PostgreSqlDialect; use sqlparser::parser::{Parser, ParserError, ParserOptions}; fn main() -> Result<(), ParserError> { let dialect = PostgreSqlDialect {}; let options = ParserOptions::new().with_trailing_commas(true); let stmts = Parser::new(&dialect) .with_options(options) .try_with_sql("SELECT a, b FROM foo,")? // trailing commas OK .parse_statements()?; Ok(()) } ``` ### Response #### Success Response (200) - **stmts** (Vec) - A vector of parsed SQL statements. #### Response Example ```json [ { "kind": "Select", "projection": [ {"kind": "Unnamed", "expr": {"kind": "Ident", "value": "a", "quote_style": null}}, {"kind": "Unnamed", "expr": {"kind": "Ident", "value": "b", "quote_style": null}} ], "from": [ { "relation": {"kind": "Table", "relation": {"kind": "Ident", "value": "foo", "quote_style": null}}, "alias": null, "left_join_conditions": [] } ], "selection": null, "group_by": [], "having": null, "order_by": [] } ] ``` ERROR HANDLING: - Returns `ParserError` if parsing fails or recursion limit is exceeded. ``` -------------------------------- ### Generated Visitor Calls for Custom Method Source: https://github.com/apache/datafusion-sqlparser-rs/blob/main/derive/README.md When `#[visit(with = "visit_expr")]` is used, the generated code calls `visitor.pre_visit_expr` before and after visiting the inner fields. ```text visitor.pre_visit_expr() visitor.pre_visit_expr() visitor.post_visit_expr() visitor.post_visit_expr() ``` -------------------------------- ### Run honggfuzz fuzzer Source: https://github.com/apache/datafusion-sqlparser-rs/blob/main/docs/fuzzing.md Executes the honggfuzz fuzzer against a specified target within the fuzz directory. List available targets using `cargo read-manifest`. ```shell cd fuzz cargo hfuzz run ```