Try Live
Add Docs
Rankings
Pricing
Enterprise
Docs
Install
Theme
Install
Docs
Pricing
Enterprise
More...
More...
Try Live
Rankings
Create API Key
Add Docs
Insyra
https://github.com/hazelnutparadise/insyra
Admin
Insyra is a next-generation data analysis library for Golang, offering dynamic and versatile tools
...
Tokens:
138,159
Snippets:
679
Trust Score:
4.2
Update:
3 weeks ago
Context
Skills
Chat
Benchmark
77.3
Suggestions
Latest
Show doc for...
Code
Info
Show Results
Context Summary (auto-generated)
Raw
Copy
Link
# Insyra Insyra is a next-generation data analysis library for Go that provides intuitive data structures, statistical analysis, data visualization, and seamless Python integration. The library centers around two core data structures: `DataList` (one-dimensional data container) and `DataTable` (two-dimensional tabular data), offering methods for data manipulation, statistical calculations, filtering, and transformation. Insyra supports parallel processing, multiple file formats (CSV, Excel, JSON, Parquet), and features a powerful Column Calculation Language (CCL) that works like Excel formulas. The library is designed for developers handling complex data structures who need fast, lovely, and easy-to-use tools for data analysis. It provides thread-safe operations through an actor-style serialization pattern, defensive copies for all public data accessors, and comprehensive error handling. With sub-packages for statistics (`stats`), visualization (`plot`, `gplot`), parallel computing (`parallel`), linear programming (`lp`, `lpgen`), and marketing analytics (`mkt`), Insyra covers a wide range of data analysis needs. ## Installation ```bash go get github.com/HazelnutParadise/insyra/allpkgs ``` ## Creating DataList DataList is a dynamic, generic container for storing and manipulating collections of data with built-in statistical analysis capabilities. ```go package main import ( "fmt" "github.com/HazelnutParadise/insyra" ) func main() { // Create DataList with values dl := insyra.NewDataList(1, 2, 3, 4, 5) // Create with mixed types dlMixed := insyra.NewDataList("Alice", 25, true, 3.14) // Create with nested slices (automatically flattened) dlFlat := insyra.NewDataList([]int{1, 2}, []string{"a", "b"}) // Result: [1, 2, "a", "b"] // Set name and append data dl.SetName("Scores").Append(6, 7, 8) // Access elements (supports negative indexing) fmt.Println(dl.Get(0)) // 1 (first element) fmt.Println(dl.Get(-1)) // 8 (last element) // Display data dl.Show() // Output: // DataList: [1 2 3 4 5 6 7 8] } ``` ## DataList Statistical Analysis DataList provides comprehensive statistical methods including mean, median, standard deviation, quartiles, and more. ```go package main import ( "fmt" "github.com/HazelnutParadise/insyra" ) func main() { dl := insyra.NewDataList(10, 20, 30, 40, 50, 60, 70, 80, 90, 100) // Basic statistics fmt.Printf("Sum: %.2f\n", dl.Sum()) // 550.00 fmt.Printf("Mean: %.2f\n", dl.Mean()) // 55.00 fmt.Printf("Median: %.2f\n", dl.Median()) // 55.00 fmt.Printf("Min: %.2f\n", dl.Min()) // 10.00 fmt.Printf("Max: %.2f\n", dl.Max()) // 100.00 fmt.Printf("Range: %.2f\n", dl.Range()) // 90.00 // Variance and standard deviation fmt.Printf("Stdev: %.2f\n", dl.Stdev()) // Sample standard deviation fmt.Printf("Var: %.2f\n", dl.Var()) // Sample variance // Quartiles and percentiles fmt.Printf("Q1: %.2f\n", dl.Quartile(1)) fmt.Printf("Q2: %.2f\n", dl.Quartile(2)) fmt.Printf("Q3: %.2f\n", dl.Quartile(3)) fmt.Printf("IQR: %.2f\n", dl.IQR()) fmt.Printf("75th Percentile: %.2f\n", dl.Percentile(75)) // Time series analysis ma := dl.MovingAverage(3) ma.Show() // 3-period moving average smoothed := dl.ExponentialSmoothing(0.3) smoothed.Show() // Display comprehensive summary dl.Summary() } ``` ## Creating DataTable DataTable provides a tabular data structure with rich manipulation capabilities including reading from files, filtering, sorting, and statistical aggregation. ```go package main import ( "fmt" "log" "github.com/HazelnutParadise/insyra" ) func main() { // Create DataTable from DataLists names := insyra.NewDataList("Alice", "Bob", "Charlie").SetName("Name") ages := insyra.NewDataList(28, 34, 29).SetName("Age") cities := insyra.NewDataList("NYC", "LA", "Chicago").SetName("City") dt := insyra.NewDataTable(names, ages, cities) dt.SetName("Employees") // Display table dt.Show() // Access elements fmt.Println(dt.GetElement(0, "A")) // "Alice" fmt.Println(dt.GetCol("B").Data()) // [28 34 29] fmt.Println(dt.GetRow(1).Data()) // ["Bob" 34 "LA"] fmt.Println(dt.GetColByName("Age").Mean()) // 30.33 // Read from CSV file dtCSV, err := insyra.ReadCSV_File("data.csv", false, true) if err != nil { log.Fatal(err) } dtCSV.Show() // Read from JSON dtJSON, err := insyra.ReadJSON_File("data.json") if err != nil { log.Fatal(err) } dtJSON.Show() // Save to files dt.ToCSV("output.csv", false, true, false) dt.ToJSON("output.json", true) } ``` ## Column Calculation Language (CCL) CCL provides Excel-like formula syntax for column calculations with support for conditionals, aggregations, and row/column references. ```go package main import ( "github.com/HazelnutParadise/insyra" ) func main() { // Create sample data dt := insyra.NewDataTable( insyra.NewDataList(85, 92, 78, 95, 88).SetName("Score"), insyra.NewDataList(100, 200, 150, 300, 250).SetName("Sales"), insyra.NewDataList(10, 20, 15, 30, 25).SetName("Cost"), ) // Add calculated columns using CCL // Conditional classification dt.AddColUsingCCL("Grade", "IF(A > 90, 'A', IF(A > 80, 'B', 'C'))") // Basic arithmetic dt.AddColUsingCCL("Profit", "['Sales'] - ['Cost']") // Percentage calculation dt.AddColUsingCCL("Margin", "(['Sales'] - ['Cost']) / ['Sales'] * 100") // Aggregate functions dt.AddColUsingCCL("TotalSales", "SUM(['Sales'])") dt.AddColUsingCCL("AvgScore", "AVG(['Score'])") // Row-wise calculations dt.AddColUsingCCL("RowSum", "SUM(@.#)") // Range checks with chained comparisons dt.AddColUsingCCL("InRange", "IF(80 <= A <= 90, 'Target', 'Outside')") // CASE function for multiple conditions dt.AddColUsingCCL("Performance", "CASE(A > 90, 'Excellent', A > 80, 'Good', A > 70, 'Fair', 'Poor')") // Execute multiple CCL statements dt.ExecuteCCL(` NEW('Bonus') = IF(['Score'] > 90, ['Sales'] * 0.1, 0) NEW('FinalSales') = ['Sales'] + Bonus `) dt.Show() } ``` ## DataTable Filtering and Sorting DataTable provides flexible filtering by rows, columns, values, and custom predicates, plus multi-column sorting. ```go package main import ( "github.com/HazelnutParadise/insyra" ) func main() { dt, _ := insyra.ReadCSV_File("employees.csv", false, true) // Filter rows using custom function filtered := dt.FilterRows(func(colIndex, colName string, x any) bool { if colName == "Age" { if age, ok := x.(int); ok { return age > 30 } } return false }) // Filter by row index range first5 := dt.FilterRowsByRowIndexLessThan(5) last5 := dt.FilterRowsByRowIndexGreaterThan(dt.Count() - 6) // Filter columns by name pattern salaryData := dt.FilterColsByColNameContains("salary") // Drop rows/columns containing specific values cleaned := dt.DropRowsContainNil().DropColsContainNaN() // Sort by single column dt.SortBy(insyra.DataTableSortConfig{ ColumnName: "Age", Descending: false, }) // Multi-column sort dt.SortBy( insyra.DataTableSortConfig{ColumnName: "Department", Descending: false}, insyra.DataTableSortConfig{ColumnName: "Salary", Descending: true}, ) // Simple random sampling sample := dt.SimpleRandomSample(100) filtered.Show() sample.Show() } ``` ## Syntactic Sugar with isr Package The `isr` package provides a fluent, method-chaining API for more concise and readable code. ```go package main import ( "fmt" "github.com/HazelnutParadise/insyra/isr" ) func main() { // Create DataList with fluent syntax dl := isr.DL.Of(1, 2, 3, 4, 5).Push(6, 7, 8) dl.Show() fmt.Println("Mean:", dl.Mean()) // Create DataTable from various sources dt := isr.DT.From(isr.Rows{ {"Name": "Alice", "Age": 28, "City": "NYC"}, {"Name": "Bob", "Age": 34, "City": "LA"}, {"Name": "Charlie", "Age": 29, "City": "Chicago"}, }) // Access data row := dt.Row(0) col := dt.Col("Name") value := dt.At(0, "Age") fmt.Println(row.Data()) fmt.Println(col.Data()) fmt.Println(value) // From CSV file dtCSV := isr.DT.From(isr.CSV{ FilePath: "data.csv", InputOpts: isr.CSV_inOpts{ FirstRow2ColNames: true, }, }) // From JSON dtJSON := isr.DT.From(isr.JSON{FilePath: "data.json"}) // Execute CCL with method chaining result := isr.DT.From(isr.Rows{ {"A": 10, "B": 20}, {"A": 30, "B": 40}, }).CCL("NEW('Sum') = A + B").Col(isr.Name("Sum")) result.Show() } ``` ## Statistical Analysis with stats Package The `stats` package provides comprehensive statistical functions including correlation, hypothesis testing, regression, and ANOVA. ```go package main import ( "fmt" "github.com/HazelnutParadise/insyra" "github.com/HazelnutParadise/insyra/stats" ) func main() { // Sample data x := insyra.NewDataList(1, 2, 3, 4, 5, 6, 7, 8, 9, 10) y := insyra.NewDataList(2.1, 4.2, 5.8, 8.1, 10.2, 11.9, 14.1, 16.0, 18.2, 20.1) // Correlation analysis corr := stats.Correlation(x, y, stats.PearsonCorrelation) fmt.Printf("Correlation: %.4f, P-value: %.4f\n", corr.Statistic, corr.PValue) // Linear regression reg := stats.LinearRegression(y, x) fmt.Printf("Slope: %.4f, Intercept: %.4f\n", reg.Slope, reg.Intercept) fmt.Printf("R-squared: %.4f\n", reg.RSquared) fmt.Printf("95%% CI for Slope: [%.4f, %.4f]\n", reg.ConfidenceIntervalSlope[0], reg.ConfidenceIntervalSlope[1]) // T-test group1 := insyra.NewDataList(23, 25, 28, 30, 32) group2 := insyra.NewDataList(30, 32, 35, 37, 40) ttest := stats.TwoSampleTTest(group1, group2, true, 0.95) fmt.Printf("T-test: t=%.4f, p=%.4f\n", ttest.Statistic, ttest.PValue) // One-way ANOVA g1 := insyra.NewDataList(4, 5, 6) g2 := insyra.NewDataList(7, 8, 9) g3 := insyra.NewDataList(1, 2, 3) anova := stats.OneWayANOVA(g1, g2, g3) fmt.Printf("ANOVA: F=%.4f, p=%.4f\n", anova.Factor.F, anova.Factor.P) // Chi-square test observed := insyra.NewDataList("A", "B", "A", "C", "A", "B", "B", "C") chiResult := stats.ChiSquareGoodnessOfFit(observed, nil, true) chiResult.Show() // Skewness and Kurtosis data := insyra.NewDataList(1, 2, 2, 3, 3, 3, 4, 4, 5) fmt.Printf("Skewness: %.4f\n", stats.Skewness(data)) fmt.Printf("Kurtosis: %.4f\n", stats.Kurtosis(data)) } ``` ## Interactive Charts with plot Package The `plot` package creates interactive web-based charts using go-echarts, with support for HTML and PNG export. ```go package main import ( "github.com/HazelnutParadise/insyra" "github.com/HazelnutParadise/insyra/plot" ) func main() { // Create sample data sales := insyra.NewDataList(100, 150, 120, 180, 200).SetName("Sales") costs := insyra.NewDataList(60, 80, 70, 100, 110).SetName("Costs") // Bar Chart barConfig := plot.BarChartConfig{ Title: "Monthly Performance", XAxis: []string{"Jan", "Feb", "Mar", "Apr", "May"}, ShowLabels: true, Theme: plot.ThemeWesteros, } barChart := plot.CreateBarChart(barConfig, sales, costs) plot.SaveHTML(barChart, "bar_chart.html") // Line Chart with smooth lines lineConfig := plot.LineChartConfig{ Title: "Sales Trend", XAxis: []string{"Jan", "Feb", "Mar", "Apr", "May"}, Smooth: true, FillArea: true, } lineChart := plot.CreateLineChart(lineConfig, sales) plot.SaveHTML(lineChart, "line_chart.html") // Pie Chart pieConfig := plot.PieChartConfig{ Title: "Market Share", ShowLabels: true, ShowPercent: true, } pieChart := plot.CreatePieChart(pieConfig, plot.PieItem{Name: "Product A", Value: 45}, plot.PieItem{Name: "Product B", Value: 30}, plot.PieItem{Name: "Product C", Value: 25}, ) plot.SaveHTML(pieChart, "pie_chart.html") // Scatter Chart scatterConfig := plot.ScatterChartConfig{ Title: "X-Y Correlation", XAxisName: "Variable X", YAxisName: "Variable Y", } scatterData := map[string][]plot.ScatterPoint{ "Series1": {{X: 1, Y: 2}, {X: 2, Y: 4}, {X: 3, Y: 5}, {X: 4, Y: 8}}, } scatterChart := plot.CreateScatterChart(scatterConfig, scatterData) plot.SaveHTML(scatterChart, "scatter_chart.html") // Save as PNG (requires Chrome/Chromium) plot.SavePNG(barChart, "bar_chart.png") } ``` ## Parquet File Operations The `parquet` package provides efficient read/write support for Apache Parquet files with streaming and CCL integration. ```go package main import ( "context" "fmt" "github.com/HazelnutParadise/insyra" "github.com/HazelnutParadise/insyra/parquet" ) func main() { ctx := context.Background() // Inspect Parquet file metadata info, _ := parquet.Inspect("data.parquet") fmt.Printf("Rows: %d, RowGroups: %d\n", info.NumRows, info.NumRowGroups) // Read entire file dt, _ := parquet.Read(ctx, "data.parquet", parquet.ReadOptions{}) dt.Show() // Read specific columns and row groups dtPartial, _ := parquet.Read(ctx, "data.parquet", parquet.ReadOptions{ Columns: []string{"id", "name", "value"}, RowGroups: []int{0, 1}, }) dtPartial.Show() // Stream large files in batches dtChan, errChan := parquet.Stream(ctx, "large.parquet", parquet.ReadOptions{}, 1000) for { select { case batch, ok := <-dtChan: if !ok { return } rows, _ := batch.Size() fmt.Printf("Processed batch with %d rows\n", rows) case err := <-errChan: if err != nil { panic(err) } return } } // Read single column col, _ := parquet.ReadColumn(ctx, "data.parquet", "price", parquet.ReadColumnOptions{}) fmt.Printf("Price mean: %.2f\n", col.Mean()) // Filter with CCL filtered, _ := parquet.FilterWithCCL(ctx, "data.parquet", "(['price'] > 100) && (['status'] == 'active')") filtered.Show() // Write DataTable to Parquet newDT := insyra.NewDataTable( insyra.NewDataList(1, 2, 3).SetName("ID"), insyra.NewDataList("A", "B", "C").SetName("Code"), ) parquet.Write(newDT, "output.parquet") } ``` ## Parallel Processing The `parallel` package enables simple concurrent execution of multiple functions with result collection. ```go package main import ( "fmt" "time" "github.com/HazelnutParadise/insyra/parallel" ) func main() { // Define functions to run in parallel fetchUsers := func() ([]string, error) { time.Sleep(100 * time.Millisecond) return []string{"Alice", "Bob", "Charlie"}, nil } fetchOrders := func() (int, error) { time.Sleep(150 * time.Millisecond) return 42, nil } fetchStats := func() map[string]float64 { time.Sleep(80 * time.Millisecond) return map[string]float64{"avg": 3.14, "sum": 100.5} } // Run in parallel and await results pg := parallel.GroupUp(fetchUsers, fetchOrders, fetchStats).Run() results := pg.AwaitResult() fmt.Printf("All tasks completed in parallel\n") for i, result := range results { fmt.Printf("Task %d: %v\n", i, result) } // For functions with no return values counter := 0 increment := func() { counter++ } double := func() { counter *= 2 } parallel.GroupUp(increment, double).Run().AwaitNoResult() fmt.Printf("Counter: %d\n", counter) } ``` ## DataTable Merge Operations DataTable supports SQL-like merge operations including inner, outer, left, and right joins. ```go package main import ( "github.com/HazelnutParadise/insyra" ) func main() { // Create two tables dt1 := insyra.NewDataTable( insyra.NewDataList("A", "B", "C").SetName("ID"), insyra.NewDataList(1, 2, 3).SetName("Value1"), ) dt2 := insyra.NewDataTable( insyra.NewDataList("B", "C", "D").SetName("ID"), insyra.NewDataList(10, 20, 30).SetName("Value2"), ) // Inner join - only matching rows inner, _ := dt1.Merge(dt2, insyra.MergeDirectionHorizontal, insyra.MergeModeInner, "ID") inner.Show() // Result: B, C rows with both Value1 and Value2 // Left join - keep all from dt1 left, _ := dt1.Merge(dt2, insyra.MergeDirectionHorizontal, insyra.MergeModeLeft, "ID") left.Show() // Result: A, B, C rows; A has nil for Value2 // Right join - keep all from dt2 right, _ := dt1.Merge(dt2, insyra.MergeDirectionHorizontal, insyra.MergeModeRight, "ID") right.Show() // Result: B, C, D rows; D has nil for Value1 // Outer join - keep all rows outer, _ := dt1.Merge(dt2, insyra.MergeDirectionHorizontal, insyra.MergeModeOuter, "ID") outer.Show() // Result: A, B, C, D rows with nil for missing values // Vertical merge (append rows) dt3 := insyra.NewDataTable( insyra.NewDataList("E", "F").SetName("ID"), insyra.NewDataList(4, 5).SetName("Value1"), ) vertical, _ := dt1.Merge(dt3, insyra.MergeDirectionVertical, insyra.MergeModeOuter) vertical.Show() } ``` ## Error Handling Insyra provides both instance-level error tracking for method chaining and a global error buffer for monitoring. ```go package main import ( "fmt" "github.com/HazelnutParadise/insyra" ) func main() { dl := insyra.NewDataList(1, 2, 3) // Chained operations with error checking dl.Append(4, 5).Sort().Reverse() // Check for errors after chain if err := dl.Err(); err != nil { fmt.Printf("Error: %s\n", err.Message) } dl.ClearErr() // Clear for next operations // DataTable error handling dt := insyra.NewDataTable( insyra.NewDataList(1, 2, 3), insyra.NewDataList("a", "b", "c"), ) dt.SortBy(insyra.DataTableSortConfig{ColumnName: "NonExistent"}) if err := dt.Err(); err != nil { fmt.Printf("DataTable error: %s\n", err.Message) } dt.ClearErr() // Global error buffer if insyra.HasError() { allErrors := insyra.GetAllErrors() for _, e := range allErrors { fmt.Printf("[%s] %s.%s: %s\n", e.Level, e.PackageName, e.FuncName, e.Message) } insyra.ClearErrors() } // Check for specific error levels if insyra.HasErrorAboveLevel(insyra.LogLevelWarning) { fmt.Println("Warnings or higher detected") } } ``` Insyra provides a comprehensive toolkit for Go developers needing data analysis capabilities, from simple statistical calculations to complex data transformations and visualizations. The main use cases include data cleaning and preprocessing, statistical analysis and hypothesis testing, time series analysis, report generation with charts, ETL pipelines with Parquet support, and parallel data processing. The library integrates well with existing Go codebases through its fluent API design and can also execute Python code for leveraging Python's data science ecosystem. The integration pattern typically involves loading data from files (CSV, Excel, JSON, Parquet) into DataTable structures, performing transformations using CCL or method chains, running statistical analysis with the stats package, and outputting results to files or visualizations. For large datasets, the streaming Parquet reader and parallel processing capabilities ensure efficient memory usage and performance.