### Go SDK Examples Source: https://github.com/apache/beam/blob/master/examples/notebooks/interactive-overview/getting-started.ipynb Code examples demonstrating common data processing tasks using the Apache Beam Go SDK. ```Go package main import ( "context" "fmt" "github.com/apache/beam/sdks/v2/go/pkg/beam" "github.com/apache/beam/sdks/v2/go/pkg/beam/io/textio" "github.com/apache/beam/sdks/v2/go/pkg/beam/runners/direct" ) func main() { ctx := context.Background() beam.Init() // Define pipeline options opts := []string{} // Create a pipeline p, _ := beam.NewPipeline(beam.NewPipelineOptions(opts...)) // Create a PCollection of strings words := beam.CreateList(p, []string{"hello", "world", "apache", "beam"}) // Apply transformations upperWords := beam.Map(p, func(word string) string { return strings.ToUpper(word) }) prefixedWords := beam.Map(p, func(word string) string { return fmt.Sprintf("Processed: %s", word) }) // Output the results textio.Write(p, "output.txt", prefixedWords) // Run the pipeline direct.Run(p) } ``` -------------------------------- ### Apache Beam Go SDK Examples Source: https://github.com/apache/beam/blob/master/examples/notebooks/interactive-overview/getting-started.ipynb Go language code examples for Apache Beam, illustrating pipeline construction and data manipulation. ```Go package main import ( "context" "fmt" "github.com/apache/beam/sdks/v2/go/pkg/beam" "github.com/apache/beam/sdks/v2/go/pkg/beam/testing/ptest" ) func init() { beam.RegisterFunction(squareInt) } func squareInt(x int) int { return x * x } func main() { ctx := context.Background() pipeline, _ := beam.NewPipeline() numbers := beam.CreateList1(pipeline, []int{1, 2, 3, 4, 5}) squaredNumbers := beam.ParDo(pipeline, squareInt, numbers) beam.Map(pipeline, func(x int) { fmt.Println(x) }, squaredNumbers) ptest.Run(pipeline) } ``` -------------------------------- ### Apache Beam Go SDK Examples Source: https://github.com/apache/beam/blob/master/examples/notebooks/interactive-overview/getting-started.ipynb Provides examples of using the Apache Beam Go SDK for pipeline development. ```go package main import ( "context" "fmt" "github.com/apache/beam/sdks/v2/go/pkg/beam" "github.com/apache/beam/sdks/v2/go/pkg/beam/io/textio" ) func main() { ctx := context.Background() beam.Init() p, err := beam.NewPipeline() if err != nil { panic(fmt.Sprintf("Failed to create pipeline: %v", err)) } numbers := beam.CreateList1(ctx, []int{1, 2, 3}) processed := beam.Map(ctx, numbers, func(x int) int { return x * x }) textio.Write(ctx, p, "output.txt", processed) err = p.Run(ctx) if err != nil { panic(fmt.Sprintf("Failed to run pipeline: %v", err)) } } ``` -------------------------------- ### Apache Beam Go SDK Example Source: https://github.com/apache/beam/blob/master/examples/notebooks/interactive-overview/getting-started.ipynb Example demonstrating the usage of the Apache Beam Go SDK for data processing. ```go package main import ( "context" "fmt" "github.com/apache/beam/sdks/v2/go/pkg/beam" "github.com/apache/beam/sdks/v2/go/pkg/beam/runners/direct" ) func main() { ctx := context.Background() beam.Init() // Initialize Beam SDK p := beam.NewPipeline() elements := []string{"Hello", "World"} col := beam.CreateList(elements) processed := beam.Map(ctx, func(s string) string { return "Processed: " + s }, col) beam.Output(ctx, processed, func(s string) { fmt.Println(s) }) direct.Run(ctx, p) } ``` -------------------------------- ### Apache Beam Quickstart Examples Source: https://github.com/apache/beam/blob/master/website/www/site/content/en/get-started/resources/learning-resources.md Guides for setting up and running a WordCount pipeline using different Apache Beam SDKs. These examples are essential for beginners to understand the basic workflow. ```Java /get-started/quickstart-java/ ``` ```Python /get-started/quickstart-py/ ``` ```Go /get-started/quickstart-go/ ``` -------------------------------- ### Apache Beam Go SDK Examples Source: https://github.com/apache/beam/blob/master/examples/notebooks/interactive-overview/getting-started.ipynb Showcases data processing with the Apache Beam Go SDK. Includes examples for creating pipelines and applying transformations. ```Go package main import ( "context" "fmt" "github.com/apache/beam/sdks/v2/go/pkg/beam" "github.com/apache/beam/sdks/v2/go/pkg/beam/io/textio" "github.com/apache/beam/sdks/v2/go/pkg/beam/transforms/elements" ) func init() { beam.RegisterFunction(squareInt) } func squareInt(x int) int { return x * x } func main() { ctx := context.Background() p, err := beam.NewPipeline() if err != nil { panic(fmt.Sprintf("Failed to create pipeline: %v", err)) } numbers := beam.CreateList1(ctx, p, []int{1, 2, 3, 4, 5}) squaredNumbers := beam.ParDo(ctx, elements.Map(squareInt), numbers) textio.Write(ctx, p, "output.txt", squaredNumbers) err = beam.Run(ctx, p) if err != nil { panic(fmt.Sprintf("Failed to run pipeline: %v", err)) } } ``` -------------------------------- ### Apache Beam Go SDK Examples Source: https://github.com/apache/beam/blob/master/examples/notebooks/interactive-overview/getting-started.ipynb Provides examples of using the Apache Beam Go SDK to define and execute data processing pipelines. ```Go package main import ( "context" "fmt" "github.com/apache/beam/sdks/v2/go/pkg/beam" "github.com/apache/beam/sdks/v2/go/pkg/beam/io/textio" "github.com/apache/beam/sdks/v2/go/pkg/beam/runners/direct" ) func main() { ctx := context.Background() beam.Init() p := beam.NewPipeline() options := direct.NewPipelineOptions() lines := textio.Read(p, "input.txt") upperCaseLines := beam.Map(p, func(line string) string { return strings.ToUpper(line) }, lines) textio.Write(p, "output.txt", upperCaseLines) err := beam.Run(ctx, p, options) if err != nil { fmt.Printf("Error running pipeline: %v\n", err) } } ``` -------------------------------- ### Apache Beam Go SDK Example Source: https://github.com/apache/beam/blob/master/examples/notebooks/interactive-overview/getting-started.ipynb A basic example demonstrating the use of the Apache Beam Go SDK to create and run a simple data processing pipeline. ```Go package main import ( "context" "fmt" "github.com/apache/beam/sdks/v2/go/pkg/beam" "github.com/apache/beam/sdks/v2/go/pkg/beam/runners/direct" ) func main() { ctx := context.Background() beam.Init() p := beam.NewPipeline() numbers := beam.CreateList1(p, []int{1, 2, 3, 4, 5}) squaredNumbers := beam.Map(p, func(x int) int { return x * x }, numbers) beam.ParDo0(p, &printFn{}, squaredNumbers) err := direct.Run(ctx, p) if err != nil { panic(err) } } type printFn struct {} func (f *printFn) Process(element int) { fmt.Println(element) } ``` -------------------------------- ### Apache Beam Go SDK Examples Source: https://github.com/apache/beam/blob/master/examples/notebooks/interactive-overview/getting-started.ipynb Go language code examples demonstrating the usage of the Apache Beam SDK for creating and running data processing pipelines. ```Go package main import ( "context" "fmt" "github.com/apache/beam/sdks/v2/go/pkg/beam" "github.com/apache/beam/sdks/v2/go/pkg/beam/io/textio" "github.com/apache/beam/sdks/v2/go/pkg/beam/transforms/elements" ) func main() { ctx := context.Background() beam.Init() p := beam.NewPipeline() col := p.Root() // Create a collection of numbers numbers := beam.CreateList(p, []int{1, 2, 3, 4, 5}) // Square each number squared := beam.Map(p, func(x int) int { return x * x }, numbers) // Print the squared numbers textio.Write(p, "output.txt", beam.Map(p, func(x int) string { return fmt.Sprintf("%d", x) }, squared)) beam.Run(ctx, p) } ``` -------------------------------- ### Apache Beam Go SDK Examples Source: https://github.com/apache/beam/blob/master/examples/notebooks/interactive-overview/getting-started.ipynb Provides examples of using the Apache Beam Go SDK for building data processing pipelines, showcasing its capabilities. ```Go package main import ( "context" "fmt" "github.com/apache/beam/sdks/v2/go/pkg/beam" "github.com/apache/beam/sdks/v2/go/pkg/beam/io/textio" "github.com/apache/beam/sdks/v2/go/pkg/beam/runners/direct" ) func main() { ctx := context.Background() beam.Init() p, s := beam.NewPipelineWithRoot() lines := textio.Read(s, "input.txt") upperCaseLines := beam.Map(s, func(line string) string { return strings.ToUpper(line) }, lines) textio.Write(s, "output.txt", upperCaseLines) err := direct.Run(ctx, p) if err != nil { fmt.Printf("Failed to run pipeline: %v", err) } } ``` -------------------------------- ### Apache Beam SDKs: Java Examples Source: https://github.com/apache/beam/blob/master/examples/notebooks/interactive-overview/getting-started.ipynb Provides examples of how to implement Apache Beam pipelines using the Java SDK. This covers basic pipeline construction and common transforms. ```Java import org.apache.beam.sdk.Pipeline; import org.apache.beam.sdk.options.PipelineOptions; import org.apache.beam.sdk.options.PipelineOptionsFactory; import org.apache.beam.sdk.transforms.Create; import org.apache.beam.sdk.transforms.MapElements; import org.apache.beam.sdk.values.PCollection; import org.apache.beam.sdk.values.TypeDescriptors; public class SimplePipeline { public static void main(String[] args) { PipelineOptions options = PipelineOptionsFactory.create(); Pipeline pipeline = Pipeline.create(options); PCollection numbers = pipeline.apply(Create.of(1, 2, 3, 4, 5)); PCollection squaredNumbers = numbers.apply( "SquareNumbers", MapElements.into(TypeDescriptors.integers()).via(x -> x * x) ); squaredNumbers.apply("PrintResults", MapElements.into(TypeDescriptors.void_()).via(System.out::println)); pipeline.run().waitUntilFinish(); } } ``` -------------------------------- ### Python SDK Quickstart Source: https://github.com/apache/beam/blob/master/sdks/python/README.md Guide to setting up the Python development environment, installing the Beam SDK for Python, and running an example pipeline. ```Python print('This section refers to the Python SDK quickstart guide.') # Further details can be found at /get-started/quickstart-py ``` -------------------------------- ### Beam Python SDK Quickstart Source: https://github.com/apache/beam/blob/master/website/www/site/content/en/documentation/sdks/python.md Guides users through setting up their Python development environment, installing the Beam SDK for Python, and running an example pipeline. ```bash # Install Apache Beam SDK for Python pip install apache-beam[gcp] # Example pipeline execution (conceptual) # python your_pipeline_script.py --runner=DirectRunner ``` -------------------------------- ### Apache Beam Go SDK - Basic Pipeline Source: https://github.com/apache/beam/blob/master/examples/notebooks/interactive-overview/getting-started.ipynb Illustrates how to set up and run a simple pipeline using the Apache Beam Go SDK. This example creates a pipeline and applies a basic transformation. ```Go package main import ( "context" "fmt" "github.com/apache/beam/sdks/v2/go/pkg/beam" "github.com/apache/beam/sdks/v2/go/pkg/beam/io/textio" "github.com/apache/beam/sdks/v2/go/pkg/beam/runners/direct" ) func init() { beam.RegisterFunction(formatFn) } func formatFn(element string) string { return "Formatted: " + element } func main() { ctx := context.Background() p, err := beam.NewPipeline(beam.NewPipelineOptions(nil)) if err != nil { panic(fmt.Sprintf("Failed to create pipeline: %v", err)) } elements := []string{"one", "two", "three"} PCollection := beam.CreateList(p, elements) formatted := beam.ParDo(p, formatFn, PCollection) textio.Write(p, "output.txt", formatted) err = direct.Run(ctx, p) if err != nil { panic(fmt.Sprintf("Failed to run pipeline: %v", err)) } } ``` -------------------------------- ### Apache Beam Go SDK Examples Source: https://github.com/apache/beam/blob/master/examples/notebooks/interactive-overview/getting-started.ipynb Provides examples of data processing workflows using the Apache Beam Go SDK, including pipeline construction and element transformation. ```Go package main import ( "context" "fmt" "github.com/apache/beam/sdks/v2/go/pkg/beam" "github.com/apache/beam/sdks/v2/go/pkg/beam/io/textio" "github.com/apache/beam/sdks/v2/go/pkg/beam/transforms/elements" ) func main() { ctx := context.Background() beam.Init() // Define pipeline options opts := beam.NewPipelineOptions(ctx) // Create a pipeline p, s := beam.NewPipelineWithRoot() // Example: Read lines from a file, transform, and print lines := textio.Read(s, "input.txt") upperLines := elements.Map(s, elements.StringToUpper) textio.Write(s, "output.txt", upperLines) // Run the pipeline err := beam.Run(ctx, opts, p) if err != nil { fmt.Printf("Failed to run pipeline: %v\n", err) } } ``` -------------------------------- ### Deploy examples setup Source: https://github.com/apache/beam/blob/master/playground/README.md Steps to set up the environment for deploying examples, including navigating to the infrastructure directory, creating and activating a Python virtual environment, and installing dependencies. ```shell cd playground/infrastucture ``` ```shell python3 -m venv venv ``` ```shell source venv/bin/activate ``` ```shell pip install -r requirements.txt ``` -------------------------------- ### Apache Beam Go SDK: Simple Pipeline Source: https://github.com/apache/beam/blob/master/examples/notebooks/interactive-overview/getting-started.ipynb An introductory example of creating and running a basic pipeline using the Apache Beam Go SDK. It demonstrates the core concepts of defining a pipeline and applying simple transformations. ```Go package main import ( "context" "fmt" "github.com/apache/beam/sdks/v2/go/pkg/beam" "github.com/apache/beam/sdks/v2/go/pkg/beam/runners/direct" ) func init() { beam.RegisterFunction(doubleInt) } func doubleInt(x int) int { return x * 2 } func main() { ctx := context.Background() p, err := beam.NewPipeline(beam.NewPipelineOptions(nil)) if err != nil { panic(fmt.Sprintf("Failed to create pipeline: %v", err)) } nums := beam.CreateList([]int{1, 2, 3, 4, 5}) doubled := beam.ParDo(ctx, doubleInt, nums) beam.Output(ctx, doubled) err = direct.Run(ctx, p) if err != nil { panic(fmt.Sprintf("Failed to run pipeline: %v", err)) } } ``` -------------------------------- ### Apache Beam SDKs: Go Examples Source: https://github.com/apache/beam/blob/master/examples/notebooks/interactive-overview/getting-started.ipynb Illustrates how to build Apache Beam pipelines using the Go SDK. This includes setting up a pipeline and applying basic transformations. ```Go package main import ( "context" "fmt" "github.com/apache/beam/sdks/v2/go/pkg/beam" "github.com/apache/beam/sdks/v2/go/pkg/beam/io/textio" "github.com/apache/beam/sdks/v2/go/pkg/beam/runners/direct" ) func main() { ctx := context.Background() beam.Init() p := direct.NewRunner() input := []string{"hello world", "apache beam go"} col := beam.CreateList(input) processed := beam.Map(col, func(s string) string { return s + " processed" }) beam.Output(processed, "output.txt") p.Run(ctx, beam.Pipeline{Root: processed}) fmt.Println("Pipeline finished.") } ``` -------------------------------- ### Apache Beam Python SDK Example Source: https://github.com/apache/beam/blob/master/examples/notebooks/interactive-overview/getting-started.ipynb Example demonstrating the usage of the Apache Beam Python SDK for data processing. ```python import apache_beam as beam with beam.Pipeline() as pipeline: (pipeline | 'CreateData' >> beam.Create(['Hello', 'World']) | 'ProcessData' >> beam.Map(lambda x: f'Processed: {x}') | 'PrintOutput' >> beam.Map(print)) ``` -------------------------------- ### Apache Beam SDK - Go Examples Source: https://github.com/apache/beam/blob/master/examples/notebooks/interactive-overview/getting-started.ipynb Code snippets demonstrating Apache Beam pipeline construction and execution using the Go SDK. ```go package main import ( "context" "fmt" "github.com/apache/beam/sdks/v2/go/pkg/beam" "github.com/apache/beam/sdks/v2/go/pkg/beam/io/textio" "github.com/apache/beam/sdks/v2/go/pkg/beam/transforms/elements" ) func main() { ctx := context.Background() beam.Init() p := beam.NewPipeline() options := beam.NewPipelineOptions() input := textio.Read(p, "input.txt") processed := elements.Map(p, input, func(line string) string { return "Processed: " + line }) textio.Write(p, "output.txt", processed) err := beam.Run(ctx, options, p) if err != nil { fmt.Printf("Failed to run pipeline: %v\n", err) } } ``` -------------------------------- ### Apache Beam Java SDK Example Source: https://github.com/apache/beam/blob/master/examples/notebooks/interactive-overview/getting-started.ipynb Example demonstrating the usage of the Apache Beam Java SDK for data processing. ```java package com.example.beam; import org.apache.beam.sdk.Pipeline; import org.apache.beam.sdk.options.PipelineOptions; import org.apache.beam.sdk.options.PipelineOptionsFactory; import org.apache.beam.sdk.transforms.Create; import org.apache.beam.sdk.transforms.DoFn; import org.apache.beam.sdk.transforms.ParDo; public class SimplePipeline { public static void main(String[] args) { PipelineOptions options = PipelineOptionsFactory.create(); Pipeline pipeline = Pipeline.create(options); pipeline.apply(Create.of("Hello", "World")) .apply(ParDo.of(new DoFn() { @ProcessElement public void processElement(@Element String element, OutputReceiver out) { out.output("Processed: " + element); } })) .forEach(System.out::println); } } ``` -------------------------------- ### Apache Beam Go SDK Examples Source: https://github.com/apache/beam/blob/master/examples/notebooks/interactive-overview/getting-started.ipynb Illustrative code snippets for the Apache Beam Go SDK, demonstrating pipeline construction and execution. ```go package main import ( "context" "fmt" "github.com/apache/beam/sdks/v2/go/pkg/beam" "github.com/apache/beam/sdks/v2/go/pkg/beam/runners/direct" ) func main() { ctx := context.Background() beam.Init() // Initializes Beam. // Create a pipeline. p := beam.NewPipeline() // Define a transform. numbers := beam.CreateList1(p, 1, 2, 3, 4, 5) squared := beam.Map(p, func(x int) int { return x * x }, numbers) // Process the data. beam.Println(ctx, squared) // Run the pipeline. err := direct.Run(ctx, p) if err != nil { panic(fmt.Sprintf("Failed to run pipeline: %v", err)) } } ``` -------------------------------- ### Apache Beam Go SDK Examples Source: https://github.com/apache/beam/blob/master/examples/notebooks/interactive-overview/getting-started.ipynb Illustrative code snippets for the Apache Beam Go SDK, demonstrating pipeline construction and execution. ```go package main import ( "context" "fmt" "github.com/apache/beam/sdks/v2/go/pkg/beam" "github.com/apache/beam/sdks/v2/go/pkg/beam/runners/direct" ) func init() { beam.RegisterFunction(square) } func square(x int) int { return x * x } func main() { ctx := context.Background() pipeline, _ := beam.NewPipeline() elements := beam.CreateList([]int{1, 2, 3, 4, 5}) mapped := beam.Map(pipeline.Root(), elements, square) beam.Println(ctx, mapped) direct.Run(ctx, pipeline) } ``` -------------------------------- ### Install Apache Beam Source: https://github.com/apache/beam/blob/master/examples/notebooks/interactive-overview/getting-started.ipynb Installs the apache-beam package quietly using pip. This is the first step to using Apache Beam. ```python # Install apache-beam with pip. !pip install --quiet apache-beam ``` -------------------------------- ### Python SDK: WordCount Example Source: https://github.com/apache/beam/blob/master/examples/notebooks/interactive-overview/getting-started.ipynb A WordCount example implemented using the Apache Beam Python SDK. It mirrors the functionality of the Java example, demonstrating text processing and counting. ```Python import apache_beam as beam from apache_beam.options.pipeline_options import PipelineOptions with beam.Pipeline(options=PipelineOptions()) as pipeline: lines = pipeline | 'ReadLines' >> beam.io.ReadFromText('input.txt') words = lines | 'SplitWords' >> beam.FlatMap(lambda line: line.split()) word_counts = words | 'CountWords' >> beam.combiners.Count.PerElement() output = word_counts | 'FormatCounts' >> beam.Map(lambda word_count: f'{word_count[0]}: {word_count[1]}') output | 'WriteCounts' >> beam.io.WriteToText('output.txt') ``` -------------------------------- ### Go Apache Beam Pipeline Example Source: https://github.com/apache/beam/blob/master/examples/notebooks/interactive-overview/getting-started.ipynb An example of an Apache Beam pipeline written in Go, showcasing basic data processing. ```go package main import ( "context" "fmt" "github.com/apache/beam/sdks/v2/go/pkg/beam" "github.com/apache/beam/sdks/v2/go/pkg/beam/runners/direct" ) func main() { ctx := context.Background() beam.InitBackend(direct.NewRunner()) p := beam.NewPipeline() words := beam.CreateList(p, "hello", "world", "apache", "beam") uppercasedWords := beam.Map(p, words, func(s string) string { return strings.ToUpper(s) }) beam.ParDo0(p, &printFn{}, uppercasedWords) err := beam.Run(ctx, p) if err != nil { fmt.Printf("Error running pipeline: %v\n", err) } } type printFn struct {} func (f *printFn) Process(element string, output beam.Output) { fmt.Println(element) } ``` -------------------------------- ### Apache Beam Go SDK Examples Source: https://github.com/apache/beam/blob/master/examples/notebooks/interactive-overview/getting-started.ipynb Go code snippets showcasing Apache Beam pipeline development. ```Go package main import ( "context" "fmt" "github.com/apache/beam/sdks/v2/go/pkg/beam" "github.com/apache/beam/sdks/v2/go/pkg/beam/io/textio" "github.com/apache/beam/sdks/v2/go/pkg/beam/runners/direct" ) func main() { ctx := context.Background() beam.Init() p := beam.NewPipeline() col := beam.CreateList(p, []int{1, 2, 3, 4, 5}) mapped := beam.Map(p, col, func(x int) int { return x * x }) textio.Write(p, "output.txt", mapped) direct.Run(p) } ``` -------------------------------- ### Go: Creating a Simple Beam Pipeline Source: https://github.com/apache/beam/blob/master/examples/notebooks/interactive-overview/getting-started.ipynb Demonstrates how to create a basic Apache Beam pipeline in Go. This includes setting up the pipeline options and running a simple transformation. ```go package main import ( "context" "fmt" "github.com/apache/beam/sdks/v2/go/pkg/beam" "github.com/apache/beam/sdks/v2/go/pkg/beam/io/textio" "github.com/apache/beam/sdks/v2/go/pkg/beam/transforms/elements" ) func main() { ctx := context.Background() beam.Init() // Initialize Beam SDK // Define pipeline options (can be customized) options := beam.NewPipelineOptions(ctx, nil) // Create a new pipeline p, s := beam.NewPipelineWithRoot() // p is the pipeline, s is the scope // Create a PCollection from a slice of strings words := []string{"hello", "world", "apache", "beam"} input := beam.CreateList(s, words) // Apply a Map transform to convert words to uppercase uppercasedWords := elements.Map(s, input, func(word string) string { return strings.ToUpper(word) }) // Apply a Map transform to print each word elements.Map(s, uppercasedWords, func(word string) { fmt.Println(word) }) // Run the pipeline err := beam.Run(ctx, options, p) if err != nil { fmt.Printf("Error running pipeline: %v\n", err) } } ``` -------------------------------- ### Apache Beam Go SDK Examples Source: https://github.com/apache/beam/blob/master/examples/notebooks/interactive-overview/getting-started.ipynb Provides examples of building data processing pipelines using the Apache Beam Go SDK. Showcases pipeline creation, element processing, and output. ```Go package main import ( "context" "fmt" "github.com/apache/beam/sdks/v2/go/pkg/beam" "github.com/apache/beam/sdks/v2/go/pkg/beam/io/textio" "github.com/apache/beam/sdks/v2/go/pkg/beam/runners/direct" ) func init() { beam.RegisterFunction(square) } func square(x int) int { return x * x } func main() { ctx := context.Background() pipeline, _ := beam.NewPipeline() numbers := beam.CreateList1(ctx, pipeline, []int{1, 2, 3, 4, 5}) processed := beam.Map(ctx, numbers, square) textio.Write(ctx, pipeline, "output.txt", processed) direct.Run(ctx, pipeline) } ``` -------------------------------- ### Python SDK Examples Source: https://github.com/apache/beam/blob/master/examples/notebooks/interactive-overview/getting-started.ipynb Code examples demonstrating common data processing tasks using the Apache Beam Python SDK. ```Python import apache_beam as beam with beam.Pipeline() as pipeline: words = pipeline | 'CreateWords' >> beam.Create(['hello', 'world', 'apache', 'beam']) processed_words = words | 'ToUpper' >> beam.Map(str.upper) final_output = processed_words | 'AddPrefix' >> beam.Map(lambda word: f'Processed: {word}') final_output | 'Print' >> beam.Map(print) ``` -------------------------------- ### Apache Beam Go SDK Example Source: https://github.com/apache/beam/blob/master/examples/notebooks/interactive-overview/getting-started.ipynb Illustrates a basic pipeline creation and execution using the Apache Beam Go SDK. This example demonstrates reading data and applying a simple transformation. ```Go package main import ( "context" "fmt" "github.com/apache/beam/sdks/v2/go/pkg/beam" "github.com/apache/beam/sdks/v2/go/pkg/beam/io/textio" "github.com/apache/beam/sdks/v2/go/pkg/beam/runners/direct" ) func main() { ctx := context.Background() // Create a pipeline. p, err := beam.NewPipeline(direct.NewRunner()) if err != nil { panic(fmt.Sprintf("Failed to create pipeline: %v", err)) } // Define the pipeline. words := []string{"hello", "world", "apache", "beam"} wordCollection := beam.CreateList(words) processedCollection := beam.Map(ctx, func(word string) string { return strings.ToUpper(word) }, wordCollection) textio.Write(ctx, p, "output.txt", processedCollection) // Run the pipeline. if err := p.Run(ctx); err != nil { panic(fmt.Sprintf("Failed to run pipeline: %v", err)) } } ``` -------------------------------- ### Java SDK Examples Source: https://github.com/apache/beam/blob/master/examples/notebooks/interactive-overview/getting-started.ipynb Code examples demonstrating common data processing tasks using the Apache Beam Java SDK. ```Java import org.apache.beam.sdk.Pipeline; import org.apache.beam.sdk.options.PipelineOptions; import org.apache.beam.sdk.options.PipelineOptionsFactory; import org.apache.beam.sdk.transforms.Create; import org.apache.beam.sdk.transforms.MapElements; import org.apache.beam.sdk.values.PCollection; import org.apache.beam.sdk.values.TypeDescriptors; public class SimplePipeline { public static void main(String[] args) { PipelineOptions options = PipelineOptionsFactory.create(); Pipeline pipeline = Pipeline.create(options); PCollection words = pipeline.apply( Create.of("hello", "world", "apache", "beam")); words.apply( MapElements.into(TypeDescriptors.strings()).via((String word) -> word.toUpperCase())) .apply( MapElements.into(TypeDescriptors.strings()).via((String word) -> "Processed: " + word)); pipeline.run().waitUntilFinish(); } } ``` -------------------------------- ### Install Node.js and npm Source: https://github.com/apache/beam/blob/master/playground/README.md Installs Node.js and npm, the package manager for JavaScript, on Ubuntu/Debian systems. Includes a link to the official npm installation guide for other Linux distributions. ```shell sudo apt install npm # Other Linux variants: Follow manual at https://docs.npmjs.com/downloading-and-installing-node-js-and-npm ``` -------------------------------- ### Apache Beam Go SDK Example Source: https://github.com/apache/beam/blob/master/examples/notebooks/interactive-overview/getting-started.ipynb Demonstrates a simple pipeline using the Apache Beam Go SDK, showcasing data creation and transformation. ```go package main import ( "context" "fmt" "github.com/apache/beam/sdks/v2/go/pkg/beam" "github.com/apache/beam/sdks/v2/go/pkg/beam/runners/direct" ) func main() { ctx := context.Background() beam.Init() p := beam.NewPipeline() options := beam.PipelineOptions{} input := []string{"hello", "world", "apache", "beam"} col := beam.CreateList(p, input) transformed := beam.Map(p, col, func(s string) string { return strings.ToUpper(s) }) beam.Nil.Output(p, transformed) err := direct.Run(ctx, p, options) if err != nil { fmt.Printf("Error running pipeline: %v\n", err) } } ``` -------------------------------- ### Apache Beam Java SDK Examples Source: https://github.com/apache/beam/blob/master/examples/notebooks/interactive-overview/getting-started.ipynb Code examples showcasing the Apache Beam Java SDK for building data processing pipelines. ```java import org.apache.beam.sdk.Pipeline; import org.apache.beam.sdk.options.PipelineOptions; import org.apache.beam.sdk.transforms.Create; import org.apache.beam.sdk.transforms.MapElements; import org.apache.beam.sdk.values.PCollection; import org.apache.beam.sdk.values.TypeDescriptors; public class SimplePipeline { public static void main(String[] args) { Pipeline pipeline = Pipeline.create(); PCollection numbers = pipeline.apply(Create.of(1, 2, 3, 4, 5)); PCollection squaredNumbers = numbers.apply( MapElements.into(TypeDescriptors.integers()).via((Integer x) -> x * x)); squaredNumbers.apply(MapElements.into(TypeDescriptors.strings()).via(Object::toString)).apply(beam.transforms.Print.toOut()); pipeline.run().waitUntilFinish(); } } ``` -------------------------------- ### Apache Beam Go SDK Examples Source: https://github.com/apache/beam/blob/master/examples/notebooks/interactive-overview/getting-started.ipynb Demonstrates the usage of the Apache Beam Go SDK for creating and running data processing pipelines. Covers pipeline creation and basic transformations. ```go package main import ( "context" "fmt" "github.com/apache/beam/sdks/v2/go/pkg/beam" "github.com/apache/beam/sdks/v2/go/pkg/beam/runners/direct" ) func main() { ctx := context.Background() // Create a pipeline. p := beam.NewPipeline() // Define a transform. input := []string{"hello", "world", "apache", "beam"} words := beam.CreateList(p, input) lengths := beam.Map(p, func(s string) int { return len(s) }, words) // Process the data. beam.Map(p, func(l int) { fmt.Printf("Length: %d\n", l) }, lengths) // Run the pipeline. err := direct.Run(ctx, p) if err != nil { panic(err) } } ``` -------------------------------- ### Apache Beam Java SDK Examples Source: https://github.com/apache/beam/blob/master/examples/notebooks/interactive-overview/getting-started.ipynb Code examples showcasing the Apache Beam Java SDK for building data processing pipelines. ```java import org.apache.beam.sdk.Pipeline; import org.apache.beam.sdk.options.PipelineOptions; import org.apache.beam.sdk.options.PipelineOptionsFactory; import org.apache.beam.sdk.transforms.MapElements; import org.apache.beam.sdk.values.PCollection; import org.apache.beam.sdk.values.TypeDescriptors; public class SimplePipeline { public static void main(String[] args) { PipelineOptions options = PipelineOptionsFactory.create(); Pipeline pipeline = Pipeline.create(options); PCollection numbers = pipeline.apply(Create.of(1, 2, 3, 4, 5)); PCollection squaredNumbers = numbers.apply( "Square", MapElements.into(TypeDescriptors.integers()).via(x -> x * x)); squaredNumbers.apply("Print", MapElements.into(TypeDescriptors.voids()).via(System.out::println)); pipeline.run().waitUntilFinish(); } } ``` -------------------------------- ### Apache Beam Go SDK Examples Source: https://github.com/apache/beam/blob/master/examples/notebooks/interactive-overview/getting-started.ipynb Illustrative code snippets for the Apache Beam Go SDK, demonstrating pipeline creation and data manipulation. ```go package main import ( "context" "fmt" "github.com/apache/beam/sdks/v2/go/pkg/beam" "github.com/apache/beam/sdks/v2/go/pkg/beam/runners/direct" ) func main() { beam.Init() ctx := context.Background() p := beam.NewPipeline() elements := beam.CreateList(ctx, p, "hello", "world") beam.ParDo(ctx, &printElement{}, elements) err := direct.Run(ctx, p) if err != nil { fmt.Printf("Error running pipeline: %v\n", err) } } type printElement struct {} func (p *printElement) ProcessElement(element string) { fmt.Println(element) } ``` -------------------------------- ### Go SDK: Simple Pipeline Source: https://github.com/apache/beam/blob/master/examples/notebooks/interactive-overview/getting-started.ipynb A basic example of creating and running a Beam pipeline using the Go SDK. ```Go package main import ( "context" "fmt" "github.com/apache/beam/sdks/v2/go/pkg/beam" "github.com/apache/beam/sdks/v2/go/pkg/beam/runners/direct" ) func init() { beam.RegisterFunction(formatFn) } func formatFn(element string) string { return "Formatted: " + element } func main() { ctx := context.Background() pipeline, err := beam.NewPipeline() if err != nil { panic(fmt.Sprintf("Failed to create pipeline: %v", err)) } elements := []string{"one", "two", "three"} col := beam.CreateList(pipeline, elements) formatted := beam.ParDo(pipeline, formatFn, col) beam.Impulse(pipeline) beam.DisplayData(pipeline, "Formatted Elements", formatted) err = direct.Run(ctx, pipeline) if err != nil { panic(fmt.Sprintf("Failed to run pipeline: %v", err)) } } ``` -------------------------------- ### Apache Beam SDK for Go - Basic Pipeline Source: https://github.com/apache/beam/blob/master/examples/notebooks/interactive-overview/getting-started.ipynb A simple example demonstrating how to create and run a basic Apache Beam pipeline using the Go SDK. It includes creating a pipeline and applying a simple transformation. ```Go package main import ( "context" "fmt" "github.com/apache/beam/sdks/v2/go/pkg/beam" "github.com/apache/beam/sdks/v2/go/pkg/beam/io/textio" "github.com/apache/beam/sdks/v2/go/pkg/beam/runners/direct" ) func main() { ctx := context.Background() pipeline, _ := beam.NewPipeline() // Define a simple transformation transform := func(s string) string { return "Hello, " + s } // Create a PCollection from a slice of strings input := []string{"World", "Beam"} col := beam.CreateList(ctx, pipeline, input) // Apply the transformation processed := beam.Map(ctx, pipeline, col, transform) // Output the results textio.Write(ctx, pipeline, "output.txt", processed) // Run the pipeline direct.Run(ctx, pipeline) fmt.Println("Pipeline finished.") } ``` -------------------------------- ### Install Flutter and Dart Protobuf Dependencies Source: https://github.com/apache/beam/blob/master/playground/README.md Installs Flutter and its protobuf dependencies, required for Dart-based gRPC services. Provides installation commands for Ubuntu and links to Linux installation guides, plus PATH instructions. ```shell sudo apt install flutter # Other Linux variants: Follow manual at https://flutter.dev/docs/get-started/install/linux # Dart protobuf dependencies: https://grpc.io/docs/languages/dart/quickstart/ # Do not forget to update your PATH environment variable ``` -------------------------------- ### Apache Beam Go SDK Examples Source: https://github.com/apache/beam/blob/master/examples/notebooks/interactive-overview/getting-started.ipynb Illustrates how to use the Apache Beam Go SDK for building data processing pipelines. This includes examples of creating pipelines, applying transformations, and handling data. ```Go package main import ( "context" "fmt" "github.com/apache/beam/sdks/v2/go/pkg/beam" "github.com/apache/beam/sdks/v2/go/pkg/beam/io/textio" "github.com/apache/beam/sdks/v2/go/pkg/beam/runners/direct" ) func init() { beam.RegisterType(MyInts{}) } type MyInts []int func main() { p := beam.NewPipeline() options := direct.NewOptions() options.Output = "output" lines := textio.Read(p, "input.txt") // Example: Filter lines that contain "error" filteredLines := beam.Filter(p, lines, func(line string) bool { return strings.Contains(line, "error") }) // Example: Count the number of filtered lines count := beam.Count(p, filteredLines) textio.Write(p, "output.txt", count) err := beam.Run(p, options) if err != nil { fmt.Printf("Failed to run pipeline: %v\n", err) } } ``` -------------------------------- ### Apache Beam JavaScript SDK Examples Source: https://github.com/apache/beam/blob/master/examples/notebooks/interactive-overview/getting-started.ipynb Code examples for the Apache Beam JavaScript SDK, showing how to define and run data processing pipelines. ```javascript const beam = require('@apache-beam/wasm'); async function run() { const pipeline = await beam.Pipeline.create(); const numbers = pipeline.create([1, 2, 3, 4, 5]); const squaredNumbers = numbers.map(x => x * x); squaredNumbers.forEach(console.log); await pipeline.run(); } run(); ``` -------------------------------- ### Apache Beam JavaScript SDK Examples Source: https://github.com/apache/beam/blob/master/examples/notebooks/interactive-overview/getting-started.ipynb Code examples for the Apache Beam JavaScript SDK, showing how to define and execute data processing pipelines. ```javascript const beam = require('@apache-beam/wasm'); async function runPipeline() { const pipeline = await beam.Pipeline.create(); const numbers = pipeline.create([1, 2, 3, 4, 5]); const squared = pipeline.map(numbers, x => x * x); await pipeline.print(squared); await pipeline.run(); } runPipeline().catch(err => { console.error('Pipeline failed:', err); }); ``` -------------------------------- ### Apache Beam Go SDK - Pipeline Example Source: https://github.com/apache/beam/blob/master/examples/notebooks/interactive-overview/getting-started.ipynb An example demonstrating how to build a simple pipeline using the Apache Beam Go SDK. It showcases the SDK's structure for defining data processing steps. ```Go package main import ( "context" "fmt" "github.com/apache/beam/sdks/v2/go/pkg/beam" "github.com/apache/beam/sdks/v2/go/pkg/beam/io/filesystem" "github.com/apache/beam/sdks/v2/go/pkg/beam/transforms/elements" ) func init() { beam.RegisterFunction(square) } func square(x int) int { return x * x } func main() { ctx := context.Background() p, err := beam.NewPipeline() if err != nil { panic(fmt.Sprintf("Failed to create pipeline: %v", err)) } numbers := []int{1, 2, 3, 4, 5} col := beam.CreateList(p, numbers) squared := beam.ParDo(p, square, col) beam.Output(p, squared) if err := beam.Run(ctx, p); err != nil { panic(fmt.Sprintf("Failed to run pipeline: %v", err)) } } ``` -------------------------------- ### Anomaly Detection Example Source: https://github.com/apache/beam/blob/master/website/www/site/content/en/documentation/ml/overview.md Build an anomaly detection pipeline using Apache Beam. This example demonstrates how to identify unusual patterns in data streams. ```Python # Conceptual example for anomaly detection with Beam # import apache_beam as beam # import numpy as np # class AnomalyDetector(beam.DoFn): # def __init__(self, threshold=3.0): # self.threshold = threshold # self.data_points = [] # def process(self, element): # self.data_points.append(element['value']) # if len(self.data_points) > 100: # mean = np.mean(self.data_points) # std = np.std(self.data_points) # z_score = (element['value'] - mean) / std # if abs(z_score) > self.threshold: # yield {'id': element['id'], 'anomaly': True, 'value': element['value']} # else: # yield {'id': element['id'], 'anomaly': False, 'value': element['value']} # with beam.Pipeline() as pipeline: # (pipeline # | 'CreateSensorData' >> beam.Create([{'id': i, 'value': np.random.randn() * 5 + 10} for i in range(200)]) # | 'DetectAnomalies' >> beam.ParDo(AnomalyDetector(threshold=2.0)) # | 'FilterAnomalies' >> beam.Filter(lambda x: x['anomaly']) # | 'PrintAnomalies' >> beam.Map(print) # ) ``` -------------------------------- ### Apache Beam Go SDK Examples Source: https://github.com/apache/beam/blob/master/examples/notebooks/interactive-overview/getting-started.ipynb Provides examples of using the Apache Beam Go SDK for building data processing pipelines. Demonstrates pipeline creation and basic transformations. ```Go package main import ( "context" "fmt" "github.com/apache/beam/sdks/v2/go/pkg/beam" "github.com/apache/beam/sdks/v2/go/pkg/beam/runners/direct" ) func main() { ctx := context.Background() beam.Init() // Initializes Beam. p := beam.NewPipeline() col := beam.Impulse(p) // Example transform: Print elements beam.ParDo(p, &printElement{}, beam.Impulse(p), ) err := direct.Run(ctx, p) if err != nil { fmt.Printf("Error running pipeline: %v\n", err) } } type printElement struct {} func (p *printElement) ProcessElement(element string) { fmt.Println(element) } ``` -------------------------------- ### Python Apache Beam Pipeline Example Source: https://github.com/apache/beam/blob/master/examples/notebooks/interactive-overview/getting-started.ipynb A Python example demonstrating how to create and run a simple Apache Beam pipeline. This includes data creation and transformation. ```python import apache_beam as beam with beam.Pipeline() as pipeline: (pipeline | 'CreateData' >> beam.Create(['hello', 'world', 'apache', 'beam']) | 'Uppercase' >> beam.Map(str.upper) | 'Print' >> beam.Map(print)) ``` -------------------------------- ### Apache Beam Go SDK Example Source: https://github.com/apache/beam/blob/master/examples/notebooks/interactive-overview/getting-started.ipynb Illustrates a simple data processing pipeline using the Apache Beam Go SDK, demonstrating pipeline construction and execution. ```Go package main import ( "context" "fmt" "github.com/apache/beam/sdks/v2/go/pkg/beam" "github.com/apache/beam/sdks/v2/go/pkg/beam/runners/direct" ) func main() { ctx := context.Background() beam.Init() // Initialize Beam SDK // Create a pipeline p := beam.NewPipeline() // Define a data source (e.g., a list of strings) words := []string{"hello", "world", "apache", "beam"} // Create a PCollection from the data pcol := beam.CreateList(p, words) // Apply a transformation (e.g., uppercase) uppercased := beam.Map(pcol, func(s string) string { return strings.ToUpper(s) }) // Output the results beam.Output(uppercased, func(s string) { fmt.Println(s) }) // Run the pipeline err := direct.Run(p) if err != nil { panic(err) } } ```