### Start Local Development Server Source: https://github.com/simplexspatial/osm4scala/blob/master/website/README.md Starts a local development server for the website. Changes are reflected live without server restart. ```console yarn start ``` -------------------------------- ### NPM Run Commands Overview Source: https://github.com/simplexspatial/osm4scala/blob/master/website/README.md Lists available lifecycle scripts for the project, including start, build, deploy, and more. ```shell $ npm run (base) Lifecycle scripts included in documentation: start docusaurus start available via `npm run-script`: docusaurus docusaurus build docusaurus build swizzle docusaurus swizzle deploy docusaurus deploy clear docusaurus clear serve docusaurus serve write-translations docusaurus write-translations write-heading-ids docusaurus write-heading-ids $ npm run clear $ npm run start ``` -------------------------------- ### Install Website Dependencies Source: https://github.com/simplexspatial/osm4scala/blob/master/website/README.md Installs the necessary dependencies for the website project using Yarn. ```console yarn install ``` -------------------------------- ### SQL Dependency for Spark 3.x and Scala 2.12 Source: https://github.com/simplexspatial/osm4scala/blob/master/website/docs/spark-connector.mdx Use this command to start Spark SQL with the osm4scala connector dependency for Spark 3.x and Scala 2.12. ```shell bin/spark-sql --packages 'com.acervera.osm4scala:osm4scala-spark3-shaded_2.12:1.0.11' ``` -------------------------------- ### Run Docker Image for Jupyter Notebook Source: https://github.com/simplexspatial/osm4scala/blob/master/website/docs/spark-connector.mdx This command starts the jupyter/all-spark-notebook Docker image, exposing necessary ports for Spark and Jupyter. It's useful for setting up a Spark-enabled notebook environment. ```bash docker run -e JUPYTER_ENABLE_LAB=yes -d -p 8888:8888 -p 4040:4040 -p 4041:4041 jupyter/all-spark-notebook ``` -------------------------------- ### Compile Protobuf Source Code Source: https://github.com/simplexspatial/osm4scala/blob/master/README.md Execute this command to compile the protobuf source code, which is a special requirement for the project setup. ```shell sbt compile ``` -------------------------------- ### PySpark Dependency for Spark 3.x and Scala 2.12 Source: https://github.com/simplexspatial/osm4scala/blob/master/website/docs/spark-connector.mdx Use this command to start PySpark with the osm4scala connector dependency for Spark 3.x and Scala 2.12. ```shell bin/pyspark --packages 'com.acervera.osm4scala:osm4scala-spark3-shaded_2.12:1.0.11' ``` -------------------------------- ### Spark Shell Dependency for Spark 3.x and Scala 2.12 Source: https://github.com/simplexspatial/osm4scala/blob/master/website/docs/spark-connector.mdx Use this command to start the Spark Shell with the osm4scala connector dependency for Spark 3.x and Scala 2.12. ```shell bin/spark-shell --packages 'com.acervera.osm4scala:osm4scala-spark3-shaded_2.12:1.0.11' ``` -------------------------------- ### Count Node Primitives in a PBF File Source: https://github.com/simplexspatial/osm4scala/blob/master/website/docs/standalone-scala-library.mdx This example demonstrates how to count all node primitives in a PBF file using EntityIterator.fromPbf. It leverages Scala's functional programming features for concise data processing. ```scala EntityIterator.fromPbf(inputStream).count(_.osmModel == OSMTypes.Node) ``` -------------------------------- ### Create DataFrame from OSM PBF (Scala) Source: https://github.com/simplexspatial/osm4scala/blob/master/website/docs/spark-connector.mdx Read OSM PBF data into a Spark DataFrame using the 'osm.pbf' format. This example demonstrates counting primitives by type. ```scala import com.acervera.osm4scala.spark.OsmSqlEntity import org.apache.spark.sql.SparkSession object PrimitivesCounter { def main(args: Array[String]): Unit = { val spark = SparkSession .builder() .appName("Primitives counter") .getOrCreate() spark.read .format("osm.pbf") .load(args(0)) .groupBy(OsmSqlEntity.FIELD_TYPE) .count .show } } ``` -------------------------------- ### Build Website for Deployment Source: https://github.com/simplexspatial/osm4scala/blob/master/website/README.md Generates static content for the website into the 'build' directory, ready for hosting. ```console yarn build ``` -------------------------------- ### Deploy Documentation Site Source: https://github.com/simplexspatial/osm4scala/blob/master/README.md Deploy the project documentation and site. Ensure Node.js is managed with nvm and set the GIT_USER and USE_SSH environment variables. ```bash git checkout v1.*.* cd website nvm use export GIT_USER=; export USE_SSH=true; npm run deploy ``` -------------------------------- ### Initialize Spark with osm4scala Package Source: https://github.com/simplexspatial/osm4scala/blob/master/website/static/notebooks/spylon_notebook_example.ipynb Use this cell to initialize a Spark environment and include the osm4scala package. Ensure the package version matches your Spark and Scala versions. ```python %%init_spark launcher.packages = ["com.acervera.osm4scala:osm4scala-spark3-shaded_2.12:1.0.11"] ``` -------------------------------- ### Deploy Website to GitHub Pages Source: https://github.com/simplexspatial/osm4scala/blob/master/website/README.md Builds and deploys the website to the 'gh-pages' branch, suitable for GitHub Pages hosting. ```console GIT_USER= USE_SSH=true yarn deploy ``` -------------------------------- ### Release Project Source: https://github.com/simplexspatial/osm4scala/blob/master/README.md Checkout the master branch and execute the release command to initiate the project release process. ```shell git checkout master sbt release ``` -------------------------------- ### Bundle and Release to Sonatype Source: https://github.com/simplexspatial/osm4scala/blob/master/README.md After publishing signed artifacts, execute this command to bundle and release them to Sonatype. ```shell sbt sonatypeBundleRelease ``` -------------------------------- ### Configure SBT for Dependency Shading Source: https://github.com/simplexspatial/osm4scala/blob/master/website/docs/spark-connector.mdx This Scala code configures SBT to shade the 'com.google.protobuf' package to 'shadeproto' to resolve dependency conflicts with older versions of the Google Protobuf library used by Spark and Hadoop. ```scala assemblyShadeRules in assembly := Seq( ShadeRule .rename("com.google.protobuf.**" -> "shadeproto.@1") .inAll ) ``` -------------------------------- ### Submit Spark Application (Scala) Source: https://github.com/simplexspatial/osm4scala/blob/master/website/docs/spark-connector.mdx Submit a Spark application that uses the osm4scala connector. The '--packages' argument is optional if the dependency is included in the deployable artifact. ```shell bin/spark-submit \ --packages 'com.acervera.osm4scala:osm4scala-spark3-shaded_2.12:1.0.11' \ examples/spark-documentation/target/scala-2.12/osm4scala-examples-spark-documentation_2.12-1.0.11.jar \ /tmp/osm/monaco-anonymized.osm.pbf ``` -------------------------------- ### Create Table for OSM Data in Spark SQL Source: https://github.com/simplexspatial/osm4scala/blob/master/website/docs/spark-connector.mdx Create a table in Spark SQL to read OSM data from PBF files using the osm.pbf provider. ```sql CREATE TABLE osm USING osm.pbf LOCATION ''; ``` -------------------------------- ### Run osm4scala-spark-utilities Counter Command Source: https://github.com/simplexspatial/osm4scala/blob/master/examples/spark-utilities/README.md Submit a Spark job to count OSM primitives (Nodes, Ways, Relations) from PBF files. Specify input path, output path, coalesce factor, and output format. ```shell bin/spark-submit \ --packages 'com.github.scopt:scopt_2.12:3.7.1,com.acervera.osm4scala:osm4scala-spark-shaded_2.12:1.0.11' \ --class com.acervera.osm4scala.examples.spark.Driver \ "osm4scala-examples-spark-utilities_2.12-1.0.11.jar" \ counter \ -i \ -o \ -c 1 \ -f csv ``` -------------------------------- ### Count Primitives by Type using SQL (Scala) Source: https://github.com/simplexspatial/osm4scala/blob/master/website/docs/spark-connector.mdx Execute a SQL query on the 'osm' temporary view to count the number of primitives for each type. Assumes the 'osm' view has been created. ```scala spark.sql("select type, count(*) as num_primitives from osm group by type").show() ``` -------------------------------- ### Run osm4scala-spark-utilities Tag Keys Command Source: https://github.com/simplexspatial/osm4scala/blob/master/examples/spark-utilities/README.md Submit a Spark job to extract tag keys from OSM primitives (Nodes, Ways, Relations) in PBF files. Specify input path, output path, coalesce factor, and output format. ```shell bin/spark-submit \ --packages 'com.github.scopt:scopt_2.12:3.7.1,com.acervera.osm4scala:osm4scala-spark-shaded_2.12:1.0.11' \ --class com.acervera.osm4scala.examples.spark.Driver \ "osm4scala-examples-spark-utilities_2.12-1.0.11.jar" \ tag_keys \ -i \ -o \ -c 1 \ -f csv ``` -------------------------------- ### Run Tests for Scala Versions Source: https://github.com/simplexspatial/osm4scala/blob/master/website/docs/contributing.mdx Execute tests for different Scala versions. Set PATCH_211 to true or false to include or exclude Scala 2.11. ```shell PATCH_211=false sbt clean +test ``` ```shell PATCH_211=true sbt clean +test ``` -------------------------------- ### Submit Spark Application (PySpark) Source: https://github.com/simplexspatial/osm4scala/blob/master/website/docs/spark-connector.mdx Submit a PySpark application using the osm4scala connector. The '--packages' flag can be omitted if the JAR is part of the deployment. ```shell bin/spark-submit \ --packages 'com.acervera.osm4scala:osm4scala-spark3-shaded_2.12:1.0.11' \ examples/spark-documentation/src/main/scala/com/acervera/osm4scala/examples/spark/documentation/PrimiriveCounter.py \ /tmp/osm/monaco-anonymized.osm.pbf ``` -------------------------------- ### Create Table for OSM PBF with Split Disabled (SQL) Source: https://github.com/simplexspatial/osm4scala/blob/master/website/docs/spark-connector.mdx Use this SQL snippet to create a table for OSM PBF files with the 'split' option disabled, preventing Spark from splitting individual PBF files for parallelization. ```sql spark-sql> CREATE TABLE osm USING osm.pbf OPTIONS ( 'split' = 'false' ) LOCATION ''; ``` -------------------------------- ### Load OSM PBF into SQL View (Scala) Source: https://github.com/simplexspatial/osm4scala/blob/master/website/docs/spark-connector.mdx Load OSM PBF files into a Spark DataFrame and create a temporary SQL view named 'osm'. This allows for SQL-based querying. ```scala val osmDF = spark.sqlContext.read.format("osm.pbf").load("") osmDF.createOrReplaceTempView("osm") ``` -------------------------------- ### Create DataFrame from OSM PBF (PySpark) Source: https://github.com/simplexspatial/osm4scala/blob/master/website/docs/spark-connector.mdx Load OSM PBF files into a PySpark DataFrame and perform a count by type. Ensure the 'osm.pbf' format is available at runtime. ```python from pyspark.sql import SparkSession import sys if __name__ == '__main__': spark = SparkSession.builder.appName("Primitives counter").getOrCreate() spark.read.format("osm.pbf")\ .load(sys.argv[1])\ .groupBy("type")\ .count()\ .show() ``` -------------------------------- ### Count Primitives in Full Planet PBF Source: https://github.com/simplexspatial/osm4scala/blob/master/website/docs/performance.mdx Measures the time taken to count all primitives in the full planet PBF file (approx. 4 billion elements). Memory usage is negligible. ```text Found [3,976,885,170] primitives in /media/angelcervera/My Passport/osm/planet-latest.osm.pbf in 2,566.11 sec. ``` -------------------------------- ### Add Bintray Repository for osm4scala Source: https://github.com/simplexspatial/osm4scala/blob/master/website/docs/standalone-scala-library.mdx If you encounter issues resolving osm4scala dependencies, add this Bintray repository to your resolvers in sbt. This is typically only needed if direct resolution fails. ```scala resolvers += "osm4scala repo" at "http://dl.bintray.com/angelcervera/maven" ``` -------------------------------- ### Load OSM Data in Spark Shell (Scala) Source: https://github.com/simplexspatial/osm4scala/blob/master/website/docs/spark-connector.mdx Load OSM data from PBF files into a Spark DataFrame using the osm.pbf format in Scala. ```scala val osmDF = spark.read.format("osm.pbf").load("") ``` -------------------------------- ### Load and Query OSM PBF Data with Spark (Python) Source: https://github.com/simplexspatial/osm4scala/blob/master/website/static/notebooks/spylon_notebook_example.ipynb Load an OSM PBF file into a Spark DataFrame and filter for elements with a 'highway' tag equal to 'traffic_signals'. This is useful for extracting specific geographic features. ```python %%python osm_df = spark.read.format("osm.pbf").load("/home/jovyan/work/monaco-anonymized.osm.pbf") osm_df.select("latitude", "longitude").where("element_at(tags, 'highway') == 'traffic_signals'").show() ``` -------------------------------- ### Count Primitives in Spain PBF Source: https://github.com/simplexspatial/osm4scala/blob/master/website/docs/performance.mdx Measures the time taken to count all primitives in the Spain PBF file. Memory usage is negligible. ```text Found [67,976,861] primitives in /home/angelcervera/projects/osm/spain-latest.osm.pbf in 32.44 sec. ``` -------------------------------- ### Add osm4scala-core Dependency with Maven Source: https://github.com/simplexspatial/osm4scala/blob/master/website/docs/standalone-scala-library.mdx Integrate the osm4scala-core library into your Maven project by adding the provided dependency configuration to your pom.xml file. ```xml com.acervera.osm4scala osm4scala-core_${scala-version} ${version} ``` -------------------------------- ### Add osm4scala-core Dependency with sbt Source: https://github.com/simplexspatial/osm4scala/blob/master/website/docs/standalone-scala-library.mdx Include the osm4scala-core library in your Scala project using sbt by adding the specified dependency to your build.sbt file. ```scala libraryDependencies += "com.acervera.osm4scala" %% "osm4scala-core" % "" ``` -------------------------------- ### Publish Signed Artifacts Source: https://github.com/simplexspatial/osm4scala/blob/master/README.md Publish signed artifacts to Maven Central. Ensure GPG keys and Sonatype credentials are set up correctly. Use the PATCH_211 flag to manage Scala 2.11 compatibility. ```shell git checkout v1.*.* sbt clean PATCH_211=false sbt +publishSigned ``` ```shell PATCH_211=true sbt +publishSigned ``` -------------------------------- ### Testing with PATCH_211 Flag Source: https://github.com/simplexspatial/osm4scala/blob/master/README.md Run tests for different Scala versions using the PATCH_211 flag. Set to 'false' for default behavior or 'true' to enable Scala 2.11 compatibility. ```shell PATCH_211=false sbt +test ``` ```shell PATCH_211=true sbt +test ``` -------------------------------- ### Extract Traffic Lights as POIs in Spark Shell Source: https://github.com/simplexspatial/osm4scala/blob/master/website/docs/spark-connector.mdx Extract latitude, longitude, and tags for all traffic lights from the dataset using Spark SQL. ```Scala scala> osmDF.select("latitude", "longitude", "tags").where("element_at(tags, 'highway') == 'traffic_signals'").show(10,false) +------------------+-------------------+------------------------------------------------------------------------------+ |latitude |longitude |tags | +------------------+-------------------+------------------------------------------------------------------------------+ |54.59766649999997 |-5.8889806000000045|[highway -> traffic_signals] | |54.58006689999997 |-5.938683200000003 |[highway -> traffic_signals, traffic_signals -> signal] | |54.58260049999997 |-5.946187600000005 |[direction -> backward, highway -> traffic_signals, traffic_signals -> signal]| |51.90097769999996 |-8.470285700000005 |[highway -> traffic_signals] | |51.901616299999965|-8.470139700000004 |[highway -> traffic_signals] | |51.89978239999997 |-8.465829200000002 |[highway -> traffic_signals] | |51.89707529999997 |-8.474892800000001 |[highway -> traffic_signals] | |51.89784849999997 |-8.466895200000002 |[highway -> traffic_signals] | |51.89547809999997 |-8.476100900000002 |[highway -> traffic_signals] | |51.89772569999997 |-8.477145100000003 |[highway -> traffic_signals] | +------------------+-------------------+------------------------------------------------------------------------------+ only showing top 10 rows ``` ```PySpark >>> osmDF.select("latitude", "longitude", "tags").where("element_at(tags, 'highway') == 'traffic_signals'").show(10,False) +------------------+-------------------+------------------------------------------------------------------------------+ |latitude |longitude |tags | +------------------+-------------------+------------------------------------------------------------------------------+ |54.59766649999997 |-5.8889806000000045|[highway -> traffic_signals] | |54.58006689999997 |-5.938683200000003 |[highway -> traffic_signals, traffic_signals -> signal] | |54.58260049999997 |-5.946187600000005 |[direction -> backward, highway -> traffic_signals, traffic_signals -> signal]| |51.90097769999996 |-8.470285700000005 |[highway -> traffic_signals] | |51.901616299999965|-8.470139700000004 |[highway -> traffic_signals] | |51.89978239999997 |-8.465829200000002 |[highway -> traffic_signals] | |51.89707529999997 |-8.474892800000001 |[highway -> traffic_signals] | |51.89784849999997 |-8.466895200000002 |[highway -> traffic_signals] | |51.89547809999997 |-8.476100900000002 |[highway -> traffic_signals] | |51.89772569999997 |-8.477145100000003 |[highway -> traffic_signals] | +------------------+-------------------+------------------------------------------------------------------------------+ only showing top 10 rows ``` ```SQL spark-sql> select latitude, longitude, tags from osm where type = 0 and element_at(tags, "highway") == 'traffic_signals' limit 10; 40.42125 -3.6844500000000004 {"crossing":"traffic_signals","crossing_ref":"zebra","highway":"traffic_signals"} ``` -------------------------------- ### Read OSM PBF with Split Disabled (Scala) Source: https://github.com/simplexspatial/osm4scala/blob/master/website/docs/spark-connector.mdx Use this Scala snippet to read OSM PBF files with the 'split' option disabled, preventing Spark from splitting individual PBF files for parallelization. ```scala scala> val osmDF = spark.read.format("osm.pbf").option("split", "false").load("") ``` -------------------------------- ### Add osm4scala Spark Dependency (SBT) Source: https://github.com/simplexspatial/osm4scala/blob/master/website/docs/spark-connector.mdx Include the osm4scala Spark shaded dependency in your SBT project to use it in Spark applications. This is necessary for Scala and Python environments. ```sbt libraryDependencies += "com.acervera.osm4scala" % "osm4scala-spark3-shaded_2.12" % "1.0.11" ``` -------------------------------- ### Extract Unique Tags by Primitive Type from Spain PBF Source: https://github.com/simplexspatial/osm4scala/blob/master/website/docs/performance.mdx Measures the time taken to extract unique tags from specific primitive types (Way) in the Spain PBF file. The list of tags is stored in a text file. ```text Found [2,451] different tags in primitives of type [Way] in /home/angelcervera/projects/osm/spain-latest.osm.pbf. List stored in /home/angelcervera/projects/osm/spain-latest.tags.txt. Time to process: 33.47 sec. ``` -------------------------------- ### Load OSM Data in PySpark Source: https://github.com/simplexspatial/osm4scala/blob/master/website/docs/spark-connector.mdx Load OSM data from PBF files into a Spark DataFrame using the osm.pbf format in PySpark. ```python osmDF = spark.read.format("osm.pbf").load("") ``` -------------------------------- ### Load and Query OSM PBF Data with Spark (Scala) Source: https://github.com/simplexspatial/osm4scala/blob/master/website/static/notebooks/spylon_notebook_example.ipynb Load an OSM PBF file into a Spark DataFrame and filter for elements with a 'highway' tag equal to 'traffic_signals'. This is useful for extracting specific geographic features. ```scala val osmDF = spark.read.format("osm.pbf").load("/home/jovyan/work/monaco-anonymized.osm.pbf") osmDF.select("latitude", "longitude") .where("element_at(tags, 'highway') == 'traffic_signals'") .show ``` -------------------------------- ### Extract Relation Data with Spark SQL Source: https://github.com/simplexspatial/osm4scala/blob/master/website/docs/spark-connector.mdx This snippet demonstrates how to query OSM data using Spark SQL, extracting specific fields like id, type, version, user information, and formatted timestamp from relations where the user ID is not null. It shows the first 5 rows of the result. ```scala spark.sql("select id, type, info.version, info.userId, info.userName, date_format(info.timestamp, \"dd-MMM-y kk:mm:ss z\") as timestamp from osm where info.userId IS NOT NULL").show(5, false) ``` -------------------------------- ### Extract Unique Tags from Spain PBF Source: https://github.com/simplexspatial/osm4scala/blob/master/website/docs/performance.mdx Measures the time taken to extract all unique tags from the Spain PBF file. The list of tags is stored in a text file. ```text Found [4,166] different tags in /home/angelcervera/projects/osm/spain-latest.osm.pbf. List stored in /home/angelcervera/projects/osm/spain-latest.tags.txt. Time to process: 39.22 sec. ``` -------------------------------- ### Extract Way Data with Spark SQL Source: https://github.com/simplexspatial/osm4scala/blob/master/website/docs/spark-connector.mdx Query all ways from the 'osm' table, selecting their ID, the list of nodes they contain, and their tags. This is useful for analyzing linear features like roads. ```scala spark.sql("select id, nodes, tags from osm where type = 1").show() ``` -------------------------------- ### Count Specific Primitive Types in Spain PBF Source: https://github.com/simplexspatial/osm4scala/blob/master/website/docs/performance.mdx Measures the time taken to count specific primitive types (Way, Node, Relation) in the Spain PBF file. Memory usage is negligible. ```text Found [4,839,505] primitives of type [Way] in /home/angelcervera/projects/osm/spain-latest.osm.pbf in 31.72 sec. Found [63,006,432] primitives of type [Node] in /home/angelcervera/projects/osm/spain-latest.osm.pbf in 32.70 sec. Found [130,924] primitives of type [Relation] in /home/angelcervera/projects/osm/spain-latest.osm.pbf in 32.66 sec. ``` -------------------------------- ### Count OSM Primitives in Spark Shell Source: https://github.com/simplexspatial/osm4scala/blob/master/website/docs/spark-connector.mdx Count the number of different OSM primitive types (nodes, ways, relations) in the dataset using Spark SQL. ```Scala scala> osmDF.groupBy("type").count().show() +----+--------+ |type| count| +----+--------+ | 1| 2096455| | 2| 91971| | 0|19426617| +----+--------+ ``` ```PySpark >>> osmDF.groupBy("type").count().show() +----+--------+ |type| count| +----+--------+ | 1| 2096455| | 2| 91971| | 0|19426617| +----+--------+ ``` ```SQL spark-sql> select type, count(type) from osm group by type 1 338795 2 10357 0 2328075 ``` -------------------------------- ### Read OSM PBF with Split Disabled (PySpark) Source: https://github.com/simplexspatial/osm4scala/blob/master/website/docs/spark-connector.mdx Use this PySpark snippet to read OSM PBF files with the 'split' option disabled, preventing Spark from splitting individual PBF files for parallelization. ```python >>> osmDF = spark.read.format("osm.pbf").option("split", "false").load("") ``` -------------------------------- ### Extract All Unique Tag Keys using SQL (Scala) Source: https://github.com/simplexspatial/osm4scala/blob/master/website/docs/spark-connector.mdx Query the 'osm' SQL view to extract all distinct keys used in the 'tags' map column. This query uses `map_keys` and `explode` functions. ```scala spark.sql("select distinct(explode(map_keys(tags))) as tag_key from osm order by tag_key asc").show() ``` -------------------------------- ### Extract Node Data with Spark SQL Source: https://github.com/simplexspatial/osm4scala/blob/master/website/docs/spark-connector.mdx Query all nodes from the 'osm' table, selecting their ID, coordinates, and tags. This is useful for analyzing point-based features. ```scala spark.sql("select id, latitude, longitude, tags from osm where type = 0").show() ``` -------------------------------- ### Add osm4scala Spark Dependency (Maven) Source: https://github.com/simplexspatial/osm4scala/blob/master/website/docs/spark-connector.mdx Include the osm4scala Spark shaded dependency in your Maven project for use in Spark applications. This applies to both Scala and Python. ```xml com.acervera.osm4scala osm4scala-spark3-shaded_2.12 1.0.11 ``` -------------------------------- ### Extract Relation Data with Spark SQL Source: https://github.com/simplexspatial/osm4scala/blob/master/website/docs/spark-connector.mdx Query all relations from the 'osm' table, selecting their ID, the members they relate, and their tags. This is useful for analyzing complex geographic structures. ```scala spark.sql("select id, relations, tags from osm where type = 2").show() ``` -------------------------------- ### Parallel Counting with Scala Future.traverse Source: https://github.com/simplexspatial/osm4scala/blob/master/website/docs/performance.mdx Process data in parallel using Scala's Future.traverse. This method is suitable for datasets that fit into memory, as it creates a Future for each element, potentially leading to high memory consumption. ```scala val counter = new AtomicLong() def count(pbfIS: InputStream): Long = { val result = Future.traverse(BlobTupleIterator.fromPbf(pbfIS))(tuple => Future { counter.addAndGet( count(tuple._2) ) }) Await.result(result, Duration.Inf) counter.longValue() } ``` === COMPLETE CONTENT === This response contains all available snippets from this library. No additional content exists. Do not make further requests.