Try Live
Add Docs
Rankings
Pricing
Enterprise
Docs
Install
Install
Docs
Pricing
Enterprise
More...
More...
Try Live
Rankings
Add Docs
cybedtools
https://github.com/ryanstraight/cybedtools
Admin
cybedtools is an R package providing a reproducible pipeline for ingesting, representing, and
...
Tokens:
17,053
Snippets:
176
Trust Score:
7.5
Update:
6 days ago
Context
Skills
Chat
Benchmark
51.6
Suggestions
Latest
Show doc for...
Code
Info
Show Results
Context Summary (auto-generated)
Raw
Copy
Link
# cybedtools cybedtools is an R package for reproducible cross-framework analysis of cybersecurity workforce and learning standards. It ingests eight major frameworks — NICE v2, DCWF v5.1, SFIA 9, ENISA ECSF v1, Cyber.org K-12, CSTA K-12 CS, ACM/IEEE CSEC2017, and JRC DigComp 2.2 — and expresses them all in a shared `cybed:` semantic schema backed by JSON-LD and queryable via SPARQL. The package adds a comparison layer without replacing or re-authoring any upstream framework content. The core pipeline is: ingest source files into tidy CSVs, verify against declared invariants, assemble per-framework JSON-LD documents under a two-tier namespace architecture, export to N-Triples for fast SPARQL execution, and query the combined graph using a single-basic-graph-pattern (single-BGP) discipline composed with R-side dplyr joins. A shipped `framework_summary` data object and a `make_demo_graph()` sandbox let users explore the package without staging source data. Classed error conditions (`cybedtools_file_not_found`, `cybedtools_framework_not_found`, `cybedtools_unknown_prefix`) enable programmatic error handling throughout. --- ## Installation ```r # install.packages("remotes") remotes::install_github("ryanstraight/cybedtools") ``` To run the full pipeline with real framework data, clone the repository and stage source files: ```bash git clone https://github.com/ryanstraight/cybedtools cd cybedtools # Stage framework source files under data/raw/<slug>/ per docs/framework-data-sources.md Rscript scripts/000-build.R # ingestion + verification + assembly + export ``` --- ## make_demo_graph — Build a small in-memory demo RDF graph Returns a synthetic two-framework `rdf` object (≈30 triples) with the structural shape all domain helpers expect. Useful for verifying installation, trying helpers without staged data, and developing new queries. ```r library(cybedtools) library(dplyr) rdf <- make_demo_graph() # Verify installation: framework metadata framework_metadata(rdf) |> arrange(jurisdiction) #> # A tibble: 2 × 5 #> framework name jurisdiction sector specificity #> <chr> <chr> <chr> <chr> <chr> #> 1 https://w3id.org/cybed/ontology#fr... Demo Framework B EU general general-IT #> 2 https://w3id.org/cybed/ontology#fr... Demo Framework A US civilian cybersecurity-specific # All role-to-framework bindings in the demo graph role_framework_bindings(rdf) #> # A tibble: 3 × 4 #> role role_name framework framework_name #> <chr> <chr> <chr> <chr> #> 1 .../role/demo-a1 Security Architect .../framework/demo-fw-a Demo Framework A #> 2 .../role/demo-a2 Incident Responder .../framework/demo-fw-a Demo Framework A #> 3 .../role/demo-b1 IT Generalist .../framework/demo-fw-b Demo Framework B ``` --- ## load_combined_ntriples_graph — Load the combined N-Triples graph Parses the pre-exported `_combined.nt` file (produced by `scripts/025-export-ntriples.R`) into an rdflib graph object. This is the recommended data source for all SPARQL work: faster than JSON-LD and handles large graphs correctly under librdf. ```r library(cybedtools) library(dplyr) # Requires staged data: Rscript scripts/025-export-ntriples.R rdf <- load_combined_ntriples_graph() # Custom path rdf <- load_combined_ntriples_graph(file_path = "/path/to/_combined.nt") # Error handling: classed condition for missing file tryCatch( load_combined_ntriples_graph(), cybedtools_file_not_found = function(e) { message("Graph not staged yet. Run scripts/025-export-ntriples.R first.") } ) ``` --- ## load_combined_rdf_graph — Load the combined JSON-LD graph Parses the `_combined.jsonld` file into an rdflib graph. Use for canonical semantic-web interop (e.g., publishing to a Fuseki endpoint); prefer `load_combined_ntriples_graph()` for interactive SPARQL work. ```r rdf <- load_combined_rdf_graph() # Custom path rdf <- load_combined_rdf_graph(file_path = "data/processed/jsonld/_combined.jsonld") ``` --- ## load_single_framework_graph — Load one framework's JSON-LD Loads exactly one framework's JSON-LD for per-framework diagnostics or isolated queries. Raises `cybedtools_framework_not_found` when the slug is absent. ```r # Load only the ECSF framework rdf_ecsf <- load_single_framework_graph("ecsf") # Available slugs: "nice", "sfia", "dcwf", "ecsf", # "cyberorg-k12", "csta", "csec2017", "digcomp" rdf_nice <- load_single_framework_graph("nice") # Error handling tryCatch( load_single_framework_graph("unknown-slug"), cybedtools_framework_not_found = function(e) { message("Slug not found. Check valid slugs.") } ) ``` --- ## load_unified_rdf_graph — Load all frameworks into one graph iteratively Iterates a list of framework slugs and parses each JSON-LD file into a shared rdf object. Missing slugs emit a warning and are skipped. Marked experimental because the default slug list will track future framework revisions. ```r # Load all eight frameworks rdf <- load_unified_rdf_graph() # Subset: only the two EU frameworks rdf_eu <- load_unified_rdf_graph( framework_slugs = c("ecsf", "digcomp"), jsonld_dir = here::here("data", "processed", "jsonld") ) ``` --- ## framework_metadata — Tibble of framework metadata Calls one single-BGP SPARQL query per property and left-joins the results in R. Returns one row per `cybed:Framework` node with name, jurisdiction, sector, and specificity. ```r rdf <- load_combined_ntriples_graph() framework_metadata(rdf) |> arrange(jurisdiction, name) #> # A tibble: 8 × 5 #> framework name jurisdiction sector specificity #> <chr> <chr> <chr> <chr> <chr> #> 1 .../digcomp-2.2 DigComp 2.2 EU citizen-education general-digital-competence #> 2 .../ecsf-v1 ECSF v1 EU civilian cybersecurity-specific #> 3 .../csec2017-v1 CSEC2017 Curricular... global higher-education cybersecurity-specific #> 4 .../sfia-9 SFIA 9 global general general-IT #> 5 .../csta-2017 CSTA K-12 Computer Science... US K-12-education general-computing #> 6 .../cyberorg-k12 Cyber.org K-12 Learning... US K-12-education cybersecurity-specific #> 7 .../dcwf-v51 DCWF v5.1 US defense cybersecurity-specific #> 8 .../nice-v2 NICE v2 (NIST SP 800-181...) US civilian cybersecurity-specific ``` --- ## role_framework_bindings — Role-to-framework bindings One row per (role, framework) pair. Roles without a `cybed:partOf` triple pointing to a `cybed:Framework` node are excluded. Foundation for role-count queries and pairwise framework comparisons. ```r rdf <- load_combined_ntriples_graph() # Role counts per framework, descending role_framework_bindings(rdf) |> count(framework_name, sort = TRUE, name = "role_count") #> # A tibble: 8 × 2 #> framework_name role_count #> <chr> <int> #> 1 SFIA 9 147 #> 2 Cyber.org K-12 Learning Standards v1.0 116 #> 3 DCWF v5.1 74 #> 4 CSTA K-12 Computer Science Standards... 25 #> 5 NICE v2 (NIST SP 800-181 Rev 1...) 41 #> 6 ECSF v1 12 #> 7 CSEC2017 Curricular Guidelines v1.0 8 #> 8 DigComp 2.2 5 ``` --- ## element_framework_bindings — Element-to-framework bindings One row per (element, framework) pair via `cybed:partOf` triples on `cybed:RoleElement` nodes. Foundation for element-count queries and jurisdictional pivot analyses. ```r rdf <- load_combined_ntriples_graph() # Element counts per framework, sorted element_framework_bindings(rdf) |> count(framework_name, sort = TRUE, name = "element_count") #> # A tibble: 8 × 2 #> framework_name element_count #> <chr> <int> #> 1 DCWF v5.1 2945 #> 2 NICE v2 (NIST SP 800-181 Rev 1...) 2111 #> 3 SFIA 9 672 #> 4 ECSF v1 374 #> 5 Cyber.org K-12 Learning Standards v1.0 123 #> 6 CSTA K-12 Computer Science Standards... 120 #> 7 CSEC2017 Curricular Guidelines v1.0 38 #> 8 DigComp 2.2 21 # Pivot element volume by jurisdiction (surfaces 13:1 US vs EU asymmetry) element_framework_bindings(rdf) |> left_join( framework_metadata(rdf) |> transmute(framework, jurisdiction), by = "framework" ) |> count(jurisdiction, name = "element_count") |> arrange(desc(element_count)) #> # A tibble: 3 × 2 #> jurisdiction element_count #> <chr> <int> #> 1 US 5299 #> 2 EU 395 #> 3 global 710 ``` --- ## role_element_bindings — Role-to-element bindings One row per (role, element) pair from `cybed:hasElement` triples. Pairs with `role_framework_bindings()` to produce element-load statistics per role. ```r rdf <- load_combined_ntriples_graph() reb <- role_element_bindings(rdf) rfb <- role_framework_bindings(rdf) # Top 10 roles by element count across all eight frameworks reb |> count(role, name = "element_count") |> left_join(rfb |> select(role, role_name, framework_name), by = "role") |> arrange(desc(element_count)) |> slice_head(n = 5) #> # A tibble: 5 × 4 #> role element_count role_name framework_name #> <chr> <int> <chr> <chr> #> 1 ... 307 Security Control Assessment NICE v2 (NIST SP 800-181...) #> 2 ... 232 Secure Systems Development NICE v2 (NIST SP 800-181...) #> 3 ... 219 Cybersecurity Architecture NICE v2 (NIST SP 800-181...) #> 4 ... 206 Defensive Cybersecurity NICE v2 (NIST SP 800-181...) #> 5 ... 204 Systems Security Management NICE v2 (NIST SP 800-181...) ``` --- ## sparql_pairs — Single-BGP SPARQL returning subject-object pairs Issues `SELECT ?s ?o WHERE { ?s <predicate> ?o }`. The foundational primitive for all domain helpers. Returns a tibble with columns `s` and `o`. ```r rdf <- make_demo_graph() # All jurisdictions in the graph sparql_pairs(rdf, "cybed:jurisdiction") #> # A tibble: 2 × 2 #> s o #> <chr> <chr> #> 1 https://w3id.org/cybed/ontology#framework/... US #> 2 https://w3id.org/cybed/ontology#framework/... EU # All schema:name values sparql_pairs(rdf, "schema:name") ``` --- ## sparql_subjects — Single-BGP SPARQL returning subjects Issues `SELECT ?s WHERE { ?s <predicate> <object> }`. Used to enumerate nodes of a given type (e.g., all `cybed:Framework` nodes). ```r rdf <- make_demo_graph() # All Framework URIs sparql_subjects(rdf, "a", "cybed:Framework") #> # A tibble: 2 × 1 #> s #> <chr> #> 1 https://w3id.org/cybed/ontology#framework/demo-fw-a #> 2 https://w3id.org/cybed/ontology#framework/demo-fw-b # All Role URIs sparql_subjects(rdf, "a", "cybed:Role") ``` --- ## build_jsonld_context — Build a single-framework JSON-LD `@context` Returns a named list suitable for use as JSON-LD `@context`. Includes the four base vocabularies (schema, skos, rdfs, cybed) plus the specified framework prefix. ```r ctx <- build_jsonld_context("nice") names(ctx) #> [1] "schema" "skos" "rdfs" "cybed" "nice" ctx$cybed #> [1] "https://w3id.org/cybed/ontology#" ctx$nice #> [1] "https://nice.nist.gov/framework/terms#" # Valid workforce prefixes: "nice", "dcwf", "ecf", "sfia", "ecsf" # Valid pedagogical prefixes: "cyberorg", "csta", "csec", "digcomp" build_jsonld_context("ecsf") ``` --- ## build_multi_framework_context — Build a multi-framework JSON-LD `@context` Covers multiple framework prefixes in one `@context` block. Required when assembling combined graphs for cross-framework SPARQL. Raises `cybedtools_unknown_prefix` on invalid prefixes. ```r ctx <- build_multi_framework_context(c("nice", "sfia", "ecsf")) names(ctx) #> [1] "schema" "skos" "rdfs" "cybed" "nice" "sfia" "ecsf" # Error handling tryCatch( build_multi_framework_context(c("nice", "invalid-prefix")), cybedtools_unknown_prefix = function(e) { message("Unknown prefix supplied. Check valid_framework_prefixes.") } ) ``` --- ## build_framework_node — Construct a `cybed:Framework` JSON-LD node Every framework in the corpus has exactly one top-level Framework node. Downstream Role and RoleElement nodes reference it via `cybed:partOf`. ```r fw <- build_framework_node( framework_id = "nice-v2", framework_name = "NICE v2 (NIST SP 800-181 Rev 1 components)", framework_prefix = "nice", version = "2.0.0", publisher = "NIST", jurisdiction = "US", sector = "civilian", specificity = "cybersecurity-specific", license = "https://www.nist.gov/open/copyright-fair-use-and-licensing-statements", date_published = "2022-11-07" ) fw[["@id"]] #> [1] "cybed:framework/nice-v2" fw[["@type"]] #> [1] "nice:Framework" "cybed:Framework" fw[["cybed:jurisdiction"]] #> [1] "US" ``` --- ## build_role_node — Construct a `cybed:Role` JSON-LD node A "role" generalizes across framework structural types: work roles (NICE, DCWF), role profiles (ECSF), competences (e-CF), skill-at-responsibility-levels (SFIA), grade-band clusters (Cyber.org, CSTA), knowledge areas (CSEC2017), and competence areas (DigComp). ```r role <- build_role_node( role_id = "OG-WRL-015", role_name = "Cybersecurity Architecture", framework_prefix = "nice", framework_role_type = "WorkRole", description = "Designs and develops enterprise security architecture.", element_ids = c("T0001", "T0002", "K0001", "K0002", "S0001"), framework_id = "nice-v2" ) role[["@id"]] #> [1] "nice:OG-WRL-015" role[["@type"]] #> [1] "nice:WorkRole" "cybed:Role" role[["cybed:partOf"]] #> $`@id` #> [1] "cybed:framework/nice-v2" length(role[["cybed:hasElement"]]) #> [1] 5 ``` --- ## build_role_element_node — Construct a `cybed:RoleElement` JSON-LD node One atomic statement attached to a role: a task, knowledge statement, skill statement, learning standard, competence description, etc. Framework-specific element types subclass `cybed:RoleElement`. ```r el <- build_role_element_node( element_id = "T0001", framework_prefix = "nice", framework_element_type = "TaskStatement", element_text = "Acquire and manage the necessary resources, including leadership support, financial resources, and key security personnel, to support information technology (IT) security goals and objectives.", source_section = "OG-WRL-015", framework_id = "nice-v2" ) el[["@id"]] #> [1] "nice:T0001" el[["@type"]] #> [1] "nice:TaskStatement" "cybed:RoleElement" el[["cybed:elementText"]] #> [1] "Acquire and manage the necessary resources..." el[["cybed:partOf"]] #> $`@id` #> [1] "cybed:framework/nice-v2" ``` --- ## assemble_framework_document — Assemble a complete JSON-LD document Wraps framework, role, and element nodes into a single top-level JSON-LD document with the appropriate `@context`. The output of this function is what `write_jsonld_document()` serializes to disk. ```r fw <- build_framework_node( framework_id = "ecsf-v1", framework_name = "ECSF v1", framework_prefix = "ecsf", version = "1.0", publisher = "ENISA", jurisdiction = "EU", sector = "civilian", specificity = "cybersecurity-specific" ) role <- build_role_node( role_id = "CISO", role_name = "Chief Information Security Officer", framework_prefix = "ecsf", framework_role_type = "RoleProfile", framework_id = "ecsf-v1" ) el <- build_role_element_node( element_id = "CISO-M01", framework_prefix = "ecsf", framework_element_type = "Competence", element_text = "Define and maintain the cybersecurity strategy.", framework_id = "ecsf-v1" ) doc <- assemble_framework_document(fw, list(role), list(el), "ecsf") names(doc) #> [1] "@context" "@graph" length(doc[["@graph"]]) #> [1] 3 # framework node + 1 role + 1 element ``` --- ## validate_jsonld_node — Validate a JSON-LD node's minimum structure Checks presence of `@id` and `@type` (and `@context` when `require_context = TRUE`). Does not perform full JSON-LD 1.1 compliance. Use before passing nodes to downstream pipeline steps. ```r good_node <- list(`@id` = "nice:T0001", `@type` = "nice:TaskStatement") validate_jsonld_node(good_node) #> $valid #> [1] TRUE #> $missing_fields #> character(0) bad_node <- list(`@id` = "nice:T0001") # missing @type validate_jsonld_node(bad_node) #> $valid #> [1] FALSE #> $missing_fields #> [1] "@type" # Require @context for top-level document validation doc_node <- list(`@context` = list(), `@id` = "x", `@type` = "Y") validate_jsonld_node(doc_node, require_context = TRUE) #> $valid #> [1] TRUE ``` --- ## write_jsonld_document / read_jsonld_document — File I/O `write_jsonld_document()` serializes a JSON-LD named list to disk with pretty-printing and `auto_unbox = TRUE`. Creates missing parent directories automatically. `read_jsonld_document()` reads it back with `simplifyVector = FALSE` to preserve list-of-objects structure. Both raise classed conditions on I/O failures. ```r library(cybedtools) # Assemble a minimal document doc <- list( `@context` = build_jsonld_context("dcwf"), `@graph` = list( build_framework_node( framework_id = "dcwf-v51", framework_name = "DCWF v5.1", framework_prefix = "dcwf", version = "5.1", publisher = "DoD", jurisdiction = "US", sector = "defense", specificity = "cybersecurity-specific" ) ) ) tmp <- tempfile(fileext = ".jsonld") write_jsonld_document(doc, tmp) #> JSON-LD written: /tmp/RtmpXXX/fileXXX.jsonld # Round-trip: read back and inspect doc_back <- read_jsonld_document(tmp) doc_back[["@graph"]][[1]][["@id"]] #> [1] "cybed:framework/dcwf-v51" # Error handling for missing file tryCatch( read_jsonld_document("/nonexistent.jsonld"), cybedtools_file_not_found = function(e) { message("File not found: ", conditionMessage(e)) } ) unlink(tmp) ``` --- ## framework_summary — Shipped data object A tibble with 8 rows (one per framework) shipped with the package. Provides display names, type, jurisdiction, role counts, element counts, elements-per-role density, and license. Available immediately without staged data — useful for quick exploratory plots. ```r library(cybedtools) library(dplyr) framework_summary #> # A tibble: 8 × 8 #> framework_slug framework_name framework_type jurisdiction role_count element_count elements_per_role license #> <chr> <chr> <chr> <chr> <int> <int> <dbl> <chr> #> 1 nice-v2 NICE v2 (NIST SP 800-181 Rev 1...) workforce US 41 2111 51.5 public domain #> 2 dcwf-v51 DCWF v5.1 workforce US 74 2945 39.8 public domain #> ... # Filter to workforce frameworks only subset(framework_summary, framework_type == "workforce") # Density comparison without staging any data framework_summary |> arrange(desc(elements_per_role)) |> select(framework_name, jurisdiction, elements_per_role) ``` --- ## Cross-Framework Analysis Pattern cybedtools is designed for reproducible empirical comparisons. The primary workflow loads the combined graph, calls domain helpers, and composes results in dplyr. ```r library(cybedtools) library(dplyr) rdf <- load_combined_ntriples_graph() # Element density per framework (one expression, reproducible result) role_framework_bindings(rdf) |> count(framework_name, name = "role_count") |> left_join( element_framework_bindings(rdf) |> count(framework_name, name = "element_count"), by = "framework_name" ) |> mutate(elements_per_role = round(element_count / role_count, 1)) |> arrange(desc(elements_per_role)) #> # A tibble: 8 × 4 #> framework_name role_count element_count elements_per_role #> <chr> <int> <int> <dbl> #> 1 NICE v2 (NIST SP 800-181 Rev 1...) 41 2111 51.5 #> 2 DCWF v5.1 74 2945 39.8 #> 3 ECSF v1 12 374 31.2 #> 4 CSEC2017 Curricular Guidelines v1.0 8 38 4.8 #> 5 CSTA K-12 Computer Science Standards... 25 120 4.8 #> 6 SFIA 9 147 672 4.6 #> 7 DigComp 2.2 5 21 4.2 #> 8 Cyber.org K-12 Learning Standards v1.0 116 123 1.1 # Pairwise comparison: NICE (role-first) vs SFIA (skill-first) rfb <- role_framework_bindings(rdf) reb <- role_element_bindings(rdf) rfb |> filter(framework_name %in% c("NICE v2 (NIST SP 800-181 Rev 1 components)", "SFIA 9")) |> left_join(reb |> count(role, name = "element_count"), by = "role") |> group_by(framework_name) |> summarize( role_count = n(), mean_elements = round(mean(element_count, na.rm = TRUE), 1), median_elements = median(element_count, na.rm = TRUE), max_elements = max(element_count, na.rm = TRUE) ) ``` --- ## Adding a New Framework Extend the corpus by following the six-step pattern: register a slug and prefix, write an ingestion script, declare invariants, add verification field mappings, write an assembly adapter, then run the pipeline. ```r # Step 1: Register prefix in R/jsonld-helpers.R cybed_namespaces <- list( # ... existing prefixes ... fx = "https://frameworkx.example.org/ontology#" ) valid_framework_prefixes <- c( "nice", "dcwf", "ecf", "sfia", "ecsf", "cyberorg", "csta", "csec", "digcomp", "fx" # new framework ) # Step 5: Assembly adapter in scripts/020-assemble-jsonld.R assemble_frameworkx <- function() { prov <- load_framework_provenance("frameworkx") roles <- read_framework_table("frameworkx", "roles") elements <- read_framework_table("frameworkx", "elements") framework_node <- build_framework_node( framework_id = "frameworkx-v1", framework_name = prov$framework_version, framework_prefix = "fx", version = prov$framework_version, publisher = prov$source$publisher, jurisdiction = "US", sector = "civilian", specificity = "cybersecurity-specific", license = prov$licensing$source_license, date_published = prov$framework_date ) role_nodes <- purrr::pmap(roles, function(role_id, role_name, ...) { build_role_node( role_id = role_id, role_name = role_name, framework_prefix = "fx", framework_role_type = "WorkRole", framework_id = "frameworkx-v1" ) }) element_nodes <- purrr::pmap(elements, function(element_id, text, ...) { build_role_element_node( element_id = element_id, framework_prefix = "fx", framework_element_type = "Element", element_text = text, framework_id = "frameworkx-v1" ) }) list(framework = framework_node, roles = role_nodes, elements = element_nodes, prefix = "fx") } framework_assemblers[["frameworkx"]] <- assemble_frameworkx # Step 6: Run pipeline — existing SPARQL queries include Framework X automatically # because they match on cybed:Framework, cybed:Role, cybed:RoleElement ``` ```bash Rscript scripts/010-ingest-frameworkx.R Rscript scripts/015-verify-ingestion.R Rscript scripts/020-assemble-jsonld.R Rscript scripts/040-run-sparql.R ``` --- cybedtools is primarily used in two settings: empirical cybersecurity education research requiring reproducible cross-framework claims, and workforce-development analysis mapping job roles to training requirements across jurisdictions. In both settings, the core pattern is the same — load the N-Triples graph, call domain helpers (`framework_metadata`, `role_framework_bindings`, `element_framework_bindings`, `role_element_bindings`), and compose findings with dplyr — making results scriptable, version-controllable, and citable via the package's Zenodo DOI. Integration into existing R workflows requires only `rdflib` and standard tidyverse packages. The package's single-BGP SPARQL discipline works around known librdf limitations with large graphs, so users never need to write multi-pattern SPARQL directly. The shipped `framework_summary` data object and `make_demo_graph()` enable immediate exploration without staging upstream framework data, while the six-step extension protocol and per-framework provenance manifests ensure the pipeline remains auditable and license-compliant as new frameworks are added.