### Installation Guide Source: https://github.com/marian-nmt/marian/blob/master/doc/README.md Instructions for installing necessary packages and setting up a Python environment for Marian NMT development. ```APIDOC ## Installation On Ubuntu 20.04, install the following packages: sudo apt-get install python3 python3-pip python3-setuptools doxygen Then set up a Python environment and install modules: pip3 install virtualenv virtualenv venv -p python3 source venv/bin/activate pip3 install -r requirements.txt Documentation building should also work on Windows, but it has not been tested. ``` -------------------------------- ### Install Mio with Static Generators Source: https://github.com/marian-nmt/marian/blob/master/src/3rd_party/mio/README.md Steps to install Mio using static generators like Make or Ninja, including optional installation prefix and testing configuration. ```bash cd mkdir build cd build cmake [-D CMAKE_INSTALL_PREFIX="path/to/installation"] [-D BUILD_TESTING=False] -D CMAKE_BUILD_TYPE=Release -G <"Unix Makefiles" | "Ninja"> .. ``` -------------------------------- ### CMake Project Setup and Options Source: https://github.com/marian-nmt/marian/blob/master/CMakeLists.txt Initializes the CMake project, sets C++ standards, and defines various build options such as CPU/GPU compilation, examples, server, tests, and third-party library support (CUDA, MKL, MPI, SentencePiece, etc.). It also includes platform-specific options like Apple Accelerate. ```cmake cmake_minimum_required(VERSION 3.5.1) set(CMAKE_MODULE_PATH ${CMAKE_CURRENT_SOURCE_DIR}/cmake) if (POLICY CMP0074) cmake_policy(SET CMP0074 NEW) # CMake 3.12 endif () project(marian CXX C) set(CMAKE_CXX_STANDARD 17) set(CMAKE_CXX_STANDARD_REQUIRED ON) set(BUILD_ARCH native CACHE STRING "Compile for this CPU architecture.") # Custom CMake options option(COMPILE_CPU "Compile CPU version" ON) option(COMPILE_CUDA "Compile GPU version" ON) option(COMPILE_EXAMPLES "Compile examples" OFF) option(COMPILE_SERVER "Compile marian-server" OFF) option(COMPILE_TESTS "Compile tests" OFF) if(APPLE) option(USE_APPLE_ACCELERATE "Compile with Apple Accelerate" ON) else(APPLE) option(USE_APPLE_ACCELERATE "Compile with Apple Accelerate" OFF) endif(APPLE) option(USE_CCACHE "Use ccache compiler cache (https://ccache.dev)" OFF) option(USE_CUDNN "Use CUDNN library" OFF) option(USE_DOXYGEN "Build documentation with Doxygen" ON) option(USE_FBGEMM "Use FBGEMM" OFF) option(USE_MKL "Compile with MKL support" ON) option(USE_MPI "Use MPI library" OFF) option(USE_NCCL "Use NCCL library" ON) option(USE_SENTENCEPIECE "Download and compile SentencePiece" ON) option(USE_STATIC_LIBS "Link statically against non-system libs" OFF) option(GENERATE_MARIAN_INSTALL_TARGETS "Generate Marian install targets (requires CMake 3.12+)" OFF) option(DETERMINISTIC "Try to make training results as deterministic as possible (e.g. for testing)" OFF) ``` -------------------------------- ### Install Documentation Dependencies on Ubuntu Source: https://github.com/marian-nmt/marian/blob/master/doc/README.md Commands to install system-level dependencies and set up a Python virtual environment for building the project documentation. ```bash sudo apt-get install python3 python3-pip python3-setuptools doxygen pip3 install virtualenv virtualenv venv -p python3 source venv/bin/activate pip3 install -r requirements.txt ``` -------------------------------- ### Configure Mio Installation with Dynamic Generators Source: https://github.com/marian-nmt/marian/blob/master/src/3rd_party/mio/README.md Steps to configure Mio for installation using dynamic generators like Visual Studio or Xcode. ```bash cd mkdir build cd build cmake [-D CMAKE_INSTALL_PREFIX="path/to/installation"] [-D BUILD_TESTING=False] -G <"Visual Studio 14 2015 Win64" | "Xcode"> .. ``` -------------------------------- ### Configuration File Format Example Source: https://github.com/marian-nmt/marian/blob/master/src/3rd_party/CLI/README.md Provides an example of the INI file format used for MARIAN NMT configuration, including comments, sections, and value assignments. ```APIDOC ## Configuration File Format Example ### Description This example demonstrates the INI file format supported by MARIAN NMT for configuration. ### Method N/A (This is a file format example) ### Endpoint N/A ### Parameters N/A ### Request Example ```ini ; Commments are supported, using a ; ; The default section is [default], case insensitive value = 1 str = "A string" vector = 1 2 3 str_vector = "one" "two" "and three" ; Sections map to subcommands [subcommand] in_subcommand = Wow sub.subcommand = true ``` ### Response N/A #### Response Example N/A ``` -------------------------------- ### spdlog CMake Installation Rules Source: https://github.com/marian-nmt/marian/blob/master/src/3rd_party/spdlog/CMakeLists.txt Defines the installation paths and rules for the spdlog library, including headers, CMake configuration files, and pkgconfig files. It ensures that the library and its associated build system components are correctly installed. ```cmake set(generated_dir "${CMAKE_CURRENT_BINARY_DIR}/generated") set(config_install_dir "lib/cmake/${PROJECT_NAME}") set(include_install_dir "include") set(pkgconfig_install_dir "lib/pkgconfig") set(version_config "${generated_dir}/${PROJECT_NAME}ConfigVersion.cmake") set(project_config "${generated_dir}/${PROJECT_NAME}Config.cmake") set(pkg_config "${generated_dir}/${PROJECT_NAME}.pc") set(targets_export_name "${PROJECT_NAME}Targets") set(namespace "${PROJECT_NAME}::") include(CMakePackageConfigHelpers) write_basic_package_version_file( "${version_config}" COMPATIBILITY SameMajorVersion ) # Note: use 'targets_export_name' configure_file("cmake/Config.cmake.in" "${project_config}" @ONLY) configure_file("cmake/spdlog.pc.in" "${pkg_config}" @ONLY) install( TARGETS spdlog EXPORT "${targets_export_name}" INCLUDES DESTINATION "${include_install_dir}" ) install(DIRECTORY "include/spdlog" DESTINATION "${include_install_dir}") install( FILES "${project_config}" "${version_config}" DESTINATION "${config_install_dir}" ) install( FILES "${pkg_config}" DESTINATION "${pkgconfig_install_dir}" ) install( EXPORT "${targets_export_name}" NAMESPACE "${namespace}" DESTINATION "${config_install_dir}" ) ``` -------------------------------- ### Install mio Library via CMake Source: https://github.com/marian-nmt/marian/blob/master/src/3rd_party/mio/README.md Executes the installation process for the mio library using CMake. This command copies header files and configuration files to the installation root. ```bash cmake --build . --config Release --target install ``` -------------------------------- ### Build and Install Marian NMT with CMake Source: https://github.com/marian-nmt/marian/blob/master/contrib/triton-aml/marian_backend/README.md This snippet demonstrates the command-line steps to configure, build, and install Marian NMT locally using CMake. It involves creating a build directory, setting the installation prefix, and executing the make install command. Dependencies can be managed by overriding default repository tags. ```bash mkdir build cd build cmake -DCMAKE_INSTALL_PREFIX:PATH=`pwd`/install .. make install ``` -------------------------------- ### Train Factored Models with Marian Source: https://github.com/marian-nmt/marian/blob/master/doc/factors.md Command-line examples for training factored models in Marian NMT. These examples demonstrate different configurations for source/target factors, embedding combination methods, and lemma dependencies. ```bash # Using factors on both source and target with sum combination path_to/build/marian -t corpus.fact.{src,trg} \ -v vocab.{src,trg}.fsv # Using factors only on source with concat combination and tied embeddings path_to/build/marian -t corpus.fact.src corpus.trg \ -v vocab.src.fsv vocab.trg.yml \ --factors-combine concat \ --factors-dim-emb 8 \ --tied-embeddings-all # Using factors only on target with soft-transformer-layer dependency path_to/build/marian -t corpus.src corpus.fact.trg \ -v vocab.src.yml vocab.fsv.trg \ --tied-embeddings \ --lemma-dependency soft-transformer-layer ``` -------------------------------- ### Configure Marian Installation Targets Source: https://github.com/marian-nmt/marian/blob/master/src/CMakeLists.txt Sets up the installation rules for the Marian library, allowing it to be installed into system directories via standard build tools. ```cmake if(GENERATE_MARIAN_INSTALL_TARGETS) include(GNUInstallDirs) install(TARGETS marian EXPORT marian-targets ARCHIVE DESTINATION ${CMAKE_INSTALL_LIBDIR} ) endif() ``` -------------------------------- ### Configure Marian Training Experiments Source: https://context7.com/marian-nmt/marian/llms.txt Example YAML configuration for defining model architecture, training parameters, and device allocation, along with command-line execution instructions. ```yaml model: model/model.npz train-sets: - data/train.src - data/train.trg type: transformer dim-emb: 512 optimizer: adam learn-rate: 0.0003 devices: [0, 1, 2, 3] ``` ```bash marian --config config.yml marian --config config.yml --devices 0 1 --workspace 4000 ``` -------------------------------- ### Build with Custom C++ Flags Source: https://github.com/marian-nmt/marian/blob/master/src/3rd_party/phf/README.md Example of how to configure the build process using GNU Make, specifically demonstrating how to disable standard C++ library dependencies using the PHF_NO_LIBCXX macro. ```shell make CPPFLAGS="-DPHF_NO_LIBCXX" \ CXXFLAGS="-std=c++11 -fno-rtti -fno-exceptions -O3 -march=native" \ LDFLAGS="-nostdlib" \ LIBS="-lSystem" ``` -------------------------------- ### Train Models with Factored Vocabularies Source: https://context7.com/marian-nmt/marian/llms.txt Command-line examples for training Marian models using factored vocabulary files (.fsv), covering scenarios like source/target factors, concatenation, and lemma dependencies. ```bash # Factors on both source and target marian -t corpus.fact.src corpus.fact.trg \ -v vocab.src.fsv vocab.trg.fsv # Factors on source with concatenation marian -t corpus.fact.src corpus.trg \ -v vocab.src.fsv vocab.trg.yml \ --factors-combine concat \ --factors-dim-emb 8 \ --tied-embeddings-all # Factors on target with lemma dependency marian -t corpus.src corpus.fact.trg \ -v vocab.src.yml vocab.trg.fsv \ --tied-embeddings \ --lemma-dependency soft-transformer-layer ``` -------------------------------- ### Configure Installation and Export Source: https://github.com/marian-nmt/marian/blob/master/contrib/triton-aml/marian_backend/CMakeLists.txt Configures the installation paths and generates CMake package configuration files to allow other projects to consume the Triton Marian backend library. ```cmake install(TARGETS triton-marian-backend EXPORT triton-marian-backend-targets LIBRARY DESTINATION ${CMAKE_INSTALL_PREFIX}/backends/marian) include(CMakePackageConfigHelpers) configure_package_config_file(${CMAKE_CURRENT_LIST_DIR}/cmake/TritonMarianBackendConfig.cmake.in ${CMAKE_CURRENT_BINARY_DIR}/TritonMarianBackendConfig.cmake INSTALL_DESTINATION ${INSTALL_CONFIGDIR}) ``` -------------------------------- ### Configure Triton Marian Backend Build Source: https://github.com/marian-nmt/marian/blob/master/contrib/triton-aml/marian_backend/CMakeLists.txt This snippet demonstrates the initial project setup, including defining minimum CMake versions, project options for GPU and statistics, and declaring external dependencies using FetchContent. ```cmake cmake_minimum_required(VERSION 3.17) project(tritonmarianbackend LANGUAGES C CXX) option(TRITON_ENABLE_GPU "Enable GPU support in backend" OFF) option(TRITON_ENABLE_STATS "Include statistics collections in backend" ON) include(FetchContent) FetchContent_Declare(repo-common GIT_REPOSITORY https://github.com/triton-inference-server/common.git GIT_TAG main GIT_SHALLOW ON) FetchContent_MakeAvailable(repo-common) ``` -------------------------------- ### Start Marian Translation WebSocket Server Source: https://context7.com/marian-nmt/marian/llms.txt Starts a WebSocket server using marian-server for real-time translation. Configurable with port, models, vocabs, beam size, normalization, and devices. Requires model and vocabulary files. ```bash # Start translation server marian-server \ --port 8080 \ --models model.npz \ --vocabs vocab.src.yml vocab.trg.yml \ --beam-size 6 \ --normalize 0.6 \ --devices 0 ``` -------------------------------- ### Command-Line Argument Parsing Examples in C++ Source: https://github.com/marian-nmt/marian/blob/master/src/3rd_party/CLI/README.md Illustrates different ways to provide command-line arguments to a C++ application using the CLI11 library. This covers flags, combined flags, options with spaces, options without spaces, and long options. ```C++ // Flags // -a // -abc (combined flags) // Options // -f filename // -ffilename (no space) // -abcf filename (combined flags and option) // Long options // --long // --file filename (space) // --file=filename (equals) ``` -------------------------------- ### Configure Factored Vocabulary File Source: https://github.com/marian-nmt/marian/blob/master/doc/factors.md An example of a .fsv vocabulary file structure used in Marian NMT. It defines factors and lemmas to support factored model training. ```text # factors _lemma _d d0 : _d d1 : _d d2 : _d d3 : _d d4 : _d _has_d # lemmas : _lemma : _lemma , : _lemma . : _lemma le : _lemma pour : _lemma ``` -------------------------------- ### Install Marian NMT Dependencies (install.sh) Source: https://github.com/marian-nmt/marian/blob/master/scripts/shortlist/README.md The install.sh script is a helper utility for the Marian NMT project. It automates the process of downloading and compiling external tools like fastalign and extract-lex, and then places the necessary binaries into a local '_./bin_' directory for easy access. ```bash #!/bin/bash # This script downloads and compiles fastalign and extract-lex # and copies required binaries into _./bin_ # Example usage (not part of the script itself): # ./install.sh ``` -------------------------------- ### Creating a mlp::mlp Network in C++ Source: https://github.com/marian-nmt/marian/blob/master/doc/layer.md Illustrates how to build a complete MLP network by stacking multiple layers using `mlp::mlp`. This example shows adding a `mlp::dense` layer followed by an `mlp::output` layer, configuring their parameters, and then constructing the entire network within a graph. ```cpp // construct a mlp::mlp network auto mlp_networks = mlp::mlp() // construct a mpl container .push_back(mlp::dense() // construct a dense layer ("prefix", "dense") // prefix name is dense ("dim", 5) // dimension is 5 ("activation", (int)mlp::act::tanh))// activation function is tanh .push_back(mlp::output() // construct a output layer ("dim", 5)) // dimension is 5 ("prefix", "mlp_network") // prefix name is mlp_network .construct(graph); // construct this mlp layers in graph ``` -------------------------------- ### Makefile for Marian NMT Build Source: https://github.com/marian-nmt/marian/wiki/AmuNMT-for-Automatic-Post-Editing This Makefile is used to build the Marian NMT system. It requires the path to the AmuNMT tool and can be executed by simply typing 'make'. It handles the setup of input files, model files, and scripts needed for submission generation. Users may need to adjust GPU device configurations. ```makefile AMUNMT=/home/marcinj/Badania/amunmt make ``` -------------------------------- ### Initialize and Configure spdlog Loggers Source: https://github.com/marian-nmt/marian/blob/master/src/3rd_party/spdlog/README.md Demonstrates how to set up console, basic file, rotating, and daily loggers. It also covers formatting, runtime log level adjustments, and global registry management. ```c++ #include "spdlog/spdlog.h" #include #include namespace spd = spdlog; int main(int, char*[]) { try { auto console = spd::stdout_color_mt("console"); console->info("Welcome to spdlog!"); auto my_logger = spd::basic_logger_mt("basic_logger", "logs/basic.txt"); auto rotating_logger = spd::rotating_logger_mt("some_logger_name", "logs/mylogfile", 1048576 * 5, 3); auto daily_logger = spd::daily_logger_mt("daily_logger", "logs/daily", 2, 30); spd::set_pattern("*** [%H:%M:%S %z] [thread %t] %v ***"); spd::set_level(spd::level::info); spd::drop_all(); } catch (const spd::spdlog_ex& ex) { std::cout << "Log init failed: " << ex.what() << std::endl; return 1; } } ``` -------------------------------- ### Initialize and Parse CLI Options Source: https://github.com/marian-nmt/marian/blob/master/src/3rd_party/CLI/README.md Demonstrates the standard way to initialize a CLI11 application, add an option, and parse command line arguments using the provided macro. ```cpp CLI::App app{"App description"}; std::string filename = "default"; app.add_option("-f,--file", filename, "A help string"); CLI11_PARSE(app, argc, argv); ``` -------------------------------- ### Set Up Optimiser (C++) Source: https://github.com/marian-nmt/marian/blob/master/doc/graph.md Shows how to initialize an optimiser with a specified algorithm (Adam, Sgd) and learning rate. The optimiser is responsible for updating model parameters based on computed gradients. ```cpp // Choose optimizer (Sgd, Adagrad, Adam) and initial learning rate auto opt = Optimizer(0.01); ``` ```cpp // set up Sgd optimiser with 0.005 learning rate auto opt = Optimizer(0.005); ``` -------------------------------- ### Build Documentation with Make Source: https://github.com/marian-nmt/marian/blob/master/doc/README.md Command to trigger the documentation build process, which generates static HTML files in the build directory. ```bash make html ``` -------------------------------- ### Generate Packaged Installation with CPack Source: https://github.com/marian-nmt/marian/blob/master/src/3rd_party/mio/README.md Creates a relocatable installation package for the library using CPack. The generator name depends on the target platform. ```bash cpack -G -C Release ``` -------------------------------- ### Configure Optimizers and Training Loop in C++ Source: https://context7.com/marian-nmt/marian/llms.txt Demonstrates how to instantiate optimizers like Adam, Sgd, or Adagrad and integrate them into a training loop to update model parameters based on graph backpropagation. ```cpp // Create optimizer auto opt = Optimizer( 0.0003, // Learning rate {0.9, 0.98, 1e-9} // Adam parameters (beta1, beta2, eps) ); // Alternative optimizers auto sgd = Optimizer(0.01); auto adagrad = Optimizer(0.01); // Training loop for(int epoch = 0; epoch < numEpochs; epoch++) { for(auto batch : batches) { // Build graph for batch auto loss = buildGraph(graph, batch); // Forward and backward pass graph->backprop(); // Update parameters opt->update(graph); // Clear graph for next iteration graph->clear(); } } ``` -------------------------------- ### Create and Initialize ExpressionGraph in C++ Source: https://github.com/marian-nmt/marian/blob/master/doc/graph.md Demonstrates the creation of an ExpressionGraph object and its initialization with device options and workspace memory. Proper initialization is crucial to prevent runtime crashes. ```cpp // create a graph auto graph = New(); // initialise graph with device options // here we specify device no. is 0 // device type can be DeviceType::cpu or DeviceType::gpu graph->setDevice({0, DeviceType::cpu}); // preallocate workspace memory (MB) for the graph graph->reserveWorkspaceMB(128); ``` -------------------------------- ### Example Debug Output (C++) Source: https://github.com/marian-nmt/marian/blob/master/doc/graph.md Provides an example of the output generated when debugging a node, showing the value and gradient information printed during the forward and backward passes, typically when Marian logger is enabled. ```cpp [2021-02-16 15:10:51] [memory] Reserving 256 B, device gpu0 [2021-02-16 15:10:51] Debug: Parameter x op=param [2021-02-16 15:10:51] shape=1x1 size=1 type=float32 device=gpu0 ptr=140505547538432 bytes=256 min: 2.00000000 max: 2.00000000 l2-norm: 2.00000000 [[ 2.00000000 ]] [2021-02-16 15:10:51] [memory] Reserving 256 B, device gpu0 [2021-02-16 15:10:51] Debug Grad: Parameter x op=param [2021-02-16 15:10:51] shape=1x1 size=1 type=float32 device=gpu0 ptr=140505547538944 bytes=256 min: 2.58385324 max: 2.58385324 l2-norm: 2.58385324 [[ 2.58385324 ]] ``` -------------------------------- ### Build and Test Mio with Static Generators Source: https://github.com/marian-nmt/marian/blob/master/src/3rd_party/mio/README.md Instructions for building and testing Mio using static configuration tools like GNU Make or Ninja. It involves creating a build directory, configuring with CMake, and executing build and test commands. ```bash cd mkdir build cd build cmake -D CMAKE_BUILD_TYPE= -G <"Unix Makefiles" | "Ninja"> .. < make | ninja | cmake --build . > < make test | ninja test | cmake --build . --target test | ctest > ``` -------------------------------- ### Configure and Link Marian NMT Example Executables Source: https://github.com/marian-nmt/marian/blob/master/src/examples/CMakeLists.txt This CMake script defines executable targets for iris and mnist examples. It iterates through the targets to link the Marian core library and conditionally adds CUDA support if detected. ```cmake add_executable(iris_example iris/iris.cpp) add_executable(mnist_example mnist/mnist_ffnn.cpp) foreach(exec iris_example mnist_example) target_link_libraries(${exec} marian ${EXT_LIBS}) if(CUDA_FOUND) target_link_libraries(${exec} marian ${EXT_LIBS} marian_cuda ${EXT_LIBS}) endif(CUDA_FOUND) set_target_properties(${exec} PROPERTIES RUNTIME_OUTPUT_DIRECTORY "${CMAKE_BINARY_DIR}") endforeach(exec) ``` -------------------------------- ### Subclassing the App Class Source: https://github.com/marian-nmt/marian/blob/master/src/3rd_party/CLI/README.md Explains how to subclass the `App` class to provide preset default options, setup/teardown code, and customize help flags. ```APIDOC ## Subclassing the App Class ### Description The `App` class is designed to be subclassed, allowing for the provision of preset default options, custom setup/teardown code, and modifications to help flags. ### Method N/A (This involves subclassing) ### Endpoint N/A ### Parameters N/A ### Request Example ```cpp // Example of subclassing App class MyCustomApp : public CLI::App { public: MyCustomApp() { // Set preset defaults or perform setup option_defaults()->required(); set_help_flag("--my-help", "Custom help flag"); } }; ``` ### Response N/A #### Response Example N/A ``` -------------------------------- ### CMake Install Target Generation Source: https://github.com/marian-nmt/marian/blob/master/CMakeLists.txt Handles the generation of installation targets for the Marian project, including necessary includes for system libraries and package configuration. This section includes a compatibility check for CMake versions older than 3.12, issuing a warning and disabling the feature if the version is insufficient. ```cmake if(GENERATE_MARIAN_INSTALL_TARGETS AND ${CMAKE_VERSION} VERSION_LESS "3.12") message(WARNING "Marian install targets cannot be generated on CMake <3.12.\n Please upgrade your CMake version or set GENERATE_MARIAN_INSTALL_TARGETS=OFF to remove this warning. Disabling installation targets.") set(GENERATE_MARIAN_INSTALL_TARGETS OFF CACHE BOOL "Forcing disabled installation targets due to CMake <3.12." FORCE) endif() if(GENERATE_MARIAN_INSTALL_TARGETS) include(GNUInstallDirs) # This defines default values for installation directories (all platforms even if named GNU) include(InstallRequiredSystemLibraries) # Tell CMake that the `install` target needs to install required system libraries (eg: Windows SDK) include(CMakePackageConfigHelpers) # Helper to create relocatable packages install(EXPORT marian-targets # Installation target DESTINATION ${CMAKE_INSTALL_LIBDIR}/cmake) endif(GENERATE_MARIAN_INSTALL_TARGETS) ``` -------------------------------- ### GET /subcommand/status Source: https://github.com/marian-nmt/marian/blob/master/src/3rd_party/CLI/README.md Checks if a specific subcommand was parsed from the command line or retrieves the list of active subcommands. ```APIDOC ## GET /subcommand/status ### Description Determines if a subcommand was provided during execution or retrieves the collection of subcommands parsed. ### Method GET ### Parameters #### Query Parameters - **name** (string) - Optional - The name of the subcommand to check via `got_subcommand`. ### Response #### Success Response (200) - **parsed** (boolean) - Whether the subcommand was present. - **subcommands** (array) - List of pointers to subcommands found on the command line. ``` -------------------------------- ### GET /database/query Source: https://github.com/marian-nmt/marian/blob/master/src/3rd_party/SQLiteCpp/README.md Executes a parameterized SQL query against a SQLite database file and retrieves result rows. ```APIDOC ## GET /database/query ### Description Executes a SELECT query with bound parameters to safely retrieve data from the database. ### Method GET ### Endpoint /database/query ### Parameters #### Query Parameters - **db_path** (string) - Required - Path to the SQLite database file - **sql** (string) - Required - The SQL SELECT statement with '?' placeholders - **bind_value** (integer) - Required - Value to bind to the first parameter ### Request Example { "db_path": "example.db3", "sql": "SELECT * FROM test WHERE size > ?", "bind_value": 6 } ### Response #### Success Response (200) - **rows** (array) - List of objects containing column values #### Response Example { "rows": [ {"id": 1, "value": "sample", "size": 10} ] } ``` -------------------------------- ### Initialize Git Submodules Source: https://github.com/marian-nmt/marian/blob/master/src/3rd_party/SQLiteCpp/README.md Commands to clone the repository and initialize the required googletest submodule. ```shell git clone https://github.com/SRombauts/SQLiteCpp.git cd SQLiteCpp git submodule init git submodule update ``` -------------------------------- ### CastNodeOp Constructor (C++) Source: https://github.com/marian-nmt/marian/blob/master/doc/operators.md A simple example of a NodeOp that requires explicit type specification. The constructor takes an expression and a type, initializing the UnaryNodeOp. ```cpp // CastNodeOp in src/graph/node_operators_unary.h CastNodeOp(Expr a, Type type) : UnaryNodeOp(a, type) {} ``` -------------------------------- ### Implement Sin Unary Node Operator Source: https://github.com/marian-nmt/marian/blob/master/doc/operators.md An example implementation of a sine function operator using UnaryNodeOp, demonstrating forward and backward pass definitions. ```cpp struct SinNodeOp : public UnaryNodeOp { SinNodeOp(Expr x) : UnaryNodeOp(x) {} NodeOps forwardOps() override { using namespace functional; return {NodeOp(Element(_1 = sin(_2), val_, child(0)->val()))}; } NodeOps backwardOps() override { using namespace functional; return {NodeOp(Add(_1 * cos(_2), child(0)->grad(), adj_, child(0)->val()))}; } const std::string type() override { return "sin"; } }; ``` -------------------------------- ### Build and Test Mio with Dynamic Generators Source: https://github.com/marian-nmt/marian/blob/master/src/3rd_party/mio/README.md Instructions for building and testing Mio using dynamic configuration tools like Visual Studio or Xcode. It includes configuration, build, and test execution steps specific to IDE-based generators. ```bash cd mkdir build cd build cmake -G <"Visual Studio 14 2015 Win64" | "Xcode"> .. cmake --build . --config ctest --build-config cmake --build . --config --target test ``` -------------------------------- ### Consume mio via CMake find_package Source: https://github.com/marian-nmt/marian/blob/master/src/3rd_party/mio/README.md Configures a downstream CMake project to use the installed mio library. It links the mio target to a specified project target. ```cmake find_package( mio REQUIRED ) target_link_libraries( MyTarget PUBLIC mio::mio ) ``` -------------------------------- ### Handle Assertions in SQLiteC++ Source: https://github.com/marian-nmt/marian/blob/master/src/3rd_party/SQLiteCpp/README.md Provides an example of how to define a custom assertion handler for SQLiteC++ when exceptions should not be used in destructors. This requires defining SQLITECPP_ENABLE_ASSERT_HANDLER during compilation and implementing the assertion_failed function. ```C++ #ifdef SQLITECPP_ENABLE_ASSERT_HANDLER namespace SQLite { /// definition of the assertion handler enabled when SQLITECPP_ENABLE_ASSERT_HANDLER is defined in the project (CMakeList.txt) void assertion_failed(const char* apFile, const long apLine, const char* apFunc, const char* apExpr, const char* apMsg) { // Print a message to the standard error output stream, and abort the program. std::cerr << apFile << ":" << apLine << ":" << " error: assertion failed (" << apExpr << ") in " << apFunc << "() with message \"" << apMsg << "\"\n"; std::abort(); } } #endif ``` -------------------------------- ### Using Perfect Hash Functions in Lua Source: https://github.com/marian-nmt/marian/blob/master/src/3rd_party/phf/README.md Demonstrates how to initialize a perfect hash function object using a set of keys and retrieve unique hash values for those keys. ```lua local phf = require"phf" local lambda = 4 -- how many keys per intermediate bucket local alpha = 80 -- output hash space loading in percentage. local keys = { "apple", "banana", "cherry", "date", "eggplant", "fig", "guava", "honeydew", "jackfruit", "kiwi", "lemon", "mango" } local F = phf.new(keys, lambda, alpha) for i=1,#keys do print(keys[i], F(keys[i])) end ``` -------------------------------- ### Python Client for Marian WebSocket Server Source: https://context7.com/marian-nmt/marian/llms.txt A Python client example using the 'websocket-client' library to connect to a marian-server and perform translations. It sends text to the server and receives the translated output. ```python #!/usr/bin/env python3 # Client example for marian-server from websocket import create_connection import sys # Connect to server ws = create_connection("ws://localhost:8080/translate") # Translate text batch = "Hello, how are you?\nThis is a test." ws.send(batch) result = ws.recv() print(result.rstrip()) ws.close() ``` -------------------------------- ### Retrieve Directory Children and Parent Source: https://github.com/marian-nmt/marian/blob/master/src/3rd_party/pathie-cpp/README.md Demonstrates how to list directory contents and navigate to parent directories using Pathie's Path class. ```cpp std::vector children = your_path.children(); Pathie::Path yourpath("foo/bar/baz"); Pathie::Path parent = yourpath.parent(); ``` -------------------------------- ### ExpressionGraph: Constants and Parameters (C++) Source: https://context7.com/marian-nmt/marian/llms.txt Shows how to create constant tensors (immutable) and parameter tensors (trainable) within an ExpressionGraph. Covers various initialization methods like fromVector, glorotUniform, zeros, and uniform. ```cpp // Create constant nodes (immutable during training) auto x = graph->constant({batchSize, inputDim}, inits::fromVector(inputData)); auto ones = graph->ones({10, 10}); auto zeros = graph->zeros({10, 10}); // Create parameter nodes (trainable weights) auto W = graph->param("W", {inputDim, hiddenDim}, inits::glorotUniform()); auto b = graph->param("b", {1, hiddenDim}, inits::zeros()); // Parameter with specific initialization auto W_init = graph->param("W_init", {512, 256}, inits::uniform(-0.1f, 0.1f)); // Fixed (non-trainable) parameter auto W_fixed = graph->param("W_fixed", {100, 100}, inits::fromValue(1.0f)); W_fixed->setTrainable(false); ``` -------------------------------- ### Create and Manage Temporary Directory (C++) Source: https://github.com/marian-nmt/marian/blob/master/src/3rd_party/pathie-cpp/README.md Shows how to create a temporary directory using `Pathie::Tempdir`. The directory is automatically removed when the `Tempdir` object goes out of scope. A fragment can be provided to customize the temporary directory name. ```cpp #include #include #include #include // ... { srand(time(NULL)); // Needs random number generator Pathie::Tempdir tmpdir("foo"); // Pass a fragment to use as part of filename std::cout << "Temporary dir is: " << tmpdir.path() << std::endl; } // When `tmpdir' is destroyed, the destructor recursively // deletes the directory that was created. ``` -------------------------------- ### Constructing a Convolution Layer Source: https://github.com/marian-nmt/marian/blob/master/doc/layer.md Illustrates the construction of a 2D convolution layer in Marian. It requires an NVIDIA cuDNN installation and uses specific configuration options like kernel dimensions and kernel count. ```cpp // construct a convolution layer auto conv_1 = convolution(graph) // pass graph pointer to the layer ("prefix", "conv_1") // prefix name is conv_1 ("kernel-dims", std::make_pair(3,3)) // kernel is 3*3 ("kernel-num", 32) // kernel no. is 32 .apply(x); // link node x as the input ``` -------------------------------- ### Map Existing File Descriptor with mio Source: https://github.com/marian-nmt/marian/blob/master/src/3rd_party/mio/README.md Shows how to initialize a memory mapping using an existing file descriptor instead of a file path. This is useful when the file is already opened by the application. ```cpp #include #include #include #include #include int main() { const int fd = open("file.txt", O_RDONLY); mio::mmap_source mmap(fd, 0, mio::map_entire_file); } ``` -------------------------------- ### Define Doxygen Javadoc-style Comment Block Source: https://github.com/marian-nmt/marian/blob/master/doc/doc_guide.rst Standard C-style comment block format used for documenting classes, functions, and methods in Marian. It starts with two asterisks and supports brief and detailed descriptions. ```cpp /** * Brief description which ends at this dot. Details follow * here. */ ``` -------------------------------- ### Defining CLI11 Options and Flags Source: https://github.com/marian-nmt/marian/blob/master/src/3rd_party/CLI/README.md Demonstrates the syntax for adding various types of command-line options, including standard options, flags, set-based options, and subcommands. ```cpp app.add_option(option_name, variable_to_bind_to, help_string="", default=false); app.add_complex(...); app.add_flag(option_name, int_or_bool = nothing, help_string=""); app.add_flag_function(option_name, function , help_string=""); app.add_set(option_name, variable_to_bind_to, set_of_possible_options, help_string="", default=false); app.add_set_ignore_case(...); App* subcom = app.add_subcommand(name, description); ``` -------------------------------- ### Implement Sin Operation in C++ Source: https://github.com/marian-nmt/marian/blob/master/doc/operators.md Provides the implementation for the 'sin' expression operator. It utilizes the generic 'Expression' helper function to create a 'SinNodeOp' instance and add it to the graph. This is a straightforward example of defining a unary operation. ```cpp // src/graph/expression_operators.h Expr sin(Expr x); // src/graph/expression_operators.cpp Expr sin(Expr x) { return Expression(x); } ``` -------------------------------- ### ExpressionGraph: Creating and Initializing Graphs (C++) Source: https://context7.com/marian-nmt/marian/llms.txt Demonstrates the creation and initialization of an ExpressionGraph, the core data structure for building neural network computations in Marian NMT. Includes setting the device, reserving workspace, and enabling gradient checkpointing. ```cpp #include "marian.h" using namespace marian; // Create a new expression graph auto graph = New(); // Initialize with device (CPU or GPU) graph->setDevice({0, DeviceType::cpu}); // CPU device 0 // graph->setDevice({0, DeviceType::gpu}); // GPU device 0 // Reserve workspace memory (MB) graph->reserveWorkspaceMB(128); // Enable gradient checkpointing for memory efficiency graph->setCheckpointing(true); ``` -------------------------------- ### Configuration File Usage Source: https://github.com/marian-nmt/marian/blob/master/src/3rd_party/CLI/README.md Explains how to use configuration files (INI format by default) to set application options, including sections, comments, and data types. ```APIDOC ## Configuration File Usage ### Description This section describes how to use configuration files to set application options. MARIAN NMT supports INI format by default, allowing for comments, sections, and various data types. ### Method `app.set_config(option_name, default_file_name, help_string, required)` ### Endpoint N/A (This is a configuration method) ### Parameters #### Path Parameters None #### Query Parameters None #### Request Body - **option_name** (string) - The name of the configuration option. - **default_file_name** (string) - The default name of the configuration file. - **help_string** (string) - Help text for the configuration option. - **required** (boolean) - If true, the configuration file must exist. ### Request Example ```cpp app.set_config(option_name="", default_file_name="", help_string="Read an ini file", required=false) ``` ### Response #### Success Response (200) N/A (This is a configuration method, not an API endpoint) #### Response Example N/A ``` -------------------------------- ### Implement RNN and LSTM Layers in C++ Source: https://context7.com/marian-nmt/marian/llms.txt Provides examples for creating single RNN cells and stacked RNN structures, supporting various cell types like LSTM or GRU and configurations for directionality and normalization. ```cpp // Single RNN cell auto cell = rnn::cell() ("type", "lstm") // Options: gru, lstm, tanh, relu, sru ("prefix", "rnn_cell") ("dimInput", 512) ("dimState", 512) ("dropout", 0.1f); // Stacked RNN auto rnn = rnn::rnn() ("type", "lstm") ("prefix", "encoder") ("dimInput", 512) ("dimState", 512) ("direction", (int)rnn::dir::forward) // or backward, alternating_forward ("dropout", 0.2f) ("layer-normalization", true) .push_back(rnn::cell()) .push_back(rnn::cell()) .construct(graph); // Process sequence auto output = rnn->transduce(input); ``` -------------------------------- ### POST /app/add_option Source: https://github.com/marian-nmt/marian/blob/master/src/3rd_party/CLI/README.md Defines how to initialize the CLI application and register command-line options using the CLI11 library. ```APIDOC ## POST /app/add_option ### Description Initializes the CLI application instance and binds a command-line flag to a local variable. ### Method POST (Conceptual representation of configuration) ### Endpoint CLI::App::add_option ### Parameters #### Request Body - **flags** (string) - Required - The command line flags (e.g., "-f,--file"). - **variable** (reference) - Required - The variable to store the parsed value. - **help** (string) - Optional - Help text displayed in the auto-generated help menu. ### Request Example ```cpp CLI::App app{"App description"}; std::string filename = "default"; app.add_option("-f,--file", filename, "A help string"); ``` ### Response #### Success Response (200) - **status** (boolean) - Returns true if the option was successfully registered. #### Response Example ```cpp // No explicit JSON response; library modifies the bound variable directly after parsing. ``` ``` -------------------------------- ### ExpressionGraph: Tensor Manipulation (C++) Source: https://context7.com/marian-nmt/marian/llms.txt Provides examples of essential tensor manipulation operations within the Expression Graph API, such as reshaping, transposing, concatenating, slicing, selecting elements by index, and repeating tensor elements. ```cpp // Reshape tensor auto reshaped = reshape(x, {batchSize, seqLen, hiddenDim}); // Transpose auto transposed = transpose(x); // Swap last two axes auto permuted = transpose(x, {0, 2, 1, 3}); // Custom permutation // Concatenate tensors auto concat = concatenate({a, b, c}, 0); // Along axis 0 // Slice tensor auto sliced = slice(x, 0, Slice(0, 10)); // First 10 elements along axis 0 // Select specific indices auto selected = select(x, 1, indices); // Repeat elements auto repeated = repeat(x, 3, 0); // Repeat 3 times along axis 0 ``` -------------------------------- ### Retrieve Native Path Representation (C++) Source: https://github.com/marian-nmt/marian/blob/master/src/3rd_party/pathie-cpp/README.md Demonstrates how to get the native representation of a path. On Windows, this is `std::wstring` for Unicode support, while on UNIX-like systems, it's `std::string` which typically uses UTF-8 encoding. ```cpp std::wstring native_utf16 = mypath.native(); ``` -------------------------------- ### AmuNMT Configuration File (YAML) Source: https://github.com/marian-nmt/marian/wiki/AmuNMT-for-Automatic-Post-Editing This is a sample configuration file for the AmuNMT system, written in YAML format. It specifies relative paths, scorer configurations (including model paths and types), vocabulary files, weights for different scorers, beam size, normalization settings, and device assignments. The 'devices' parameter can be adjusted based on available hardware. ```yaml # amunn config file relative-paths: yes # Scorer configuration scorers: F0: type: Nematus path: ../mt-pe/model.iter260000.npz F1: type: Nematus path: ../mt-pe/model.iter270000.npz F2: type: Nematus path: ../mt-pe/model.iter280000.npz F3: type: Nematus path: ../mt-pe/model.iter290000.npz F4: type: Nematus path: ../src-pe/model.iter340000.npz tab: 1 F5: type: Nematus path: ../src-pe/model.iter350000.npz tab: 1 F6: type: Nematus path: ../src-pe/model.iter360000.npz tab: 1 F7: type: Nematus path: ../src-pe/model.iter370000.npz tab: 1 F8: type: APE source-vocab: - ../mt-pe/vocab.mt.json - ../src-pe/vocab.src.json target-vocab: ../mt-pe/vocab.pe.json weights: F0: 0.0679875234050288 F1: 0.136272622440232 F2: 0.0447424881348462 F3: 0.0505810091549122 F4: 0.119029214497868 F5: -0.0291262004966649 F6: -0.0348248568202612 F7: 0.131424048800743 F8: 0.386012036249443 beam-size: 12 normalize: yes n-best: no devices: [0, 1, 2] threads-per-device: 1 ```