### Example Metadata Configuration File Source: https://github.com/infinitensor/gguf/blob/main/xtask/README.md Provides an example of a metadata configuration file content, demonstrating the setting of various metadata fields including numerical and string types. ```plaintext 'llama.block_count' u64 22 'llama.context_length' u64 2048 'llama.embedding_length' u64 2048 'llama.feed_forward_length' u64 5632 'llama.attention.head_count' u64 32 'llama.attention.head_count_kv' u64 4 'llama.rope.dimension_count' u64 64 'tokenizer.chat_template' str| | {%- for message in messages -%} | {%- if message['role'] == 'user' -%} | {{ '<|user|> | ' + message['content'] + eos_token }} | {%- elif message['role'] == 'system' -%} | {{ '<|system|> | ' + message['content'] + eos_token }} | {%- elif message['role'] == 'assistant' -%} | {{ '<|assistant|> | ' + message['content'] + eos_token }} | {%- endif -%} | {%- if loop.last and add_generation_prompt -%} | {{ '<|assistant|> | ' }} | {%- endif -%} | {%- endfor -%} ``` -------------------------------- ### Quantization and Dequantization Example (Rust) Source: https://github.com/infinitensor/gguf/blob/main/ggml-quants/README.md Demonstrates how to use the ggml-quants library to quantize f32 data into Q8_1 format and then dequantize it back to f32. This showcases the core functionality of the library for data compression and restoration. ```rust use ggml_quants::{Quantize, Q8_1}; // Original floating-point data let data: [f32; 32] = [ 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0, 1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.8, 1.9, 2.0, 2.1, 2.2, 2.3, 2.4, 2.5, 2.6, 2.7, 2.8, 2.9, 3.0, 3.1, 3.2, ]; // Quantized data let quantized = Q8_1::quantize(&data); // Dequantized data let dequantized: [f32; 32] = quantized.dequantize(); ``` -------------------------------- ### gguf-utils show 命令帮助 Source: https://github.com/infinitensor/gguf/blob/main/xtask/README.md 显示gguf-utils的'show'命令的帮助信息,该命令用于展示GGUF文件的内容。它接受文件模式作为参数,并允许过滤元数据和张量。 ```shell gguf-utils show --help ``` -------------------------------- ### gguf-utils show-data 命令帮助 Source: https://github.com/infinitensor/gguf/blob/main/xtask/README.md 显示gguf-utils的'show-data'命令的帮助信息,该命令用于展示GGUF文件中特定张量的数据。需要提供文件名和张量名作为参数。 ```shell gguf-utils show-data --help ``` -------------------------------- ### gguf-utils set-meta 命令帮助 Source: https://github.com/infinitensor/gguf/blob/main/xtask/README.md 显示gguf-utils的'set-meta'命令的帮助信息,该命令用于设置GGUF文件的元数据。具体如何设置元数据(例如,键值对)需要查阅该命令的详细文档。 ```shell gguf-utils set-meta --help ``` -------------------------------- ### gguf-utils convert 命令帮助 Source: https://github.com/infinitensor/gguf/blob/main/xtask/README.md 显示gguf-utils的'convert'命令的帮助信息,该命令用于将GGUF文件转换为不同的格式。具体的转换格式选项未在此处列出,但通常会通过额外的参数指定。 ```shell gguf-utils convert --help ``` -------------------------------- ### gguf-utils 帮助信息 Source: https://github.com/infinitensor/gguf/blob/main/xtask/README.md 显示gguf-utils工具的全局帮助信息,列出所有可用的命令和选项。 ```shell gguf-utils --help ``` -------------------------------- ### gguf-utils merge 命令帮助 Source: https://github.com/infinitensor/gguf/blob/main/xtask/README.md 显示gguf-utils的'merge'命令的帮助信息,该命令用于将多个GGUF分片合并成一个单一的文件。支持使用文件通配符匹配分片,并指定输出目录。 ```shell gguf-utils merge --help ``` -------------------------------- ### Create and Write GGUF File Source: https://github.com/infinitensor/gguf/blob/main/ggus/README.md Demonstrates how to create a new GGUF file, write its header, metadata key-value pairs, and tensor data. It utilizes `GGufFileWriter` and `GGufTensorWriter` for these operations. ```rust use ggus::{ DataFuture, GGufFileHeader, GGufFileWriter, GGufMetaDataValueType, GGufTensorWriter, GGmlType, GGufReader, GGufReadError, }; use memmap2::Mmap; use std::fs::File; fn main() -> Result<(), Box> { const FILE_NAME: &str = "new_model.gguf"; // ========== 写入部分 ========== let file = File::create(FILE_NAME)?; let header = GGufFileHeader::new(3, 0, 0); let mut writer = GGufFileWriter::new(file, header)?; // 写入元数据 writer.write_alignment(32)?; writer.write_meta_kv( "general.architecture", GGufMetaDataValueType::String, b"llama\0", )?; writer.write_meta_kv("general.name", GGufMetaDataValueType::String, b"My Model\0")?; writer.write_meta_kv( "llm.context_length", GGufMetaDataValueType::U32, &2048u32.to_le_bytes(), )?; // 写入张量 let mut tensor_writer = writer.finish::>(true); let shape = [4, 4]; let data_bytes = vec![ 1u8, 0, 0, 0, // f32 = 1.0 little endian 0, 0, 128, 63, // f32 = 1.0 0, 0, 0, 64, // f32 = 2.0 0, 0, 64, 64, // f32 = 3.0 0, 0, 128, 64, // f32 = 4.0 0, 0, 160, 64, // f32 = 5.0 0, 0, 176, 64, // f32 = 6.0 0, 0, 192, 64, // f32 = 8.0 0, 0, 208, 64, // f32 = 6.5 0, 0, 224, 64, // f32 = 7.0 0, 0, 240, 64, // f32 = 7.5 0, 0, 0, 65, // f32 = 8.0 0, 0, 16, 65, // f32 = 9.0 0, 0, 32, 65, // f32 = 10.0 0, 0, 48, 65, // f32 = 11.0 0, 0, 64, 65, // f32 = 12.0 ]; tensor_writer.write_tensor("weight", GGmlType::F32, &shape, data_bytes)?; tensor_writer.finish()?; // ========== 读取部分 ========== let file = File::open(FILE_NAME)?; let data = unsafe { Mmap::map(&file) }?; let mut reader = GGufReader::new(&data); let header = match reader.read_header() { Ok(h) => h, Err(e) => { eprintln!("读取 GGUF 头失败: {:?}", e); return Err(Box::new(std::io::Error::new( std::io::ErrorKind::Other, "读取 GGUF 头失败", ))); } }; println!( "字节序: {}", if header.is_native_endian() { "Native" } else { "Swapped" } ); println!("版本号: {}", header.version); println!("元数据键值对数量: {}", header.metadata_kv_count); println!("张量数量: {}", header.tensor_count); Ok(()) } ``` -------------------------------- ### gguf-utils 命令列表 Source: https://github.com/infinitensor/gguf/blob/main/xtask/README.md 列出gguf-utils工具支持的所有命令及其简要说明,包括show, show-data, split, merge, convert, set-meta等。 ```plaintext Usage: gguf-utils Commands: show Show the contents of gguf files show-data Show tensor data in gguf file split Split gguf files into shards merge Merge shards into a single gguf file convert Convert gguf files to different format set-meta Set metadata of gguf files help Print this message or the help of the given subcommand(s) Options: -h, --help Print help -V, --version Print version ``` -------------------------------- ### gguf-utils split 命令帮助 Source: https://github.com/infinitensor/gguf/blob/main/xtask/README.md 显示gguf-utils的'split'命令的帮助信息,该命令用于将GGUF文件分割成多个分片。支持指定输出目录、每个分片的最大张量数或最大字节数。 ```shell gguf-utils split --help ``` -------------------------------- ### gguf-utils show 命令参数 Source: https://github.com/infinitensor/gguf/blob/main/xtask/README.md 展示gguf-utils show命令的详细参数,包括文件模式、数组详细程度、元数据过滤和张量过滤。 ```plaintext Usage: gguf-utils show [OPTIONS] Arguments: The file to show Options: -n, --array-detail How many elements to show in arrays, `all` for all elements [default: 8] -m, --filter-meta Meta to show [default: *] -t, --filter-tensor Tensors to show [default: *] -h, --help Print help ``` -------------------------------- ### gguf-utils split 命令参数 Source: https://github.com/infinitensor/gguf/blob/main/xtask/README.md 展示gguf-utils split命令的参数,包括输入文件、输出目录、最大张量数、最大字节数以及是否包含张量数据等选项。 ```plaintext Usage: gguf-utils split [OPTIONS] Arguments: File to split Options: -o, --output-dir Output directory for converted files -t, --max-tensors Max count of tensors per shard -s, --max-bytes Max size in bytes per shard --no-tensor-first If set, the first shard will not contain any tensor --no-data If set, tensor data will not be written to output files --log Log level, may be "off", "trace", "debug", "info" or "error" -h, --help Print help ``` -------------------------------- ### gguf-utils merge 命令参数 Source: https://github.com/infinitensor/gguf/blob/main/xtask/README.md 展示gguf-utils merge命令的参数,包括用于匹配分片的文件通配符、输出目录以及是否包含张量数据等选项。 ```plaintext Usage: gguf-utils merge [OPTIONS] Arguments: Glob pattern to match shards Options: -o, --output-dir Output directory for merged file --no-data If set, tensor data will not be written to output files --log Log level, may be "off", "trace", "debug", "info" or "error" -h, --help Print help ``` -------------------------------- ### ggml Quantization Types and Conversion Algorithms Source: https://github.com/infinitensor/gguf/blob/main/README.md Details the ggml quantization types and the algorithms used for conversion. This section likely covers the underlying principles of how models are quantized for efficiency. ```markdown - [ggml 量化类型定义和转换算法](ggml-quants) ``` -------------------------------- ### gguf-utils show-data 命令参数 Source: https://github.com/infinitensor/gguf/blob/main/xtask/README.md 展示gguf-utils show-data命令的参数,需要指定要显示数据的文件名和张量名。 ```plaintext Usage: gguf-utils show-data Arguments: Name of file to show Name of tensor to show Options: -h, --help Print help ``` -------------------------------- ### GGUF Metadata Configuration Formats Source: https://github.com/infinitensor/gguf/blob/main/xtask/README.md Details the formats for configuring metadata key-value pairs, including algebraic types, single-line strings, and multi-line strings with custom separators. ```APIDOC Meta Key-Value Configuration: 1. Algebraic Type Metadata: Format: '' can be integer, unsigned integer, float, or boolean. 2. String Metadata: Single-line: ''str "" Multi-line: ''str [Content] [Content] [Content] - is a delimiter string without whitespace, immediately following 'str'. - Subsequent lines starting with + space are part of the string. - Any line not starting with terminates the multi-line string. 3. Array Metadata: TODO: Currently not implemented. ``` -------------------------------- ### gguf Utility Tools Source: https://github.com/infinitensor/gguf/blob/main/README.md Provides information about the utility tools available for working with gguf files. These tools are likely used for tasks such as conversion, validation, or manipulation of gguf formatted data. ```markdown - [gguf 实用工具](xtask) ``` -------------------------------- ### gguf File Definitions and Operations Source: https://github.com/infinitensor/gguf/blob/main/README.md Explains the structure and definitions of gguf files, along with the operations that can be performed on them. This is crucial for understanding how models are stored and accessed in the gguf format. ```markdown - [gguf 文件定义和操作](ggus) ``` -------------------------------- ### Read GGUF File Header Source: https://github.com/infinitensor/gguf/blob/main/ggus/README.md Demonstrates how to read the header information from an existing GGUF file using `GGufReader`. It maps the file into memory and extracts details like endianness, version, and counts of metadata and tensors. ```rust use ggus::{GGufReader}; use memmap2::Mmap; use std::fs::File; fn read_gguf_header(file_path: &str) -> Result<(), Box> { let file = File::open(file_path)?; let data = unsafe { Mmap::map(&file) }?; let mut reader = GGufReader::new(&data); let header = match reader.read_header() { Ok(h) => h, Err(e) => { eprintln!("读取 GGUF 头失败: {:?}", e); return Err(Box::new(std::io::Error::new( std::io::ErrorKind::Other, "读取 GGUF 头失败", ))); } }; println!( "字节序: {}", if header.is_native_endian() { "Native" } else { "Swapped" } ); println!("版本号: {}", header.version); println!("元数据键值对数量: {}", header.metadata_kv_count); println!("张量数量: {}", header.tensor_count); Ok(()) } ``` -------------------------------- ### GGUF File Conversion Source: https://github.com/infinitensor/gguf/blob/main/xtask/README.md Converts GGUF files by applying a series of specified steps. Supports various conversion steps like sorting, permuting, merging, splitting, casting, and filtering. ```shell cargo convert --help ``` ```plaintext Convert gguf files to different format Usage: gguf-utils convert [OPTIONS] --steps Arguments: File to convert Options: -x, --steps Steps to apply, separated by "->", maybe "sort", "permute-qk", "merge-linear", "split-linear", "to-llama:", "cast:", "filter-meta:" or "filter-tensor:" -o, --output-dir Output directory for converted files -t, --max-tensors Max count of tensors per shard -s, --max-bytes Max size in bytes per shard --no-tensor-first If set, the first shard will not contain any tensor --no-data If set, tensor data will not be written to output files --log Log level, may be "off", "trace", "debug", "info" or "error" -h, --help Print help ``` -------------------------------- ### GGUF Metadata Modification Source: https://github.com/infinitensor/gguf/blob/main/xtask/README.md Sets metadata for GGUF files. Supports various data types for metadata values, including integers, floats, booleans, strings, and multi-line strings. ```shell gguf-utils set-meta --help ``` ```shell # in project dir cargo set-meta --help ``` ```plaintext Set metadata of gguf files Usage: gguf-utils set-meta [OPTIONS] Arguments: File to set metadata Meta data to set for the file Options: -o, --output-dir Output directory for converted files -t, --max-tensors Max count of tensors per shard -s, --max-bytes Max size in bytes per shard --no-tensor-first If set, the first shard will not contain any tensor --no-data If set, tensor data will not be written to output files --log Log level, may be "off", "trace", "debug", "info" or "error" -h, --help Print help ``` === COMPLETE CONTENT === This response contains all available snippets from this library. No additional content exists. Do not make further requests.