### Install and Configure Dependencies in Livebook/Scripts Source: https://github.com/elixir-nx/bumblebee/blob/main/README.md Use `Mix.install/2` in notebooks or scripts to install Bumblebee and EXLA, and configure Nx to use the EXLA backend simultaneously. ```elixir Mix.install( [ {:bumblebee, "~> 0.6.0"}, {:exla, ">= 0.0.0"} ], config: [nx: [default_backend: EXLA.Backend]] ) ``` -------------------------------- ### Start Nx.Serving as a Batched Inference Server Source: https://context7.com/elixir-nx/bumblebee/llms.txt Configures and starts a Bumblebee serving process as a supervised OTP process. This automatically batches concurrent requests from multiple clients to maximize throughput. Ensure the serving configuration includes batch size and sequence length if applicable. ```elixir # In your application supervisor (application.ex) defmodule MyApp.Application do use Application def start(_type, _args) do {:ok, bert} = Bumblebee.load_model({:hf, "google-bert/bert-base-uncased"}) {:ok, tokenizer} = Bumblebee.load_tokenizer({:hf, "google-bert/bert-base-uncased"}) serving = Bumblebee.Text.fill_mask(bert, tokenizer, compile: [batch_size: 8, sequence_length: 100], defn_options: [compiler: EXLA] ) children = [ {Nx.Serving, serving: serving, name: MyApp.Serving, batch_size: 8, batch_timeout: 100} ] Supervisor.start_link(children, strategy: :one_for_one) end end # Call from anywhere in the application (requests are automatically batched) Nx.Serving.batched_run(MyApp.Serving, "The [MASK] sat on the mat.") #=> %{predictions: [%{token: "cat", score: 0.87}, ...]}) ``` -------------------------------- ### Nx.Serving as a Production Process Source: https://context7.com/elixir-nx/bumblebee/llms.txt Starts a supervised process that automatically batches concurrent requests from multiple clients, maximizing throughput on CPU or GPU. ```APIDOC ## `Nx.Serving` as a Production Process — Batched inference server Any Bumblebee serving can be started as a supervised process that automatically batches concurrent requests from multiple clients, maximising throughput on CPU or GPU. ```elixir # In your application supervisor (application.ex) defmodule MyApp.Application do use Application def start(_type, _args) do {:ok, bert} = Bumblebee.load_model({:hf, "google-bert/bert-base-uncased"}) {:ok, tokenizer} = Bumblebee.load_tokenizer({:hf, "google-bert/bert-base-uncased"}) serving = Bumblebee.Text.fill_mask(bert, tokenizer, compile: [batch_size: 8, sequence_length: 100], defn_options: [compiler: EXLA] ) children = [ {Nx.Serving, serving: serving, name: MyApp.Serving, batch_size: 8, batch_timeout: 100} ] Supervisor.start_link(children, strategy: :one_for_one) end end # Call from anywhere in the application (requests are automatically batched) Nx.Serving.batched_run(MyApp.Serving, "The [MASK] sat on the mat.") #=> %{predictions: [%{token: "cat", score: 0.87}, ...]}) ``` ``` -------------------------------- ### Compile Bumblebee Serving with EXLA Source: https://github.com/elixir-nx/bumblebee/blob/main/examples/phoenix/README.md Example of creating a Bumblebee serving instance configured to use EXLA as the compiler for upfront computation compilation. This ensures efficient use of the GPU for neural network models. ```elixir serving = Bumblebee.Text.text_embedding(model_info, tokenizer, compile: [batch_size: 1, sequence_length: 512], defn_options: [compiler: EXLA] ) ``` -------------------------------- ### Load Bumblebee Models Source: https://github.com/elixir-nx/bumblebee/blob/main/examples/phoenix/README.md Example of loading multiple Bumblebee models from Hugging Face. This function can be called during application startup or deployment to pre-cache models. ```elixir def load_all do Bumblebee.load_xyz({:hf, "microsoft/resnet"}) Bumblebee.load_xyz({:hf, "foo/bar/baz"}) end ``` -------------------------------- ### Perform Text Fill-Mask Task with Bumblebee Source: https://github.com/elixir-nx/bumblebee/blob/main/README.md Load a pre-trained BERT model and tokenizer from Hugging Face Hub, create a text fill-mask serving pipeline, and use it to predict masked words in a sentence. This example demonstrates a common NLP task using Bumblebee's high-level APIs. ```elixir {:ok, model_info} = Bumblebee.load_model({:hf, "google-bert/bert-base-uncased"}) {:ok, tokenizer} = Bumblebee.load_tokenizer({:hf, "google-bert/bert-base-uncased"}) serving = Bumblebee.Text.fill_mask(model_info, tokenizer) Nx.Serving.run(serving, "The capital of [MASK] is Paris.") #=> %{ #=> predictions: [ #=> %{score: 0.9279842972755432, token: "france"}, #=> %{score: 0.008412551134824753, token: "brittany"}, #=> %{score: 0.007433671969920397, token: "algeria"}, #=> %{score: 0.004957548808306456, token: "department"}, #=> %{score: 0.004369721747934818, token: "reunion"} #=> ] #=> } ``` -------------------------------- ### Bumblebee.load_tokenizer/2 Source: https://context7.com/elixir-nx/bumblebee/llms.txt Loads a tokenizer from a specified repository, downloading and building a fast (Rust-based) tokenizer from `tokenizer.json`. The tokenizer type is inferred from the model's `config.json`. ```APIDOC ## Bumblebee.load_tokenizer/2 ### Description Loads a tokenizer from a repository. Downloads and builds a fast (Rust-based) tokenizer from `tokenizer.json`. The tokenizer type is inferred from the model's `config.json`. ### Method `Bumblebee.load_tokenizer/2` ### Parameters #### Path Parameters - `source` (tuple): Specifies the source of the tokenizer. Can be `{:hf, "repository_name"}` for Hugging Face Hub or `{:local, "/path/to/tokenizer/dir"}` for a local directory. - `opts` (keyword list): Optional keyword list for configuration. ### Request Example ```elixir {:ok, tokenizer} = Bumblebee.load_tokenizer({:hf, "google-bert/bert-base-uncased"}) # Apply tokenizer directly inputs = Bumblebee.apply_tokenizer(tokenizer, "The capital of France is Paris.") # inputs => %{"input_ids" => #Nx.Tensor<...>, "attention_mask" => #Nx.Tensor<...>, ...} # Tokenize a batch inputs = Bumblebee.apply_tokenizer(tokenizer, [ "Hello world", "Elixir is great" ]) # Configure tokenizer options via Bumblebee.configure/2 tokenizer = Bumblebee.configure(tokenizer, length: 128) ``` ### Response #### Success Response Returns `{:ok, tokenizer}` where `tokenizer` is the loaded tokenizer object. #### Response Example ```elixir # A tokenizer struct or map representing the loaded tokenizer %Bumblebee.Tokenizer{...} ``` ``` -------------------------------- ### Configure Nx to Use EXLA Backend Source: https://github.com/elixir-nx/bumblebee/blob/main/README.md Configure Nx to use the EXLA backend by default in your `config/config.exs` file. This ensures that models are compiled and run using EXLA for better performance. ```elixir import Config config :nx, default_backend: EXLA.Backend ``` -------------------------------- ### Configure Nx Default Backend Source: https://github.com/elixir-nx/bumblebee/blob/main/examples/phoenix/README.md Configuration to set the default Nx backend to use the CPU for one-off operations, ensuring that the GPU is reserved for large computations. ```elixir config :nx, :default_backend, {EXLA.Backend, client: :host} ``` -------------------------------- ### Load Tokenizer with Bumblebee Source: https://context7.com/elixir-nx/bumblebee/llms.txt Use `Bumblebee.load_tokenizer/2` to download and build a fast tokenizer. The tokenizer type is inferred from the model's `config.json`. You can apply the tokenizer directly or configure its options. ```elixir {:ok, tokenizer} = Bumblebee.load_tokenizer({:hf, "google-bert/bert-base-uncased"}) ``` ```elixir # Apply tokenizer directly inputs = Bumblebee.apply_tokenizer(tokenizer, "The capital of France is Paris.") # inputs => %{"input_ids" => #Nx.Tensor<...>, "attention_mask" => #Nx.Tensor<...>, ...} ``` ```elixir # Tokenize a batch inputs = Bumblebee.apply_tokenizer(tokenizer, [ "Hello world", "Elixir is great" ]) ``` ```elixir # Configure tokenizer options via Bumblebee.configure/2 tokenizer = Bumblebee.configure(tokenizer, length: 128) ``` -------------------------------- ### Add Bumblebee and EXLA Dependencies Source: https://github.com/elixir-nx/bumblebee/blob/main/README.md Add Bumblebee and the optional EXLA dependency to your `mix.exs` file. EXLA is recommended for just-in-time model compilation and optimized CPU/GPU performance. ```elixir def deps do [ {:bumblebee, "~> 0.6.0"}, {:exla, ">= 0.0.0"} ] end ``` -------------------------------- ### Create Local Model Checkpoints with Python Source: https://github.com/elixir-nx/bumblebee/blob/main/AGENTS.md This Python script generates local checkpoints for various SmolLM3 model types using a specified configuration. It saves these checkpoints to be used for testing. ```python from transformers import SmolLM3Config, SmolLM3Model, SmolLM3ForCausalLM, SmolLM3ForQuestionAnswering, SmolLM3ForSequenceClassification, SmolLM3ForTokenClassification config = SmolLM3Config( vocab_size=1024, hidden_size=32, num_hidden_layers=2, num_attention_heads=4, intermediate_size=37, hidden_act="gelu", hidden_dropout_prob=0.1, attention_probs_dropout_prob=0.1, max_position_embeddings=512, type_vocab_size=16, is_decoder=False, initializer_range=0.02, pad_token_id=0, no_rope_layers=[0, 1] ) for c in [SmolLM3Model, SmolLM3ForCausalLM, SmolLM3ForQuestionAnswering, SmolLM3ForSequenceClassification, SmolLM3ForTokenClassification]: name = c.__name__ c(config).save_pretrained(f"bumblebee-testing/tiny-random-{name}", repo_id=f"bumblebee-testing/tiny-random-{name}") ``` -------------------------------- ### Load Generation Configuration with Bumblebee Source: https://context7.com/elixir-nx/bumblebee/llms.txt Use `Bumblebee.load_generation_config/2` to load sampling and decoding configuration for text generation models. Generation parameters can be adjusted using `Bumblebee.configure/2`. ```elixir {:ok, generation_config} = Bumblebee.load_generation_config({:hf, "openai-community/gpt2"}) ``` ```elixir # Adjust generation parameters generation_config = Bumblebee.configure(generation_config, max_new_tokens: 100, min_new_tokens: 10 ) ``` -------------------------------- ### Bumblebee.load_generation_config/2 Source: https://context7.com/elixir-nx/bumblebee/llms.txt Loads the generation configuration for text generation models, which includes sampling and decoding parameters. ```APIDOC ## Bumblebee.load_generation_config/2 ### Description Loads sampling and decoding configuration for text generation models. ### Method `Bumblebee.load_generation_config/2` ### Parameters #### Path Parameters - `source` (tuple): Specifies the source of the generation configuration. Can be `{:hf, "repository_name"}` for Hugging Face Hub or `{:local, "/path/to/config/dir"}` for a local directory. - `opts` (keyword list): Optional keyword list for configuration. ### Request Example ```elixir {:ok, generation_config} = Bumblebee.load_generation_config({:hf, "openai-community/gpt2"}) # Adjust generation parameters generation_config = Bumblebee.configure(generation_config, max_new_tokens: 100, min_new_tokens: 10 ) ``` ### Response #### Success Response Returns `{:ok, generation_config}` where `generation_config` is the loaded generation configuration map or struct. #### Response Example ```elixir %{max_new_tokens: 100, min_new_tokens: 10, ...} ``` ``` -------------------------------- ### Load Pre-trained Model with Bumblebee Source: https://context7.com/elixir-nx/bumblebee/llms.txt Use `Bumblebee.load_model/2` to download and cache model parameters and configuration. Supports loading from Hugging Face Hub or local directories, with options for architecture override, custom parameter types, and specific subdirectories. ```elixir # Load BERT from Hugging Face (auto-infers architecture) {:ok, bert} = Bumblebee.load_model({:hf, "google-bert/bert-base-uncased"}) %{model: model, params: params, spec: spec} = bert ``` ```elixir # Load with explicit architecture {:ok, resnet} = Bumblebee.load_model({:hf, "microsoft/resnet-50"}, architecture: :base) ``` ```elixir # Load with bfloat16 precision for GPU efficiency {:ok, llama} = Bumblebee.load_model({:hf, "meta-llama/Llama-2-7b-hf"}, type: :bf16) ``` ```elixir # Load a specific subdir (e.g., for Stable Diffusion components) {:ok, unet} = Bumblebee.load_model({:hf, "CompVis/stable-diffusion-v1-4", subdir: "unet"}) ``` ```elixir # Load from local directory {:ok, model_info} = Bumblebee.load_model({:local, "/path/to/model/dir"}) ``` ```elixir # Customise spec at load time {:ok, resnet} = Bumblebee.load_model({:hf, "microsoft/resnet-50"}, spec_overrides: [num_labels: 10]) ``` -------------------------------- ### Load Featurizer with Bumblebee Source: https://context7.com/elixir-nx/bumblebee/llms.txt Use `Bumblebee.load_featurizer/2` to load a featurizer for preprocessing images or audio into model-compatible tensors. Apply the featurizer to input data like images or audio files. ```elixir {:ok, featurizer} = Bumblebee.load_featurizer({:hf, "microsoft/resnet-50"}) ``` ```elixir # Apply featurizer to an image {:ok, img} = StbImage.read_file("/path/to/image.jpg") inputs = Bumblebee.apply_featurizer(featurizer, [img]) # inputs => %{"pixel_values" => #Nx.Tensor} ``` ```elixir # Whisper audio featurizer {:ok, featurizer} = Bumblebee.load_featurizer({:hf, "openai/whisper-tiny"}) ``` -------------------------------- ### Bumblebee.load_featurizer/2 Source: https://context7.com/elixir-nx/bumblebee/llms.txt Loads a featurizer, which is used for preprocessing images or audio into tensors compatible with machine learning models. ```APIDOC ## Bumblebee.load_featurizer/2 ### Description Loads a featurizer (image/audio preprocessor) for preprocessing images or audio into model-compatible tensors. ### Method `Bumblebee.load_featurizer/2` ### Parameters #### Path Parameters - `source` (tuple): Specifies the source of the featurizer. Can be `{:hf, "repository_name"}` for Hugging Face Hub or `{:local, "/path/to/featurizer/dir"}` for a local directory. - `opts` (keyword list): Optional keyword list for configuration. ### Request Example ```elixir {:ok, featurizer} = Bumblebee.load_featurizer({:hf, "microsoft/resnet-50"}) # Apply featurizer to an image {:ok, img} = StbImage.read_file("/path/to/image.jpg") inputs = Bumblebee.apply_featurizer(featurizer, [img]) # inputs => %{"pixel_values" => #Nx.Tensor} # Whisper audio featurizer {:ok, featurizer} = Bumblebee.load_featurizer({:hf, "openai/whisper-tiny"}) ``` ### Response #### Success Response Returns `{:ok, featurizer}` where `featurizer` is the loaded featurizer object. #### Response Example ```elixir # A featurizer struct or map representing the loaded featurizer %Bumblebee.Featurizer{...} ``` ``` -------------------------------- ### Generate Reference Values with Python Source: https://github.com/elixir-nx/bumblebee/blob/main/AGENTS.md Use this Python script to obtain reference output values from Hugging Face Transformers models for testing purposes. It prints the shape and a slice of the last hidden state. ```python from transformers import BertModel import torch model = BertModel.from_pretrained("hf-internal-testing/tiny-random-BertModel") inputs = { "input_ids": torch.tensor([[10, 20, 30, 40, 50, 60, 70, 80, 0, 0]]), "attention_mask": torch.tensor([[1, 1, 1, 1, 1, 1, 1, 1, 0, 0]]) } outputs = model(**inputs) print(outputs.last_hidden_state.shape) print(outputs.last_hidden_state[:, 1:4, 1:4]) #=> torch.Size([1, 10, 32]) #=> tensor([[[-0.2331, 1.7817, 1.1736], #=> [-1.1001, 1.3922, -0.3391], #=> [ 0.0408, 0.8677, -0.0779]]], grad_fn=) ``` -------------------------------- ### Generate Image Captions with BLIP Source: https://context7.com/elixir-nx/bumblebee/llms.txt Use this function to generate natural language descriptions for images. Ensure BLIP model, featurizer, tokenizer, and generation config are loaded. ```elixir {:ok, blip} = Bumblebee.load_model({:hf, "Salesforce/blip-image-captioning-base"}) {:ok, featurizer} = Bumblebee.load_featurizer({:hf, "Salesforce/blip-image-captioning-base"}) {:ok, tokenizer} = Bumblebee.load_tokenizer({:hf, "Salesforce/blip-image-captioning-base"}) {:ok, generation_config} = Bumblebee.load_generation_config({:hf, "Salesforce/blip-image-captioning-base"}) serving = Bumblebee.Vision.image_to_text(blip, featurizer, tokenizer, generation_config, defn_options: [compiler: EXLA] ) image = StbImage.read_file!("/path/to/cat_on_chair.jpg") Nx.Serving.run(serving, image) #=> %{results: [%{text: "a cat sitting on a chair"}]} ``` -------------------------------- ### Bumblebee.configure/2 Source: https://context7.com/elixir-nx/bumblebee/llms.txt A versatile function to build or update configuration structs for various Bumblebee components, including model specs, featurizers, tokenizers, and generation configurations. ```APIDOC ## Bumblebee.configure/2 ### Description Builds or updates a configuration struct. This function is used to create new configurations or modify existing ones for model specs, featurizers, schedulers, tokenizers, or generation configs with custom options. ### Method `Bumblebee.configure/2` ### Parameters #### Path Parameters - `target` (module or struct): The target configuration struct or module to build/update (e.g., `Bumblebee.Vision.ResNet`, an existing `tokenizer`, or `generation_config`). - `opts` (keyword list): Keyword list of options to set or update in the configuration. ### Request Example ```elixir # Build a model spec from scratch spec = Bumblebee.configure(Bumblebee.Vision.ResNet, architecture: :for_image_classification, num_labels: 200 ) # Update an existing config featurizer = Bumblebee.configure(Bumblebee.Vision.ConvNextFeaturizer, resize_method: :bilinear) # Update generation config generation_config = Bumblebee.configure(generation_config, max_new_tokens: 50, min_new_tokens: 5 ) # Build model from spec model = Bumblebee.build_model(spec) ``` ### Response #### Success Response Returns the updated or newly created configuration struct. #### Response Example ```elixir # Depending on the target, returns a configured struct or map %Bumblebee.Spec{...} # or %Bumblebee.Tokenizer{...} # or %{max_new_tokens: 50, min_new_tokens: 5, ...} ``` ``` -------------------------------- ### Text Generation with Bumblebee Source: https://context7.com/elixir-nx/bumblebee/llms.txt Generate text continuations from a prompt using `Bumblebee.Text.generation/4`. Supports streaming output for real-time applications. Load model, tokenizer, and generation configuration. ```elixir {:ok, model_info} = Bumblebee.load_model({:hf, "openai-community/gpt2"}) {:ok, tokenizer} = Bumblebee.load_tokenizer({:hf, "openai-community/gpt2"}) {:ok, generation_config} = Bumblebee.load_generation_config({:hf, "openai-community/gpt2"}) generation_config = Bumblebee.configure(generation_config, max_new_tokens: 15) # Standard batch generation serving = Bumblebee.Text.generation(model_info, tokenizer, generation_config) Nx.Serving.run(serving, "Elixir is a functional") #=> %{ #=> results: [ #=> %{text: " programming language that is designed to be used in a variety of applications. It", token_summary: %{input: 5, output: 15, padding: 0}} #=> ] #=> } # Streaming generation - returns a lazy stream of text chunks serving = Bumblebee.Text.generation(model_info, tokenizer, generation_config, stream: true) Nx.Serving.run(serving, "Elixir is a functional") |> Enum.to_list() #=> [" programming", " language", " that", " is", " designed", " to", " be", " used", #=> " in", " a", " variety", " of", " applications.", " It"] ``` -------------------------------- ### Bumblebee.load_model/2 Source: https://context7.com/elixir-nx/bumblebee/llms.txt Loads a pre-trained model from Hugging Face Hub or a local directory. It downloads and caches model parameters and configuration, returning a map containing the model, parameters, and specification. Supports architecture overrides, custom parameter types, and custom backends. ```APIDOC ## Bumblebee.load_model/2 ### Description Loads a pre-trained model from Hugging Face Hub or local disk. Returns a `model_info` map with `:model`, `:params`, and `:spec` keys. Supports architecture override, custom parameter types (e.g., `:bf16`), and custom backends. ### Method `Bumblebee.load_model/2` ### Parameters #### Path Parameters - `source` (tuple): Specifies the source of the model. Can be `{:hf, "repository_name"}` for Hugging Face Hub or `{:local, "/path/to/model/dir"}` for a local directory. For Hugging Face sources, an optional `subdir` key can be provided within the tuple. - `opts` (keyword list): Optional keyword list for configuration. Supported options include `architecture`, `type`, `spec_overrides`. ### Request Example ```elixir # Load BERT from Hugging Face (auto-infers architecture) {:ok, bert} = Bumblebee.load_model({:hf, "google-bert/bert-base-uncased"}) %{model: model, params: params, spec: spec} = bert # Load with explicit architecture {:ok, resnet} = Bumblebee.load_model({:hf, "microsoft/resnet-50"}, architecture: :base) # Load with bfloat16 precision for GPU efficiency {:ok, llama} = Bumblebee.load_model({:hf, "meta-llama/Llama-2-7b-hf"}, type: :bf16) # Load a specific subdir (e.g., for Stable Diffusion components) {:ok, unet} = Bumblebee.load_model({:hf, "CompVis/stable-diffusion-v1-4", subdir: "unet"}) # Load from local directory {:ok, model_info} = Bumblebee.load_model({:local, "/path/to/model/dir"}) # Customise spec at load time {:ok, resnet} = Bumblebee.load_model({:hf, "microsoft/resnet-50"}, spec_overrides: [num_labels: 10]) ``` ### Response #### Success Response Returns `{:ok, model_info}` where `model_info` is a map containing: - `:model`: The loaded model. - `:params`: The model parameters. - `:spec`: The model specification. #### Response Example ```elixir %{model: %AxonLayer{}, params: %{...}, spec: %Bumblebee.Spec{...}} ``` ``` -------------------------------- ### Load a Diffusion Scheduler with Bumblebee Source: https://context7.com/elixir-nx/bumblebee/llms.txt Loads a noise scheduler (DDIM, PNDM, or LCM) for controlling the denoising process in diffusion models. Schedulers can be initialized and stepped through using provided functions. ```elixir {:ok, scheduler} = Bumblebee.load_scheduler({:hf, "CompVis/stable-diffusion-v1-4", subdir: "scheduler"}) # Schedulers can also be used directly {state, timesteps} = Bumblebee.scheduler_init(scheduler, 50, sample_template, prng_key) {state, prev_sample} = Bumblebee.scheduler_step(scheduler, state, sample, prediction) ``` -------------------------------- ### Configure Bumblebee Components Source: https://context7.com/elixir-nx/bumblebee/llms.txt Use `Bumblebee.configure/2` to build or update configuration structs for model specs, featurizers, schedulers, tokenizers, or generation configs. This function allows for customization of various parameters. ```elixir # Build a model spec from scratch spec = Bumblebee.configure(Bumblebee.Vision.ResNet, architecture: :for_image_classification, num_labels: 200 ) ``` ```elixir # Update an existing config featurizer = Bumblebee.configure(Bumblebee.Vision.ConvNextFeaturizer, resize_method: :bilinear) ``` ```elixir # Update generation config generation_config = Bumblebee.configure(generation_config, max_new_tokens: 50, min_new_tokens: 5 ) ``` ```elixir # Build model from spec model = Bumblebee.build_model(spec) ``` -------------------------------- ### Load Model with Subdirectory Source: https://github.com/elixir-nx/bumblebee/blob/main/README.md When a repository contains multiple models, specify the subdirectory containing the desired model. ```elixir Bumblebee.load_model({:hf, "model-repo", subdir: "..."}) ``` -------------------------------- ### Qwen3 Text Reranking with Bumblebee Source: https://context7.com/elixir-nx/bumblebee/llms.txt Perform text reranking using Qwen3 instruction-tuned models to score query-document relevance. Outputs normalized relevance probabilities and supports batching. ```elixir {:ok, model_info} = Bumblebee.load_model({:hf, "Qwen/Qwen3-Reranker-0.6B"}) {:ok, tokenizer} = Bumblebee.load_tokenizer({:hf, "Qwen/Qwen3-Reranker-0.6B"}) serving = Bumblebee.Text.text_reranking_qwen3(model_info, tokenizer, compile: [batch_size: 4, sequence_length: 512], defn_options: [compiler: EXLA] ) query = "What is the capital of France?" documents = [ "Paris is the capital of France.", "Berlin is the capital of Germany." ] pairs = Enum.map(documents, &{query, &1}) Nx.Serving.run(serving, pairs) #=> %{scores: [%{score: 0.98, query: "...", document: "Paris is..."}, ...]} ``` -------------------------------- ### Generate Images from Text with Stable Diffusion Source: https://context7.com/elixir-nx/bumblebee/llms.txt Generate images from text prompts using Stable Diffusion. Requires loading CLIP text encoder, UNet, VAE decoder, scheduler, and optionally a safety checker. Supports negative prompts and fixed seeds for reproducibility. ```elixir repository_id = "CompVis/stable-diffusion-v1-4" {:ok, tokenizer} = Bumblebee.load_tokenizer({:hf, "openai/clip-vit-large-patch14"}) {:ok, clip} = Bumblebee.load_model({:hf, repository_id, subdir: "text_encoder"}) {:ok, unet} = Bumblebee.load_model({:hf, repository_id, subdir: "unet"}) {:ok, vae} = Bumblebee.load_model({:hf, repository_id, subdir: "vae"}, architecture: :decoder) {:ok, scheduler} = Bumblebee.load_scheduler({:hf, repository_id, subdir: "scheduler"}) {:ok, featurizer} = Bumblebee.load_featurizer({:hf, repository_id, subdir: "feature_extractor"}) {:ok, safety_checker} = Bumblebee.load_model({:hf, repository_id, subdir: "safety_checker"}) serving = Bumblebee.Diffusion.StableDiffusion.text_to_image(clip, unet, vae, tokenizer, scheduler, num_steps: 20, num_images_per_prompt: 2, guidance_scale: 7.5, safety_checker: safety_checker, safety_checker_featurizer: featurizer, compile: [batch_size: 1, sequence_length: 60], defn_options: [compiler: EXLA] ) Nx.Serving.run(serving, "numbat in forest, detailed, digital art") #=> %{ #=> results: [ #=> %{image: #Nx.Tensor, is_safe: true}, #=> %{image: #Nx.Tensor, is_safe: true} #=> ] #=> } # With negative prompt and fixed seed for reproducibility Nx.Serving.run(serving, %{ prompt: "a serene mountain lake at sunset", negative_prompt: "ugly, blurry, low quality", seed: 42 }) ``` -------------------------------- ### Load Model from Hugging Face Hub Source: https://github.com/elixir-nx/bumblebee/blob/main/README.md Use this function to load a model from the Hugging Face Hub. Ensure the model type is implemented in Bumblebee. ```elixir Bumblebee.load_model({:hf, "model-repo"}) ``` -------------------------------- ### Configure Progress Bar Step Source: https://github.com/elixir-nx/bumblebee/blob/main/README.md Update the progress bar's update frequency. Set to 10 to update every 10% instead of every 1%. ```elixir config :bumblebee, :progress_bar_step, 10 ``` -------------------------------- ### Bumblebee.load_scheduler/2 Source: https://context7.com/elixir-nx/bumblebee/llms.txt Loads a noise scheduler (DDIM, PNDM, or LCM) used to control the denoising process in diffusion models. ```APIDOC ## `Bumblebee.load_scheduler/2` — Load a diffusion scheduler Loads a noise scheduler (DDIM, PNDM, or LCM) used to control the denoising process in diffusion models. ```elixir {:ok, scheduler} = Bumblebee.load_scheduler({:hf, "CompVis/stable-diffusion-v1-4", subdir: "scheduler"}) # Schedulers can also be used directly {state, timesteps} = Bumblebee.scheduler_init(scheduler, 50, sample_template, prng_key) {state, prev_sample} = Bumblebee.scheduler_step(scheduler, state, sample, prediction) ``` ``` -------------------------------- ### Bumblebee.Vision.image_to_text/5 Source: https://context7.com/elixir-nx/bumblebee/llms.txt Generates natural language descriptions of images using multimodal models like BLIP. This function takes the loaded model, featurizer, tokenizer, and generation configuration as arguments. ```APIDOC ## Bumblebee.Vision.image_to_text/5 ### Description Generates natural language descriptions of images using multimodal models such as BLIP. ### Function Signature `Bumblebee.Vision.image_to_text(blip, featurizer, tokenizer, generation_config, opts \\ [])` ### Parameters - `blip`: The loaded BLIP model. - `featurizer`: The featurizer for the model. - `tokenizer`: The tokenizer for the model. - `generation_config`: The generation configuration. - `opts`: Optional arguments, such as `defn_options`. ### Request Example ```elixir {:ok, blip} = Bumblebee.load_model({:hf, "Salesforce/blip-image-captioning-base"}) {:ok, featurizer} = Bumblebee.load_featurizer({:hf, "Salesforce/blip-image-captioning-base"}) {:ok, tokenizer} = Bumblebee.load_tokenizer({:hf, "Salesforce/blip-image-captioning-base"}) {:ok, generation_config} = Bumblebee.load_generation_config({:hf, "Salesforce/blip-image-captioning-base"}) serving = Bumblebee.Vision.image_to_text(blip, featurizer, tokenizer, generation_config, defn_options: [compiler: EXLA]) image = StbImage.read_file!("/path/to/cat_on_chair.jpg") Nx.Serving.run(serving, image) ``` ### Response Example ```elixir %{results: [%{text: "a cat sitting on a chair"}]} ``` ``` -------------------------------- ### Bumblebee.Text.text_reranking_qwen3/3 Source: https://context7.com/elixir-nx/bumblebee/llms.txt Scores query-document relevance using Qwen3 instruction-tuned reranker models, outputting normalized relevance probabilities. ```APIDOC ## Bumblebee.Text.text_reranking_qwen3/3 - Text reranking with Qwen3 reranker models ### Description Scores query-document relevance using Qwen3 instruction-tuned reranker models, outputting normalized relevance probabilities. ### Method Signature `Bumblebee.Text.text_reranking_qwen3(model_info, tokenizer, opts)` ### Parameters - `model_info`: Loaded model information. - `tokenizer`: Loaded tokenizer. - `opts`: Compilation and definition options, e.g., `compile: [batch_size: 4, sequence_length: 512]`, `defn_options: [compiler: EXLA]`. ### Request Example ```elixir {:ok, model_info} = Bumblebee.load_model({:hf, "Qwen/Qwen3-Reranker-0.6B"}) {:ok, tokenizer} = Bumblebee.load_tokenizer({:hf, "Qwen/Qwen3-Reranker-0.6B"}) serving = Bumblebee.Text.text_reranking_qwen3(model_info, tokenizer, compile: [batch_size: 4, sequence_length: 512], defn_options: [compiler: EXLA] ) query = "What is the capital of France?" documents = [ "Paris is the capital of France.", "Berlin is the capital of Germany." ] pairs = Enum.map(documents, &{query, &1}) Nx.Serving.run(serving, pairs) ``` ### Response Example ```elixir %{scores: [%{score: 0.98, query: "...", document: "Paris is..."}, ...]}` ``` ``` -------------------------------- ### Bumblebee.Text.generation/4 Source: https://context7.com/elixir-nx/bumblebee/llms.txt Generates text continuations from a given prompt. This function supports both standard batch generation and streaming output for real-time applications. ```APIDOC ## `Bumblebee.Text.generation/4` — Prompt-driven text generation Generates text continuations from a prompt. Supports streaming output for real-time use cases. ### Standard Generation ```elixir {:ok, model_info} = Bumblebee.load_model({:hf, "openai-community/gpt2"}) {:ok, tokenizer} = Bumblebee.load_tokenizer({:hf, "openai-community/gpt2"}) {:ok, generation_config} = Bumblebee.load_generation_config({:hf, "openai-community/gpt2"}) generation_config = Bumblebee.configure(generation_config, max_new_tokens: 15) serving = Bumblebee.Text.generation(model_info, tokenizer, generation_config) Nx.Serving.run(serving, "Elixir is a functional") #=> %{ #=> results: [ #=> %{text: " programming language that is designed to be used in a variety of applications. It", token_summary: %{input: 5, output: 15, padding: 0}} #=> ] #=> } ``` ### Streaming Generation ```elixir # Streaming generation - returns a lazy stream of text chunks serving = Bumblebee.Text.generation(model_info, tokenizer, generation_config, stream: true) Nx.Serving.run(serving, "Elixir is a functional") |> Enum.to_list() #=> [" programming", " language", " that", " is", " designed", " to", " be", " used", #=> " in", " a", " variety", " of", " applications.", " It"] ``` ``` -------------------------------- ### Load Tokenizer with Revision Source: https://github.com/elixir-nx/bumblebee/blob/main/README.md Load a tokenizer specifying a particular revision, useful when using a generated tokenizer.json from a PR. ```elixir Bumblebee.load_tokenizer({:hf, "model-repo", revision: "..."}) ``` -------------------------------- ### Bumblebee.Text.zero_shot_classification/4 Source: https://context7.com/elixir-nx/bumblebee/llms.txt Classifies text into arbitrary user-supplied labels without any task-specific fine-tuning, using natural language inference. ```APIDOC ## Bumblebee.Text.zero_shot_classification/4 - Zero-shot classification ### Description Classifies text into arbitrary user-supplied labels without any task-specific fine-tuning, using natural language inference. ### Method Signature `Bumblebee.Text.zero_shot_classification(model, tokenizer, labels)` ### Parameters - `model`: A loaded zero-shot classification model. - `tokenizer`: A loaded tokenizer. - `labels`: A list of arbitrary labels to classify the text into. ### Request Example ```elixir {:ok, model} = Bumblebee.load_model({:hf, "facebook/bart-large-mnli"}) {:ok, tokenizer} = Bumblebee.load_tokenizer({:hf, "facebook/bart-large-mnli"}) labels = ["cooking", "traveling", "dancing"] serving = Bumblebee.Text.zero_shot_classification(model, tokenizer, labels) Nx.Serving.run(serving, "One day I will see the world") ``` ### Response Example ```elixir %{predictions: [ %{label: "cooking", score: 0.0070497458800673485}, %{label: "traveling", score: 0.985000491142273}, %{label: "dancing", score: 0.007949736900627613} ]} ``` ``` -------------------------------- ### Zero-Shot Text Classification with Bumblebee Source: https://context7.com/elixir-nx/bumblebee/llms.txt Classify text into user-defined labels without fine-tuning, leveraging natural language inference. Requires a model trained on NLI tasks and a list of target labels. ```elixir {:ok, model} = Bumblebee.load_model({:hf, "facebook/bart-large-mnli"}) {:ok, tokenizer} = Bumblebee.load_tokenizer({:hf, "facebook/bart-large-mnli"}) labels = ["cooking", "traveling", "dancing"] serving = Bumblebee.Text.zero_shot_classification(model, tokenizer, labels) Nx.Serving.run(serving, "One day I will see the world") #=> %{ #=> predictions: [ #=> %{label: "cooking", score: 0.0070497458800673485}, #=> %{label: "traveling", score: 0.985000491142273}, #=> %{label: "dancing", score: 0.007949736900627613} #=> ] #=> } ``` -------------------------------- ### Bumblebee.Diffusion.StableDiffusion.text_to_image/6 Source: https://context7.com/elixir-nx/bumblebee/llms.txt Generates images from text prompts using Stable Diffusion. This function requires multiple submodels including CLIP text encoder, UNet, VAE decoder, and a noise scheduler, with optional safety checker integration. ```APIDOC ## Bumblebee.Diffusion.StableDiffusion.text_to_image/6 ### Description Generates images from text prompts using Stable Diffusion. Requires loading multiple submodels: CLIP text encoder, UNet, VAE decoder, and a noise scheduler. Optionally integrates a safety checker. ### Function Signature `Bumblebee.Diffusion.StableDiffusion.text_to_image(clip, unet, vae, tokenizer, scheduler, opts \\ [])` ### Parameters - `clip`: The loaded CLIP text encoder model. - `unet`: The loaded UNet model. - `vae`: The loaded VAE decoder model. - `tokenizer`: The loaded tokenizer. - `scheduler`: The loaded noise scheduler. - `opts`: Optional arguments, including `num_steps`, `num_images_per_prompt`, `guidance_scale`, `safety_checker`, `safety_checker_featurizer`, `compile`, and `defn_options`. ### Request Example (Basic) ```elixir repository_id = "CompVis/stable-diffusion-v1-4" {:ok, tokenizer} = Bumblebee.load_tokenizer({:hf, "openai/clip-vit-large-patch14"}) {:ok, clip} = Bumblebee.load_model({:hf, repository_id, subdir: "text_encoder"}) {:ok, unet} = Bumblebee.load_model({:hf, repository_id, subdir: "unet"}) {:ok, vae} = Bumblebee.load_model({:hf, repository_id, subdir: "vae"}, architecture: :decoder) {:ok, scheduler} = Bumblebee.load_scheduler({:hf, repository_id, subdir: "scheduler"}) {:ok, featurizer} = Bumblebee.load_featurizer({:hf, repository_id, subdir: "feature_extractor"}) {:ok, safety_checker} = Bumblebee.load_model({:hf, repository_id, subdir: "safety_checker"}) serving = Bumblebee.Diffusion.StableDiffusion.text_to_image(clip, unet, vae, tokenizer, scheduler, num_steps: 20, num_images_per_prompt: 2, guidance_scale: 7.5, safety_checker: safety_checker, safety_checker_featurizer: featurizer, compile: [batch_size: 1, sequence_length: 60], defn_options: [compiler: EXLA]) Nx.Serving.run(serving, "numbat in forest, detailed, digital art") ``` ### Response Example (Basic) ```elixir %{results: [%{image: #Nx.Tensor, is_safe: true}, %{image: #Nx.Tensor, is_safe: true}]} ``` ### Request Example (With Negative Prompt and Seed) ```elixir # Assuming 'serving' is already defined as above Nx.Serving.run(serving, %{prompt: "a serene mountain lake at sunset", negative_prompt: "ugly, blurry, low quality", seed: 42}) ``` ``` -------------------------------- ### Bumblebee.Text.question_answering/3 Source: https://context7.com/elixir-nx/bumblebee/llms.txt Extracts the answer span directly from a context passage given a question. Returns ranked answer candidates with positional offsets. ```APIDOC ## Bumblebee.Text.question_answering/3 - Extractive question answering ### Description Extracts the answer span directly from a context passage given a question. Returns ranked answer candidates with positional offsets. ### Method Signature `Bumblebee.Text.question_answering(roberta, tokenizer)` ### Parameters - `roberta`: A loaded question answering model. - `tokenizer`: A loaded tokenizer. ### Request Example ```elixir {:ok, roberta} = Bumblebee.load_model({:hf, "deepset/roberta-base-squad2"}) {:ok, tokenizer} = Bumblebee.load_tokenizer({:hf, "FacebookAI/roberta-base"}) serving = Bumblebee.Text.question_answering(roberta, tokenizer) Nx.Serving.run(serving, %{ question: "What\'s my name?", context: "My name is Sarah and I live in London." }) ``` ### Response Example ```elixir %{results: [%{end: 16, score: 0.81039959192276, start: 11, text: "Sarah"}]} ``` ``` -------------------------------- ### Extract Image Embeddings with CLIP Source: https://context7.com/elixir-nx/bumblebee/llms.txt Extract dense vector representations of images for retrieval or similarity search. Requires CLIP model and featurizer. The `embedding_processor` option can be set to `:l2_norm`. ```elixir {:ok, clip} = Bumblebee.load_model({:hf, "openai/clip-vit-base-patch32"}, module: Bumblebee.Vision.ClipVision ) {:ok, featurizer} = Bumblebee.load_featurizer({:hf, "openai/clip-vit-base-patch32"}) serving = Bumblebee.Vision.image_embedding(clip, featurizer, embedding_processor: :l2_norm ) image = StbImage.read_file!("/path/to/image.jpg") Nx.Serving.run(serving, image) #=> %{ #=> embedding: #Nx.Tensor< #=> f32[768] #=> [-0.43403682112693787, 0.09786412119865417, ...] #=> > #=> } ``` -------------------------------- ### Fill Mask Task with Bumblebee Source: https://context7.com/elixir-nx/bumblebee/llms.txt Use `Bumblebee.Text.fill_mask/3` to predict tokens for a `[MASK]` placeholder. Load the model and tokenizer first. The `top_k` option controls the number of predictions. ```elixir {:ok, bert} = Bumblebee.load_model({:hf, "google-bert/bert-base-uncased"}) {:ok, tokenizer} = Bumblebee.load_tokenizer({:hf, "google-bert/bert-base-uncased"}) serving = Bumblebee.Text.fill_mask(bert, tokenizer, top_k: 5) Nx.Serving.run(serving, "The capital of [MASK] is Paris.") #=> %{ #=> predictions: [ #=> %{score: 0.9279842972755432, token: "france"}, #=> %{score: 0.008412551134824753, token: "brittany"}, #=> %{score: 0.007433671969920397, token: "algeria"}, #=> %{score: 0.004957548808306456, token: "department"}, #=> %{score: 0.004369721747934818, token: "reunion"} #=> ] #=> } ```