### Install PromptSource Source: https://context7.com/bigscience-workshop/promptsource/llms.txt Install PromptSource from PyPI for using existing prompts. For creating new prompts, clone the repository and install from source. ```bash pip install promptsource ``` ```bash git clone https://github.com/bigscience-workshop/promptsource.git cd promptsource pip install -e . ``` -------------------------------- ### Apply a prompt to a dataset example Source: https://github.com/bigscience-workshop/promptsource/blob/main/README.md Demonstrates applying a prompt template to a specific example and printing the resulting input and target. ```python >>> result = prompt.apply(example) >>> print("INPUT: ", result[0]) INPUT: What label best describes this news article? Carlyle Looks Toward Commercial Aerospace (Reuters) Reuters - Private investment firm Carlyle Group,\ which has a reputation for making well-timed and occasionally\ controversial plays in the defense industry, has quietly placed\ its bets on another part of the market. >>> print("TARGET: ", result[1]) TARGET: Business ``` -------------------------------- ### Install PromptSource Source: https://github.com/bigscience-workshop/promptsource/blob/main/README.md Commands for installing the PromptSource package for usage or local development. ```bash pip install promptsource ``` ```bash pip install -e . ``` -------------------------------- ### Launch the PromptSource app locally Source: https://github.com/bigscience-workshop/promptsource/blob/main/README.md Command to start the Streamlit application from the repository root. ```bash streamlit run promptsource/app.py ``` -------------------------------- ### Launch PromptSource Application Source: https://context7.com/bigscience-workshop/promptsource/llms.txt Commands to start the Streamlit application with various configuration options. ```bash streamlit run promptsource/app.py ``` ```bash streamlit run promptsource/app.py -- --read-only ``` ```bash export PROMPTSOURCE_MANUAL_DATASET_DIR=/path/to/datasets streamlit run promptsource/app.py ``` -------------------------------- ### Apply Template to Dataset Example Source: https://context7.com/bigscience-workshop/promptsource/llms.txt Use the apply() method to render a Jinja template with data from a dataset example, generating input and target strings. Truncation and variable highlighting can be controlled. ```python from datasets import load_dataset from promptsource.templates import DatasetTemplates dataset = load_dataset("ag_news", split="train") example = dataset[1] print(f"Example: {example}") ag_news_prompts = DatasetTemplates('ag_news') template = ag_news_prompts["classify_question_first"] input_text, target_text = template.apply(example) print(f"INPUT: {input_text}") print(f"TARGET: {target_text}") input_text, target_text = template.apply(example, truncate=False) input_text, target_text = template.apply(example, highlight_variables=True) ``` -------------------------------- ### Control template behavior with logic Source: https://github.com/bigscience-workshop/promptsource/blob/main/CONTRIBUTING.md Use lists or dictionaries to map example values to human-readable labels. ```jinja2 The label for this example is {{ ["Label A", "Label B"][label] }}. ``` ```jinja2 The label for this example is {{ {"a": "Label A", "b": "Label B" }[label] }}. ``` -------------------------------- ### Instantiate and Use DatasetTemplates Source: https://github.com/bigscience-workshop/promptsource/blob/main/API_DOCUMENTATION.md Instantiate DatasetTemplates to access prompts for a given dataset and subset. Use it to get the number of prompts or a list of all template names. ```python template_key = f"{dataset_name}/{subset_name}" if subset_name is not None else dataset_name prompts = DatasetTemplates(template_key) len(prompts) # Returns the number of prompts for the given dataset prompts.all_template_names # Returns a sorted list of all templates names for this dataset ``` -------------------------------- ### Access dataset fields Source: https://github.com/bigscience-workshop/promptsource/blob/main/CONTRIBUTING.md Use double curly braces to print values from the example dictionary. ```jinja2 The text in this example is {{ text }}. ``` -------------------------------- ### Write Jinja Templates for Prompts Source: https://context7.com/bigscience-workshop/promptsource/llms.txt Examples of Jinja templates used to define prompts, including conditional logic, list operations, and answer choice mapping. ```jinja2 {# Basic template accessing dataset fields #} {{ premise }} Question: Does this imply that "{{ hypothesis }}"? Yes, no, or maybe? ||| {{ answer_choices[label] }} {# Template with conditional logic #} {% if label_coarse == 0 %} Is this question asking for a {{"definition"}}, {{"description"}}, {{"manner"}}, or {{"reason"}}? {{ text }} ||| {{ {0: "Manner", 7: "Definition", 9: "Reason", 12: "Description"}[label_fine] }} {% endif %} {# Template with list operations #} Given these statements about {{ category }}: {{ answers | map(attribute="atext") | map("lower") | join(", ") }}. Which is the most appropriate answer? {{ qtext }} ||| {% for answer in answers if answer["aid"] == ra -%} {{ answer["atext"] }} {%- endfor %} {# Template using answer_choices variable #} {{ text }} Which of the following sections would this article appear in? {{ answer_choices[0] }}, {{ answer_choices[1] }}, {{ answer_choices[2] }}, or {{ answer_choices[3] }}? ||| {{ answer_choices[label] }} ``` -------------------------------- ### Define a prompt template in Jinja Source: https://github.com/bigscience-workshop/promptsource/blob/main/README.md Example of a Jinja template used to format SNLI dataset entries into a prompt structure. ```jinja2 {{premise}} Question: Does this imply that "{{hypothesis}}"? Yes, no, or maybe? ||| {{answer_choices[label]}} ``` -------------------------------- ### Render literal text Source: https://github.com/bigscience-workshop/promptsource/blob/main/CONTRIBUTING.md Use plain text for content that remains constant across all examples. ```jinja2 This is just literal text that will be printed the same way every time. ``` -------------------------------- ### Filter Prompts with Jinja for TREC Dataset Source: https://github.com/bigscience-workshop/promptsource/blob/main/CONTRIBUTING.md Use Jinja if statements to apply prompts only to a subset of examples based on specific labels. This example filters for questions related to specific coarse-grained categories in the TREC dataset. ```jinja2 {% if label_coarse == 0 %} Is this question asking for a {{"definition"}}, a {{"description"}}, a {{"manner of action"}}, or a {{"reason"}}? {{text}} ||| {{ {0: "Manner", 7: "Defintion", 9: "Reason", 12: "Description"}[label_fine] }} {% endif %} ``` -------------------------------- ### Template.apply() Method Source: https://context7.com/bigscience-workshop/promptsource/llms.txt The apply method renders a Jinja template using data from a specific dataset example. ```APIDOC ## Template.apply() ### Description Creates a prompted example by rendering the Jinja template with data from a dataset example, returning a tuple of input and target strings. ### Parameters - **example** (dict) - Required - A dictionary representing a single row from a dataset. - **truncate** (bool) - Optional - Whether to truncate the input text. Defaults to True. - **highlight_variables** (bool) - Optional - Whether to highlight variables for debugging. Defaults to False. ### Returns - **tuple** - (input_text, target_text) strings. ### Example ```python input_text, target_text = template.apply(example, truncate=False, highlight_variables=True) ``` ``` -------------------------------- ### Convert Infix Prompt to Conditional Generation Format Source: https://github.com/bigscience-workshop/promptsource/blob/main/CONTRIBUTING.md Transform an 'infix' prompt format into a conditional generation format using Jinja. This example converts a premise-hypothesis relationship prompt into a format suitable for generative models. ```jinja2 Given that {{premise}}, it {{ ["must be true", "might be true", "must be false"][label] }} that {{hypothesis}} ``` ```jinja2 Given that {{premise}}, it {{ "must be true, might be true, or must be false" }} that {{hypothesis}}?||| {{ ["must be true", "might be true", "must be false"][label] }} ``` -------------------------------- ### Handle Splits Without Labels using Jinja Source: https://github.com/bigscience-workshop/promptsource/blob/main/CONTRIBUTING.md Wrap conditional statements on the target side to handle datasets with splits lacking labels, such as test splits without ground truth. This example for super_glue/boolq ensures the target is generated only when a label exists. ```jinja2 {{ passage }} Question: {{ question }} Answer: ||| {% if label != -1 %} {{ answer_choices[label] }} {% endif %} ``` -------------------------------- ### Retrieve prompts for a dataset subset Source: https://github.com/bigscience-workshop/promptsource/blob/main/README.md Shows how to load a specific dataset subset and initialize the corresponding DatasetTemplates. ```python dataset_name, subset_name = "super_glue", "rte" dataset = load_dataset(f"{dataset_name}/{subset_name}", split="train") example = dataset[0] prompts = DatasetTemplates(f"{dataset_name}/{subset_name}") ``` -------------------------------- ### Apply prompts to datasets Source: https://github.com/bigscience-workshop/promptsource/blob/main/README.md Demonstrates loading a dataset from Hugging Face and applying templates using the PromptSource API. ```python # Load an example from the datasets ag_news >>> from datasets import load_dataset >>> dataset = load_dataset("ag_news", split="train") >>> example = dataset[1] # Load prompts for this dataset >>> from promptsource.templates import DatasetTemplates >>> ag_news_prompts = DatasetTemplates('ag_news') # Print all the prompts available for this dataset. The keys of the dict are the UUIDs the uniquely identify each of the prompt, and the values are instances of `Template` which wraps prompts >>> print(ag_news_prompts.templates) {'24e44a81-a18a-42dd-a71c-5b31b2d2cb39': , '8fdc1056-1029-41a1-9c67-354fc2b8ceaf': , '918267e0-af68-4117-892d-2dbe66a58ce9': , '9345df33-4f23-4944-a33c-eef94e626862': , '98534347-fff7-4c39-a795-4e69a44791f7': , 'b401b0ee-6ffe-4a91-8e15-77ee073cd858': , 'cb355f33-7e8c-4455-a72b-48d315bd4f60': } # Select a prompt by its name >>> prompt = ag_news_prompts["classify_question_first"] ``` -------------------------------- ### Initialize DatasetTemplates Source: https://context7.com/bigscience-workshop/promptsource/llms.txt Load prompts for a specific dataset using the DatasetTemplates class. Access template names and count available prompts. ```python from datasets import load_dataset from promptsource.templates import DatasetTemplates ag_news_prompts = DatasetTemplates('ag_news') print(f"Number of prompts: {len(ag_news_prompts)}") print(f"Template names: {ag_news_prompts.all_template_names}") template = ag_news_prompts["classify_question_first"] rte_prompts = DatasetTemplates("super_glue/rte") print(f"RTE prompts available: {len(rte_prompts)}") ``` -------------------------------- ### Head_QA Prompt Template Source: https://github.com/bigscience-workshop/promptsource/blob/main/CONTRIBUTING.md Demonstrates using filters like map and join, alongside conditional for-loops to process complex dataset schemas. ```jinja2 Given this list of statements about {{category}}: {{ answers | map(attribute="atext") | map("lower") | map("trim", ".") | join(", ") }}. Which one is the most appropriate answer/completion for the paragraph that follows? {{qtext}} ||| {% for answer in answers if answer["aid"]==ra -%} {{answer["atext"]}} {%- endfor %} ``` -------------------------------- ### Initialize TemplateCollection Source: https://context7.com/bigscience-workshop/promptsource/llms.txt Aggregate all prompts from all datasets using TemplateCollection. Access dataset templates and counts per dataset. ```python from promptsource.templates import TemplateCollection collection = TemplateCollection() print(f"Total datasets: {len(collection)}") for key in list(collection.datasets_templates.keys())[:5]: print(key) rte_templates = collection.get_dataset("super_glue", "rte") print(f"RTE templates: {rte_templates.all_template_names}") template_counts = collection.get_templates_count() print(f"AG News templates: {template_counts.get('ag_news', 0)}") ``` -------------------------------- ### Collect all available prompts Source: https://github.com/bigscience-workshop/promptsource/blob/main/README.md Initializes a TemplateCollection to access all prompts associated with datasets. ```python >>> from promptsource.templates import TemplateCollection # Get all the prompts available in PromptSource >>> collection = TemplateCollection() # Print a dict where the key is the pair (dataset name, subset name) # and the value is an instance of DatasetTemplates >>> print(collection.datasets_templates) {('poem_sentiment', None): , ('common_gen', None): , ('anli', None): , ('cc_news', None): , ('craigslist_bargains', None): ,...} ``` -------------------------------- ### DatasetTemplates Class Usage Source: https://github.com/bigscience-workshop/promptsource/blob/main/API_DOCUMENTATION.md Demonstrates how to use the DatasetTemplates class to manage prompts for a specific dataset. ```APIDOC ## DatasetTemplates Class Usage ### Description Manages all prompts for a specific dataset/subset, providing functionality to read/write prompts from/to YAML files. ### Instantiation and Usage To get existing prompts and their names for a given dataset: ```python >>> from promptsource.dataset_templates import DatasetTemplates >>> dataset_name = "your_dataset_name" >>> subset_name = "your_subset_name" # Optional >>> template_key = f"{dataset_name}/{subset_name}" if subset_name is not None else dataset_name >>> prompts = DatasetTemplates(template_key) ``` ### Methods #### `len(prompts)` - **Description**: Returns the number of prompts for the given dataset. #### `prompts.all_template_names` - **Description**: Returns a sorted list of all templates names for this dataset. ``` -------------------------------- ### Creating Custom Templates Source: https://context7.com/bigscience-workshop/promptsource/llms.txt Define a new template using Jinja syntax and configure its associated metadata. ```python from promptsource.templates import Template, DatasetTemplates from datasets import load_dataset # Create a new template custom_template = Template( name="my_custom_prompt", jinja="""Classify the following news article into one of these categories: {{ answer_choices | join(", ") }}. Article: {{ text }} Category: ||| {{ answer_choices[label] }}""", reference="Custom prompt by developer", answer_choices="Politics ||| Sports ||| Business ||| Technology" ) # Set metadata custom_template.metadata.original_task = True custom_template.metadata.choices_in_prompt = True custom_template.metadata.metrics = ["Accuracy"] custom_template.metadata.languages = ["en"] # Test the template dataset = load_dataset("ag_news", split="train") example = dataset[0] input_text, target_text = custom_template.apply(example) print(f"INPUT:\n{input_text}") # Output: INPUT: # Classify the following news article into one of these categories: # Politics, Sports, Business, Technology. # # Article: Wall St. Bears Claw Back Into the Black (Reuters)... # # Category: print(f"TARGET: {target_text}") # Output: TARGET: Business ``` -------------------------------- ### PAWS Prompt Template Source: https://github.com/bigscience-workshop/promptsource/blob/main/CONTRIBUTING.md A simple template for binary classification tasks where answer choices do not require escaping. ```jinja2 Sentence 1: {{sentence1}} Sentence 2: {{sentence2}} Question: Does Sentence 1 paraphrase Sentence 2? Yes or No? ||| {{answer_choices[label]}} ``` -------------------------------- ### Define Static Answer Choices Source: https://github.com/bigscience-workshop/promptsource/blob/main/CONTRIBUTING.md Use a pipe-separated string to define a fixed set of answer choices for a template. ```jinja2 World News ||| Sports ||| Business ||| Science and Technology ``` -------------------------------- ### SNLI Prompt: Task Description and Input Source: https://github.com/bigscience-workshop/promptsource/blob/main/CONTRIBUTING.md Format a prompt with a task description followed by the input, suitable for tasks like SNLI. The target is separated by '|||'. ```jinja2 Determine the relation between the following two sentences. The relations are entailment, contradiction, or neutral. {{premise}} {{hypothesis}} ||| {{label}} ``` -------------------------------- ### Hellaswag Prompt Template Source: https://github.com/bigscience-workshop/promptsource/blob/main/CONTRIBUTING.md Uses string manipulation functions to format context and answer choices for the Hellaswag dataset. ```jinja2 First, {{ ctx_a.lower() }} Then, {{ ctx_b.lower() }}... Complete the above description with a chosen ending: (a) {{ answer_choices[0] }} (b) {{ answer_choices[1] }} (c) {{ answer_choices[2] }} (d) {{ answer_choices[3] }} ||| {{ answer_choices[label | int()] }} ``` -------------------------------- ### Define input and target Source: https://github.com/bigscience-workshop/promptsource/blob/main/CONTRIBUTING.md Use three vertical bars to separate the input string from the target string. ```jinja2 I'm working on the final exam for my class and am trying to figure out the answer to the question "{{question}}" I found the following info on Wikipedia and I think it has the answer. Can you tell me the answer? {{context}} ||| {{answers["text"][0]}}' ``` -------------------------------- ### Template Class Methods Source: https://github.com/bigscience-workshop/promptsource/blob/main/API_DOCUMENTATION.md Details on the methods available for the Template class, used to wrap and apply prompts. ```APIDOC ## Template Class Methods ### Description Provides methods to apply a prompt to an example, retrieve prompt identifiers, and get answer choices. ### Methods #### `apply(example, truncate=True, highlight_variables=False)` - **Description**: Creates a prompted example by applying the template to the given example. - **Parameters**: - `example` (Dict) - Required - The dataset example to create a prompt for. - `truncate` (Bool) - Optional - Defaults to `True`. If True, example fields will be truncated to `TEXT_VAR_LENGTH` chars. - `highlight_variables` (Bool) - Optional - Defaults to `False`. Highlight the added variables (internal use for the app rendering). #### `get_id()` - **Description**: Gets the UUID of the prompt. #### `get_name()` - **Description**: Gets the name of the prompt. #### `get_reference()` - **Description**: Gets any additional information about the prompt (such as bibliographic reference). #### `get_answer_choices_list(example)` - **Description**: If applicable, returns a list of answer choices for a given example. - **Parameters**: - `example` (Dict) - Required - The dataset example to retrieve answer choices for. ``` -------------------------------- ### Exporting Templates to DataFrame Source: https://context7.com/bigscience-workshop/promptsource/llms.txt Load all available templates into a Pandas DataFrame for filtering and analysis. ```python from promptsource.templates import get_templates_data_frame # Get all templates as a DataFrame df = get_templates_data_frame() print(f"Total templates: {len(df)}") # Output: Total templates: ~2000 print(f"Columns: {df.columns.tolist()}") # Output: Columns: ['id', 'dataset', 'subset', 'name', 'reference', # 'original_task', 'choices_in_prompt', 'metrics', 'languages', # 'answer_choices', 'jinja'] # Filter templates by dataset ag_news_templates = df[df['dataset'] == 'ag_news'] print(f"AG News templates:\n{ag_news_templates[['name', 'original_task', 'metrics']]}") # Find all templates with accuracy metric accuracy_templates = df[df['metrics'].apply(lambda x: 'Accuracy' in x if x else False)] print(f"Templates with Accuracy metric: {len(accuracy_templates)}") # Find English-only templates english_templates = df[df['languages'].apply(lambda x: x == ['en'] if x else False)] print(f"English-only templates: {len(english_templates)}") ``` -------------------------------- ### Jinja Cookbook Patterns Source: https://github.com/bigscience-workshop/promptsource/blob/main/CONTRIBUTING.md Common Jinja2 patterns for accessing attributes, joining lists, conditional logic, and zipping lists. ```jinja {{ answers_spans.spans }} ``` ```jinja {{ spans_list | join(", ") }} ``` ```jinja {% if label==0 %} do_something {% elif condition %} do_something_else {% endif %} ``` ```jinja {% for a, b in zip(list_A, list_B) %} do_something_with_a_and_b {% endfor %} ``` -------------------------------- ### Accessing Template Answer Choices Source: https://context7.com/bigscience-workshop/promptsource/llms.txt Retrieve dynamic or static answer choices and raw Jinja expressions from a template. ```python from datasets import load_dataset from promptsource.templates import DatasetTemplates dataset = load_dataset("ag_news", split="train") example = dataset[0] ag_news_prompts = DatasetTemplates('ag_news') template = ag_news_prompts["classify_question_first"] # Get available answer choices for the template choices = template.get_answer_choices_list(example) print(f"Answer choices: {choices}") # Output: Answer choices: ['World politics', 'Sports', 'Business', 'Science and technology'] # For templates with static answer choices (not dependent on example) fixed_choices = template.get_fixed_answer_choices_list() print(f"Fixed choices: {fixed_choices}") # Output: Fixed choices: ['World politics', 'Sports', 'Business', 'Science and technology'] # Access the raw Jinja expression for answer choices answer_expr = template.get_answer_choices_expr() print(f"Answer choices expression: {answer_expr}") # Output: Answer choices expression: World politics ||| Sports ||| Business ||| Science and technology ``` -------------------------------- ### DatasetTemplates Class Source: https://context7.com/bigscience-workshop/promptsource/llms.txt The DatasetTemplates class is used to load and manage all prompt templates associated with a specific dataset or subset. ```APIDOC ## DatasetTemplates Class ### Description Wraps all prompts for a specific dataset or subset and provides methods for reading, accessing, and managing prompt templates stored in YAML files. ### Initialization `DatasetTemplates(dataset_name)` ### Parameters - **dataset_name** (string) - Required - The name of the dataset (e.g., 'ag_news') or 'dataset_name/subset_name' for datasets with subsets. ### Properties - **all_template_names** (list) - A sorted list of all available template names for the dataset. ### Example ```python from promptsource.templates import DatasetTemplates ag_news_prompts = DatasetTemplates('ag_news') # Access a specific template template = ag_news_prompts['classify_question_first'] ``` ``` -------------------------------- ### SNLI Prompt: Question-Answer with Choices Source: https://github.com/bigscience-workshop/promptsource/blob/main/CONTRIBUTING.md Format a prompt as a question-answer pair with optional multiple choices, suitable for tasks like SNLI. The target is separated by '|||'. ```jinja2 {{premise}} Is it the case that {{hypothesis}}? {{ "Yes" }}, {{ "No" }}, {{ "Maybe" }} ||| {{ ["Yes", "No", "Maybe"].index(label) }} ``` -------------------------------- ### TemplateCollection Class Methods Source: https://github.com/bigscience-workshop/promptsource/blob/main/API_DOCUMENTATION.md Details on the methods of the TemplateCollection class, which aggregates all prompts available in PromptSource. ```APIDOC ## TemplateCollection Class Methods ### Description Aggregates all prompts available under PromptSource by wrapping `DatasetTemplates` objects. ### Methods #### `get_dataset(dataset_name, subset_name=None)` - **Description**: Returns the `DatasetTemplates` object corresponding to the dataset name. - **Parameters**: - `dataset_name` (String) - Required - Name of the dataset to get. - `subset_name` (String) - Optional - Defaults to `None`. Name of the subset. #### `get_templates_count()` - **Description**: Returns the overall number count of templates across all datasets. Note: subsets' counts are included in the dataset counts. ``` -------------------------------- ### Accessing Template Metadata Source: https://context7.com/bigscience-workshop/promptsource/llms.txt Retrieve identification details and task-specific metadata from a template object. ```python from promptsource.templates import DatasetTemplates ag_news_prompts = DatasetTemplates('ag_news') template = ag_news_prompts["classify_with_choices"] # Access template identification print(f"Template ID: {template.get_id()}") # Output: Template ID: b401b0ee-6ffe-4a91-8e15-77ee073cd858 print(f"Template name: {template.get_name()}") # Output: Template name: classify_with_choices print(f"Reference: {template.get_reference()}") # Output: Reference: (empty string or citation) # Access metadata metadata = template.metadata print(f"Original task: {metadata.original_task}") # Output: Original task: True print(f"Choices in prompt: {metadata.choices_in_prompt}") # Output: Choices in prompt: True print(f"Evaluation metrics: {metadata.metrics}") # Output: Evaluation metrics: ['Accuracy'] print(f"Languages: {metadata.languages}") # Output: Languages: ['en'] ``` -------------------------------- ### Extract Dynamic Answer Choices Source: https://github.com/bigscience-workshop/promptsource/blob/main/CONTRIBUTING.md Use Jinja expressions to extract example-specific choices directly from the underlying dataset. ```jinja2 {{choices.text | join("|||")}} ``` -------------------------------- ### Template.get_answer_choices_list() Source: https://context7.com/bigscience-workshop/promptsource/llms.txt Retrieves the list of valid answer choices for a classification task by rendering the Jinja expression. ```APIDOC ## Template.get_answer_choices_list() ### Description Returns the list of valid answer choices for classification tasks, rendering the Jinja answer_choices expression for a given example. ### Parameters #### Request Body - **example** (dict) - Required - The dataset example used to render the answer choices expression. ``` -------------------------------- ### TemplateCollection Class Source: https://context7.com/bigscience-workshop/promptsource/llms.txt The TemplateCollection class provides an interface to access all templates across all datasets in the PromptSource library. ```APIDOC ## TemplateCollection Class ### Description Aggregates all prompts from all datasets in PromptSource, providing access to the complete collection of templates and counts. ### Methods - **get_dataset(dataset_name, subset_name)** - Returns a DatasetTemplates object for the specified dataset. - **get_templates_count()** - Returns a dictionary mapping dataset names to the number of available templates. ### Properties - **datasets_templates** (dict) - A dictionary where keys are tuples of (dataset_name, subset_name) and values are DatasetTemplates objects. ``` -------------------------------- ### get_templates_data_frame() Source: https://context7.com/bigscience-workshop/promptsource/llms.txt Exports all available templates and their associated metadata into a Pandas DataFrame. ```APIDOC ## get_templates_data_frame() ### Description Exports all templates and their metadata into a Pandas DataFrame for analysis and exploration. ### Response - **DataFrame** (pandas.DataFrame) - A DataFrame containing columns: id, dataset, subset, name, reference, original_task, choices_in_prompt, metrics, languages, answer_choices, and jinja. ``` -------------------------------- ### Reference Answer Choices in Templates Source: https://github.com/bigscience-workshop/promptsource/blob/main/CONTRIBUTING.md Access the answer_choices list within Jinja templates to ensure consistency between input and target fields. ```jinja2 {{text}} Which of the following sections of a newspaper would this article likely appear in? {{answer_choices[0]}}, {{answer_choices[1]}}, {{answer_choices[2]}}, or {{answer_choices[3]}}? ||| {{ answer_choices[label] }} ``` -------------------------------- ### Protect static content Source: https://github.com/bigscience-workshop/promptsource/blob/main/CONTRIBUTING.md Surround content with double curly braces and quotes to prevent modification by data augmentation. ```jinja2 The choices are {{"a"}}, {{"b"}}, and {{"c"}}. ``` -------------------------------- ### Metadata Attributes Source: https://github.com/bigscience-workshop/promptsource/blob/main/API_DOCUMENTATION.md Details on the attributes of the Metadata class, which encapsulates prompt-specific information. ```APIDOC ## Metadata Attributes ### Description Encapsulates metadata associated with a prompt, including task originality, choice inclusion, and evaluation metrics. ### Attributes - **`original_task`** (Boolean) - If True, this prompt asks a model to perform the original task designed for this dataset. - **`choices_in_prompt`** (Boolean) - If True, the answer choices are included in the templates such that models see those choices in the input. Only applicable to classification tasks. - **`metrics`** (List[String]) - List of strings denoting metrics to use for evaluation. ``` -------------------------------- ### Template Metadata Access Source: https://context7.com/bigscience-workshop/promptsource/llms.txt Accessing metadata properties of a template object such as task type, metrics, and language support. ```APIDOC ## Template Metadata Access ### Description Access annotations about the task type, metrics, and whether answer choices are shown in the prompt via the metadata attribute. ### Response - **original_task** (bool) - Indicates if the template represents the original task. - **choices_in_prompt** (bool) - Indicates if answer choices are included in the prompt. - **metrics** (list) - List of evaluation metrics associated with the template. - **languages** (list) - List of languages supported by the template. ``` === COMPLETE CONTENT === This response contains all available snippets from this library. No additional content exists. Do not make further requests.