### Execute Shell Commands for Setup
Source: https://github.com/visualcomments/top-papers-graph.git/blob/main/notebooks/task3_dual_local_models_blind_ab_kaggle_offline.ipynb
Shows example shell commands executed during the setup process, including cloning a git repository and installing Python packages using pip.
```shell
> git clone --depth 1 https://github.com/top-papers/top-papers-graph.git /content/top-papers-graph
> /usr/bin/python3 -m pip install -q ipywidgets pyyaml requests unidecode nbformat
> /usr/bin/python3 -m pip install -q -e .[task3]
```
--------------------------------
### Local Setup and Server Start
Source: https://github.com/visualcomments/top-papers-graph.git/blob/main/src/scireason/mcp/USAGE_RU.md
Steps to set up a local Python environment and start the MCP server for testing.
```bash
cd /home/karaluv/proga/top-papers-graph
python -m venv .venv
source .venv/bin/activate
python -m pip install -U pip
python -m pip install -e '.[mcp]'
```
```bash
PYTHONPATH=src python -m scireason.mcp.server
```
--------------------------------
### Save Quick Setup
Source: https://github.com/visualcomments/top-papers-graph.git/blob/main/notebooks/task3_dual_local_models_blind_ab_colab.ipynb
Saves the current configuration for a quick setup. Use this before installing the repository.
```python
print("Быстрый сетап сохранён. Теперь можно запускать ячейку установки репозитория.")
```
--------------------------------
### Quick Setup UI for Repository Installation
Source: https://github.com/visualcomments/top-papers-graph.git/blob/main/task3_dual_local_models_blind_ab_guarded_fixed.ipynb
This code initializes and displays a UI for quick data input before repository installation. It includes an accordion for different settings and buttons for applying or resetting. Use this for early data entry; verification is in a separate cell.
```python
basic_box = W.VBox([
input_mode,
W.HBox([task1_yaml_path, task1_yaml_upload]),
W.HBox([processed_path, processed_upload]),
W.HBox([commands_file_path, commands_upload]),
query,
identifiers_text,
commands_text,
])
expert_box = W.VBox([expert_last, expert_first, expert_pat, domain_id, out_dir])
model_alpha_box = W.VBox([model_a_owner_label, model_a_vlm_backend, model_a_vlm_model_id, model_a_local_text_model])
model_beta_box = W.VBox([model_b_owner_label, model_b_vlm_backend, model_b_vlm_model_id, model_b_local_text_model])
flags_box = W.VBox([hf_offline, include_multimodal, run_vlm, raw_yaml_override])
accordion = W.Accordion(children=[
basic_box,
W.HBox([expert_box, flags_box]),
W.HBox([model_alpha_box, model_beta_box]),
])
accordion.set_title(0, 'Входные данные')
accordion.set_title(1, 'Эксперт и флаги')
accordion.set_title(2, 'Модели α / β')
display(W.VBox([
W.HTML('
Эта ячейка нужна только для раннего ввода данных. Проверка вынесена в следующую отдельную ячейку.
'),
accordion,
W.HBox([quick_apply_btn, quick_reset_btn]),
quick_status_html,
quick_summary_html,
]))
_task3_quick_apply_to_globals(show_message=False)
if TASK3_QUICK_INITIAL_YAML_ERROR:
quick_status_html.value = _task3_quick_render_box(
'Есть проблема в сохранённом YAML override',
lines=[TASK3_QUICK_INITIAL_YAML_ERROR],
tone='warning',
)
```
--------------------------------
### Install Dependencies and Setup Environment
Source: https://github.com/visualcomments/top-papers-graph.git/blob/main/utils/telegram_bot/README.md
Navigate to the telegram_bot directory, create a virtual environment, activate it, install requirements, and copy the environment file. Remember to fill in your bot token in the .env file.
```bash
cd telegram_bot
python3 -m venv venv && source venv/bin/activate
pip install -r requirements.txt
cp .env.example .env # затем впишите в .env токен бота
```
--------------------------------
### Run PostgreSQL Setup Script
Source: https://github.com/visualcomments/top-papers-graph.git/blob/main/third_party/top-papers-bot/README.md
Executes the script to install and configure PostgreSQL, create a user, database, and subscriptions table. May require sudo privileges.
```bash
sudo ./make_postgres.sh
```
--------------------------------
### Install Project Dependencies
Source: https://github.com/visualcomments/top-papers-graph.git/blob/main/course/weeks/week01.md
Use this script to install and configure project dependencies. It handles setup automatically. Available for both bash and PowerShell environments.
```shell
./scripts/bootstrap.sh
```
```powershell
./scripts/bootstrap.ps1
```
--------------------------------
### Install Project Dependencies
Source: https://github.com/visualcomments/top-papers-graph.git/blob/main/docs/quickstart.md
Sets up a virtual environment and installs the project with development dependencies. Use the Windows-specific command for activation if on that OS.
```bash
python -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
pip install -e ".[dev]"
```
--------------------------------
### Local Environment Setup with Pip
Source: https://github.com/visualcomments/top-papers-graph.git/blob/main/TUTORIAL_QWEN3VL_DATASPHERE_BUDGET_RU.md
Commands to create a virtual environment, activate it, and install necessary Python packages (datasphere, pyyaml, huggingface_hub) for local development.
```bash
python3 -m venv .venv
source .venv/bin/activate
python -m pip install -U pip
python -m pip install -U datasphere pyyaml huggingface_hub
datasphere version
```
--------------------------------
### Install and Activate DataSphere CLI
Source: https://github.com/visualcomments/top-papers-graph.git/blob/main/experiments/vlm_finetuning/datasphere/TUTORIAL_FULL_EXPERIMENT_RU.md
Activate the virtual environment and install or upgrade the DataSphere CLI. Verify the installation with the version command.
```bash
source .venv/bin/activate
python -m pip install -U datasphere
datasphere version
```
--------------------------------
### Apply Setup to Widgets
Source: https://github.com/visualcomments/top-papers-graph.git/blob/main/task3_dual_local_models_blind_ab_guarded_fixed.ipynb
Applies a given setup configuration to the various widgets in the UI. If no setup is provided, it first refreshes the setup from global variables. This function is crucial for initializing or updating the UI based on configuration.
```python
def _apply_setup_to_widgets(setup=None, *, announce=True):
setup = setup or _refresh_setup_from_globals()
input_mode.value = setup['input_mode']
trajectory_path.value = setup['task1_yaml_path']
processed_path.value = setup['processed_path']
commands_path.value = setup['commands_file_path']
query.value = setup['query']
identifiers_text.value = '\n'.join(setup['identifiers'])
commands_text.value = setup['commands_text']
expert_last.value = setup['expert']['last_name']
expert_first.value = setup['expert']['first_name']
expert_pat.value = setup['expert']['patronymic']
domain_id.value = setup['domain_id']
out_dir.value = setup['out_dir']
search_limit.value = int(setup['search_limit'])
top_papers.value = int(setup['top_papers'])
top_hypotheses.value = int(setup['top_hypotheses'])
candidate_top_k.value = int(setup['candidate_top_k'])
top_pairs.value = int(setup['top_pairs'])
annoy_n_trees.value = int(setup['annoy_n_trees'])
annoy_top_k.value = int(setup['annoy_top_k'])
include_multimodal.value = bool(setup['include_multimodal'])
run_vlm.value = bool(setup['run_vlm'])
edge_mode.value = setup['edge_mode']
link_prediction_backend.value = setup['link_prediction_backend']
model_a_owner_label.value = setup['model_a']['owner_label']
model_a_vlm_backend.value = setup['model_a']['vlm_backend']
model_a_vlm_model_id.value = setup['model_a']['vlm_model_id'] or TASK3_DEFAULT_LOCAL_VLM_MODEL
```
--------------------------------
### Initial Setup and Imports
Source: https://github.com/visualcomments/top-papers-graph.git/blob/main/notebooks/task3_dual_local_models_blind_ab_kaggle_offline.ipynb
Ensures the setup is validated before proceeding with imports and environment variable configurations. Sets up environment variables for asynchronous operations and timeouts.
```python
if '_task3_require_validated_setup' not in globals():
raise RuntimeError('Сначала запустите ячейку быстрого сетапа и отдельную ячейку проверки.')
_task3_require_validated_setup()
```
```python
import gc, io, json, os, sys, tempfile, subprocess, zipfile, shutil, traceback, inspect
from pathlib import Path
import importlib.util
os.environ.setdefault('G4F_ASYNC_ENABLED', '1')
os.environ.setdefault('G4F_ASYNC_MAX_CONCURRENCY', '3')
os.environ.setdefault('G4F_ASYNC_RETRIES', '3')
os.environ.setdefault('G4F_ASYNC_MAX_MODELS_PER_REQUEST', '3')
os.environ.setdefault('LLM_REQUEST_TIMEOUT_SECONDS', '25')
os.environ.setdefault('VLM_REQUEST_TIMEOUT_SECONDS', '45')
```
--------------------------------
### Apply Quick Setup Configuration
Source: https://github.com/visualcomments/top-papers-graph.git/blob/main/notebooks/task3_dual_local_models_blind_ab_kaggle_offline.ipynb
Applies the current values from interactive widgets to global variables, updating the quick setup summary and status messages. This function is triggered when the 'Apply Quick Setup' button is clicked.
```python
def _task3_quick_on_apply(_):
_task3_quick_apply_to_globals(show_message=True)
quick_apply_btn.on_click(_task3_quick_on_apply)
```
--------------------------------
### Apply Setup Configuration
Source: https://github.com/visualcomments/top-papers-graph.git/blob/main/notebooks/task3_dual_local_models_blind_ab_colab.ipynb
Callback function to apply the current setup configuration to the form widgets. It's triggered when the 'Apply Setup' button is clicked.
```python
def _on_apply_setup(_):
_apply_setup_to_widgets(_refresh_setup_from_globals(), announce=True)
```
--------------------------------
### Apply Setup Configuration to Widgets
Source: https://github.com/visualcomments/top-papers-graph.git/blob/main/notebooks/task3_dual_local_models_blind_ab_kaggle_offline.ipynb
Applies a predefined setup configuration to the notebook's widgets. This is useful for quickly populating form fields with known values. It also provides feedback on whether the setup was applied successfully and if there were any parsing errors.
```python
model_a_local_text_model.value = setup['model_a']['local_text_model']
model_b_owner_label.value = setup['model_b']['owner_label']
model_b_vlm_backend.value = setup['model_b']['vlm_backend']
model_b_vlm_model_id.value = setup['model_b']['vlm_model_id']
model_b_local_text_model.value = setup['model_b']['local_text_model']
hf_offline.value = bool(setup['hf_offline'])
create_offline_form.value = bool(setup['create_offline_form'])
create_expert_bundle.value = bool(setup['create_expert_bundle'])
auto_download_offline.value = bool(setup['auto_download_offline'])
auto_download_bundle.value = bool(setup['auto_download_bundle'])
auto_download_owner_key.value = bool(setup['auto_download_owner_key'])
report = _validate_task3dual_snapshot(_collect_form_snapshot())
_set_validation_box(report, title='Проверка после применения быстрого сетапа')
if announce:
note_lines = ['Значения из верхней ячейки применены к форме.']
if TASK3_DUAL_SETUP_PARSE_ERROR:
note_lines.append('YAML override не разобрался — использованы значения из словаря и дефолты.')
_set_status_box('Форма синхронизирована', lines=note_lines, tone='neutral')
```
--------------------------------
### Require Validated Setup Function
Source: https://github.com/visualcomments/top-papers-graph.git/blob/main/notebooks/task3_dual_local_models_blind_ab_kaggle_offline.ipynb
Checks if the quick setup has been validated and is in a usable state. Raises a `Task3SetupBlocked` exception if the setup is dirty, not yet validated, failed validation, or has an outdated fingerprint.
```python
def _task3_require_validated_setup():
if globals().get('TASK3_QUICK_SETUP_DIRTY'):
raise Task3SetupBlocked(
'Запуск заблокирован: сетап изменён после последней проверки',
[
'Нажмите «Сохранить быстрый сетап».',
'Затем снова запустите ячейку «Проверка быстрого сетапа».',
],
)
report = globals().get('TASK3_QUICK_VALIDATION_REPORT')
if not report:
raise Task3SetupBlocked(
'Запуск заблокирован: быстрый сетап ещё не проверен',
[
'Сначала запустите ячейку «Проверка быстрого сетапа».',
],
)
if not globals().get('TASK3_QUICK_SETUP_VALIDATED_OK'):
lines = list((report or {}).get('errors') or [])
reason = globals().get('TASK3_QUICK_SETUP_BLOCK_REASON')
if reason:
lines = [reason] + lines
if not lines:
lines = ['Исправьте конфиг и заново запустите ячейку проверки.']
raise Task3SetupBlocked('Запуск заблокирован: быстрый сетап не прошёл проверку', lines[:8])
current_fp = str(globals().get('TASK3_QUICK_SETUP_CURRENT_FINGERPRINT') or '')
validated_fp = str(globals().get('TASK3_QUICK_SETUP_VALIDATED_FINGERPRINT') or '')
if current_fp and validated_fp and current_fp != validated_fp:
raise Task3SetupBlocked(
'Запуск заблокирован: проверка устарела',
[
'Сохранённая конфигурация изменилась после последней проверки.',
'Снова запустите ячейку «Проверка быстрого сетапа».',
],
)
return True
```
--------------------------------
### Install and Import Libraries
Source: https://github.com/visualcomments/top-papers-graph.git/blob/main/notebooks/task3_dual_local_models_blind_ab_colab.ipynb
Installs necessary libraries and sets up environment variables for asynchronous operations and timeouts. It also configures mock providers if a smoke test is enabled.
```python
import gc, io, json, os, sys, tempfile, subprocess, zipfile, shutil, traceback, inspect
from pathlib import Path
os.environ.setdefault('G4F_ASYNC_ENABLED', '1')
os.environ.setdefault('G4F_ASYNC_MAX_CONCURRENCY', '3')
os.environ.setdefault('G4F_ASYNC_RETRIES', '3')
os.environ.setdefault('G4F_ASYNC_MAX_MODELS_PER_REQUEST', '3')
os.environ.setdefault('LLM_REQUEST_TIMEOUT_SECONDS', '25')
os.environ.setdefault('VLM_REQUEST_TIMEOUT_SECONDS', '45')
if os.environ.get('TASK3_DUAL_NOTEBOOK_SMOKE') == '1':
os.environ.setdefault('LLM_PROVIDER', 'mock')
os.environ.setdefault('LLM_MODEL', 'mock')
os.environ.setdefault('EMBED_PROVIDER', 'hash')
os.environ.setdefault('MM_EMBED_BACKEND', 'none')
os.environ.setdefault('VLM_BACKEND', 'none')
os.environ.setdefault('HF_HUB_OFFLINE', '1')
os.environ.setdefault('TRANSFORMERS_OFFLINE', '1')
os.environ.setdefault('HF_DATASETS_OFFLINE', '1')
```
--------------------------------
### Install Dependencies and Clone Repository
Source: https://github.com/visualcomments/top-papers-graph.git/blob/main/top_papers_graph_scidatapipe_hf_colab_from_csv_only_fixed_gdown_scope_fixed.ipynb
Installs necessary system packages and Python libraries, then clones the specified top-papers-graph repository and installs it in editable mode. This sets up the environment for subsequent operations.
```python
# @title
import os
import sys
import shutil
import subprocess
from pathlib import Path
def run(cmd, cwd=None):
print("+ ", " ".join(cmd))
subprocess.run(cmd, check=True, cwd=cwd)
run(["apt-get", "update", "-qq"])
run(["apt-get", "install", "-y", "-qq", "git", "p7zip-full", "unzip"])
run([sys.executable, "-m", "pip", "install", "-q", "-U", "pip", "setuptools", "wheel"])
run([sys.executable, "-m", "pip", "install", "-q", "-U",
"gdown", "huggingface_hub", "pandas", "openpyxl", "PyYAML", "Unidecode", "requests"])
repo_dir = Path("/content/top-papers-graph")
if repo_dir.exists():
shutil.rmtree(repo_dir)
run(["git", "clone", "--depth", "1", "--branch", REPO_BRANCH, REPO_URL, str(repo_dir)])
run([sys.executable, "-m", "pip", "install", "-q", "-e", "."], cwd=str(repo_dir))
print("Репозиторий установлен:", repo_dir)
```
--------------------------------
### Python Function to Apply Quick Setup to Globals
Source: https://github.com/visualcomments/top-papers-graph.git/blob/main/task3_dual_local_models_blind_ab_guarded_kaggle_offline.ipynb
Applies a collected snapshot to global variables, merging it with default setup and handling YAML overrides. It also generates a fingerprint for the current setup.
```python
def _task3_quick_apply_to_globals(show_message=True):
snapshot = _task3_quick_collect_snapshot()
merged = _task3_quick_deep_update(_task3_quick_default_setup(), snapshot)
merged.pop('yaml_override_text', None)
merged.pop('upload_notes', None)
yaml_text = str(snapshot.get('yaml_override_text') or '').strip()
yaml_payload, yaml_error = _task3_quick_parse_yaml(yaml_text)
if yaml_payload:
merged = _task3_quick_deep_update(merged, yaml_payload)
fingerprint = _task3_quick_snapshot_fingerprint(snapshot)
globals()['TASK3_DUAL_SETUP'] = merged
globals()['TASK3_DUAL_SETUP_YAML'] = yaml_text
globals()['TASK3_QUICK_LAST_SNAPSHOT'] = snapshot
globals()['TASK3_QUICK_LAST_YAML_ERROR'] = yaml_error
globals()['TASK3_QUICK_SETUP_CURRENT_FINGERPRINT'] = fingerprint
globals()['TASK3_QUICK_SETUP_DIRTY'] = False
globals()['TASK3_QUICK_SETUP_VALIDATED_OK'] = False
globals()['TASK3_QUICK_SETUP_VALIDATED_FINGERPRINT'] = ''
globals()['TASK3_QUICK_SETUP_VALIDATION_REQUIRED'] = True
globals()['TASK3_QUICK_SETUP_BLOCK_REASON'] = 'После сохранения быстрый сетап нужно проверить отдельной ячейкой.'
summary = {
'input_mode': merged['input_mode'],
'task1_yaml_path': merged['task1_yaml_path'],
'processed_path': merged['processed_path'],
'commands_file_path': merged['commands_file_path'],
'query': merged['query'],
'identifiers_count': len(merged['identifiers']),
'model_a': merged['model_a'],
'model_b': merged['model_b'],
'hf_offline': merged['hf_offline'],
}
```
--------------------------------
### Launch VLM Smoke Example
Source: https://github.com/visualcomments/top-papers-graph.git/blob/main/DATASPHERE_SMOKE_EVAL_IMAGES_COLUMN_FIX_RU.md
This command sequence is used to launch the VLM smoke example after setting up the environment and project.
```bash
cd top-papers-graph-main
source .venv/bin/activate
export DATASPHERE_PROJECT_ID='bt18pnosk97i8n24ddnv'
datasphere project get --id "$DATASPHERE_PROJECT_ID"
bash experiments/vlm_finetuning/datasphere/launch_examples.sh hf-smoke-managed
```
--------------------------------
### Install smolagents with Agents Support
Source: https://github.com/visualcomments/top-papers-graph.git/blob/main/docs/smolagents.md
Install the project with basic agent support. This is the minimum requirement for agent functionality.
```bash
pip install -e ".[agents]"
```
--------------------------------
### Apply Setup to Widgets
Source: https://github.com/visualcomments/top-papers-graph.git/blob/main/notebooks/task3_dual_local_models_blind_ab_colab.ipynb
Applies a given setup configuration to the various widgets on the page. Use this to pre-fill form fields based on a saved or default configuration.
```python
def _apply_setup_to_widgets(setup=None, *, announce=True):
setup = setup or _refresh_setup_from_globals()
input_mode.value = setup['input_mode']
trajectory_path.value = setup['task1_yaml_path']
processed_path.value = setup['processed_path']
commands_path.value = setup['commands_file_path']
query.value = setup['query']
identifiers_text.value = '\n'.join(setup['identifiers'])
commands_text.value = setup['commands_text']
expert_last.value = setup['expert']['last_name']
expert_first.value = setup['expert']['first_name']
expert_pat.value = setup['expert']['patronymic']
domain_id.value = setup['domain_id']
out_dir.value = setup['out_dir']
search_limit.value = int(setup['search_limit'])
top_papers.value = int(setup['top_papers'])
top_hypotheses.value = int(setup['top_hypotheses'])
candidate_top_k.value = int(setup['candidate_top_k'])
top_pairs.value = int(setup['top_pairs'])
annoy_n_trees.value = int(setup['annoy_n_trees'])
annoy_top_k.value = int(setup['annoy_top_k'])
include_multimodal.value = bool(setup['include_multimodal'])
run_vlm.value = bool(setup['run_vlm'])
edge_mode.value = setup['edge_mode']
link_prediction_backend.value = setup['link_prediction_backend']
model_a_owner_label.value = setup['model_a']['owner_label']
model_a_vlm_backend.value = setup['model_a']['vlm_backend']
model_a_vlm_model_id.value = setup['model_a']['vlm_model_id'] or TASK3_DEFAULT_LOCAL_VLM_MODEL
model_a_local_text_model.value = setup['model_a']['local_text_model']
model_b_owner_label.value = setup['model_b']['owner_label']
model_b_vlm_backend.value = setup['model_b']['vlm_backend']
model_b_vlm_model_id.value = setup['model_b']['vlm_model_id']
```
--------------------------------
### Key-Value Store for Form Fields
Source: https://github.com/visualcomments/top-papers-graph.git/blob/main/scripts/healthboard/frontend/output/singlepage.html
The `xg` class provides a key-value store for form fields, using a specific format for keys to manage field data efficiently. It includes methods for setting, getting, updating, deleting, and mapping entries.
```javascript
var bg="__@field_split__";
function yg(e){return e.map(function(e){return"".concat(y(e),":").concat(e)}).join(bg)}
var xg=function(){function e(){ft(this,e),R(this,"kvs",new Map)}return mt(e,[
{key:"set",value:function(e,t){this.kvs.set(yg(e),t)}},
{key:"get",value:function(e){return this.kvs.get(yg(e))}},
{key:"update",value:function(e,t){var n=t(this.get(e));n?this.set(e,n):this.delete(e)}},
{key:"delete",value:function(e){this.kvs.delete(yg(e))}},
{key:"map",value:function(e){return l(this.kvs.entries()).map(function(t){var n=N(t,2),r=n[0],o=n[1],i=r.split(bg);
return e({key:i.map(function(e){var t=N(e.match(/^(\[^:]*\*):(.
*)$/),3),n=t[1],r=t[2];return"number"===n?Number(r):r}),value:o})})}},
{key:"toJSON",value:function(){var e={};
return this.map(function(t){var n=t.key,r=t.value;
return e[n.join(".")] = r,null}),e}}
]),e}
();
const wg=xg;
```
--------------------------------
### Install and Update Python Dependencies
Source: https://github.com/visualcomments/top-papers-graph.git/blob/main/experiments/vlm_finetuning/datasphere/TUTORIAL_FULL_EXPERIMENT_RU.md
Installs and updates necessary Python packages including datasphere, pyyaml, and huggingface_hub. This is for local environment setup.
```bash
python3 -m venv .venv
source .venv/bin/activate
python -m pip install -U pip
python -m pip install -U datasphere pyyaml huggingface_hub
```
--------------------------------
### Run Course Demo
Source: https://github.com/visualcomments/top-papers-graph.git/blob/main/course/README.md
Execute the bootstrap script and then run the demo for the project. This is the recommended starting point for the course.
```bash
./scripts/bootstrap.sh
top-papers-graph demo-run
```
--------------------------------
### Make PostgreSQL Script Executable
Source: https://github.com/visualcomments/top-papers-graph.git/blob/main/third_party/top-papers-bot/README.md
Grants execute permissions to the PostgreSQL setup script. This script automates the installation and configuration of PostgreSQL.
```bash
chmod +x make_postgres.sh
```
--------------------------------
### Python: Collect Snapshot for Quick Setup
Source: https://github.com/visualcomments/top-papers-graph.git/blob/main/notebooks/task3_dual_local_models_blind_ab_kaggle_offline.ipynb
Gathers current settings and file paths from various input widgets and global variables to create a snapshot of the setup. Handles file uploads and appends notes about saved files.
```python
def _task3_quick_collect_snapshot():
yaml_upload_path = ''
processed_upload_path = ''
commands_upload_path = ''
upload_notes = []
try:
if task1_yaml_upload.value:
yaml_upload_path = _task3_quick_save_upload(task1_yaml_upload.value, 'task1.yaml')
upload_notes.append(f'Task1 YAML сохранён: {yaml_upload_path}')
except Exception as e:
upload_notes.append(f'Не удалось сохранить Task1 YAML upload: {type(e).__name__}: {e}')
try:
if processed_upload.value:
processed_upload_path = _task3_quick_save_upload(processed_upload.value, 'processed_papers.zip')
upload_notes.append(f'Processed ZIP сохранён: {processed_upload_path}')
except Exception as e:
upload_notes.append(f'Не удалось сохранить processed upload: {type(e).__name__}: {e}')
try:
if commands_upload.value:
commands_upload_path = _task3_quick_save_upload(commands_upload.value, 'commands.yaml')
upload_notes.append(f'Commands file сохранён: {commands_upload_path}')
except Exception as e:
upload_notes.append(f'Не удалось сохранить commands upload: {type(e).__name__}: {e}')
return {
'input_mode': input_mode.value,
'task1_yaml_path': (yaml_upload_path or task1_yaml_path.value or '').strip(),
'processed_path': (processed_upload_path or processed_path.value or '').strip(),
'commands_file_path': (commands_upload_path or commands_file_path.value or '').strip(),
'query': query.value.strip(),
'identifiers': [line.strip() for line in identifiers_text.value.splitlines() if line.strip()],
'commands_text': commands_text.value.strip(),
'expert': {
'last_name': expert_last.value.strip(),
'first_name': expert_first.value.strip(),
'patronymic': expert_pat.value.strip() or '-',
},
'domain_id': domain_id.value.strip() or 'science',
'out_dir': out_dir.value.strip() or 'runs/task3_dual_local_blind_ab',
'hf_offline': bool(hf_offline.value),
'include_multimodal': bool(include_multimodal.value),
'run_vlm': bool(run_vlm.value),
'model_a': {
'owner_label': model_a_owner_label.value.strip() or 'base_local_model',
'vlm_backend': model_a_vlm_backend.value,
'vlm_model_id': model_a_vlm_model_id.value.strip(),
'local_text_model': model_a_local_text_model.value.strip(),
},
'model_b': {
'owner_label': model_b_owner_label.value.strip() or 'finetuned_local_model',
'vlm_backend': model_b_vlm_backend.value,
'vlm_model_id': model_b_vlm_model_id.value.strip(),
'local_text_model': model_b_local_text_model.value.strip(),
},
'yaml_override_text': raw_yaml_override.value,
'upload_notes': upload_notes,
}
```
--------------------------------
### Unpack and Navigate to Project Directory
Source: https://github.com/visualcomments/top-papers-graph.git/blob/main/TUTORIAL_QWEN3_VL_8B_DATASPHERE_SMOKE_FIX_RU.md
Create a directory, change into it, unzip the provided archive, and navigate into the extracted project folder. This sets up the project files for use.
```bash
mkdir -p ~/Documents/top-papers-graph-fixed
cd ~/Documents/top-papers-graph-fixed
unzip /path/to/top-papers-graph-main-datasphere-smoke-fix.zip
cd top-papers-graph-main
```
--------------------------------
### Environment Setup and Path Initialization
Source: https://github.com/visualcomments/top-papers-graph.git/blob/main/notebooks/task3_dual_local_models_blind_ab_kaggle_gdown_script_clone_models_fixed.ipynb
Initializes workspace, output, and Hugging Face cache directories, and sets environment variables for timeouts, DPI, and VLM parameters. Ensures necessary directories are created.
```python
import importlib
import json
import os
import shutil
import subprocess
import sys
import zipfile
from pathlib import Path
WORKSPACE_DIR = Path(CFG["workspace_dir"]).expanduser()
OUTPUT_DIR = Path(CFG["output_dir"]).expanduser()
HF_HOME = Path(str(CFG.get("hf_home") or "/kaggle/working/.hf")).expanduser()
HF_HOME.mkdir(parents=True, exist_ok=True)
os.environ.setdefault("HF_HOME", str(HF_HOME))
os.environ.setdefault("HF_HUB_CACHE", str(HF_HOME / "hub"))
os.environ.setdefault("TRANSFORMERS_CACHE", str(HF_HOME / "hub"))
os.environ.setdefault("LOCAL_VLM_REQUEST_TIMEOUT_SECONDS", str(CFG.get("local_vlm_request_timeout_seconds") or 120))
os.environ.setdefault("SCIREASON_LOCAL_VLM_REQUEST_TIMEOUT_SECONDS", os.environ.get("LOCAL_VLM_REQUEST_TIMEOUT_SECONDS", "300"))
os.environ.setdefault("LOCAL_VLM_STARTUP_TIMEOUT_SECONDS", str(CFG.get("local_vlm_startup_timeout_seconds") or 900))
os.environ.setdefault("SCIREASON_LOCAL_VLM_STARTUP_TIMEOUT_SECONDS", os.environ.get("LOCAL_VLM_STARTUP_TIMEOUT_SECONDS", "900"))
os.environ.setdefault("PDF_RENDER_DPI", str(CFG.get("pdf_render_dpi") or 110))
os.environ.setdefault("VLM_MAX_PIXELS", str(CFG.get("vlm_max_pixels") or (768 * 28 * 28)))
os.environ.setdefault("VLM_MAX_NEW_TOKENS", str(CFG.get("vlm_max_new_tokens") or 192))
os.environ.setdefault("HF_HUB_DISABLE_PROGRESS_BARS", "1")
os.environ.setdefault("TOKENIZERS_PARALLELISM", "false")
WORKSPACE_DIR.mkdir(parents=True, exist_ok=True)
OUTPUT_DIR.mkdir(parents=True, exist_ok=True)
```
--------------------------------
### Install Task3 Dependencies
Source: https://github.com/visualcomments/top-papers-graph.git/blob/main/notebooks/task3_dual_local_models_blind_ab_kaggle_offline.ipynb
Installs or verifies the installation of required Python packages for the task3 runtime. It handles both online and offline scenarios, including optional dependency installation.
```python
_required_boot_modules = ['ipywidgets', 'yaml', 'requests', 'nbformat']
_missing_boot_modules = [name for name in _required_boot_modules if not _task3_module_available(name)]
if _missing_boot_modules and not TASK3_RUNTIME_OFFLINE:
run_optional([sys.executable, '-m', 'pip', 'install', '-q', 'ipywidgets', 'pyyaml', 'requests', 'unidecode', 'nbformat'])
elif _missing_boot_modules:
print('[warn] offline mode: пропускаю pip install для базовых notebook-зависимостей ->', ', '.join(_missing_boot_modules))
```
```python
_task3_editable_install = [sys.executable, '-m', 'pip', 'install', '-q', '-e', '.[task3]']
if TASK3_RUNTIME_OFFLINE or os.environ.get('TASK3_NOTEBOOK_SMOKE') == '1' or os.environ.get('TASK3_DUAL_NOTEBOOK_SMOKE') == '1' or os.environ.get('TASK3_NOTEBOOK_SKIP_OPTIONAL_DEPS') == '1':
_task3_editable_install = [sys.executable, '-m', 'pip', 'install', '-q', '-e', '.', '--no-deps']
if not run_optional(_task3_editable_install, cwd=REPO_DIR, label='pip install editable task3 notebook runtime'):
if _task3_editable_install[-1:] != ['--no-deps']:
run_optional([sys.executable, '-m', 'pip', 'install', '-q', '-e', '.', '--no-deps'], cwd=REPO_DIR, label='pip install editable task3 notebook runtime --no-deps fallback')
```
--------------------------------
### Launch Full Experiment (Recommended)
Source: https://github.com/visualcomments/top-papers-graph.git/blob/main/TUTORIAL_QWEN3VL_DATASPHERE_BUDGET_RU.md
Launches the full Qwen3VL experiment using the managed launcher with a specified project ID. Requires setting the DATASPHERE_PROJECT_ID environment variable.
```bash
export DATASPHERE_PROJECT_ID=''
bash experiments/vlm_finetuning/datasphere/launch_examples.sh hf-full-managed
```
--------------------------------
### Install Python Dependencies
Source: https://github.com/visualcomments/top-papers-graph.git/blob/main/third_party/top-papers-bot/README.md
Installs all necessary Python libraries for the bot. Ensure you have Python 3.8+ and pip installed.
```bash
pip install -r requirements.txt
```
--------------------------------
### Install Project Dependencies
Source: https://github.com/visualcomments/top-papers-graph.git/blob/main/task3_dual_local_models_blind_ab.ipynb
Installs necessary project dependencies using pip. This includes optional packages and an editable install for the task-specific components.
```python
run_optional([sys.executable, '-m', 'pip', 'install', '-q', 'ipywidgets', 'pyyaml', 'requests', 'unidecode', 'nbformat'])
_task3_editable_install = [sys.executable, '-m', 'pip', 'install', '-q', '-e', '.[task3]']
```
--------------------------------
### Initialize Repository Directory
Source: https://github.com/visualcomments/top-papers-graph.git/blob/main/notebooks/task3_dual_local_models_blind_ab_colab.ipynb
Sets up the repository directory, cloning it from GitHub if it doesn't exist. It prioritizes the directory found by `find_repo_root` or defaults to a path within /content or the current directory.
```python
REPO_DIR = find_repo_root()
REPO_URL = 'https://github.com/top-papers/top-papers-graph.git'
if REPO_DIR is None:
REPO_DIR = Path('/content/top-papers-graph') if Path('/content').exists() else Path.cwd() / 'top-papers-graph'
if not REPO_DIR.exists():
run(['git', 'clone', '--depth', '1', REPO_URL, str(REPO_DIR)])
SRC_DIR = REPO_DIR / 'src'
```
--------------------------------
### Display UI for Quick Setup
Source: https://github.com/visualcomments/top-papers-graph.git/blob/main/task3_dual_local_models_blind_ab_guarded_kaggle_offline.ipynb
Renders the UI elements for a quick setup, including an accordion for input data, expert settings, flags, and model configurations, along with apply and reset buttons.
```python
Result:
VBox(children=(HTML(value='Быстрый сетап до у…
```
--------------------------------
### Install Task 3 Dependencies
Source: https://github.com/visualcomments/top-papers-graph.git/blob/main/notebooks/task3_multimodal_temporal_hypothesis_generation_colab.ipynb
Install the necessary packages for Task 3 using pip. This command installs the package in editable mode.
```bash
pip install -e ".[task3]"
```
--------------------------------
### Environment Setup for Repository
Source: https://github.com/visualcomments/top-papers-graph.git/blob/main/notebooks/task3_dual_local_models_blind_ab_kaggle_gdown_script_vlm_timeout_fixed.ipynb
Initializes the repository source, extracts archives if necessary, and sets up the Python path to include the repository's source and root directories.
```python
kind, source_path = _ensure_repo_source()
repo_extract_dir = WORKSPACE_DIR / "repo"
if repo_extract_dir.exists():
shutil.rmtree(repo_extract_dir)
repo_extract_dir.mkdir(parents=True, exist_ok=True)
if kind == "archive":
print(f"[repo] extract archive: {source_path}")
with zipfile.ZipFile(source_path, "r") as zf:
zf.extractall(repo_extract_dir)
repo_candidates = [p for p in repo_extract_dir.rglob("*") if p.is_dir() and (p / "pyproject.toml").exists() and (p / "src" / "scireason").exists()]
if not repo_candidates:
raise FileNotFoundError("После распаковки архива не найден корень репозитория")
REPO_ROOT = sorted(repo_candidates, key=lambda p: len(str(p)))[0]
else:
REPO_ROOT = source_path
SRC_DIR = REPO_ROOT / "src"
os.environ["PYTHONPATH"] = os.pathsep.join([str(SRC_DIR), str(REPO_ROOT), str(os.environ.get("PYTHONPATH") or "")]).strip(os.pathsep)
if str(SRC_DIR) not in sys.path:
sys.path.insert(0, str(SRC_DIR))
if str(REPO_ROOT) not in sys.path:
sys.path.insert(0, str(REPO_ROOT))
print("[repo] root =", REPO_ROOT)
```
--------------------------------
### Install GNN Dependencies
Source: https://github.com/visualcomments/top-papers-graph.git/blob/main/course/weeks/week11.md
Install the necessary dependencies for the GNN mode. This command installs the project in editable mode with optional GNN extras.
```bash
pip install -e ".[gnn]" (опционально .[gnn_ext])
```
--------------------------------
### Refresh Setup From Globals
Source: https://github.com/visualcomments/top-papers-graph.git/blob/main/notebooks/task3_dual_local_models_blind_ab_colab.ipynb
Refreshes the global effective setup and parse error variables. Call this function when the setup configuration might have changed.
```python
def _refresh_setup_from_globals():
global TASK3_DUAL_EFFECTIVE_SETUP, TASK3_DUAL_SETUP_PARSE_ERROR
TASK3_DUAL_EFFECTIVE_SETUP, TASK3_DUAL_SETUP_PARSE_ERROR = _build_task3_dual_setup()
return TASK3_DUAL_EFFECTIVE_SETUP
```
--------------------------------
### Repository Setup Script
Source: https://github.com/visualcomments/top-papers-graph.git/blob/main/notebooks/task3_dual_local_models_blind_ab_kaggle_gdown_script.ipynb
This script sets up the project repository by locating and extracting it. It handles finding the repository from archives or directories and creates necessary workspace and output directories. Ensure the repository is accessible via specified paths or Kaggle datasets.
```python
import json
import os
import shutil
import subprocess
import sys
import zipfile
from pathlib import Path
WORKSPACE_DIR = Path(CFG["workspace_dir"]).expanduser()
OUTPUT_DIR = Path(CFG["output_dir"]).expanduser()
WORKSPACE_DIR.mkdir(parents=True, exist_ok=True)
OUTPUT_DIR.mkdir(parents=True, exist_ok=True)
def _find_repo_source():
explicit_archive = str(CFG.get("repo_archive_path") or "").strip()
if explicit_archive:
p = Path(explicit_archive)
if p.exists():
return ("archive", p)
explicit_dir = str(CFG.get("repo_dir_path") or "").strip()
if explicit_dir:
p = Path(explicit_dir)
if (p / "pyproject.toml").exists() and (p / "src" / "scireason").exists():
return ("dir", p)
search_roots = [Path("/kaggle/input"), Path("/kaggle/working"), Path.cwd(), Path("/mnt/data")]
patterns = ("top-papers-graph*.zip", "top-papers-graph-main*", "top-papers-graph*")
for base in search_roots:
if not base.exists():
continue
for pattern in patterns:
for candidate in sorted(base.rglob(pattern)):
if candidate.is_file() and candidate.suffix == ".zip":
return ("archive", candidate)
if candidate.is_dir() and (candidate / "pyproject.toml").exists() and (candidate / "src" / "scireason").exists():
return ("dir", candidate)
raise FileNotFoundError("Не удалось найти архив/папку top-papers-graph. Прикрепите dataset или заполните repo_archive_path / repo_dir_path.")
kind, source_path = _find_repo_source()
repo_extract_dir = WORKSPACE_DIR / "repo"
if repo_extract_dir.exists():
shutil.rmtree(repo_extract_dir)
repo_extract_dir.mkdir(parents=True, exist_ok=True)
if kind == "archive":
print(f"[repo] extract archive: {source_path}")
with zipfile.ZipFile(source_path, "r") as zf:
zf.extractall(repo_extract_dir)
repo_candidates = [p for p in repo_extract_dir.rglob("*") if p.is_dir() and (p / "pyproject.toml").exists() and (p / "src" / "scireason").exists()]
if not repo_candidates:
raise FileNotFoundError("После распаковки архива не найден корень репозитория")
REPO_ROOT = sorted(repo_candidates, key=lambda p: len(str(p)))[0]
else:
REPO_ROOT = source_path
print("[repo] root =", REPO_ROOT)
```
--------------------------------
### Install Base Notebook Dependencies
Source: https://github.com/visualcomments/top-papers-graph.git/blob/main/task3_dual_local_models_blind_ab_guarded_fixed.ipynb
Installs essential Python packages required for the notebook runtime if they are missing. It conditionally skips the installation in offline mode.
```python
_required_boot_modules = ['ipywidgets', 'yaml', 'requests', 'nbformat']
_missing_boot_modules = [name for name in _required_boot_modules if not _task3_module_available(name)]
if _missing_boot_modules and not TASK3_RUNTIME_OFFLINE:
run_optional([sys.executable, '-m', 'pip', 'install', '-q', 'ipywidgets', 'pyyaml', 'requests', 'unidecode', 'nbformat'])
elif _missing_boot_modules:
print('[warn] offline mode: пропускаю pip install для базовых notebook-зависимостей ->', ', '.join(_missing_boot_modules))
```
--------------------------------
### Apply Quick Setup Configuration
Source: https://github.com/visualcomments/top-papers-graph.git/blob/main/notebooks/task3_dual_local_models_blind_ab_kaggle_offline.ipynb
Applies the current settings from the quick setup UI to global variables. It also checks for initial YAML override errors and displays a warning if any are found.
```Python
_task3_quick_apply_to_globals(show_message=False)
if TASK3_QUICK_INITIAL_YAML_ERROR:
quick_status_html.value = _task3_quick_render_box(
'Есть проблема в сохранённом YAML override',
lines=[TASK3_QUICK_INITIAL_YAML_ERROR],
tone='warning',
)
```
--------------------------------
### Build Initial Setup for Task 3
Source: https://github.com/visualcomments/top-papers-graph.git/blob/main/notebooks/task3_dual_local_models_blind_ab_kaggle_offline.ipynb
Constructs the initial setup configuration for Task 3 by merging default settings with configurations from environment variables, JSON, and YAML files. Handles potential parsing errors for YAML.
```python
def _task3_quick_build_initial_setup():
merged = _task3_quick_default_setup()
raw_setup = globals().get('TASK3_DUAL_SETUP')
if isinstance(raw_setup, dict):
merged = _task3_quick_deep_update(merged, raw_setup)
json_text = str(globals().get('TASK3_DUAL_SETUP_JSON') or os.environ.get('TASK3_DUAL_SETUP_JSON') or '').strip()
if json_text:
try:
payload = json.loads(json_text)
if isinstance(payload, dict):
merged = _task3_quick_deep_update(merged, payload)
except Exception:
pass
setup_path = str(globals().get('TASK3_DUAL_SETUP_PATH') or os.environ.get('TASK3_DUAL_SETUP_PATH') or '').strip()
if setup_path:
setup_file = Path(setup_path).expanduser()
if setup_file.exists():
try:
payload = yaml.safe_load(setup_file.read_text(encoding='utf-8')) or {}
if isinstance(payload, dict):
merged = _task3_quick_deep_update(merged, payload)
except Exception:
pass
yaml_text = str(globals().get('TASK3_DUAL_SETUP_YAML') or '').strip()
yaml_payload, yaml_error = _task3_quick_parse_yaml(yaml_text)
if yaml_payload:
merged = _task3_quick_deep_update(merged, yaml_payload)
return merged, yaml_text, yaml_error
TASK3_QUICK_INITIAL_SETUP, TASK3_QUICK_INITIAL_YAML_TEXT, TASK3_QUICK_INITIAL_YAML_ERROR = _task3_quick_build_initial_setup()
```
--------------------------------
### Install and Import Libraries for Task 3
Source: https://github.com/visualcomments/top-papers-graph.git/blob/main/task3_multimodal_temporal_hypothesis_generation.ipynb
Installs the task3 package in editable mode and imports core libraries. This cell should be run first. It conditionally installs based on environment variables.
```python
if os.environ.get('TASK3_NOTEBOOK_SMOKE') == '1' or os.environ.get('TASK3_DUAL_NOTEBOOK_SMOKE') == '1' or os.environ.get('TASK3_NOTEBOOK_SKIP_OPTIONAL_DEPS') == '1':
_task3_editable_install = [sys.executable, '-m', 'pip', 'install', '-q', '-e', '.', '--no-deps']
run_optional(_task3_editable_install, cwd=REPO_DIR, label='pip install editable task3 notebook runtime')
import yaml
import ipywidgets as W
from IPython.display import HTML, Markdown, display, clear_output
from unidecode import unidecode
from scireason.config import settings
from scireason.llm import list_available_g4f_models
from scireason.pipeline.task3_hypothesis_generation import prepare_task3_hypothesis_bundle
from scireason.task3_offline_review import (
build_task3_expert_artifact_bundle,
build_task3_offline_review_package,
)
TASK3_DEFAULT_G4F_MODEL = getattr(settings, 'task2_default_g4f_model', 'auto') or 'auto'
TASK3_DEFAULT_LOCAL_VLM_MODEL = getattr(settings, 'vlm_model_id', 'Qwen/Qwen2.5-VL-3B-Instruct') or 'Qwen/Qwen2.5-VL-3B-Instruct'
TASK3_LAST_RUN = None
TASK3_LAST_TASK_META = None
TASK3_LAST_ARTIFACTS = None
print('REPO_DIR =', REPO_DIR)
print('SRC_DIR =', SRC_DIR)
print('task3 default g4f model =', TASK3_DEFAULT_G4F_MODEL)
print('task3 default local VLM model =', TASK3_DEFAULT_LOCAL_VLM_MODEL)
```
--------------------------------
### Environment Setup and Path Initialization
Source: https://github.com/visualcomments/top-papers-graph.git/blob/main/notebooks/task3_dual_local_models_blind_ab_kaggle_gdown_script_fixed.ipynb
Initializes environment variables and directories for Kaggle operations, including Hugging Face cache, VLM timeouts, and DPI settings. Creates necessary workspace and output directories.
```python
import importlib
import json
import os
import shutil
import subprocess
import sys
import zipfile
from pathlib import Path
WORKSPACE_DIR = Path(CFG["workspace_dir"]).expanduser()
OUTPUT_DIR = Path(CFG["output_dir"]).expanduser()
HF_HOME = Path(str(CFG.get("hf_home") or "/kaggle/working/.hf")).expanduser()
HF_HOME.mkdir(parents=True, exist_ok=True)
os.environ.setdefault("HF_HOME", str(HF_HOME))
os.environ.setdefault("HF_HUB_CACHE", str(HF_HOME / "hub"))
os.environ.setdefault("TRANSFORMERS_CACHE", str(HF_HOME / "hub"))
os.environ.setdefault("LOCAL_VLM_REQUEST_TIMEOUT_SECONDS", str(CFG.get("local_vlm_request_timeout_seconds") or 120))
os.environ.setdefault("SCIREASON_LOCAL_VLM_REQUEST_TIMEOUT_SECONDS", os.environ.get("LOCAL_VLM_REQUEST_TIMEOUT_SECONDS", "120"))
os.environ.setdefault("PDF_RENDER_DPI", str(CFG.get("pdf_render_dpi") or 110))
os.environ.setdefault("VLM_MAX_PIXELS", str(CFG.get("vlm_max_pixels") or (1024 * 28 * 28)))
os.environ.setdefault("HF_HUB_DISABLE_PROGRESS_BARS", "1")
os.environ.setdefault("TOKENIZERS_PARALLELISM", "false")
WORKSPACE_DIR.mkdir(parents=True, exist_ok=True)
OUTPUT_DIR.mkdir(parents=True, exist_ok=True)
```
--------------------------------
### Editable Install Task 3 Package
Source: https://github.com/visualcomments/top-papers-graph.git/blob/main/task3_dual_local_models_blind_ab_guarded_fixed.ipynb
Installs the Task 3 package in editable mode. It includes options to skip dependencies or install without them based on runtime environment variables.
```python
_task3_editable_install = [sys.executable, '-m', 'pip', 'install', '-q', '-e', '.[task3]']
if TASK3_RUNTIME_OFFLINE or os.environ.get('TASK3_NOTEBOOK_SMOKE') == '1' or os.environ.get('TASK3_DUAL_NOTEBOOK_SMOKE') == '1' or os.environ.get('TASK3_NOTEBOOK_SKIP_OPTIONAL_DEPS') == '1':
_task3_editable_install = [sys.executable, '-m', 'pip', 'install', '-q', '-e', '.', '--no-deps']
if not run_optional(_task3_editable_install, cwd=REPO_DIR, label='pip install editable task3 notebook runtime'):
run_optional([sys.executable, '-m', 'pip', 'install', '-q', '-e', '.', '--no-deps'], cwd=REPO_DIR, label='pip install editable task3 notebook runtime --no-deps fallback')
```