### Install Scrapinghub library

Source: https://python-scrapinghub.readthedocs.io/en/latest/quickstart.html

Commands to install the scrapinghub package via pip, including an optional version with MessagePack support for improved performance.

```bash
pip install scrapinghub
pip install scrapinghub[msgpack]
```

--------------------------------

### Install Scrapinghub Python Library

Source: https://python-scrapinghub.readthedocs.io/en/latest/_sources/quickstart.rst.txt

Installs the Scrapinghub Python library. It is recommended to install with MessagePack support for better performance and bandwidth usage.

```bash
pip install scrapinghub

```

```bash
pip install scrapinghub[msgpack]

```

--------------------------------

### Get a Specific Project Setting Value (Python)

Source: https://python-scrapinghub.readthedocs.io/en/latest/client/apidocs.html

Illustrates how to retrieve the value of a specific project setting by its key. The `get` method takes the setting key as a string parameter.

```python
>>> project = client.get_project(123)
>>> project.settings.get('default_job_units')
2
```

--------------------------------

### Initialize Scrapinghub Client and Manage Projects

Source: https://python-scrapinghub.readthedocs.io/en/latest/quickstart.html

Demonstrates how to instantiate the ScrapinghubClient using an API key and how to list available projects.

```python
from scrapinghub import ScrapinghubClient
apikey = '84c87545607a4bc0****************'
client = ScrapinghubClient(apikey)
client.projects.list()
```

--------------------------------

### Access Project Settings Instance (Python)

Source: https://python-scrapinghub.readthedocs.io/en/latest/client/apidocs.html

Demonstrates how to get a Settings instance for a specific project. This is accessed via the `settings` attribute of a Project object.

```python
>>> project = client.get_project(123)
>>> project.settings
<scrapinghub.client.projects.Settings at 0x10ecf1250>
```

--------------------------------

### Python: Manage Collection Items with Scrapinghub Client

Source: https://python-scrapinghub.readthedocs.io/en/latest/client/apidocs.html

Demonstrates common operations on a Scrapinghub collection using the Python client. This includes setting items with a key, counting all items, retrieving a specific item by key, iterating over all items, iterating over key-value pairs, getting item keys, filtering by multiple keys, deleting items by key, and truncating the entire collection. Ensure the Scrapinghub client is installed and authenticated.

```python
>>> foo_store.set({'_key': '002d050ee3ff6192dcbecc4e4b4457d7',
...                'value': '1447221694537'})

>>> foo_store.count()
1

>>> foo_store.get('002d050ee3ff6192dcbecc4e4b4457d7')
{'value': '1447221694537'}

>>> foo_store.iter()
<generator object jldecode at 0x1049eef10>

>>> for elem in foo_store.iter(count=1)):
...     print(elem)
[{'_key': '002d050ee3ff6192dcbecc4e4b4457d7', 'value': '1447221694537'}]

>>> keys = foo_store.iter(nodata=True, meta=['_key']))
>>> next(keys)
{'_key': '002d050ee3ff6192dcbecc4e4b4457d7'}

>>> foo_store.list(key=['002d050ee3ff6192dcbecc4e4b4457d7', 'blah'])
[{'_key': '002d050ee3ff6192dcbecc4e4b4457d7', 'value': '1447221694537'}]

>>> foo_store.delete('002d050ee3ff6192dcbecc4e4b4457d7')

>>> foo_store.truncate()
```

--------------------------------

### Get a Project Instance using Scrapinghub Client (Python)

Source: https://python-scrapinghub.readthedocs.io/en/latest/client/apidocs.html

Demonstrates how to obtain a Project instance using the ScrapinghubClient. This is the recommended way to interact with individual projects.

```python
>>> client = ScrapinghubClient()
>>> project = client.get_project(123)
>>> project
<scrapinghub.client.projects.Project at 0x106cdd6a0>
>>> project.key
'123'
```

--------------------------------

### Run a New Job with Scrapinghub Client

Source: https://python-scrapinghub.readthedocs.io/en/latest/_sources/quickstart.rst.txt

Illustrates how to run a new job for a specific project, providing the spider name and optional job arguments.

```python
project = client.get_project(123)
project.jobs.run('spider1', job_args={'arg1': 'val1'})

```

--------------------------------

### Instantiate Scrapinghub Client

Source: https://python-scrapinghub.readthedocs.io/en/latest/_sources/quickstart.rst.txt

Demonstrates how to instantiate the Scrapinghub client using an API key. The API key can be found in the Zyte account settings.

```python
from scrapinghub import ScrapinghubClient
apikey = '84c87545607a4bc0****************' # your API key as a string
client = ScrapinghubClient(apikey)

```

--------------------------------

### Run Integration Tests

Source: https://python-scrapinghub.readthedocs.io/en/latest/quickstart.html

Commands to execute integration tests using pytest, including flags to ignore or update existing VCR.py cassettes.

```bash
py.test --ignore-cassettes
py.test --update-cassettes
```

--------------------------------

### GET /projects/summary

Source: https://python-scrapinghub.readthedocs.io/en/latest/client/apidocs.html

Get a summary of job statuses across all projects.

```APIDOC
## GET /projects/summary

### Description
Returns a list of dictionaries containing job counts (pending, running, finished) and capacity status for user projects.

### Method
GET

### Endpoint
/projects/summary

### Parameters
#### Query Parameters
- **state** (string/list) - Optional - Filter projects by specific job state.

### Response
#### Success Response (200)
- **summaries** (list[dict]) - List of project status summaries.

#### Response Example
[
  {
    "project": 123,
    "pending": 0,
    "running": 1,
    "finished": 674,
    "has_capacity": true
  }
]
```

--------------------------------

### List Deployed Projects with Scrapinghub Client

Source: https://python-scrapinghub.readthedocs.io/en/latest/_sources/quickstart.rst.txt

Shows how to list all deployed projects associated with the provided API key using the Scrapinghub client.

```python
client.projects.list()

```

--------------------------------

### Run Jobs and Retrieve Data

Source: https://python-scrapinghub.readthedocs.io/en/latest/quickstart.html

Shows how to trigger a spider job for a specific project and iterate through the resulting items collected by a job.

```python
project = client.get_project(123)
project.jobs.run('spider1', job_args={'arg1': 'val1'})

job = client.get_job('123/1/2')
for item in job.items.iter():
    print(item)
```

--------------------------------

### Start a Scraping Job (Python)

Source: https://python-scrapinghub.readthedocs.io/en/latest/client/apidocs.html

Illustrates how to move a job to a running state. This method can accept optional keyword meta parameters and returns the previous string state of the job.

```python
>>> job.start()
'pending'
```

--------------------------------

### GET /projects

Source: https://python-scrapinghub.readthedocs.io/en/latest/client/apidocs.html

Retrieve a list of project IDs available to the current user.

```APIDOC
## GET /projects

### Description
Returns a list of project IDs associated with the current user account.

### Method
GET

### Endpoint
/projects

### Response
#### Success Response (200)
- **projects** (list[int]) - A list of numeric project IDs.

#### Response Example
[123, 456]
```

--------------------------------

### Retrieving a Spider Instance

Source: https://python-scrapinghub.readthedocs.io/en/latest/client/apidocs.html

Demonstrates the usage of the get method to fetch a specific spider object by its name.

```python
spider = project.spiders.get('spider2')
```

--------------------------------

### Get a Specific Project by ID using Projects Collection (Python)

Source: https://python-scrapinghub.readthedocs.io/en/latest/client/apidocs.html

Illustrates how to retrieve a specific project object using its ID from the Projects collection. The `get` method takes an integer or string numeric project ID.

```python
>>> client = ScrapinghubClient()
>>> project = client.projects.get(123)
>>> project
<scrapinghub.client.projects.Project at 0x106cdd6a0>
```

--------------------------------

### Access Job Output Data with Scrapinghub Client

Source: https://python-scrapinghub.readthedocs.io/en/latest/_sources/quickstart.rst.txt

Demonstrates how to retrieve and iterate over the output data of a specific job using its ID.

```python
job = client.get_job('123/1/2')
for item in job.items.iter():
    print(item)

```

--------------------------------

### List All Project IDs using Projects Collection (Python)

Source: https://python-scrapinghub.readthedocs.io/en/latest/client/apidocs.html

Demonstrates how to get a list of all project IDs available to the current user. The `list` method returns a list of integers representing project IDs.

```python
>>> client = ScrapinghubClient()
>>> client.projects.list()
[123, 456]
```

--------------------------------

### GET /projects/{project_id}

Source: https://python-scrapinghub.readthedocs.io/en/latest/client/apidocs.html

Retrieve a specific project object to access its resources like jobs, spiders, and settings.

```APIDOC
## GET /projects/{project_id}

### Description
Retrieves a project object for a given project ID, providing access to nested resources.

### Method
GET

### Endpoint
/projects/{project_id}

### Parameters
#### Path Parameters
- **project_id** (string/int) - Required - The unique identifier for the project.

### Response
#### Success Response (200)
- **project** (Object) - A project instance containing activity, collections, frontiers, jobs, settings, and spiders.

#### Response Example
{
  "key": "123",
  "has_capacity": true
}
```

--------------------------------

### Get Project Summaries using Projects Collection (Python)

Source: https://python-scrapinghub.readthedocs.io/en/latest/client/apidocs.html

Shows how to retrieve summaries for all available user projects. The `summary` method can optionally filter by project state and returns a list of dictionaries, each containing project status details.

```python
>>> client = ScrapinghubClient()
>>> client.projects.summary()
[{'finished': 674,
  'has_capacity': True,
  'pending': 0,
  'project': 123,
  'running': 1},
 {'finished': 33079,
  'has_capacity': True,
  'pending': 0,
  'project': 456,
  'running': 2}]
```

--------------------------------

### List Job Metadata (Python)

Source: https://python-scrapinghub.readthedocs.io/en/latest/client/apidocs.html

Demonstrates how to get a list of all job metadata key/value pairs using the `list()` method of the JobMeta class. Be aware that this can consume significant memory for large datasets.

```python
>>> job.metadata.list()
[('project', 123), ('units', 1), ('state', 'finished'), ...]
```

--------------------------------

### Manage Collections with Scrapinghub Client (Python)

Source: https://python-scrapinghub.readthedocs.io/en/latest/client/overview.html

Demonstrates the workflow for using Scrapinghub Collections to store and retrieve records. It covers getting a store, setting key-value pairs, counting items, getting specific items, iterating, filtering, and deleting.

```python
>>> collections = project.collections
>>> foo_store = collections.get_store('foo_store')
>>> foo_store.set({'_key': '002d050ee3ff6192dcbecc4e4b4457d7', 'value': '1447221694537'})
>>> foo_store.count()
1
>>> foo_store.get('002d050ee3ff6192dcbecc4e4b4457d7')
{u'value': u'1447221694537'}
>>> # iterate over _key & value pair
... list(foo_store.iter())
[{u'_key': u'002d050ee3ff6192dcbecc4e4b4457d7', u'value': u'1447221694537'}]
>>> # filter by multiple keys - only values for keys that exist will be returned
... list(foo_store.iter(key=['002d050ee3ff6192dcbecc4e4b4457d7', 'blah']))
[{u'_key': u'002d050ee3ff6192dcbecc4e4b4457d7', u'value': u'1447221694537'}]
>>> foo_store.delete('002d050ee3ff6192dcbecc4e4b4457d7')
>>> foo_store.count()
0
```

--------------------------------

### GET /jobs

Source: https://python-scrapinghub.readthedocs.io/en/latest/client/apidocs.html

Methods for managing and retrieving information about scraping jobs within a project.

```APIDOC
## GET /jobs

### Description
Retrieves a list or summary of scraping jobs associated with a project.

### Method
GET

### Endpoint
/jobs

### Parameters
#### Query Parameters
- **project_id** (string) - Required - The ID of the project to query.

### Response
#### Success Response (200)
- **jobs** (array) - List of job objects.

### Response Example
{
  "jobs": [{"id": "123/45/67", "state": "finished"}]
}
```

--------------------------------

### Get and Access Job Information (Python)

Source: https://python-scrapinghub.readthedocs.io/en/latest/client/apidocs.html

Demonstrates how to retrieve a specific job using its key and access its properties like key and metadata. It requires an initialized Scrapinghub client and a project object.

```python
>>> job = project.jobs.get('123/1/2')
>>> job.key
'123/1/2'
>>> job.metadata.get('state')
'finished'
```

--------------------------------

### Get Project Instance by ID

Source: https://python-scrapinghub.readthedocs.io/en/latest/client/apidocs.html

Fetches a `Project` instance associated with a given project ID. This method serves as a shortcut for accessing projects via the `client.projects.get()` method. The project ID can be provided as an integer or a numeric string.

```python
>>> project = client.get_project(123)
>>> project
<scrapinghub.client.projects.Project at 0x106cdd6a0>
```

--------------------------------

### Get Job Collection Instance (Python)

Source: https://python-scrapinghub.readthedocs.io/en/latest/client/apidocs.html

Demonstrates how to access the Jobs collection object associated with a project or a spider. This object is used to manage multiple jobs.

```python
>>> project.jobs
<scrapinghub.client.jobs.Jobs at 0x10477f0b8>
>>> spider = project.spiders.get('spider1')
>>> spider.jobs
<scrapinghub.client.jobs.Jobs at 0x104767e80>
```

--------------------------------

### Iterate Project Frontiers

Source: https://python-scrapinghub.readthedocs.io/en/latest/client/apidocs.html

Shows how to get an iterator for all frontiers within a project. This allows for sequential processing of all frontiers associated with the project.

```python
>>> project.frontiers.iter()
<list_iterator at 0x103c93630>
```

--------------------------------

### Iterate Frontier Slots

Source: https://python-scrapinghub.readthedocs.io/en/latest/client/apidocs.html

Demonstrates how to get an iterator for all slots within a frontier. This is useful for processing all available frontier slots sequentially.

```python
>>> frontier.iter()
<list_iterator at 0x1030736d8>
```

--------------------------------

### Getting a Versioned Store Collection (Python)

Source: https://python-scrapinghub.readthedocs.io/en/latest/client/apidocs.html

Retrieves a collection that retains up to three copies of each item. This is suitable for scenarios where historical data or rollbacks are needed.

```python
versioned_store = collections.get_versioned_store('my_versioned_data')
```

--------------------------------

### Iterate Through Job Metadata (Python)

Source: https://python-scrapinghub.readthedocs.io/en/latest/client/apidocs.html

Provides an example of iterating through job metadata key/value pairs using the `iter()` method of the JobMeta class. This is recommended for large amounts of metadata to avoid memory issues.

```python
>>> job.metadata.iter()
<dict_itemiterator at 0x104adbd18>
```

--------------------------------

### Getting a Versioned Cached Store Collection (Python)

Source: https://python-scrapinghub.readthedocs.io/en/latest/client/apidocs.html

Retrieves a collection that retains multiple copies of items, with each copy expiring after a month. This offers a balance between versioning and caching.

```python
versioned_cached_store = collections.get_versioned_cached_store('my_versioned_cached_data')
```

--------------------------------

### Retrieve and Filter Job Logs using Python

Source: https://python-scrapinghub.readthedocs.io/en/latest/client/apidocs.html

Demonstrates how to access job logs using the iter() and list() methods. Includes examples of iterating through logs, limiting results, and applying filters based on log levels and message content.

```python
# Retrieve all logs from a job
job.logs.iter()

# Iterate through first 100 log entries
for log in job.logs.iter(count=100):
    print(log)

# Retrieve a single log entry
job.logs.list(count=1)

# Retrieve logs with a specific level and filter by keyword
filters = [("message", "contains", ["mymessage"])]
job.logs.list(level='WARNING', filter=filters)
```

--------------------------------

### Getting a Collection by Type and Name (Python)

Source: https://python-scrapinghub.readthedocs.io/en/latest/client/apidocs.html

Provides the base method for retrieving a specific collection using its type and name. This method is fundamental for accessing any type of collection.

```python
collection = collections.get(_type_='s', _name_='my_store')
```

--------------------------------

### Filter and Paginate Jobs

Source: https://python-scrapinghub.readthedocs.io/en/latest/client/overview.html

Shows how to filter jobs by state, tags, and custom metadata fields, as well as how to handle pagination using the start parameter.

```python
>>> jobs_summary = project.jobs.iter(has_tag=['new', 'verified'], lacks_tag='obsolete')
>>> jobs_summary = spider.jobs.iter(spider='foo', state='finished', count=3)
>>> jobs_summary = spider.jobs.iter(start=1000)
>>> jobs_summary = project.jobs.iter(jobmeta=['scheduled_by'])
```

--------------------------------

### Project Settings Management

Source: https://python-scrapinghub.readthedocs.io/en/latest/client/overview.html

Endpoints for interacting with project settings, including listing all settings, getting specific values, and updating single or multiple settings.

```APIDOC
## GET /project/settings

### Description
Retrieves a list of all current project settings and their values.

### Method
GET

### Endpoint
/project/settings

### Response
#### Success Response (200)
- **settings** (list) - A list of tuples containing setting names and their values.

#### Response Example
[('default_job_units', 2), ('job_runtime_limit', 24)]

---

## GET /project/settings/{name}

### Description
Retrieves the value of a specific project setting by its name.

### Method
GET

### Endpoint
/project/settings/{name}

### Parameters
#### Path Parameters
- **name** (string) - Required - The name of the setting to retrieve.

### Response
#### Success Response (200)
- **value** (any) - The value associated with the setting.

---

## POST /project/settings/{name}

### Description
Updates the value of a specific project setting.

### Method
POST

### Endpoint
/project/settings/{name}

### Parameters
#### Path Parameters
- **name** (string) - Required - The name of the setting to update.

#### Request Body
- **value** (any) - Required - The new value for the setting.

---

## PATCH /project/settings

### Description
Updates multiple project settings simultaneously using a dictionary of key-value pairs.

### Method
PATCH

### Endpoint
/project/settings

### Request Body
- **settings** (object) - Required - A dictionary containing the settings to update.

### Request Example
{
  "default_job_units": 1,
  "job_runtime_limit": 20
}
```

--------------------------------

### List Job Samples by Timestamp with Scrapinghub Client

Source: https://python-scrapinghub.readthedocs.io/en/latest/client/apidocs.html

Shows how to list job samples filtered by a timestamp using the `list()` method. This example retrieves samples with a timestamp greater than or equal to the provided value. The output is a list of lists, where each inner list contains sample data.

```python
>>> job.samples.list(startts=1484570043851)
[[1484570043851, 554, 576, 1777, 821, 0],
 [1484570046673, 561, 583, 1782, 821, 0]]
```

--------------------------------

### List Project Settings (Python)

Source: https://python-scrapinghub.readthedocs.io/en/latest/client/apidocs.html

Demonstrates how to retrieve all project settings as a list. The `list` method is a convenient shortcut but may consume significant memory for large numbers of settings.

```python
>>> project = client.get_project(123)
>>> project.settings.list()
[(u'default_job_units', 2), (u'job_runtime_limit', 20)]
```

--------------------------------

### Python: List Collection Items with Scrapinghub Client

Source: https://python-scrapinghub.readthedocs.io/en/latest/client/apidocs.html

Provides an example of using the `list()` method to retrieve all items from a Scrapinghub collection as a list. This method shares the same parameters as `iter()` for filtering. However, it's important to note that for large collections, `list()` can consume a substantial amount of memory, and using `iter()` is recommended for better performance and resource management.

```python
all_items = foo_store.list(key=['key1', 'key2'], prefix='data_')
print(all_items)
```

--------------------------------

### Get Job Summaries by State (Python)

Source: https://python-scrapinghub.readthedocs.io/en/latest/client/apidocs.html

Retrieves a summary of jobs, optionally filtered by state or spider. The returned data is a list of dictionaries, grouped by job state. If a specific state is provided, it returns a single dictionary for that state.

```python
>>> spider.jobs.summary()
[{'count': 0, 'name': 'pending', 'summary': []},
 {'count': 0, 'name': 'running', 'summary': []},
 {'count': 5, 'name': 'finished', 'summary': [...]}]

>>> project.jobs.summary('pending')
{'count': 0, 'name': 'pending', 'summary': []}
```

--------------------------------

### Manage Data with Hubstorage in Python

Source: https://python-scrapinghub.readthedocs.io/en/latest/_sources/legacy/hubstorage.rst.txt

Demonstrates basic CRUD operations on a key-value store using the `foo_store` object, likely from the Scrapinghub Hubstorage library. It shows how to set, count, get, iterate, and delete entries. Dependencies include the `scrapinghub` library.

```python
>>> foo_store.set({'_key': '002d050ee3ff6192dcbecc4e4b4457d7', 'value': '1447221694537'})
>>> foo_store.count()
1
>>> foo_store.get('002d050ee3ff6192dcbecc4e4b4457d7')
{u'value': u'1447221694537'}
>>> # iterate over _key & value pair
... list(foo_store.iter_values())
[{u'_key': u'002d050ee3ff6192dcbecc4e4b4457d7', u'value': u'1447221694537'}]
>>> # filter by multiple keys - only values for keys that exist will be returned
... list(foo_store.iter_values(key=['002d050ee3ff6192dcbecc4e4b4457d7', 'blah']))
[{u'_key': u'002d050ee3ff6192dcbecc4e4b4457d7', u'value': u'1447221694537'}]
>>> foo_store.delete('002d050ee3ff6192dcbecc4e4b4457d7')
>>> foo_store.count()
0
```

--------------------------------

### Get job information in Python

Source: https://python-scrapinghub.readthedocs.io/en/latest/_sources/legacy/connection.rst.txt

Retrieves metadata and information about a specific job. This includes details like the spider name, start time, tags, and field counts.

```python
print(job.info['spider'])
print(job.info['started_time'])
print(job.info['tags'])
print(job.info['fields_count']['description'])
```

--------------------------------

### Get Specific Frontier Slot

Source: https://python-scrapinghub.readthedocs.io/en/latest/client/apidocs.html

Illustrates how to get a specific FrontierSlot object by its name. This allows for targeted operations on a particular slot.

```python
>>> frontier.get('example.com')
<scrapinghub.client.frontiers.FrontierSlot at 0x1049d8978>
```

--------------------------------

### Get Specific Job Metadata Field (Python)

Source: https://python-scrapinghub.readthedocs.io/en/latest/client/apidocs.html

Shows how to retrieve the value of a specific metadata field by its name using the `get()` method of the JobMeta class.

```python
>>> job.metadata.get('version')
'test'
```

--------------------------------

### Manage Frontiers with Scrapinghub Client (Python)

Source: https://python-scrapinghub.readthedocs.io/en/latest/client/overview.html

Explains how to manage frontiers using the Scrapinghub client. It covers iterating through all frontiers, listing them, getting a specific frontier, iterating through frontier slots, listing slots, getting a specific slot, and adding/deleting requests and fingerprints.

```python
>>> frontiers = project.frontiers

>>> frontiers.iter()
<list_iterator at 0x103c93630>

>>> frontiers.list()
['test', 'test1', 'test2']

>>> frontier = frontiers.get('test')
>>> frontier
<scrapinghub.client.Frontier at 0x1048ae4a8>

>>> frontier.iter()
<list_iterator at 0x1030736d8>

>>> frontier.list()
['example.com', 'example.com2']

>>> slot = frontier.get('example.com')
>>> slot
<scrapinghub.client.FrontierSlot at 0x1049d8978>

>>> slot.queue.add([{'fp': '/some/path.html'}])
>>> slot.flush()
>>> slot.newcount
1

>>> frontier.newcount
1
>>> frontiers.newcount
3

>>> slot.fingerprints.add(['fp1', 'fp2'])
>>> slot.flush()

>>> slot.q.add([{'fp': '/'}, {'fp': 'page1.html', 'p': 1, 'qdata': {'depth': 1}}])
>>> slot.flush()

>>> reqs = slot.q.iter()

>>> fps = slot.f.iter()

>>> fps = slot.q.list()

>>> slot.q.delete('00013967d8af7b0001')

>>> slot.delete()

>>> frontier.flush()

>>> frontiers.flush()

>>> frontiers.close()
```

--------------------------------

### Jobs Summary API

Source: https://python-scrapinghub.readthedocs.io/en/latest/client/apidocs.html

Get a summary of jobs, optionally filtered by state or spider.

```APIDOC
## GET /jobs/summary

### Description
Retrieves a summary of jobs, optionally filtered by state or spider.

### Method
GET

### Endpoint
/jobs/summary

### Parameters
#### Query Parameters
- **state** (str) - Optional - Filter jobs by a specific state.
- **spider** (str) - Optional - Filter jobs by spider name (not needed if instantiated with `Spider`).
- **params** (dict) - Optional - Additional keyword arguments.

### Response
#### Success Response (200)
- **list[dict]** - A list of dictionaries containing job summaries, grouped by job state.

#### Response Example
```python
# Example for spider.jobs.summary()
[{'count': 0, 'name': 'pending', 'summary': []},
 {'count': 0, 'name': 'running', 'summary': []},
 {'count': 5, 'name': 'finished', 'summary': [...]}]

# Example for project.jobs.summary('pending')
{'count': 0, 'name': 'pending', 'summary': []}
```
```

--------------------------------

### Initialize HubstorageClient and Project

Source: https://python-scrapinghub.readthedocs.io/en/latest/_sources/legacy/hubstorage.rst.txt

Demonstrates how to authenticate with the HubstorageClient using an API key and retrieve project-level information such as settings and job summaries.

```python
from scrapinghub import HubstorageClient
hc = HubstorageClient(auth='apikey')
hc.server_timestamp()

project = hc.get_project('1111111')
print(project.settings['botgroups'])
print(project.jobsummary())
```

--------------------------------

### Initialize Scrapinghub Client

Source: https://python-scrapinghub.readthedocs.io/en/latest/client/apidocs.html

Demonstrates how to create an instance of the ScrapinghubClient. Authentication can be provided via an API key or environment variables. Optional parameters for the API endpoint and additional HubstorageClient arguments can also be passed.

```python
>>> from scrapinghub import ScrapinghubClient
>>> client = ScrapinghubClient('APIKEY')
>>> client
<scrapinghub.client.ScrapinghubClient at 0x1047af2e8>
```

--------------------------------

### GET /jobs/{job_key}

Source: https://python-scrapinghub.readthedocs.io/en/latest/client/apidocs.html

Retrieves a specific job instance using its unique job key.

```APIDOC
## GET /jobs/{job_key}

### Description
Retrieves a Job object associated with the provided job key.

### Method
GET

### Endpoint
client.get_job(job_key)

### Parameters
#### Path Parameters
- **job_key** (string) - Required - Format: 'project_id/spider_id/job_id'

### Response
#### Success Response (200)
- **Job** (object) - A job instance object.
```

--------------------------------

### Update Multiple Project Settings at Once (Python)

Source: https://python-scrapinghub.readthedocs.io/en/latest/client/apidocs.html

Demonstrates how to update multiple project settings simultaneously using a dictionary. The `update` method allows for partial updates to settings.

```python
>>> project = client.get_project(123)
>>> project.settings.update({'default_job_units': 1,
...                          'job_runtime_limit': 20})
```

--------------------------------

### GET /collections

Source: https://python-scrapinghub.readthedocs.io/en/latest/client/apidocs.html

Methods for interacting with data collections, including CRUD operations on stored items.

```APIDOC
## GET /collections/{collection_name}

### Description
Retrieves items or metadata from a specific data collection.

### Method
GET

### Endpoint
/collections/{collection_name}

### Parameters
#### Path Parameters
- **collection_name** (string) - Required - The name of the collection to access.

### Response
#### Success Response (200)
- **items** (array) - List of items stored in the collection.

### Response Example
{
  "items": [{"key": "value"}]
}
```

--------------------------------

### GET /logs

Source: https://python-scrapinghub.readthedocs.io/en/latest/legacy/hubstorage.html

Retrieve logs associated with a specific job.

```APIDOC
## GET /jobs/{job_id}/logs

### Description
Iterate through logs for a specific job.

### Method
GET

### Parameters
#### Query Parameters
- **count** (integer) - Optional - Number of log entries to retrieve.

### Response
#### Success Response (200)
- **logs** (list) - A list of dictionaries containing log level, message, and timestamp.

### Response Example
{
  "logs": [{"level": "INFO", "message": "Started", "time": 1447221694537}]
}
```

--------------------------------

### Iterate Through Project Settings (Python)

Source: https://python-scrapinghub.readthedocs.io/en/latest/client/apidocs.html

Shows how to iterate through the key/value pairs of a project's settings. The `iter` method returns an iterator over these pairs.

```python
>>> project = client.get_project(123)
>>> project.settings.iter()
<dictionary-itemiterator at 0x10ed11578>
```

--------------------------------

### Initialize Scrapinghub Client

Source: https://python-scrapinghub.readthedocs.io/en/latest/_sources/client/overview.rst.txt

Demonstrates how to instantiate the ScrapinghubClient using a valid API key. This client serves as the entry point for all platform interactions.

```python
from scrapinghub import ScrapinghubClient
apikey = '84c87545607a4bc0****************'
client = ScrapinghubClient(apikey)
```

--------------------------------

### Write Sample Item with Scrapinghub Client

Source: https://python-scrapinghub.readthedocs.io/en/latest/client/apidocs.html

Illustrates how to write a new sample item to the collection using the `write()` method. The `item` parameter should be a dictionary containing the sample data.

```python
>>> job.samples.write({'data': 'sample_data'})

```

--------------------------------

### GET /jobs/count

Source: https://python-scrapinghub.readthedocs.io/en/latest/legacy/connection.html

Returns the total count of jobs for a project based on applied filters.

```APIDOC
## GET /jobs/count

### Description
Returns the total number of jobs matching the specified criteria.

### Method
GET

### Endpoint
/jobs/count

### Parameters
#### Query Parameters
- **project** (string) - Required - The project ID.

### Request Example
GET /jobs/count?project=12345

### Response
#### Success Response (200)
- **count** (integer) - Total number of jobs.

#### Response Example
{
  "count": 42
}
```

--------------------------------

### Python: Get Item from Collection by Key using Scrapinghub Client

Source: https://python-scrapinghub.readthedocs.io/en/latest/client/apidocs.html

Shows how to retrieve a single item from a Scrapinghub collection using its unique key. The `get()` method takes the item's key as a string and can optionally accept additional query parameters. If the item exists, it returns a dictionary representing the item; otherwise, it might raise an error or return None depending on the client's implementation.

```python
item = foo_store.get('item_key', param1='value1')
```

--------------------------------

### Delete a Project Setting by Key (Python)

Source: https://python-scrapinghub.readthedocs.io/en/latest/client/apidocs.html

Illustrates how to delete a specific project setting using its key. The `delete` method takes the setting key as a string parameter.

```python
>>> project = client.get_project(123)
>>> project.settings.delete('job_runtime_limit')
```

--------------------------------

### Accessing and Managing Spiders

Source: https://python-scrapinghub.readthedocs.io/en/latest/client/apidocs.html

Demonstrates how to retrieve a spider instance from a project and access its attributes like key and name.

```python
spider = project.spiders.get('spider1')
print(spider.key)
print(spider.name)
```

--------------------------------

### GET /jobs/list

Source: https://python-scrapinghub.readthedocs.io/en/latest/legacy/connection.html

Retrieves a list of jobs associated with a specific project based on provided filters.

```APIDOC
## GET /jobs/list

### Description
Retrieves a list of jobs for a given project. Supports filtering via parameters.

### Method
GET

### Endpoint
/jobs/list

### Parameters
#### Query Parameters
- **project** (string) - Required - The project ID.
- **state** (string) - Optional - Filter jobs by state (e.g., running, finished).

### Request Example
GET /jobs/list?project=12345

### Response
#### Success Response (200)
- **jobs** (array) - List of job objects.

#### Response Example
{
  "jobs": [{"id": "12345/1/1", "state": "finished"}]
}
```

--------------------------------

### Retrieve Job Summaries

Source: https://python-scrapinghub.readthedocs.io/en/latest/_sources/client/overview.rst.txt

Provides methods to get a summary of job states or the most recent jobs for each spider.

```python
>>> spider.jobs.summary()
>>> list(sp.jobs.iter_last())
```

--------------------------------

### Initialize HubstorageClient

Source: https://python-scrapinghub.readthedocs.io/en/latest/legacy/hubstorage.html

Demonstrates how to authenticate and initialize the HubstorageClient using an API key.

```python
from scrapinghub import HubstorageClient
hc = HubstorageClient(auth='apikey')
hc.server_timestamp()
```

--------------------------------

### List Project Frontiers

Source: https://python-scrapinghub.readthedocs.io/en/latest/client/apidocs.html

Demonstrates how to list all available frontiers within a project. This provides a simple list of frontier names.

```python
>>> project.frontiers.list()
['test', 'test1', 'test2']
```

--------------------------------

### Manage Projects and Spiders

Source: https://python-scrapinghub.readthedocs.io/en/latest/legacy/connection.html

Demonstrates how to list available projects, select a specific project, list spiders within that project, and schedule a new spider run.

```python
# List projects
conn.project_ids()

# Select project
project = conn[123]

# Schedule spider
project.schedule('myspider', arg1='val1')

# List spiders
project.spiders()
```

--------------------------------

### GET /jobs/{job_id}/logs

Source: https://python-scrapinghub.readthedocs.io/en/latest/client/apidocs.html

Retrieves log entries for a specific job. Supports filtering by log level and content, and provides both list and iterator interfaces.

```APIDOC
## GET /jobs/{job_id}/logs

### Description
Retrieves log entries associated with a specific job. Use `iter()` for large datasets to optimize memory usage or `list()` for smaller, immediate results.

### Method
GET

### Endpoint
/jobs/{job_id}/logs

### Parameters
#### Query Parameters
- **count** (integer) - Optional - Limit the number of log entries returned.
- **level** (string) - Optional - Filter logs by severity level (e.g., WARNING, INFO).
- **filter** (list) - Optional - List of tuples for advanced filtering (e.g., [("message", "contains", ["text"])]).

### Request Example
```
job.logs.list(level='WARNING', count=10)
```

### Response
#### Success Response (200)
- **level** (integer) - The log severity level.
- **message** (string) - The log content.
- **time** (integer) - UNIX timestamp in milliseconds.

#### Response Example
[
  {
    "level": 30,
    "message": "Some warning: mymessage",
    "time": 1486375511188
  }
]
```

--------------------------------

### PUT /projects/{project_id}/settings

Source: https://python-scrapinghub.readthedocs.io/en/latest/client/apidocs.html

Update project-level configuration settings.

```APIDOC
## PUT /projects/{project_id}/settings

### Description
Updates one or multiple configuration settings for a specific project.

### Method
PUT

### Endpoint
/projects/{project_id}/settings

### Request Body
- **values** (dict) - Required - Key-value pairs of settings to update.

### Request Example
{
  "default_job_units": 1,
  "job_runtime_limit": 20
}

### Response
#### Success Response (200)
- **status** (string) - Confirmation of update.
```

--------------------------------

### Getting a Cached Store Collection (Python)

Source: https://python-scrapinghub.readthedocs.io/en/latest/client/apidocs.html

Retrieves a collection that caches items for a month. This is useful for frequently accessed data where slightly older versions are acceptable.

```python
cached_store = collections.get_cached_store('my_cached_data')
```

--------------------------------

### Manage Projects

Source: https://python-scrapinghub.readthedocs.io/en/latest/client/overview.html

Shows how to list available projects, retrieve a summary of project activity, and access a specific project instance by its ID.

```python
client.projects.list()
client.projects.summary()
project = client.get_project(123)
```

--------------------------------

### Get Frontier New Request Count

Source: https://python-scrapinghub.readthedocs.io/en/latest/client/apidocs.html

Shows how to access the `newcount` property of a frontier to see the number of new requests that have been added. This is useful for monitoring frontier activity.

```python
>>> frontier.newcount
3
```

--------------------------------

### Accessing and Listing Collections (Python)

Source: https://python-scrapinghub.readthedocs.io/en/latest/client/apidocs.html

Demonstrates how to access the collections object from a project instance and list available collections. This is the primary way to interact with collections in the Scrapinghub client.

```python
>>> collections = project.collections
>>> collections.list()
[{'name': 'Pages', 'type': 's'}]
>>> foo_store = collections.get_store('foo_store')
```

--------------------------------

### Manage Project Settings in Python

Source: https://python-scrapinghub.readthedocs.io/en/latest/_sources/client/overview.rst.txt

Shows how to retrieve, update, and list project settings using the Settings class. This allows for dynamic configuration of project parameters like job units and runtime limits.

```python
project.settings.list()
project.settings.get('job_runtime_limit')
project.settings.set('job_runtime_limit', 20)
project.settings.update({'default_job_units': 1, 'job_runtime_limit': 20})
```

--------------------------------

### Initialize HubStorage Client

Source: https://python-scrapinghub.readthedocs.io/en/latest/legacy/hubstorage.html

Shows how to initialize the HubStorage client with optional parameters for authentication, endpoint, timeouts, retries, and user agent. The default values are used if not specified.

```python
_class _scrapinghub.hubstorage.HubstorageClient(_auth =None_, _endpoint =None_, _connection_timeout =None_, _max_retries =None_, _max_retry_time =None_, _user_agent =None_, _use_msgpack =True_)
```

--------------------------------

### Get all finished jobs for a project in Python

Source: https://python-scrapinghub.readthedocs.io/en/latest/_sources/legacy/connection.rst.txt

Retrieves a set of all jobs that have completed execution within a project. The result is a JobSet object, which is iterable.

```python
jobs = project.jobs(state='finished')
for job in jobs:
    # process job
print([x.id for x in jobs])
```

--------------------------------

### POST /activity

Source: https://python-scrapinghub.readthedocs.io/en/latest/client/apidocs.html

Adds new activity events to a project.

```APIDOC
## POST /activity

### Description
Adds one or more activity events to the project's activity log.

### Method
POST

### Endpoint
project.activity.add(values)

### Request Body
- **values** (dict or list) - Required - A dictionary or list of dictionaries containing 'event', 'job', and 'user' keys.

### Request Example
{
  "event": "job:completed",
  "job": "123/2/4",
  "user": "jobrunner"
}
```

--------------------------------

### Update a Single Project Setting Value (Python)

Source: https://python-scrapinghub.readthedocs.io/en/latest/client/apidocs.html

Shows how to update the value of a single project setting. The `set` method takes the setting key and the new value as parameters. Note that some settings are read-only.

```python
>>> project = client.get_project(123)
>>> project.settings.set('default_job_units', 2)
```

--------------------------------

### ScrapinghubClient Initialization

Source: https://python-scrapinghub.readthedocs.io/en/latest/client/apidocs.html

Initializes the main client to interact with the Scrapy Cloud API using authentication credentials.

```APIDOC
## ScrapinghubClient Initialization

### Description
Initializes the connection to the Scrapy Cloud API. If credentials are not provided, it attempts to read from environment variables.

### Parameters
- **auth** (string) - Optional - Scrapy Cloud API key or credentials.
- **dash_endpoint** (string) - Optional - The API URL (defaults to https://app.zyte.com/api/).

### Request Example
```python
from scrapinghub import ScrapinghubClient
client = ScrapinghubClient('YOUR_API_KEY')
```
```

--------------------------------

### Listing Project Spiders

Source: https://python-scrapinghub.readthedocs.io/en/latest/client/apidocs.html

Shows how to retrieve a list of all spiders within a project, returning their metadata as dictionaries.

```python
spiders_list = project.spiders.list()
for spider in spiders_list:
    print(spider)
```

--------------------------------

### Get Slot New Request Count

Source: https://python-scrapinghub.readthedocs.io/en/latest/client/apidocs.html

Demonstrates accessing the `newcount` property of a slot to retrieve the number of new requests added to that specific slot. This helps in monitoring slot-specific activity.

```python
>>> slot.newcount
2
```

--------------------------------

### List Slot Requests

Source: https://python-scrapinghub.readthedocs.io/en/latest/client/apidocs.html

Shows how to list all request batches currently in a slot's queue. The output is a list of dictionaries, each representing a batch with its ID and requests.

```python
>>> slot.q.list()
[{'id': '0115a8579633600006',
  'requests': [['page1.html', {'depth': 1}]]}]
```

--------------------------------

### Manage Projects and Spiders

Source: https://python-scrapinghub.readthedocs.io/en/latest/legacy/hubstorage.html

Shows how to retrieve project settings, job summaries, and spider IDs.

```python
project = hc.get_project('1111111')
project.settings['botgroups']
project.jobsummary()
project.ids.spider('foo')
summaries = project.spiders.lastjobsummary(count=3)
```

--------------------------------

### Query and Filter Jobs

Source: https://python-scrapinghub.readthedocs.io/en/latest/_sources/legacy/hubstorage.rst.txt

Demonstrates how to list job metadata using filters like tags, state, and pagination to manage large sets of job data.

```python
jobs_metadata = project.jobq.list(has_tag=['new', 'verified'], lacks_tag='obsolete')
jobs_metadata_filtered = project.jobq.list(spider='foo', state='finished', count=3)
jobs_paginated = project.jobq.list(start=1000)
```

--------------------------------

### Update Job State (Python)

Source: https://python-scrapinghub.readthedocs.io/en/latest/client/apidocs.html

Demonstrates how to update the state of a job to a new value, optionally with additional meta parameters. It returns the previous string state of the job.

```python
>>> job.update('finished')
'running'
```

--------------------------------

### List Job Summaries with Filters (Python)

Source: https://python-scrapinghub.readthedocs.io/en/latest/client/apidocs.html

Provides a convenient shortcut to list job results based on various filter parameters. It returns a list of dictionaries, each representing a job summary. For large datasets, using `iter()` is recommended to avoid high memory consumption.

```python
list(_count =None_, _start =None_, _spider =None_, _state =None_, _has_tag =None_, _lacks_tag =None_, _startts =None_, _endts =None_, _meta =None_, _** params_)
```

--------------------------------

### Retrieve and iterate job items using Python

Source: https://python-scrapinghub.readthedocs.io/en/latest/client/apidocs.html

Demonstrates how to retrieve items from a job using iterators, list methods, and chunked processing. These methods support filtering by count, timestamp, and custom criteria to manage memory usage efficiently.

```python
# Retrieve all items as a generator
job.items.iter()

# Iterate through first 100 items
for item in job.items.iter(count=100):
    print(item)

# Retrieve items with timestamp filter
job.items.list(startts=1447221694537)

# Retrieve items in chunks
gen = job.items.list_iter(chunksize=2)
next(gen)

# Retrieve items with complex filters
filters = [("size", ">", [30000]), ("size", "<", [40000])]
job.items.list(count=1, filter=filters)
```

--------------------------------

### Update Job Tags

Source: https://python-scrapinghub.readthedocs.io/en/latest/client/overview.html

Illustrates how to add tags to a job. The `update_tags()` method is used, and tags can be added to an existing list of tags. For example, to add the tag 'consumed':

```python
>>> job.update_tags(add=['consumed'])

```

--------------------------------

### Get a Specific Job by Key using Scrapinghub API

Source: https://python-scrapinghub.readthedocs.io/en/latest/client/apidocs.html

Retrieves a single `Job` object using its unique job key. The job key must match the project and spider context. Returns a `Job` object.

```python
>>> job = project.jobs.get('123/1/2')
>>> job.key
'123/1/2'
```

--------------------------------

### Manage Project Settings in Scrapinghub

Source: https://python-scrapinghub.readthedocs.io/en/latest/client/overview.html

Demonstrates how to interact with the project settings object to list all settings, retrieve a specific value, update a single setting, or perform bulk updates. These methods allow for dynamic configuration of project parameters via the Scrapinghub API.

```python
# List all project settings
project.settings.list()

# Get a specific setting value
project.settings.get('job_runtime_limit')

# Update a single setting
project.settings.set('job_runtime_limit', 20)

# Update multiple settings at once
project.settings.update({'default_job_units': 1, 'job_runtime_limit': 20})
```

--------------------------------

### Get a Specific Job by Key

Source: https://python-scrapinghub.readthedocs.io/en/latest/client/apidocs.html

Retrieves a `Job` object using its unique job key. The job key must be in the format 'project_id/spider_id/job_id', where all components are integers. This method is essential for accessing and managing individual job data.

```python
>>> job = client.get_job('123/1/1')
>>> job
<scrapinghub.client.jobs.Job at 0x10afe2eb1>
```

--------------------------------

### Iterate Through Project Activity Events

Source: https://python-scrapinghub.readthedocs.io/en/latest/client/apidocs.html

Provides a way to iterate over all activity events for a given project. It's recommended to use the `iter()` method for large amounts of activity to avoid excessive memory usage. The `list()` method is a convenient shortcut but may consume more memory.

```python
>>> project.activity.iter()
<generator object jldecode at 0x1049ee990>
```

--------------------------------

### Python: Create a Collection Writer with Scrapinghub Client

Source: https://python-scrapinghub.readthedocs.io/en/latest/client/apidocs.html

Shows how to create a writer for a Scrapinghub collection using the Python client. This method allows for efficient batch uploading of items. It accepts several optional parameters to configure the writer's behavior, such as initial offset, authentication, queue size, and content encoding. The function returns a batch writer object.

```python
writer = foo_store.create_writer(start=0, size=1000, interval=15, qsize=None, content_encoding='identity', maxitemsize=1048576, callback=None)
```

--------------------------------

### Iterate Last Job Summaries (Python)

Source: https://python-scrapinghub.readthedocs.io/en/latest/client/apidocs.html

Retrieves a generator object yielding dictionaries of job summaries for a given filter. This is useful for fetching recent job data efficiently. It can be used to get all last job summaries for a project or for a specific spider.

```python
>>> project.jobs.iter_last()
<generator object jldecode at 0x1048a95c8>

>>> list(spider.jobs.iter_last())
[{'close_reason': 'success',
  'elapsed': 3062444,
  'errors': 1,
  'finished_time': 1482911633089,
  'key': '123/1/3',
  'logs': 8,
  'pending_time': 1482911596566,
  'running_time': 1482911598909,
  'spider': 'spider1',
  'state': 'finished',
  'ts': 1482911615830,
  'version': 'some-version'}]
```

--------------------------------

### Connect to Scrapinghub API

Source: https://python-scrapinghub.readthedocs.io/en/latest/legacy/connection.html

Initializes a connection to the Scrapinghub service using an API key. This is the entry point for all subsequent project and job operations.

```python
from scrapinghub import Connection
conn = Connection('APIKEY')
conn
```

--------------------------------

### HubStorage Client Methods

Source: https://python-scrapinghub.readthedocs.io/en/latest/legacy/hubstorage.html

Lists common methods available on the HubStorage client, including closing the client, getting job or project objects, pushing new jobs, and executing HTTP requests with retry policies.

```python
close(_timeout =None_)
get_job(_* args_, _** kwargs_)
get_project(_* args_, _** kwargs_)
push_job(_projectid_ , _spidername_ , _auth =None_, _** jobparams_)
request(_is_idempotent =False_, _** kwargs_)
server_timestamp()
```