### Install CloudPathlib with Cloud SDKs

Source: https://github.com/drivendataorg/cloudpathlib/blob/master/README.md

Installs cloudpathlib along with specific cloud service SDKs using pip extras. Use quotes if your shell requires it.

```bash
pip install cloudpathlib[s3,gs,azure]
```

```bash
pip install "cloudpathlib[s3,gs,azure]"
```

--------------------------------

### Install Development Version with All SDKs

Source: https://github.com/drivendataorg/cloudpathlib/blob/master/README.md

Installs the latest development version of cloudpathlib from GitHub, including all available cloud service SDKs.

```bash
pip install https://github.com/drivendataorg/cloudpathlib.git#egg=cloudpathlib[all]
```

--------------------------------

### PR Checklist Example

Source: https://github.com/drivendataorg/cloudpathlib/blob/master/CONTRIBUTING.md

This is a sample checklist provided in the PR template to ensure all necessary steps are completed before submitting a pull request.

```markdown
 - [ ] I have read and understood `CONTRIBUTING.md`
 - [ ] Confirmed an issue exists for the PR, and the text `Closes #issue` appears in the PR summary (e.g., `Closes #123`).
 - [ ] Confirmed PR is rebased onto the latest base
 - [ ] Confirmed failure before change and success after change
 - [ ] Any generic new functionality is replicated across cloud providers if necessary
 - [ ] Tested manually against live server backend for at least one provider
 - [ ] Added tests for any new functionality
 - [ ] Linting passes locally
 - [ ] Tests pass locally
 - [ ] Updated `HISTORY.md` with the issue that is addressed and the PR you are submitting. If the top section is not `## UNRELEASED``, then you need to add a new section to the top of the document for your change.
```

--------------------------------

### Get Client Instance for Rig

Source: https://github.com/drivendataorg/cloudpathlib/blob/master/CONTRIBUTING.md

Instantiate a client class for use with the testing rig. This is necessary when testing functionality directly on the `*Client` classes.

```python
new_client = rig.client_class()
```

--------------------------------

### Setup Interactive Testing Environment

Source: https://github.com/drivendataorg/cloudpathlib/blob/master/CONTRIBUTING.md

Configure a Jupyter notebook for interactive testing of the library. Ensure autoreload is enabled to pick up code changes immediately.

```python
%load_ext autoreload
%autoreload 2
```

```python
from cloudpathlib import CloudPath

cp = CloudPath("s3://my-test-bucket/")
```

--------------------------------

### Install Development Requirements

Source: https://github.com/drivendataorg/cloudpathlib/blob/master/CONTRIBUTING.md

Installs all development dependencies and an editable version of cloudpathlib. Ensure your Python environment is active before running.

```bash
make reqs
```

--------------------------------

### Open and Write to Cloud Path

Source: https://github.com/drivendataorg/cloudpathlib/blob/master/README.md

Opens a cloud path for writing and writes content to it. Ensure the necessary cloud SDK dependencies are installed.

```python
with CloudPath("s3://bucket/filename.txt").open("w+") as f:
    f.write("Send my changes to the cloud!")
```

--------------------------------

### Serve Documentation Locally

Source: https://github.com/drivendataorg/cloudpathlib/blob/master/CONTRIBUTING.md

Start a local development server to preview documentation changes. The server typically auto-reloads for most file changes, but requires a restart for changes in `index.md`, `HISTORY.md`, or `CONTRIBUTING.md` after running `make docs-setup`.

```bash
make docs-serve
```

--------------------------------

### Cross-platform path manipulation with pathlib

Source: https://github.com/drivendataorg/cloudpathlib/blob/master/docs/docs/why_cloudpathlib.ipynb

Illustrates how pathlib handles cross-platform path construction, including getting the user's home directory and joining path components.

```python
path = Path.home()
path

docs = path / 'Documents'
docs
```

--------------------------------

### Get path information with pathlib

Source: https://github.com/drivendataorg/cloudpathlib/blob/master/docs/docs/why_cloudpathlib.ipynb

Demonstrates how to retrieve various attributes of a file path, such as name, stem, suffix, parent, and read its content.

```python
notebook = Path("why_cloudpathlib.ipynb").resolve()

print(f"{ 'Path:':15}{notebook}")
print(f"{ 'Name:':15}{notebook.name}")
print(f"{ 'Stem:':15}{notebook.stem}")
print(f"{ 'Suffix:':15}{notebook.suffix}")
print(f"{ 'With suffix:':15}{notebook.with_suffix('.cpp')}")
print(f"{ 'Parent:':15}{notebook.parent}")
print(f"{ 'Read_text:'}
{notebook.read_text()[:200]}
")
```

--------------------------------

### Install CloudPathlib with Conda

Source: https://github.com/drivendataorg/cloudpathlib/blob/master/README.md

Installs cloudpathlib with specific cloud service SDKs using conda from conda-forge. The suffix indicates the SDK to install.

```bash
conda install cloudpathlib-s3 -c conda-forge
```

--------------------------------

### Using `os.path` Functions with Patched `CloudPath`

Source: https://github.com/drivendataorg/cloudpathlib/blob/master/docs/docs/patching_builtins.ipynb

This example shows how `os.path` functions (`isdir`, `basename`, `dirname`, `join`) correctly handle `CloudPath` objects after `patch_os_functions` is applied. It demonstrates seamless integration with cloud paths.

```python
with patch_os_functions():
    result = os.path.isdir(folder)
    print("Patched version of `os.path.isdir` returns: ", result)

    print("basename:", os.path.basename(cp))

    print("dirname:", os.path.dirname(cp))

    joined = os.path.join(folder, "dir", "sub", "name.txt")
    print("join:", joined)
```

--------------------------------

### Set and Get Default S3Client

Source: https://github.com/drivendataorg/cloudpathlib/blob/master/docs/docs/authentication.md

Illustrates how to set an explicitly instantiated client as the default for future CloudPath objects and how to retrieve the current default client.

```python
client = S3Client(aws_access_key_id="myaccesskey", aws_secret_access_key="mysecretkey")
client.set_as_default_client()

S3Client.get_default_client()
#> <cloudpathlib.s3.s3client.S3Client at 0x7feac3d1fb90>
```

--------------------------------

### Create CloudPath Instance (File Exists)

Source: https://github.com/drivendataorg/cloudpathlib/blob/master/CONTRIBUTING.md

Use `rig.create_cloud_path` to get a `CloudPath` instance that refers to a file expected to exist on the provider. This is useful for testing scenarios where the file's presence is a prerequisite.

```python
cp = rig.create_cloud_path("dir_0/file0_0.txt")
```

--------------------------------

### List Files in S3 Bucket Directory

Source: https://github.com/drivendataorg/cloudpathlib/blob/master/docs/docs/caching.ipynb

Iterate through files in a specified S3 directory using `iterdir()`. This example shows how to list the first 5 images for a given incident.

```python
from cloudpathlib import CloudPath
from itertools import islice

ladi = CloudPath("s3://ladi/Images/FEMA_CAP/2020/70349")

# list first 5 images for this incident
for p in islice(ladi.iterdir(), 5):
    print(p)
```

--------------------------------

### Configuring S3 Client with ExtraArgs

Source: https://github.com/drivendataorg/cloudpathlib/blob/master/docs/docs/authentication.md

You can configure an S3Client with various boto3 ExtraArgs that will be passed to upload, download, or copy operations. This example demonstrates setting 'ChecksumMode' for downloads and 'ACL' for uploads on a client that will be used as the default.

```python
from cloudpathlib import S3Client

c = S3Client(extra_args={
    "ChecksumMode": "ENABLED",  # download extra arg, only used when downloading
    "ACL": "public-read",       # upload extra arg, only used when uploading
})

# use these extras for all CloudPaths
c.set_as_default_client()
```

--------------------------------

### Pandas to CSV (Using .open() Workaround)

Source: https://github.com/drivendataorg/cloudpathlib/blob/master/docs/docs/patching_builtins.ipynb

Provides the recommended workaround for writing pandas DataFrames to `CloudPath` by using `CloudPath.open()` to get a file-like buffer, which pandas can then write to.

```python
# instead, use .open
with cloud_path.open("w") as f:
    df.to_csv(f)

assert cloud_path.exists()
print("Successfully wrote to ", cloud_path)
```

--------------------------------

### Using `library_function` with Patched `open` in Jupyter

Source: https://github.com/drivendataorg/cloudpathlib/blob/master/docs/docs/patching_builtins.ipynb

This example demonstrates successful use of `library_function` with a `CloudPath` after patching the notebook's `open` function. It verifies that the file is written and can be read back correctly.

```python
from cloudpathlib import CloudPath, patch_open

# enable patch and rebind notebook's open
open = patch_open().patched

# create file to read
cp = CloudPath("s3://cloudpathlib-test-bucket/patching_builtins/file.txt")

library_function(cp)
assert cp.read_text() == "hello!"
print("Succeeded!")
```

--------------------------------

### Azure Blob Storage Path Operations with CloudPathLib

Source: https://github.com/drivendataorg/cloudpathlib/blob/master/docs/docs/why_cloudpathlib.ipynb

This snippet demonstrates file operations on an Azure Blob Storage container. It mirrors the S3 example, showing path creation, writing, existence checks, and deletion.

```python
from cloudpathlib import CloudPath

# Changing this root path is the ONLY change!
cloud_directory = CloudPath("az://cloudpathlib-test-container/why_cloudpathlib/")

upload = cloud_directory / "user_upload.txt"
upload.write_text("A user made this file!")

assert upload.exists()
upload.unlink()
assert not upload.exists()
```

--------------------------------

### Configure Basic HTTP Authentication

Source: https://github.com/drivendataorg/cloudpathlib/blob/master/docs/docs/http.md

Pass a urllib.request.BaseHandler implementation, such as HTTPBasicAuthHandler, to the HttpClient constructor to enable authentication for requests. This example shows how to add credentials for a specific realm and URI.

```python
import urllib.request

auth_handler = urllib.request.HTTPBasicAuthHandler()
auth_handler.add_password(
    realm="Some Realm",
    uri="http://www.example.com",
    user="username",
    passwd="password"
)

client = HttpClient(auth=auth_handler)
my_path = client.CloudPath("http://www.example.com/secret/data.txt")

# Now GET requests will include basic auth headers
content = my_path.read_text()
```

--------------------------------

### Accessing Requester Pays S3 Buckets

Source: https://github.com/drivendataorg/cloudpathlib/blob/master/docs/docs/authentication.md

When accessing a Requester Pays S3 bucket, you must explicitly indicate that you will pay for the operations. This example shows how to list contents of such a bucket by creating an S3Client with the 'RequestPayer' extra argument set to 'requester'.

```python
from cloudpathlib import CloudPath

tars = list(CloudPath("s3://arxiv/src/").iterdir())
print(tars)

#> ClientError: An error occurred (AccessDenied) ...
```

```python
from cloudpathlib import S3Client

c = S3Client(extra_args={"RequestPayer": "requester"})

# use the client we created to build the path
tars = list(c.CloudPath("s3://arxiv/src/").iterdir())
print(tars)
```

--------------------------------

### Build Documentation

Source: https://github.com/drivendataorg/cloudpathlib/blob/master/CONTRIBUTING.md

Run this command in the project's root directory to build the latest version of the documentation.

```bash
make docs
```

--------------------------------

### Developer Commands with Make

Source: https://github.com/drivendataorg/cloudpathlib/blob/master/CONTRIBUTING.md

Lists available commands for local development and maintenance tasks. Run `make` to see all options.

```bash
clean                remove all build, test, coverage and Python artifacts
clean-build          remove build artifacts
clean-pyc            remove Python file artifacts
clean-test           remove test and coverage artifacts
dist                 builds source and wheel package
docs-setup           setup docs pages based on README.md and HISTORY.md
docs                 build the static version of the docs
docs-serve           serve documentation to livereload while you work
format               run black to format codebase
install              install the package to the active Python's site-packages
lint                 check style with black, flake8, and mypy
release              package and upload a release
reqs                 install development requirements
test                 run tests with mocked cloud SDKs
test-debug           rerun tests that failed in last run and stop with pdb at failures
test-live-cloud      run tests on live cloud backends
perf                 run performance measurement suite for s3 and save results to perf-results.csv
```

--------------------------------

### Joining Paths and Creating New File Paths

Source: https://github.com/drivendataorg/cloudpathlib/blob/master/README.md

Demonstrates how to join path components to create a new file path within the cloud storage. The file does not need to exist beforehand.

```python
new_file_copy = root_dir / "nested_dir/copy_file.txt"
new_file_copy.exists()
new_file_copy.write_text(text_data)
```

--------------------------------

### Explicitly Instantiate S3Client with Credentials

Source: https://github.com/drivendataorg/cloudpathlib/blob/master/docs/docs/authentication.md

Shows how to create an S3Client instance directly using access key and secret key. This is useful for authentication methods other than environment variables.

```python
from cloudpathlib import S3Client

client = S3Client(aws_access_key_id="myaccesskey", aws_secret_access_key="mysecretkey")

# these next two commands are equivalent
# use client's factory method
cp1 = client.CloudPath("s3://cloudpathlib-test-bucket/")
# or pass client as keyword argument
cp2 = CloudPath("s3://cloudpathlib-test-bucket/", client=client)
```

--------------------------------

### Get S3 file statistics

Source: https://github.com/drivendataorg/cloudpathlib/blob/master/docs/docs/why_cloudpathlib.ipynb

Use the stat() method to retrieve file metadata, such as size, modification time, etc., for an S3 object.

```python
stat = s3p.stat()
print(f"File size in bytes: {stat.st_size}")
stat
```

--------------------------------

### Run Live Cloud Tests

Source: https://github.com/drivendataorg/cloudpathlib/blob/master/CONTRIBUTING.md

Executes tests against actual live cloud provider servers. Ensure you have the necessary credentials configured for each provider before running.

```bash
make test-live-cloud
```

--------------------------------

### Create Cloud Paths with Test Rigs

Source: https://github.com/drivendataorg/cloudpathlib/blob/master/CONTRIBUTING.md

Use the `create_cloud_path` method of a test rig to instantiate cloud paths. This method can create paths to existing assets or non-existent locations. You can also obtain a client instance from the rig.

```python
def test_file_operations(rig):
    # Create a path to an existing file in the test assets
    cp = rig.create_cloud_path("dir_0/file0_0.txt")
    
    # Create a path to a non-existent file
    cp2 = rig.create_cloud_path("path/that/does/not/exist.txt")
    
    # Get a client instance
    client = rig.client_class()
```

--------------------------------

### Basic CloudPath Usage with HTTP URLs

Source: https://github.com/drivendataorg/cloudpathlib/blob/master/docs/docs/http.md

Demonstrates creating CloudPath objects for HTTP/HTTPS URLs and performing common file operations like reading, joining paths, checking existence, and listing directory contents.

```python
from cloudpathlib import CloudPath

# Create a path object
path = CloudPath("https://example.com/data/file.txt")

# Read file contents
text = path.read_text()
binary = path.read_bytes()

# Get parent directory
parent = path.parent  # https://example.com/data/

# Join paths
subpath = path.parent / "other.txt"  # https://example.com/data/other.txt

# Check if file exists
if path.exists():
    print("File exists!")

# Get file name and suffix
print(path.name)      # "file.txt"
print(path.suffix)    # ".txt"

# List directory contents (if server supports directory listings)
data_dir = CloudPath("https://example.com/data/")
for child_path in data_dir.iterdir():
    print(child_path)
```

--------------------------------

### Run Full Test Suite (Mocked)

Source: https://github.com/drivendataorg/cloudpathlib/blob/master/CONTRIBUTING.md

Executes the complete test suite using mocked cloud SDKs, ensuring no network calls are made. This is the most common command during development.

```bash
make test
```

--------------------------------

### Open and display an image from cache

Source: https://github.com/drivendataorg/cloudpathlib/blob/master/docs/docs/caching.ipynb

Opening a file with `.open()` downloads it to the cache if it's not already present. Subsequent opens will use the cached version, leading to faster access.

```python
%%time
with flood_image.open("rb") as f:
    i = Image.open(f)
    plt.imshow(i)
```

--------------------------------

### tmp_dir File Cache Mode Example

Source: https://github.com/drivendataorg/cloudpathlib/blob/master/docs/docs/caching.ipynb

Demonstrates the 'tmp_dir' file cache mode. Cached files are available while the CloudPath object exists and after it's deleted, but are removed when the Client object is garbage collected.

```python
tmp_dir_client = S3Client()

flood_image = tmp_dir_client.CloudPath(
    "s3://ladi/Images/FEMA_CAP/2020/70349/DSC_0002_a89f1b79-786f-4dac-9dcc-609fb1a977b1.jpg"
)

with flood_image.open("rb") as f:
    i = Image.open(f)
    print("Image loaded...")

# cache exists while the CloudPath object persists
local_cached_file = flood_image._local
print("Cache file exists after finished reading: ", local_cached_file.exists())

# decrement reference count so garbage collection runs
del flood_image

# file still exists
print("Cache file exists after CloudPath is no longer referenced: ", local_cached_file.exists())

# decrement reference count so garbage collector removes the client
del tmp_dir_client

# file still exists
print("Cache file exists after Client is no longer referenced: ", local_cached_file.exists())
```

--------------------------------

### Simulating Library Function Using `open`

Source: https://github.com/drivendataorg/cloudpathlib/blob/master/docs/docs/patching_builtins.ipynb

This code simulates a function within a third-party library that uses the built-in `open` to write to a file. It demonstrates a scenario where patching `open` would be necessary for `CloudPath` compatibility.

```python
# Imagine that deep in a third-party library a function is implemented like this
def library_function(filepath: str):
    with open(filepath, "w") as f:
        f.write("hello!")
```

--------------------------------

### Instantiate CloudPath and Access Client

Source: https://github.com/drivendataorg/cloudpathlib/blob/master/docs/docs/authentication.md

Demonstrates how a default client is automatically instantiated when creating a CloudPath object. Subsequent instances of the same service's paths will reuse this client.

```python
from cloudpathlib import CloudPath

cloud_path = CloudPath("s3://cloudpathlib-test-bucket/")   # same for S3Path(...)
cloud_path.client
#> <cloudpathlib.s3.s3client.S3Client at 0x7feac3d1fb90>
```

--------------------------------

### Instantiate HTTP/HTTPS Paths with AnyPath or CloudPath

Source: https://github.com/drivendataorg/cloudpathlib/blob/master/docs/docs/http.md

Use `AnyPath` or `CloudPath` to automatically handle `http://` or `https://` URLs. `AnyPath` also supports local file paths.

```python
from cloudpathlib import AnyPath, CloudPath

# AnyPath will automatically detect "http://" or "https://" (or local file paths)
my_path = AnyPath("https://www.example.com/files/info.txt")

# CloudPath will dispatch to the correct subclass
my_path = CloudPath("https://www.example.com/files/info.txt")
```

--------------------------------

### Create CloudPath Instance (File May Not Exist)

Source: https://github.com/drivendataorg/cloudpathlib/blob/master/CONTRIBUTING.md

Use `rig.create_cloud_path` to create a `CloudPath` instance for a path that does not necessarily need to exist. This is suitable for testing operations that do not require the target file or directory to be present beforehand.

```python
cp2 = rig.create_cloud_path("path/that/does/not/exist.txt")
```

--------------------------------

### Persistent File Cache Mode Example

Source: https://github.com/drivendataorg/cloudpathlib/blob/master/docs/docs/caching.ipynb

Illustrates the 'persistent' file cache mode. Cached files persist even after both CloudPath and Client objects are deleted, requiring manual cleanup. This mode requires `local_cache_dir` to be specified.

```python
persistent_client = S3Client(local_cache_dir="./cache")

# cache mode set automatically to persistent if local_cache_dir and not explicit
print("Client cache mode set to: ", persistent_client.file_cache_mode)

# Just uses default client
flood_image = persistent_client.CloudPath(
    "s3://ladi/Images/FEMA_CAP/2020/70349/DSC_0002_a89f1b79-786f-4dac-9dcc-609fb1a977b1.jpg"
)

with flood_image.open("rb") as f:
    i = Image.open(f)
    print("Image loaded...")

# cache exists while the CloudPath object persists
local_cached_file = flood_image._local
print("Cache file exists after finished reading: ", local_cached_file.exists())

# decrement reference count so garbage collection runs
del flood_image

# file still exists
print("Cache file exists after CloudPath is no longer referenced: ", local_cached_file.exists())

# decrement reference count so garbage collector removes the client
client_cache_dir = persistent_client._local_cache_dir
del persistent_client

# file still exists
print("Cache file exists after Client is no longer referenced: ", local_cached_file.exists())


# explicitly remove persistent cache file
import shutil

shutil.rmtree(client_cache_dir)
```

--------------------------------

### LocalClient

Source: https://github.com/drivendataorg/cloudpathlib/blob/master/docs/docs/api-reference/local.md

A generic client for local file system interactions.

```APIDOC
## cloudpathlib.local.LocalClient

### Description
A generic client for interacting with the local file system.

### Usage
Instantiate this class to perform operations on local file system paths.

```

--------------------------------

### Performance Test Report Structure

Source: https://github.com/drivendataorg/cloudpathlib/blob/master/CONTRIBUTING.md

Example structure of a performance test report, showing metrics like Mean, Std, Max, and N Items for different test scenarios. This helps in understanding the performance impact of code changes.

```text
                                  Performance suite results: (2023-10-08T13:18:04.774823)                                  
┏━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━┓
┃ Test Name      ┃ Config Name                ┃ Iterations ┃           Mean ┃              Std ┃            Max ┃ N Items ┃
┡━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━┩
│ List Folders   │ List shallow recursive     │         10 │ 0:00:00.862476 │ ± 0:00:00.020222 │ 0:00:00.898143 │   5,500 │
│ List Folders   │ List shallow non-recursive │         10 │ 0:00:00.884997 │ ± 0:00:00.086678 │ 0:00:01.117775 │   5,500 │
│ List Folders   │ List normal recursive      │         10 │ 0:00:01.248844 │ ± 0:00:00.095575 │ 0:00:01.506868 │   7,877 │
│ List Folders   │ List normal non-recursive  │         10 │ 0:00:00.060042 │ ± 0:00:00.003986 │ 0:00:00.064052 │     113 │
│ List Folders   │ List deep recursive        │         10 │ 0:00:02.004731 │ ± 0:00:00.130264 │ 0:00:02.353263 │   7,955 │
│ List Folders   │ List deep non-recursive    │         10 │ 0:00:00.054268 │ ± 0:00:00.003314 │ 0:00:00.062116 │      31 │
│ Glob scenarios │ Glob shallow recursive     │         10 │ 0:00:01.056946 │ ± 0:00:00.160470 │ 0:00:01.447082 │   5,500 │
│ Glob scenarios │ Glob shallow non-recursive │         10 │ 0:00:00.978217 │ ± 0:00:00.091849 │ 0:00:01.230822 │   5,500 │
│ Glob scenarios │ Glob normal recursive      │         10 │ 0:00:01.510334 │ ± 0:00:00.101108 │ 0:00:01.789393 │   7,272 │
│ Glob scenarios │ Glob normal non-recursive  │         10 │ 0:00:00.058301 │ ± 0:00:00.002621 │ 0:00:00.063299 │      12 │
│ Glob scenarios │ Glob deep recursive        │         10 │ 0:00:02.784629 │ ± 0:00:00.099764 │ 0:00:02.981882 │   7,650 │
│ Glob scenarios │ Glob deep non-recursive    │         10 │ 0:00:00.051322 │ ± 0:00:00.002653 │ 0:00:00.054844 │      25 │
│ Walk scenarios │ Walk shallow               │         10 │ 0:00:00.905571 │ ± 0:00:00.076332 │ 0:00:01.113957 │   5,500 │
│ Walk scenarios │ Walk normal                │         10 │ 0:00:01.441215 │ ± 0:00:00.014923 │ 0:00:01.470414 │   7,272 │
│ Walk scenarios │ Walk deep                  │         10 │ 0:00:02.461520 │ ± 0:00:00.031832 │ 0:00:02.539132 │   7,650 │
└────────────────┴────────────────────────────┴────────────┴────────────────┴──────────────────┴────────────────┴─────────┘
```

--------------------------------

### Run Code Formatting

Source: https://github.com/drivendataorg/cloudpathlib/blob/master/CONTRIBUTING.md

Execute the Black formatter to ensure code adheres to project style guidelines. This command should be run before submitting changes.

```bash
make format
```

--------------------------------

### Implement a Custom Client Class

Source: https://github.com/drivendataorg/cloudpathlib/blob/master/CONTRIBUTING.md

Define a new client class by inheriting from `cloudpathlib.client.Client`. This class will handle the specific interactions with a cloud storage provider.

```python
from cloudpathlib.client import Client

class MyClient(Client):
    # implementation here...
```

--------------------------------

### Custom Directory Listing Parser

Source: https://github.com/drivendataorg/cloudpathlib/blob/master/docs/docs/http.md

Override the default directory listing parser by providing a custom function to HttpClient. This function should accept HTML content and yield strings representing file or directory names. This example uses BeautifulSoup to find links with a specific class.

```python
def my_parser(html_content: str) -> Iterable[str]:
    # for example, just get a with href and class "file-link"
    # using beautifulsoup
    soup = BeautifulSoup(html_content, "html.parser")
    for link in soup.find_all("a", class_="file-link"):
        yield link.get("href")

client = HttpClient(custom_list_page_parser=my_parser)
my_dir = client.CloudPath("http://example.com/public/")

for subpath, is_dir in my_dir.list_dir(recursive=False):
    print(subpath, "dir" if is_dir else "file")
```

--------------------------------

### Accessing cached file via fspath (read-only)

Source: https://github.com/drivendataorg/cloudpathlib/blob/master/docs/docs/caching.ipynb

Use the `.fspath` property to get the local path to the cached file. This is useful for libraries that do not accept `PathLike` objects. Note that this operation downloads the file if it's not in the cache and should be treated as read-only, as changes will not be uploaded to the cloud.

```python
# Warning: Using the `.fspath` property will download the file from the cloud if it does not exist yet in the cache.
# Warning: Since we are no longer in control of opening/closing the file, we cannot upload any changes when the file is closed. Therefore, you should treat any code where you use fspath as _read only_. Writes directly to fspath will not be uploaded to the cloud.
```

--------------------------------

### Initialize S3Path

Source: https://github.com/drivendataorg/cloudpathlib/blob/master/docs/docs/why_cloudpathlib.ipynb

Create an S3Path object representing a file in an S3 bucket. This requires the 's3://' prefix.

```python
from cloudpathlib import S3Path

s3p = S3Path("s3://cloudpathlib-test-bucket/why_cloudpathlib/file.txt")
s3p.name
```

--------------------------------

### Create an empty file on S3 with touch()

Source: https://github.com/drivendataorg/cloudpathlib/blob/master/docs/docs/why_cloudpathlib.ipynb

The touch() method creates an empty file at the specified S3 path, similar to the pathlib equivalent.

```python
# Touch (just like with `pathlib.Path`)
s3p.touch()
```

--------------------------------

### Instantiate AnyPath for Local and Cloud Paths

Source: https://github.com/drivendataorg/cloudpathlib/blob/master/docs/docs/anypath-polymorphism.md

Use AnyPath to create path objects. It automatically resolves to a pathlib.Path for local paths or a CloudPath subclass (e.g., S3Path) for cloud URIs.

```python
from cloudpathlib import AnyPath

path = AnyPath("mydir/myfile.txt")
path
#> PosixPath('mydir/myfile.txt')

cloud_path = AnyPath("s3://mybucket/myfile.txt")
cloud_path
#> S3Path('s3://mybucket/myfile.txt')

isinstance(path, AnyPath)
#> True
isinstance(cloud_path, AnyPath)
#> True
```

--------------------------------

### LocalGSImplementation

Source: https://github.com/drivendataorg/cloudpathlib/blob/master/docs/docs/api-reference/local.md

Details the implementation for local Google Cloud Storage interactions.

```APIDOC
## cloudpathlib.local.local_gs_implementation

### Description
Provides the implementation details for interacting with Google Cloud Storage (GCS) in a local context.

### Usage
This module is typically used internally by cloudpathlib to handle GCS operations.

```

--------------------------------

### Initialize AzureBlobPath

Source: https://github.com/drivendataorg/cloudpathlib/blob/master/docs/docs/why_cloudpathlib.ipynb

Create an AzureBlobPath object representing a file in an Azure Blob Storage container. This requires the 'az://' prefix.

```python
from cloudpathlib import AzureBlobPath

azp = AzureBlobPath("az://cloudpathlib-test-container/file.txt")
azp.name
```

--------------------------------

### List files in a directory with pathlib

Source: https://github.com/drivendataorg/cloudpathlib/blob/master/docs/docs/why_cloudpathlib.ipynb

Use the glob method to list all files and directories within the current directory.

```python
list(Path(".").glob("*"))
```

--------------------------------

### Run Performance Tests

Source: https://github.com/drivendataorg/cloudpathlib/blob/master/CONTRIBUTING.md

Execute performance tests using 'make perf'. This command generates a report detailing the performance of various listing, globbing, and walking operations across different configurations. Include these results in your Pull Request description.

```bash
make perf
```

--------------------------------

### Demonstrating `os.path` Functions with `CloudPath` (Unpatched)

Source: https://github.com/drivendataorg/cloudpathlib/blob/master/docs/docs/patching_builtins.ipynb

This snippet illustrates the failure when using `os.path` functions like `isdir` with a `CloudPath` object before patching. It highlights the need for patching to enable `CloudPath` compatibility.

```python
import os

from cloudpathlib import patch_os_functions, CloudPath

cp = CloudPath("s3://cloudpathlib-test-bucket/patching_builtins/file.txt")
folder = cp.parent

try:
    print(os.path.isdir(folder))
except Exception as e:
    print("Unpatched version fails:")
    print(e)
```

--------------------------------

### Import pathlib

Source: https://github.com/drivendataorg/cloudpathlib/blob/master/docs/docs/why_cloudpathlib.ipynb

Import the Path object from the standard pathlib library.

```python
from pathlib import Path
```

--------------------------------

### Listing Files After Writing

Source: https://github.com/drivendataorg/cloudpathlib/blob/master/README.md

Lists all text files in the specified directory and its subdirectories after a new file has been written. Confirms the new file is now present.

```python
list(root_dir.glob('**/*.txt'))
```

--------------------------------

### LocalGSClient

Source: https://github.com/drivendataorg/cloudpathlib/blob/master/docs/docs/api-reference/local.md

Represents a client for interacting with Google Cloud Storage locally.

```APIDOC
## cloudpathlib.local.LocalGSClient

### Description
A client class for managing Google Cloud Storage (GCS) resources locally.

### Usage
Instantiate this class to perform operations on GCS paths.

```

--------------------------------

### List Available FileCacheMode Options

Source: https://github.com/drivendataorg/cloudpathlib/blob/master/docs/docs/caching.ipynb

Prints all available file cache modes from the FileCacheMode enum. Use these strings or enum members when configuring the cache mode.

```python
from cloudpathlib.enums import FileCacheMode

print("\n".join(FileCacheMode))
```

```text
persistent
tmp_dir
cloudpath_object
close_file
```

--------------------------------

### Register Custom Client and Path Classes

Source: https://github.com/drivendataorg/cloudpathlib/blob/master/CONTRIBUTING.md

Register your custom client and path classes with `cloudpathlib` using decorators. This allows `CloudPath` to correctly dispatch to your provider based on the URI scheme.

```python
from cloudpathlib.client import Client, register_client_class
from cloudpathlib.cloudpath import CloudPath, register_path_class

@register_client_class("my-prefix")
class MyClient(Client):
    # implementation here...

@register_path_class("my-prefix")
class MyPath(CloudPath):
    cloud_prefix: str = "my-prefix://"
    client: "MyClient"

    # implementation here...
```

--------------------------------

### Pillow with CloudPath (Patched)

Source: https://github.com/drivendataorg/cloudpathlib/blob/master/docs/docs/patching_builtins.ipynb

Demonstrates successful image saving to a `CloudPath` using Pillow after patching all built-ins with `patch_all_builtins()`. This confirms that patching enables compatibility.

```python
# Patched: success with patching builtins
with patch_all_builtins():
    Image.new("RGB", (10, 10), color=(255, 0, 0)).save(img_path)

    assert img_path.read_bytes()
    print("With patches, Pillow successfully writes to a CloudPath")
```

--------------------------------

### Implement a Custom Path Class

Source: https://github.com/drivendataorg/cloudpathlib/blob/master/CONTRIBUTING.md

Define a new path class by inheriting from `cloudpathlib.cloudpath.CloudPath`. This class represents paths within the custom cloud provider and should specify its `cloud_prefix` and `client` type.

```python
from cloudpathlib.cloudpath import CloudPath

class MyPath(CloudPath):
    cloud_prefix: str = "my-prefix://"
    client: "MyClient"

    # implementation here...
```

--------------------------------

### Run Linting and Type Checking

Source: https://github.com/drivendataorg/cloudpathlib/blob/master/CONTRIBUTING.md

Execute Flake8 for linting and MyPy for type checking to ensure code quality and correctness. This command should be run before submitting changes.

```bash
make lint
```

--------------------------------

### Cloud Provider Abstraction in Client Class

Source: https://github.com/drivendataorg/cloudpathlib/blob/master/CONTRIBUTING.md

The generic functionality for cloud providers, such as setting defaults and caching, is implemented in the `Client` class. This class also defines the interface that provider-specific `*Client` backends must implement.

```python
from cloudpathlib.client import Client

class S3Client(Client):
    # ... implementation for S3 ...
    pass
```

--------------------------------

### Accessing Public S3 Bucket With `no_sign_request=True`

Source: https://github.com/drivendataorg/cloudpathlib/blob/master/docs/docs/authentication.md

Instantiate an `S3Client` with `no_sign_request=True` to access public S3 buckets without credentials. Use the created client to instantiate `CloudPath` objects for operations.

```python
from cloudpathlib import S3Client

c = S3Client(no_sign_request=True)

# use this client object to create the CloudPath
c.CloudPath("s3://ladi/Images/FEMA_CAP/2020/70349/DSC_0001_5a63d42e-27c6-448a-84f1-bfc632125b8e.jpg").exists()
#> True
```

--------------------------------

### LocalS3Implementation

Source: https://github.com/drivendataorg/cloudpathlib/blob/master/docs/docs/api-reference/local.md

Details the implementation for local Amazon S3 interactions.

```APIDOC
## cloudpathlib.local.local_s3_implementation

### Description
Provides the implementation details for interacting with Amazon S3 in a local context.

### Usage
This module is typically used internally by cloudpathlib to handle S3 operations.

```

--------------------------------

### Glob with CloudPath (Patched)

Source: https://github.com/drivendataorg/cloudpathlib/blob/master/docs/docs/patching_builtins.ipynb

Shows how `glob.glob` and `glob.iglob` work with `CloudPath` when `patch_glob` is used. This allows `CloudPath` objects to be used as patterns or `root_dir` arguments.

```python
with patch_glob():
    print("Patched succeeds:")
    print(glob(CloudPath("s3://cloudpathlib-test-bucket/manual-tests/**/*dir*/**/*")))

    # or equivalently
    print(glob("**/*dir*/**/*", root_dir=CloudPath("s3://cloudpathlib-test-bucket/manual-tests/")))
```

--------------------------------

### Update Support Table

Source: https://github.com/drivendataorg/cloudpathlib/blob/master/CONTRIBUTING.md

Execute this script and update `README.md` if you add or remove methods from the `CloudPath` class or its subclasses. This ensures the support table in the README is accurate.

```bash
python docs/make_support_table.py
```

--------------------------------

### Instantiate S3Client with a local cache directory

Source: https://github.com/drivendataorg/cloudpathlib/blob/master/docs/docs/caching.ipynb

To maintain a persistent cache, instantiate `S3Client` with the `local_cache_dir` argument. This ensures downloaded files are stored in the specified directory, even after Python restarts.

```python
from cloudpathlib import S3Client

# explicitly instantiate a client that always uses the local cache
client = S3Client(local_cache_dir="data")

ladi = client.CloudPath("s3://ladi/Images/FEMA_CAP/2020/70349")
```

--------------------------------

### Instantiate CloudPath

Source: https://github.com/drivendataorg/cloudpathlib/blob/master/README.md

Creates a CloudPath object, which dispatches to the appropriate cloud service path class based on the URI prefix. Authentication is handled by default via environment variables.

```python
from cloudpathlib import CloudPath

# dispatches to S3Path based on prefix
root_dir = CloudPath("s3://drivendata-public-assets/")
root_dir
#> S3Path('s3://drivendata-public-assets/')
```

--------------------------------

### LocalAzureBlobImplementation

Source: https://github.com/drivendataorg/cloudpathlib/blob/master/docs/docs/api-reference/local.md

Details the implementation for local Azure Blob storage interactions.

```APIDOC
## cloudpathlib.local.local_azure_blob_implementation

### Description
Provides the implementation details for interacting with Azure Blob storage in a local context.

### Usage
This module is typically used internally by cloudpathlib to handle Azure Blob operations.

```

--------------------------------

### Azure Live Backend Test Environment Variables

Source: https://github.com/drivendataorg/cloudpathlib/blob/master/CONTRIBUTING.md

Set these environment variables to enable live testing against Azure Blob Storage and Azure Data Lake Storage Gen2. If AZURE_STORAGE_GEN2_CONNECTION_STRING is not set, only blob storage will be tested.

```bash
AZURE_STORAGE_CONNECTION_STRING=your_connection_string
AZURE_STORAGE_GEN2_CONNECTION_STRING=your_connection_string
```

--------------------------------

### Pillow with CloudPath (Unpatched)

Source: https://github.com/drivendataorg/cloudpathlib/blob/master/docs/docs/patching_builtins.ipynb

Illustrates that third-party libraries like Pillow fail when attempting to save directly to a `CloudPath` without patching built-ins, due to expecting string or bytes file paths.

```python
from cloudpathlib import CloudPath, patch_all_builtins
from PIL import Image

base = CloudPath("s3://cloudpathlib-test-bucket/patching_builtins/third_party/")

img_path = base / "pillow_demo.png"

# Unpatched: using CloudPath directly fails
try:
    Image.new("RGB", (10, 10), color=(255, 0, 0)).save(img_path)
except Exception as e:
    print("Pillow without patch: FAILED:", e)
```

--------------------------------

### Check Cache Status

Source: https://github.com/drivendataorg/cloudpathlib/blob/master/docs/docs/caching.ipynb

Verify if files have been downloaded to the local cache. Initially, the cache directory will be empty even after listing files.

```bash
!tree {ladi.fspath}
```

--------------------------------

### Instantiate S3Client with Custom Endpoint

Source: https://github.com/drivendataorg/cloudpathlib/blob/master/docs/docs/authentication.md

Use this snippet to create an S3Client instance that connects to a custom S3-compatible object store endpoint. You can then use this client to create CloudPath objects or set it as the default client for all future paths.

```python
from cloudpathlib import S3Client, CloudPath

# create a client pointing to the endpoint
client = S3Client(endpoint_url="http://my.s3.server:1234")

# option 1: use the client to create paths
cp1 = client.CloudPath("s3://cloudpathlib-test-bucket/")

# option 2: pass the client as keyword argument
cp2 = CloudPath("s3://cloudpathlib-test-bucket/", client=client)

# option3: set this client as the default so it is used in any future paths
client.set_as_default_client()
cp3 = CloudPath("s3://cloudpathlib-test-bucket/")
```