### Configuration File Example for Image Format Source: https://github.com/hect0x7/jmcomic-crawler-python/blob/master/README.md Example of a YAML configuration file to set the image suffix for downloaded images. This allows converting downloaded images to formats like PNG. ```yaml download: image: suffix: .png # 该配置用于把下载的图片转为png格式 ``` -------------------------------- ### Example program output Source: https://github.com/hect0x7/jmcomic-crawler-python/blob/master/assets/docs/sources/tutorial/8_pick_domain.md Displays the console output after running the domain accessibility test. ```text 获取到7个域名,开始测试 18comic.vip: ok 18comic.org: ok 18comic-palworld.vip: ok 18comic-c.art: ok jmcomic1.me: ok jmcomic.me: ok 18comic-palworld.club: ok ``` -------------------------------- ### Install jmcomic from source Source: https://github.com/hect0x7/jmcomic-crawler-python/blob/master/assets/readme/README-en.md Install the jmcomic library directly from its GitHub repository. ```shell pip install git+https://github.com/hect0x7/JMComic-Crawler-Python ``` -------------------------------- ### jmv Output Example Source: https://github.com/hect0x7/jmcomic-crawler-python/blob/master/assets/docs/sources/tutorial/2_command_line.md This is an example of the detailed information displayed by the jmv command for a comic. ```text 🔍 正在查询 禁漫车号 - [350234] 的详情... ────────────────────────────────────────────────── 📖 标题: xxx 🆔 ID: JM350234 🔗 链接: https://18comic.vip/album/350234/ ✍️ 作者: Author1, Author2 ────────────────────────────────────────────────── 📅 发布日期: 2022-06-15 📅 更新日期: 2023-01-01 📄 总页数: 50 👀 观看: 2M ❤️ 点赞: 77K 💬 评论: 9801 ────────────────────────────────────────────────── 🏷️ 标签: 标签1, 标签2, ... 🎭 人物: 角色A, 角色B, ... 📚 作品: 作品1, 作品2, ... ────────────────────────────────────────────────── 📑 章节 (2): 第1話 上 (id: 350234) 第2話 下 (id: 350235) ────────────────────────────────────────────────── [运行结束] 请按回车键关闭窗口... (下次运行可附加 -y 参数跳过确认) ``` -------------------------------- ### Install jmcomic via pip Source: https://github.com/hect0x7/jmcomic-crawler-python/blob/master/assets/readme/README-en.md Install the jmcomic library from the official pip source. Use the -U flag to upgrade if already installed. ```shell pip install jmcomic -U ``` -------------------------------- ### Filter: Download First 3 Images of a Chapter Source: https://github.com/hect0x7/jmcomic-crawler-python/blob/master/assets/docs/sources/tutorial/5_filter.md This example demonstrates how to filter images within a chapter, keeping only the first three. It checks if the detail is a photo and then slices the photo list. ```python from jmcomic import * class First3ImageDownloader(JmDownloader): def do_filter(self, detail): if detail.is_photo(): photo: JmPhotoDetail = detail # Supports [start,end,step] return photo[:3] return detail ``` -------------------------------- ### Filter: Download Specific Chapters After an Update Source: https://github.com/hect0x7/jmcomic-crawler-python/blob/master/assets/docs/sources/tutorial/5_filter.md This example filters chapters, downloading only those that appear after a specific chapter ID within a given album. It uses a dictionary `album_after_photo` to map album IDs to the starting chapter ID for filtering. ```python from jmcomic import * # Reference: https://github.com/hect0x7/JMComic-Crawler-Python/issues/95 class FindUpdateDownloader(JmDownloader): album_after_photo = { 'xxx': 'yyy' } def do_filter(self, detail): if not detail.is_album(): return detail return self.find_update(detail) # Pass in the comic ID, chapter ID (x-th chapter), to find all chapter IDs after the x-th chapter in this comic def find_update(self, album: JmAlbumDetail): if album.album_id not in self.album_after_photo: return album photo_ls = [] photo_begin = self.album_after_photo[album.album_id] is_new_photo = False for photo in album: if is_new_photo: photo_ls.append(photo) if photo.photo_id == photo_begin: is_new_photo = True return photo_ls ``` -------------------------------- ### Custom Directory Naming with Specific Album Data Source: https://github.com/hect0x7/jmcomic-crawler-python/blob/master/assets/docs/sources/tutorial/9_custom_download_dir_name.md Use a custom function to generate directory names based on specific album IDs. This example maps album IDs to custom names, overriding default naming conventions. ```python from jmcomic import JmModuleConfig dic = { '248965': '社团学姐(爆赞韩漫)' } # Amyname JmModuleConfig.AFIELD_ADVICE['myname'] = lambda album: dic[album.id] download_album(248965) ``` -------------------------------- ### Get Entity Classes and Download Images Source: https://github.com/hect0x7/jmcomic-crawler-python/blob/master/assets/docs/sources/tutorial/0_common_usage.md Retrieve entity classes for albums, chapters, and images to download cover images or individual pictures. You can download by image detail object or directly by URL if the image is not obfuscated. ```python from jmcomic import * # 客户端 client = JmOption.default().new_jm_client() # 本子实体类 album: JmAlbumDetail = client.get_album_detail('427413') # 下载本子封面图,保存为 cover.png (图片后缀可指定为jpg、webp等) client.download_album_cover('427413', './cover.png') def fetch(photo: JmPhotoDetail): # 章节实体类 photo = client.get_photo_detail(photo.photo_id, False) print(f'章节id: {photo.photo_id}') # 图片实体类 image: JmImageDetail for image in photo: print(f'图片url: {image.img_url}') # 下载单个图片 client.download_by_image_detail(image, './a.jpg') # 如果是已知未混淆的图片,也可以直接使用url来下载 random_image_domain = JmModuleConfig.DOMAIN_IMAGE_LIST[0] client.download_image(f'https://{random_image_domain}/media/albums/416130.jpg', './a.jpg') # 多线程发起请求 multi_thread_launcher( iter_objs=album, apply_each_obj_func=fetch ) ``` -------------------------------- ### Download a JM Album Source: https://github.com/hect0x7/jmcomic-crawler-python/blob/master/assets/readme/README-en.md Download all chapter images of a JM album by providing its ID. Requires the jmcomic module to be installed. ```python import jmcomic jmcomic.download_album('123') ``` -------------------------------- ### Initialize Option with Plugins Source: https://github.com/hect0x7/jmcomic-crawler-python/blob/master/assets/docs/sources/tutorial/6_plugin.md Load an option file to trigger configured plugins, such as the login plugin, before performing download operations. ```python import jmcomic option = jmcomic.create_option_by_file('xxx.yml') # 创建option对象 # 程序走到这里,login插件已经调用完毕了 # 后续下载本子就都是已登录状态里了 option.download_album(123) ``` -------------------------------- ### Create Option Object from File Source: https://github.com/hect0x7/jmcomic-crawler-python/blob/master/README.md Create a configuration object by loading settings from a YAML file. This allows for customized download options such as image format conversion. ```python import jmcomic # 创建配置对象 option = jmcomic.create_option_by_file('你的配置文件路径,例如 D:/option.yml') ``` -------------------------------- ### Initialize Custom Plugin Source: https://github.com/hect0x7/jmcomic-crawler-python/blob/master/assets/docs/sources/tutorial/6_plugin.md Create an option instance to trigger the registered custom plugin. ```python from jmcomic import create_option option = create_option('xxx') ``` -------------------------------- ### Download Album with Option File via Command Line Source: https://github.com/hect0x7/jmcomic-crawler-python/blob/master/README.md Download an album using the command-line interface and specify a custom option file using the --option parameter. ```shell jmcomic 123 --option="D:/a.yml" ``` -------------------------------- ### Customize Downloads with Options Source: https://github.com/hect0x7/jmcomic-crawler-python/blob/master/assets/docs/sources/tutorial/0_common_usage.md Customize download behavior like specifying domains, using proxies, logging in, or converting image formats by creating and passing an option object. It's recommended to use configuration files for options. ```python from jmcomic import * # 1. 在调用下载api前,通过创建和使用option对象,可以定制化下载行为。 # 推荐使用配置文件的方式来创建option对象, # 你可以配置很多东西,比如代理、cookies、下载规则等等。 # 配置文件的语法参考: https://jmcomic.readthedocs.io/en/latest/option_file_syntax/ option = create_option_by_file('op.yml') # 通过配置文件来创建option对象 # 2. 调用下载api,把option作为参数传递 download_album(123, option) # 也可以使用下面这种面向对象的方式,是一样的 option.download_album(123) ``` -------------------------------- ### Configure Custom Plugin in YAML Source: https://github.com/hect0x7/jmcomic-crawler-python/blob/master/assets/docs/sources/tutorial/6_plugin.md Use the custom plugin key and parameters within the configuration file to activate the plugin. ```yaml # 3. 在配置文件中使用plugin plugins: after_init: # 事件 - plugin: myplugin # 你自定义的插件key kwargs: word: hello jmcomic # 你自定义的插件的参数 ``` -------------------------------- ### Download Album via Command Line Source: https://github.com/hect0x7/jmcomic-crawler-python/blob/master/README.md Download a specific album using the command-line interface. This is a simple and direct method for downloading content without writing Python scripts. ```shell jmcomic 123 ``` -------------------------------- ### Download Album with Default Settings Source: https://github.com/hect0x7/jmcomic-crawler-python/blob/master/README.md Download all chapters and images for a given album ID using the default settings. Ensure the 'jmcomic' module is imported first. ```python import jmcomic # 导入此模块,需要先安装. jmcomic.download_album('123') # 传入要下载的album的id,即可下载整个album到本地. ``` -------------------------------- ### Activate Custom Downloader Class Source: https://github.com/hect0x7/jmcomic-crawler-python/blob/master/assets/docs/sources/tutorial/5_filter.md After defining your custom downloader class, activate it using the `use()` method. This ensures your custom filtering logic is applied during downloads. ```python MyDownloader.use() ``` -------------------------------- ### Download Album with Custom Options Source: https://github.com/hect0x7/jmcomic-crawler-python/blob/master/assets/readme/README-en.md Download a JM album using a configuration file to specify options like image format conversion. The configuration file path needs to be provided. ```python import jmcomic # Create configuration object option = jmcomic.create_option_by_file('Path to your configuration file, e.g. D:/option.yml') # Download the album using the option configured jmcomic.download_album(123, option) # Equivalent to: option.download_album(123) ``` -------------------------------- ### Download Album with Custom Options Source: https://github.com/hect0x7/jmcomic-crawler-python/blob/master/README.md Download an album using a pre-configured option object. This method allows for fine-grained control over the download process, including network proxies and image processing. ```python # 使用option对象来下载本子 jmcomic.download_album(123, option) # 等价写法: option.download_album(123) ``` -------------------------------- ### Configure Download Directory Rule Source: https://github.com/hect0x7/jmcomic-crawler-python/blob/master/assets/docs/sources/tutorial/9_custom_download_dir_name.md Define a custom directory structure for downloads using the `dir_rule` configuration. This allows specifying a base directory and a rule based on album or photo properties. ```yaml dir_rule: # 设定根目录 base_dir base_dir: D:/a/b/c/ rule: Bd / Ptitle # P表示章节,title表示使用章节的title字段 # 这个规则的含义是,把图片下载到路径 {base_dir}/{Ptitle}/ 下 # 即:根目录 / 章节标题 / 图片文件 ``` -------------------------------- ### Download Specific Chapter via Command Line Source: https://github.com/hect0x7/jmcomic-crawler-python/blob/master/README.md Download a specific chapter of an album using the command-line interface. Specify both the album ID and the chapter ID. ```shell jmcomic 123 p456 ``` -------------------------------- ### Create JmApiClient Manually Source: https://github.com/hect0x7/jmcomic-crawler-python/blob/master/assets/docs/sources/tutorial/0_common_usage.md Manually instantiates a `JmApiClient` for API access. Requires specifying the `postman`, `domain_list`, and `retry_times`. ```python # 网页端 cl = JmHtmlClient( postman=JmModuleConfig.new_postman(), domain_list=['18comic.vip'], retry_times=1 ) # API端(APP) cl = JmApiClient( postman=JmModuleConfig.new_postman(), domain_list=JmModuleConfig.DOMAIN_API_LIST, retry_times=1 ) ``` -------------------------------- ### Implement Asynchronous Plugin Lifecycle Source: https://github.com/hect0x7/jmcomic-crawler-python/blob/master/assets/docs/sources/tutorial/6_plugin.md Manage asynchronous tasks in plugins by using enter_wait_list and overriding wait_until_finish for graceful shutdown. ```python import threading import time from jmcomic import JmOptionPlugin, JmModuleConfig class MyAsyncPlugin(JmOptionPlugin): plugin_key = 'my_async_plugin' def invoke(self, **kwargs) -> None: # 1. 告诉 option 有一个异步插件正在运行,请主线程在退出前关掉我或等我 self.enter_wait_list() self.is_running = True # 2. 启动一个新的线程... self.thread = threading.Thread(target=self.do_async_work) self.thread.start() def do_async_work(self): while self.is_running: print('异步工作运行中...') time.sleep(1) def wait_until_finish(self): # 3. 覆写 wait_until_finish 方法,实现优雅停机 / 强制阻塞等待的逻辑 # 主程序在结束时必定会调用它(如果此前挂起了该插件) self.is_running = False # 发送停机信号 if hasattr(self, 'thread') and self.thread.is_alive(): self.thread.join() # 阻塞直到线程安全退出 JmModuleConfig.register_plugin(MyAsyncPlugin) ``` -------------------------------- ### Directory Naming by Author and Title Source: https://github.com/hect0x7/jmcomic-crawler-python/blob/master/assets/docs/sources/tutorial/9_custom_download_dir_name.md Configure directory names to include the author and title of the album. The `authoroname` built-in field can also achieve a similar result. ```python from jmcomic import JmModuleConfig # Amyname JmModuleConfig.AFIELD_ADVICE['myname'] = lambda album: f'【{album.author}】{album.title}' # album有一个内置字段 authoroname,效果类似 ``` -------------------------------- ### 初始化 jmcomic 配置 Source: https://github.com/hect0x7/jmcomic-crawler-python/blob/master/assets/docs/sources/tutorial/7_advance.md 使用 create_option 函数加载指定的 YAML 配置文件以启动下载任务。 ```python from jmcomic import create_option create_option('myoption.yml') ``` -------------------------------- ### Define Custom Downloader Class Source: https://github.com/hect0x7/jmcomic-crawler-python/blob/master/assets/docs/sources/tutorial/5_filter.md To use the filter mechanism, define a custom class inheriting from JmDownloader and override the do_filter method. Ensure the return value of do_filter supports len(), returning an empty list `[]` if no content should be downloaded. ```python class MyDownloader(JmDownloader): def do_filter(self, detail): # How to override? Refer to JmDownloader.do_filter and the examples below ... ``` -------------------------------- ### Customize Client Class Source: https://github.com/hect0x7/jmcomic-crawler-python/blob/master/assets/docs/sources/tutorial/4_module_custom.md Define a custom client class inheriting from JmHtmlClient, assign a unique client_key, and register it using JmModuleConfig.register_client. Then, specify your client's key in the configuration file. ```python def custom_client_class(): """ 该文件演示自定义client类 """ # 默认情况下,JmOption使用client类是根据配置项 `client.impl` 决定的 # JmOption会根据`client.impl`到 JmModuleConfig.CLASS_CLIENT_IMPL 中查找 # 自定义client的步骤如下 # 1. 自定义Client类 class MyClient(JmHtmlClient): client_key = 'myclient' pass # 2. 让MyClient生效 JmModuleConfig.register_client(MyClient) # 3. 在配置文件中使用你定义的client.impl,后续使用这个option即可 """ client: impl: myclient """ ``` -------------------------------- ### 配置高级重试插件 Source: https://github.com/hect0x7/jmcomic-crawler-python/blob/master/assets/docs/sources/tutorial/12_domain_strategy.md 在 YAML 配置文件中启用 AdvancedRetryPlugin,以支持更复杂的域名轮询和失败重试策略。 ```yaml plugins: after_init: - plugin: advanced_retry # 声明并开启高级重试插件 kwargs: retry_config: retry_rounds: 3 # 整个域名数组支持轮询尝试的圈数 retry_domain_max_times: 5 # 单个域名允许的最大失败次数 ``` -------------------------------- ### View Album Details via Command Line Source: https://github.com/hect0x7/jmcomic-crawler-python/blob/master/README.md Quickly view details of an album using the 'jmv' command. This command extracts the album ID from various text formats and displays information without downloading. ```shell # 直接输入车号 jmv 350234 # 从混合文本中提取数字(提取出 350234) jmv 350谁还没看过234 # 指定option文件(也支持环境变量,用法同上) jmv 350234 --option="D:/a.yml" # -y 参数:执行完毕后直接退出,无需按回车确认 jmv 350234 -y ``` -------------------------------- ### Create and Register Custom Plugin Source: https://github.com/hect0x7/jmcomic-crawler-python/blob/master/assets/docs/sources/tutorial/6_plugin.md Define a custom plugin by inheriting from JmOptionPlugin and registering it with JmModuleConfig. ```python # 1. 自定义plugin类 from jmcomic import JmOptionPlugin, JmModuleConfig # 自定义一个类,继承JmOptionPlugin class MyPlugin(JmOptionPlugin): # 指定你的插件的key plugin_key = 'myplugin' # 实现invoke方法 # 方法的参数可以自定义,这里假设方法只有一个参数 word def invoke(self, word) -> None: print(word) # 2. 让plugin类生效 JmModuleConfig.register_plugin(MyPlugin) ``` -------------------------------- ### Dynamically Change Search Query with Generator send() Source: https://github.com/hect0x7/jmcomic-crawler-python/blob/master/assets/docs/sources/tutorial/0_common_usage.md Demonstrates advanced generator usage with `send()` for site search, allowing dynamic modification of search parameters like the search query mid-iteration. Requires a `while` loop. ```python generator = html_cl.search_gen('mana') try: page = next(generator) while True: for aid, atitle in page.iter_id_title(): print(aid, atitle) # 可直接动态传参改变搜索条件,例如下一页换成搜索 'nana' page = generator.send({"search_query": 'nana'}) except StopIteration: pass ``` -------------------------------- ### 动态获取全量域名 Source: https://github.com/hect0x7/jmcomic-crawler-python/blob/master/assets/docs/sources/tutorial/12_domain_strategy.md 通过官方发布页获取最新域名列表并更新全局配置,适用于需要动态更新域名的场景。 ```python from jmcomic import * # 获取全量域名列表 domain_list = JmModuleConfig.get_html_domain_all() print(f"全量域名列表:{domain_list}") # 将获取到的域名替换掉全局默认域名列表 JmModuleConfig.DOMAIN_HTML_LIST = domain_list op = create_option('option.yml') # 新建的 Client 会默认使用刚刚更新的 DOMAIN_HTML_LIST cl = op.new_jm_client() ``` -------------------------------- ### Directory Naming by Chapter Index and Title Source: https://github.com/hect0x7/jmcomic-crawler-python/blob/master/assets/docs/sources/tutorial/9_custom_download_dir_name.md Use the built-in `indextitle` field to name folders based on the chapter number and title. This is a straightforward way to organize chapters. ```yaml # 直接使用内置字段 indextitle 即可 dir_rule: rule: Bd_Pindextitle ``` -------------------------------- ### Create JmHtmlClient Manually Source: https://github.com/hect0x7/jmcomic-crawler-python/blob/master/assets/docs/sources/tutorial/0_common_usage.md Manually instantiates a `JmHtmlClient` for web scraping. Requires specifying the `postman`, `domain_list`, and `retry_times`. ```python # 默认的使用方式是先创建option,option封装了所有配置,然后由option.new_jm_client() 创建客户端client,使用client可以访问禁漫接口 # 下面演示直接构造client的方式 from jmcomic import * """ 创建JM客户端 :param postman: 负责实现HTTP请求的对象,持有cookies、headers、proxies等信息 :param domain_list: 禁漫域名 :param retry_times: 重试次数 """ # 网页端 cl = JmHtmlClient( postman=JmModuleConfig.new_postman(), domain_list=['18comic.vip'], retry_times=1 ) ``` -------------------------------- ### jmcomic.jm_client_impl Source: https://github.com/hect0x7/jmcomic-crawler-python/blob/master/assets/docs/sources/api/client.md Overview of the client implementation classes available in the jmcomic library. ```APIDOC ## Client Implementation Modules ### Description The `jmcomic.jm_client_impl` module provides the core client classes for interacting with the JM comic platform. It includes both HTML-based and API-based client implementations. ### Members - **JmHtmlClient**: A client implementation that interacts with the platform via HTML parsing. - **JmApiClient**: A client implementation that interacts with the platform via its internal API endpoints. ``` -------------------------------- ### Configure Login Plugin in YAML Source: https://github.com/hect0x7/jmcomic-crawler-python/blob/master/assets/docs/sources/tutorial/6_plugin.md Define the login plugin within the plugins section of an option configuration file to automatically authenticate during initialization. ```yaml plugins: # 插件配置项 after_init: # 在after_init事件时自动执行插件 - plugin: login # 插件的key kwargs: # 下面是给插件的参数 (kwargs),由插件类自定义 username: un # 禁漫帐号 password: pw # 密码 ``` -------------------------------- ### 配置静态域名与客户端 Source: https://github.com/hect0x7/jmcomic-crawler-python/blob/master/assets/docs/sources/tutorial/12_domain_strategy.md 通过 YAML 配置文件定义域名列表,并使用 jmcomic 库加载配置以实现自动域名轮询。 ```yaml # option.yml 示例 client: impl: html domain: html: - 18comic.vip - 18comic.org ``` ```python from jmcomic import * # 通过配置文件构建并获取配置好的 Option 和 Client # Option会加载上面的域名列表,在请求时如果第一个域名失败,会自动重试列表中的下一个域名。 op = create_option('option.yml') cl = op.new_jm_client() ``` -------------------------------- ### Directory Naming by Comic ID and Title Source: https://github.com/hect0x7/jmcomic-crawler-python/blob/master/assets/docs/sources/tutorial/9_custom_download_dir_name.md Set directory names to include the comic's ID and title. This uses a custom field handler for photo details. ```python from jmcomic import JmModuleConfig # Pmyname JmModuleConfig.PFIELD_ADVICE['myname'] = lambda photo: f'【{photo.id}】{photo.title}' ``` -------------------------------- ### Search Site with Category and Sub-Category Source: https://github.com/hect0x7/jmcomic-crawler-python/blob/master/assets/docs/sources/tutorial/0_common_usage.md Performs a site search specifying both a main category (e.g., Doujin) and a sub-category (e.g., CG). This functionality is specific to the web client. ```python from jmcomic import * op = create_option_by_file('op.yml') # 创建网页端client html_cl = op.new_jm_client(impl='html') # 使用站内搜索,指定【分类】和【副分类】 # 分类 = JmMagicConstants.CATEGORY_DOUJIN = 同人本 # 副分类 = JmMagicConstants.SUB_DOUJIN_CG = CG本 # 实际URL:https://18comic.vip/search/photos/doujin/sub/CG?main_tag=0&search_query=mana&page=1&o=mr&t=a page = html_cl.search_site(search_query='mana', category=JmMagicConstants.CATEGORY_DOUJIN, sub_category=JmMagicConstants.SUB_DOUJIN_CG, page=1, ) # 打印page内容 for aid, atitle in page.iter_id_title(): print(aid, atitle) ``` -------------------------------- ### 配置 jmcomic 下载规则与插件 Source: https://github.com/hect0x7/jmcomic-crawler-python/blob/master/assets/docs/sources/tutorial/7_advance.md 定义下载路径、图片格式、登录凭据、增量更新逻辑及压缩插件行为的 YAML 配置。 ```yaml dir_rule: # 下载路径规则 rule: Bd_Aid base_dir: D:/jmcomic download: image: suffix: .jpg # 转为jpg格式的图片 client: domain: - 18comic.vip # 指定域名 plugins: after_init: - plugin: login # 登录插件 kwargs: username: un password: pw - plugin: find_update # 只下载新章插件 kwargs: 145504: 290266 # 下载本子145504的章节290266以后的新章 after_album: - plugin: zip # 压缩文件插件 kwargs: level: photo # 按照章节,一个章节一个压缩文件 filename_rule: Ptitle # 压缩文件的命名规则 zip_dir: D:/jmcomic # 压缩文件存放的文件夹 delete_original_file: true # 压缩成功后,删除所有原文件和文件夹 ``` -------------------------------- ### Customize Entity Classes (Album, Photo, Image) Source: https://github.com/hect0x7/jmcomic-crawler-python/blob/master/assets/docs/sources/tutorial/4_module_custom.md Replace default entity classes like JmAlbumDetail or JmPhotoDetail with custom ones inheriting from them. Override properties or methods like 'custom' or 'get_dirname' to modify entity behavior. Assign your custom classes to JmModuleConfig.CLASS_ALBUM and JmModuleConfig.CLASS_PHOTO. ```python def custom_album_photo_image_detail_class(): """ 该函数演示自定义实体类(本子/章节/图片) 在使用路径规则 DirRule 时,可能会遇到需要自定义实体类属性的情况,例如: dir_rule: base_dir: ${workspace} rule: Bd_Acustom_Pcustom # 可选:对目录名进行繁/简体规范化(None/zh-cn/zh-tw),默认不启用 # normalize_zh: zh-cn 上面的Acustom,Pcustom都是自定义字段 如果你想要使用这种自定义字段,你就需要替换默认的实体类,方式如下 """ # 自定义本子实体类 class MyAlbum(JmAlbumDetail): # 自定义 custom 属性 @property def custom(self): return f'custom_{self.title}' # 自定义章节实体类 class MyPhoto(JmPhotoDetail): # 自定义 custom 属性 @property def custom(self): return f'custom_{self.title}' """ v2.3.3: 支持更灵活的自定义方式,可以使用函数,效果同上,示例见下 """ class MyAlbum2(JmAlbumDetail): def get_dirname(self, ref: str) -> str: if ref == 'custom': return f'custom_{self.name}' return super().get_dirname(ref) # 最后,替换默认实体类来让你的自定义类生效 JmModuleConfig.CLASS_ALBUM = MyAlbum JmModuleConfig.CLASS_PHOTO = MyPhoto ``` -------------------------------- ### Customize Download Callback Source: https://github.com/hect0x7/jmcomic-crawler-python/blob/master/assets/docs/sources/tutorial/4_module_custom.md Override download callback methods by creating a custom downloader class that inherits from JmDownloader and then assigning it to JmModuleConfig.CLASS_DOWNLOADER. ```python def custom_download_callback(): """ 该函数演示自定义下载时的回调函数 """ # jmcomic的下载功能由 JmModuleConfig.CLASS_DOWNLOADER 这个类来负责执行 # 这个类默认是 JmDownloader,继承了DownloadCallback # 你可以写一个自定义类,继承JmDownloader,覆盖属于DownloadCallback的方法,来实现自定义回调 class MyDownloader(JmDownloader): # 覆盖 album 下载完成后的回调 def after_album(self, album: JmAlbumDetail): print(f'album下载完毕: {album}') pass # 最后,让你的自定义类生效 JmModuleConfig.CLASS_DOWNLOADER = MyDownloader ``` -------------------------------- ### 通过 GitHub 兜底获取域名 Source: https://github.com/hect0x7/jmcomic-crawler-python/blob/master/assets/docs/sources/tutorial/12_domain_strategy.md 当官方发布页无法访问时,通过 GitHub 仓库解析最新域名,并结合重试机制进行下载。 ```python from jmcomic import * # 该请求发往 github.com,在大多数常规网络中均能保持连通 domains = JmModuleConfig.get_html_domain_all_via_github() op = JmOption.default() # 可以结合重试机制,允许失败时轮换多次 op.client.retry_times = 3 # 应用域名池新建包含该域名的 Client (记得指定 impl='html') # 将新建的 client 赋值回 op,使其在后续的下载中生效 op.client = op.new_jm_client(domain_list=domains, impl='html') download_album('438696', op) ``` -------------------------------- ### Download Comics and Chapters Source: https://github.com/hect0x7/jmcomic-crawler-python/blob/master/assets/docs/sources/tutorial/0_common_usage.md Use these functions to download entire albums or specific chapters. You can also download multiple albums simultaneously by passing a list of IDs. ```python from jmcomic import * # 下载id为438696的本子 (https://18comic.vip/album/438696) download_album(438696) # 下载章节 (https://18comic.vip/photo/438696) download_photo(438696) # 同时下载多个本子 download_album([123, 456, 789]) ``` -------------------------------- ### 获取单个跳转域名 Source: https://github.com/hect0x7/jmcomic-crawler-python/blob/master/assets/docs/sources/tutorial/12_domain_strategy.md 通过访问永久跳转页获取单个可用的网页端域名。 ```python from jmcomic import * # 获取当前可用的单一网页端域名 domain = JmModuleConfig.get_html_domain() op = JmOption.default() op.client = op.new_jm_client(domain_list=[domain], impl='html') ``` -------------------------------- ### Specify jmv Options via Command-line Argument Source: https://github.com/hect0x7/jmcomic-crawler-python/blob/master/assets/docs/sources/tutorial/2_command_line.md Similar to jmcomic, jmv supports the --option argument to specify a custom options configuration file path. ```sh jmv 350234 --option="D:/a.yml" ``` -------------------------------- ### Specify jmcomic Options via Command-line Argument Source: https://github.com/hect0x7/jmcomic-crawler-python/blob/master/assets/docs/sources/tutorial/2_command_line.md Provide a custom options configuration file path to jmcomic using the --option argument. ```sh jmcomic 123 --option="D:/a.yml" ``` -------------------------------- ### Filter Comics by Category and Sort by Views Source: https://github.com/hect0x7/jmcomic-crawler-python/blob/master/assets/docs/sources/tutorial/0_common_usage.md Retrieves the first page of comics from all categories, sorted by view count. Uses constants for time, category, and order. ```python page: JmCategoryPage = cl.categories_filter( page=1, time=JmMagicConstants.TIME_ALL, # 时间选择全部,具体可以写什么请见JmMagicConstants category=JmMagicConstants.CATEGORY_ALL, # 分类选择全部,具体可以写什么请见JmMagicConstants order_by=JmMagicConstants.ORDER_BY_VIEW, # 按照观看数排序,具体可以写什么请见JmMagicConstants ) ``` -------------------------------- ### Iterate Through Paginated Site Search Results Source: https://github.com/hect0x7/jmcomic-crawler-python/blob/master/assets/docs/sources/tutorial/0_common_usage.md Uses a generator to loop through paginated results from a site search that includes category and sub-category filters. This is for the web client only. ```python for page in html_cl.search_gen(search_query='mana', category=JmMagicConstants.CATEGORY_DOUJIN, sub_category=JmMagicConstants.SUB_DOUJIN_CG, page=1, # 起始页码 ): # 打印page内容 for aid, atitle in page.iter_id_title(): print(aid, atitle) ``` -------------------------------- ### Dynamically Modify Search Conditions with Generator send() Source: https://github.com/hect0x7/jmcomic-crawler-python/blob/master/assets/docs/sources/tutorial/0_common_usage.md Demonstrates advanced generator usage with `send()` to dynamically change search parameters mid-iteration. Requires a `while` loop to manage `send()` return values. ```python generator = cl.categories_filter_gen(page=1, time=JmMagicConstants.TIME_WEEK) try: page = next(generator) # 预先启动生成器 while True: # 打印第一页 for aid, atitle in page: print(aid, atitle) # 假设我们只想看前一页,下一页想换一个排序方式 # 调用 send 传入包含新参数的 dict 即可覆盖原来的查询条件 page = generator.send({"order_by": JmMagicConstants.ORDER_BY_LATEST}) except StopIteration: pass ``` -------------------------------- ### Download Comics with jmcomic Source: https://github.com/hect0x7/jmcomic-crawler-python/blob/master/assets/docs/sources/tutorial/2_command_line.md Use the jmcomic command to download albums and photos by specifying their IDs. Separate multiple IDs with spaces. ```sh jmcomic 123 456 p333 ``` -------------------------------- ### Define Custom Album Field for Directory Naming Source: https://github.com/hect0x7/jmcomic-crawler-python/blob/master/assets/docs/sources/tutorial/9_custom_download_dir_name.md Create a custom field for album details to be used in directory naming rules. This involves defining the field name in `dir_rule.rule` and providing a handler function in `JmModuleConfig.AFIELD_ADVICE`. ```python from jmcomic import JmModuleConfig # 你需要写一个函数,把字段名作为key,函数作为value,加到JmModuleConfig.AFIELD_ADVICE这个字典中 JmModuleConfig.AFIELD_ADVICE['myname'] = lambda album: f'[{album.id}] {album.title}' ``` -------------------------------- ### Specify jmcomic Options via Environment Variable Source: https://github.com/hect0x7/jmcomic-crawler-python/blob/master/assets/docs/sources/tutorial/2_command_line.md Configure the JM_OPTION_PATH environment variable with the path to your options configuration file for jmcomic. ```sh jmcomic 123 ``` -------------------------------- ### Replace String in Folder Path Source: https://github.com/hect0x7/jmcomic-crawler-python/blob/master/assets/docs/sources/tutorial/9_custom_download_dir_name.md Use the `replace_path_string` plugin to substitute specific text within download folder paths. This is useful for consistent naming across different environments. ```yaml plugins: after_init: - plugin: replace_path_string kwargs: replace: # {左边写你要替换的原文}: {右边写替换成什么文本} kyockcho: きょくちょ ``` -------------------------------- ### Browse Categories and Rankings Source: https://github.com/hect0x7/jmcomic-crawler-python/blob/master/assets/docs/sources/tutorial/0_common_usage.md Access content by category or rankings, similar to site search but without filtering. This allows you to retrieve all items within a specific category. ```python from jmcomic import * # 创建客户端 op = JmOption.default() cl = op.new_jm_client() # 调用分类接口 # 根据下面的参数,这个调用的意义就是: ``` -------------------------------- ### Search and Download Comics Source: https://github.com/hect0x7/jmcomic-crawler-python/blob/master/assets/docs/sources/tutorial/0_common_usage.md Combine searching and downloading by first searching for comics based on tags or other criteria, collecting their IDs, and then initiating a download for the found albums. ```python from jmcomic import * option = JmOption.default() client = option.new_jm_client() tag = '無修正' # 搜索标签,可以使用search_tag。 # 搜索第一页。 page: JmSearchPage = client.search_tag(tag, page=1) aid_list = [] for aid, atitle, tag_list in page.iter_id_title_tag(): # 使用page的iter_id_title_tag迭代器 if tag in tag_list: print(f'[标签/{tag}] 发现目标: [{aid}]: [{atitle}]') aid_list.append(aid) download_album(aid_list, option) ``` -------------------------------- ### Customize Option Class Source: https://github.com/hect0x7/jmcomic-crawler-python/blob/master/assets/docs/sources/tutorial/4_module_custom.md Create a custom option class by inheriting from JmOption and overriding methods like __init__ or default. Assign your custom class to JmModuleConfig.CLASS_OPTION to enable it. ```python def custom_option_class(): """ 该函数演示自定义option类 """ # jmcomic模块支持自定义Option类, # 你可以写一个自己的类,继承JmOption,然后覆盖其中的一些方法。 class MyOption(JmOption): def __init__(self, *args, **kwargs): print('MyOption 初始化开始') super().__init__(*args, **kwargs) @classmethod def default(cls): print('调用了MyOption.default()') return super().default() # 最后,替换默认Option类即可 JmModuleConfig.CLASS_OPTION = MyOption ``` -------------------------------- ### Manually Invoke Plugins Source: https://github.com/hect0x7/jmcomic-crawler-python/blob/master/assets/docs/sources/tutorial/6_plugin.md Trigger specific plugin events manually using the call_all_plugin method on an existing option object. ```python # 假设你已经创建了option对象 from jmcomic import JmOption option: JmOption # 手动调用after_init事件下的插件 option.call_all_plugin('after_init') # 手动调用一个特定事件的插件(如果你没有配置这个事件的插件,那么无事发生。) option.call_all_plugin('my_event') ``` -------------------------------- ### Test Jmcomic domain accessibility Source: https://github.com/hect0x7/jmcomic-crawler-python/blob/master/assets/docs/sources/tutorial/8_pick_domain.md Uses multi-threading to fetch domains from a template URL and verifies each one by attempting to retrieve album details. ```python """ 该脚本的作用:测试当前ip可以访问哪些禁漫域名 """ from jmcomic import * option = JmOption.default() meta_data = { # 'proxies': ProxyBuilder.clash_proxy() } disable_jm_log() def get_all_domain(): template = 'https://jmcmomic.github.io/go/{}.html' url_ls = [ template.format(i) for i in range(300, 309) ] domain_set: Set[str] = set() def fetch_domain(url): from curl_cffi import requests as postman text = postman.get(url, allow_redirects=False, **meta_data).text for domain in JmcomicText.analyse_jm_pub_html(text): if domain.startswith('jm365.work'): continue domain_set.add(domain) multi_thread_launcher( iter_objs=url_ls, apply_each_obj_func=fetch_domain, ) return domain_set domain_set = get_all_domain() print(f'获取到{len(domain_set)}个域名,开始测试') domain_status_dict = {} def test_domain(domain: str): client = option.new_jm_client(impl='html', domain_list=[domain], **meta_data) status = 'ok' try: client.get_album_detail('123456') except Exception as e: status = str(e.args) pass domain_status_dict[domain] = status multi_thread_launcher( iter_objs=domain_set, apply_each_obj_func=test_domain, ) for domain, status in domain_status_dict.items(): print(f'{domain}: {status}') ``` -------------------------------- ### Normalize Chinese Characters in Directory Names Source: https://github.com/hect0x7/jmcomic-crawler-python/blob/master/assets/docs/sources/tutorial/9_custom_download_dir_name.md Enable simplified or traditional Chinese normalization for directory names to ensure consistency. This feature requires the optional `zhconv` library. ```yaml dir_rule: base_dir: D:/a/b/c/ rule: Bd / Ptitle normalize_zh: zh-cn # 可选值:None(默认,不转换)/ zh-cn / zh-tw ``` -------------------------------- ### View Comic Details with jmv Source: https://github.com/hect0x7/jmcomic-crawler-python/blob/master/assets/docs/sources/tutorial/2_command_line.md Use the jmv command to view comic details without downloading. It can extract comic IDs from mixed text or directly from numbers. ```sh jmv 350234 ``` ```sh jmv 350234 ``` ```sh jmv JM350234 ``` -------------------------------- ### Filter Logs by Topic using Plugin Source: https://github.com/hect0x7/jmcomic-crawler-python/blob/master/assets/docs/sources/tutorial/11_log_custom.md Configure the `log_topic_filter` plugin to selectively enable logs for specific topics like 'api' and 'html'. This helps in focusing on relevant log messages. ```yaml log: true plugins: after_init: - plugin: log_topic_filter # 日志topic过滤插件 kwargs: whitelist: [ # 只保留api和html,这两个是Client发请求时会打的日志topic 'api', 'html', ] ``` -------------------------------- ### Iterate Through Filtered Comics with Generator Source: https://github.com/hect0x7/jmcomic-crawler-python/blob/master/assets/docs/sources/tutorial/0_common_usage.md Uses a generator to loop through paginated results of filtered comics. Basic usage involves a simple for loop. ```python for page in cl.categories_filter_gen(page=1, # 起始页码 # 下面是分类参数 time=JmMagicConstants.TIME_WEEK, category=JmMagicConstants.CATEGORY_ALL, order_by=JmMagicConstants.ORDER_BY_VIEW, ): for aid, atitle in page: print(aid, atitle) ``` -------------------------------- ### Handle jmcomic Exceptions Source: https://github.com/hect0x7/jmcomic-crawler-python/blob/master/assets/docs/sources/tutorial/0_common_usage.md Implement exception handling for various jmcomic errors, including missing albums, JSON resolution failures, and request retries. The `check_exception=True` parameter in download functions can help detect partial download failures. ```python from jmcomic import * # 客户端 client = JmOption.default().new_jm_client() # 捕获获取本子/章节详情时可能出现的异常 try: # 请求本子实体类 album: JmAlbumDetail = client.get_album_detail('427413') except MissingAlbumPhotoException as e: print(f'id={e.error_jmid}的本子不存在') except JsonResolveFailException as e: print(f'解析json失败') # 响应对象 resp = e.resp print(f'resp.text: {resp.text}, resp.status_code: {resp.status_code}') except RequestRetryAllFailException as e: print(f'请求失败,重试次数耗尽') except JmcomicException as e: # 捕获所有异常,用作兜底 print(f'jmcomic遇到异常: {e}') # 多线程下载时,可能出现非当前线程下载失败,抛出异常, # 而JmDownloader有对应字段记录了这些线程发生的异常 # 使用check_exception=True参数可以使downloader主动检查是否存在下载异常 # 如果有,则当前线程会主动上抛一个PartialDownloadFailedException异常 # 该参数主要用于主动检查部分下载失败的情况, # 因为非当前线程抛出的异常(比如下载章节的线程和下载图片的线程),这些线程如果抛出异常, # 当前线程是感知不到的,try-catch下载方法download_album不能捕获到其他线程发生的异常。 try: album, downloader = download_album(123, check_exception=True) except PartialDownloadFailedException as e: downloader: JmDownloader = e.downloader print(f'下载出现部分失败, 下载失败的章节: {downloader.download_failed_photo}, 下载失败的图片: {downloader.download_failed_image}') ``` -------------------------------- ### Search for Comics Source: https://github.com/hect0x7/jmcomic-crawler-python/blob/master/assets/docs/sources/tutorial/0_common_usage.md Search for comics using keywords and filters. The `search_site` function allows for paginated searches within the site, and you can directly search for album IDs. ```python from jmcomic import * client = JmOption.default().new_jm_client() # 分页查询,search_site就是禁漫网页上的【站内搜索】 page: JmSearchPage = client.search_site(search_query='+MANA +无修正', page=1) print(f'结果总数: {page.total}, 分页大小: {page.page_size},页数: {page.page_count}') # page默认的迭代方式是page.iter_id_title(),每次迭代返回 albun_id, title for album_id, title in page: print(f'[{album_id}]: {title}') # 直接搜索禁漫车号 page = client.search_site(search_query='427413') album: JmAlbumDetail = page.single_album print(album.tags) ``` -------------------------------- ### Disable Logs for a Specific Plugin Source: https://github.com/hect0x7/jmcomic-crawler-python/blob/master/assets/docs/sources/tutorial/11_log_custom.md Add a `log: false` configuration to a plugin to prevent it from outputting its own logs. This is useful for reducing noise from specific plugin functionalities. ```yaml plugins: after_init: - plugin: client_proxy log: false # 插件自身不打印日志 kwargs: proxy_client_key: photo_concurrent_fetcher_proxy whitelist: [ api, ] ``` -------------------------------- ### Retrieve Monthly Ranking Source: https://github.com/hect0x7/jmcomic-crawler-python/blob/master/assets/docs/sources/tutorial/0_common_usage.md Fetches the monthly ranking of comics. This is a convenience method that internally calls `categories_filter`. ```python page: JmCategoryPage = cl.month_ranking(1) ```