### Install Pyaxmlparser via pip Source: https://context7.com/appknox/pyaxmlparser/llms.txt The standard method for installing the pyaxmlparser library in a Python environment. This command fetches the package from PyPI and installs necessary dependencies. ```bash pip install pyaxmlparser ``` -------------------------------- ### AXMLParser CLI Usage Example Source: https://github.com/appknox/pyaxmlparser/blob/master/README.rst Demonstrates how to use the apkinfo command-line interface (CLI) tool to extract information from an Android APK file. This tool parses the AndroidManifest.xml within the APK. ```shell ➜ apkinfo --help Usage: apkinfo [OPTIONS] FILENAME Options: -s, --silent Don't print any debug or warning logs --help Show this message and exit. ``` ```shell $ apkinfo ~/Downloads/com.hardcodedjoy.roboremo.15.apk APK: /home/chillaranand/Downloads/com.hardcodedjoy.roboremo.15.apk App name: RoboRemo Package: com.hardcodedjoy.roboremo Version name: 2.0.0 Version code: 15 Is it Signed: True Is it Signed with v1 Signatures: True Is it Signed with v2 Signatures: True Is it Signed with v3 Signatures: False ``` -------------------------------- ### Pyaxmlparser Usage Source: https://context7.com/appknox/pyaxmlparser/llms.txt This section provides examples of how to use the pyaxmlparser library for common Android APK analysis tasks. ```APIDOC ## Pyaxmlparser Library Usage ### Description This documentation outlines the primary ways to interact with the pyaxmlparser library for Android APK analysis. ### Core Classes - **`APK`**: The highest-level interface for general APK analysis. - **`AXMLPrinter`**: For lower-level XML parsing. - **`ARSCParser`**: For parsing Android's resource tables. ### Installation Install the library using pip: ```bash pip install pyaxmlparser ``` For enhanced file type detection, install `python-magic`: ```bash pip install python-magic ``` ### Basic Usage Example (APK Class) ```python from pyaxmlparser import APK try: apk_file_path = "path/to/your/app.apk" apk = APK(apk_file_path) # Extract basic information package_name = apk.package version_name = apk.version_name version_code = apk.version_code permissions = apk.permissions print(f"Package Name: {package_name}") print(f"Version Name: {version_name}") print(f"Version Code: {version_code}") print(f"Permissions: {', '.join(permissions)}") # Extract certificate information certificates = apk.cert if certificates: print("Certificates:") for cert in certificates: print(f" Subject: {cert['subject']}") print(f" Issuer: {cert['issuer']}") print(f" SHA1: {cert['sha1']}") except FileNotFoundError: print(f"Error: APK file not found at {apk_file_path}") except Exception as e: print(f"An error occurred: {e}") ``` ### Use Cases - **Metadata Extraction**: Quickly retrieve package name, version, and other manifest details. - **Security Scanning**: Analyze declared permissions for potential risks. - **MDM Integration**: Use extracted data within Mobile Device Management systems. - **CI/CD Pipelines**: Automate checks and analysis in Android build processes. - **Malware Analysis**: Identify suspicious patterns or permissions in APKs. ``` -------------------------------- ### Handle DEX Files in APKs Source: https://context7.com/appknox/pyaxmlparser/llms.txt This section covers methods for accessing Dalvik Executable (DEX) files, which contain the application's compiled code. It supports both single-DEX and multi-DEX APKs, allowing you to check for multi-DEX presence, get DEX file names, retrieve the main DEX file's bytes, and iterate through all DEX files. ```python from pyaxmlparser import APK apk = APK('/path/to/app.apk') # Check if APK has multiple DEX files is_multidex = apk.is_multidex() print(f"Multi-DEX: {is_multidex}") # Get names of all DEX files dex_names = list(apk.get_dex_names()) print(dex_names) # Output: ['classes.dex', 'classes2.dex', 'classes3.dex'] # Get raw bytes of main classes.dex main_dex = apk.get_dex() print(f"Main DEX size: {len(main_dex)} bytes") # Iterate over all DEX files for dex_bytes in apk.get_all_dex(): print(f"DEX size: {len(dex_bytes)} bytes") ``` -------------------------------- ### Access APK Contents Source: https://context7.com/appknox/pyaxmlparser/llms.txt Access the files contained within the APK archive. You can list all files, retrieve specific file contents, or get file type information. ```APIDOC ## get_files() / get_file() - Access APK Contents ### Description Access the files contained within the APK archive. You can list all files, retrieve specific file contents, or get file type information. ### Method N/A (These are methods of the APK class) ### Endpoint N/A ### Parameters N/A ### Request Example ```python from pyaxmlparser import APK apk = APK('/path/to/app.apk') # List all files in the APK files = apk.get_files() for filename in files: print(filename) # Read a specific file's raw bytes try: manifest_bytes = apk.get_file('AndroidManifest.xml') print(f"Manifest size: {len(manifest_bytes)} bytes") except Exception as e: print(f"File not found: {e}") # Get file types (requires python-magic library) file_types = apk.get_files_types() for filename, filetype in file_types.items(): print(f"{filename}: {filetype}") # Get CRC32 checksums for all files crc_values = apk.get_files_crc32() for filename, crc in crc_values.items(): print(f"{filename}: {crc:08x}") ``` ### Response #### Success Response (200) N/A (These are method return values) #### Response Example ``` AndroidManifest.xml classes.dex resources.arsc res/drawable/icon.png lib/arm64-v8a/libnative.so Manifest size: 1234 bytes classes.dex: Dalvik dex file resources.arsc: Android Resource AndroidManifest.xml: XML document text ``` ``` -------------------------------- ### Get SDK Version Information from APK Source: https://context7.com/appknox/pyaxmlparser/llms.txt Retrieve the minimum, target, and maximum SDK versions from an APK's manifest. These versions define the Android compatibility range for the application. The `get_effective_target_sdk_version()` method provides an integer representation, defaulting to 1 if not explicitly set. ```python from pyaxmlparser import APK apk = APK('/path/to/app.apk') # Get SDK version information min_sdk = apk.get_min_sdk_version() target_sdk = apk.get_target_sdk_version() max_sdk = apk.get_max_sdk_version() print(f"Min SDK: {min_sdk}") # e.g., "21" (Android 5.0) print(f"Target SDK: {target_sdk}") # e.g., "33" (Android 13) print(f"Max SDK: {max_sdk}") # e.g., None (usually not set) # Get effective target SDK (returns int, defaults to 1 if not set) effective_target = apk.get_effective_target_sdk_version() print(f"Effective Target SDK: {effective_target}") # e.g., 33 ``` -------------------------------- ### Extract Certificate and Signature Information Source: https://context7.com/appknox/pyaxmlparser/llms.txt This functionality allows you to inspect the digital certificates and signature schemes (v1, v2, v3) used to sign an APK. You can check if the APK is signed, retrieve all unique certificates, get certificates from specific signature versions, list signature file names, and extract raw DER-encoded certificates. ```python from pyaxmlparser import APK apk = APK('/path/to/app.apk') # Check signature presence print(f"Is signed: {apk.is_signed()}") print(f"V1 (JAR) signature: {apk.is_signed_v1()}") print(f"V2 signature: {apk.is_signed_v2()}") print(f"V3 signature: {apk.is_signed_v3()}") # Get all unique certificates (from all signature versions) certificates = apk.get_certificates() for cert in certificates: print(f"Subject: {cert.subject.human_friendly}") print(f"Issuer: {cert.issuer.human_friendly}") print(f"Serial: {cert.serial_number}") print(f"Not Before: {cert.not_valid_before}") print(f"Not After: {cert.not_valid_after}") print(f"SHA256: {cert.sha256.hex()}") # Get certificates from specific signature version v1_certs = apk.get_certificates_v1() v2_certs = apk.get_certificates_v2() v3_certs = apk.get_certificates_v3() # Get signature file names (v1 signatures) sig_names = apk.get_signature_names() print(sig_names) # e.g., ['META-INF/CERT.RSA'] # Get raw DER-encoded certificate for sig_name in sig_names: cert_der = apk.get_certificate_der(sig_name) print(f"Certificate DER size: {len(cert_der)} bytes") ``` -------------------------------- ### AXMLParser Python Package Usage Source: https://github.com/appknox/pyaxmlparser/blob/master/README.rst Shows how to use the AXMLParser Python package to programmatically parse an Android APK file and access its attributes. It requires the 'pyaxmlparser' library to be installed. ```python from pyaxmlparser import APK apk = APK('/foo/bar.apk') print(apk.package) print(apk.version_name) print(apk.version_code) print(apk.icon_info) print(apk.icon_data) print(apk.application) ``` -------------------------------- ### Initialize APK and Extract Metadata Source: https://context7.com/appknox/pyaxmlparser/llms.txt Demonstrates how to load an APK file from a path or raw bytes and access core application metadata such as package name, version, and signature status. ```python from pyaxmlparser import APK # Load an APK file from path apk = APK('/path/to/app.apk') # Or load from raw bytes with open('/path/to/app.apk', 'rb') as f: apk = APK(f.read(), raw=True) # Access basic app information via properties print(f"App Name: {apk.application}") print(f"Package: {apk.packagename}") print(f"Version Name: {apk.version_name}") print(f"Version Code: {apk.version_code}") # Check signature status print(f"Is Signed: {apk.signed}") print(f"Signed v1: {apk.signed_v1}") print(f"Signed v2: {apk.signed_v2}") print(f"Signed v3: {apk.signed_v3}") # Get app icon information and data icon_bytes = apk.icon_data if icon_bytes: with open('app_icon.png', 'wb') as f: f.write(icon_bytes) ``` -------------------------------- ### APK Class - Initialization and Metadata Source: https://context7.com/appknox/pyaxmlparser/llms.txt The APK class is the primary entry point for loading an APK file and accessing basic application metadata like package name, version, and signature status. ```APIDOC ## CLASS APK ### Description Initializes the APK object from a file path or raw bytes and parses the AndroidManifest.xml. ### Method Constructor ### Parameters #### Path Parameters - **path** (string) - Required - File system path to the APK file. #### Request Body - **raw_data** (bytes) - Optional - Raw APK bytes if loading from memory. - **raw** (boolean) - Optional - Set to True when passing raw bytes. ### Response - **application** (string) - The app name. - **packagename** (string) - The unique package identifier. - **version_name** (string) - Human-readable version string. - **version_code** (int) - Internal version integer. - **signed** (boolean) - Signature verification status. ``` -------------------------------- ### Inspect APK Metadata via CLI Source: https://context7.com/appknox/pyaxmlparser/llms.txt Usage of the apkinfo command-line tool to quickly inspect APK metadata, including signature status and version information. ```bash apkinfo /path/to/app.apk apkinfo --silent /path/to/app.apk apkinfo --help ``` -------------------------------- ### Initialize APK Parser Source: https://context7.com/appknox/pyaxmlparser/llms.txt Basic usage of the APK class to load an Android package file. This allows access to high-level metadata such as package name, version, and manifest details. ```python from pyaxmlparser import APK apk = APK('path/to/your/app.apk') print(apk.package) print(apk.version_name) ``` -------------------------------- ### SDK Version Methods Source: https://context7.com/appknox/pyaxmlparser/llms.txt Retrieve the minimum, target, and maximum SDK versions from the manifest, which determine Android version compatibility for the application. ```APIDOC ## SDK Version Methods ### Description Retrieve the minimum, target, and maximum SDK versions from the manifest, which determine Android version compatibility for the application. ### Method N/A (These are methods of the APK class) ### Endpoint N/A ### Parameters N/A ### Request Example ```python from pyaxmlparser import APK apk = APK('/path/to/app.apk') # Get SDK version information min_sdk = apk.get_min_sdk_version() target_sdk = apk.get_target_sdk_version() max_sdk = apk.get_max_sdk_version() print(f"Min SDK: {min_sdk}") # e.g., "21" (Android 5.0) print(f"Target SDK: {target_sdk}") # e.g., "33" (Android 13) print(f"Max SDK: {max_sdk}") # e.g., None (usually not set) # Get effective target SDK (returns int, defaults to 1 if not set) effective_target = apk.get_effective_target_sdk_version() print(f"Effective Target SDK: {effective_target}") # e.g., 33 ``` ### Response #### Success Response (200) N/A (These are method return values) #### Response Example ``` Min SDK: 21 Target SDK: 33 Max SDK: None Effective Target SDK: 33 ``` ``` -------------------------------- ### Advanced APK Directory Processing Source: https://context7.com/appknox/pyaxmlparser/llms.txt Demonstrates how to manually parse AndroidManifest.xml and resolve resource references by combining AXMLPrinter and ARSCParser for deep analysis of extracted APK directories. ```python from pyaxmlparser.arscparser import ARSCParser from pyaxmlparser.axmlprinter import AXMLPrinter app_root = '/path/to/extracted/apk' xml = AXMLPrinter(open(f"{app_root}/AndroidManifest.xml", 'rb').read()).get_xml_obj() rsc = ARSCParser(open(f"{app_root}/resources.arsc", "rb").read()) app_label = xml.get('{http://schemas.android.com/apk/res/android}label') if app_label and app_label.startswith('@'): res_id = int('0x' + app_label[1:], 0) package_name = rsc.get_packages_names()[0] res_type, res_name, _ = rsc.get_id(package_name, res_id) string_value = rsc.get_string(package_name, res_name) print(f"App Name: {string_value[1]}") ``` -------------------------------- ### DEX File Methods Source: https://context7.com/appknox/pyaxmlparser/llms.txt Access the Dalvik Executable (DEX) files containing the application's compiled code. Supports both single-DEX and multi-DEX APKs. ```APIDOC ## DEX File Methods ### Description Access the Dalvik Executable (DEX) files containing the application's compiled code. Supports both single-DEX and multi-DEX APKs. ### Method N/A (These are methods of the APK class) ### Endpoint N/A ### Parameters N/A ### Request Example ```python from pyaxmlparser import APK apk = APK('/path/to/app.apk') # Check if APK has multiple DEX files is_multidex = apk.is_multidex() print(f"Multi-DEX: {is_multidex}") # Get names of all DEX files dex_names = list(apk.get_dex_names()) print(dex_names) # Get raw bytes of main classes.dex main_dex = apk.get_dex() print(f"Main DEX size: {len(main_dex)} bytes") # Iterate over all DEX files for dex_bytes in apk.get_all_dex(): print(f"DEX size: {len(dex_bytes)} bytes") ``` ### Response #### Success Response (200) N/A (These are method return values) #### Response Example ``` Multi-DEX: True ['classes.dex', 'classes2.dex', 'classes3.dex'] Main DEX size: 5678 bytes DEX size: 5678 bytes DEX size: 4321 bytes DEX size: 1098 bytes ``` ``` -------------------------------- ### Parse Android Resources with ARSCParser Source: https://context7.com/appknox/pyaxmlparser/llms.txt Demonstrates how to load and parse a resources.arsc file to extract strings, colors, dimensions, and other resource types. It covers retrieving package names, locales, and specific resource values. ```python from pyaxmlparser.arscparser import ARSCParser with open('resources.arsc', 'rb') as f: arsc_data = f.read() parser = ARSCParser(arsc_data) packages = parser.get_packages_names() package_name = packages[0] locales = parser.get_locales(package_name) strings_xml = parser.get_string_resources(package_name) app_name = parser.get_string(package_name, 'app_name') ``` -------------------------------- ### Certificate and Signature Methods Source: https://context7.com/appknox/pyaxmlparser/llms.txt Extract and analyze the digital certificates used to sign the APK. Supports v1 (JAR), v2, and v3 signature schemes. ```APIDOC ## Certificate and Signature Methods ### Description Extract and analyze the digital certificates used to sign the APK. Supports v1 (JAR), v2, and v3 signature schemes. ### Method N/A (These are methods of the APK class) ### Endpoint N/A ### Parameters N/A ### Request Example ```python from pyaxmlparser import APK apk = APK('/path/to/app.apk') # Check signature presence print(f"Is signed: {apk.is_signed()}") print(f"V1 (JAR) signature: {apk.is_signed_v1()}") print(f"V2 signature: {apk.is_signed_v2()}") print(f"V3 signature: {apk.is_signed_v3()}") # Get all unique certificates (from all signature versions) certificates = apk.get_certificates() for cert in certificates: print(f"Subject: {cert.subject.human_friendly}") print(f"Issuer: {cert.issuer.human_friendly}") print(f"Serial: {cert.serial_number}") print(f"Not Before: {cert.not_valid_before}") print(f"Not After: {cert.not_valid_after}") print(f"SHA256: {cert.sha256.hex()}") # Get certificates from specific signature version v1_certs = apk.get_certificates_v1() v2_certs = apk.get_certificates_v2() v3_certs = apk.get_certificates_v3() # Get signature file names (v1 signatures) sig_names = apk.get_signature_names() print(sig_names) # Get raw DER-encoded certificate for sig_name in sig_names: cert_der = apk.get_certificate_der(sig_name) print(f"Certificate DER size: {len(cert_der)} bytes") ``` ### Response #### Success Response (200) N/A (These are method return values) #### Response Example ``` Is signed: True V1 (JAR) signature: True V2 signature: True V3 signature: False Subject: CN=Android Debug, OU=Android, O=Android, C=US Issuer: CN=Android Debug, OU=Android, O=Android, C=US Serial: 1234567890abcdef Not Before: 2023-01-01 00:00:00 Not After: 2053-12-31 23:59:59 SHA256: a1b2c3d4e5f67890... ['META-INF/CERT.RSA'] Certificate DER size: 1024 bytes ``` ``` -------------------------------- ### Access APK Contents with get_files() and get_file() Source: https://context7.com/appknox/pyaxmlparser/llms.txt This functionality allows you to interact with the files contained within an APK archive. You can list all files, read the raw bytes of a specific file, determine file types (requires `python-magic`), or obtain CRC32 checksums for each file. ```python from pyaxmlparser import APK apk = APK('/path/to/app.apk') # List all files in the APK files = apk.get_files() for filename in files: print(filename) # Output: # AndroidManifest.xml # classes.dex # resources.arsc # res/drawable/icon.png # lib/arm64-v8a/libnative.so # Read a specific file's raw bytes try: manifest_bytes = apk.get_file('AndroidManifest.xml') print(f"Manifest size: {len(manifest_bytes)} bytes") except Exception as e: print(f"File not found: {e}") # Get file types (requires python-magic library) file_types = apk.get_files_types() for filename, filetype in file_types.items(): print(f"{filename}: {filetype}") # Output: # classes.dex: Dalvik dex file # resources.arsc: Android Resource # Get CRC32 checksums for all files crc_values = apk.get_files_crc32() for filename, crc in crc_values.items(): print(f"{filename}: {crc:08x}") ``` -------------------------------- ### Retrieve App Permissions Source: https://context7.com/appknox/pyaxmlparser/llms.txt Shows how to extract requested permissions, detailed permission levels, and distinguish between AOSP and third-party permissions defined in the AndroidManifest.xml. ```python from pyaxmlparser import APK apk = APK('/path/to/app.apk') # Get all requested permissions permissions = apk.get_permissions() for perm in permissions: print(perm) # Get detailed permission info details = apk.get_details_permissions() for perm, info in details.items(): protection_level, label, description = info print(f"{perm}: {protection_level}") # Get AOSP and third-party permissions aosp_perms = apk.get_requested_aosp_permissions() third_party_perms = apk.get_requested_third_party_permissions() ``` -------------------------------- ### Component Inspection Methods Source: https://context7.com/appknox/pyaxmlparser/llms.txt Methods to retrieve Android components such as activities, services, receivers, and providers defined in the manifest. ```APIDOC ## METHOD get_activities() ### Description Returns a list of all activities declared in the manifest. ### Response - **activities** (list) - List of fully qualified class names. ## METHOD get_intent_filters() ### Description Retrieves intent filters for a specific component type and class name. ### Parameters - **component_type** (string) - Required - Type of component (e.g., 'activity'). - **class_name** (string) - Required - Fully qualified class name. ### Response - **filters** (dict) - Dictionary of actions and categories associated with the component. ``` -------------------------------- ### Quick APK Identification with get_apkid Source: https://context7.com/appknox/pyaxmlparser/llms.txt A lightweight utility to extract essential APK metadata such as package name, version code, and version name without performing a full parse of the APK structure. ```python from pyaxmlparser.core import get_apkid appid, version_code, version_name = get_apkid('/path/to/app.apk') print(f"Package: {appid}") print(f"Version Code: {version_code}") print(f"Version Name: {version_name}") ``` -------------------------------- ### Permission Retrieval Methods Source: https://context7.com/appknox/pyaxmlparser/llms.txt Methods to extract requested, declared, and detailed permission information from the APK manifest. ```APIDOC ## METHOD get_permissions() ### Description Retrieves a list of all permissions requested by the application. ### Method GET ### Response - **permissions** (list) - List of permission strings (e.g., android.permission.INTERNET). ## METHOD get_details_permissions() ### Description Returns a dictionary containing detailed information about requested permissions including protection levels and descriptions. ### Response - **details** (dict) - Mapping of permission names to (protection_level, label, description) tuples. ``` -------------------------------- ### Parse Binary XML with AXMLPrinter Source: https://context7.com/appknox/pyaxmlparser/llms.txt The `AXMLPrinter` class is used to convert Android's binary XML format (AXML) into a human-readable XML string. This is essential for parsing files like `AndroidManifest.xml` or other binary XML resources. It can output pretty-printed XML, raw XML bytes, or an lxml ElementTree object for programmatic access to XML attributes and structure. ```python from pyaxmlparser.axmlprinter import AXMLPrinter # Read binary XML file with open('AndroidManifest.xml', 'rb') as f: binary_xml = f.read() # Parse and convert to XML axml = AXMLPrinter(binary_xml) # Check if parsing was successful if axml.is_valid(): # Get XML as bytes (UTF-8 encoded, pretty-printed) xml_bytes = axml.get_xml(pretty=True) print(xml_bytes.decode('utf-8')) # Get XML without pretty printing raw_xml = axml.get_buff() # Get lxml ElementTree object for programmatic access xml_obj = axml.get_xml_obj() print(f"Root tag: {xml_obj.tag}") # Access attributes package = xml_obj.get('package') print(f"Package: {package}") # Check for packer/obfuscation warnings if axml.is_packed(): print("Warning: XML appears to be packed or obfuscated") else: print("Failed to parse binary XML") ``` -------------------------------- ### Extract Android Components Source: https://context7.com/appknox/pyaxmlparser/llms.txt Retrieves application components including activities, services, receivers, and providers, as well as intent filters for specific components. ```python from pyaxmlparser import APK apk = APK('/path/to/app.apk') # Get main activity main_activity = apk.get_main_activity() # Get all activities and services activities = apk.get_activities() services = apk.get_services() # Get intent filters for a specific component filters = apk.get_intent_filters('activity', 'com.example.myapp.MainActivity') print(filters) ``` -------------------------------- ### AXMLPrinter - Parse Binary XML Source: https://context7.com/appknox/pyaxmlparser/llms.txt The `AXMLPrinter` class converts Android's binary XML format (AXML) into standard XML. This is useful for parsing extracted AndroidManifest.xml files or other binary XML resources. ```APIDOC ## AXMLPrinter - Parse Binary XML ### Description The `AXMLPrinter` class converts Android's binary XML format (AXML) into standard XML. This is useful for parsing extracted AndroidManifest.xml files or other binary XML resources. ### Method N/A (This is a class for parsing) ### Endpoint N/A ### Parameters N/A ### Request Example ```python from pyaxmlparser.axmlprinter import AXMLPrinter # Read binary XML file with open('AndroidManifest.xml', 'rb') as f: binary_xml = f.read() # Parse and convert to XML axml = AXMLPrinter(binary_xml) # Check if parsing was successful if axml.is_valid(): # Get XML as bytes (UTF-8 encoded, pretty-printed) xml_bytes = axml.get_xml(pretty=True) print(xml_bytes.decode('utf-8')) # Get XML without pretty printing raw_xml = axml.get_buff() # Get lxml ElementTree object for programmatic access xml_obj = axml.get_xml_obj() print(f"Root tag: {xml_obj.tag}") # Access attributes package = xml_obj.get('package') print(f"Package: {package}") # Check for packer/obfuscation warnings if axml.is_packed(): print("Warning: XML appears to be packed or obfuscated") else: print("Failed to parse binary XML") ``` ### Response #### Success Response (200) N/A (This is a parsing utility) #### Response Example ```xml ... Root tag: {{http://schemas.android.com/apk/res/android}}manifest Package: com.example.app ``` ``` === COMPLETE CONTENT === This response contains all available snippets from this library. No additional content exists. Do not make further requests.