### Basic Regex Matching with Joni (Java) Source: https://github.com/jruby/joni/blob/master/README.md Demonstrates the fundamental process of compiling a regex pattern and performing a search operation on a given string using the Joni library. It returns the starting index of the first match or -1 if no match is found. ```java byte[] pattern = "a*".getBytes(); byte[] str = "aaa".getBytes(); Regex regex = new Regex(pattern, 0, pattern.length, Option.NONE, UTF8Encoding.INSTANCE); Matcher matcher = regex.matcher(str); int result = matcher.search(0, str.length, Option.DEFAULT); ``` -------------------------------- ### Joni Regex Matching with Named Captures (Java) Source: https://github.com/jruby/joni/blob/master/README.md Illustrates the usage of named capture groups in Joni regular expressions. After a successful match, this example iterates through named backreferences to identify and potentially extract captured content by name. ```java byte[] pattern = "(?a*)".getBytes(); byte[] str = "aaa".getBytes(); Regex regex = new Regex(pattern, 0, pattern.length, Option.NONE, UTF8Encoding.INSTANCE); Matcher matcher = regex.matcher(str); int result = matcher.search(0, str.length, Option.DEFAULT); if (result != -1) { Region region = matcher.getEagerRegion(); for (Iterator entry = regex.namedBackrefIterator(); entry.hasNext();) { NameEntry e = entry.next(); int number = e.getBackRefs()[0]; // can have many refs per name // int begin = region.beg[number]; // int end = region.end[number]; } } ``` -------------------------------- ### Switch Regex Syntax Modes in Java with Joni Source: https://context7.com/jruby/joni/llms.txt This Java example shows how to use different regular expression syntax modes (Ruby, Java, POSIX Extended) with the Joni library. By providing the appropriate Syntax object during Regex compilation, you can control how patterns are interpreted. It requires Joni classes like Regex, Matcher, Option, Syntax, and WarnCallback. ```java import org.jcodings.specific.UTF8Encoding; import org.joni.Regex; import org.joni.Matcher; import org.joni.Option; import org.joni.Syntax; import org.joni.WarnCallback; // Ruby syntax (default) byte[] rubyPattern = "(?\\w+)".getBytes(); Regex rubyRegex = new Regex(rubyPattern, 0, rubyPattern.length, Option.NONE, UTF8Encoding.INSTANCE, Syntax.RUBY, WarnCallback.DEFAULT); // Java syntax byte[] javaPattern = "\\b\\w+\\b".getBytes(); Regex javaRegex = new Regex(javaPattern, 0, javaPattern.length, Option.NONE, UTF8Encoding.INSTANCE, Syntax.Java, WarnCallback.DEFAULT); // POSIX Extended syntax byte[] posixPattern = "[[:alpha:]]+".getBytes(); Regex posixRegex = new Regex(posixPattern, 0, posixPattern.length, Option.NONE, UTF8Encoding.INSTANCE, Syntax.PosixExtended, WarnCallback.DEFAULT); byte[] testStr = "test123".getBytes(); Matcher matcher = rubyRegex.matcher(testStr); int result = matcher.search(0, testStr.length, Option.DEFAULT); System.out.println("Ruby syntax matched: " + (result != -1)); // Output: Ruby syntax matched: true ``` -------------------------------- ### Configure Pattern Matching Options in Java Source: https://context7.com/jruby/joni/llms.txt This Java code configures Joni regex matching behavior using Option flags for case-insensitivity and multiline matching. It demonstrates how to create Regex objects with specific options and use the Matcher to find patterns in strings, showing the results of these configurations. It requires Joni library classes like Regex, Matcher, Option, and UTF8Encoding. ```java import org.jcodings.specific.UTF8Encoding; import org.joni.Matcher; import org.joni.Option; import org.joni.Regex; // Case-insensitive matching byte[] pattern1 = "hello".getBytes(); byte[] str1 = "HELLO World".getBytes(); Regex regex1 = new Regex(pattern1, 0, pattern1.length, Option.IGNORECASE, UTF8Encoding.INSTANCE); Matcher matcher1 = regex1.matcher(str1); int result1 = matcher1.search(0, str1.length, Option.DEFAULT); System.out.println("Case-insensitive match: " + (result1 != -1)); // Multiline mode (^ and $ match line boundaries) byte[] pattern2 = "^line\\d+$".getBytes(); byte[] str2 = "text\nline123\nmore".getBytes(); Regex regex2 = new Regex(pattern2, 0, pattern2.length, Option.MULTILINE, UTF8Encoding.INSTANCE); Matcher matcher2 = regex2.matcher(str2); int result2 = matcher2.search(0, str2.length, Option.DEFAULT); System.out.println("Multiline match: " + (result2 != -1)); // Output: Case-insensitive match: true // Multiline match: true ``` -------------------------------- ### Implement Timeout for Regex Matching in Java Source: https://context7.com/jruby/joni/llms.txt This Java code demonstrates how to prevent potential regex denial-of-service attacks by setting a timeout for matching operations using the Joni library. It shows how to create a Matcher with a specified timeout in nanoseconds and how to catch a TimeoutException if the matching operation exceeds the allotted time. Dependencies include Joni classes like Regex, Matcher, Option, UTF8Encoding, and TimeoutException. ```java import org.jcodings.specific.UTF8Encoding; import org.joni.Matcher; import org.joni.Option; import org.joni.Regex; import org.joni.exception.TimeoutException; // Potentially slow pattern (catastrophic backtracking) byte[] pattern = "(a+)+b".getBytes(); byte[] str = "aaaaaaaaaaaaaaaaaaaaaaaaaX".getBytes(); // No 'b' at end Regex regex = new Regex(pattern, 0, pattern.length, Option.NONE, UTF8Encoding.INSTANCE); // Create matcher with 100ms timeout (in nanoseconds) long timeoutNanos = 100_000_000L; Matcher matcher = regex.matcher(str, 0, str.length, timeoutNanos); try { int result = matcher.search(0, str.length, Option.DEFAULT); System.out.println("Match result: " + result); } catch (Exception e) { if (e instanceof TimeoutException) { System.out.println("Matching timed out!"); } } // Output: Matching timed out! ``` -------------------------------- ### Joni Regex Matching with Captures (Java) Source: https://github.com/jruby/joni/blob/master/README.md Shows how to perform a regex search and retrieve captured groups using Joni. If a match is found, the `getEagerRegion()` method can be used to access the captured substrings. ```java byte[] pattern = "(a*)".getBytes(); byte[] str = "aaa".getBytes(); Regex regex = new Regex(pattern, 0, pattern.length, Option.NONE, UTF8Encoding.INSTANCE); Matcher matcher = regex.matcher(str); int result = matcher.search(0, str.length, Option.DEFAULT); if (result != -1) { Region region = matcher.getEagerRegion(); } ``` -------------------------------- ### Compile and Search Regex in Byte Arrays (Java) Source: https://context7.com/jruby/joni/llms.txt Demonstrates how to compile a regular expression from a byte array and search for its occurrence within another byte array. It utilizes the Joni library's Regex and Matcher classes, specifying encoding and options. The output includes the match position and boundaries. ```java import org.jcodings.specific.UTF8Encoding; import org.joni.Matcher; import org.joni.Option; import org.joni.Regex; // Compile pattern from byte array byte[] pattern = "a*b+".getBytes(); byte[] str = "aaabbb".getBytes(); Regex regex = new Regex(pattern, 0, pattern.length, Option.NONE, UTF8Encoding.INSTANCE); Matcher matcher = regex.matcher(str); // Search for match (returns byte position or -1 if not found) int result = matcher.search(0, str.length, Option.DEFAULT); if (result != -1) { System.out.println("Match found at position: " + result); System.out.println("Match begins at: " + matcher.getBegin()); System.out.println("Match ends at: " + matcher.getEnd()); } // Output: Match found at position: 0 // Match begins at: 0 // Match ends at: 6 ``` -------------------------------- ### Joni: Distinguish Match vs Search Operations in Java Source: https://context7.com/jruby/joni/llms.txt Demonstrates the difference between `match()` which requires an anchored match at a specific position, and `search()` which finds the pattern anywhere within a given range. Operates on byte arrays using UTF8Encoding. ```java import org.jcodings.specific.UTF8Encoding; import org.joni.Matcher; import org.joni.Option; import org.joni.Regex; byte[] pattern = "\\d+".getBytes(); byte[] str = "abc 123 def".getBytes(); Regex regex = new Regex(pattern, 0, pattern.length, Option.NONE, UTF8Encoding.INSTANCE); Matcher matcher = regex.matcher(str); // match() - must match at the specified position (position 0 here) int matchResult = matcher.match(0, str.length, Option.DEFAULT); System.out.println("match() at position 0: " + matchResult); // -1 (no match) // search() - finds pattern anywhere in the range int searchResult = matcher.search(0, str.length, Option.DEFAULT); System.out.println("search() result: " + searchResult); // 4 (found at \"123\") // match() at the location where search() found it int matchResult2 = matcher.match(4, str.length, Option.DEFAULT); System.out.println("match() at position 4: " + matchResult2); // 4 (matches) // Output: match() at position 0: -1 // search() result: 4 // match() at position 4: 4 ``` -------------------------------- ### Create Regex from String and Match (Java) Source: https://context7.com/jruby/joni/llms.txt Shows how to create a Joni Regex object directly from a Java String, which is automatically encoded. This snippet then uses the regex to find a pattern (like a phone number) within a byte array representing text and prints the matched portion. ```java import org.jcodings.specific.UTF8Encoding; import org.joni.Regex; import org.joni.Matcher; import org.joni.Option; // Simplified constructor for strings String patternStr = "\\d{3}-\\d{4}"; Regex regex = new Regex(patternStr, UTF8Encoding.INSTANCE); byte[] text = "Call me at 555-1234".getBytes(); Matcher matcher = regex.matcher(text); int result = matcher.search(0, text.length, Option.DEFAULT); if (result != -1) { byte[] matched = new byte[matcher.getEnd() - matcher.getBegin()]; System.arraycopy(text, matcher.getBegin(), matched, 0, matched.length); System.out.println("Phone: " + new String(matched)); } // Output: Phone: 555-1234 ``` -------------------------------- ### Joni: Matcher Without Region for Memory Optimization in Java Source: https://context7.com/jruby/joni/llms.txt Illustrates how to use `matcherNoRegion()` to create a matcher that skips capture group tracking, reducing memory overhead and improving performance. This is useful when only the overall match is needed. ```java import org.jcodings.specific.UTF8Encoding; import org.joni.Matcher; import org.joni.Option; import org.joni.Regex; byte[] pattern = "\\w+@\\w+\\.\\w+".getBytes(); byte[] str = "Contact: user@example.com for info".getBytes(); Regex regex = new Regex(pattern, 0, pattern.length, Option.NONE, UTF8Encoding.INSTANCE); // Create matcher without region tracking (faster, less memory) Matcher matcher = regex.matcherNoRegion(str, 0, str.length); int result = matcher.search(0, str.length, Option.DEFAULT); if (result != -1) { // Can still get overall match bounds System.out.println("Found at: " + matcher.getBegin() + " to " + matcher.getEnd()); // But getEagerRegion() will only have the full match, no capture groups byte[] matched = new byte[matcher.getEnd() - matcher.getBegin()]; System.arraycopy(str, matcher.getBegin(), matched, 0, matched.length); System.out.println("Email: " + new String(matched)); } // Output: Found at: 9 to 25 // Email: user@example.com ``` -------------------------------- ### Joni Imports for Regex Operations (Java) Source: https://github.com/jruby/joni/blob/master/README.md These imports are necessary to utilize the Joni library for regular expression matching in Java. They provide access to encoding, regex, and matcher functionalities. ```java import org.jcodings.specific.UTF8Encoding; import org.joni.Matcher; import org.joni.Option; import org.joni.Regex; ``` -------------------------------- ### Joni: Handle Different Text Encodings (ASCII, UTF-8) in Java Source: https://context7.com/jruby/joni/llms.txt Shows how to work with various text encodings, including ASCII and UTF-8 with Unicode characters, using Joni. It compiles regex patterns and searches within byte arrays corresponding to these encodings. ```java import org.jcodings.specific.UTF8Encoding; import org.jcodings.specific.ASCIIEncoding; import org.jcodings.specific.EUCJPEncoding; import org.joni.Regex; import org.joni.Matcher; import org.joni.Option; // ASCII encoding byte[] asciiPattern = "[a-z]+".getBytes(); byte[] asciiStr = "hello".getBytes(); Regex asciiRegex = new Regex(asciiPattern, 0, asciiPattern.length, Option.NONE, ASCIIEncoding.INSTANCE); Matcher asciiMatcher = asciiRegex.matcher(asciiStr); System.out.println("ASCII match: " + (asciiMatcher.search(0, asciiStr.length, Option.DEFAULT) != -1)); // UTF-8 with Unicode String unicodeStr = "Hello 世界 World"; byte[] utf8Str = unicodeStr.getBytes("UTF-8"); byte[] utf8Pattern = "世界".getBytes("UTF-8"); Regex utf8Regex = new Regex(utf8Pattern, 0, utf8Pattern.length, Option.NONE, UTF8Encoding.INSTANCE); Matcher utf8Matcher = utf8Regex.matcher(utf8Str); int result = utf8Matcher.search(0, utf8Str.length, Option.DEFAULT); System.out.println("UTF-8 Unicode match at: " + result); // Output: ASCII match: true // UTF-8 Unicode match at: 6 ``` -------------------------------- ### Extract Data with Named Capture Groups in Java Source: https://context7.com/jruby/joni/llms.txt This Java snippet demonstrates how to use named capture groups in regular expressions to extract specific parts of a string. It utilizes the Joni library to define a regex pattern with named groups and then iterates through these groups to retrieve their corresponding values from the matched string. Dependencies include Joni library classes like Regex, Matcher, NameEntry, Region, and UTF8Encoding. ```java import org.jcodings.specific.UTF8Encoding; import org.joni.Matcher; import org.joni.NameEntry; import org.joni.Option; import org.joni.Regex; import org.joni.Region; import java.util.Iterator; byte[] pattern = "(?\\d{4})-(?\\d{2})-(?\\d{2})".getBytes(); byte[] str = "Date: 2025-12-05".getBytes(); Regex regex = new Regex(pattern, 0, pattern.length, Option.NONE, UTF8Encoding.INSTANCE); Matcher matcher = regex.matcher(str, 6, str.length); // Start at position 6 int result = matcher.search(6, str.length, Option.DEFAULT); if (result != -1) { Region region = matcher.getEagerRegion(); // Iterate through named groups for (Iterator entry = regex.namedBackrefIterator(); entry.hasNext();) { NameEntry e = entry.next(); String name = new String(e.name, e.nameP, e.nameEnd - e.nameP); int[] backRefs = e.getBackRefs(); if (backRefs.length > 0) { int groupNum = backRefs[0]; int begin = region.getBeg(groupNum); int end = region.getEnd(groupNum); byte[] value = new byte[end - begin]; System.arraycopy(str, begin, value, 0, value.length); System.out.println(name + " = " + new String(value)); } } } // Output: year = 2025 // month = 12 // day = 05 ``` -------------------------------- ### Extract Capture Groups from Byte Array Matches (Java) Source: https://context7.com/jruby/joni/llms.txt Illustrates how to use Joni's Region API to extract captured groups from a regex match in a byte array. After finding a match, it retrieves the Region object and iterates through its capture groups, converting the matched byte segments back into strings. ```java import org.jcodings.specific.UTF8Encoding; import org.joni.Matcher; import org.joni.Option; import org.joni.Regex; import org.joni.Region; byte[] pattern = "(\w+)@(\w+\\.\\w+)".getBytes(); byte[] str = "user@example.com".getBytes(); Regex regex = new Regex(pattern, 0, pattern.length, Option.NONE, UTF8Encoding.INSTANCE); Matcher matcher = regex.matcher(str); int result = matcher.search(0, str.length, Option.DEFAULT); if (result != -1) { Region region = matcher.getEagerRegion(); // region.beg[0] and region.end[0] are the full match // region.beg[1], region.end[1] are first capture group, etc. for (int i = 0; i < region.getNumRegs(); i++) { int begin = region.getBeg(i); int end = region.getEnd(i); if (begin != -1) { byte[] captured = new byte[end - begin]; System.arraycopy(str, begin, captured, 0, captured.length); System.out.println("Group " + i + ": " + new String(captured)); } } } // Output: Group 0: user@example.com // Group 1: user // Group 2: example.com ``` === COMPLETE CONTENT === This response contains all available snippets from this library. No additional content exists. Do not make further requests.