### Install StringSimilarity.NET Package Source: https://context7.com/feature23/stringsimilarity.net/llms.txt This command installs the StringSimilarity.NET NuGet package using the .NET CLI or Package Manager Console. This is the first step to using the library in your .NET project. ```bash Install-Package F23.StringSimilarity ``` -------------------------------- ### Compute Ratcliff-Obershelp Similarity and Distance in C# Source: https://context7.com/feature23/stringsimilarity.net/llms.txt Illustrates the usage of the RatcliffObershelp class for calculating string similarity based on matching characters. Examples include handling transposed characters, comparing names, and dealing with identical or completely different strings. ```csharp using F23.StringSimilarity; var ratcliff = new RatcliffObershelp(); // Similarity for transposed characters double sim1 = ratcliff.Similarity("My string", "My tsring"); Console.WriteLine($"Similarity: {sim1:F4}"); // Output: ~0.8889 // More different strings double sim2 = ratcliff.Similarity("My string", "My ntrisg"); Console.WriteLine($"Similarity: {sim2:F4}"); // Output: ~0.7778 // Distance is 1 - similarity double dist1 = ratcliff.Distance("My string", "My tsring"); Console.WriteLine($"Distance: {dist1:F4}"); // Output: ~0.1111 // Comparing names double sim3 = ratcliff.Similarity("Pennsylvania", "Pencilvaneya"); Console.WriteLine($"Name similarity: {sim3:F4}"); // Identical strings double sim4 = ratcliff.Similarity("identical", "identical"); Console.WriteLine($"Identical: {sim4}"); // Output: 1.0 // Completely different strings double sim5 = ratcliff.Similarity("abc", "xyz"); Console.WriteLine($"Different: {sim5}"); // Output: 0.0 ``` -------------------------------- ### Calculate SIFT4 Distance in C# (Experimental) Source: https://context7.com/feature23/stringsimilarity.net/llms.txt Shows how to use the experimental SIFT4 algorithm for calculating string distance. It covers default usage, custom MaxOffset for controlling the search range, and examples with short strings, transpositions, identical strings, and empty strings. ```csharp using F23.StringSimilarity.Experimental; var sift4 = new Sift4(); // Default MaxOffset is 10 double dist1 = sift4.Distance("hello", "hallo"); Console.WriteLine($"Distance: {dist1}"); // Custom MaxOffset for controlling search range sift4.MaxOffset = 5; string s1 = "This is the first string"; string s2 = "And this is another string"; double dist2 = sift4.Distance(s1, s2); Console.WriteLine($"Sentence distance: {dist2}"); // Output: 11 // Short strings double dist3 = sift4.Distance("abc", "adc"); Console.WriteLine($"Short string distance: {dist3}"); // Transposition detection double dist4 = sift4.Distance("ab", "ba"); Console.WriteLine($"Transposition: {dist4}"); // Identical strings double dist5 = sift4.Distance("same", "same"); Console.WriteLine($"Identical: {dist5}"); // Output: 0 // Empty string handling double dist6 = sift4.Distance("", "test"); Console.WriteLine($"Empty vs non-empty: {dist6}"); // Output: 4 ``` -------------------------------- ### Calculate N-Gram Distance in C# Source: https://github.com/feature23/stringsimilarity.net/blob/main/README.md Computes the normalized N-Gram distance between two strings using the F23.StringSimilarity library. It utilizes affixing with newline characters to weight the start of strings and normalizes the score by the length of the longest word. ```C# using System; using F23.StringSimilarity; public class Program { public static void Main(string[] args) { // produces 0.583333 var twogram = new NGram(2); Console.WriteLine(twogram.Distance("ABCD", "ABTUIO")); // produces 0.97222 string s1 = "Adobe CreativeSuite 5 Master Collection from cheap 4zp"; string s2 = "Adobe CreativeSuite 5 Master Collection from cheap d1x"; var ngram = new NGram(4); Console.WriteLine(ngram.Distance(s1, s2)); } } ``` -------------------------------- ### Generate N-gram Profiles for String Similarity in C# Source: https://context7.com/feature23/stringsimilarity.net/llms.txt Demonstrates how to generate n-gram profiles from text using shingle-based algorithms like Cosine. It shows how to retrieve the profile and then use cached profiles for efficient comparison between documents. ```csharp using F23.StringSimilarity; // Use any shingle-based algorithm to generate profiles var cosine = new Cosine(3); // 3-grams string text = "hello world"; var profile = cosine.GetProfile(text); Console.WriteLine("N-gram profile:"); foreach (var kvp in profile) { Console.WriteLine($" '{kvp.Key}': {kvp.Value}"); } // Output: // 'hel': 1 // 'ell': 1 // 'llo': 1 // 'lo ': 1 // 'o w': 1 // ' wo': 1 // 'wor': 1 // 'orl': 1 // 'rld': 1 // Profiles can be cached and reused for multiple comparisons var profile1 = cosine.GetProfile("document one"); var profile2 = cosine.GetProfile("document two"); var profile3 = cosine.GetProfile("document three"); // Compare document one against all others efficiently double sim12 = cosine.Similarity(profile1, profile2); double sim13 = cosine.Similarity(profile1, profile3); Console.WriteLine($"Doc1 vs Doc2: {sim12:F4}"); Console.WriteLine($"Doc1 vs Doc3: {sim13:F4}"); ``` -------------------------------- ### Compute Cosine Similarity via String Profiles in C# Source: https://github.com/feature23/stringsimilarity.net/blob/main/README.md Demonstrates how to pre-compute string profiles to calculate Cosine similarity. This approach is optimized for large datasets where the same algorithm instance must be reused for consistency. ```C# using System; using F23.StringSimilarity; public class Program { public static void Main(string[] args) { string s1 = "My first string"; string s2 = "My other string..."; // Let's work with sequences of 2 characters... var cosine = new Cosine(2); // For cosine similarity I need the profile of strings var profile1 = cosine.GetProfile(s1); var profile2 = cosine.GetProfile(s2); // Prints 0.516185 Console.WriteLine(cosine.Similarity(profile1, profile2)); } } ``` -------------------------------- ### Compute Q-Gram Distance in C# Source: https://context7.com/feature23/stringsimilarity.net/llms.txt Calculates the Q-Gram distance by computing the L1 norm of the difference between the n-gram occurrence counts of two strings. This distance measure provides a lower bound on the Levenshtein distance and runs efficiently in O(m+n) time. The default shingle size (k) is 3 (trigrams), which can be specified. The library also supports pre-computing string profiles for faster distance calculations on large datasets. ```csharp using F23.StringSimilarity; // Default k=3 (trigrams) var qgram = new QGram(); double dist1 = qgram.Distance("hello", "hallo"); Console.WriteLine($"Distance: {dist1}"); // Custom shingle size - bigrams var qgram2 = new QGram(2); // "ABCD" bigrams: AB, BC, CD // "ABCE" bigrams: AB, BC, CE // Difference: CD appears 1 time in first, 0 in second // CE appears 0 times in first, 1 in second // L1 norm = |1-0| + |0-1| = 2 double dist2 = qgram2.Distance("ABCD", "ABCE"); Console.WriteLine($"Bigram distance: {dist2}"); // Output: 2 // For large datasets, pre-compute profiles var qgramProfiler = new QGram(2); var profile1 = qgramProfiler.GetProfile("first string"); var profile2 = qgramProfiler.GetProfile("second string"); double dist3 = qgramProfiler.Distance(profile1, profile2); Console.WriteLine($"Profile distance: {dist3}"); // Identical strings double dist4 = qgram.Distance("same", "same"); Console.WriteLine($"Identical: {dist4}"); // Output: 0 ``` -------------------------------- ### Calculate Q-Gram Distance in C# Source: https://github.com/feature23/stringsimilarity.net/blob/main/README.md Computes the Q-Gram distance between two strings by calculating the L1 norm of the difference between their n-gram profiles. This provides an efficient O(m+n) lower bound on Levenshtein distance. ```C# using System; using F23.StringSimilarity; public class Program { public static void Main(string[] args) { var dig = new QGram(2); // AB BC CD CE // 1 1 1 0 // 1 1 0 1 // Total: 2 Console.WriteLine(dig.Distance("ABCD", "ABCE")); } } ``` -------------------------------- ### Calculate Sorensen-Dice Similarity and Distance in C# Source: https://context7.com/feature23/stringsimilarity.net/llms.txt Demonstrates how to use the SorensenDice class to calculate the similarity and distance between strings. It shows default usage with trigrams, custom shingle sizes, and comparisons of sentences and identical/different strings. ```csharp using F23.StringSimilarity; // Default k=3 (trigrams) var sorensen = new SorensenDice(); double sim1 = sorensen.Similarity("night", "nacht"); Console.WriteLine($"Similarity: {sim1:F4}"); // Distance is 1 - similarity double dist1 = sorensen.Distance("night", "nacht"); Console.WriteLine($"Distance: {dist1:F4}"); // Custom shingle size var sorensen2 = new SorensenDice(2); double sim2 = sorensen2.Similarity("hello", "hallo"); Console.WriteLine($"2-gram similarity: {sim2:F4}"); // Comparing sentences double sim3 = sorensen.Similarity("the quick brown fox", "the fast brown fox"); Console.WriteLine($"Sentence similarity: {sim3:F4}"); // Identical strings double sim4 = sorensen.Similarity("identical", "identical"); Console.WriteLine($"Identical: {sim4}"); // Output: 1.0 // Completely different strings double dist2 = sorensen.Distance("abc", "xyz"); Console.WriteLine($"Different: {dist2}"); // Output: 1.0 ``` -------------------------------- ### Commit Action Staging View Test in Java Source: https://github.com/feature23/stringsimilarity.net/blob/main/test/F23.StringSimilarity.Tests/71816-2.txt This Java code implements a JUnit test for the EGit 'Commit' action, focusing on its behavior with the Staging View. It sets up the test environment by configuring EGit preferences, creating a local repository, and then tests the scenario where the Staging View is opened without selection synchronization. The test asserts that the correct repository is associated with the opened Staging View. ```java package org.eclipse.egit.ui.test.team.actions; import static org.junit.Assert.assertEquals; import static org.junit.Assert.assertNotNull; import static org.junit.Assert.assertTrue; import java.io.File; import org.eclipse.core.resources.IProject; import org.eclipse.core.resources.ResourcesPlugin; import org.eclipse.egit.ui.Activator; import org.eclipse.egit.ui.UIPreferences; import org.eclipse.egit.ui.common.LocalRepositoryTestCase; import org.eclipse.egit.ui.internal.staging.StagingView; import org.eclipse.egit.ui.test.ContextMenuHelper; import org.eclipse.egit.ui.test.TestUtil; import org.eclipse.jgit.lib.Repository; import org.eclipse.swtbot.swt.finder.junit.SWTBotJunit4ClassRunner; import org.eclipse.swtbot.swt.finder.widgets.SWTBotTree; import org.eclipse.ui.PartInitException; import org.eclipse.ui.PlatformUI; import org.junit.After; import org.junit.Before; import org.junit.Test; import org.junit.runner.RunWith; /** * Tests for the Team->Commit action */ @RunWith(SWTBotJunit4ClassRunner.class) public class CommitActionStagingViewTest extends LocalRepositoryTestCase { private File repositoryFile; private boolean initialLinkWithSelection; private boolean initialUseStagingView; @Before public void setup() throws Exception { initialUseStagingView = Activator.getDefault().getPreferenceStore() .getBoolean(UIPreferences.ALWAYS_USE_STAGING_VIEW); initialLinkWithSelection = Activator.getDefault().getPreferenceStore() .getBoolean(UIPreferences.STAGING_VIEW_SYNC_SELECTION); Activator.getDefault().getPreferenceStore() .setValue(UIPreferences.ALWAYS_USE_STAGING_VIEW, true); Activator.getDefault().getPreferenceStore() .setDefault(UIPreferences.STAGING_VIEW_SYNC_SELECTION, false); Activator.getDefault().getPreferenceStore() .setValue(UIPreferences.STAGING_VIEW_SYNC_SELECTION, false); repositoryFile = createProjectAndCommitToRepository(); Repository repo = lookupRepository(repositoryFile); TestUtil.configureTestCommitterAsUser(repo); // TODO delete the second project for the time being (.gitignore is // currently not hiding the .project file from commit) IProject project = ResourcesPlugin.getWorkspace().getRoot().getProject(PROJ2); File dotProject = new File(project.getLocation().toOSString(), ".project"); project.delete(false, false, null); assertTrue(dotProject.delete()); TestUtil.hideView(StagingView.VIEW_ID); } @After public void tearDown() { Activator.getDefault().getPreferenceStore().setValue( UIPreferences.ALWAYS_USE_STAGING_VIEW, initialUseStagingView); Activator.getDefault().getPreferenceStore() .setDefault(UIPreferences.STAGING_VIEW_SYNC_SELECTION, true); Activator.getDefault().getPreferenceStore().setValue( UIPreferences.STAGING_VIEW_SYNC_SELECTION, initialLinkWithSelection); } @Test public void testOpenStagingViewNoLinkWithSelection() throws Exception { setTestFileContent("I have changed this"); SWTBotTree projectExplorerTree = TestUtil.getExplorerTree(); util.getProjectItems(projectExplorerTree, PROJ1)[0].select(); String menuString = util.getPluginLocalizedValue("CommitAction_label"); ContextMenuHelper.clickContextMenu(projectExplorerTree, "Team", menuString); TestUtil.waitUntilViewWithGivenIdShows(StagingView.VIEW_ID); final Repository[] repo = { null }; PlatformUI.getWorkbench().getDisplay().syncExec(new Runnable() { @Override public void run() { StagingView view; try { view = (StagingView) PlatformUI.getWorkbench() .getActiveWorkbenchWindow().getActivePage() .showView(StagingView.VIEW_ID); repo[0] = view.getCurrentRepository(); } catch (PartInitException e) { // Ignore, repo[0] remains null } } }); Repository repository = lookupRepository(repositoryFile); assertNotNull("No repository found", repository); assertEquals("Repository mismatch", repository, repo[0]); } } ``` -------------------------------- ### Calculate Weighted Levenshtein Distance Source: https://context7.com/feature23/stringsimilarity.net/llms.txt Implements Levenshtein distance with customizable substitution costs, defined by an `ICharacterSubstitution` implementation. This is useful for scenarios like OCR or keyboard typo correction where certain substitutions are more likely than others. Supports an optional limit for early termination. ```csharp using F23.StringSimilarity; // Define custom substitution costs public class KeyboardSubstitution : ICharacterSubstitution { public double Cost(char c1, char c2) { // Adjacent keys on keyboard have lower substitution cost if ((c1 == 't' && c2 == 'r') || (c1 == 'r' && c2 == 't')) return 0.5; if ((c1 == 'e' && c2 == 'r') || (c1 == 'r' && c2 == 'e')) return 0.5; if ((c1 == 'o' && c2 == 'p') || (c1 == 'p' && c2 == 'o')) return 0.5; return 1.0; // Default cost } } var weighted = new WeightedLevenshtein(new KeyboardSubstitution()); // 't' and 'r' are adjacent, so lower cost double distance1 = weighted.Distance("string", "srring"); Console.WriteLine($"Adjacent key typo: {distance1}"); // Output: 0.5 // Non-adjacent substitution has full cost double distance2 = weighted.Distance("string", "sxring"); Console.WriteLine($"Non-adjacent substitution: {distance2}"); // Output: 1.0 // With early termination limit double distance3 = weighted.Distance("hello", "world", limit: 3.0); Console.WriteLine($"With limit: {distance3}"); // Output: 3.0 (limit reached) // Multiple substitutions double distance4 = weighted.Distance("test", "rest"); Console.WriteLine($"t->r substitution: {distance4}"); // Output: 0.5 ``` -------------------------------- ### Compute Cosine Similarity in C# Source: https://context7.com/feature23/stringsimilarity.net/llms.txt Calculates the cosine of the angle between two string vectors in n-gram space, returning a similarity score between 0 and 1. It uses an efficient O(m+n) algorithm based on profile comparison. The default k-shingle size is 3 (trigrams), which can be adjusted. For large datasets, pre-computing string profiles can significantly improve performance for multiple comparisons. ```csharp using F23.StringSimilarity; // Default k=3 (trigrams) var cosine = new Cosine(); double sim1 = cosine.Similarity("hello world", "hello there"); Console.WriteLine($"Similarity: {sim1:F4}"); // Distance is 1 - similarity double dist1 = cosine.Distance("hello world", "hello there"); Console.WriteLine($"Distance: {dist1:F4}"); // Custom k-shingle size var cosine2 = new Cosine(2); double sim2 = cosine2.Similarity("night", "nacht"); Console.WriteLine($"2-gram similarity: {sim2:F4}"); // For large datasets, pre-compute profiles for efficiency var cosineProfiler = new Cosine(2); string text1 = "My first string"; string text2 = "My other string..."; var profile1 = cosineProfiler.GetProfile(text1); var profile2 = cosineProfiler.GetProfile(text2); // Compute similarity between profiles (faster for multiple comparisons) double sim3 = cosineProfiler.Similarity(profile1, profile2); Console.WriteLine($"Profile similarity: {sim3:F4}"); // Output: ~0.5162 // Identical strings double sim4 = cosine.Similarity("same text", "same text"); Console.WriteLine($"Identical: {sim4}"); // Output: 1.0 ``` -------------------------------- ### Calculate String Similarity using Ratcliff-Obershelp in C# Source: https://github.com/feature23/stringsimilarity.net/blob/main/README.md The Ratcliff-Obershelp algorithm, or Gestalt Pattern Matching, computes a similarity score between two strings ranging from 0.0 to 1.0. This implementation uses the F23.StringSimilarity library to compare strings and identify substitution patterns. ```csharp using System; using F23.StringSimilarity; public class Program { public static void Main(string[] args) { var ro = new RatcliffObershelp(); // substitution of s and t Console.WriteLine(ro.Similarity("My string", "My tsring")); // substitution of s and n Console.WriteLine(ro.Similarity("My string", "My ntrisg")); } } ``` -------------------------------- ### Calculate Normalized Levenshtein Distance and Similarity Source: https://context7.com/feature23/stringsimilarity.net/llms.txt Provides normalized Levenshtein distance (between 0 and 1) and similarity (1 - distance). The distance is calculated by dividing the Levenshtein distance by the length of the longer string. Supports `ReadOnlySpan` inputs. ```csharp using F23.StringSimilarity; var normalizedLev = new NormalizedLevenshtein(); // Distance returns value between 0 and 1 double distance = normalizedLev.Distance("hello", "hallo"); Console.WriteLine($"Normalized distance: {distance}"); // Output: 0.2 // Similarity returns 1 - distance double similarity = normalizedLev.Similarity("hello", "hallo"); Console.WriteLine($"Similarity: {similarity}"); // Output: 0.8 // Comparing very different strings double distance2 = normalizedLev.Distance("abc", "xyz"); Console.WriteLine($"Different strings distance: {distance2}"); // Output: 1.0 // Identical strings double similarity2 = normalizedLev.Similarity("same", "same"); Console.WriteLine($"Identical similarity: {similarity2}"); // Output: 1.0 // Using spans double spanDistance = normalizedLev.Distance("test".AsSpan(), "text".AsSpan()); Console.WriteLine($"Span distance: {spanDistance}"); // Output: 0.25 ``` -------------------------------- ### Calculate String Distance using SIFT4 in C# Source: https://github.com/feature23/stringsimilarity.net/blob/main/README.md SIFT4 is a general-purpose string distance algorithm designed to mimic human perception of string differences. It accounts for character substitutions and common subsequences, requiring a MaxOffset parameter to define the search window. ```csharp using System; using System.Diagnostics; using F23.StringSimilarity; public class Program { public static void Main(string[] args) { var s1 = "This is the first string"; var s2 = "And this is another string"; var sift4 = new Sift4(); sift4.MaxOffset = 5; double expResult = 11.0; double result = sift4.Distance(s1, s2); Debug.Assert(Math.Abs(result - expResult) < 0.1); } } ``` -------------------------------- ### Calculate Weighted Levenshtein Distance in C# Source: https://github.com/feature23/stringsimilarity.net/blob/main/README.md Implements the Levenshtein distance with customizable substitution costs for characters. This is particularly useful for OCR applications or keyboard auto-correction where certain character substitutions are more likely than others. Requires an implementation of the ICharacterSubstitution interface. ```csharp using System; using F23.StringSimilarity; public class Program { public static void Main(string[] args) { var l = new WeightedLevenshtein(new ExampleCharSub()); Console.WriteLine(l.Distance("String1", "String1")); Console.WriteLine(l.Distance("String1", "Srring1")); Console.WriteLine(l.Distance("String1", "Srring2")); } } private class ExampleCharSub : ICharacterSubstitution { public double Cost(char c1, char c2) { // The cost for substituting 't' and 'r' is considered smaller as these 2 are located next to each other on a keyboard if (c1 == 't' && c2 == 'r') return 0.5; // For most cases, the cost of substituting 2 characters is 1.0 return 1.0; } } ``` -------------------------------- ### Calculate Metric LCS Distance in C# Source: https://context7.com/feature23/stringsimilarity.net/llms.txt Provides a normalized metric distance based on the Longest Common Subsequence. It returns a value between 0 (identical) and 1 (completely different) based on the ratio of the LCS length to the maximum string length. ```csharp using F23.StringSimilarity; var metricLcs = new MetricLCS(); // Example 1: // LCS of "ABCDEFG" and "ABCDEFHJKL" is "ABCDEF" (length 6) // max length = 10 // Distance = 1 - 6/10 = 0.4 double distance1 = metricLcs.Distance("ABCDEFG", "ABCDEFHJKL"); Console.WriteLine($"Distance: {distance1}"); // Output: 0.4 // Example 2: // LCS of "ABDEF" and "ABDIF" is "ABD" or "ABDF" (length 4) // max length = 5 // Distance = 1 - 4/5 = 0.2 double distance2 = metricLcs.Distance("ABDEF", "ABDIF"); Console.WriteLine($"Distance: {distance2}"); // Output: 0.2 // Identical strings double distance3 = metricLcs.Distance("identical", "identical"); Console.WriteLine($"Identical: {distance3}"); // Output: 0 // Completely different strings double distance4 = metricLcs.Distance("abc", "xyz"); Console.WriteLine($"Different: {distance4}"); // Output: 1.0 // Using spans double spanDist = metricLcs.Distance("hello".AsSpan(), "hallo".AsSpan()); Console.WriteLine($"Span distance: {spanDist}"); // Output: 0.2 ``` -------------------------------- ### Calculate N-Gram Distance in C# Source: https://context7.com/feature23/stringsimilarity.net/llms.txt Computes the normalized N-Gram distance between two strings using affixing with special characters. It returns a value between 0 and 1, where 0 indicates identical strings and 1 indicates completely different strings. The default n-gram size is 2 (bigrams), but it can be customized for longer strings or specific use cases like near-duplicate detection. ```csharp using F23.StringSimilarity; // Default n=2 (bigrams) var bigram = new NGram(); double dist1 = bigram.Distance("ABCD", "ABCE"); Console.WriteLine($"Bigram distance: {dist1:F4}"); // Custom n-gram size var twogram = new NGram(2); double dist2 = twogram.Distance("ABCD", "ABTUIO"); Console.WriteLine($"2-gram distance: {dist2:F4}"); // Output: ~0.5833 // 4-gram for longer strings (good for near-duplicate detection) var fourgram = new NGram(4); string s1 = "Adobe CreativeSuite 5 Master Collection from cheap 4zp"; string s2 = "Adobe CreativeSuite 5 Master Collection from cheap d1x"; double dist3 = fourgram.Distance(s1, s2); Console.WriteLine($"4-gram distance: {dist3:F4}"); // Output: ~0.0278 // Identical strings double dist4 = bigram.Distance("test", "test"); Console.WriteLine($"Identical: {dist4}"); // Output: 0 // Completely different strings double dist5 = bigram.Distance("abc", "xyz"); Console.WriteLine($"Different: {dist5}"); // Output: 1.0 ``` -------------------------------- ### Calculate Levenshtein Distance in C# Source: https://github.com/feature23/stringsimilarity.net/blob/main/README.md Calculates the Levenshtein distance between two strings using dynamic programming. This is a metric distance, suitable for algorithms relying on the triangle inequality. The implementation has O(m) space and O(m.n) time complexity. ```csharp using System; using F23.StringSimilarity; public class Program { public static void Main(string[] args) { var l = new Levenshtein(); Console.WriteLine(l.Distance("My string", "My $tring")); Console.WriteLine(l.Distance("My string", "My $tring")); Console.WriteLine(l.Distance("My string", "My $tring")); } } ``` -------------------------------- ### Calculate Jaro-Winkler Similarity in C# Source: https://context7.com/feature23/stringsimilarity.net/llms.txt Calculates the Jaro-Winkler similarity score, ideal for short strings like names. It provides a normalized score between 0 and 1, with higher values for strings sharing common prefixes. ```csharp using F23.StringSimilarity; // Default threshold (0.7) var jaroWinkler = new JaroWinkler(); // High similarity for similar names double sim1 = jaroWinkler.Similarity("MARTHA", "MARHTA"); Console.WriteLine($"Name similarity: {sim1:F4}"); // Output: ~0.9611 // Common prefix boosts score double sim2 = jaroWinkler.Similarity("DWAYNE", "DUANE"); Console.WriteLine($"Prefix boost: {sim2:F4}"); // Output: ~0.8400 // Distance is 1 - similarity double dist1 = jaroWinkler.Distance("My string", "My tsring"); Console.WriteLine($"Distance: {dist1:F4}"); // Output: ~0.0259 // Identical strings double sim3 = jaroWinkler.Similarity("identical", "identical"); Console.WriteLine($"Identical: {sim3}"); // Output: 1.0 // Custom threshold (set negative for pure Jaro distance without Winkler boost) var jaro = new JaroWinkler(-1); double sim4 = jaro.Similarity("MARTHA", "MARHTA"); Console.WriteLine($"Pure Jaro: {sim4:F4}"); // Output: ~0.9444 // Using spans for performance double spanSim = jaroWinkler.Similarity("john".AsSpan(), "jon".AsSpan()); Console.WriteLine($"Span similarity: {spanSim:F4}"); ``` -------------------------------- ### Calculate Levenshtein Distance Source: https://context7.com/feature23/stringsimilarity.net/llms.txt Computes the Levenshtein edit distance between two strings. It calculates the minimum number of single-character edits (insertions, deletions, or substitutions) required to change one string into the other. Supports an optional limit for early termination and can use `ReadOnlySpan` for performance. ```csharp using F23.StringSimilarity; var levenshtein = new Levenshtein(); // Basic distance calculation double distance1 = levenshtein.Distance("kitten", "sitting"); Console.WriteLine($"Distance: {distance1}"); // Output: 3 // Identical strings have distance 0 double distance2 = levenshtein.Distance("hello", "hello"); Console.WriteLine($"Distance: {distance2}"); // Output: 0 // With early termination limit for performance optimization double distance3 = levenshtein.Distance("My string", "My $tring", limit: 2); Console.WriteLine($"Distance (with limit): {distance3}"); // Output: 1 // Using ReadOnlySpan for high-performance scenarios ReadOnlySpan span1 = "algorithm".AsSpan(); ReadOnlySpan span2 = "altruistic".AsSpan(); double distance4 = levenshtein.Distance(span1, span2); Console.WriteLine($"Span distance: {distance4}"); // Output: 6 ``` -------------------------------- ### Metric Longest Common Subsequence Distance in C# Source: https://github.com/feature23/stringsimilarity.net/blob/main/README.md Implements a string distance metric based on the Longest Common Subsequence (LCS). The distance is computed as 1 minus the ratio of the LCS length to the length of the longer of the two input strings. This metric provides a normalized distance value between 0.0 and 1.0. It uses the F23.StringSimilarity library. ```csharp using System; using F23.StringSimilarity; public class Program { public static void Main(string[] args) { var lcs = new MetricLCS(); string s1 = "ABCDEFG"; string s2 = "ABCDEFHJKL"; // LCS: ABCDEF => length = 6 // longest = s2 => length = 10 // => 1 - 6/10 = 0.4 Console.WriteLine(lcs.Distance(s1, s2)); // LCS: ABDF => length = 4 // longest = ABDEF => length = 5 // => 1 - 4 / 5 = 0.2 Console.WriteLine(lcs.Distance("ABDEF", "ABDIF")); } } ``` -------------------------------- ### Longest Common Subsequence (LCS) Distance in C# Source: https://github.com/feature23/stringsimilarity.net/blob/main/README.md Calculates the Longest Common Subsequence (LCS) distance between two strings using a dynamic programming approach. The LCS distance is defined as the sum of the lengths of the two strings minus twice the length of their longest common subsequence. This implementation is suitable for scenarios where only insertions and deletions are considered, or when substitution cost is double that of insertion/deletion. It uses the F23.StringSimilarity library. ```csharp using System; using F23.StringSimilarity; public class Program { public static void Main(string[] args) { var lcs = new LongestCommonSubsequence(); // Will produce 4.0 Console.WriteLine(lcs.Distance("AGCAT", "GAC")); // Will produce 1.0 Console.WriteLine(lcs.Distance("AGCAT", "AGCT")); } } ``` -------------------------------- ### Calculate Damerau-Levenshtein Distance in C# Source: https://context7.com/feature23/stringsimilarity.net/llms.txt Computes the Damerau-Levenshtein distance, which accounts for insertions, deletions, substitutions, and transpositions of adjacent characters. It returns the minimum number of operations required to transform one string into another. ```csharp using F23.StringSimilarity; var damerau = new Damerau(); // Transposition counts as 1 operation (not 2 substitutions) double distance1 = damerau.Distance("ab", "ba"); Console.WriteLine($"Transposition: {distance1}"); // Output: 1 // Single substitution double distance2 = damerau.Distance("ABCDEF", "ABDCEF"); Console.WriteLine($"One transposition: {distance2}"); // Output: 1 // Multiple transpositions double distance3 = damerau.Distance("ABCDEF", "BACDFE"); Console.WriteLine($"Two transpositions: {distance3}"); // Output: 2 // Deletion double distance4 = damerau.Distance("ABCDEF", "ABCDE"); Console.WriteLine($"Deletion: {distance4}"); // Output: 1 // Insertion double distance5 = damerau.Distance("ABCDEF", "ABCGDEF"); Console.WriteLine($"Insertion: {distance5}"); // Output: 1 // Completely different strings double distance6 = damerau.Distance("ABCDEF", "POIU"); Console.WriteLine($"Completely different: {distance6}"); // Output: 6 // Using spans double spanDist = damerau.Distance("recieve".AsSpan(), "receive".AsSpan()); Console.WriteLine($"Span transposition: {spanDist}"); // Output: 1 ``` -------------------------------- ### Jaro-Winkler Similarity Calculation in C# Source: https://github.com/feature23/stringsimilarity.net/blob/main/README.md Computes the Jaro-Winkler similarity between two strings, a metric primarily used for record linkage and detecting transposition errors in short strings like names. The similarity score ranges from 0.0 to 1.0, with higher values indicating greater similarity. This implementation uses the F23.StringSimilarity library. ```csharp using System; using F23.StringSimilarity; public class Program { public static void Main(string[] args) { var jw = new JaroWinkler(); // substitution of s and t Console.WriteLine(jw.Similarity("My string", "My tsring")); // substitution of s and n Console.WriteLine(jw.Similarity("My string", "My ntrisg")); } } ``` -------------------------------- ### Calculate Longest Common Subsequence (LCS) in C# Source: https://context7.com/feature23/stringsimilarity.net/llms.txt Computes the distance based on the Longest Common Subsequence. This metric measures the edit distance by identifying the length of the longest sequence of characters that appear in both strings in the same relative order. ```csharp using F23.StringSimilarity; var lcs = new LongestCommonSubsequence(); // Distance calculation // LCS of "AGCAT" and "GAC" is "GA" or "AC" (length 2) // Distance = 5 + 3 - 2*2 = 4 double distance1 = lcs.Distance("AGCAT", "GAC"); Console.WriteLine($"Distance: {distance1}"); // Output: 4 // Another example // LCS of "AGCAT" and "AGCT" is "AGCT" (length 4) // Distance = 5 + 4 - 2*4 = 1 double distance2 = lcs.Distance("AGCAT", "AGCT"); Console.WriteLine($"Distance: {distance2}"); // Output: 1 // Get the length of LCS directly int lcsLength = lcs.Length("ABCDGH", "AEDFHR"); Console.WriteLine($"LCS length: {lcsLength}"); // Output: 3 (ADH) // Identical strings have distance 0 double distance3 = lcs.Distance("same", "same"); Console.WriteLine($"Identical: {distance3}"); // Output: 0 // Using spans double spanDist = lcs.Distance("algorithm".AsSpan(), "altruistic".AsSpan()); Console.WriteLine($"Span distance: {spanDist}"); ``` -------------------------------- ### Calculate Jaccard Index Similarity in C# Source: https://context7.com/feature23/stringsimilarity.net/llms.txt Computes the Jaccard Index, a set-based similarity measure defined as the size of the intersection divided by the size of the union of n-gram sets derived from the strings. It returns a similarity score between 0 and 1. The default shingle size (k) is 3 (trigrams), but this can be customized. The distance is calculated as 1 minus the similarity. This method is effective for comparing sets of words or tokens. ```csharp using F23.StringSimilarity; // Default k=3 (trigrams) var jaccard = new Jaccard(); double sim1 = jaccard.Similarity("hello", "hallo"); Console.WriteLine($"Similarity: {sim1:F4}"); // Distance is 1 - similarity double dist1 = jaccard.Distance("hello", "hallo"); Console.WriteLine($"Distance: {dist1:F4}"); // Custom shingle size var jaccard2 = new Jaccard(2); double sim2 = jaccard2.Similarity("night", "nacht"); Console.WriteLine($"2-gram similarity: {sim2:F4}"); // Good for comparing sets of words/tokens double sim3 = jaccard.Similarity("the quick brown fox", "the fast brown fox"); Console.WriteLine($"Sentence similarity: {sim3:F4}"); // Identical strings double sim4 = jaccard.Similarity("identical", "identical"); Console.WriteLine($"Identical: {sim4}"); // Output: 1.0 // Completely different strings double dist2 = jaccard.Distance("abc", "xyz"); Console.WriteLine($"Different: {dist2}"); // Output: 1.0 ``` -------------------------------- ### Calculate Normalized Levenshtein Distance in C# Source: https://github.com/feature23/stringsimilarity.net/blob/main/README.md Computes the Normalized Levenshtein distance by dividing the Levenshtein distance by the length of the longer string, resulting in a value between 0.0 and 1.0. This measure is not a metric distance. Similarity is calculated as 1 - normalized distance. ```csharp using System; using F23.StringSimilarity; public class Program { public static void Main(string[] args) { var l = new NormalizedLevenshtein(); Console.WriteLine(l.Distance("My string", "My $tring")); Console.WriteLine(l.Distance("My string", "My $tring")); Console.WriteLine(l.Distance("My string", "My $tring")); } } ``` -------------------------------- ### Damerau-Levenshtein Distance Calculation in C# Source: https://github.com/feature23/stringsimilarity.net/blob/main/README.md Calculates the Damerau-Levenshtein distance between two strings, which is the minimum number of single-character edits (insertions, deletions, substitutions, or transpositions of adjacent characters) required to change one string into the other. This implementation uses the F23.StringSimilarity library. ```csharp using System; using F23.StringSimilarity; public class Program { public static void Main(string[] args) { var d = new Damerau(); // 1 substitution Console.WriteLine(d.Distance("ABCDEF", "ABDCEF")); // 2 substitutions Console.WriteLine(d.Distance("ABCDEF", "BACDFE")); // 1 deletion Console.WriteLine(d.Distance("ABCDEF", "ABCDE")); Console.WriteLine(d.Distance("ABCDEF", "BCDEF")); Console.WriteLine(d.Distance("ABCDEF", "ABCGDEF")); // All different Console.WriteLine(d.Distance("ABCDEF", "POIU")); } } ``` === COMPLETE CONTENT === This response contains all available snippets from this library. No additional content exists. Do not make further requests.