Provides multiple text-based similarity algorithms to measure the similarity of input text pairs. The provided algorithms are tuned to measure similarity both in the representation (syntax) and the meaning (semantics) of the text content.
Perform the similarity analysis on the given sentence pair, either syntactic or semantic analysis
text1 required | string non-empty The text content with UTF-8 text representation |
lang1 required | string The two letter language code |
text2 required | string non-empty The text content with UTF-8 text representation |
lang2 required | string The two letter language code |
algo | string Similarity AlgorithmsSyntactic SimilarityThe syntactic similarity algorithms exclusively focus on the representational features of text. The most dominant of these features is the set of tokens (character and words) being used. Different syntactic similarity algorithms exploit these features differently to provide a measure of similarity between an input text pair. The similarity is measured based on a scale of 0 to 1, where 1 represents the best possible match, and 0 indicates the no match scenario. In addition the base algorithms, we also utilize the approach of character and/or word based shingles to add context for increasing the similarity accuracy. The following syntactic similarity algorithms are supported:
Semantic SimilarityThe semantic similarity algorithms focus on comparing the input text pair based on the main concepts present in the text regardless of the words used to represent these concepts. Roughly speaking it is similar to comparing the meaning of the two sentences independent of the words used. See here for more details. Our semantic similarity algorithms are created using modern deep learning based word embeddings trained on enterprise corpus of sample documents. The models are trained on single sentences, and/or short paragraphs as input, and therefore work best for content size in that range. All of our semantic similarity algorithms support multi and cross lingual scenarios, where the input text pair can be expressed in any of the supported languages (for example en-en, en-fr, en-es, fr-fr, fr-es etc.). The following semantic similarity algorithms are supported:
|
200 response
Invalid request body
The request is forbidden (Please input a valid API key)
{- "text1": "how are you",
- "lang1": "en",
- "text2": "how old are you",
- "lang2": "en",
- "algo": "syn.cosine-word"
}
{- "status": {
- "success": true,
- "code": 200
}, - "result": {
- "text1": "string",
- "text2": "string",
- "score": 1,
- "prediction": {
- "match": true,
- "conf": 1
}
}
}