Class ParsedDocumentMetadata
Metadata about a parsed document
Inherited Members
Namespace: Textkernel.Tx.Models
Assembly: Textkernel.Tx.SDK.dll
Syntax
public class ParsedDocumentMetadata
Properties
| Improve this Doc View SourceDocumentCulture
An ISO 3066 code that represents the cultural context of the document regarding formatting of
numbers, dates, character symbols, etc. This value is usually a simple concatenation of the
language and country codes, such as en-US
for US English; however, note that culture
can be set independently of language and country to achieve fine-tuned cultural control over parsing,
so if you use this value you should not assume that it always matches the language and country.
Declaration
public string DocumentCulture { get; set; }
Property Value
Type | Description |
---|---|
System.String |
DocumentLanguage
An ISO 639-1 code that represents the primary language of the parsed text. When the
language could not be automatically determined, it is reported as the special value
iv
(invariant/unknown). Note that the two-letter ISO codes reported by the
Parser - such as zh
for Chinese - do not differentiate between language
variants, such as Mandarin and Cantonese.
Declaration
public string DocumentLanguage { get; set; }
Property Value
Type | Description |
---|---|
System.String |
DocumentLastModified
The last-revised/last-modified date that was provided for the document. This was used to calculate all of the important metrics about skills and jobs.
Declaration
public DateTime DocumentLastModified { get; set; }
Property Value
Type | Description |
---|---|
System.DateTime |
ParserSettings
The full parser settings that were used during parsing
Declaration
public string ParserSettings { get; set; }
Property Value
Type | Description |
---|---|
System.String |
PlainText
The plain text that was used for parsing
Declaration
public string PlainText { get; set; }
Property Value
Type | Description |
---|---|
System.String |