Parse the XML

Overview

OpenAmplify's linguistic analysis generates signals about the supplied text and returns them in any of a number of formats. The most universally accessible format is XML, discussed here.  If you're familiar with reading a WSDL, ours can be found here.

Our signals identify the prominent Topics (nouns) and Actions (verbs) from the text. They also provide useful data about those Topics and Actions. The Demographics and Style analyses generate information about the author and their tone based on the style of the writing.

Data structure

OpenAmplify's linguistic analysis generates signals about the supplied text and returns them in any of a number of formats. The most universally accessible format is XML, discussed here.

All data in the XML come in paired <name> and <value> tags.

In some cases, when the <name> and <value> tags fall inside a <topic> or <action> tag, <name> is the name of the topic or action (e.g. "world" or "say hello") and <value> is the prominence of the topic or action in the text. As you might expect, a higher prominence score means the topic or action figures more prominently in the text.

In other cases the <value> contains a score pertinent to the containing node or set of nodes, and the name is a textual translation of that score.

For example, in the extract below, <value> is the score of the maximum polarity (attitude) toward a topic that's contained within another branch of the containing node and <name> is a translation of that score.

<topicResult>
	<topic>…</topic>
	<polarity>
		<max>
			<name>Very Positive</name>
			<value>1.00000</value>
		</max>
		….
	</polarity>
	…
</topicResult>

Data Dictionary and Examples

Below is a list of the data points returned by the different analyses and some explanation of what they mean.

For all the examples, the submitted text is:

"We like coders. We want our coders to write good code. Good code helps make valuable software. Should we give them a helpful API?"

  • analysis= All : All combines all the signals from the following analyses.
  • analysis= Topics :
    • Domain: Broad domains suggested by the most relevant topics (e.g. Automobiles, Medicine, etc.)
    • Topic: Nouns from the text.
      • Name = the topic (e.g. coder).
      • Value = the prominence of the word in the text.
    • Polarity: The attitude expressed toward a Topic.
      • OpenAmplify returns a Min, Mean, and Max Polarity name and score.
      • Value to Name mappings:
        • value < 0 » "Negative"
        • value = 0 » "Neutral"
        • value > 0 » "Positive"
    • OfferingGuidance/RequestingGuidance: A measure of how much guidance the author is offering or requesting about each Topic
      • Value to Name mappings:
        • value = 1 » "Not At All"
        • value = 2 » "To Some Extent"
        • value = 3 » "A Lot"
  • analysis= Actions :
    • Action: Verbs & verb phrases from the text.
    • ActionType: A classification of the action (e.g. "buy" and "purchase" are both classified as "buy")
    • Temporality: The time associated with each action (past, recent past, present, future)
      • Value to Name mappings:
        • value = 0 » "NA"
        • value = 1 » "Past"
        • value = 2 » "Recent Past"
        • value = 3 » "Present"
        • value = 4 » "Future"
    • Decisiveness: The degree to which the action in question is likely to happen in the future.
      • Value to Name mappings:
        • value < 0 » "NA"
        • value 0-1 » "Low"
        • value 1.000001 - 2 » "Medium Low"
        • value 2.000001 - 3 » "Medium"
        • value 3.000001 - 4 » "Medium High"
        • value 4.000001 - 5 » "High"
    • OfferingGuidance/RequestingGuidance: A measure of how much the author is offering or requesting guidance about the action in question.
      • Value to Name mappings:
        • value = 1 » "Not At All"
        • value = 2 » "To Some Extent"
        • value = 3 » "A Lot"
  • analysis= Demographics :
    • Age: Estimates the age of the author in broad categories. This analysis improves as the length of the content increases.
      • Value to Name mappings:
        • value < (-.02) » "Young"
        • value (-.02) - .02 » "Adult"
        • value >.02 » "Senior"
    • Gender: Estimates the gender of the author
      • Names: "Male", "Female", "Neutral"
        • The value corresponds to the degree of certainty about Gender
    • Education: Estimates the degree of education of the author.
      • Value to Name mappings:
        • value = 0 » "Undecided"
        • value = 1 » "Pre-Secondary"
        • value = 2 » "Secondary"
        • value = 3 » "College"
        • value = 4 » "Post Graduate"
  • analysis= Style :
    • Slang: Identifies how much slang is used in the text.
      • Value to Name mappings:
        • value >= 3 » "Slang"
        • value < 3 » "No Slang"
    • Flamboyance: Identifies how flamboyant, lively, or colorful the language being used is.
      • Value to Name mappings:
        • value < 2 » "Not Flamboyant"
        • 2 <= value < 3 » "Not Very Flamboyant"
        • 3 <= value < 4 » "Somewhat Flamboyant"
        • 4 <= value < 5 » "Flamboyant"
        • value => 5 » "Very Flamboyant"