Codeq NLP API Tutorial
Part 8. Semantic Roles
In this tutorial of Codeq’s NLP API we will focus on a single module that can be used to extract semantic roles from texts. Previous tutorials of this series can be found here:
- Part 1. Getting started and sending requests to the API.
- Part 2. Calling NLP annotators for linguistic analysis.
- Part 3. Using text classifiers to classify sentiments and emotions, and detect sarcasm in texts.
- Part 4. Detecting abusive and harmful content from texts.
- Part 5. Extracting and disambiguating named entities.
- Part 6. Extracting Speech Acts, Questions & Tasks.
- Part 7. Summarizing texts.
The complete list of modules in the Codeq NLP API can be found here:
Define a NLP pipeline and analyze a text
As usual, the first step is to declare an instance of the NLP API client and use it to send a text along with a pipeline variable indicating the name of the Semantic Roles annotator. The output is a Document object that contains a list of analyzed Sentences. A quick look of the output can be found with the method document.pretty_print().
from codeq_nlp_api import CodeqClientclient = CodeqClient(user_id="USER_ID", user_key="USER_KEY")pipe = [
"semantic_roles"
]
text = "A pneumonia outbreak was reported in Wuhan, China in December 2019."
document = client.analyze(text, pipeline=pipe)
print(document.pretty_print())
Semantic Role Labelling
The goal of this module is to identify the main events and participants in sentences and classify the different types of relations between them. In the extraction of Semantic Roles, events are called predicates, while the participants are known as the arguments of a given predicate. Arguments can denote specific types of relations, for example they can be an Agent, a Patient or a Location in relation to the predicate.
More details about the Semantic Role Labeler and an example of its application can be found here:
- KEY: semantic_roles
- ATTR: sentence.semantic_roles
Output Labels:
- Agent/Experiencer
- Patient/Theme
- Instrument/Beneficiary/Goal
- StartingPoint/Attribute
- EndingPoint
- Location
- Purpose
- Cause
- Temporal
- Modifier
- Negative
- GenericArgument
pipe = [
"semantic_roles"
]
text = "A pneumonia outbreak was reported in Wuhan, China in December 2019."
document = client.analyze(text, pipeline=pipe)
for sentence in document.sentences:
raw_sentence = sentence.raw_sentence
semantic_roles = sentence.semantic_roles
print(raw_sentence)
for sr in semantic_roles:
predicate_lemma = sr['predicate_lemma']
predicate_token = sr['predicate_token']
predicate_position = sr['predicate_position']
print("")
print("predicate_lemma: %s" % predicate_lemma)
print("predicate_token: %s" % predicate_token)
print("predicate_position: %s" % predicate_position)
if 'arguments' in sr:
print("arguments:")
for arg in sr['arguments']:
arg_type = arg['type']
arg_tokens = arg['tokens']
arg_tokens_position = arg['positions']
print("- type: %s" % arg_type)
print("- tokens: %s" % arg_tokens)
print("- positions: %s\n" % arg_tokens_position)# Output:
#
# sentence: A pneumonia outbreak was reported in Wuhan, China in December 2019.
#
# semantic_roles:
#
# predicate_lemma: be
# predicate_token: was
# predicate_position: 4
#
# predicate_lemma: report
# predicate_token: reported
# predicate_position: 5
# arguments:
# - type: Patient/Theme
# - tokens: ['A', 'pneumonia', 'outbreak']
# - positions: [1, 2, 3]
#
# - type: Location
# - tokens: ['in', 'Wuhan', ',', 'China']
# - positions: [6, 7, 8, 9]
#
# - type: Temporal
# - tokens: ['in', 'December', '2019']
# - positions: [10, 11, 12]
From the output above we can observe the following:
- All semantic roles contain a predicate and, if present, a list of arguments for that predicate.
- Predicates include the token, lemma (inflected form) and position in the sentence.
- Each argument will contain the type (see Output labels above), the tokens of that argument and the position of the tokens in the sentence.
- All token positions in the sentence start from 1 (instead of 0, as lists in Python).
Wrap up
In this tutorial we described how to use the Semantic Role Labeler of the Codeq NLP API. The code below summarizes how to iterate over its output:
- Take a look at our documentation to learn more about the NLP tools we provide.
- Do you need inspiration? Go to our use case demos and see how you can integrate different tools.
- In our NLP demos section you can also try our tools and find examples of the output of each module.