Requirements Quality Factor Ontology

RQFO

Guideline

The following sections describe how to identify and extract eligible objects from primary studies. Mandatory attributes are marked with an asterisk.


factor

A quality factor (QF) represents a normative metric which maps a textual requirement of a specific granularity to a scale and therefore informs about the quality of that input.

name

scope note*

The name of a quality factor shall reflect the quality factor in relation to the whole set of quality factors. This means that the name can be extracted from the paper directly, but if the name does not sufficiently demarcate the quality factor from others, then it is feasible to select a new name. This is due to the fact that several publications propose and name a quality factor without awareness of other quality factors, which makes demarcation by design impossible.

aspect

dimension cluster*

This dimension-cluster is built on the conceptual notion that a quality factor is a normative rule, where a violation against it is hypothesized to have an impact of the type aspect on an activity in which the requirement is used. It captures all quality aspects that are explicitly mentioned in the description of the impact of the QA.

Extraction rule: A violation against the quality factor has an impact on

DimensionsExtraction Rule
adequacythe appropriateness of the requirement in its respective context
atomicitythe confinement of the requirements scope to only one, not further splittable element
completenessthe explicit availability of all relevant information
compliancethe adherence to external rules
consistencythe satisfiability of all requirements in conjunction
correctnessthe alignment of the stated text with the intended objects
designindependencethe confinement of the requirement to the problem space
feasibilitythe chance of realistically implementing the requirement
maintainabilitythe ability to continuously ensure the quality of the requirement
modifiabilitythe ability to change the requirement
necessitythe singularity of written text
precisionthe level of unique specification of the text
reusabilitythe ease of reuse
simplicitythe intricacy of the written text
traceabilitythe explicit connections to other artifacts
unambiguousenessthe unique interpretation of a requirement
understandabilitythe comprehensibility of written text
verifiabilitythe ability to assess whether a requirement is met
CharacteristicsExtraction Rule
-in a negative way
+in a positive way
?in an unknown way
no way at all

linguistic complexity

dimension*

The linguistic complexity classifies a quality factor regarding the type of information that needs to be available to determine a violation against the quality factor. This informs about the complexity of automatically detecting this quality factor: while lexical factors can be decided for example using regular expressions, syntactic factors using POS, constituency or dependency parsing, and structural using metadata, semantic factors require an understanding of the input, which might only be approximated using thesauri or a relationship to an ontology.

Extraction rule: In order to determine a violation against the quality factor, one must know at least …

Characteristic Extraction Rule
lexicalthe literal words (i.e., a regular expression can be used to automate the rule).
structuralthe structure of sentences (i.e., metadata (headings, emphasis, …) is necessary to automate the rule).
syntacticthe grammatical relationships between words (i.e., POS tags, constituency tags, dependency tags, etc. can be used to automate the rule).
semanticthe meaning of the words (i.e., a semantic comprehension of the text is necessary to automate the rule).

scope

dimension*

The scope classifies a quality factor regarding the extent of information that is necessary in order to determine a violation against the formal rule of the quality factor. The minimal scope shall be chosen, i.e., for one violation against a given quality factor, how much textual information must be seen to detect that violation? The classification of the scope shall not be approached from a standpoint of 'ensuring that a given input document is free of violations against that factor', because that would always entail a global/document scope.

Extraction rule: To determine one violation against the formal rule of the quality attribute, it suffices to see …

Characteristic Extraction Rule
worda single token/word
phrasemultiple, coherent words
sentencea full, grammatically correct sentence
structured/tabular texta structured text (use case specification, user story, feature table)
user storya structured text following the Cohn/Connextra template ('As a <user> I want to <goal> so that <justification>.')
use casea structured text describing a set of connected scenarios in the form of consecutive steps
requirementa structured functional, non-functional, or process requirement
sectiona full, coherent section
documenta full, coherent document
globalall textual requirements artifacts associated to the product/service

description

A description object explains (a) what the quality factor means and (b) how this quality factor is hypothesized to inform about the quality of the requirement. A quality attribute can be associated with multiple descriptions, which may be the result of parallel work or updating definitions or impact descriptions.

definition

scope note*

A definition is an informal rule, which must be complied with in order to ensure good quality of the requirement according to the authors. The definition may be simply postulated, but can also be derived empirically from data or developed in collaboration with industry.

impact

scope note

The impact scope note explicitly describes how the QA affects the actual quality of the requirements. A manuscript should make the hypothesized impact explicit, but does not need to in order to be included.

empirical evidence

dimension*

This dimension captures whether the given description and/or impact is rooted in any sort of empirical evidence. This may simply be practitioners reporting violations against the quality factor as a challenge, or an investigation of requirements artifacts. Empirical evidence for the description or impact corroborates the relevance of this QF.

Characteristic Extraction Rule
trueAn empirical method has been applied to validate the definition or the impact (or both) of the quality attribute.
falseThe quality attribute has simply been postulated without any empirical validation of its definition or impact.

practitioners involved

dimension*

This dimension indicates whether practitioners - collaborators working primarily in industry - were involved in the creation or validation of the quality attribute.

Characteristic Extraction Rule
truePractitioners were involved in the validation of the description or impact of the quality attribute.
falseNo practitioners were involved in the validation of the description or impact of the quality attribute or there was no empirical method applied at all.

dataset

A data set object is an arbitrarily large set of natural language requirements, which may make one or more specific quality factors explicit (e.g., through annotations) and are usable as gold standards to evaluate newly proposed approaches.

description

scope note*

The description of a data set object contains information about the origin of this data. Descriptions may be vague in the case of confidential data or explicit in the case of open-source data.

origin

dimension*

This dimension classifies a data set object regarding the type of the author. If no author is ever explicitly mentioned in the reference and accessing the reference does not reveal the author either, the data sets author must be exposed as unknown.

Characteristic Extraction Rule
practitioner dataData that was extracted from contexts in which practitioners work
student dataData that was created or extracted in the context of student work.
mocked dataData that was fabricated for the purpose of being studied.
unknownA dataset exists, but it is not clear who created the data.

ground truth annotators

dimension*

This dimension classifies an object regarding who is responsible for annotating the ground truth embedded in the data set if such an annotation exists. Data that is used as is without any additional information embedded into it have a ground truth annotator of none.

Characteristic Extraction Rule
practitionersPractitioners annotated the data.
researchersPh.D., PostDocs, Professors, or independent researchers annotated the data.
studentsBSc or MSc students annotated the data.
authorsResearchers listed as authors on the paper.
inherentThe truth is embedded in the data in some way. Could be just analysing the data the way it is, or the truth was added to the data in the way it was created.
noneData was not annotated
unknownThe dataset was annotated, but it is not clear who annotated the data

size

numeric

This dimension quantifies an object regarding the number of contained elements, which shall support to estimate whether a data set contains a sufficient amount of entries for specific training tasks.

granularity

dimension*

The granularity classifies an object regarding the scope of the elements contained in the data set.

Characteristic Extraction Rule
worda single token/word
phrasea substring of a sentence
sentencea full, grammatically correct sentence
structured/tabular texta structured text (use case specification, user story, feature table)
user storya structured text following the Cohn/Connextra template ('As a <user> I want to <goal> so that <justification>.')
use casea structured text describing a set of connected scenarios in the form of consecutive steps
requirementa structured functional, non-functional, or process requirement
sectiona full, coherent section
documenta full, coherent document
globalall textual requirements artifacts associated to the product/service

accessibility

dimension*

The accessibility classifies an object regarding the degree to which it is currently available and usable.

Characteristic Extraction Rule
open accessThe dataset is hosted in a service that satisfies the following criteria: (1) Immutable URL: cannot be altered by the author or someone else, (2) Permanent: the hosting organization has a mission to maintain artefacts for the foreseeable future, (3) Accessible: There is a DOI pointing to the real datasource URL, (4) Open-Source License: The dataset has a proper licence which grants access and re-use of data, material, and source code
available in paperThe dataset is small enough that the authors disclose the entire dataset in the paper itself (e.g. a set of 14 requirements, listed in a table).
reachable linkThe dataset is reachable now, but is missing some aspect above to be considered Open Access.
broken linkLink in paper, but does not resolve.
no linkA dataset is discussed, but no link is provided.
upon requestAuthors say the dataset is available upon request.
privateThe authors say that a dataset exists, but is private for some reasons (such as industry collaboration with private data, etc.).
proprietaryThe approach is available but proprietary

source link

scope note

The source or link is the pointer towards the location where the object can be found.


approach

An approach is an implementation of automatic detection of a violation against the formal rule which the quality attribute entails.

type

dimension*

The type of proposed solution classifies an approach regarding the general paradigm utilized to implement a detection algorithm.

Characteristic Extraction Rule
rule-basedViolations against the quality attribute are detected based on a static set of predefined rules.
supervised mlThe detection of violations against the quality attribute is realized through a supervised machine learning approach.
unsupervised mlThe detection of violations against the quality attribute is realized through an unsupervised machine learning approach.
supervised dlThe detection of violations against the quality attribute is realized through a supervised deep learning approach.
unsupervised dlThe detection of violations against the quality attribute is realized through an unsupervised deep learning approach.

accessibility

dimension*

The accessibility classifies an approach regarding the degree to which it is available.

Characteristic Extraction Rule
open accessThe approach is hosted in a service that satisfies all of the following criteria: (1) Immutable URL: cannot be altered by the author or someone else, (2) Permanent: the hosting organization has a mission to maintain artefacts for the foreseeable future, (3) Accessible: There is a DOI pointing to the real approach URL, (4) Open-Source License: The approach has a proper licence which grants access and re-use of data, material, and source code
open sourceThe approach is available for all to use and the codebase has been disclosed
reachable linkThe approach is reachable now, but is missing some aspect above to be considered Open Access.
broken linkA link is given in paper, but does not resolve.
no linkAn approach is discussed, but no link is provided.
upon requestAuthors say the approach is available upon request.
privateThe authors say that an approach exists, but is private for some reasons (such as industry collaboration with private data, etc.)
proprietaryThe approach is available but proprietary

source link

scope note

The source or link is the pointer towards the location where the object can be found.

empirical method applied

dimension*

This dimension determines whether an approach has been evaluated with some sort of empirical method: this can be a formal experiment comparing the efficiency of the approach, but may also appear in the form of interviews confirming the findings of the approach.

Characteristic Extraction Rule
trueAn empirical method has been applied to validate the approach.
falseThe approach has simply been postulated without any empirical validation.

practitioners involved

dimension*

This dimension captures whether the application/evaluation of the approach involved actual practitioners. We currently do not differentiate whether the practitioners involved with the evaluation were also the practitioners who worked with the data set used for the evaluation.

Characteristic Extraction Rule
trueThe evaluation of the approach involved practitioners, which primarily work in industry.
falseThe evaluation of the approach involved no practitioners (hence instead: authors, research staff, students, etc.).

releases

dimension cluster

The relase classifies an approach regarding the type of solution that was disclosed to the public. While some approaches are disclosed in the form of executable tools, also publishing the source code in order to improve reuse and maintainance shall be encouraged.

DimensionsExtraction Rule
toolA standalone tool
webserviceA online interface hosted as a webservice
libraryA library
apiAn API or library
codeThe source code of the approach
notebookA (Jupyter) notebook demonstrating the approach
modelA pre-trained model (resulting from an ML/DL solution)
CharacteristicsExtraction Rule
yhas been released
has not been released

necessary information

dimension cluster

The necessary information classifies an approach regarding the type of information that needs to be available in order to automatically determine a violation against the formal rule.

Extraction rule: To automatically determine a violation against the formal rule of the quality factor,

DimensionsExtraction Rule
part-of-speech tagsan association of each token with its corresponding part-of-speech tag
dependency tagsan association of each token with the token it depends on
consistyency tagsan association of each token with its parenting constituent
lemmatizationan association of each token with its lemmatized form
stemmingan association of each token with its word stem
phrase chunksan association of phrases to containing chunks
stop word removalthe automatic removal of words that do not add value to the text
semantic role labelingthe annotation of semantic roles to parts of the text
thesaurusa graph connecting words with synonyms
named entity recognitionthe automatic recognition of named entities from noun phrases
parse treean acyclic graph representing the syntactical hierarchy of a sentence
CharacteristicsExtraction Rule
yis necessary
?is unclear whether it is necessary
is not necessary