QAParser.settings
describes how fields can be parsed and extracted out of QAPair
s. The basis of QAParser
is QAField
which describe individual normalized values that are extracted.
During extraction, QAField
s are iterated in a sequence for each QAPair
to attempt extraction. As such, the order of precedence for non-array fields (i.e., is_array = false
) are such that values extracted from later QAField
s override earlier ones.
For array QAField
s (i.e., is_array = true
), the extracted values are concatenated together.
An array of Field
describing the how and what to extract from a QAPair
.
All items must be unique
No Additional ItemsA field describes how to identify and extract the value from a Question-Answer pair.
No Additional PropertiesThe extracted value will be associated with this key.
Must be at least 1
characters long
Regular expression pattern that will be used to search the question text. We use Python regular expressions and re.search
for matching.
Must be at least 1
characters long
Type of the extracted value. Defaults to string
.
- For number
, we will convert the value to float
. - For boolean
, we support handling of non-zero numbers, yes
, and true
. All other values are deemed as false
.
If is_array
and there are multiple question-field matches, the entries will be grouped together into a list
.
If not is_array
and the answer value is already a list, only the first element will be used.
Defaults to false
.
Set this to true
to use case-sensitive regex search. Defaults to false
.
Action to take during parser error. Defaults to raise
.
Default value to use when this field is not set.
null
Normalize value to a specific set of choices using Regex. The first matching regex in this array will be used as the normalized choice. If no matches are found, the original string value is used. Only applies to string
type.
Must contain a minimum of 0
items
Regular expression to match for normalization.
Must be at least 1
characters long
Choice to be returned by the parser.
Use the given format string for parsing date/datetime. Defaults to dateutil.parser.parse(value, dayfirst=True, default=today_midnight)
.
Must be at least 1
characters long
Use the given format string for parsing time. Defaults to dateutil.parser.parse(value, dayfirst=True, default=today_midnight)
.
Must be at least 1
characters long
For date-time fields that are date only, this will be the default time added to it. Defaults to 08:00
.
Ensures that the timezone of any parsed DATETIME
or TIME
field is localized to this timezone. Defaults to UTC
.
Must be at least 1
characters long