sec_parser.processing_steps.empty_element_classifier
Exceptions
Raised when an invalid iteration value is encountered. |
Classes
AbstractElementwiseTransformStep class is used to iterate over |
|
The ElementProcessingContext class is designed to provide context information |
|
The EmptyElement class represents an HTML element that does not contain any content. |
|
IrrelevantElementClassifier class for converting elements |
Module Contents
- class sec_parser.processing_steps.empty_element_classifier.AbstractElementwiseProcessingStep(*, types_to_process: set[type[sec_parser.semantic_elements.abstract_semantic_element.AbstractSemanticElement]] | None = None, types_to_exclude: set[type[sec_parser.semantic_elements.abstract_semantic_element.AbstractSemanticElement]] | None = None)
Bases:
sec_parser.processing_steps.abstract_classes.abstract_processing_step.AbstractProcessingStepAbstractElementwiseTransformStep class is used to iterate over all Semantic Elements with or without applying transformations.
- _NUM_ITERATIONS = 1
- abstract _process_element(element: sec_parser.semantic_elements.abstract_semantic_element.AbstractSemanticElement, context: sec_parser.processing_steps.abstract_classes.processing_context.ElementProcessingContext) sec_parser.semantic_elements.abstract_semantic_element.AbstractSemanticElement
_process_element method is responsible for transforming a single semantic element into another.
It can also be utilized to simply iterate over all elements without applying any transformations.
- _process_recursively(elements: list[sec_parser.semantic_elements.abstract_semantic_element.AbstractSemanticElement], *, _context: sec_parser.processing_steps.abstract_classes.processing_context.ElementProcessingContext) list[sec_parser.semantic_elements.abstract_semantic_element.AbstractSemanticElement]
- class sec_parser.processing_steps.empty_element_classifier.ElementProcessingContext
The ElementProcessingContext class is designed to provide context information for elementwise processing steps.
- iteration: int
- class sec_parser.processing_steps.empty_element_classifier.EmptyElement(html_tag: sec_parser.processing_engine.html_tag.HtmlTag, *, processing_log: sec_parser.processing_engine.processing_log.ProcessingLog | None = None, log_origin: sec_parser.processing_engine.processing_log.LogItemOrigin | None = None)
Bases:
IrrelevantElementThe EmptyElement class represents an HTML element that does not contain any content. It is a subclass of the IrrelevantElement class and is used to identify and handle empty HTML tags in the document.
- exception sec_parser.processing_steps.empty_element_classifier.InvalidIterationError
Bases:
ValueErrorRaised when an invalid iteration value is encountered.
- class sec_parser.processing_steps.empty_element_classifier.EmptyElementClassifier(*, types_to_process: set[type[sec_parser.semantic_elements.abstract_semantic_element.AbstractSemanticElement]] | None = None, types_to_exclude: set[type[sec_parser.semantic_elements.abstract_semantic_element.AbstractSemanticElement]] | None = None)
-
IrrelevantElementClassifier class for converting elements into IrrelevantElement instances.
This step scans through a list of semantic elements and changes it, primarily by replacing suitable candidates with IrrelevantElement instances.
- _process_element(element: sec_parser.semantic_elements.abstract_semantic_element.AbstractSemanticElement, _: sec_parser.processing_steps.abstract_classes.abstract_elementwise_processing_step.ElementProcessingContext) sec_parser.semantic_elements.abstract_semantic_element.AbstractSemanticElement
Transform a single semantic element into a EmptyElement if applicable.