sec_parser.processing_steps.introductory_section_classifier
Classes
AbstractElementwiseTransformStep class is used to iterate over |
|
The ElementProcessingContext class is designed to provide context information |
|
The IntroductorySectionElement class represents elements that are part of the |
|
The TopSectionTitle class represents the title and the beginning of a top-level |
|
The IntroductorySectionElementClassifier is a processing step designed |
Module Contents
- class sec_parser.processing_steps.introductory_section_classifier.AbstractElementwiseProcessingStep(*, types_to_process: set[type[sec_parser.semantic_elements.abstract_semantic_element.AbstractSemanticElement]] | None = None, types_to_exclude: set[type[sec_parser.semantic_elements.abstract_semantic_element.AbstractSemanticElement]] | None = None)
Bases:
sec_parser.processing_steps.abstract_classes.abstract_processing_step.AbstractProcessingStepAbstractElementwiseTransformStep class is used to iterate over all Semantic Elements with or without applying transformations.
- _NUM_ITERATIONS = 1
- abstract _process_element(element: sec_parser.semantic_elements.abstract_semantic_element.AbstractSemanticElement, context: sec_parser.processing_steps.abstract_classes.processing_context.ElementProcessingContext) sec_parser.semantic_elements.abstract_semantic_element.AbstractSemanticElement
_process_element method is responsible for transforming a single semantic element into another.
It can also be utilized to simply iterate over all elements without applying any transformations.
- _process_recursively(elements: list[sec_parser.semantic_elements.abstract_semantic_element.AbstractSemanticElement], *, _context: sec_parser.processing_steps.abstract_classes.processing_context.ElementProcessingContext) list[sec_parser.semantic_elements.abstract_semantic_element.AbstractSemanticElement]
- class sec_parser.processing_steps.introductory_section_classifier.ElementProcessingContext
The ElementProcessingContext class is designed to provide context information for elementwise processing steps.
- iteration: int
- class sec_parser.processing_steps.introductory_section_classifier.IntroductorySectionElement(html_tag: sec_parser.processing_engine.html_tag.HtmlTag, *, processing_log: sec_parser.processing_engine.processing_log.ProcessingLog | None = None, log_origin: sec_parser.processing_engine.processing_log.LogItemOrigin | None = None)
Bases:
IrrelevantElementThe IntroductorySectionElement class represents elements that are part of the introductory sections of a document, such as title page, disclaimers or other preliminary information that precedes the main content of the document. This class is a subclass of the IrrelevantElement class, as these introductory sections are typically not part of the core financial data to be extracted.
- class sec_parser.processing_steps.introductory_section_classifier.TopSectionTitle(html_tag: sec_parser.processing_engine.html_tag.HtmlTag, *, processing_log: sec_parser.processing_engine.processing_log.ProcessingLog | None = None, log_origin: sec_parser.processing_engine.processing_log.LogItemOrigin | None = None, level: int | None = None, section_type: sec_parser.semantic_elements.top_section_title_types.TopSectionType | None = None)
Bases:
sec_parser.semantic_elements.mixins.dict_text_content_mixin.DictTextContentMixin,sec_parser.semantic_elements.top_section_start_marker.TopSectionStartMarkerThe TopSectionTitle class represents the title and the beginning of a top-level section of a document. For instance, in SEC 10-Q reports, a top-level section could be “Part I, Item 3. Quantitative and Qualitative Disclosures About Market Risk.”.
- classmethod create_from_element(source: sec_parser.semantic_elements.abstract_semantic_element.AbstractSemanticElement, log_origin: sec_parser.processing_engine.processing_log.LogItemOrigin, *, level: int | None = None, section_type: sec_parser.semantic_elements.top_section_title_types.TopSectionType | None = None) sec_parser.semantic_elements.abstract_semantic_element.AbstractLevelElement
- to_dict(*, include_previews: bool = False, include_contents: bool = False) dict[str, Any]
- class sec_parser.processing_steps.introductory_section_classifier.IntroductorySectionElementClassifier(*, types_to_process: set[type[sec_parser.semantic_elements.abstract_semantic_element.AbstractSemanticElement]] | None = None, types_to_exclude: set[type[sec_parser.semantic_elements.abstract_semantic_element.AbstractSemanticElement]] | None = None)
-
The IntroductorySectionElementClassifier is a processing step designed to classify elements that are located before the actual contents of the document.
For example, consider a SEC EDGAR 10-Q report. This processing step will mark all elements that appear before the ‘part1’ section.
- _NUM_ITERATIONS = 2
- _process_element(element: sec_parser.semantic_elements.abstract_semantic_element.AbstractSemanticElement, context: sec_parser.processing_steps.abstract_classes.abstract_elementwise_processing_step.ElementProcessingContext) sec_parser.semantic_elements.abstract_semantic_element.AbstractSemanticElement