sec_parser.processing_steps.introductory_section_classifier

Classes

`AbstractElementwiseProcessingStep`	AbstractElementwiseTransformStep class is used to iterate over
`ElementProcessingContext`	The ElementProcessingContext class is designed to provide context information
`IntroductorySectionElement`	The IntroductorySectionElement class represents elements that are part of the
`TopSectionTitle`	The TopSectionTitle class represents the title and the beginning of a top-level
`IntroductorySectionElementClassifier`	The IntroductorySectionElementClassifier is a processing step designed

Module Contents

class sec_parser.processing_steps.introductory_section_classifier.AbstractElementwiseProcessingStep(*, types_to_process: set[type[sec_parser.semantic_elements.abstract_semantic_element.AbstractSemanticElement]] | None = None, types_to_exclude: set[type[sec_parser.semantic_elements.abstract_semantic_element.AbstractSemanticElement]] | None = None)

Bases: sec_parser.processing_steps.abstract_classes.abstract_processing_step.AbstractProcessingStep

AbstractElementwiseTransformStep class is used to iterate over all Semantic Elements with or without applying transformations.

_NUM_ITERATIONS = 1

abstract _process_element(element: sec_parser.semantic_elements.abstract_semantic_element.AbstractSemanticElement, context: sec_parser.processing_steps.abstract_classes.processing_context.ElementProcessingContext) → sec_parser.semantic_elements.abstract_semantic_element.AbstractSemanticElement

_process_element method is responsible for transforming a single semantic element into another.

It can also be utilized to simply iterate over all elements without applying any transformations.

_process_recursively(elements: list[sec_parser.semantic_elements.abstract_semantic_element.AbstractSemanticElement], *, _context: sec_parser.processing_steps.abstract_classes.processing_context.ElementProcessingContext) → list[sec_parser.semantic_elements.abstract_semantic_element.AbstractSemanticElement]

_process(elements: list[sec_parser.semantic_elements.abstract_semantic_element.AbstractSemanticElement]) → list[sec_parser.semantic_elements.abstract_semantic_element.AbstractSemanticElement]

class sec_parser.processing_steps.introductory_section_classifier.ElementProcessingContext

The ElementProcessingContext class is designed to provide context information for elementwise processing steps.

iteration: int

class sec_parser.processing_steps.introductory_section_classifier.IntroductorySectionElement(html_tag: sec_parser.processing_engine.html_tag.HtmlTag, *, processing_log: sec_parser.processing_engine.processing_log.ProcessingLog | None = None, log_origin: sec_parser.processing_engine.processing_log.LogItemOrigin | None = None)

Bases: IrrelevantElement

The IntroductorySectionElement class represents elements that are part of the introductory sections of a document, such as title page, disclaimers or other preliminary information that precedes the main content of the document. This class is a subclass of the IrrelevantElement class, as these introductory sections are typically not part of the core financial data to be extracted.

class sec_parser.processing_steps.introductory_section_classifier.TopSectionTitle(html_tag: sec_parser.processing_engine.html_tag.HtmlTag, *, processing_log: sec_parser.processing_engine.processing_log.ProcessingLog | None = None, log_origin: sec_parser.processing_engine.processing_log.LogItemOrigin | None = None, level: int | None = None, section_type: sec_parser.semantic_elements.top_section_title_types.TopSectionType | None = None)

Bases: sec_parser.semantic_elements.mixins.dict_text_content_mixin.DictTextContentMixin, sec_parser.semantic_elements.top_section_start_marker.TopSectionStartMarker

The TopSectionTitle class represents the title and the beginning of a top-level section of a document. For instance, in SEC 10-Q reports, a top-level section could be “Part I, Item 3. Quantitative and Qualitative Disclosures About Market Risk.”.

classmethod create_from_element(source: sec_parser.semantic_elements.abstract_semantic_element.AbstractSemanticElement, log_origin: sec_parser.processing_engine.processing_log.LogItemOrigin, *, level: int | None = None, section_type: sec_parser.semantic_elements.top_section_title_types.TopSectionType | None = None) → sec_parser.semantic_elements.abstract_semantic_element.AbstractLevelElement

to_dict(*, include_previews: bool = False, include_contents: bool = False) → dict[str, Any]

class sec_parser.processing_steps.introductory_section_classifier.IntroductorySectionElementClassifier(*, types_to_process: set[type[sec_parser.semantic_elements.abstract_semantic_element.AbstractSemanticElement]] | None = None, types_to_exclude: set[type[sec_parser.semantic_elements.abstract_semantic_element.AbstractSemanticElement]] | None = None)

Bases: sec_parser.processing_steps.abstract_classes.abstract_elementwise_processing_step.AbstractElementwiseProcessingStep

The IntroductorySectionElementClassifier is a processing step designed to classify elements that are located before the actual contents of the document.

For example, consider a SEC EDGAR 10-Q report. This processing step will mark all elements that appear before the ‘part1’ section.

_NUM_ITERATIONS = 2

_process_element(element: sec_parser.semantic_elements.abstract_semantic_element.AbstractSemanticElement, context: sec_parser.processing_steps.abstract_classes.abstract_elementwise_processing_step.ElementProcessingContext) → sec_parser.semantic_elements.abstract_semantic_element.AbstractSemanticElement