sec_parser.processing_steps.introductory_section_classifier

Classes

AbstractElementwiseProcessingStep

AbstractElementwiseTransformStep class is used to iterate over

ElementProcessingContext

The ElementProcessingContext class is designed to provide context information

IntroductorySectionElement

The IntroductorySectionElement class represents elements that are part of the

TopSectionTitle

The TopSectionTitle class represents the title and the beginning of a top-level

IntroductorySectionElementClassifier

The IntroductorySectionElementClassifier is a processing step designed

Module Contents

class sec_parser.processing_steps.introductory_section_classifier.AbstractElementwiseProcessingStep(*, types_to_process: set[type[sec_parser.semantic_elements.abstract_semantic_element.AbstractSemanticElement]] | None = None, types_to_exclude: set[type[sec_parser.semantic_elements.abstract_semantic_element.AbstractSemanticElement]] | None = None)

Bases: sec_parser.processing_steps.abstract_classes.abstract_processing_step.AbstractProcessingStep

AbstractElementwiseTransformStep class is used to iterate over all Semantic Elements with or without applying transformations.

_NUM_ITERATIONS = 1
abstract _process_element(element: sec_parser.semantic_elements.abstract_semantic_element.AbstractSemanticElement, context: sec_parser.processing_steps.abstract_classes.processing_context.ElementProcessingContext) sec_parser.semantic_elements.abstract_semantic_element.AbstractSemanticElement

_process_element method is responsible for transforming a single semantic element into another.

It can also be utilized to simply iterate over all elements without applying any transformations.

_process_recursively(elements: list[sec_parser.semantic_elements.abstract_semantic_element.AbstractSemanticElement], *, _context: sec_parser.processing_steps.abstract_classes.processing_context.ElementProcessingContext) list[sec_parser.semantic_elements.abstract_semantic_element.AbstractSemanticElement]
_process(elements: list[sec_parser.semantic_elements.abstract_semantic_element.AbstractSemanticElement]) list[sec_parser.semantic_elements.abstract_semantic_element.AbstractSemanticElement]
class sec_parser.processing_steps.introductory_section_classifier.ElementProcessingContext

The ElementProcessingContext class is designed to provide context information for elementwise processing steps.

iteration: int
class sec_parser.processing_steps.introductory_section_classifier.IntroductorySectionElement(html_tag: sec_parser.processing_engine.html_tag.HtmlTag, *, processing_log: sec_parser.processing_engine.processing_log.ProcessingLog | None = None, log_origin: sec_parser.processing_engine.processing_log.LogItemOrigin | None = None)

Bases: IrrelevantElement

The IntroductorySectionElement class represents elements that are part of the introductory sections of a document, such as title page, disclaimers or other preliminary information that precedes the main content of the document. This class is a subclass of the IrrelevantElement class, as these introductory sections are typically not part of the core financial data to be extracted.

class sec_parser.processing_steps.introductory_section_classifier.TopSectionTitle(html_tag: sec_parser.processing_engine.html_tag.HtmlTag, *, processing_log: sec_parser.processing_engine.processing_log.ProcessingLog | None = None, log_origin: sec_parser.processing_engine.processing_log.LogItemOrigin | None = None, level: int | None = None, section_type: sec_parser.semantic_elements.top_section_title_types.TopSectionType | None = None)

Bases: sec_parser.semantic_elements.mixins.dict_text_content_mixin.DictTextContentMixin, sec_parser.semantic_elements.top_section_start_marker.TopSectionStartMarker

The TopSectionTitle class represents the title and the beginning of a top-level section of a document. For instance, in SEC 10-Q reports, a top-level section could be “Part I, Item 3. Quantitative and Qualitative Disclosures About Market Risk.”.

classmethod create_from_element(source: sec_parser.semantic_elements.abstract_semantic_element.AbstractSemanticElement, log_origin: sec_parser.processing_engine.processing_log.LogItemOrigin, *, level: int | None = None, section_type: sec_parser.semantic_elements.top_section_title_types.TopSectionType | None = None) sec_parser.semantic_elements.abstract_semantic_element.AbstractLevelElement
to_dict(*, include_previews: bool = False, include_contents: bool = False) dict[str, Any]
class sec_parser.processing_steps.introductory_section_classifier.IntroductorySectionElementClassifier(*, types_to_process: set[type[sec_parser.semantic_elements.abstract_semantic_element.AbstractSemanticElement]] | None = None, types_to_exclude: set[type[sec_parser.semantic_elements.abstract_semantic_element.AbstractSemanticElement]] | None = None)

Bases: sec_parser.processing_steps.abstract_classes.abstract_elementwise_processing_step.AbstractElementwiseProcessingStep

The IntroductorySectionElementClassifier is a processing step designed to classify elements that are located before the actual contents of the document.

For example, consider a SEC EDGAR 10-Q report. This processing step will mark all elements that appear before the ‘part1’ section.

_NUM_ITERATIONS = 2
_process_element(element: sec_parser.semantic_elements.abstract_semantic_element.AbstractSemanticElement, context: sec_parser.processing_steps.abstract_classes.abstract_elementwise_processing_step.ElementProcessingContext) sec_parser.semantic_elements.abstract_semantic_element.AbstractSemanticElement