sec_parser.processing_steps.title_classifier ============================================ .. py:module:: sec_parser.processing_steps.title_classifier Classes ------- .. autoapisummary:: sec_parser.processing_steps.title_classifier.TitleClassifier Module Contents --------------- .. py:class:: TitleClassifier(types_to_process: set[type[sec_parser.semantic_elements.abstract_semantic_element.AbstractSemanticElement]] | None = None, types_to_exclude: set[type[sec_parser.semantic_elements.abstract_semantic_element.AbstractSemanticElement]] | None = None) Bases: :py:obj:`sec_parser.processing_steps.abstract_classes.abstract_elementwise_processing_step.AbstractElementwiseProcessingStep` TitleClassifier elements into TitleElement instances by scanning a list of semantic elements and replacing suitable candidates. The "_unique_styles_by_order" tuple: ==================================== - Represents an ordered set of unique styles found in the document. - Preserves the order of insertion, which determines the hierarchical level of each style. - Assumes that earlier "highlight" styles correspond to higher level paragraph or section headings. .. py:attribute:: _unique_styles_by_order :type: tuple[sec_parser.semantic_elements.highlighted_text_element.TextStyle, Ellipsis] :value: () .. py:method:: _add_unique_style(style: sec_parser.semantic_elements.highlighted_text_element.TextStyle) -> None Add a new unique style if not already present. .. py:method:: _process_element(element: sec_parser.semantic_elements.abstract_semantic_element.AbstractSemanticElement, _: sec_parser.processing_steps.abstract_classes.abstract_elementwise_processing_step.ElementProcessingContext) -> sec_parser.semantic_elements.abstract_semantic_element.AbstractSemanticElement Process each element and convert to TitleElement if necessary.