sec_parser.semantic_elements.title_element

Classes

AbstractLevelElement

The AbstractLevelElement class provides a level attribute to semantic elements.

DictTextContentMixin

In the domain of HTML parsing, especially in the context of SEC EDGAR documents,

TitleElement

The TitleElement class represents the title of a paragraph or other content object.

Module Contents

class sec_parser.semantic_elements.title_element.AbstractLevelElement(html_tag: sec_parser.processing_engine.html_tag.HtmlTag, *, processing_log: sec_parser.processing_engine.processing_log.ProcessingLog | None = None, level: int | None = None, log_origin: sec_parser.processing_engine.processing_log.LogItemOrigin | None = None)

Bases: AbstractSemanticElement

The AbstractLevelElement class provides a level attribute to semantic elements. It represents hierarchical levels in the document structure. For instance, a main section title might be at level 1, a subsection at level 2, etc.

MIN_LEVEL = 0
classmethod create_from_element(source: AbstractSemanticElement, log_origin: sec_parser.processing_engine.processing_log.LogItemOrigin, *, level: int | None = None) AbstractLevelElement

Convert the semantic element into another semantic element type.

to_dict(*, include_previews: bool = False, include_contents: bool = False) dict[str, Any]
__repr__() str

Return repr(self).

class sec_parser.semantic_elements.title_element.DictTextContentMixin(html_tag: sec_parser.processing_engine.html_tag.HtmlTag, *, processing_log: sec_parser.processing_engine.processing_log.ProcessingLog | None = None, log_origin: sec_parser.processing_engine.processing_log.LogItemOrigin | None = None)

Bases: sec_parser.semantic_elements.abstract_semantic_element.AbstractSemanticElement

In the domain of HTML parsing, especially in the context of SEC EDGAR documents, a semantic element refers to a meaningful unit within the document that serves a specific purpose. For example, a paragraph or a table might be considered a semantic element. Unlike syntactic elements, which merely exist to structure the HTML, semantic elements carry information that is vital to the understanding of the document’s content.

This class serves as a foundational representation of such semantic elements, containing an HtmlTag object that stores the raw HTML tag information. Subclasses will implement additional behaviors based on the type of the semantic element.

to_dict(*, include_previews: bool = False, include_contents: bool = False) dict[str, Any]
class sec_parser.semantic_elements.title_element.TitleElement

Bases: sec_parser.semantic_elements.mixins.dict_text_content_mixin.DictTextContentMixin, sec_parser.semantic_elements.abstract_semantic_element.AbstractLevelElement

The TitleElement class represents the title of a paragraph or other content object. It serves as a semantic marker, providing context and structure to the document.