sec_parser.semantic_tree.tree_builder
Classes
The TitleElement class represents the title of a paragraph or other content object. |
|
The TopSectionStartMarker class represents the beginning of a top-level |
|
AbstractNestingRule is a base class for defining rules for nesting |
|
AbstractNestingRule is a base class for defining rules for nesting |
|
AbstractNestingRule is a base class for defining rules for nesting |
|
The TreeNode class is a fundamental part of the semantic tree structure. |
|
Builds a semantic tree from a list of semantic elements. |
Module Contents
- class sec_parser.semantic_tree.tree_builder.TitleElement
Bases:
sec_parser.semantic_elements.mixins.dict_text_content_mixin.DictTextContentMixin,sec_parser.semantic_elements.abstract_semantic_element.AbstractLevelElementThe TitleElement class represents the title of a paragraph or other content object. It serves as a semantic marker, providing context and structure to the document.
- class sec_parser.semantic_tree.tree_builder.TopSectionStartMarker(html_tag: sec_parser.processing_engine.html_tag.HtmlTag, *, processing_log: sec_parser.processing_engine.processing_log.ProcessingLog | None = None, log_origin: sec_parser.processing_engine.processing_log.LogItemOrigin | None = None, level: int | None = None, section_type: sec_parser.semantic_elements.top_section_title_types.TopSectionType | None = None)
Bases:
sec_parser.semantic_elements.abstract_semantic_element.AbstractLevelElementThe TopSectionStartMarker class represents the beginning of a top-level section of a document. It is used to mark the start of sections such as “Part I, Item 1. Business” in SEC 10-Q reports.
- classmethod create_from_element(source: sec_parser.semantic_elements.abstract_semantic_element.AbstractSemanticElement, log_origin: sec_parser.processing_engine.processing_log.LogItemOrigin, *, level: int | None = None, section_type: sec_parser.semantic_elements.top_section_title_types.TopSectionType | None = None) sec_parser.semantic_elements.abstract_semantic_element.AbstractLevelElement
Convert the semantic element into another semantic element type.
- to_dict(*, include_previews: bool = False, include_contents: bool = False) dict[str, Any]
- class sec_parser.semantic_tree.tree_builder.AbstractNestingRule(*, exclude_parents: set[type[sec_parser.semantic_elements.abstract_semantic_element.AbstractSemanticElement]] | None = None, exclude_children: set[type[sec_parser.semantic_elements.abstract_semantic_element.AbstractSemanticElement]] | None = None)
Bases:
abc.ABCAbstractNestingRule is a base class for defining rules for nesting semantic elements. Each rule should ideally mention at most one or two types of semantic elements to reduce coupling and complexity.
In case of conflicts between rules, they should be resolved through parameters like exclude_parents and exclude_children.
- should_be_nested_under(parent: sec_parser.semantic_elements.abstract_semantic_element.AbstractSemanticElement, child: sec_parser.semantic_elements.abstract_semantic_element.AbstractSemanticElement) bool
- abstract _should_be_nested_under(parent: sec_parser.semantic_elements.abstract_semantic_element.AbstractSemanticElement, child: sec_parser.semantic_elements.abstract_semantic_element.AbstractSemanticElement) bool
- class sec_parser.semantic_tree.tree_builder.AlwaysNestAsParentRule(cls: type[sec_parser.semantic_elements.abstract_semantic_element.AbstractSemanticElement], /, *, exclude_parents: set[type[sec_parser.semantic_elements.abstract_semantic_element.AbstractSemanticElement]] | None = None, exclude_children: set[type[sec_parser.semantic_elements.abstract_semantic_element.AbstractSemanticElement]] | None = None)
Bases:
AbstractNestingRuleAbstractNestingRule is a base class for defining rules for nesting semantic elements. Each rule should ideally mention at most one or two types of semantic elements to reduce coupling and complexity.
In case of conflicts between rules, they should be resolved through parameters like exclude_parents and exclude_children.
- _should_be_nested_under(parent: sec_parser.semantic_elements.abstract_semantic_element.AbstractSemanticElement, child: sec_parser.semantic_elements.abstract_semantic_element.AbstractSemanticElement) bool
- class sec_parser.semantic_tree.tree_builder.NestSameTypeDependingOnLevelRule(*, exclude_parents: set[type[sec_parser.semantic_elements.abstract_semantic_element.AbstractSemanticElement]] | None = None, exclude_children: set[type[sec_parser.semantic_elements.abstract_semantic_element.AbstractSemanticElement]] | None = None)
Bases:
AbstractNestingRuleAbstractNestingRule is a base class for defining rules for nesting semantic elements. Each rule should ideally mention at most one or two types of semantic elements to reduce coupling and complexity.
In case of conflicts between rules, they should be resolved through parameters like exclude_parents and exclude_children.
- _should_be_nested_under(parent: sec_parser.semantic_elements.abstract_semantic_element.AbstractSemanticElement, child: sec_parser.semantic_elements.abstract_semantic_element.AbstractSemanticElement) bool
- class sec_parser.semantic_tree.tree_builder.SemanticTree(root_nodes: list[sec_parser.semantic_tree.tree_node.TreeNode])
- __iter__() collections.abc.Iterator[sec_parser.semantic_tree.tree_node.TreeNode]
Iterate over the root nodes of the tree.
- __len__() int
- property nodes: collections.abc.Iterator[sec_parser.semantic_tree.tree_node.TreeNode]
Get all nodes in the semantic tree. This includes the root nodes and all their descendants.
- render(*, pretty: bool | None = True, ignored_types: tuple[type[sec_parser.semantic_elements.abstract_semantic_element.AbstractSemanticElement], Ellipsis] | None = None, char_display_limit: int | None = None, verbose: bool = False) str
Render the semantic tree as a human-readable string.
Syntactic sugar for a more convenient usage of render.
- print(*, pretty: bool | None = True, ignored_types: tuple[type[sec_parser.semantic_elements.abstract_semantic_element.AbstractSemanticElement], Ellipsis] | None = None, char_display_limit: int | None = None, verbose: bool = False, line_limit: int | None = None) None
Print the semantic tree as a human-readable string.
Syntactic sugar for a more convenient usage of render.
- class sec_parser.semantic_tree.tree_builder.TreeNode(semantic_element: sec_parser.semantic_elements.abstract_semantic_element.AbstractSemanticElement, *, parent: TreeNode | None = None, children: collections.abc.Iterable[TreeNode] | None = None)
The TreeNode class is a fundamental part of the semantic tree structure. Each TreeNode represents a node in the tree. It holds a reference to a semantic element, maintains a list of its child nodes, and a reference to its parent node. This class provides methods for managing the tree structure, such as adding and removing child nodes. Importantly, these methods ensure logical consistency as children/parents are being changed. For example, if a parent is removed from a child, the child is automatically removed from the parent.
- property semantic_element: sec_parser.semantic_elements.abstract_semantic_element.AbstractSemanticElement
- __repr__() str
Return repr(self).
- property text: str
Property text is a passthrough to the SemanticElement text property.
- get_source_code(*, pretty: bool = False) str
get_source_code is a passthrough to the SemanticElement method.
- class sec_parser.semantic_tree.tree_builder.TreeBuilder(get_rules: Callable[[], list[sec_parser.semantic_tree.nesting_rules.AbstractNestingRule]] | None = None)
Builds a semantic tree from a list of semantic elements.
Why Use a Tree Structure?
Using a tree data structure allows for easier and more robust filtering of sections. With a tree, you can select specific branches to filter, making it straightforward to identify section boundaries. This approach is more maintainable and robust compared to attempting the same operations on a flat list of elements.
Overview:
Takes a list of semantic elements.
Applies nesting rules to these elements.
Customization:
The nesting process is customizable through a list of rules. These rules determine how new elements should be nested under existing ones.
Advanced Customization:
You can supply your own set of rules by providing a callable to get_rules, which should return a list of AbstractNestingRule instances.
- static get_default_rules() list[sec_parser.semantic_tree.nesting_rules.AbstractNestingRule]
- build(elements: list[sec_parser.semantic_elements.abstract_semantic_element.AbstractSemanticElement]) sec_parser.semantic_tree.semantic_tree.SemanticTree
- _find_parent_node(new_node: sec_parser.semantic_tree.tree_node.TreeNode, stack: list[sec_parser.semantic_tree.tree_node.TreeNode], rules: list[sec_parser.semantic_tree.nesting_rules.AbstractNestingRule]) sec_parser.semantic_tree.tree_node.TreeNode | None
- _should_nest_under(child_node: sec_parser.semantic_tree.tree_node.TreeNode, parent_node: sec_parser.semantic_tree.tree_node.TreeNode, rules: list[sec_parser.semantic_tree.nesting_rules.AbstractNestingRule]) bool