sec_parser.utils.bs4_.text_styles_metrics

Exceptions

SecParserValueError

Base exception class for sec_parser.

Functions

compute_text_styles_metrics(→ dict[tuple[str, str], float])

Compute the percentage distribution of various CSS styles within the

_compute_effective_style(→ dict[str, str])

Aggregate the effective styles for a given tag by

Module Contents

exception sec_parser.utils.bs4_.text_styles_metrics.SecParserValueError

Bases: SecParserError, ValueError

Base exception class for sec_parser. All custom exceptions in sec_parser are inherited from this class.

sec_parser.utils.bs4_.text_styles_metrics.compute_text_styles_metrics(tag: bs4.Tag) dict[tuple[str, str], float]

Compute the percentage distribution of various CSS styles within the text content of a given HTML tag and its descendants.

This function iterates through all the text nodes within the tag, recursively includes text from child elements, and calculates the effective styles applied to each text segment.

It aggregates these styles and computes their percentage distribution based on the length of text they apply to.

The function uses BeautifulSoup’s recursive text search and parent traversal features. It returns a dictionary containing the aggregated style metrics (the percentage distribution of styles).

Each dictionary entry corresponds to a unique style, (property, value) and the percentage of text it affects.

sec_parser.utils.bs4_.text_styles_metrics._compute_effective_style(tag: bs4.Tag) dict[str, str]

Aggregate the effective styles for a given tag by traversing up the parent hierarchy.