Перейти к содержанию

Yandex

Classes

YandexItem

Bases: BaseSearchItem

Represents a single Yandex search result item.

A structured representation of an individual image search result from Yandex, containing detailed information about the found image and its source.

Attributes:

Name Type Description
url str

Direct URL to the webpage containing the image.

title str

Title or heading associated with the image.

thumbnail str

URL of the image thumbnail, properly formatted with https if needed.

source str

Domain name of the website hosting the image.

content str

Descriptive text or context surrounding the image.

size str

Image dimensions in "widthxheight" format.

Source code in PicImageSearch/model/yandex.py
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
class YandexItem(BaseSearchItem):
    """Represents a single Yandex search result item.

    A structured representation of an individual image search result from Yandex,
    containing detailed information about the found image and its source.

    Attributes:
        url (str): Direct URL to the webpage containing the image.
        title (str): Title or heading associated with the image.
        thumbnail (str): URL of the image thumbnail, properly formatted with https if needed.
        source (str): Domain name of the website hosting the image.
        content (str): Descriptive text or context surrounding the image.
        size (str): Image dimensions in "widthxheight" format.
    """

    def __init__(self, data: dict[str, Any], **kwargs: Any):
        """Initializes a YandexItem with data from a search result.

        Args:
            data (dict[str, Any]): A dictionary containing the search result data.
        """
        super().__init__(data, **kwargs)

    def _parse_data(self, data: dict[str, Any], **kwargs: Any) -> None:
        """Parses raw search result data into structured attributes.

        Processes the raw dictionary data from Yandex and extracts relevant information
        into instance attributes.

        Args:
            data (dict[str, Any]): Dictionary containing the raw search result data from Yandex.
            **kwargs (Any): Additional keyword arguments (unused).

        Note:
            The thumbnail URL is automatically prefixed with 'https:' if it starts
            with '//' to ensure proper formatting.
        """
        self.url: str = data["url"]
        self.title: str = data["title"]
        thumb_url: str = data["thumb"]["url"]
        self.thumbnail: str = (
            f"https:{thumb_url}" if thumb_url.startswith("//") else thumb_url
        )
        self.source: str = data["domain"]
        self.content: str = data["description"]
        original_image = data["originalImage"]
        self.size: str = f"{original_image['width']}x{original_image['height']}"

Functions

__init__(data, **kwargs)

Initializes a YandexItem with data from a search result.

Parameters:

Name Type Description Default
data dict[str, Any]

A dictionary containing the search result data.

required
Source code in PicImageSearch/model/yandex.py
25
26
27
28
29
30
31
def __init__(self, data: dict[str, Any], **kwargs: Any):
    """Initializes a YandexItem with data from a search result.

    Args:
        data (dict[str, Any]): A dictionary containing the search result data.
    """
    super().__init__(data, **kwargs)
_parse_data(data, **kwargs)

Parses raw search result data into structured attributes.

Processes the raw dictionary data from Yandex and extracts relevant information into instance attributes.

Parameters:

Name Type Description Default
data dict[str, Any]

Dictionary containing the raw search result data from Yandex.

required
**kwargs Any

Additional keyword arguments (unused).

{}
Note

The thumbnail URL is automatically prefixed with 'https:' if it starts with '//' to ensure proper formatting.

Source code in PicImageSearch/model/yandex.py
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
def _parse_data(self, data: dict[str, Any], **kwargs: Any) -> None:
    """Parses raw search result data into structured attributes.

    Processes the raw dictionary data from Yandex and extracts relevant information
    into instance attributes.

    Args:
        data (dict[str, Any]): Dictionary containing the raw search result data from Yandex.
        **kwargs (Any): Additional keyword arguments (unused).

    Note:
        The thumbnail URL is automatically prefixed with 'https:' if it starts
        with '//' to ensure proper formatting.
    """
    self.url: str = data["url"]
    self.title: str = data["title"]
    thumb_url: str = data["thumb"]["url"]
    self.thumbnail: str = (
        f"https:{thumb_url}" if thumb_url.startswith("//") else thumb_url
    )
    self.source: str = data["domain"]
    self.content: str = data["description"]
    original_image = data["originalImage"]
    self.size: str = f"{original_image['width']}x{original_image['height']}"

YandexResponse

Bases: BaseSearchResponse[YandexItem]

Encapsulates a complete Yandex reverse image search response.

Processes and stores the full response from a Yandex reverse image search, including all found image results and metadata.

Attributes:

Name Type Description
raw list[YandexItem]

List of parsed search results as YandexItem instances.

url str

URL of the search results page.

origin PyQuery

PyQuery object containing the raw HTML response.

Source code in PicImageSearch/model/yandex.py
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
class YandexResponse(BaseSearchResponse[YandexItem]):
    """Encapsulates a complete Yandex reverse image search response.

    Processes and stores the full response from a Yandex reverse image search,
    including all found image results and metadata.

    Attributes:
        raw (list[YandexItem]): List of parsed search results as YandexItem instances.
        url (str): URL of the search results page.
        origin (PyQuery): PyQuery object containing the raw HTML response.
    """

    def __init__(self, resp_data: str, resp_url: str, **kwargs: Any):
        """Initializes with the response text and URL.

        Args:
            resp_data (str): the text of the response.
            resp_url (str): URL to the search result page.
        """
        super().__init__(resp_data, resp_url, **kwargs)

    def _parse_response(self, resp_data: str, **kwargs: Any) -> None:
        """Parses the raw HTML response from Yandex into structured data.

        Extracts search results from the HTML response by locating and parsing
        the JSON data embedded in the page.

        Args:
            resp_data (str): Raw HTML response from Yandex search.
            **kwargs (Any): Additional keyword arguments (unused).

        Note:
            The method looks for a specific div element containing the search results
            data in JSON format, then creates YandexItem instances for each result.
        """
        data = parse_html(resp_data)
        self.origin: PyQuery = data
        data_div = data.find('div.Root[id^="CbirSites_infinite"]')
        data_json = json_loads(data_div.attr("data-state"))
        sites = data_json["sites"]
        self.raw: list[YandexItem] = [YandexItem(site) for site in sites]

Functions

__init__(resp_data, resp_url, **kwargs)

Initializes with the response text and URL.

Parameters:

Name Type Description Default
resp_data str

the text of the response.

required
resp_url str

URL to the search result page.

required
Source code in PicImageSearch/model/yandex.py
71
72
73
74
75
76
77
78
def __init__(self, resp_data: str, resp_url: str, **kwargs: Any):
    """Initializes with the response text and URL.

    Args:
        resp_data (str): the text of the response.
        resp_url (str): URL to the search result page.
    """
    super().__init__(resp_data, resp_url, **kwargs)
_parse_response(resp_data, **kwargs)

Parses the raw HTML response from Yandex into structured data.

Extracts search results from the HTML response by locating and parsing the JSON data embedded in the page.

Parameters:

Name Type Description Default
resp_data str

Raw HTML response from Yandex search.

required
**kwargs Any

Additional keyword arguments (unused).

{}
Note

The method looks for a specific div element containing the search results data in JSON format, then creates YandexItem instances for each result.

Source code in PicImageSearch/model/yandex.py
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
def _parse_response(self, resp_data: str, **kwargs: Any) -> None:
    """Parses the raw HTML response from Yandex into structured data.

    Extracts search results from the HTML response by locating and parsing
    the JSON data embedded in the page.

    Args:
        resp_data (str): Raw HTML response from Yandex search.
        **kwargs (Any): Additional keyword arguments (unused).

    Note:
        The method looks for a specific div element containing the search results
        data in JSON format, then creates YandexItem instances for each result.
    """
    data = parse_html(resp_data)
    self.origin: PyQuery = data
    data_div = data.find('div.Root[id^="CbirSites_infinite"]')
    data_json = json_loads(data_div.attr("data-state"))
    sites = data_json["sites"]
    self.raw: list[YandexItem] = [YandexItem(site) for site in sites]

Functions