GitHub - djkelleher/search-engines: Query and scrape search engines (Google, Google News, Yahoo, Yahoo News, Bing, Bing News, Ask, Dogpile, Dogpile News)

Query and scrape search engines (Google, Google News, Yahoo, Yahoo News, Bing, Bing News, Ask, Dogpile, Dogpile News)

Installation

pip install search_engines

Overview

Each search engine has a module {engine_name}.py which two functions:

extract_search_results(html: str, page_url: str) -> Tuple[List[Dict[str, str]], str]

and

get_search_url(query: str, latest: bool = True, country: str = 'us') -> str

Usage Example

Construct a URL for the first results page of searching "Tesla TSLA" in Bing Search.

from search_engines import bing_search

url = bing_search.get_search_url('Tesla TSLA')

Load the URL using a simple HTTP client or web browser and extract the page HTML. This package does not make any restrictions on clients can be used. We'll use the requests library for this example.

import requests

resp = requests.get(url)
html = resp.text

We can now extract search results from the HTML. The returned results list will be a list of dictionaries with keys url, title, preview_text, page_number. If we want to scrape multiple pages, we can load the next page using the returned next_page_url, and again extracting the results using extract_search_results.

results, next_page_url = bing_search.extract_search_results(html, url)

Putting that all together, here's how we would scrape all pages of search results:

url = bing_search.get_search_url('Tesla TSLA')
while True:
    html = requests.get(url).text
    results, next_page_url = bing_search.extract_search_results(html, url)
    # do something wih results...
    if next_page_url:
        url = next_page_url
    else:
        break

Contributions

Add new search engines!

Name		Name	Last commit message	Last commit date
Latest commit History 41 Commits
search_engines		search_engines
tests		tests
.gitignore		.gitignore
README.md		README.md
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Query and scrape search engines (Google, Google News, Yahoo, Yahoo News, Bing, Bing News, Ask, Dogpile, Dogpile News)

Installation

Overview

Usage Example

Contributions

About

Releases

Packages

Contributors 3

Languages

djkelleher/search-engines

Folders and files

Latest commit

History

Repository files navigation

Query and scrape search engines (Google, Google News, Yahoo, Yahoo News, Bing, Bing News, Ask, Dogpile, Dogpile News)

Installation

Overview

Usage Example

Contributions

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages