mirror of https://github.com/Bunsly/JobSpy
[enh]: extract emails
parent
c802c8c3b8
commit
e4b925605d
|
@ -7,27 +7,27 @@ jobs:
|
||||||
runs-on: ubuntu-latest
|
runs-on: ubuntu-latest
|
||||||
|
|
||||||
steps:
|
steps:
|
||||||
- uses: actions/checkout@v3
|
- uses: actions/checkout@v3
|
||||||
- name: Set up Python
|
- name: Set up Python
|
||||||
uses: actions/setup-python@v4
|
uses: actions/setup-python@v4
|
||||||
with:
|
with:
|
||||||
python-version: "3.10"
|
python-version: "3.10"
|
||||||
|
|
||||||
- name: Install poetry
|
- name: Install poetry
|
||||||
run: >-
|
run: >-
|
||||||
python3 -m
|
python3 -m
|
||||||
pip install
|
pip install
|
||||||
poetry
|
poetry
|
||||||
--user
|
--user
|
||||||
|
|
||||||
- name: Build distribution 📦
|
- name: Build distribution 📦
|
||||||
run: >-
|
run: >-
|
||||||
python3 -m
|
python3 -m
|
||||||
poetry
|
poetry
|
||||||
build
|
build
|
||||||
|
|
||||||
- name: Publish distribution 📦 to PyPI
|
- name: Publish distribution 📦 to PyPI
|
||||||
if: startsWith(github.ref, 'refs/tags')
|
if: startsWith(github.ref, 'refs/tags')
|
||||||
uses: pypa/gh-action-pypi-publish@release/v1
|
uses: pypa/gh-action-pypi-publish@release/v1
|
||||||
with:
|
with:
|
||||||
password: ${{ secrets.PYPI_API_TOKEN }}
|
password: ${{ secrets.PYPI_API_TOKEN }}
|
71
README.md
71
README.md
|
@ -4,26 +4,30 @@
|
||||||
|
|
||||||
**Not technical?** Try out the web scraping tool on our site at [usejobspy.com](https://usejobspy.com).
|
**Not technical?** Try out the web scraping tool on our site at [usejobspy.com](https://usejobspy.com).
|
||||||
|
|
||||||
*Looking to build a data-focused software product?* **[Book a call](https://calendly.com/zachary-products/15min)** *to work with us.*
|
*Looking to build a data-focused software product?* **[Book a call](https://calendly.com/zachary-products/15min)** *to
|
||||||
|
work with us.*
|
||||||
\
|
\
|
||||||
Check out another project we wrote: ***[HomeHarvest](https://github.com/ZacharyHampton/HomeHarvest)** – a Python package for real estate scraping*
|
Check out another project we wrote: ***[HomeHarvest](https://github.com/ZacharyHampton/HomeHarvest)** – a Python package
|
||||||
## Features
|
for real estate scraping*
|
||||||
|
|
||||||
|
## Features
|
||||||
|
|
||||||
- Scrapes job postings from **LinkedIn**, **Indeed** & **ZipRecruiter** simultaneously
|
- Scrapes job postings from **LinkedIn**, **Indeed** & **ZipRecruiter** simultaneously
|
||||||
- Aggregates the job postings in a Pandas DataFrame
|
- Aggregates the job postings in a Pandas DataFrame
|
||||||
- Proxy support (HTTP/S, SOCKS)
|
- Proxy support (HTTP/S, SOCKS)
|
||||||
|
|
||||||
[Video Guide for JobSpy](https://www.youtube.com/watch?v=RuP1HrAZnxs&pp=ygUgam9icyBzY3JhcGVyIGJvdCBsaW5rZWRpbiBpbmRlZWQ%3D) - Updated for release v1.1.3
|
[Video Guide for JobSpy](https://www.youtube.com/watch?v=RuP1HrAZnxs&pp=ygUgam9icyBzY3JhcGVyIGJvdCBsaW5rZWRpbiBpbmRlZWQ%3D) -
|
||||||
|
Updated for release v1.1.3
|
||||||
|
|
||||||

|

|
||||||
|
|
||||||
### Installation
|
### Installation
|
||||||
|
|
||||||
```
|
```
|
||||||
pip install --upgrade python-jobspy
|
pip install --upgrade python-jobspy
|
||||||
```
|
```
|
||||||
|
|
||||||
_Python version >= [3.10](https://www.python.org/downloads/release/python-3100/) required_
|
_Python version >= [3.10](https://www.python.org/downloads/release/python-3100/) required_
|
||||||
|
|
||||||
### Usage
|
### Usage
|
||||||
|
|
||||||
|
@ -65,6 +69,7 @@ print(jobs)
|
||||||
```
|
```
|
||||||
|
|
||||||
### Output
|
### Output
|
||||||
|
|
||||||
```
|
```
|
||||||
SITE TITLE COMPANY_NAME CITY STATE JOB_TYPE INTERVAL MIN_AMOUNT MAX_AMOUNT JOB_URL DESCRIPTION
|
SITE TITLE COMPANY_NAME CITY STATE JOB_TYPE INTERVAL MIN_AMOUNT MAX_AMOUNT JOB_URL DESCRIPTION
|
||||||
indeed Software Engineer AMERICAN SYSTEMS Arlington VA None yearly 200000 150000 https://www.indeed.com/viewjob?jk=5e409e577046... THIS POSITION COMES WITH A 10K SIGNING BONUS!...
|
indeed Software Engineer AMERICAN SYSTEMS Arlington VA None yearly 200000 150000 https://www.indeed.com/viewjob?jk=5e409e577046... THIS POSITION COMES WITH A 10K SIGNING BONUS!...
|
||||||
|
@ -74,7 +79,9 @@ linkedin Full-Stack Software Engineer Rain New York
|
||||||
zip_recruiter Software Engineer - New Grad ZipRecruiter Santa Monica CA fulltime yearly 130000 150000 https://www.ziprecruiter.com/jobs/ziprecruiter... We offer a hybrid work environment. Most US-ba...
|
zip_recruiter Software Engineer - New Grad ZipRecruiter Santa Monica CA fulltime yearly 130000 150000 https://www.ziprecruiter.com/jobs/ziprecruiter... We offer a hybrid work environment. Most US-ba...
|
||||||
zip_recruiter Software Developer TEKsystems Phoenix AZ fulltime hourly 65 75 https://www.ziprecruiter.com/jobs/teksystems-0... Top Skills' Details• 6 years of Java developme...
|
zip_recruiter Software Developer TEKsystems Phoenix AZ fulltime hourly 65 75 https://www.ziprecruiter.com/jobs/teksystems-0... Top Skills' Details• 6 years of Java developme...
|
||||||
```
|
```
|
||||||
|
|
||||||
### Parameters for `scrape_jobs()`
|
### Parameters for `scrape_jobs()`
|
||||||
|
|
||||||
```plaintext
|
```plaintext
|
||||||
Required
|
Required
|
||||||
├── site_type (List[enum]): linkedin, zip_recruiter, indeed
|
├── site_type (List[enum]): linkedin, zip_recruiter, indeed
|
||||||
|
@ -91,8 +98,8 @@ Optional
|
||||||
├── offset (enum): starts the search from an offset (e.g. 25 will start the search from the 25th result)
|
├── offset (enum): starts the search from an offset (e.g. 25 will start the search from the 25th result)
|
||||||
```
|
```
|
||||||
|
|
||||||
|
|
||||||
### JobPost Schema
|
### JobPost Schema
|
||||||
|
|
||||||
```plaintext
|
```plaintext
|
||||||
JobPost
|
JobPost
|
||||||
├── title (str)
|
├── title (str)
|
||||||
|
@ -113,14 +120,15 @@ JobPost
|
||||||
```
|
```
|
||||||
|
|
||||||
### Exceptions
|
### Exceptions
|
||||||
|
|
||||||
The following exceptions may be raised when using JobSpy:
|
The following exceptions may be raised when using JobSpy:
|
||||||
|
|
||||||
* `LinkedInException`
|
* `LinkedInException`
|
||||||
* `IndeedException`
|
* `IndeedException`
|
||||||
* `ZipRecruiterException`
|
* `ZipRecruiterException`
|
||||||
|
|
||||||
## Supported Countries for Job Searching
|
## Supported Countries for Job Searching
|
||||||
|
|
||||||
|
|
||||||
### **LinkedIn**
|
### **LinkedIn**
|
||||||
|
|
||||||
LinkedIn searches globally & uses only the `location` parameter.
|
LinkedIn searches globally & uses only the `location` parameter.
|
||||||
|
@ -129,43 +137,45 @@ LinkedIn searches globally & uses only the `location` parameter.
|
||||||
|
|
||||||
ZipRecruiter searches for jobs in **US/Canada** & uses only the `location` parameter.
|
ZipRecruiter searches for jobs in **US/Canada** & uses only the `location` parameter.
|
||||||
|
|
||||||
|
|
||||||
### **Indeed**
|
### **Indeed**
|
||||||
Indeed supports most countries, but the `country_indeed` parameter is required. Additionally, use the `location` parameter to narrow down the location, e.g. city & state if necessary.
|
|
||||||
|
Indeed supports most countries, but the `country_indeed` parameter is required. Additionally, use the `location`
|
||||||
|
parameter to narrow down the location, e.g. city & state if necessary.
|
||||||
|
|
||||||
You can specify the following countries when searching on Indeed (use the exact name):
|
You can specify the following countries when searching on Indeed (use the exact name):
|
||||||
|
|
||||||
|
| | | | |
|
||||||
| | | | |
|
|----------------------|--------------|------------|----------------|
|
||||||
|------|------|------|------|
|
| Argentina | Australia | Austria | Bahrain |
|
||||||
| Argentina | Australia | Austria | Bahrain |
|
| Belgium | Brazil | Canada | Chile |
|
||||||
| Belgium | Brazil | Canada | Chile |
|
| China | Colombia | Costa Rica | Czech Republic |
|
||||||
| China | Colombia | Costa Rica | Czech Republic |
|
| Denmark | Ecuador | Egypt | Finland |
|
||||||
| Denmark | Ecuador | Egypt | Finland |
|
| France | Germany | Greece | Hong Kong |
|
||||||
| France | Germany | Greece | Hong Kong |
|
| Hungary | India | Indonesia | Ireland |
|
||||||
| Hungary | India | Indonesia | Ireland |
|
| Israel | Italy | Japan | Kuwait |
|
||||||
| Israel | Italy | Japan | Kuwait |
|
| Luxembourg | Malaysia | Mexico | Morocco |
|
||||||
| Luxembourg | Malaysia | Mexico | Morocco |
|
| Netherlands | New Zealand | Nigeria | Norway |
|
||||||
| Netherlands | New Zealand | Nigeria | Norway |
|
| Oman | Pakistan | Panama | Peru |
|
||||||
| Oman | Pakistan | Panama | Peru |
|
| Philippines | Poland | Portugal | Qatar |
|
||||||
| Philippines | Poland | Portugal | Qatar |
|
| Romania | Saudi Arabia | Singapore | South Africa |
|
||||||
| Romania | Saudi Arabia | Singapore | South Africa |
|
| South Korea | Spain | Sweden | Switzerland |
|
||||||
| South Korea | Spain | Sweden | Switzerland |
|
| Taiwan | Thailand | Turkey | Ukraine |
|
||||||
| Taiwan | Thailand | Turkey | Ukraine |
|
| United Arab Emirates | UK | USA | Uruguay |
|
||||||
| United Arab Emirates | UK | USA | Uruguay |
|
| Venezuela | Vietnam | | |
|
||||||
| Venezuela | Vietnam | | |
|
|
||||||
|
|
||||||
## Frequently Asked Questions
|
## Frequently Asked Questions
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
**Q: Encountering issues with your queries?**
|
**Q: Encountering issues with your queries?**
|
||||||
**A:** Try reducing the number of `results_wanted` and/or broadening the filters. If problems persist, [submit an issue](https://github.com/cullenwatson/JobSpy/issues).
|
**A:** Try reducing the number of `results_wanted` and/or broadening the filters. If problems
|
||||||
|
persist, [submit an issue](https://github.com/cullenwatson/JobSpy/issues).
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
**Q: Received a response code 429?**
|
**Q: Received a response code 429?**
|
||||||
**A:** This indicates that you have been blocked by the job board site for sending too many requests. Currently, **LinkedIn** is particularly aggressive with blocking. We recommend:
|
**A:** This indicates that you have been blocked by the job board site for sending too many requests. Currently, *
|
||||||
|
*LinkedIn** is particularly aggressive with blocking. We recommend:
|
||||||
|
|
||||||
- Waiting a few seconds between requests.
|
- Waiting a few seconds between requests.
|
||||||
- Trying a VPN or proxy to change your IP address.
|
- Trying a VPN or proxy to change your IP address.
|
||||||
|
@ -174,6 +184,7 @@ You can specify the following countries when searching on Indeed (use the exact
|
||||||
|
|
||||||
**Q: Experiencing a "Segmentation fault: 11" on macOS Catalina?**
|
**Q: Experiencing a "Segmentation fault: 11" on macOS Catalina?**
|
||||||
**A:** This is due to `tls_client` dependency not supporting your architecture. Solutions and workarounds include:
|
**A:** This is due to `tls_client` dependency not supporting your architecture. Solutions and workarounds include:
|
||||||
|
|
||||||
- Upgrade to a newer version of MacOS
|
- Upgrade to a newer version of MacOS
|
||||||
- Reach out to the maintainers of [tls_client](https://github.com/bogdanfinn/tls-client) for fixes
|
- Reach out to the maintainers of [tls_client](https://github.com/bogdanfinn/tls-client) for fixes
|
||||||
|
|
||||||
|
|
|
@ -5,9 +5,9 @@ jobs: pd.DataFrame = scrape_jobs(
|
||||||
site_name=["indeed", "linkedin", "zip_recruiter"],
|
site_name=["indeed", "linkedin", "zip_recruiter"],
|
||||||
search_term="software engineer",
|
search_term="software engineer",
|
||||||
location="Dallas, TX",
|
location="Dallas, TX",
|
||||||
results_wanted=50, # be wary the higher it is, the more likey you'll get blocked (rotating proxy should work tho)
|
results_wanted=50, # be wary the higher it is, the more likey you'll get blocked (rotating proxy should work tho)
|
||||||
country_indeed='USA',
|
country_indeed='USA',
|
||||||
offset=25 # start jobs from an offset (use if search failed and want to continue)
|
offset=25 # start jobs from an offset (use if search failed and want to continue)
|
||||||
# proxy="http://jobspy:5a4vpWtj8EeJ2hoYzk@ca.smartproxy.com:20001",
|
# proxy="http://jobspy:5a4vpWtj8EeJ2hoYzk@ca.smartproxy.com:20001",
|
||||||
)
|
)
|
||||||
|
|
||||||
|
@ -29,5 +29,3 @@ print('outputted to jobs.csv')
|
||||||
|
|
||||||
# 4: display in Jupyter Notebook (1. pip install jupyter 2. jupyter notebook)
|
# 4: display in Jupyter Notebook (1. pip install jupyter 2. jupyter notebook)
|
||||||
# display(jobs)
|
# display(jobs)
|
||||||
|
|
||||||
|
|
||||||
|
|
File diff suppressed because it is too large
Load Diff
|
@ -26,18 +26,18 @@ def _map_str_to_site(site_name: str) -> Site:
|
||||||
|
|
||||||
|
|
||||||
def scrape_jobs(
|
def scrape_jobs(
|
||||||
site_name: str | List[str] | Site | List[Site],
|
site_name: str | List[str] | Site | List[Site],
|
||||||
search_term: str,
|
search_term: str,
|
||||||
location: str = "",
|
location: str = "",
|
||||||
distance: int = None,
|
distance: int = None,
|
||||||
is_remote: bool = False,
|
is_remote: bool = False,
|
||||||
job_type: str = None,
|
job_type: str = None,
|
||||||
easy_apply: bool = False, # linkedin
|
easy_apply: bool = False, # linkedin
|
||||||
results_wanted: int = 15,
|
results_wanted: int = 15,
|
||||||
country_indeed: str = "usa",
|
country_indeed: str = "usa",
|
||||||
hyperlinks: bool = False,
|
hyperlinks: bool = False,
|
||||||
proxy: Optional[str] = None,
|
proxy: Optional[str] = None,
|
||||||
offset: Optional[int] = 0
|
offset: Optional[int] = 0
|
||||||
) -> pd.DataFrame:
|
) -> pd.DataFrame:
|
||||||
"""
|
"""
|
||||||
Simultaneously scrapes job data from multiple job sites.
|
Simultaneously scrapes job data from multiple job sites.
|
||||||
|
@ -49,8 +49,8 @@ def scrape_jobs(
|
||||||
if value_str in job_type.value:
|
if value_str in job_type.value:
|
||||||
return job_type
|
return job_type
|
||||||
raise Exception(f"Invalid job type: {value_str}")
|
raise Exception(f"Invalid job type: {value_str}")
|
||||||
job_type = get_enum_from_value(job_type) if job_type else None
|
|
||||||
|
|
||||||
|
job_type = get_enum_from_value(job_type) if job_type else None
|
||||||
|
|
||||||
if type(site_name) == str:
|
if type(site_name) == str:
|
||||||
site_type = [_map_str_to_site(site_name)]
|
site_type = [_map_str_to_site(site_name)]
|
||||||
|
@ -162,6 +162,7 @@ def scrape_jobs(
|
||||||
"min_amount",
|
"min_amount",
|
||||||
"max_amount",
|
"max_amount",
|
||||||
"currency",
|
"currency",
|
||||||
|
"emails",
|
||||||
"description",
|
"description",
|
||||||
]
|
]
|
||||||
jobs_formatted_df = jobs_df[desired_order]
|
jobs_formatted_df = jobs_df[desired_order]
|
||||||
|
|
|
@ -187,6 +187,7 @@ class JobPost(BaseModel):
|
||||||
compensation: Optional[Compensation] = None
|
compensation: Optional[Compensation] = None
|
||||||
date_posted: Optional[date] = None
|
date_posted: Optional[date] = None
|
||||||
benefits: Optional[str] = None
|
benefits: Optional[str] = None
|
||||||
|
emails: Optional[list[str]] = None
|
||||||
|
|
||||||
|
|
||||||
class JobResponse(BaseModel):
|
class JobResponse(BaseModel):
|
||||||
|
|
|
@ -27,6 +27,7 @@ from ...jobs import (
|
||||||
JobType,
|
JobType,
|
||||||
)
|
)
|
||||||
from .. import Scraper, ScraperInput, Site
|
from .. import Scraper, ScraperInput, Site
|
||||||
|
from ...utils import extract_emails_from_text
|
||||||
|
|
||||||
|
|
||||||
class IndeedScraper(Scraper):
|
class IndeedScraper(Scraper):
|
||||||
|
@ -138,6 +139,7 @@ class IndeedScraper(Scraper):
|
||||||
date_posted = date_posted.strftime("%Y-%m-%d")
|
date_posted = date_posted.strftime("%Y-%m-%d")
|
||||||
|
|
||||||
description = self.get_description(job_url, session)
|
description = self.get_description(job_url, session)
|
||||||
|
emails = extract_emails_from_text(description)
|
||||||
with io.StringIO(job["snippet"]) as f:
|
with io.StringIO(job["snippet"]) as f:
|
||||||
soup_io = BeautifulSoup(f, "html.parser")
|
soup_io = BeautifulSoup(f, "html.parser")
|
||||||
li_elements = soup_io.find_all("li")
|
li_elements = soup_io.find_all("li")
|
||||||
|
@ -153,6 +155,7 @@ class IndeedScraper(Scraper):
|
||||||
state=job.get("jobLocationState"),
|
state=job.get("jobLocationState"),
|
||||||
country=self.country,
|
country=self.country,
|
||||||
),
|
),
|
||||||
|
emails=extract_emails_from_text(description),
|
||||||
job_type=job_type,
|
job_type=job_type,
|
||||||
compensation=compensation,
|
compensation=compensation,
|
||||||
date_posted=date_posted,
|
date_posted=date_posted,
|
||||||
|
|
|
@ -17,13 +17,13 @@ from threading import Lock
|
||||||
|
|
||||||
from .. import Scraper, ScraperInput, Site
|
from .. import Scraper, ScraperInput, Site
|
||||||
from ..exceptions import LinkedInException
|
from ..exceptions import LinkedInException
|
||||||
from ... import JobType
|
|
||||||
from ...jobs import (
|
from ...jobs import (
|
||||||
JobPost,
|
JobPost,
|
||||||
Location,
|
Location,
|
||||||
JobResponse,
|
JobResponse,
|
||||||
JobType,
|
JobType,
|
||||||
)
|
)
|
||||||
|
from ...utils import extract_emails_from_text
|
||||||
|
|
||||||
|
|
||||||
class LinkedInScraper(Scraper):
|
class LinkedInScraper(Scraper):
|
||||||
|
@ -162,7 +162,7 @@ class LinkedInScraper(Scraper):
|
||||||
benefits_tag = job_card.find("span", class_="result-benefits__text")
|
benefits_tag = job_card.find("span", class_="result-benefits__text")
|
||||||
benefits = " ".join(benefits_tag.get_text().split()) if benefits_tag else None
|
benefits = " ".join(benefits_tag.get_text().split()) if benefits_tag else None
|
||||||
|
|
||||||
description, job_type = self.get_job_info_page(job_url)
|
description, job_type = self.get_job_description(job_url)
|
||||||
|
|
||||||
return JobPost(
|
return JobPost(
|
||||||
title=title,
|
title=title,
|
||||||
|
@ -173,9 +173,10 @@ class LinkedInScraper(Scraper):
|
||||||
job_url=job_url,
|
job_url=job_url,
|
||||||
job_type=job_type,
|
job_type=job_type,
|
||||||
benefits=benefits,
|
benefits=benefits,
|
||||||
|
emails=extract_emails_from_text(description)
|
||||||
)
|
)
|
||||||
|
|
||||||
def get_job_info_page(self, job_page_url: str) -> tuple[None, None] | tuple[
|
def get_job_description(self, job_page_url: str) -> tuple[None, None] | tuple[
|
||||||
str | None, tuple[str | None, JobType | None]]:
|
str | None, tuple[str | None, JobType | None]]:
|
||||||
"""
|
"""
|
||||||
Retrieves job description by going to the job page url
|
Retrieves job description by going to the job page url
|
||||||
|
@ -193,9 +194,9 @@ class LinkedInScraper(Scraper):
|
||||||
"div", class_=lambda x: x and "show-more-less-html__markup" in x
|
"div", class_=lambda x: x and "show-more-less-html__markup" in x
|
||||||
)
|
)
|
||||||
|
|
||||||
text_content = None
|
description = None
|
||||||
if div_content:
|
if div_content:
|
||||||
text_content = " ".join(div_content.get_text().split()).strip()
|
description = " ".join(div_content.get_text().split()).strip()
|
||||||
|
|
||||||
def get_job_type(
|
def get_job_type(
|
||||||
soup_job_type: BeautifulSoup,
|
soup_job_type: BeautifulSoup,
|
||||||
|
@ -224,7 +225,7 @@ class LinkedInScraper(Scraper):
|
||||||
|
|
||||||
return LinkedInScraper.get_enum_from_value(employment_type)
|
return LinkedInScraper.get_enum_from_value(employment_type)
|
||||||
|
|
||||||
return text_content, get_job_type(soup)
|
return description, get_job_type(soup)
|
||||||
|
|
||||||
@staticmethod
|
@staticmethod
|
||||||
def get_enum_from_value(value_str):
|
def get_enum_from_value(value_str):
|
||||||
|
|
|
@ -28,6 +28,7 @@ from ...jobs import (
|
||||||
JobType,
|
JobType,
|
||||||
Country,
|
Country,
|
||||||
)
|
)
|
||||||
|
from ...utils import extract_emails_from_text
|
||||||
|
|
||||||
|
|
||||||
class ZipRecruiterScraper(Scraper):
|
class ZipRecruiterScraper(Scraper):
|
||||||
|
@ -174,6 +175,7 @@ class ZipRecruiterScraper(Scraper):
|
||||||
compensation=ZipRecruiterScraper.get_compensation(job),
|
compensation=ZipRecruiterScraper.get_compensation(job),
|
||||||
date_posted=date_posted,
|
date_posted=date_posted,
|
||||||
job_url=job_url,
|
job_url=job_url,
|
||||||
|
emails=extract_emails_from_text(description),
|
||||||
)
|
)
|
||||||
return job_post
|
return job_post
|
||||||
|
|
||||||
|
@ -465,4 +467,3 @@ class ZipRecruiterScraper(Scraper):
|
||||||
parsed_url = urlparse(url)
|
parsed_url = urlparse(url)
|
||||||
|
|
||||||
return urlunparse((parsed_url.scheme, parsed_url.netloc, parsed_url.path, parsed_url.params, '', ''))
|
return urlunparse((parsed_url.scheme, parsed_url.netloc, parsed_url.path, parsed_url.params, '', ''))
|
||||||
|
|
||||||
|
|
|
@ -1,4 +1,5 @@
|
||||||
from ..jobspy import scrape_jobs
|
from ..jobspy import scrape_jobs
|
||||||
|
import pandas as pd
|
||||||
|
|
||||||
|
|
||||||
def test_all():
|
def test_all():
|
||||||
|
@ -7,4 +8,5 @@ def test_all():
|
||||||
search_term="software engineer",
|
search_term="software engineer",
|
||||||
results_wanted=5,
|
results_wanted=5,
|
||||||
)
|
)
|
||||||
assert result is not None and result.errors.empty is True
|
|
||||||
|
assert isinstance(result, pd.DataFrame) and not result.empty, "Result should be a non-empty DataFrame"
|
||||||
|
|
|
@ -1,4 +1,5 @@
|
||||||
from ..jobspy import scrape_jobs
|
from ..jobspy import scrape_jobs
|
||||||
|
import pandas as pd
|
||||||
|
|
||||||
|
|
||||||
def test_indeed():
|
def test_indeed():
|
||||||
|
@ -6,4 +7,4 @@ def test_indeed():
|
||||||
site_name="indeed",
|
site_name="indeed",
|
||||||
search_term="software engineer",
|
search_term="software engineer",
|
||||||
)
|
)
|
||||||
assert result is not None and result.errors.empty is True
|
assert isinstance(result, pd.DataFrame) and not result.empty, "Result should be a non-empty DataFrame"
|
||||||
|
|
|
@ -1,4 +1,5 @@
|
||||||
from ..jobspy import scrape_jobs
|
from ..jobspy import scrape_jobs
|
||||||
|
import pandas as pd
|
||||||
|
|
||||||
|
|
||||||
def test_linkedin():
|
def test_linkedin():
|
||||||
|
@ -6,4 +7,4 @@ def test_linkedin():
|
||||||
site_name="linkedin",
|
site_name="linkedin",
|
||||||
search_term="software engineer",
|
search_term="software engineer",
|
||||||
)
|
)
|
||||||
assert result is not None and result.errors.empty is True
|
assert isinstance(result, pd.DataFrame) and not result.empty, "Result should be a non-empty DataFrame"
|
||||||
|
|
|
@ -1,4 +1,5 @@
|
||||||
from ..jobspy import scrape_jobs
|
from ..jobspy import scrape_jobs
|
||||||
|
import pandas as pd
|
||||||
|
|
||||||
|
|
||||||
def test_ziprecruiter():
|
def test_ziprecruiter():
|
||||||
|
@ -7,4 +8,4 @@ def test_ziprecruiter():
|
||||||
search_term="software engineer",
|
search_term="software engineer",
|
||||||
)
|
)
|
||||||
|
|
||||||
assert result is not None and result.errors.empty is True
|
assert isinstance(result, pd.DataFrame) and not result.empty, "Result should be a non-empty DataFrame"
|
||||||
|
|
Loading…
Reference in New Issue