2023-09-04 20:58:46 -07:00
< img src = "https://github.com/cullenwatson/JobSpy/assets/78247585/ae185b7e-e444-4712-8bb9-fa97f53e896b" width = "400" >
2023-07-10 20:14:38 -07:00
2023-09-03 07:29:25 -07:00
**JobSpy** is a simple, yet comprehensive, job scraping library.
2023-09-15 11:51:22 -07:00
2023-09-21 18:26:04 -07:00
**Not technical?** Try out the web scraping tool on our site at [usejobspy.com ](https://usejobspy.com ).
2023-11-06 21:13:19 -08:00
*Looking to build a data-focused software product?* ** [Book a call ](https://bunsly.com/ )** *to
2023-11-30 10:49:31 -08:00
work with us.*
2023-07-10 20:14:38 -07:00
2023-09-28 16:11:28 -07:00
## Features
2023-09-04 20:52:21 -07:00
2023-10-30 17:57:36 -07:00
- Scrapes job postings from **LinkedIn** , **Indeed** , **Glassdoor** , & **ZipRecruiter** simultaneously
2023-09-03 07:29:25 -07:00
- Aggregates the job postings in a Pandas DataFrame
2023-09-07 09:36:16 -07:00
- Proxy support (HTTP/S, SOCKS)
2023-09-28 16:11:28 -07:00
[Video Guide for JobSpy ](https://www.youtube.com/watch?v=RuP1HrAZnxs&pp=ygUgam9icyBzY3JhcGVyIGJvdCBsaW5rZWRpbiBpbmRlZWQ%3D ) -
Updated for release v1.1.3
2023-09-06 09:26:55 -07:00
2023-09-03 18:05:31 -07:00
![jobspy ](https://github.com/cullenwatson/JobSpy/assets/78247585/ec7ef355-05f6-4fd3-8161-a817e31c5c57 )
2023-09-28 16:11:28 -07:00
2023-09-03 07:29:25 -07:00
### Installation
2023-09-28 16:11:28 -07:00
2023-09-05 11:03:32 -07:00
```
2023-10-10 09:54:14 -07:00
pip install python-jobspy
2023-09-05 11:03:32 -07:00
```
2023-09-28 16:11:28 -07:00
_Python version >= [3.10 ](https://www.python.org/downloads/release/python-3100/ ) required_
2023-09-03 07:29:25 -07:00
### Usage
```python
2023-09-03 10:30:13 -07:00
from jobspy import scrape_jobs
2023-09-03 07:29:25 -07:00
2023-10-10 09:23:04 -07:00
jobs = scrape_jobs(
2023-10-30 17:57:36 -07:00
site_name=["indeed", "linkedin", "zip_recruiter", "glassdoor"],
2023-09-07 11:35:10 -07:00
search_term="software engineer",
location="Dallas, TX",
2023-09-05 10:17:22 -07:00
results_wanted=10,
2023-10-30 17:57:36 -07:00
country_indeed='USA' # only needed for indeed / glassdoor
2023-09-03 07:29:25 -07:00
)
2023-10-10 09:23:04 -07:00
print(f"Found {len(jobs)} jobs")
print(jobs.head())
2023-10-30 17:57:36 -07:00
jobs.to_csv("jobs.csv", index=False) # to_xlsx
2023-09-03 07:29:25 -07:00
```
### Output
2023-09-28 16:11:28 -07:00
2023-09-03 07:29:25 -07:00
```
2023-09-03 16:11:18 -07:00
SITE TITLE COMPANY_NAME CITY STATE JOB_TYPE INTERVAL MIN_AMOUNT MAX_AMOUNT JOB_URL DESCRIPTION
indeed Software Engineer AMERICAN SYSTEMS Arlington VA None yearly 200000 150000 https://www.indeed.com/viewjob?jk=5e409e577046... THIS POSITION COMES WITH A 10K SIGNING BONUS!...
indeed Senior Software Engineer TherapyNotes.com Philadelphia PA fulltime yearly 135000 110000 https://www.indeed.com/viewjob?jk=da39574a40cb... About Us TherapyNotes is the national leader i...
linkedin Software Engineer - Early Career Lockheed Martin Sunnyvale CA fulltime yearly None None https://www.linkedin.com/jobs/view/3693012711 Description:By bringing together people that u...
linkedin Full-Stack Software Engineer Rain New York NY fulltime yearly None None https://www.linkedin.com/jobs/view/3696158877 Rain’ s mission is to create the fastest and ea...
zip_recruiter Software Engineer - New Grad ZipRecruiter Santa Monica CA fulltime yearly 130000 150000 https://www.ziprecruiter.com/jobs/ziprecruiter... We offer a hybrid work environment. Most US-ba...
zip_recruiter Software Developer TEKsystems Phoenix AZ fulltime hourly 65 75 https://www.ziprecruiter.com/jobs/teksystems-0... Top Skills' Details• 6 years of Java developme...
2023-09-03 07:29:25 -07:00
```
2023-09-28 16:11:28 -07:00
2023-09-03 07:29:25 -07:00
### Parameters for `scrape_jobs()`
2023-09-28 16:11:28 -07:00
2023-08-28 10:36:54 -07:00
```plaintext
2023-08-27 14:52:27 -07:00
Required
2023-11-08 13:51:07 -08:00
├── site_type (List[enum]): linkedin, zip_recruiter, indeed, glassdoor
2023-08-27 14:52:27 -07:00
└── search_term (str)
Optional
├── location (int)
2023-09-03 07:29:25 -07:00
├── distance (int): in miles
2023-08-27 14:52:27 -07:00
├── job_type (enum): fulltime, parttime, internship, contract
2023-09-07 11:35:10 -07:00
├── proxy (str): in format 'http://user:pass@host:port' or [https, socks]
2023-08-27 14:52:27 -07:00
├── is_remote (bool)
2023-09-03 07:29:25 -07:00
├── results_wanted (int): number of job results to retrieve for each site specified in 'site_type'
2023-09-05 11:03:32 -07:00
├── easy_apply (bool): filters for jobs that are hosted on LinkedIn
2023-09-07 11:35:10 -07:00
├── country_indeed (enum): filters the country on Indeed (see below for correct spelling)
2023-09-28 16:35:40 -07:00
├── offset (num): starts the search from an offset (e.g. 25 will start the search from the 25th result)
2023-08-26 19:40:09 -07:00
```
2023-09-03 07:29:25 -07:00
2023-09-03 16:11:18 -07:00
### JobPost Schema
2023-09-28 16:11:28 -07:00
2023-08-26 18:30:00 -07:00
```plaintext
2023-09-03 07:29:25 -07:00
JobPost
├── title (str)
2023-09-05 11:03:32 -07:00
├── company (str)
2023-09-03 07:29:25 -07:00
├── job_url (str)
├── location (object)
│ ├── country (str)
│ ├── city (str)
│ ├── state (str)
├── description (str)
2023-10-18 12:25:53 -07:00
├── job_type (str): fulltime, parttime, internship, contract
2023-09-03 07:29:25 -07:00
├── compensation (object)
2023-10-18 12:25:53 -07:00
│ ├── interval (str): yearly, monthly, weekly, daily, hourly
2023-09-05 10:17:22 -07:00
│ ├── min_amount (int)
│ ├── max_amount (int)
2023-09-05 11:03:32 -07:00
│ └── currency (enum)
└── date_posted (date)
2023-10-10 09:23:04 -07:00
└── emails (str)
└── num_urgent_words (int)
2023-10-18 12:32:21 -07:00
└── is_remote (bool)
2023-08-28 10:15:13 -07:00
```
2023-09-07 11:35:10 -07:00
### Exceptions
2023-09-28 16:11:28 -07:00
2023-09-07 11:35:10 -07:00
The following exceptions may be raised when using JobSpy:
2023-09-28 16:11:28 -07:00
2023-09-07 11:35:10 -07:00
* `LinkedInException`
* `IndeedException`
* `ZipRecruiterException`
2023-11-08 13:51:07 -08:00
* `GlassdoorException`
2023-09-07 11:35:10 -07:00
2023-09-05 10:17:22 -07:00
## Supported Countries for Job Searching
### **LinkedIn**
2023-11-08 13:51:07 -08:00
LinkedIn searches globally & uses only the `location` parameter. You can only fetch 1000 jobs max from the LinkedIn endpoint we're using
2023-09-05 10:17:22 -07:00
### **ZipRecruiter**
2023-09-07 11:46:14 -07:00
ZipRecruiter searches for jobs in **US/Canada** & uses only the `location` parameter.
2023-09-05 10:17:22 -07:00
2023-11-08 13:51:07 -08:00
### **Indeed / Glassdoor**
2023-09-28 16:11:28 -07:00
2023-10-30 17:57:36 -07:00
Indeed & Glassdoor supports most countries, but the `country_indeed` parameter is required. Additionally, use the `location`
2023-11-08 13:51:07 -08:00
parameter to narrow down the location, e.g. city & state if necessary.
2023-09-28 16:11:28 -07:00
2023-10-30 17:57:36 -07:00
You can specify the following countries when searching on Indeed (use the exact name, * indicates support for Glassdoor):
2023-09-28 16:11:28 -07:00
| | | | |
|----------------------|--------------|------------|----------------|
2023-10-30 17:57:36 -07:00
| Argentina | Australia* | Austria* | Bahrain |
| Belgium* | Brazil* | Canada* | Chile |
2023-09-28 16:11:28 -07:00
| China | Colombia | Costa Rica | Czech Republic |
| Denmark | Ecuador | Egypt | Finland |
2023-10-30 17:57:36 -07:00
| France* | Germany* | Greece | Hong Kong* |
| Hungary | India* | Indonesia | Ireland* |
| Israel | Italy* | Japan | Kuwait |
| Luxembourg | Malaysia | Mexico* | Morocco |
| Netherlands* | New Zealand* | Nigeria | Norway |
2023-09-28 16:11:28 -07:00
| Oman | Pakistan | Panama | Peru |
| Philippines | Poland | Portugal | Qatar |
2023-10-30 17:57:36 -07:00
| Romania | Saudi Arabia | Singapore* | South Africa |
| South Korea | Spain* | Sweden | Switzerland* |
2023-09-28 16:11:28 -07:00
| Taiwan | Thailand | Turkey | Ukraine |
2023-10-30 17:57:36 -07:00
| United Arab Emirates | UK* | USA* | Uruguay |
2023-09-28 16:11:28 -07:00
| Venezuela | Vietnam | | |
2023-08-28 10:51:05 -07:00
2023-10-30 17:57:36 -07:00
2023-11-08 13:51:07 -08:00
Glassdoor can only fetch 900 jobs from the endpoint we're using on a given search.
2023-09-03 18:05:31 -07:00
## Frequently Asked Questions
---
**Q: Encountering issues with your queries?**
2023-09-28 16:11:28 -07:00
**A:** Try reducing the number of `results_wanted` and/or broadening the filters. If problems
2023-10-18 12:32:21 -07:00
persist, [submit an issue ](https://github.com/Bunsly/JobSpy/issues ).
2023-09-03 18:05:31 -07:00
---
**Q: Received a response code 429?**
2023-10-10 09:54:14 -07:00
**A:** This indicates that you have been blocked by the job board site for sending too many requests. All of the job board sites are aggressive with blocking. We recommend:
2023-09-03 18:05:31 -07:00
- Waiting a few seconds between requests.
2023-09-09 08:55:33 -07:00
- Trying a VPN or proxy to change your IP address.
2023-09-03 18:05:31 -07:00
---
2023-09-19 10:50:14 -07:00
**Q: Experiencing a "Segmentation fault: 11" on macOS Catalina?**
**A:** This is due to `tls_client` dependency not supporting your architecture. Solutions and workarounds include:
2023-09-28 16:11:28 -07:00
2023-09-19 10:50:14 -07:00
- Upgrade to a newer version of MacOS
- Reach out to the maintainers of [tls_client ](https://github.com/bogdanfinn/tls-client ) for fixes
2023-08-28 10:36:54 -07:00