JobSpy/README.md

110 lines
4.4 KiB
Markdown
Raw Normal View History

2023-09-03 16:02:43 -07:00
# <img src="https://github.com/cullenwatson/JobSpy/assets/78247585/2f61a059-9647-4a9c-bfb9-e3a9448bdc6a" style="vertical-align: sub; margin-right: 5px;"> JobSpy
2023-07-10 20:14:38 -07:00
2023-09-03 07:29:25 -07:00
**JobSpy** is a simple, yet comprehensive, job scraping library.
2023-07-10 20:14:38 -07:00
## Features
2023-08-26 12:41:33 -07:00
- Scrapes job postings from **LinkedIn**, **Indeed** & **ZipRecruiter** simultaneously
2023-09-03 07:29:25 -07:00
- Aggregates the job postings in a Pandas DataFrame
2023-09-03 18:05:31 -07:00
![jobspy](https://github.com/cullenwatson/JobSpy/assets/78247585/ec7ef355-05f6-4fd3-8161-a817e31c5c57)
2023-09-03 07:29:25 -07:00
### Installation
2023-09-03 08:15:30 -07:00
`pip install python-jobspy`
2023-09-03 07:29:25 -07:00
_Python version >= [3.10](https://www.python.org/downloads/release/python-3100/) required_
### Usage
```python
2023-09-03 10:30:13 -07:00
from jobspy import scrape_jobs
2023-09-03 07:29:25 -07:00
import pandas as pd
jobs: pd.DataFrame = scrape_jobs(
site_name=["indeed", "linkedin", "zip_recruiter"],
search_term="software engineer",
results_wanted=10
)
if jobs.empty:
print("No jobs found.")
else:
2023-09-03 18:05:31 -07:00
#1 print
2023-09-03 07:29:25 -07:00
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)
pd.set_option('display.width', None)
pd.set_option('display.max_colwidth', 50) # set to 0 to see full job url / desc
print(jobs)
2023-09-03 18:05:31 -07:00
#2 display in Jupyter Notebook
#display(jobs)
2023-09-03 08:49:53 -07:00
2023-09-03 18:05:31 -07:00
#3 output to .csv
#jobs.to_csv('jobs.csv', index=False)
2023-09-03 07:29:25 -07:00
```
### Output
```
2023-09-03 16:11:18 -07:00
SITE TITLE COMPANY_NAME CITY STATE JOB_TYPE INTERVAL MIN_AMOUNT MAX_AMOUNT JOB_URL DESCRIPTION
indeed Software Engineer AMERICAN SYSTEMS Arlington VA None yearly 200000 150000 https://www.indeed.com/viewjob?jk=5e409e577046... THIS POSITION COMES WITH A 10K SIGNING BONUS!...
indeed Senior Software Engineer TherapyNotes.com Philadelphia PA fulltime yearly 135000 110000 https://www.indeed.com/viewjob?jk=da39574a40cb... About Us TherapyNotes is the national leader i...
linkedin Software Engineer - Early Career Lockheed Martin Sunnyvale CA fulltime yearly None None https://www.linkedin.com/jobs/view/3693012711 Description:By bringing together people that u...
linkedin Full-Stack Software Engineer Rain New York NY fulltime yearly None None https://www.linkedin.com/jobs/view/3696158877 Rains mission is to create the fastest and ea...
zip_recruiter Software Engineer - New Grad ZipRecruiter Santa Monica CA fulltime yearly 130000 150000 https://www.ziprecruiter.com/jobs/ziprecruiter... We offer a hybrid work environment. Most US-ba...
zip_recruiter Software Developer TEKsystems Phoenix AZ fulltime hourly 65 75 https://www.ziprecruiter.com/jobs/teksystems-0... Top Skills' Details• 6 years of Java developme...
2023-09-03 07:29:25 -07:00
```
### Parameters for `scrape_jobs()`
2023-08-28 10:36:54 -07:00
```plaintext
2023-08-27 14:52:27 -07:00
Required
├── site_type (List[enum]): linkedin, zip_recruiter, indeed
└── search_term (str)
Optional
├── location (int)
2023-09-03 07:29:25 -07:00
├── distance (int): in miles
2023-08-27 14:52:27 -07:00
├── job_type (enum): fulltime, parttime, internship, contract
├── is_remote (bool)
2023-09-03 07:29:25 -07:00
├── results_wanted (int): number of job results to retrieve for each site specified in 'site_type'
├── easy_apply (bool): filters for jobs on LinkedIn that have the 'Easy Apply' option
```
2023-09-03 07:29:25 -07:00
2023-09-03 16:11:18 -07:00
### JobPost Schema
2023-08-26 18:30:00 -07:00
```plaintext
2023-09-03 07:29:25 -07:00
JobPost
├── title (str)
├── company_name (str)
├── job_url (str)
├── location (object)
│ ├── country (str)
│ ├── city (str)
│ ├── state (str)
├── description (str)
├── job_type (enum)
├── compensation (object)
│ ├── interval (CompensationInterval): yearly, monthly, weekly, daily, hourly
│ ├── min_amount (float)
│ ├── max_amount (float)
│ └── currency (str)
└── date_posted (datetime)
2023-08-28 10:15:13 -07:00
```
2023-08-28 10:51:05 -07:00
2023-09-03 18:05:31 -07:00
## Frequently Asked Questions
---
**Q: Encountering issues with your queries?**
**A:** Try reducing the number of `results_wanted` and/or broadening the filters. If problems persist, [submit an issue](#).
---
**Q: Received a response code 429?**
**A:** This indicates that you have been blocked by the job board site for sending too many requests. Currently, **ZipRecruiter** is particularly aggressive with blocking. We recommend:
- Waiting a few seconds between requests.
- Trying a VPN to change your IP address.
**Note:** Proxy support is in development and coming soon!
---
2023-08-28 10:36:54 -07:00