mirror of
https://github.com/Bunsly/JobSpy.git
synced 2026-03-05 03:54:31 -08:00
Compare commits
9 Commits
1.1.75
...
d52e366ef7
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
d52e366ef7 | ||
|
|
395ebf0017 | ||
|
|
63fddd9b7f | ||
|
|
58956868ae | ||
|
|
4fce836222 | ||
|
|
5ba25e7a7c | ||
|
|
f7cb3e9206 | ||
|
|
3ad3f121f7 | ||
|
|
ff3c782912 |
109
README.md
109
README.md
@@ -8,8 +8,8 @@ work with us.*
|
|||||||
## Features
|
## Features
|
||||||
|
|
||||||
- Scrapes job postings from **LinkedIn**, **Indeed**, **Glassdoor**, **Google**, & **ZipRecruiter** simultaneously
|
- Scrapes job postings from **LinkedIn**, **Indeed**, **Glassdoor**, **Google**, & **ZipRecruiter** simultaneously
|
||||||
- Aggregates the job postings in a Pandas DataFrame
|
- Aggregates the job postings in a dataframe
|
||||||
- Proxies support
|
- Proxies support to bypass blocking
|
||||||
|
|
||||||

|

|
||||||
|
|
||||||
@@ -33,12 +33,11 @@ jobs = scrape_jobs(
|
|||||||
google_search_term="software engineer jobs near San Francisco, CA since yesterday",
|
google_search_term="software engineer jobs near San Francisco, CA since yesterday",
|
||||||
location="San Francisco, CA",
|
location="San Francisco, CA",
|
||||||
results_wanted=20,
|
results_wanted=20,
|
||||||
hours_old=72, # (only Linkedin/Indeed is hour specific, others round up to days old)
|
hours_old=72,
|
||||||
country_indeed='USA', # only needed for indeed / glassdoor
|
country_indeed='USA',
|
||||||
|
|
||||||
# linkedin_fetch_description=True # get more info such as full description, direct job url for linkedin (slower)
|
# linkedin_fetch_description=True # gets more info such as description, direct job url (slower)
|
||||||
# proxies=["208.195.175.46:65095", "208.195.175.45:65095", "localhost"],
|
# proxies=["208.195.175.46:65095", "208.195.175.45:65095", "localhost"],
|
||||||
|
|
||||||
)
|
)
|
||||||
print(f"Found {len(jobs)} jobs")
|
print(f"Found {len(jobs)} jobs")
|
||||||
print(jobs.head())
|
print(jobs.head())
|
||||||
@@ -68,7 +67,7 @@ Optional
|
|||||||
├── search_term (str)
|
├── search_term (str)
|
||||||
|
|
|
|
||||||
├── google_search_term (str)
|
├── google_search_term (str)
|
||||||
| search term for google jobs. This is is only param for filtering google jobs.
|
| search term for google jobs. This is the only param for filtering google jobs.
|
||||||
│
|
│
|
||||||
├── location (str)
|
├── location (str)
|
||||||
│
|
│
|
||||||
@@ -88,7 +87,7 @@ Optional
|
|||||||
| number of job results to retrieve for each site specified in 'site_name'
|
| number of job results to retrieve for each site specified in 'site_name'
|
||||||
│
|
│
|
||||||
├── easy_apply (bool):
|
├── easy_apply (bool):
|
||||||
| filters for jobs that are hosted on the job board site
|
| filters for jobs that are hosted on the job board site (LinkedIn easy apply filter no longer works)
|
||||||
│
|
│
|
||||||
├── description_format (str):
|
├── description_format (str):
|
||||||
| markdown, html (Format type of the job descriptions. Default is markdown.)
|
| markdown, html (Format type of the job descriptions. Default is markdown.)
|
||||||
@@ -133,46 +132,6 @@ Optional
|
|||||||
| - easy_apply
|
| - easy_apply
|
||||||
```
|
```
|
||||||
|
|
||||||
|
|
||||||
### JobPost Schema
|
|
||||||
|
|
||||||
```plaintext
|
|
||||||
JobPost
|
|
||||||
├── title
|
|
||||||
├── company
|
|
||||||
├── company_url
|
|
||||||
├── job_url
|
|
||||||
├── location
|
|
||||||
│ ├── country
|
|
||||||
│ ├── city
|
|
||||||
│ ├── state
|
|
||||||
├── description
|
|
||||||
├── job_type: fulltime, parttime, internship, contract
|
|
||||||
├── job_function
|
|
||||||
│ ├── interval: yearly, monthly, weekly, daily, hourly
|
|
||||||
│ ├── min_amount
|
|
||||||
│ ├── max_amount
|
|
||||||
│ ├── currency
|
|
||||||
│ └── salary_source: direct_data, description (parsed from posting)
|
|
||||||
├── date_posted
|
|
||||||
├── emails
|
|
||||||
└── is_remote
|
|
||||||
|
|
||||||
Linkedin specific
|
|
||||||
└── job_level
|
|
||||||
|
|
||||||
Linkedin & Indeed specific
|
|
||||||
└── company_industry
|
|
||||||
|
|
||||||
Indeed specific
|
|
||||||
├── company_country
|
|
||||||
├── company_addresses
|
|
||||||
├── company_employees_label
|
|
||||||
├── company_revenue_label
|
|
||||||
├── company_description
|
|
||||||
└── company_logo
|
|
||||||
```
|
|
||||||
|
|
||||||
## Supported Countries for Job Searching
|
## Supported Countries for Job Searching
|
||||||
|
|
||||||
### **LinkedIn**
|
### **LinkedIn**
|
||||||
@@ -219,7 +178,18 @@ You can specify the following countries when searching on Indeed (use the exact
|
|||||||
|
|
||||||
---
|
---
|
||||||
**Q: Why is Indeed giving unrelated roles?**
|
**Q: Why is Indeed giving unrelated roles?**
|
||||||
**A:** Indeed is searching each one of your terms e.g. software intern, it searches software OR intern. Try search_term='"software intern"' in quotes for stricter searching
|
**A:** Indeed searches the description too.
|
||||||
|
|
||||||
|
- use - to remove words
|
||||||
|
- "" for exact match
|
||||||
|
|
||||||
|
Example of a good Indeed query
|
||||||
|
|
||||||
|
```py
|
||||||
|
search_term='"engineering intern" software summer (java OR python OR c++) 2025 -tax -marketing'
|
||||||
|
```
|
||||||
|
|
||||||
|
This searches the description/title and must include software, summer, 2025, one of the languages, engineering intern exactly, no tax, no marketing.
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
@@ -231,8 +201,41 @@ You can specify the following countries when searching on Indeed (use the exact
|
|||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
**Q: Encountering issues with your queries?**
|
### JobPost Schema
|
||||||
**A:** Try reducing the number of `results_wanted` and/or broadening the filters. If problems
|
|
||||||
persist, [submit an issue](https://github.com/Bunsly/JobSpy/issues).
|
|
||||||
|
|
||||||
---
|
```plaintext
|
||||||
|
JobPost
|
||||||
|
├── title
|
||||||
|
├── company
|
||||||
|
├── company_url
|
||||||
|
├── job_url
|
||||||
|
├── location
|
||||||
|
│ ├── country
|
||||||
|
│ ├── city
|
||||||
|
│ ├── state
|
||||||
|
├── description
|
||||||
|
├── job_type: fulltime, parttime, internship, contract
|
||||||
|
├── job_function
|
||||||
|
│ ├── interval: yearly, monthly, weekly, daily, hourly
|
||||||
|
│ ├── min_amount
|
||||||
|
│ ├── max_amount
|
||||||
|
│ ├── currency
|
||||||
|
│ └── salary_source: direct_data, description (parsed from posting)
|
||||||
|
├── date_posted
|
||||||
|
├── emails
|
||||||
|
└── is_remote
|
||||||
|
|
||||||
|
Linkedin specific
|
||||||
|
└── job_level
|
||||||
|
|
||||||
|
Linkedin & Indeed specific
|
||||||
|
└── company_industry
|
||||||
|
|
||||||
|
Indeed specific
|
||||||
|
├── company_country
|
||||||
|
├── company_addresses
|
||||||
|
├── company_employees_label
|
||||||
|
├── company_revenue_label
|
||||||
|
├── company_description
|
||||||
|
└── company_logo
|
||||||
|
```
|
||||||
|
|||||||
Reference in New Issue
Block a user