Compare commits

...

9 Commits

Author SHA1 Message Date
Cullen Watson
d52e366ef7 docs:readme 2024-11-26 15:51:26 -06:00
Cullen Watson
395ebf0017 docs:readme 2024-11-26 15:49:12 -06:00
Cullen Watson
63fddd9b7f docs:readme 2024-11-26 15:48:22 -06:00
Cullen Watson
58956868ae docs:readme 2024-11-26 15:47:10 -06:00
Cullen Watson
4fce836222 docs:readme 2024-10-28 03:53:59 -05:00
Cullen Watson
5ba25e7a7c docs:readme 2024-10-28 03:42:19 -05:00
Cullen Watson
f7cb3e9206 docs:readme 2024-10-28 03:36:21 -05:00
Cullen Watson
3ad3f121f7 docs:readme 2024-10-28 03:34:52 -05:00
Cullen Watson
ff3c782912 docs:readme 2024-10-25 18:12:08 -05:00

109
README.md
View File

@@ -8,8 +8,8 @@ work with us.*
## Features ## Features
- Scrapes job postings from **LinkedIn**, **Indeed**, **Glassdoor**, **Google**, & **ZipRecruiter** simultaneously - Scrapes job postings from **LinkedIn**, **Indeed**, **Glassdoor**, **Google**, & **ZipRecruiter** simultaneously
- Aggregates the job postings in a Pandas DataFrame - Aggregates the job postings in a dataframe
- Proxies support - Proxies support to bypass blocking
![jobspy](https://github.com/cullenwatson/JobSpy/assets/78247585/ec7ef355-05f6-4fd3-8161-a817e31c5c57) ![jobspy](https://github.com/cullenwatson/JobSpy/assets/78247585/ec7ef355-05f6-4fd3-8161-a817e31c5c57)
@@ -33,12 +33,11 @@ jobs = scrape_jobs(
google_search_term="software engineer jobs near San Francisco, CA since yesterday", google_search_term="software engineer jobs near San Francisco, CA since yesterday",
location="San Francisco, CA", location="San Francisco, CA",
results_wanted=20, results_wanted=20,
hours_old=72, # (only Linkedin/Indeed is hour specific, others round up to days old) hours_old=72,
country_indeed='USA', # only needed for indeed / glassdoor country_indeed='USA',
# linkedin_fetch_description=True # get more info such as full description, direct job url for linkedin (slower) # linkedin_fetch_description=True # gets more info such as description, direct job url (slower)
# proxies=["208.195.175.46:65095", "208.195.175.45:65095", "localhost"], # proxies=["208.195.175.46:65095", "208.195.175.45:65095", "localhost"],
) )
print(f"Found {len(jobs)} jobs") print(f"Found {len(jobs)} jobs")
print(jobs.head()) print(jobs.head())
@@ -68,7 +67,7 @@ Optional
├── search_term (str) ├── search_term (str)
| |
├── google_search_term (str) ├── google_search_term (str)
| search term for google jobs. This is is only param for filtering google jobs. | search term for google jobs. This is the only param for filtering google jobs.
├── location (str) ├── location (str)
@@ -88,7 +87,7 @@ Optional
| number of job results to retrieve for each site specified in 'site_name' | number of job results to retrieve for each site specified in 'site_name'
├── easy_apply (bool): ├── easy_apply (bool):
| filters for jobs that are hosted on the job board site | filters for jobs that are hosted on the job board site (LinkedIn easy apply filter no longer works)
├── description_format (str): ├── description_format (str):
| markdown, html (Format type of the job descriptions. Default is markdown.) | markdown, html (Format type of the job descriptions. Default is markdown.)
@@ -133,46 +132,6 @@ Optional
| - easy_apply | - easy_apply
``` ```
### JobPost Schema
```plaintext
JobPost
├── title
├── company
├── company_url
├── job_url
├── location
│ ├── country
│ ├── city
│ ├── state
├── description
├── job_type: fulltime, parttime, internship, contract
├── job_function
│ ├── interval: yearly, monthly, weekly, daily, hourly
│ ├── min_amount
│ ├── max_amount
│ ├── currency
│ └── salary_source: direct_data, description (parsed from posting)
├── date_posted
├── emails
└── is_remote
Linkedin specific
└── job_level
Linkedin & Indeed specific
└── company_industry
Indeed specific
├── company_country
├── company_addresses
├── company_employees_label
├── company_revenue_label
├── company_description
└── company_logo
```
## Supported Countries for Job Searching ## Supported Countries for Job Searching
### **LinkedIn** ### **LinkedIn**
@@ -219,7 +178,18 @@ You can specify the following countries when searching on Indeed (use the exact
--- ---
**Q: Why is Indeed giving unrelated roles?** **Q: Why is Indeed giving unrelated roles?**
**A:** Indeed is searching each one of your terms e.g. software intern, it searches software OR intern. Try search_term='"software intern"' in quotes for stricter searching **A:** Indeed searches the description too.
- use - to remove words
- "" for exact match
Example of a good Indeed query
```py
search_term='"engineering intern" software summer (java OR python OR c++) 2025 -tax -marketing'
```
This searches the description/title and must include software, summer, 2025, one of the languages, engineering intern exactly, no tax, no marketing.
--- ---
@@ -231,8 +201,41 @@ You can specify the following countries when searching on Indeed (use the exact
--- ---
**Q: Encountering issues with your queries?** ### JobPost Schema
**A:** Try reducing the number of `results_wanted` and/or broadening the filters. If problems
persist, [submit an issue](https://github.com/Bunsly/JobSpy/issues).
--- ```plaintext
JobPost
├── title
├── company
├── company_url
├── job_url
├── location
│ ├── country
│ ├── city
│ ├── state
├── description
├── job_type: fulltime, parttime, internship, contract
├── job_function
│ ├── interval: yearly, monthly, weekly, daily, hourly
│ ├── min_amount
│ ├── max_amount
│ ├── currency
│ └── salary_source: direct_data, description (parsed from posting)
├── date_posted
├── emails
└── is_remote
Linkedin specific
└── job_level
Linkedin & Indeed specific
└── company_industry
Indeed specific
├── company_country
├── company_addresses
├── company_employees_label
├── company_revenue_label
├── company_description
└── company_logo
```