JobSpy/README.md

102 lines
3.0 KiB
Markdown
Raw Normal View History

2023-08-26 12:41:33 -07:00
# JobSpy AIO Scraper
2023-07-10 20:14:38 -07:00
## Features
2023-08-26 12:41:33 -07:00
- Scrapes job postings from **LinkedIn**, **Indeed** & **ZipRecruiter** simultaneously
- Returns jobs with title, location, company, description & other data
- Optional JWT authorization
2023-08-19 18:41:46 -07:00
2023-08-19 16:44:16 -07:00
2023-08-26 12:41:33 -07:00
### API
2023-08-19 16:44:16 -07:00
2023-08-26 12:41:33 -07:00
POST `/api/v1/jobs/`
### Request Schema
#### Example
<pre>
{
"site_type": ["linkedin", "indeed"],
"search_term": "software engineer",
"location": "austin, tx",
"distance": 10,
"job_type": "fulltime",
"results_wanted": 10
}
</pre>
2023-08-19 16:44:16 -07:00
#### Parameters:
2023-08-26 12:41:33 -07:00
##### Required
- **site_type**: _List[str]_ - `linkedin`, `zip_recruiter`, `indeed`
- **search_term**: _str_
2023-08-19 16:44:16 -07:00
2023-08-26 12:41:33 -07:00
##### Optional
- **location**: _int_
- **distance**: _int_
- **job_type**: _str_ - `fulltime`, `parttime`, `internship`, `contract`
- **is_remote**: _bool_
- **results_wanted**: _int_ (per `site_type`)
- **easy_apply**: _bool_ (only for `linkedIn`)
2023-08-23 16:05:17 -07:00
2023-08-26 12:41:33 -07:00
## Response Schema
### Example
![image](https://github.com/cullenwatson/jobspy/assets/78247585/63b313db-ce25-41aa-9ffd-ae86af6f2a45)
2023-08-23 15:52:36 -07:00
#### JobResponse
2023-08-26 12:41:33 -07:00
- **success**: _bool_ - Indicates if the request was successful
- **error**: _str_
- **jobs**: _list[JobPost]_
- #### JobPost
- **title**: _str_
- **company_name**: _str_
- **job_url**: _str_
- **location**: _object_ - (country, city, state, postal_code, address)
- **description**: _str_
- **job_type**: _str_ - `fulltime`, `parttime`, `internship`, `contract`
- **compensation**: _object_ - Contains: `interval`, `min_amount`, `max_amount`, `currency`
- **date_posted**: _str_
- **total_results**: _int_
- **returned_results**: _int_
2023-08-23 15:52:36 -07:00
2023-07-10 20:14:38 -07:00
## Installation
2023-08-19 18:15:41 -07:00
_Python >= 3.10 required_
2023-08-26 12:41:33 -07:00
1. Clone this repository `git clone https://github.com/cullenwatson/jobspy`
2023-08-17 13:46:03 -07:00
2. Install the dependencies with `pip install -r requirements.txt`
2023-08-19 16:44:16 -07:00
4. Run the server with `uvicorn main:app --reload`
2023-07-10 20:14:38 -07:00
## Usage
To see the interactive API documentation, visit [localhost:8000/docs](http://localhost:8000/docs).
For Postman integration:
2023-08-26 12:41:33 -07:00
- Import the Postman collection and environment JSON files from the `/postman/` folder.
2023-08-19 16:44:16 -07:00
## FAQ
2023-08-19 18:37:49 -07:00
### I'm having issues with my queries. What should I do?
2023-08-19 16:44:16 -07:00
2023-08-19 18:37:49 -07:00
Broadening your filters can often help. Additionally, try reducing the number of `results_wanted`.
If issues still persist, feel free to submit an issue.
2023-08-19 16:44:16 -07:00
2023-08-19 18:37:49 -07:00
### How to enable auth?
2023-08-19 16:44:16 -07:00
2023-08-19 18:37:49 -07:00
Change `AUTH_REQUIRED` in `/settings.py` to `True`
2023-07-10 20:14:38 -07:00
2023-08-19 18:45:53 -07:00
The auth uses [supabase](https://supabase.com). Create a project with a `users` table and disable RLS.
2023-08-19 18:47:05 -07:00
<img src="https://github.com/cullenwatson/jobspy/assets/78247585/03af18e1-5386-49ad-a2cf-d34232d9d747" width="500">
2023-08-19 18:45:53 -07:00
2023-08-19 18:37:49 -07:00
Add these three environment variables:
- `SUPABASE_URL`: go to project settings -> API -> Project URL
- `SUPABASE_KEY`: go to project settings -> API -> service_role secret
- `JWT_SECRET_KEY` - type `openssl rand -hex 32` in terminal to create a 32 byte secret key
2023-08-26 12:41:33 -07:00
Use these endpoints to register and get an access token:
![image](https://github.com/cullenwatson/jobspy/assets/78247585/c84c33ec-1fe8-4152-9c8c-6c4334aecfc3)