Jobs scraper library for LinkedIn, Indeed, Glassdoor & ZipRecruiter
Go to file
Yariv Menachem 6e841ffc22 added step and updated some messages 2025-01-06 16:09:16 +02:00
.github/workflows fix yml 2024-12-04 16:52:15 -06:00
.vscode fixed runing main 2024-12-11 16:38:45 +02:00
src added step and updated some messages 2025-01-06 16:09:16 +02:00
tests restructure project 2025-01-06 15:10:03 +02:00
.gitignore added pydantic settings 2025-01-02 14:07:44 +02:00
.pre-commit-config.yaml format: Apply Black formatter to the codebase (#127) 2024-03-10 23:36:27 -05:00
LICENSE docs: Create LICENSE 2023-08-26 18:47:48 -05:00
README.md removed chat id env variable 2025-01-01 17:46:39 +02:00
increment_version.py enh:auto update version 2024-12-04 16:29:38 -06:00
poetry.lock build(deps): bump markdownify to 0.13.1 (#211) 2024-10-20 00:18:44 -05:00
pyproject.toml restructure project 2025-01-06 15:10:03 +02:00
requirements.txt all data saved to db at the last step, 2025-01-05 17:41:48 +02:00

README.md

JobSeekerTG Bot

JobSeekerTG is a Telegram bot that scrapes job postings from platforms like LinkedIn, Indeed, Glassdoor, and others (currently under development). It gathers job data based on title and location, reformats it into a structured format, and saves it to a MongoDB database. New job posts are automatically sent to a designated Telegram bot chat.

This project is based on the JobSpy project. Credits to the original creator.

Features

  • Job scraping: Collects job postings from multiple job platforms.
  • Structured data: Reformats job data into a structured format for easy processing and storage.
  • Database storage: Saves job data into a MongoDB database.
  • Telegram integration: Sends new job postings directly to a Telegram bot chat.

Data Structure

The scraped job postings are stored in the following format:

JobPost
├── title
├── company
├── company_url
├── job_url
├── location
│   ├── country
│   ├── city
│   ├── state
├── description
├── job_type: fulltime, parttime, internship, contract
├── job_function
│   ├── interval: yearly, monthly, weekly, daily, hourly
│   ├── min_amount
│   ├── max_amount
│   ├── currency
│   └── salary_source: direct_data, description (parsed from posting)
├── date_posted
├── emails
└── is_remote

Prerequisites

  • Python 3.8+
  • MongoDB
  • Telegram bot token (create a bot via BotFather)

Installation

  1. Clone the repository:

    git clone https://github.com/yariv245/JobSeeker.git
    cd JobSeeker
    
  2. Install dependencies:

    pip install -r requirements.txt
    
  3. Set up environment variables: Create a .env file in the root directory with the following:

    TELEGRAM_BOT_TOKEN=your_telegram_bot_token
    MONGO_URI=your_mongodb_connection_string
    
  4. Run the bot:

    python bot.py
    

Usage

  • Add the bot to a Telegram group or chat.
  • Start the bot to receive job postings as they are scraped.

Testing

This project includes testing to ensure data scraping, formatting, and Telegram integration work as expected. Run the tests using:

pytest

Ensure you have the necessary environment variables and mock data set up before running the tests.

Contributing

Contributions are welcome! Please follow these steps:

  1. Fork the repository.
  2. Create a new branch (git checkout -b feature/your-feature-name).
  3. Commit your changes (git commit -m 'Add some feature').
  4. Push to the branch (git push origin feature/your-feature-name).
  5. Open a pull request.

Acknowledgments

  • JobSpy for inspiring this project.