Jobs scraper library for LinkedIn, Indeed, Glassdoor & ZipRecruiter
Go to file
Yariv Menachem 61cb80839c updated command in Telegram 2024-12-29 11:10:17 +02:00
.github/workflows fix yml 2024-12-04 16:52:15 -06:00
.vscode fixed runing main 2024-12-11 16:38:45 +02:00
src updated command in Telegram 2024-12-29 11:10:17 +02:00
.gitignore created main 2024-12-08 17:30:06 +02:00
.pre-commit-config.yaml format: Apply Black formatter to the codebase (#127) 2024-03-10 23:36:27 -05:00
LICENSE docs: Create LICENSE 2023-08-26 18:47:48 -05:00
README.md Update README.md 2024-12-15 15:34:15 +02:00
increment_version.py enh:auto update version 2024-12-04 16:29:38 -06:00
poetry.lock build(deps): bump markdownify to 0.13.1 (#211) 2024-10-20 00:18:44 -05:00
pyproject.toml Increment version 2024-12-04 22:55:06 +00:00
requirements.txt requirements.txt added 2024-12-11 11:33:45 +02:00

README.md

JobSeekerTG Bot

JobSeekerTG is a Telegram bot that scrapes job postings from platforms like LinkedIn, Indeed, Glassdoor, and others (currently under development). It gathers job data based on title and location, reformats it into a structured format, and saves it to a MongoDB database. New job posts are automatically sent to a designated Telegram bot chat.

This project is based on the JobSpy project. Credits to the original creator.

Features

  • Job scraping: Collects job postings from multiple job platforms.
  • Structured data: Reformats job data into a structured format for easy processing and storage.
  • Database storage: Saves job data into a MongoDB database.
  • Telegram integration: Sends new job postings directly to a Telegram bot chat.

Data Structure

The scraped job postings are stored in the following format:

JobPost
├── title
├── company
├── company_url
├── job_url
├── location
│   ├── country
│   ├── city
│   ├── state
├── description
├── job_type: fulltime, parttime, internship, contract
├── job_function
│   ├── interval: yearly, monthly, weekly, daily, hourly
│   ├── min_amount
│   ├── max_amount
│   ├── currency
│   └── salary_source: direct_data, description (parsed from posting)
├── date_posted
├── emails
└── is_remote

Prerequisites

  • Python 3.8+
  • MongoDB
  • Telegram bot token (create a bot via BotFather)

Installation

  1. Clone the repository:

    git clone https://github.com/yariv245/JobSeeker.git
    cd JobSeeker
    
  2. Install dependencies:

    pip install -r requirements.txt
    
  3. Set up environment variables: Create a .env file in the root directory with the following:

    TELEGRAM_BOT_TOKEN=your_telegram_bot_token
    MONGO_URI=your_mongodb_connection_string
    TELEGRAM_CHAT_ID=your_telegram_chat_id
    
  4. Run the bot:

    python bot.py
    

Usage

  • Add the bot to a Telegram group or chat.
  • Start the bot to receive job postings as they are scraped.

Testing

This project includes testing to ensure data scraping, formatting, and Telegram integration work as expected. Run the tests using:

pytest

Ensure you have the necessary environment variables and mock data set up before running the tests.

Contributing

Contributions are welcome! Please follow these steps:

  1. Fork the repository.
  2. Create a new branch (git checkout -b feature/your-feature-name).
  3. Commit your changes (git commit -m 'Add some feature').
  4. Push to the branch (git push origin feature/your-feature-name).
  5. Open a pull request.

Acknowledgments

  • JobSpy for inspiring this project.