docs:readme

Increment version
2026-03-06 20:44:30 -08:00 · 2025-02-09 13:42:18 -06:00 · 2025-01-17 21:44:49 -06:00 · 2024-12-04 22:55:06 +00:00 · 2024-12-04 16:54:52 -06:00 · 2024-12-04 16:52:15 -06:00
6 changed files with 124 additions and 85 deletions
--- a/.github/workflows/publish-to-pypi.yml
+++ b/.github/workflows/publish-to-pypi.yml
@@ -1,33 +1,50 @@
 name: Publish Python 🐍 distributions 📦 to PyPI
-on: push
+on:
+  pull_request:
+    types:
+      - closed
+
+permissions:
+  contents: write

 jobs:
  build-n-publish:
    name: Build and publish Python 🐍 distributions 📦 to PyPI
    runs-on: ubuntu-latest

+    if: github.event.pull_request.merged == true && github.event.pull_request.base.ref == 'main'
+
    steps:
      - uses: actions/checkout@v3
+
      - name: Set up Python
        uses: actions/setup-python@v4
        with:
          python-version: "3.10"

+      - name: Install dependencies
+        run: pip install toml
+
+      - name: Increment version
+        run: python increment_version.py
+
+      - name: Commit version increment
+        run: |
+          git config --global user.name 'github-actions'
+          git config --global user.email 'github-actions@github.com'
+          git add pyproject.toml
+          git commit -m 'Increment version'
+
+      - name: Push changes
+        run: git push
+
      - name: Install poetry
-        run: >-
-          python3 -m
-          pip install
-          poetry
-          --user
+        run: pip install poetry --user

      - name: Build distribution 📦
-        run: >-
-          python3 -m
-          poetry
-          build
+        run: poetry build

      - name: Publish distribution 📦 to PyPI
-        if: startsWith(github.ref, 'refs/tags')
        uses: pypa/gh-action-pypi-publish@release/v1
        with:
-          password: ${{ secrets.PYPI_API_TOKEN }}
+          password: ${{ secrets.PYPI_API_TOKEN }}
--- a/README.md
+++ b/README.md
@@ -2,14 +2,11 @@

 **JobSpy** is a simple, yet comprehensive, job scraping library.

-*Looking to build a data-focused software product?* **[Book a call](https://bunsly.com/)** *to
-work with us.*
-
 ## Features

 - Scrapes job postings from **LinkedIn**, **Indeed**, **Glassdoor**, **Google**, & **ZipRecruiter** simultaneously
- Aggregates the job postings in a Pandas DataFrame
- Proxies support
+- Aggregates the job postings in a dataframe
+- Proxies support to bypass blocking

 ![jobspy](https://github.com/cullenwatson/JobSpy/assets/78247585/ec7ef355-05f6-4fd3-8161-a817e31c5c57)

@@ -33,12 +30,11 @@ jobs = scrape_jobs(
    google_search_term="software engineer jobs near San Francisco, CA since yesterday",
    location="San Francisco, CA",
    results_wanted=20,
-    hours_old=72, # (only Linkedin/Indeed is hour specific, others round up to days old)
-    country_indeed='USA',  # only needed for indeed / glassdoor
+    hours_old=72,
+    country_indeed='USA',
    
-    # linkedin_fetch_description=True # get more info such as full description, direct job url for linkedin (slower)
+    # linkedin_fetch_description=True # gets more info such as description, direct job url (slower)
    # proxies=["208.195.175.46:65095", "208.195.175.45:65095", "localhost"],
-    
 )
 print(f"Found {len(jobs)} jobs")
 print(jobs.head())
@@ -68,7 +64,7 @@ Optional
 ├── search_term (str)
 |
 ├── google_search_term (str)
-|     search term for google jobs. This is is only param for filtering google jobs.
+|     search term for google jobs. This is the only param for filtering google jobs.
 │
 ├── location (str)
 │
@@ -88,7 +84,7 @@ Optional
 |    number of job results to retrieve for each site specified in 'site_name'
 │
 ├── easy_apply (bool): 
-|    filters for jobs that are hosted on the job board site
+|    filters for jobs that are hosted on the job board site (LinkedIn easy apply filter no longer works)
 │
 ├── description_format (str): 
 |    markdown, html (Format type of the job descriptions. Default is markdown.)
@@ -133,46 +129,6 @@ Optional
 |    - easy_apply
 ```

-
-### JobPost Schema
-
-```plaintext
-JobPost
-├── title
-├── company
-├── company_url
-├── job_url
-├── location
-│   ├── country
-│   ├── city
-│   ├── state
-├── description
-├── job_type: fulltime, parttime, internship, contract
-├── job_function
-│   ├── interval: yearly, monthly, weekly, daily, hourly
-│   ├── min_amount
-│   ├── max_amount
-│   ├── currency
-│   └── salary_source: direct_data, description (parsed from posting)
-├── date_posted
-├── emails
-└── is_remote
-
-Linkedin specific
-└── job_level
-
-Linkedin & Indeed specific
-└── company_industry
-
-Indeed specific
-├── company_country
-├── company_addresses
-├── company_employees_label
-├── company_revenue_label
-├── company_description
-└── company_logo
-```
-
 ## Supported Countries for Job Searching

 ### **LinkedIn**
@@ -219,7 +175,23 @@ You can specify the following countries when searching on Indeed (use the exact

 ---
 **Q: Why is Indeed giving unrelated roles?**  
-**A:** Indeed is searching each one of your terms e.g. software intern, it searches software OR intern. Try search_term='"software intern"' in quotes for stricter searching
+**A:** Indeed searches the description too.
+
+- use - to remove words
+- "" for exact match
+
+Example of a good Indeed query
+
+```py
+search_term='"engineering intern" software summer (java OR python OR c++) 2025 -tax -marketing'
+```
+
+This searches the description/title and must include software, summer, 2025, one of the languages, engineering intern exactly, no tax, no marketing.
+
+---
+
+**Q: No results when using "google"?**  
+**A:** You have to use super specific syntax. Search for google jobs on your browser and then whatever pops up in the google jobs search box after applying some filters is what you need to copy & paste into the google_search_term. 

 ---

@@ -231,8 +203,41 @@ You can specify the following countries when searching on Indeed (use the exact

 ---

-**Q: Encountering issues with your queries?**  
-**A:** Try reducing the number of `results_wanted` and/or broadening the filters. If problems
-persist, [submit an issue](https://github.com/Bunsly/JobSpy/issues).
+### JobPost Schema

---
+```plaintext
+JobPost
+├── title
+├── company
+├── company_url
+├── job_url
+├── location
+│   ├── country
+│   ├── city
+│   ├── state
+├── description
+├── job_type: fulltime, parttime, internship, contract
+├── job_function
+│   ├── interval: yearly, monthly, weekly, daily, hourly
+│   ├── min_amount
+│   ├── max_amount
+│   ├── currency
+│   └── salary_source: direct_data, description (parsed from posting)
+├── date_posted
+├── emails
+└── is_remote
+
+Linkedin specific
+└── job_level
+
+Linkedin & Indeed specific
+└── company_industry
+
+Indeed specific
+├── company_country
+├── company_addresses
+├── company_employees_label
+├── company_revenue_label
+├── company_description
+└── company_logo
+```
--- a/increment_version.py
+++ b/increment_version.py
@@ -0,0 +1,21 @@
+import toml
+
+def increment_version(version):
+    major, minor, patch = map(int, version.split('.'))
+    patch += 1
+    return f"{major}.{minor}.{patch}"
+
+# Load pyproject.toml
+with open('pyproject.toml', 'r') as file:
+    pyproject = toml.load(file)
+
+# Increment the version
+current_version = pyproject['tool']['poetry']['version']
+new_version = increment_version(current_version)
+pyproject['tool']['poetry']['version'] = new_version
+
+# Save the updated pyproject.toml
+with open('pyproject.toml', 'w') as file:
+    toml.dump(pyproject, file)
+
+print(f"Version updated from {current_version} to {new_version}")
--- a/poetry.toml
+++ b/poetry.toml
@@ -1,2 +0,0 @@
-[virtualenvs]
-in-project = true
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -1,15 +1,21 @@
+[build-system]
+requires = [ "poetry-core",]
+build-backend = "poetry.core.masonry.api"
+
 [tool.poetry]
 name = "python-jobspy"
-version = "1.1.75"
+version = "1.1.76"
 description = "Job scraper for LinkedIn, Indeed, Glassdoor & ZipRecruiter"
-authors = ["Zachary Hampton <zachary@bunsly.com>", "Cullen Watson <cullen@bunsly.com>"]
+authors = [ "Zachary Hampton <zachary@bunsly.com>", "Cullen Watson <cullen@bunsly.com>",]
 homepage = "https://github.com/Bunsly/JobSpy"
 readme = "README.md"
-keywords = ['jobs-scraper', 'linkedin', 'indeed', 'glassdoor', 'ziprecruiter']
+keywords = [ "jobs-scraper", "linkedin", "indeed", "glassdoor", "ziprecruiter",]
+[[tool.poetry.packages]]
+include = "jobspy"
+from = "src"

-packages = [
-    { include = "jobspy", from = "src" }
-]
+[tool.black]
+line-length = 88

 [tool.poetry.dependencies]
 python = "^3.10"
@@ -22,16 +28,8 @@ tls-client = "^1.0.1"
 markdownify = "^0.13.1"
 regex = "^2024.4.28"

-
 [tool.poetry.group.dev.dependencies]
 pytest = "^7.4.1"
 jupyter = "^1.0.0"
 black = "*"
 pre-commit = "*"
-
-[build-system]
-requires = ["poetry-core"]
-build-backend = "poetry.core.masonry.api"
-
-[tool.black]
-line-length = 88
--- a/src/jobspy/scrapers/google/init.py
+++ b/src/jobspy/scrapers/google/init.py
@@ -232,7 +232,7 @@ class GoogleJobsScraper(Scraper):
    def _find_job_info_initial_page(html_text: str):
        pattern = (
            f'520084652":('
-            + r"\[(?:[^\[\]]|\[(?:[^\[\]]|\[(?:[^\[\]]|\[[^\[\]]*\])*\])*\])*\])"
+            + r"\[.*?\]\s*])\s*}\s*]\s*]\s*]\s*]\s*]"
        )
        results = []
        matches = re.finditer(pattern, html_text)
Author	SHA1	Message	Date
Cullen Watson	13c74a0fed	docs:readme	2025-02-09 13:42:18 -06:00
Cullen Watson	333e9e6760	docs:readme	2025-01-17 21:44:49 -06:00
github-actions	04032a0f91	Increment version	2024-12-04 22:55:06 +00:00
Cullen Watson	496896d0b5	enh:fix yml (#225 )	2024-12-04 16:54:52 -06:00
Cullen Watson	87ba1ad1bf	fix yml	2024-12-04 16:52:15 -06:00
Jason Geffner	4e7ac9a583	Fix Google job search (#223 ) The previous regex did not capture all expected matches in the returned content	2024-12-04 16:45:59 -06:00
Cullen Watson	e44d13e1cf	enh:auto update version	2024-12-04 16:29:38 -06:00
Cullen Watson	d52e366ef7	docs:readme	2024-11-26 15:51:26 -06:00
Cullen Watson	395ebf0017	docs:readme	2024-11-26 15:49:12 -06:00
Cullen Watson	63fddd9b7f	docs:readme	2024-11-26 15:48:22 -06:00
Cullen Watson	58956868ae	docs:readme	2024-11-26 15:47:10 -06:00
Cullen Watson	4fce836222	docs:readme	2024-10-28 03:53:59 -05:00
Cullen Watson	5ba25e7a7c	docs:readme	2024-10-28 03:42:19 -05:00
Cullen Watson	f7cb3e9206	docs:readme	2024-10-28 03:36:21 -05:00
Cullen Watson	3ad3f121f7	docs:readme	2024-10-28 03:34:52 -05:00
Cullen Watson	ff3c782912	docs:readme	2024-10-25 18:12:08 -05:00