StJohnsLawReviewScraper

St. John's Law Review
maps to St. John's Law Review (id 165)
Latest Status
no_new_content #4377
Latest Metrics
d=0  |  skip=241  |  err=0
t=41.9s
Implementation
LightBaseScraper
bepress
Law Review
Uploads Pending
0
Last Upload
2026-03-08 22:07:05
uabox:Law_Review_Project/st_johns_law_review_20260308_220651.zip

Definition

scraper_id
StJohnsLawReviewScraper
canonical_name
St. John's Law Review
institution_code
-
platform
bepress
base_class
LightBaseScraper
class_name
StJohnsLawReviewScraper
module_path
scrapers.stjohns_scraper
file_path
scrapers/stjohns_scraper.py
has_cli_entrypoint
true
is_abstract
false
discovered_at
2026-03-30 20:11:12
updated_at
2026-06-17 04:52:57

Run History

Showing 6 runs (law_review_id=165) — use ?limit=200 for more.
Run Status Start End Runtime Metrics Error / Details Logs
#4377 no_new_content 2026-06-01T16:25:23+00:00 2026-06-01T16:26:05+00:00 41.9s d=0  |  skip=241  |  err=0
discovered=241  |  processed=241
-
extra_json
{"automation_cycle_id": 1520, "canonical_name": "St. John's Law Review", "child_pid": 58563, "file_path": "scrapers/stjohns_scraper.py", "heartbeat_at": "2026-06-01T16:25:53+00:00", "heartbeat_source": "orchestrator", "law_review_id": 165, "orchestrator": "lrscraper", "orchestrator_started_at": "2026-06-01T16:25:23+00:00", "run_kind": "scheduled_active", "scraper_id": "StJohnsLawReviewScraper", "script_path": "scrapers/stjohns_scraper.py", "stderr_path": "logs/orchestrator_runs/1780331123_StJohnsLawReviewScraper.err.log", "stdout_path": "logs/orchestrator_runs/1780331123_StJohnsLawReviewScraper.out.log", "timeout_minutes": 45}
stdout | stderr
#3702 no_new_content 2026-05-01T19:54:47+00:00 2026-05-01T19:55:33+00:00 46.3s d=0  |  skip=241  |  err=0
discovered=241  |  processed=241
-
extra_json
{"automation_cycle_id": 609, "canonical_name": "St. John's Law Review", "child_pid": 990625, "file_path": "scrapers/stjohns_scraper.py", "heartbeat_at": "2026-05-01T19:55:17+00:00", "heartbeat_source": "orchestrator", "law_review_id": 165, "orchestrator": "lrscraper", "orchestrator_started_at": "2026-05-01T19:54:47+00:00", "run_kind": "scheduled_active", "scraper_id": "StJohnsLawReviewScraper", "script_path": "scrapers/stjohns_scraper.py", "stderr_path": "logs/orchestrator_runs/1777665287_StJohnsLawReviewScraper.err.log", "stdout_path": "logs/orchestrator_runs/1777665287_StJohnsLawReviewScraper.out.log", "timeout_minutes": 45}
stdout | stderr
#2862 success 2026-03-08T19:23:39+00:00 2026-03-08T19:24:53+00:00 74.7s d=12  |  skip=229  |  err=0
discovered=241  |  processed=241
-
extra_json
{"canonical_name": "St. John's Law Review", "child_pid": 2936362, "file_path": "scrapers/stjohns_scraper.py", "heartbeat_at": "2026-03-08T19:24:39+00:00", "heartbeat_source": "orchestrator", "law_review_id": 165, "orchestrator": "lrscraper", "orchestrator_started_at": "2026-03-08T19:23:39+00:00", "scraper_id": "StJohnsLawReviewScraper", "script_path": "scrapers/stjohns_scraper.py", "stderr_path": "logs/orchestrator_runs/1772997819_StJohnsLawReviewScraper.err.log", "stdout_path": "logs/orchestrator_runs/1772997819_StJohnsLawReviewScraper.out.log", "timeout_minutes": 45}
stdout | stderr
#959 success 2026-02-06T18:56:02+00:00 2026-02-06T18:56:25+00:00 23.1s d=6  |  skip=24  |  err=0
discovered=30  |  processed=30
-
extra_json
{"canonical_name": "St. John's Law Review", "child_pid": 4079233, "file_path": "scrapers/stjohns_scraper.py", "heartbeat_at": "2026-02-06T18:56:02+00:00", "heartbeat_source": "orchestrator", "law_review_id": 165, "orchestrator": "lrscraper", "orchestrator_started_at": "2026-02-06T18:56:02+00:00", "scraper_id": "StJohnsLawReviewScraper", "script_path": "scrapers/stjohns_scraper.py", "stderr_path": "logs/orchestrator_runs/1770404162_StJohnsLawReviewScraper.err.log", "stdout_path": "logs/orchestrator_runs/1770404162_StJohnsLawReviewScraper.out.log", "timeout_minutes": 30}
stdout | stderr
#955 partial 2026-02-06T18:34:32+00:00 2026-02-06T18:36:05+00:00 92.7s d=24  |  skip=0  |  err=6
discovered=30  |  processed=30
partial_download_errors: HTTP 403 for https://scholarship.law.stjohns.edu/cgi/viewcontent.cgi?article=7334&context=lawreview (fallback UA)
extra_json
{"canonical_name": "St. John's Law Review", "child_pid": 4041795, "file_path": "scrapers/stjohns_scraper.py", "heartbeat_at": "2026-02-06T18:36:02+00:00", "heartbeat_source": "orchestrator", "law_review_id": 165, "orchestrator": "lrscraper", "orchestrator_started_at": "2026-02-06T18:34:32+00:00", "scraper_id": "StJohnsLawReviewScraper", "script_path": "scrapers/stjohns_scraper.py", "stderr_path": "logs/orchestrator_runs/1770402872_StJohnsLawReviewScraper.err.log", "stdout_path": "logs/orchestrator_runs/1770402872_StJohnsLawReviewScraper.out.log", "timeout_minutes": 30}
stdout | stderr
#952 failed 2026-02-06T18:32:54+00:00 2026-02-06T18:32:55+00:00 0.7s d=0  |  skip=0  |  err=0
discovered=0  |  processed=0
error: unbalanced parenthesis at position 20
traceback
Traceback (most recent call last):
  File "/home/arbel/sites/lrscraper/light_base_scraper.py", line 672, in run
    items = await self.discover_urls()
            ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/arbel/sites/lrscraper/scrapers/stjohns_scraper.py", line 81, in discover_urls
    for entry in self._parse_issue_items(issue_html, issue_meta):
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/arbel/sites/lrscraper/scrapers/stjohns_scraper.py", line 181, in _parse_issue_items
    title = _clean_title(
            ^^^^^^^^^^^^^
  File "/home/arbel/sites/lrscraper/scrapers/stjohns_scraper.py", line 314, in _clean_title
    text = re.sub(r"\(\s*\d+\\s*kb\\s*\\)$", "", text, flags=re.IGNORECASE).strip()
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/arbel/miniconda3/lib/python3.11/re/__init__.py", line 185, in sub
    return _compile(pattern, flags).sub(repl, string, count)
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/arbel/miniconda3/lib/python3.11/re/__init__.py", line 294, in _compile
    p = _compiler.compile(pattern, flags)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/arbel/miniconda3/lib/python3.11/re/_compiler.py", line 743, in compile
    p = _parser.parse(p, flags)
        ^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/arbel/miniconda3/lib/python3.11/re/_parser.py", line 987, in parse
    raise source.error("unbalanced parenthesis")
re.error: unbalanced parenthesis at position 20
extra_json
{"canonical_name": "St. John's Law Review", "child_pid": 4038783, "file_path": "scrapers/stjohns_scraper.py", "heartbeat_at": "2026-02-06T18:32:54+00:00", "heartbeat_source": "orchestrator", "law_review_id": 165, "orchestrator": "lrscraper", "orchestrator_started_at": "2026-02-06T18:32:54+00:00", "scraper_id": "StJohnsLawReviewScraper", "script_path": "scrapers/stjohns_scraper.py", "stderr_path": "logs/orchestrator_runs/1770402774_StJohnsLawReviewScraper.err.log", "stdout_path": "logs/orchestrator_runs/1770402774_StJohnsLawReviewScraper.out.log", "timeout_minutes": 30}
stdout | stderr

Runs (scraper_name = StJohnsLawReviewScraper)

These are runs recorded explicitly under this scraper_id.
Run Status Start End Runtime Metrics Error / Details Logs
#4377 no_new_content 2026-06-01T16:25:23+00:00 2026-06-01T16:26:05+00:00 41.9s d=0  |  skip=241  |  err=0
discovered=241  |  processed=241
-
extra_json
{"automation_cycle_id": 1520, "canonical_name": "St. John's Law Review", "child_pid": 58563, "file_path": "scrapers/stjohns_scraper.py", "heartbeat_at": "2026-06-01T16:25:53+00:00", "heartbeat_source": "orchestrator", "law_review_id": 165, "orchestrator": "lrscraper", "orchestrator_started_at": "2026-06-01T16:25:23+00:00", "run_kind": "scheduled_active", "scraper_id": "StJohnsLawReviewScraper", "script_path": "scrapers/stjohns_scraper.py", "stderr_path": "logs/orchestrator_runs/1780331123_StJohnsLawReviewScraper.err.log", "stdout_path": "logs/orchestrator_runs/1780331123_StJohnsLawReviewScraper.out.log", "timeout_minutes": 45}
stdout | stderr
#3702 no_new_content 2026-05-01T19:54:47+00:00 2026-05-01T19:55:33+00:00 46.3s d=0  |  skip=241  |  err=0
discovered=241  |  processed=241
-
extra_json
{"automation_cycle_id": 609, "canonical_name": "St. John's Law Review", "child_pid": 990625, "file_path": "scrapers/stjohns_scraper.py", "heartbeat_at": "2026-05-01T19:55:17+00:00", "heartbeat_source": "orchestrator", "law_review_id": 165, "orchestrator": "lrscraper", "orchestrator_started_at": "2026-05-01T19:54:47+00:00", "run_kind": "scheduled_active", "scraper_id": "StJohnsLawReviewScraper", "script_path": "scrapers/stjohns_scraper.py", "stderr_path": "logs/orchestrator_runs/1777665287_StJohnsLawReviewScraper.err.log", "stdout_path": "logs/orchestrator_runs/1777665287_StJohnsLawReviewScraper.out.log", "timeout_minutes": 45}
stdout | stderr
#2862 success 2026-03-08T19:23:39+00:00 2026-03-08T19:24:53+00:00 74.7s d=12  |  skip=229  |  err=0
discovered=241  |  processed=241
-
extra_json
{"canonical_name": "St. John's Law Review", "child_pid": 2936362, "file_path": "scrapers/stjohns_scraper.py", "heartbeat_at": "2026-03-08T19:24:39+00:00", "heartbeat_source": "orchestrator", "law_review_id": 165, "orchestrator": "lrscraper", "orchestrator_started_at": "2026-03-08T19:23:39+00:00", "scraper_id": "StJohnsLawReviewScraper", "script_path": "scrapers/stjohns_scraper.py", "stderr_path": "logs/orchestrator_runs/1772997819_StJohnsLawReviewScraper.err.log", "stdout_path": "logs/orchestrator_runs/1772997819_StJohnsLawReviewScraper.out.log", "timeout_minutes": 45}
stdout | stderr
#959 success 2026-02-06T18:56:02+00:00 2026-02-06T18:56:25+00:00 23.1s d=6  |  skip=24  |  err=0
discovered=30  |  processed=30
-
extra_json
{"canonical_name": "St. John's Law Review", "child_pid": 4079233, "file_path": "scrapers/stjohns_scraper.py", "heartbeat_at": "2026-02-06T18:56:02+00:00", "heartbeat_source": "orchestrator", "law_review_id": 165, "orchestrator": "lrscraper", "orchestrator_started_at": "2026-02-06T18:56:02+00:00", "scraper_id": "StJohnsLawReviewScraper", "script_path": "scrapers/stjohns_scraper.py", "stderr_path": "logs/orchestrator_runs/1770404162_StJohnsLawReviewScraper.err.log", "stdout_path": "logs/orchestrator_runs/1770404162_StJohnsLawReviewScraper.out.log", "timeout_minutes": 30}
stdout | stderr
#955 partial 2026-02-06T18:34:32+00:00 2026-02-06T18:36:05+00:00 92.7s d=24  |  skip=0  |  err=6
discovered=30  |  processed=30
partial_download_errors: HTTP 403 for https://scholarship.law.stjohns.edu/cgi/viewcontent.cgi?article=7334&context=lawreview (fallback UA)
extra_json
{"canonical_name": "St. John's Law Review", "child_pid": 4041795, "file_path": "scrapers/stjohns_scraper.py", "heartbeat_at": "2026-02-06T18:36:02+00:00", "heartbeat_source": "orchestrator", "law_review_id": 165, "orchestrator": "lrscraper", "orchestrator_started_at": "2026-02-06T18:34:32+00:00", "scraper_id": "StJohnsLawReviewScraper", "script_path": "scrapers/stjohns_scraper.py", "stderr_path": "logs/orchestrator_runs/1770402872_StJohnsLawReviewScraper.err.log", "stdout_path": "logs/orchestrator_runs/1770402872_StJohnsLawReviewScraper.out.log", "timeout_minutes": 30}
stdout | stderr
#952 failed 2026-02-06T18:32:54+00:00 2026-02-06T18:32:55+00:00 0.7s d=0  |  skip=0  |  err=0
discovered=0  |  processed=0
error: unbalanced parenthesis at position 20
traceback
Traceback (most recent call last):
  File "/home/arbel/sites/lrscraper/light_base_scraper.py", line 672, in run
    items = await self.discover_urls()
            ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/arbel/sites/lrscraper/scrapers/stjohns_scraper.py", line 81, in discover_urls
    for entry in self._parse_issue_items(issue_html, issue_meta):
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/arbel/sites/lrscraper/scrapers/stjohns_scraper.py", line 181, in _parse_issue_items
    title = _clean_title(
            ^^^^^^^^^^^^^
  File "/home/arbel/sites/lrscraper/scrapers/stjohns_scraper.py", line 314, in _clean_title
    text = re.sub(r"\(\s*\d+\\s*kb\\s*\\)$", "", text, flags=re.IGNORECASE).strip()
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/arbel/miniconda3/lib/python3.11/re/__init__.py", line 185, in sub
    return _compile(pattern, flags).sub(repl, string, count)
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/arbel/miniconda3/lib/python3.11/re/__init__.py", line 294, in _compile
    p = _compiler.compile(pattern, flags)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/arbel/miniconda3/lib/python3.11/re/_compiler.py", line 743, in compile
    p = _parser.parse(p, flags)
        ^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/arbel/miniconda3/lib/python3.11/re/_parser.py", line 987, in parse
    raise source.error("unbalanced parenthesis")
re.error: unbalanced parenthesis at position 20
extra_json
{"canonical_name": "St. John's Law Review", "child_pid": 4038783, "file_path": "scrapers/stjohns_scraper.py", "heartbeat_at": "2026-02-06T18:32:54+00:00", "heartbeat_source": "orchestrator", "law_review_id": 165, "orchestrator": "lrscraper", "orchestrator_started_at": "2026-02-06T18:32:54+00:00", "scraper_id": "StJohnsLawReviewScraper", "script_path": "scrapers/stjohns_scraper.py", "stderr_path": "logs/orchestrator_runs/1770402774_StJohnsLawReviewScraper.err.log", "stdout_path": "logs/orchestrator_runs/1770402774_StJohnsLawReviewScraper.out.log", "timeout_minutes": 30}
stdout | stderr