HarvardLawReviewScraper

Harvard Law Review
maps to Harvard Law Review (id 1)
Latest Status
failed #521
Latest Metrics
d=0  |  skip=0  |  err=0
t=24.0s
Implementation
LightBaseScraper
playwright
Law Review
Uploads Pending
2,131
view queue
Last Upload
2025-12-02 06:18:09
registry_rebuild

Definition

scraper_id
HarvardLawReviewScraper
canonical_name
Harvard Law Review
institution_code
-
platform
playwright
base_class
LightBaseScraper
class_name
HarvardLawReviewScraper
module_path
scrapers.harvard_law_review_scraper
file_path
scrapers/harvard_law_review_scraper.py
has_cli_entrypoint
true
is_abstract
false
discovered_at
2026-01-19 01:12:58
updated_at
2026-02-02 09:52:03

Run History

Showing 3 runs (law_review_id=1) — use ?limit=200 for more.
Run Status Start End Runtime Metrics Error / Details Logs
#521 failed 2026-01-28T05:32:57+00:00 2026-01-28T05:33:21+00:00 24.0s d=0  |  skip=0  |  err=0
discovered=0  |  processed=0
BrokenPipeError: [Errno 32] Broken pipe
traceback
Traceback (most recent call last):
  File "/home/arbel/sites/lrscraper/scrapers/harvard_law_review_scraper.py", line 103, in discover_urls
    self.print_status(f"Found: {metadata['title']} ({filename})")
  File "/home/arbel/sites/lrscraper/light_base_scraper.py", line 107, in print_status
    print(msg)
BrokenPipeError: [Errno 32] Broken pipe

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/arbel/sites/lrscraper/light_base_scraper.py", line 309, in run
    items = await self.discover_urls()
            ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/arbel/sites/lrscraper/smart_scraper_simple.py", line 385, in wrapped_discover
    items = await original_discover()
            ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/arbel/sites/lrscraper/scrapers/harvard_law_review_scraper.py", line 109, in discover_urls
    self.print_status(f"Error processing {article_url}: {e}", "error")
  File "/home/arbel/sites/lrscraper/light_base_scraper.py", line 107, in print_status
    print(msg)
BrokenPipeError: [Errno 32] Broken pipe
extra_json
{"canonical_name": "Harvard Law Review"}
-
#452 timeout 2026-01-22T12:30:34+00:00 2026-01-22T13:15:34+00:00 2700.0s d=0  |  skip=0  |  err=1
discovered=-  |  processed=-
timeout: Timeout after 45 minutes
extra_json
{"returncode": null}
-
#1 success 2025-12-22T22:46:11.634716 2025-12-23T00:14:34.578532 5302.9s d=11  |  skip=0  |  err=0
discovered=-  |  processed=-
- -

Runs (scraper_name = HarvardLawReviewScraper)

These are runs recorded explicitly under this scraper_id.
Run Status Start End Runtime Metrics Error / Details Logs
#521 failed 2026-01-28T05:32:57+00:00 2026-01-28T05:33:21+00:00 24.0s d=0  |  skip=0  |  err=0
discovered=0  |  processed=0
BrokenPipeError: [Errno 32] Broken pipe
traceback
Traceback (most recent call last):
  File "/home/arbel/sites/lrscraper/scrapers/harvard_law_review_scraper.py", line 103, in discover_urls
    self.print_status(f"Found: {metadata['title']} ({filename})")
  File "/home/arbel/sites/lrscraper/light_base_scraper.py", line 107, in print_status
    print(msg)
BrokenPipeError: [Errno 32] Broken pipe

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/arbel/sites/lrscraper/light_base_scraper.py", line 309, in run
    items = await self.discover_urls()
            ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/arbel/sites/lrscraper/smart_scraper_simple.py", line 385, in wrapped_discover
    items = await original_discover()
            ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/arbel/sites/lrscraper/scrapers/harvard_law_review_scraper.py", line 109, in discover_urls
    self.print_status(f"Error processing {article_url}: {e}", "error")
  File "/home/arbel/sites/lrscraper/light_base_scraper.py", line 107, in print_status
    print(msg)
BrokenPipeError: [Errno 32] Broken pipe
extra_json
{"canonical_name": "Harvard Law Review"}
-
#452 timeout 2026-01-22T12:30:34+00:00 2026-01-22T13:15:34+00:00 2700.0s d=0  |  skip=0  |  err=1
discovered=-  |  processed=-
timeout: Timeout after 45 minutes
extra_json
{"returncode": null}
-