LRScraper - Scraper Detail

Latest Status

partial #4240

Latest Metrics

d=9 | skip=105 | err=0

t=557.2s

Implementation

LightBaseScraper

playwright

Law Review

Harvard Law Review

id=1

Uploads Pending

0

Last Upload

2026-05-31 21:25:20

uabox:Law_Review_Project/harvard_law_review_20260531_212500.zip

Definition

scraper_id

HarvardLawReviewScraper

canonical_name

Harvard Law Review

institution_code

-

platform

playwright

base_class

LightBaseScraper

class_name

HarvardLawReviewScraper

module_path

scrapers.harvard_law_review_scraper

file_path

scrapers/harvard_law_review_scraper.py

has_cli_entrypoint

true

is_abstract

false

discovered_at

2026-03-30 20:11:12

updated_at

2026-06-17 03:35:43

Run History

Showing 7 runs (law_review_id=1) — use ?limit=200 for more.

Run	Status	Start	End	Runtime	Metrics	Error / Details	Logs
#4240	partial	2026-05-31T19:06:29+00:00	2026-05-31T19:15:47+00:00	557.2s	d=9 \| skip=105 \| err=0 discovered=114 \| processed=114	- extra_json {"automation_cycle_id": 1499, "canonical_name": "Harvard Law Review", "child_pid": 1238525, "discovery_cutoff": true, "discovery_cutoff_details": {"consecutive_duplicates": 40, "processed_articles": 114, "queued_items": 9}, "discovery_cutoff_elapsed_seconds": 487, "discovery_cutoff_max_runtime_seconds": null, "discovery_cutoff_phase": "discovery", "discovery_cutoff_reason": "duplicate_streak", "file_path": "scrapers/harvard_law_review_scraper.py", "heartbeat_at": "2026-05-31T19:15:29+00:00", "heartbeat_source": "orchestrator", "law_review_id": 1, "orchestrator": "lrscraper", "orchestrator_started_at": "2026-05-31T19:06:29+00:00", "run_kind": "scheduled_active", "scraper_id": "HarvardLawReviewScraper", "script_path": "scrapers/harvard_law_review_scraper.py", "stderr_path": "logs/orchestrator_runs/1780254389_HarvardLawReviewScraper.err.log", "stdout_path": "logs/orchestrator_runs/1780254389_HarvardLawReviewScraper.out.log", "timeout_minutes": 45}	stdout \| stderr
#3566	partial	2026-05-01T17:29:19+00:00	2026-05-01T17:30:27+00:00	67.2s	d=20 \| skip=87 \| err=0 discovered=107 \| processed=107	- extra_json {"automation_cycle_id": 588, "canonical_name": "Harvard Law Review", "child_pid": 856921, "discovery_cutoff": true, "discovery_cutoff_details": {"consecutive_duplicates": 40, "processed_articles": 107, "queued_items": 20}, "discovery_cutoff_elapsed_seconds": 52, "discovery_cutoff_max_runtime_seconds": null, "discovery_cutoff_phase": "discovery", "discovery_cutoff_reason": "duplicate_streak", "file_path": "scrapers/harvard_law_review_scraper.py", "heartbeat_at": "2026-05-01T17:30:19+00:00", "heartbeat_source": "orchestrator", "law_review_id": 1, "orchestrator": "lrscraper", "orchestrator_started_at": "2026-05-01T17:29:19+00:00", "run_kind": "scheduled_active", "scraper_id": "HarvardLawReviewScraper", "script_path": "scrapers/harvard_law_review_scraper.py", "stderr_path": "logs/orchestrator_runs/1777656559_HarvardLawReviewScraper.err.log", "stdout_path": "logs/orchestrator_runs/1777656559_HarvardLawReviewScraper.out.log", "timeout_minutes": 45}	stdout \| stderr
#2696	partial	2026-03-08T05:09:49+00:00	2026-03-08T05:10:31+00:00	42.0s	d=7 \| skip=75 \| err=0 discovered=82 \| processed=82	- extra_json {"canonical_name": "Harvard Law Review", "child_pid": 2439295, "discovery_cutoff": true, "discovery_cutoff_details": {"consecutive_duplicates": 40, "processed_articles": 82, "queued_items": 7}, "discovery_cutoff_elapsed_seconds": 36, "discovery_cutoff_max_runtime_seconds": null, "discovery_cutoff_phase": "discovery", "discovery_cutoff_reason": "duplicate_streak", "file_path": "scrapers/harvard_law_review_scraper.py", "heartbeat_at": "2026-03-08T05:10:19+00:00", "heartbeat_source": "orchestrator", "law_review_id": 1, "orchestrator": "lrscraper", "orchestrator_started_at": "2026-03-08T05:09:49+00:00", "scraper_id": "HarvardLawReviewScraper", "script_path": "scrapers/harvard_law_review_scraper.py", "stderr_path": "logs/orchestrator_runs/1772946589_HarvardLawReviewScraper.err.log", "stdout_path": "logs/orchestrator_runs/1772946589_HarvardLawReviewScraper.out.log", "timeout_minutes": 45}	stdout \| stderr
#590	success	2026-02-06T02:43:34+00:00	2026-02-06T02:52:11+00:00	516.5s	d=9 \| skip=0 \| err=0 discovered=9 \| processed=9	- extra_json {"canonical_name": "Harvard Law Review", "child_pid": 2256464, "file_path": "scrapers/harvard_law_review_scraper.py", "heartbeat_at": "2026-02-06T02:52:04+00:00", "heartbeat_source": "orchestrator", "law_review_id": 1, "orchestrator": "lrscraper", "orchestrator_started_at": "2026-02-06T02:43:34+00:00", "scraper_id": "HarvardLawReviewScraper", "script_path": "scrapers/harvard_law_review_scraper.py", "stderr_path": "logs/orchestrator_runs/1770345814_HarvardLawReviewScraper.err.log", "stdout_path": "logs/orchestrator_runs/1770345814_HarvardLawReviewScraper.out.log", "timeout_minutes": 25}	stdout \| stderr
#521	failed	2026-01-28T05:32:57+00:00	2026-01-28T05:33:21+00:00	24.0s	d=0 \| skip=0 \| err=0 discovered=0 \| processed=0	BrokenPipeError: [Errno 32] Broken pipe traceback Traceback (most recent call last): File "/home/arbel/sites/lrscraper/scrapers/harvard_law_review_scraper.py", line 103, in discover_urls self.print_status(f"Found: {metadata['title']} ({filename})") File "/home/arbel/sites/lrscraper/light_base_scraper.py", line 107, in print_status print(msg) BrokenPipeError: [Errno 32] Broken pipe During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/home/arbel/sites/lrscraper/light_base_scraper.py", line 309, in run items = await self.discover_urls() ^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/arbel/sites/lrscraper/smart_scraper_simple.py", line 385, in wrapped_discover items = await original_discover() ^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/arbel/sites/lrscraper/scrapers/harvard_law_review_scraper.py", line 109, in discover_urls self.print_status(f"Error processing {article_url}: {e}", "error") File "/home/arbel/sites/lrscraper/light_base_scraper.py", line 107, in print_status print(msg) BrokenPipeError: [Errno 32] Broken pipe extra_json {"canonical_name": "Harvard Law Review"}	-
#452	timeout	2026-01-22T12:30:34+00:00	2026-01-22T13:15:34+00:00	2700.0s	d=0 \| skip=0 \| err=1 discovered=- \| processed=-	timeout: Timeout after 45 minutes extra_json {"returncode": null}	-
#1	success	2025-12-22T22:46:11.634716	2025-12-23T00:14:34.578532	5302.9s	d=11 \| skip=0 \| err=0 discovered=- \| processed=-	-	-

Runs (scraper_name = HarvardLawReviewScraper)

These are runs recorded explicitly under this scraper_id.

Run	Status	Start	End	Runtime	Metrics	Error / Details	Logs
#4240	partial	2026-05-31T19:06:29+00:00	2026-05-31T19:15:47+00:00	557.2s	d=9 \| skip=105 \| err=0 discovered=114 \| processed=114	- extra_json {"automation_cycle_id": 1499, "canonical_name": "Harvard Law Review", "child_pid": 1238525, "discovery_cutoff": true, "discovery_cutoff_details": {"consecutive_duplicates": 40, "processed_articles": 114, "queued_items": 9}, "discovery_cutoff_elapsed_seconds": 487, "discovery_cutoff_max_runtime_seconds": null, "discovery_cutoff_phase": "discovery", "discovery_cutoff_reason": "duplicate_streak", "file_path": "scrapers/harvard_law_review_scraper.py", "heartbeat_at": "2026-05-31T19:15:29+00:00", "heartbeat_source": "orchestrator", "law_review_id": 1, "orchestrator": "lrscraper", "orchestrator_started_at": "2026-05-31T19:06:29+00:00", "run_kind": "scheduled_active", "scraper_id": "HarvardLawReviewScraper", "script_path": "scrapers/harvard_law_review_scraper.py", "stderr_path": "logs/orchestrator_runs/1780254389_HarvardLawReviewScraper.err.log", "stdout_path": "logs/orchestrator_runs/1780254389_HarvardLawReviewScraper.out.log", "timeout_minutes": 45}	stdout \| stderr
#3566	partial	2026-05-01T17:29:19+00:00	2026-05-01T17:30:27+00:00	67.2s	d=20 \| skip=87 \| err=0 discovered=107 \| processed=107	- extra_json {"automation_cycle_id": 588, "canonical_name": "Harvard Law Review", "child_pid": 856921, "discovery_cutoff": true, "discovery_cutoff_details": {"consecutive_duplicates": 40, "processed_articles": 107, "queued_items": 20}, "discovery_cutoff_elapsed_seconds": 52, "discovery_cutoff_max_runtime_seconds": null, "discovery_cutoff_phase": "discovery", "discovery_cutoff_reason": "duplicate_streak", "file_path": "scrapers/harvard_law_review_scraper.py", "heartbeat_at": "2026-05-01T17:30:19+00:00", "heartbeat_source": "orchestrator", "law_review_id": 1, "orchestrator": "lrscraper", "orchestrator_started_at": "2026-05-01T17:29:19+00:00", "run_kind": "scheduled_active", "scraper_id": "HarvardLawReviewScraper", "script_path": "scrapers/harvard_law_review_scraper.py", "stderr_path": "logs/orchestrator_runs/1777656559_HarvardLawReviewScraper.err.log", "stdout_path": "logs/orchestrator_runs/1777656559_HarvardLawReviewScraper.out.log", "timeout_minutes": 45}	stdout \| stderr
#2696	partial	2026-03-08T05:09:49+00:00	2026-03-08T05:10:31+00:00	42.0s	d=7 \| skip=75 \| err=0 discovered=82 \| processed=82	- extra_json {"canonical_name": "Harvard Law Review", "child_pid": 2439295, "discovery_cutoff": true, "discovery_cutoff_details": {"consecutive_duplicates": 40, "processed_articles": 82, "queued_items": 7}, "discovery_cutoff_elapsed_seconds": 36, "discovery_cutoff_max_runtime_seconds": null, "discovery_cutoff_phase": "discovery", "discovery_cutoff_reason": "duplicate_streak", "file_path": "scrapers/harvard_law_review_scraper.py", "heartbeat_at": "2026-03-08T05:10:19+00:00", "heartbeat_source": "orchestrator", "law_review_id": 1, "orchestrator": "lrscraper", "orchestrator_started_at": "2026-03-08T05:09:49+00:00", "scraper_id": "HarvardLawReviewScraper", "script_path": "scrapers/harvard_law_review_scraper.py", "stderr_path": "logs/orchestrator_runs/1772946589_HarvardLawReviewScraper.err.log", "stdout_path": "logs/orchestrator_runs/1772946589_HarvardLawReviewScraper.out.log", "timeout_minutes": 45}	stdout \| stderr
#590	success	2026-02-06T02:43:34+00:00	2026-02-06T02:52:11+00:00	516.5s	d=9 \| skip=0 \| err=0 discovered=9 \| processed=9	- extra_json {"canonical_name": "Harvard Law Review", "child_pid": 2256464, "file_path": "scrapers/harvard_law_review_scraper.py", "heartbeat_at": "2026-02-06T02:52:04+00:00", "heartbeat_source": "orchestrator", "law_review_id": 1, "orchestrator": "lrscraper", "orchestrator_started_at": "2026-02-06T02:43:34+00:00", "scraper_id": "HarvardLawReviewScraper", "script_path": "scrapers/harvard_law_review_scraper.py", "stderr_path": "logs/orchestrator_runs/1770345814_HarvardLawReviewScraper.err.log", "stdout_path": "logs/orchestrator_runs/1770345814_HarvardLawReviewScraper.out.log", "timeout_minutes": 25}	stdout \| stderr
#521	failed	2026-01-28T05:32:57+00:00	2026-01-28T05:33:21+00:00	24.0s	d=0 \| skip=0 \| err=0 discovered=0 \| processed=0	BrokenPipeError: [Errno 32] Broken pipe traceback Traceback (most recent call last): File "/home/arbel/sites/lrscraper/scrapers/harvard_law_review_scraper.py", line 103, in discover_urls self.print_status(f"Found: {metadata['title']} ({filename})") File "/home/arbel/sites/lrscraper/light_base_scraper.py", line 107, in print_status print(msg) BrokenPipeError: [Errno 32] Broken pipe During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/home/arbel/sites/lrscraper/light_base_scraper.py", line 309, in run items = await self.discover_urls() ^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/arbel/sites/lrscraper/smart_scraper_simple.py", line 385, in wrapped_discover items = await original_discover() ^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/arbel/sites/lrscraper/scrapers/harvard_law_review_scraper.py", line 109, in discover_urls self.print_status(f"Error processing {article_url}: {e}", "error") File "/home/arbel/sites/lrscraper/light_base_scraper.py", line 107, in print_status print(msg) BrokenPipeError: [Errno 32] Broken pipe extra_json {"canonical_name": "Harvard Law Review"}	-
#452	timeout	2026-01-22T12:30:34+00:00	2026-01-22T13:15:34+00:00	2700.0s	d=0 \| skip=0 \| err=1 discovered=- \| processed=-	timeout: Timeout after 45 minutes extra_json {"returncode": null}	-