zcspider/docs/superpowers/plans/2026-06-18-analysis-stall.md

# Analysis Stall Fix Implementation Plan

> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.

**Goal:** Prevent summary analysis from appearing stuck by bounding URL-check latency, reporting progress, and preventing duplicate analysis threads.

**Architecture:** Keep matching behavior in `mycode/main.py`, add optional progress callbacks, and use `ThreadPoolExecutor` only for independent URL checks. `AnaThread` converts callbacks into existing Qt log signals, while `MainWindow` owns button state and thread lifecycle.

**Tech Stack:** Python 3.8, `concurrent.futures`, pandas, PySide6, unittest

---

### Task 1: Concurrent Deleted-Article Checks

**Files:**
- Modify: `tests/test_main.py`
- Modify: `mycode/main.py`

- [ ] **Step 1: Write failing tests**

Add tests proving duplicate URLs are fetched once, request failures retain rows,
and the progress callback reaches `(total, total)`.

- [ ] **Step 2: Verify tests fail**

Run:

```powershell
.\runtime\python.exe -m unittest tests.test_main.DeletedWechatContentFilterTest -v
```

Expected: failure because `filter_deleted_wechat_rows` does not accept
`progress_callback` or bounded concurrency options.

- [ ] **Step 3: Implement bounded concurrency**

Use `ThreadPoolExecutor(max_workers=8)` and `as_completed`. Submit one task per
unique non-empty URL, preserve original row order, retain rows for failed
requests, and invoke `progress_callback(completed, total)` after each completed
URL.

- [ ] **Step 4: Verify tests pass**

Run the Task 1 unittest command and expect all deleted-content tests to pass.

### Task 2: Rule-Scan Progress

**Files:**
- Modify: `tests/test_main.py`
- Modify: `mycode/main.py`

- [ ] **Step 1: Write a failing test**

Add a focused test using small injected data frames to prove `ana_wechat`
reports final rule progress without network access.

- [ ] **Step 2: Verify the test fails**

Run the new test directly and expect failure because `ana_wechat` has no
progress callback or injectable data frames.

- [ ] **Step 3: Implement progress callbacks**

Add optional `progress_callback`, `rules_df`, and `articles_df` parameters.
Report `(completed_rules, total_rules)` after each rule and pass a separate URL
progress callback to `filter_deleted_wechat_rows`. Preserve default production
behavior when parameters are omitted.

- [ ] **Step 4: Verify the test passes**

Run the focused test and the full `tests.test_main` module.

### Task 3: Qt Lifecycle and User Feedback

**Files:**
- Modify: `start.py`

- [ ] **Step 1: Wire progress messages**

In `AnaThread`, emit periodic messages for rule scanning and article-link
checking, including completed and total counts.

- [ ] **Step 2: Prevent duplicate runs**

In `MainWindow.start_ana`, return early when the current analysis thread is
running, disable `bAna` before starting, connect `finished` to a cleanup method,
and restore the button in cleanup.

- [ ] **Step 3: Verify syntax and imports**

Run:

```powershell
.\runtime\python.exe -m py_compile start.py mycode\main.py tests\test_main.py
```

Expected: exit code 0.

### Task 4: Final Verification

**Files:**
- Verify: `tests/test_main.py`
- Verify: `start.py`
- Verify: `mycode/main.py`

- [ ] **Step 1: Run all tests**

```powershell
.\runtime\python.exe -m unittest discover -s tests -v
```

Expected: all tests pass.

- [ ] **Step 2: Review the diff**

Confirm the diff contains only the planned analysis behavior, tests, and
documentation, and does not touch `col.bat` or `mycode/main2.py`.