118 lines
3.6 KiB
Markdown
118 lines
3.6 KiB
Markdown
# Analysis Stall Fix Implementation Plan
|
|
|
|
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
|
|
|
|
**Goal:** Prevent summary analysis from appearing stuck by bounding URL-check latency, reporting progress, and preventing duplicate analysis threads.
|
|
|
|
**Architecture:** Keep matching behavior in `mycode/main.py`, add optional progress callbacks, and use `ThreadPoolExecutor` only for independent URL checks. `AnaThread` converts callbacks into existing Qt log signals, while `MainWindow` owns button state and thread lifecycle.
|
|
|
|
**Tech Stack:** Python 3.8, `concurrent.futures`, pandas, PySide6, unittest
|
|
|
|
---
|
|
|
|
### Task 1: Concurrent Deleted-Article Checks
|
|
|
|
**Files:**
|
|
- Modify: `tests/test_main.py`
|
|
- Modify: `mycode/main.py`
|
|
|
|
- [ ] **Step 1: Write failing tests**
|
|
|
|
Add tests proving duplicate URLs are fetched once, request failures retain rows,
|
|
and the progress callback reaches `(total, total)`.
|
|
|
|
- [ ] **Step 2: Verify tests fail**
|
|
|
|
Run:
|
|
|
|
```powershell
|
|
.\runtime\python.exe -m unittest tests.test_main.DeletedWechatContentFilterTest -v
|
|
```
|
|
|
|
Expected: failure because `filter_deleted_wechat_rows` does not accept
|
|
`progress_callback` or bounded concurrency options.
|
|
|
|
- [ ] **Step 3: Implement bounded concurrency**
|
|
|
|
Use `ThreadPoolExecutor(max_workers=8)` and `as_completed`. Submit one task per
|
|
unique non-empty URL, preserve original row order, retain rows for failed
|
|
requests, and invoke `progress_callback(completed, total)` after each completed
|
|
URL.
|
|
|
|
- [ ] **Step 4: Verify tests pass**
|
|
|
|
Run the Task 1 unittest command and expect all deleted-content tests to pass.
|
|
|
|
### Task 2: Rule-Scan Progress
|
|
|
|
**Files:**
|
|
- Modify: `tests/test_main.py`
|
|
- Modify: `mycode/main.py`
|
|
|
|
- [ ] **Step 1: Write a failing test**
|
|
|
|
Add a focused test using small injected data frames to prove `ana_wechat`
|
|
reports final rule progress without network access.
|
|
|
|
- [ ] **Step 2: Verify the test fails**
|
|
|
|
Run the new test directly and expect failure because `ana_wechat` has no
|
|
progress callback or injectable data frames.
|
|
|
|
- [ ] **Step 3: Implement progress callbacks**
|
|
|
|
Add optional `progress_callback`, `rules_df`, and `articles_df` parameters.
|
|
Report `(completed_rules, total_rules)` after each rule and pass a separate URL
|
|
progress callback to `filter_deleted_wechat_rows`. Preserve default production
|
|
behavior when parameters are omitted.
|
|
|
|
- [ ] **Step 4: Verify the test passes**
|
|
|
|
Run the focused test and the full `tests.test_main` module.
|
|
|
|
### Task 3: Qt Lifecycle and User Feedback
|
|
|
|
**Files:**
|
|
- Modify: `start.py`
|
|
|
|
- [ ] **Step 1: Wire progress messages**
|
|
|
|
In `AnaThread`, emit periodic messages for rule scanning and article-link
|
|
checking, including completed and total counts.
|
|
|
|
- [ ] **Step 2: Prevent duplicate runs**
|
|
|
|
In `MainWindow.start_ana`, return early when the current analysis thread is
|
|
running, disable `bAna` before starting, connect `finished` to a cleanup method,
|
|
and restore the button in cleanup.
|
|
|
|
- [ ] **Step 3: Verify syntax and imports**
|
|
|
|
Run:
|
|
|
|
```powershell
|
|
.\runtime\python.exe -m py_compile start.py mycode\main.py tests\test_main.py
|
|
```
|
|
|
|
Expected: exit code 0.
|
|
|
|
### Task 4: Final Verification
|
|
|
|
**Files:**
|
|
- Verify: `tests/test_main.py`
|
|
- Verify: `start.py`
|
|
- Verify: `mycode/main.py`
|
|
|
|
- [ ] **Step 1: Run all tests**
|
|
|
|
```powershell
|
|
.\runtime\python.exe -m unittest discover -s tests -v
|
|
```
|
|
|
|
Expected: all tests pass.
|
|
|
|
- [ ] **Step 2: Review the diff**
|
|
|
|
Confirm the diff contains only the planned analysis behavior, tests, and
|
|
documentation, and does not touch `col.bat` or `mycode/main2.py`.
|