zcspider/docs/superpowers/plans/2026-06-18-analysis-stall.md

3.6 KiB

Analysis Stall Fix Implementation Plan

For agentic workers: REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (- [ ]) syntax for tracking.

Goal: Prevent summary analysis from appearing stuck by bounding URL-check latency, reporting progress, and preventing duplicate analysis threads.

Architecture: Keep matching behavior in mycode/main.py, add optional progress callbacks, and use ThreadPoolExecutor only for independent URL checks. AnaThread converts callbacks into existing Qt log signals, while MainWindow owns button state and thread lifecycle.

Tech Stack: Python 3.8, concurrent.futures, pandas, PySide6, unittest


Task 1: Concurrent Deleted-Article Checks

Files:

  • Modify: tests/test_main.py

  • Modify: mycode/main.py

  • Step 1: Write failing tests

Add tests proving duplicate URLs are fetched once, request failures retain rows, and the progress callback reaches (total, total).

  • Step 2: Verify tests fail

Run:

.\runtime\python.exe -m unittest tests.test_main.DeletedWechatContentFilterTest -v

Expected: failure because filter_deleted_wechat_rows does not accept progress_callback or bounded concurrency options.

  • Step 3: Implement bounded concurrency

Use ThreadPoolExecutor(max_workers=8) and as_completed. Submit one task per unique non-empty URL, preserve original row order, retain rows for failed requests, and invoke progress_callback(completed, total) after each completed URL.

  • Step 4: Verify tests pass

Run the Task 1 unittest command and expect all deleted-content tests to pass.

Task 2: Rule-Scan Progress

Files:

  • Modify: tests/test_main.py

  • Modify: mycode/main.py

  • Step 1: Write a failing test

Add a focused test using small injected data frames to prove ana_wechat reports final rule progress without network access.

  • Step 2: Verify the test fails

Run the new test directly and expect failure because ana_wechat has no progress callback or injectable data frames.

  • Step 3: Implement progress callbacks

Add optional progress_callback, rules_df, and articles_df parameters. Report (completed_rules, total_rules) after each rule and pass a separate URL progress callback to filter_deleted_wechat_rows. Preserve default production behavior when parameters are omitted.

  • Step 4: Verify the test passes

Run the focused test and the full tests.test_main module.

Task 3: Qt Lifecycle and User Feedback

Files:

  • Modify: start.py

  • Step 1: Wire progress messages

In AnaThread, emit periodic messages for rule scanning and article-link checking, including completed and total counts.

  • Step 2: Prevent duplicate runs

In MainWindow.start_ana, return early when the current analysis thread is running, disable bAna before starting, connect finished to a cleanup method, and restore the button in cleanup.

  • Step 3: Verify syntax and imports

Run:

.\runtime\python.exe -m py_compile start.py mycode\main.py tests\test_main.py

Expected: exit code 0.

Task 4: Final Verification

Files:

  • Verify: tests/test_main.py

  • Verify: start.py

  • Verify: mycode/main.py

  • Step 1: Run all tests

.\runtime\python.exe -m unittest discover -s tests -v

Expected: all tests pass.

  • Step 2: Review the diff

Confirm the diff contains only the planned analysis behavior, tests, and documentation, and does not touch col.bat or mycode/main2.py.