234 lines
7.1 KiB
Markdown
234 lines
7.1 KiB
Markdown
# paper_server
|
|
|
|
## What This Repo Is
|
|
|
|
`paper_server` is a Django 4.2 backend that mixes a general admin platform with a paper/resource acquisition pipeline.
|
|
|
|
Main stack:
|
|
|
|
- Django + DRF
|
|
- PostgreSQL
|
|
- Redis cache
|
|
- Celery + django-celery-beat + django-celery-results
|
|
- Channels + Daphne for WebSocket push
|
|
|
|
The repo is not just a "paper service". It contains four major business areas:
|
|
|
|
- `apps.system`: users, departments, roles, permissions, files, schedules, config
|
|
- `apps.auth1`: login/auth flows based on JWT, session, SMS, WeChat, face login
|
|
- `apps.wf`: a configurable workflow/ticket engine
|
|
- `apps.ops`: ops endpoints for logs, backups, server metrics, cache, Celery, Redis
|
|
- `apps.resm`: paper metadata, abstract/fulltext fetch, PDF download pipeline
|
|
- `apps.utils`: shared base models, viewsets, permissions, middleware, pagination, helpers
|
|
- `apps.ws`: websocket consumers and routing
|
|
|
|
## Runtime Entry Points
|
|
|
|
- `manage.py` starts Django with `server.settings`
|
|
- `server/settings.py` is the central settings file and imports environment values from `config/conf.py`
|
|
- `server/urls.py` mounts all REST APIs, Swagger, Django admin, and the SPA entry (`dist/index.html`)
|
|
- `server/asgi.py` serves HTTP plus WebSocket traffic
|
|
- `server/celery.py` creates the Celery app using `config.conf.BASE_PROJECT_CODE`
|
|
|
|
## Environment And Config
|
|
|
|
This project expects local runtime config files under `config/`:
|
|
|
|
- `config/conf.py`: Django secret/config, database, cache, Celery broker, backup shell paths
|
|
- `config/conf.json`: runtime system config loaded through `server.settings.get_sysconfig()`
|
|
|
|
Important implication:
|
|
|
|
- the repo can start only when `config/conf.py` is valid for the target environment
|
|
- many ops tasks assume Linux paths from `BACKUP_PATH` and `SH_PATH`
|
|
- Redis is used by cache, Celery broker, and Channels
|
|
|
|
## URL Map
|
|
|
|
Primary REST prefixes:
|
|
|
|
- `api/auth/`
|
|
- `api/system/`
|
|
- `api/wf/`
|
|
- `api/ops/`
|
|
- `api/resm/`
|
|
- `api/utils/`
|
|
|
|
Other routes:
|
|
|
|
- `api/swagger/` and `api/redoc/`
|
|
- `django/admin/`
|
|
- `ws/my/`
|
|
- `ws/<room_name>/`
|
|
|
|
## Core Architectural Patterns
|
|
|
|
### Shared Base Models
|
|
|
|
`apps.utils.models` defines the core model layer:
|
|
|
|
- `BaseModel`: string primary key generated by Snowflake-style `idWorker`
|
|
- `SoftModel`: soft delete support
|
|
- `CommonAModel` / `CommonBModel`: standard audit fields
|
|
- `ParentModel`: tree-like parent linkage with a stored `parent_link`
|
|
|
|
Many business models inherit from these classes, so ID generation, soft deletion, and audit fields are cross-cutting behavior.
|
|
|
|
### Shared ViewSet Base
|
|
|
|
`apps.utils.viewsets.CustomGenericViewSet` is the main DRF base class. It adds:
|
|
|
|
- permission code registration through `perms_map`
|
|
- per-user/request cache protection for duplicate requests
|
|
- data-scope filtering based on RBAC and department range
|
|
- serializer switching per action
|
|
- `select_related` / `prefetch_related` hooks
|
|
- row locking behavior for mutable operations inside transactions
|
|
|
|
When adding endpoints, this class is usually the first place to check for inherited behavior.
|
|
|
|
### Auth And Permissions
|
|
|
|
- default auth uses JWT plus DRF basic/session fallbacks
|
|
- global default permission is authenticated + `apps.utils.permission.RbacPermission`
|
|
- custom user model is `apps.system.models.User`
|
|
- websocket auth is handled in `apps.utils.middlewares.TokenAuthMiddleware` via `token` query param
|
|
|
|
## App Notes
|
|
|
|
### `apps.system`
|
|
|
|
This is the platform foundation layer.
|
|
|
|
Key models:
|
|
|
|
- `User`
|
|
- `Dept`
|
|
- `Role`
|
|
- `Permission`
|
|
- `Post` / `UserPost` / `PostRole`
|
|
- `Dictionary` / `DictType`
|
|
- `File`
|
|
- `MySchedule`
|
|
|
|
This app owns the RBAC structure used by the rest of the project.
|
|
|
|
### `apps.wf`
|
|
|
|
This app is a full workflow engine, not just a simple approval table.
|
|
|
|
Key models:
|
|
|
|
- `Workflow`
|
|
- `State`
|
|
- `Transition`
|
|
- `CustomField`
|
|
- `Ticket`
|
|
- `TicketFlow`
|
|
|
|
Important logic lives in `apps.wf.services.WfService`:
|
|
|
|
- initialize a workflow from its start state
|
|
- generate ticket serial numbers
|
|
- resolve next state from transition conditions
|
|
- resolve participants from person/role/dept/post/field/code/robot
|
|
- enforce handle permissions
|
|
- create transition logs
|
|
- send SMS notifications
|
|
- trigger robot tasks and on-reach hooks
|
|
|
|
When working on ticket behavior, read `apps/wf/services.py` before touching serializers or views.
|
|
|
|
### `apps.ops`
|
|
|
|
This app exposes runtime/maintenance APIs:
|
|
|
|
- git reload tasks
|
|
- database/media backup
|
|
- log browsing
|
|
- CPU/memory/disk inspection
|
|
- Celery info
|
|
- Redis info
|
|
- cache get/set
|
|
- DRF request log and third-party request log listing
|
|
|
|
Some behaviors depend on shell scripts and Linux-only paths from config.
|
|
|
|
### `apps.resm`
|
|
|
|
This is the paper pipeline.
|
|
|
|
Key model:
|
|
|
|
- `Paper`: stores DOI/OpenAlex metadata, OA flags, abstract/fulltext state, fetch status, failure reason, and local file save helpers
|
|
- `PaperAbstract`: separate abstract storage
|
|
|
|
The paper fetch pipeline in `apps/resm/tasks.py` currently includes:
|
|
|
|
- metadata ingestion from OpenAlex
|
|
- abstract/fulltext XML fetch from Elsevier
|
|
- PDF fetch from OA URL
|
|
- PDF fetch from OpenAlex content API
|
|
- PDF fetch from Elsevier
|
|
- Sci-Hub fallback
|
|
- task fan-out and stuck-download release
|
|
|
|
Download behavior is stateful:
|
|
|
|
- `fetch_status="downloading"` is used as a coarse lock
|
|
- `fail_reason` accumulates fetch failures
|
|
- files are stored under `media/papers/<year>/<month>/<day>/`
|
|
|
|
This app has recent local edits in the working tree, so read carefully before changing it.
|
|
|
|
### `apps.ws`
|
|
|
|
Two websocket patterns exist:
|
|
|
|
- `MyConsumer`: per-user channel (`user_<id>`) plus optional `event` group
|
|
- `RoomConsumer`: shared room chat group
|
|
|
|
The websocket layer depends on Redis-backed Channels and JWT token parsing in the query string.
|
|
|
|
## Startup Expectations
|
|
|
|
Typical local boot sequence:
|
|
|
|
1. Ensure `config/conf.py` and `config/conf.json` are present and valid.
|
|
2. Start PostgreSQL and Redis.
|
|
3. Install dependencies from `requirements.txt`.
|
|
4. Run `python manage.py migrate`.
|
|
5. Optionally run `python manage.py loaddata db.json`.
|
|
6. Start Django/Daphne.
|
|
7. Start Celery worker/beat separately if async tasks are needed.
|
|
|
|
## Important Caveats
|
|
|
|
- The repo currently has uncommitted user changes, especially under `apps/resm/`; do not revert them casually.
|
|
- `config/conf.py` contains environment-specific secrets and infrastructure paths; treat edits there as deployment-sensitive.
|
|
- Some source files display mojibake in this terminal because the project contains non-UTF8/legacy encoded Chinese comments, but the Python logic is still readable.
|
|
- `TokenAuthMiddleware` only proceeds when a token is present; websocket behavior without token is intentionally limited.
|
|
- `apps/resm/tasks.py` currently contains hard-coded third-party API credentials and source-specific logic; changing it needs extra caution.
|
|
|
|
## Good First Files To Read
|
|
|
|
- `server/settings.py`
|
|
- `server/urls.py`
|
|
- `apps/utils/models.py`
|
|
- `apps/utils/viewsets.py`
|
|
- `apps/system/models.py`
|
|
- `apps/wf/models.py`
|
|
- `apps/wf/services.py`
|
|
- `apps/resm/models.py`
|
|
- `apps/resm/tasks.py`
|
|
|
|
## Updating This File
|
|
|
|
Update `CLAUDE.md` when any of these change:
|
|
|
|
- startup/config entry points
|
|
- app/module boundaries
|
|
- workflow engine behavior
|
|
- paper download pipeline behavior
|
|
- shared base classes or permission patterns
|