- Python 49.2%
- JavaScript 23.9%
- PHP 11.1%
- Shell 9.2%
- CSS 4.3%
- Other 2.3%
The runbook had been stuck at "Step 3 pending, awaiting wp-admin wizard"
since 2026-05-10, but the actual import + Polylang language tagging
had been completed at some undocumented later point. The
`wp_term_taxonomy.count` cache lagged at 0 for fr/el/nl which made
the runbook look correct; the real `wp_term_relationships` rows tell
the true story.
Verified 2026-05-21 (cache refreshed via `wp term recount language`):
English 19,473 / Nederlands 314 / Français 210 / Ελληνικά 173
Per-source matches by guid LIKE: bloggr=171, blogfr=205, blognl=312
— exactly matching source DB published-post counts.
Cleanup also done 2026-05-21: 164 posts that still held legacy
blog{gr,fr,nl}.p2pfoundation.net references in post_content are now
all rewritten (uploads re-hosted at /wp-content/uploads/migrated-<src>/,
?p=NN permalinks remapped to new IDs via post_name join, bare-domain
links rewritten to /<lang>/). Final residual count: 0/0/0. guids
correctly preserved (immutable identifiers).
Pre-cleanup backup at /opt/backups/p2p-blog-pre-step3-2026-05-20/.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|---|---|---|
| app | ||
| tests | ||
| wp-plugin | ||
| .env.example | ||
| .gitignore | ||
| docker-compose.yml | ||
| Dockerfile | ||
| entrypoint.sh | ||
| README.md | ||
| requirements.txt | ||
p2p-translation-cache
Lazy-translation cache service for p2pwiki and p2pblog. Reader picks a target language,
service translates that one article on first request via LiteLLM (qwen-coder default),
caches the result keyed by (source, id, revision, lang). Subsequent readers hit cache.
API
POST /translate
{
"source": "wiki",
"id": "Commons",
"revision": "75811",
"lang": "fr",
"html": "<p>The commons is...</p>"
}
Returns:
{
"translated_html": "<p>Les communs sont...</p>",
"cached": false,
"model": "qwen-coder",
"elapsed_ms": 4823
}
GET /health
{ "status": "ok", "redis": true, "llm": true }
GET /stats
{ "cache_hits": 142, "cache_misses": 38, "llm_errors": 0, "inflight": 0 }
Cache
- Key:
translate:{source}:{lang}:{sha256(source|id|revision|lang)[:32]} - No TTL; revision change in the client produces a new key, so old entries are
inert until evicted by Redis
allkeys-lru(capped at 512 MB).
Inflight dedup
When N readers request the same (source, id, revision, lang) simultaneously,
only one LiteLLM call fires; the others await the same future and return
together. Avoids duplicate spend on cold pages.
Output sanitization
LLM output is passed through bleach with a wiki-friendly tag/attribute
allowlist. <script> and event-handler attributes are stripped. Models
sometimes wrap output in ```html ... ``` despite instructions; the
fences are stripped before sanitization.
Deploy
cp .env.example .env
# fill in LITELLM_API_KEY
docker compose up -d --build
curl https://translate.p2pfoundation.net/health
Why qwen by default
Free via the project's existing LiteLLM stack. Acceptable quality for major
European languages (FR, ES, DE, NL, IT, PT). Long-tail languages may want
upgrading to a paid model — change LITELLM_MODEL in .env and bounce.