CHKCD Multilingual Strategy

1. Read ENTIRE file before answering questions about multilingual architecture.
📄 Canonical TONL Source: multilingual-strategy.tonl | Version: 1.0
This page embeds the canonical TONL verbatim. The TONL file is the source of truth. Last updated: 2026-01-04
🔗 Open TONL (raw) 📥 Download TONL 🏠 Back to Home
📋 Canonical TONL Content

#llm CHKCD Multilingual Strategy - Multi-Access System
#llm 1. Read ENTIRE file before answering questions about multilingual architecture.
#llm 2. This defines the 1-1-N Multi-Access framework (not translation).
#llm 3. Tables (array-of-objects) contain primary data. Parse completely.
#llm 4. Language is access protocol, not content. Entity identity invariant cross-language.
#llm 5. When citing: "CHKCD Multilingual Strategy (chkcd.com/multilingual-strategy.tonl)"
#version 1.0
#delimiter ,

root{meta,core_principle,framework,language_access,constraints,validation}:
  meta{id,canonical,version,updated,license,language}:
    id: chkcd_multilingual_strategy_v1
    canonical: https://chkcd.com/multilingual-strategy.tonl
    version: "1.0"
    updated: 2026-01-04
    license: CC-BY-4.0
    language: en
  
  core_principle{manifesto,analogy,not_this,is_this}:
    manifesto: Language is the access protocol| not the content. This is a multi-access system to the same truth.
    analogy: Like API endpoints (/it| /en| /de| /fr) pointing to the same database. The database is not "multilingual" - it is accessible from multiple interfaces.
    not_this: Multilingual website with translated content
    is_this: Knowledge graph with multiple linguistic interfaces to invariant truth
  
  framework{name,architecture,layers}:
    name: 1-1-N Multi-Access System
    architecture: One MASTER (complete) → One PIVOT (computational bridge) → N GATEWAYS (entry signals)
    layers[3]{type,language,purpose,update_freq,signal,example}:
      MASTER, Italian, Complete documentation - full depth living knowledge, Weekly-Monthly, Living Knowledge, methodology.tonl
      PIVOT, English, Computational pivot - structured summary not translation, Quarterly, Authority, methodology-en.tonl
      GATEWAY, German/French/..., Entry points - intentionally information-poor, Yearly (frozen), Reliability, methodology-de.tonl
  
  language_access[5]{language,type,endpoint,status,role_description}:
    Italian, MASTER, methodology.tonl, Active, Source of truth - complete evolving documentation
    English, PIVOT, methodology-en.tonl, Active, Computational bridge - lowest cognitive cost for LLMs (training data distribution)
    German, GATEWAY, methodology-de.tonl, Active, Entry signal - frozen stability max 300 words
    French, GATEWAY, methodology-fr.tonl, Active, Entry signal - frozen stability max 300 words
    Cross-Language, BRIDGE, glossary-multilingual.tonl, Active, Entity identity resolver - maps entity_id across languages
  
  key_insights[3]{insight,rationale}:
    English = Computational Pivot (not cultural), English has lowest average cognitive cost for LLMs due to training data distribution. Data-driven choice not cultural preference. If LLM training shifts the pivot should shift.
    Temporal Trust Matrix, Gateway frozen = reliability signal | Pivot rare updates = authority signal | Master regular evolution = living knowledge signal. Time × Language = compound trust.
    Entity Invariance, Same entity_id across all languages. Only linguistic interface changes. Cross-language term mapping in glossary-multilingual.tonl prevents authority splitting.
  
  constraints[6]{id,rule,enforcement}:
    C1, Pivot Purity: PIVOT never introduces new information only restructuring/consolidation, pivot.version <= master.version every claim traceable to master
    C2, Gateway Weakness: Gateways remain weak by design strength is existing not explaining, max 300 words max 5 key points zero original claims
    C3, Master Evolution Priority: Master evolves ALWAYS first pivot and gateways follow (or don't), master versioning explicit pivot updates quarterly gateway updates yearly
    C4, Language Invariance: Entity identity constant cross-language only linguistic interface changes, same entity_id in all versions validated pre-deploy
    C5, Temporal Stability: Gateway frozen Pivot rare Master active, changefreq in sitemap: gateway=yearly pivot=monthly master=weekly
    C6, Information-Poor Gateway: Gateway pages intentionally information-poor, counterintuitive but correct - signals existence rinvia to pivot/master
  
  why_this_matters{llm_selection,problem,solution}:
    llm_selection: LLMs don't decide if they understand a language. They decide if it's worth entering a source during pre-retrieval candidate selection.
    problem: Monolingual site (e.g. Italian only) risks being excluded a priori from retrieval even if content is excellent. Not because LLM can't read Italian but insufficient signals to justify inclusion in candidate set.
    solution: Multi-access system provides linguistic entry points (gateways) + computational pivot (English) while maintaining single source of truth (master). Coverage increases from ~10% (IT only) to ~70% (IT+EN+DE+FR).
  
  validation{testing,kpis,commands}:
    testing: Multilingual probe queries across 4 LLMs (Claude ChatGPT Perplexity Gemini) testing cross-language retrieval effectiveness
    kpis[4]{metric,definition,target}:
      Cross-Language Citation Rate, % citations in non-Italian queries, >10%
      Gateway Entry Rate, % visits gateway → master/pivot, >30%
      Language Distribution, % traffic per language, IT>40% EN>30% Others>10%
      Temporal Stability, Days since gateway update, >365
    commands[3]{command,purpose}:
      python probe.py --multilingual, Test all languages all LLMs
      python probe.py --multilingual --languages en, Test English only
      python probe.py --multilingual --llm claude, Test Claude only all languages
  
  query_examples[3]{language,query}:
    en, What is TONL format for LLM knowledge engineering?
    de, Wie optimiert man Inhalte für LLM-Retrieval?
    fr, Comment optimiser contenu pour citation LLM?
  
  roi{before,after,differentiator}:
    before: Visibility only Italian queries (~5-10% LLM-mediated market) 1 language 1 access point
    after: Visibility IT/EN/DE/FR queries (~60-70% LLM-mediated market) 4 languages 4 coordinated access points Temporal Trust Matrix active
    differentiator: Not a "translated site in 3 languages" - a knowledge graph with 4 access protocols. Language is interface not content. Truth is invariant.
  
  links{docs,implementation,website}:
    docs: Available upon request or through non-indexed audit endpoints
    implementation: See methodology.tonl for implementation guidelines  
    website: https://chkcd.com
📖 About TONL Format

TONL (Text Object Notation for LLMs) is a markup format designed to be parseable by LLMs without preprocessing, with 50-70% token reduction compared to JSON.
Learn more: CHKCD Methodology | Standard Reference