AI & Data Use Policy
Version 1.0 · March 3, 2026
Purpose
This policy defines how CardWiki uses artificial intelligence and automated systems, how we handle data from external sources, and the commitments we make about responsible data practices.
1. How CardWiki Uses AI and Automation
| System | What It Does | How It's Used |
|---|---|---|
| CardWiki Scanner (powered by Tesseract.js) | Extracts text from card and slab label images | Pre-fills card identity fields during add-card flow. Subject to trust gate — low-confidence results are flagged, not accepted. |
| Deterministic matcher | Compares user-submitted card data against the catalog using exact field matching | Links user cards to catalog entries. Match confidence is always displayed. |
| Weighted scoring matcher (v2) | Scores market listings against catalog cards using multi-factor similarity | Routes market listings to card identities. Thresholds determine confirmed vs. needs-review vs. candidate. |
| Fingerprint generation | Creates deterministic hash signatures from card identity fields | Deduplication — prevents the same card from being added twice. |
| Normalization engine (planned) | Automates standardization of set names, variants, and manufacturer naming | Reduces manual taxonomy cleanup. All normalizations are reviewable. |
2. AI Transparency Commitments
- Explainability: AI never makes identity assertions without showing its reasoning. Match confidence and match_explain are always available.
- Human oversight: No AI output is published to the catalog without human review. OCR and matcher outputs are provisional until admin-approved.
- Source attribution: Every AI-assisted field shows its source (OCR, matcher, bulk import) so users know the provenance.
- Contestability: Users can flag and dispute any AI-generated match or identity through the correction flow.
- Confidence thresholds: Fields below threshold require manual entry. Thresholds are documented in internal product principles.
3. External Data Source Rules
3.1 Authorized Sources
CardWiki's catalog data may only come from these authorized source categories:
- Official manufacturer pages, checklists, product databases, and press releases
- Archived official sources (Wayback Machine captures of official pages, official PDFs)
- Grading company certification databases (public lookup tools)
- Admin-reviewed and approved community contributions
- Automated extraction from authorized sources, subject to trust gating
3.2 Prohibited Data Practices
- Scraping marketplace data (eBay, COMC, TCGplayer, Fanatics) for identity or pricing data
- Training AI models on partner-restricted or terms-of-service-protected datasets
- Reproducing copyrighted checklist content without authorization
- Bulk-extracting user data from third-party platforms
- Using card images from external databases without rights clearance
- Presenting third-party data as CardWiki-originated data
3.3 Partner Data Handling
When CardWiki partners with external services (marketplaces, grading companies, data providers), we commit to:
- Documenting the data rights agreement for each partnership
- Respecting partner-specific retention and deletion requirements
- Not re-distributing partner data beyond agreed scope
- Complying with API rate limits and terms of service
- Maintaining an internal data-rights register tracking all external data dependencies
4. User Data & AI
CardWiki does not use user collection data to train AI models. User-uploaded card images are used only for OCR processing (to identify the card being added) and, if the user opts in to mining, for catalog enrichment after admin approval. User collection data is never shared with third parties for AI training purposes.
5. Affiliate & Routing Transparency
When CardWiki routes users to third-party marketplaces or services (the Scout system), these links may include affiliate parameters. CardWiki will always disclose affiliate relationships where required by law. Routing decisions are based on user relevance and data availability, not on affiliate commission rates.
6. Policy Updates
This policy will be reviewed quarterly. Material changes will be communicated to users via email or in-app notification. The current version is always available at this page.