# World Spider Catalog Bot > **Project:** Spider taxonomy search bot with Discord integration and web interface > **Stack:** Python (Discord.py), TypeScript (React 19 + Bun/Hono), MongoDB, SQLite > **Architecture:** Multi-service async system with bidirectional sync --- ## πŸ“‹ Quick Reference | Component | Technology | Entry Point | Port | | ----------- | --------------- | -------------------------- | ----------- | | Discord Bot | Python 3.11+ | `main.py:19` | 8080 (sync) | | Web API | Bun + Hono | `web/server/src/server.ts` | 8000 | | Web UI | React 19 + Vite | `web/client/src/App.tsx` | 3000 | | Cache Layer | MongoDB 8 | - | 27017 | | User Data | SQLite | `database/manager.py:108` | - | --- ## πŸ—οΈ System Architecture ### Data Flow Overview ```mermaid sequenceDiagram autonumber actor User as User (Discord/Web) participant CL as Commands Layer participant CQ as Command Queue participant SH as Search Handler participant CSV as CSV in-memory lookup participant AM as API Manager participant API as External APIs participant MDB as MongoDB Cache participant EC as Embed Creator User->>CL: Initiate Request CL->>CQ: Add to Queue (rate limiting) CQ->>SH: Process Request SH->>CSV: Query Species Data CSV-->>SH: Return Results alt Data not in CSV SH->>AM: Request External Data AM->>API: Call WSC, iNat, GBIF, CITES API-->>AM: Return Data AM->>MDB: Cache Response (24h TTL) else Data in Cache AM->>MDB: Retrieve Cached Data end AM-->>SH: Return Enriched Data SH->>EC: Format Data EC-->>User: Receive Rich Data ``` ### Service Dependencies - **wsc-bot** (Python) β†’ MongoDB (cache), SQLite (user data), External APIs - **wsc-api** (Bun) β†’ MongoDB (species index, cache) - **wsc-client** (React) β†’ wsc-api (REST), Better-Auth (OAuth) --- ## πŸ“‚ Core Files & Key Line Numbers ### Bot Initialization #### `main.py` - Entry Point - **Line 19-30**: Main function - validates tokens, checks CSV - **Line 31-33**: Discord intents configuration - **Line 35-37**: Bot instantiation with `!` prefix - **Line 39**: Command registration - **Line 42**: Bot run with error handling #### `core/bot.py` - WSCBot Class - **Line 83-94**: WSCBot class definition & ownership check - **Line 86-89**: Command queue & HTTP session initialization - **Line 91-94**: Custom owner check (supports additional admin IDs) ##### Setup Hook (`setup_hook` - Line 96-147) - **Line 100-107**: aiohttp session creation (50 conn pool, 20s timeout) - **Line 108**: Database initialization - **Line 110-113**: Sync server startup (port 8080) - **Line 115-120**: MongoDB cache initialization - **Line 122-125**: CSV data loading - **Line 130**: Daily update task creation - **Line 133**: Species of the Day task creation - **Line 137**: Cache warming task - **Line 143-147**: Slash command sync ##### Background Tasks **Daily CSV Update** (`_daily_update_task` - Line 149-184) - **Line 160**: Check if update needed today - **Line 161-162**: Perform update - **Line 164-170**: Reload CSV on success - **Line 172**: Sync species index to MongoDB - **Line 179**: Wait 24 hours between checks **Species of the Day** (`_species_of_the_day_task` - Line 211-327) - **Line 226-234**: Get configs due for posting at current hour - **Line 242-258**: Resolve Discord channel - **Line 261-263**: Select species for config - **Line 266-270**: Create SOTD embed - **Line 272-286**: Add favorite button view - **Line 288-295**: Post to channel - **Line 318-322**: Calculate sleep until next hour **MongoDB Species Index Sync** (`_sync_species_index_if_needed` - Line 186-209) - **Line 190**: Check if sync needed - **Line 192**: Perform sync from CSV - **Line 194-201**: Log statistics (upserts, deletes, aliases) ##### Message Handlers **Temperature Conversion** (`on_message` - Line 408-470) - **Line 416-425**: Pattern `!25c` β†’ Celsius/Fahrenheit conversion - **Line 428-432**: Multi-measurement handler `!m 2m x 3m` - **Line 442-451**: Feet+inches pattern `!5'8"` - **Line 462-469**: Single measurement `!10ft` **Measurement Conversion** (`_handle_measurement_conversion` - Line 473-537) - **Line 478-489**: Normalize input (Γ—, *, ', ", ft+in combos) - **Line 490**: Split by 'x' or spaces - **Line 498**: Unit conversion table - **Line 501-521**: Parse each measurement part - **Line 524-528**: Format output for all units ##### Command Queue (`CommandQueue` class - Line 39-81) - **Line 42-44**: Queue and active command tracking - **Line 46-57**: `execute_command` - adds to guild queue - **Line 59-80**: `_process_queue` - sequential execution with 0.1s delays --- ### Configuration & Environment #### `core/config.py` - Configuration Management - **Line 11-14**: Core API tokens (Discord, WSC, Species+, API Ninjas) - **Line 17-21**: Web app base URL configuration - **Line 24-26**: File paths (CSV, DB, logs) - **Line 29-34**: MongoDB configuration - **Line 36**: Log level from environment - **Line 39-58**: Additional owner IDs parsing - **Line 61-87**: Logging setup with fallbacks - **Line 93-101**: Moltly sync configuration - **Line 104**: Web API URL for outbound sync --- ### API Integration Layer #### `api/manager.py` - External API Management - **Line 1-18**: Imports & cache configuration - **Line 27-32**: Weather cache TTL & versioning - **Line 39-70**: Location code expansion (ISO 3166-1, ISO 3166-2, US states) ##### Key Functions (see file for full implementation) **Location Normalization** - **`_expand_location_code`** (Line 73-~): ISO code β†’ full name - Handles ISO 3166-1 (US, MX, ES) - Handles ISO 3166-2 subdivisions (US-TX, MX-OAX) - US state codes (TX β†’ Texas, United States) **Species Data APIs** (`APIManager` class) - **`get_wsc_data`**: World Spider Catalog API lookup - **`find_inaturalist_taxon`**: iNaturalist taxon ID search - **`get_inat_photos_for_species`**: Photo retrieval - **`get_inat_observations`**: Observation records with location **GBIF Integration** - **`get_gbif_data`**: Species occurrence data - **`get_coordinates_from_code`**: Location code β†’ coordinates **Weather APIs** - **`get_weather_data`**: Historical & current weather (30-year archive) - **`get_weather_snapshot`**: Point-in-time weather - Caching: 3-day soft TTL for recent, permanent for species-level **CITES/Species+** - **`get_speciesplus_cites_status`**: CITES appendix & legislation --- ### Data Management Layer #### `utils/data.py` - In-Memory CSV Management - **Line 41-55**: Global cache & indices - `species_data_cache`: All CSV rows - `_idx_by_lsid`: Quick LSID lookup - `_idx_by_genus`: Species grouped by genus - `_set_genera`, `_set_families`: Fast existence checks - `_count_by_family`, `_count_by_family_genus`: Taxon counts - `_canonical_species`: Normalized species names - `species_aliases`: User-defined aliases ##### Alias Management - **Line 54-63**: Alias file paths & fallbacks - **Line 66-85**: `_resolve_alias_source` - copies seed file if needed - **Line 88-~**: `load_species_aliases` - loads JSON aliases ##### DataManager Class Functions - **`load_csv_data`**: Reads CSV, rebuilds all indices - **`get_csv_header`**: Returns column names - **`find_species_by_name`**: Exact or fuzzy match - **`find_species_by_lsid`**: Direct LSID lookup - **`find_species_by_genus`**: All species in genus - **`find_genus_by_name`**: Genus-level lookup - **`find_family_by_name`**: Family-level lookup ##### FuzzySearchManager Class - **`search`**: Fuzzy matching via rapidfuzz or difflib fallback - Returns sorted results with match scores ##### ConversionUtils Class - **`celsius_to_fahrenheit`**, **`fahrenheit_to_celsius`** - **`read_file_content`**: Reads article/help files - **`list_articles`**: Enumerates available articles --- ### Database Layer #### `database/manager.py` - SQLite Operations - **Line 108**: `init_database` - creates tables & schema ##### Database Schema **Favorites Table** ```sql CREATE TABLE favorites ( id INTEGER PRIMARY KEY, user_id INTEGER NOT NULL, species_canonical TEXT NOT NULL, species_lsid TEXT, added_date TIMESTAMP DEFAULT CURRENT_TIMESTAMP ); ``` **Molt Logs Table** ```sql CREATE TABLE molt_logs ( id INTEGER PRIMARY KEY, user_id INTEGER NOT NULL, species TEXT NOT NULL, specimen_name TEXT, date TIMESTAMP NOT NULL, stage TEXT, -- 'pre', 'molt', 'post' notes TEXT, species_lsid TEXT, added_date TIMESTAMP DEFAULT CURRENT_TIMESTAMP ); ``` **SOTD Configs Table** ```sql CREATE TABLE sotd_configs ( config_id INTEGER PRIMARY KEY AUTOINCREMENT, guild_id INTEGER NOT NULL, channel_id INTEGER NOT NULL, name TEXT, priority_families TEXT, -- comma-separated min_photos INTEGER DEFAULT 3, history_days INTEGER DEFAULT 90, posting_hour INTEGER DEFAULT 8, enabled BOOLEAN DEFAULT 1 ); ``` **Blacklist Table** ```sql CREATE TABLE blacklist ( id INTEGER PRIMARY KEY, entity_id INTEGER NOT NULL, entity_type TEXT, -- 'user' or 'guild' reason TEXT, added_date TIMESTAMP DEFAULT CURRENT_TIMESTAMP ); ``` ##### Key Methods **User Collections** - **`add_favorite_async`**: Add to favorites - **`get_user_favorites_async`**: Retrieve favorites - **`remove_favorite_async`**: Remove favorite - **`get_wishlist_async`**: User wishlist - **`add_molt_log_async`**: Log molt event - **`get_molt_history_async`**: Molt records - **`add_note_async`**: Personal note - **`get_user_notes_async`**: Retrieve notes **SOTD Management** - **`get_sotd_configs_async`**: All configs for guild - **`create_sotd_config_async`**: Create new config - **`update_sotd_config_async`**: Update settings - **`get_recently_featured_species_for_config_async`**: Featured history **Security** - **`add_to_blacklist_async`**: Blacklist user/guild - **`is_blacklisted_async`**: Check blacklist - **`remove_from_blacklist_async`**: Remove from blacklist --- ### MongoDB Cache Layer #### `utils/cache.py` - API Response Caching - **`APICache` class**: Persistent API response cache ##### Collections **`api_cache`** ```json { "_id": ObjectId, "key": "unique_cache_key", "value": {...}, "cachedAt": ISODate, "refreshAfter": ISODate, // null = never refresh "lastCheckedAt": ISODate, "value_hash": "sha256" } ``` **`api_calls`** (permanent archive) ```json { "ts": ISODate, "method": "GET", "url": "https://...", "urlHash": "sha256", "request": {"headers": {}, "params": {}}, "response_status": 200, "response": {"json": {}, "size": 1234}, "durationMs": 123, "cacheKey": "...", "source": "wsc_api" } ``` ##### Key Methods - **`init`**: Initialize MongoDB connection - **`get`**: Returns cached value or None if expired - **`set`**: Stores value with optional TTL - **`delete`**: Removes cache entry - **`clear`**: Clears all cache ##### Soft TTL Semantics - Data never automatically deleted - `get()` returns None if `refreshAfter` is past - `set()` only updates if value changed - Permanent historical archive --- ### Species Indexing #### `utils/species_indexer.py` - MongoDB Sync - **`SpeciesIndexer` class**: CSV β†’ MongoDB sync ##### Collections **`species_index`** ```json { "_id": "lsid_or_canonical_lower", "lsid": "urn:lsid:...", "canonical": "Genus species", "canonical_lower": "genus species", "genus": "Genus", "genus_lower": "genus", "species": "species", "species_lower": "species", "family": "Family", "family_lower": "family", "author": "Author, Year", "year": 2020, "distribution": "Location1, Location2", "aliases": ["alias1", "alias2"], "tokens": ["genus", "species", "family"], "search_blob": "searchable fulltext", "synced_at": ISODate, "row": {...} // full CSV row } ``` **`species_alias_index`** ```json { "_id": "alias_lower", "alias": "Common Name", "alias_lower": "common name", "canonical": "Genus species", "lsid": "urn:lsid:...", "synced_at": ISODate } ``` ##### Key Methods - **`sync_from_csv`**: Bulk upserts CSV rows (500 batch size) - Creates indexes on canonical_lower, genus_lower, family_lower, tokens - Returns stats (upserts, deletions, alias counts) - **`should_sync`**: Checks if stats indicate changes needed - Cleanup: Deletes old entries not in current sync --- ### CSV Update System #### `utils/csv_updater.py` - Daily Updates - **`CSVUpdater` class**: Manages CSV updates & changelogs ##### Key Methods **Update Management** - **`get_latest_csv_url`**: Fetches URL from WSC (tries 7 days back) - **`perform_update`**: Downloads, diffs, stores new CSV - **`should_update_today`**: Checks if already attempted - **`mark_update_attempted`**: Tracks update state **Changelog Generation** - **`generate_changelog`**: Diffs two CSV files - Categories: added, removed, modified, synonymy_changes - Human-readable summary **Data Management** - **`get_all_changelogs`**: Lists all changelogs - **`get_changelog_by_id`**: Retrieves specific changelog - **`get_changelog_range`**: Multi-changelog retrieval ##### File Structure ``` data/ β”œβ”€β”€ changelogs/ # JSON summaries β”‚ └── changelog_20250115.json β”œβ”€β”€ diffs/ # Detailed JSON diffs β”‚ └── diff_2025-01-15_12-34-56.json └── last_update.json # Tracking metadata ``` --- ### Species of the Day #### `utils/sotd_manager.py` - Automated Species Featuring - **`SpeciesOfTheDayManager` class**: SOTD selection logic ##### Key Methods **Species Selection** - **`select_species_of_the_day_for_config`**: Per-config selection - Priority families (Theraphosidae, Salticidae, Lycosidae) - History tracking (30-180 days) - Minimum photo count (1-10) - Weighted random selection **Configuration** - **`get_configs_due_for_posting_async`**: Configs due at current hour - **`_config_log_tag`**: Logging helper ##### Implementation Details - **Family Filtering**: Wildcard ('*', 'all', empty) = all families - **Recently Featured**: Excludes species posted within history_days - **Photo Validation**: Checks iNaturalist for min_photos - **Fallback**: If no priority family species, tries all families --- ### Sync Server #### `core/sync.py` - Bidirectional Synchronization - **Purpose**: Webhook integration with Moltly & web API ##### Inbound Sync (aiohttp server on port 8080) - **POST `/sync/molt`**: Receive molt events from Moltly - **POST `/sync/notes`**: Receive note sync events - **POST `/sync/dm`**: Forward messages to bot admins ##### Outbound Notifications **To Moltly** - **`notify_moltly_of_molt`**: POST molt event - **`notify_moltly_molt_update`**: Molt edit notification - **`notify_moltly_molt_delete`**: Molt deletion - **`notify_moltly_note_delete`**: Note deletion **To Web API** - **`notify_web_of_favorite_add`**: Sync favorites - **`notify_web_of_favorite_remove`**: Remove favorites - **`notify_web_of_history`**: Update search history ##### Security - All endpoints require `WSCA_SYNC_SECRET` header - `start_sync_server` function initializes aiohttp app --- ### Embed Creator #### `embeds/creator.py` - Discord Embed Builders - **`EmbedCreator` class**: Consistent embed styling ##### Base Embeds - **`create_base_embed`**: Template for all embeds - **`create_error_embed`**: Red embeds (errors) - **`create_success_embed`**: Green embeds (success) - **`create_processing_embed`**: Orange (loading) ##### Species-Specific - **`create_species_embed`**: Full species card - Taxonomy (genus, species, family, author) - LSID and distribution - iNaturalist photos & observation count - Weather at species location - Observation locations on map - Similar species ##### Special Content - **`create_cites_embed`**: CITES/Species+ status - **`create_weather_embed`**: Weather details - **`create_species_of_the_day_embed`**: SOTD card with facts - **`create_image_embed`**: Image gallery - **`create_similar_species_embed`**: Comparison embeds - **`create_changelog_embed`**: CSV update summaries ##### Utilities - **`_clean_reference_text`**: HTML β†’ text formatting - **`_paginate_fields`**: Splits long field lists --- ### Command Handlers #### `commands/handlers.py` - Core Search Logic - **`SearchHandler` class**: Species/genus/family queries ##### Methods - **`search_species`**: Species lookup with fuzzy matching - **`search_genus`**: Genus-level info - **`search_family`**: Family-level stats - **`search_updates`**: CSV updates search ##### Features - Fuzzy matching (configurable threshold) - API enrichment (WSC, iNat, GBIF, CITES) - Image fetching & composition - Weather data integration - Similar species recommendations - Observation mapping ##### Discord UI Views - **`FavoriteButtonView`**: Toggle favorite button - **`ImageView`**: Image gallery navigation - **`BulkSpeciesView`**: Multi-species selection - **`SimpleListPaginatorView`**: Generic list pagination - **`ArticlesHandler`**: Article browsing ##### Helper Functions - **`build_web_app_url`**: Creates shareable web link - **`create_web_link_view`**: Wraps URL in button view --- ### Prefix Commands #### `commands/bot_commands.py` - ! Commands **Search Commands** - `!search [species|genus|family|updates] <query>` - Search spider data - `!fuzzy <query>` - Fuzzy search - `!random [species|genus|family] [filter]` - Random spider **Media Commands** - `!image <Genus species>` - Fetch photos - `!recent <Genus species>` - iNaturalist observations - `!map <Genus species>` - Distribution map **Data Commands** - `!weather <location>` - Weather lookup - `!c / !convert` - Temperature conversion - `!m / !measure <values>` - Unit conversion - `!stats [families|genera|user]` - Statistics **User Collections** - `!fav [add|list|remove]` - Favorites - `!wishlist [add|list|remove]` - Wishlist - `!molt [add|list|remove]` - Molt logging - `!notes [list|view|share|alias|search]` - Personal notes **Admin/Meta** - `!cl [list|<number>]` - Changelog - `!updatecsv` - Force CSV update (owner) - `!syncspecies` - Sync to MongoDB (owner) - `!article [category]` - Browse articles --- ### Slash Commands #### `commands/slash_commands.py` - / Commands **Mirrors of Prefix Commands** - `/species <query>` - Species search - `/genus <query>` - Genus search - `/family <query>` - Family search - `/image <species>` - Image lookup - `/recent <species>` - Observations - `/map <species>` - Distribution map - `/weather <location>` - Weather - `/updates <query>` - CSV updates - `/random [type] [filter]` - Random spider - `/stats [type]` - Statistics - `/fav [action] [species]` - Favorites - `/wishlist [action] [species]` - Wishlist - `/molt [action] [...]` - Molt tracking - `/notes [action] [...]` - Notes - `/sotd [action]` - Species of the Day management - `/settings feature_channel` - Guild settings --- ### Changelog Handler #### `commands/changelog.py` - Changelog UI - **`ChangelogHandler` class**: Changelog display ##### Methods - **`list_changelogs`**: Paginated changelog list - **`show_changelog`**: Specific changelog details - **`show_changelog_range`**: Multi-changelog comparison ##### Features - Pagination for large lists - Statistics formatting (new species, synonyms) - Markdown for Discord --- ## 🌐 Web Stack [Spiders Archive](https://spiders.invert.info) ### Web Server #### `web/server/src/server.ts` - Bun + Hono API - **Framework**: Hono (TypeScript for Bun) - **Database**: MongoDB ##### Key Endpoints **Species Search** - `POST /api/search` - Full-text species search - Modes: species, genus, family, distribution, location, doi - Fuzzy matching & alias resolution - Paginated results **Type Hints (Autocomplete)** - `GET /api/hints?mode=<mode>&q=<query>` - Modes: distribution, location, reference, doi **User Data Sync** - `GET /api/user/profile` - Discord OAuth profile - `GET /api/user/favorites` - User favorites - `POST /api/user/favorites` - Add favorite - `DELETE /api/user/favorites/<lsid>` - Remove - `GET /api/user/history` - Search history - `POST /api/user/history` - Log search **Species Details** - `GET /api/species/<query>` - Full species data - Enriched with WSC, iNat, GBIF, weather - Taxonomy, images, observations, similar species **Changelog** - `GET /api/changelogs` - List changelogs - `GET /api/changelogs/<id>` - Single changelog - `GET /api/changelogs/compare` - Compare two **Authentication** - OAuth via Better-Auth with Discord provider - JWT token in Authorization header - Session management & refresh --- ### Web Client #### `web/client/src/App.tsx` - React 19 SPA - **Framework**: React 19 + Vite + TypeScript - **Maps**: Leaflet.js - **Auth**: Better-Auth (Discord OAuth) ##### Key Components **Search Interface** (`App.tsx`) - Search bar with autocomplete - Advanced filters (genus, family, distribution, location) - Paginated results or single detail view - Species detail card: - Taxonomy & LSID - Distribution map (Leaflet) - iNaturalist photos & observations - Weather at observation locations - Similar species - External resource links **User Features** (`UserCollectionsContext.tsx`) - **Favorites**: Save species - **Wishlist**: Species to find/keep - **Molt History**: Log molts - **Pin Items**: Quick access - **Search History**: Past searches **Pages** - **Species Detail**: Full info with media - **Changelog List** (`ChangelogListPage.tsx`): Browse updates - **Single Changelog** (`ChangelogPage.tsx`): Detailed view - **Changelog Compare** (`ChangelogComparePage.tsx`): Side-by-side - **Stats Page** (`StatsPage.tsx`): User achievements - **Quick Access Panel** (`QuickAccessPanel.tsx`): Pinned items **UI Components** - **UserMenu** (`UserMenu.tsx`): Profile dropdown - **ThemeContext** (`ThemeContext.tsx`): Light/dark theme - **EnrichedSpeciesCard** (`EnrichedSpeciesCard.tsx`): Enhanced species card - **ErrorBoundary**: Error handling **Authentication** (`AuthContext.tsx`) - Discord OAuth integration - Profile fetching - Token refresh ##### Styling - **App.css**: Comprehensive responsive design (93 KB) - **variables.css**: CSS custom properties - **index.css**: Base styles (4.9 KB) - **StatsPage.css**: Stats styling (8.2 KB) --- ## πŸ”§ Development & Deployment ### Local Development **Python Bot** ```bash python -m venv .venv && source .venv/bin/activate pip install -r requirements.txt python main.py ``` **Web Server** ```bash cd web/server bun install bun run dev # Hot reload ``` **Web Client** ```bash cd web/client npm install npm run dev # Vite dev server ``` ### Docker Deployment **Services** (`docker-compose.yml`) 1. **mongo** - MongoDB 8 with auth 2. **wsc-bot** - Python Discord bot 3. **wsc-api** - Bun/Hono web server **Commands** ```bash docker compose up -d # Start all docker compose down # Stop docker compose logs -f # Follow logs make build # Rebuild make restart # Restart bot ``` --- ## πŸ” Security & Performance ### Rate Limiting - **Command Queue** (`core/bot.py:39`): Per-guild queuing, 0.1s delays - **API Caching**: 24h TTL, reduces external load - **MongoDB Archive**: Permanent API call history ### Blacklisting - User/guild blacklist in SQLite - `check_blacklist()` decorator on all commands - Owner can manage via admin commands ### Authentication - **Discord OAuth**: Better-Auth integration - **Session Tokens**: MongoDB storage - **Webhook Security**: `WSCA_SYNC_SECRET` header validation ### Performance Optimizations - **In-Memory CSV**: Fast lookups with indices - **Rapidfuzz**: C-based fuzzy matching - **MongoDB Indexing**: Unique on LSID, canonical, genus, family - **Async Throughout**: aiohttp, motor, aiosqlite - **Batch Operations**: 500-doc batches for MongoDB --- ## πŸ“Š Data Structures ### CSV Structure (`species.csv`) ``` LSID, Genus, Species, Subspecies, Family, Author, Year, Distribution, Status, Original_combination, Reference, Notes ``` ### MongoDB Collections **`api_cache`**: API response cache with soft TTL **`api_calls`**: Permanent API call archive **`species_index`**: Full species data from CSV **`species_alias_index`**: User-defined aliases **`user_favorites`**: User favorite species **`user_wishlists`**: User wishlists **`user_molt_logs`**: Molt records **`user_notes`**: Personal notes **`user_history`**: Search history **`discord_users`**: OAuth user profiles ### SQLite Tables **`favorites`**: User favorites **`wishlists`**: User wishlists **`molt_logs`**: Molt tracking **`notes`**: Personal notes **`sotd_configs`**: Species of the Day configs **`blacklist`**: User/guild blacklist **`species_observations`**: Observation counts **`user_stats`**: User statistics --- ## πŸ”— External APIs ### World Spider Catalog (WSC) - **Endpoint**: `https://wsc.nmbe.ch/api/` - **Auth**: WSC_API_KEY - **Data**: Species taxonomy, references, LSID ### WSC Updates - **Endpoint**: `https://wsc.nmbe.ch/api/updates` - **Auth**: WSC_API_KEY - **Data**: Retrieve LSIDs of new or changed taxa. If date parameter is not provided, the response will include LSIDs of new or changed taxa of last six months (default) ### iNaturalist - **Endpoint**: `https://api.inaturalist.org/v1/` - **Auth**: None (public API) - **Data**: Photos, observations, taxon IDs ### GBIF - **Endpoint**: `https://api.gbif.org/v1/` - **Auth**: None - **Data**: Species occurrences, distribution ### Species+ (CITES) - **Endpoint**: `https://api.speciesplus.net/api/v1/` - **Auth**: SPECIESPLUS_TOKEN - **Data**: CITES appendix, legislation ### Open-Meteo - **Endpoint**: `https://archive-api.open-meteo.com/v1/` - **Auth**: None - **Data**: Historical weather (30-year archive) ### API Ninjas - **Endpoint**: `https://api.api-ninjas.com/v1/` - **Auth**: API_NINJAS_KEY - **Data**: Fun facts --- ## πŸ“ Environment Variables **Essential** ```env DISCORD_TOKEN=<bot_token> WSC_API_KEY=<api_key> ``` **Optional** ```env MONGO_URL=mongodb://mongo:27017 MONGO_DB=wsc_bot CACHE_TTL_HOURS=24 WEB_APP_BASE_URL=https://spiders.invert.info WEB_API_URL=http://wsc-api:8000 SPECIESPLUS_TOKEN=<token> API_NINJAS_KEY=<key> MOLTLY_SYNC_URL=https://moltly.xyz/api/sync/wsca MOLTLY_SYNC_SECRET=<secret> WSCA_SYNC_SECRET=<secret> WSCA_SYNC_PORT=8080 BOT_ADMIN_IDS= LOG_LEVEL=INFO ``` --- ## πŸ§ͺ Testing & Quality ### Linting ```bash # Python ruff check . ruff format . # TypeScript (Server) cd web/server && bun run lint # TypeScript (Client) cd web/client && npm run lint ``` ### Type Checking ```bash cd web/server && bun run check ``` --- ## πŸ“š Key Architectural Decisions ### Modular Design - Separation of concerns: commands, APIs, data, embeds - Clear responsibility boundaries - Easy to test individual components ### Async-First - All I/O operations non-blocking - Hundreds of concurrent users supported - Long-running tasks don't freeze bot ### Multi-Tier Caching - **In-memory CSV**: Instant lookups - **MongoDB soft TTL**: Persistent API cache - **API call archive**: Debugging & analytics ### Distributed Sync - Webhook-based Moltly integration - Bidirectional molt/note updates - User data synced between Discord & web ### Flexible Search - CSV exact/fuzzy matching - MongoDB full-text for web - User-defined alias system ### Automated SOTD - Per-guild configurations - Smart selection (priority families, min photos) - Educational content integration --- ## πŸ—ΊοΈ Project File Organization | Responsibility | Files | |---|---| | **Configuration** | `core/config.py`, `.env` | | **Bot Lifecycle** | `main.py`, `core/bot.py` | | **Commands** | `commands/bot_commands.py`, `commands/slash_commands.py`, `commands/handlers.py`, `commands/changelog.py` | | **Data Layer** | `database/manager.py`, `utils/data.py`, `utils/csv_updater.py` | | **Caching** | `utils/cache.py`, `utils/species_indexer.py` | | **API Integration** | `api/manager.py` | | **Display** | `embeds/creator.py` | | **Automation** | `utils/sotd_manager.py`, `utils/facts_manager.py` | | **Synchronization** | `core/sync.py` | | **Web Frontend** | `web/client/src/App.tsx`, `web/client/src/*.tsx` | | **Web Backend** | `web/server/src/server.ts` | | **Scripts** | `scripts/*.py` | --- ## πŸš€ Common Tasks ### Force CSV Update ```bash # Via Discord (owner only) !updatecsv # Via script python -m scripts.sync_species_index ``` ### Sync Species Index ```bash # Via Discord (owner only) !syncspecies # Via script python -m scripts.sync_species_index ``` ### Backup User Data ```bash bash scripts/backup_bot_data.sh ``` ### View Logs ```bash # Docker docker compose logs -f wsc-bot # Local tail -f bot_logs/bot.log ``` --- ## 🎯 Feature Highlights ### Species Search - 50,000+ spider species from WSC - Fuzzy matching with rapidfuzz - Alias support for common names - Rich embeds with images, weather, observations ### Species of the Day - Automated daily posting - Per-guild configuration - Priority family filtering - Educational content integration ### User Collections - Favorites, wishlists, molt logs - Synced between Discord & web - Search history tracking - Personal notes with sharing ### CSV Update System - Daily automatic updates - Detailed changelogs - Diff generation - MongoDB index sync ### Weather Integration - 30-year historical archive - Location-based weather - Species habitat weather - Observation weather snapshots ### CITES Status - Species+ integration - Appendix listings - Legislation tracking - Conservation status --- *Last Updated: 2026-01-31* *Documentation Version: 1.0*