Internal development documentation, knowledge base planning, and production resources
(v 1.0 — 2025-07-29)
For each bucket you get 10-30 well-documented primary/secondary sources.
Licensing notes:
Bucket | Source (URL or citation) | Type | License |
---|---|---|---|
(a) Indiana State History | Indiana Historical Bureau “Statehood Timeline” (in.gov) | Website | OA |
Indiana Historical Society digital collections (images, manuscripts) | Archive | Perm | |
Hoosiers and the American Story (IHS, 2015) ch. 1–10 | Book/PDF | OA | |
Library of Congress “A Century of Lawmaking” territorial docs | Archive | OA | |
Dunn, Indiana and Indianans (1919) | Book | OA | |
Dunn, Greater Indianapolis (1910) | Book | OA | |
Conover, “Rearview Mirror: 90-Year Retrospective on Indiana’s Economy” (IBRC) | Article | OA | |
IN.gov “Introducing Indiana” PDF (1998) | Magazine PDF | OA | |
ASCE Indiana Infrastructure Report Card (2025) | Report | OA | |
IDEM Annual Environmental Reports (PDF series) | Gov reports | OA | |
(b) Bloomington + Monroe Co. | Monroe County History Center archives | Archive | Perm |
City of Bloomington “Furniture Factory District” page | Website | OA | |
Herald-Times digital archive (IU Library sub) | Newspaper | Pay | |
Bloomington: A Bicentennial History (Madison, 2018) | Book | Perm | |
GIS open data portal (monroecounty.gov) | Dataset | OA | |
(c) Showers Brothers & Tech/Arts District | ”A Walk Through the Showers Brothers Furniture Factory” PDF (bloomington.in.gov) | OA | |
DiscoverIndiana.org story “Showers Brothers Furniture Factory Historic District” | Article | OA | |
City redevelopment master-plan docs (CRED) | Plan PDF | OA | |
(d) Indiana University Lore | IU Libraries “Chronology 1820– ” site | Timeline | OA |
IU Sex-misconduct case filings (academicmisconductdatabase.org) | Dataset | OA | |
Kinsey Institute digital archive highlights | Archive | Perm | |
IU Archives photo collections | Archive | OA | |
Little 500 sit-in oral histories (1968) | Audio transcripts | OA | |
(e) Notable Figures | Benjamin Harrison bio (whitehouse.gov) | Gov bio | OA |
Eugene V. Debs House Museum docs | Archive | OA | |
John Dillinger file (IN Historical Bureau) | Article | OA | |
Madam C. J. Walker papers (IUPUI) | Archive | Perm | |
Kurt Vonnegut Museum & Library digital exhibits | Archive | Perm | |
Hoagy Carmichael collection (IUB) | Archive | Perm | |
D. C. Stephenson KKK trial materials (IN State Archives) | Archive | Perm | |
(f) Indigenous Nations | Indiana Historical Society “Myaamia Survivance” article | Article | OA |
IN.gov “Lesson 4: Indigenous Lands of Indiana” | Gov lesson | OA | |
Treaty of Greenville text (Avalon Project) | Doc | OA | |
Miami Tribe of Oklahoma language resources | Site | OA | |
Potawatomi Nation cultural center resources | Site | OA | |
(g) Labor/Industry/Farming/Racing/Music | NIRPC 2024 Obligated Projects report | OA | |
Indiana Limestone Company historical ads (LOC) | Images | OA | |
U.S. Steel Gary Works centennial booklet | OA | ||
Indianapolis Motor Speedway history timeline | Site | OA | |
Gennett Records story (IHS) | Article | OA | |
Indiana Humanities “Indiana Foodways” series | Articles | OA | |
Purdue Ag Stats annual bulletins | Dataset | OA | |
(h) Contemporary Issues | Indiana Drug Overdose Dashboard notes (IDOH, 2022) | OA | |
WFYI/IPB News environment desk articles (coal ash, PFAS, permit fees) | News | OA | |
2024 NIRPC climate & infrastructure plans | OA | ||
IDEM Lake Michigan LAMP updates | Gov page | OA |
(add or swap sources quarterly—this starter list = 120+ items; prune if needed)
crawl_targets:
- all URLs above (respect robots.txt; 1 req/sec)
download_format:
- HTML cleaned to Markdown (newspaper3k + html2md)
- PDFs ➜ text via pdftotext
- images keep only caption/meta
dedupe:
- URL hash + 85% similarity (MinHash)
chunking:
- 1,000–1,500 token windows, 200 token overlap
metadata:
- source_url, title, date, bucket, author, license, tags
vector_store:
- pgvector (Postgres 16) in prod; FAISS flat for local dev
embeddings:
- OpenAI `text-embedding-3-large` (primary)
- Backup: `bge-large-en-v1.5`
re-index cadence:
- full rebuild yearly; incremental every upload
SYSTEM:
You are the Indiana Oracle, an interactive historical entity.
Speak in clear Midwestern English with occasional regional idioms.
NEVER claim divine authority; admit uncertainty when data gaps exist.
STYLE KNOBS
- temperature 0.6 default (raise to 0.9 for creative lore)
- max length 350 tokens per answer in kiosk mode
- vary openers: start with date, anecdote, or direct answer ≠ rote template
FORBIDDEN PHRASES
- "I am just an AI"
- "As an AI language model"
- absolute political endorsements
SAFETY / BIAS
- Decline hate or extremist praise
- Redirect modern medical/legal advice ("Consult a professional")
- Flag graphic violence; summarize instead
# | Question | Bucket | Sentiment |
---|---|---|---|
1 | ”Why is Indiana called the Hoosier State?“ | a | curious |
2 | ”Which tribes lived here before statehood?“ | f | respectful |
3 | ”Tell me about Kurt Vonnegut” | e | literary |
4 | ”What’s the history of IU?“ | d | academic |
5 | ”How did Bloomington get its name?“ | b | local |
6 | ”What happened to the Showers Brothers factory?“ | c | historical |
7 | ”Who was Madam C.J. Walker?“ | e | inspirational |
8 | ”What’s Indiana known for producing?“ | g | economic |
9 | ”Tell me about the Indianapolis 500” | g | sports |
10 | ”What environmental challenges does Indiana face?“ | h | serious |
(populate up to 60 questions)
Age-graded variants: kids, teens, adults, scholars
Sentiment markers: light / serious / critical (to tune response tone)
/docs/
└── oracle-knowledge-base-plan.md
/public/spec/
└── clip_schema.json
Latest technical evolution incorporating video layers, GPU particles, and natural voice interaction
The particle-based holographic personas approach offers both aesthetic and technical advantages over photorealistic methods. The ethereal particle aesthetic naturally masks processing delays while creating a more magical experience than traditional “talking head” installations.
Current Technical Stack Evolution:
Asset Library Structure:
/personas/vonnegut/
├── idle_loops/
│ ├── breathing_01.mp4 (20-30s)
│ ├── breathing_02.mp4
│ └── breathing_03.mp4
├── transitions/
│ ├── summon.mp4 (2-3s)
│ └── dissolve.mp4 (2-3s)
├── expressions/
│ ├── thinking.mp4
│ ├── amused.mp4
│ └── profound.mp4
└── faq_clips/
├── faq_001_dresden.mp4
├── faq_002_writing_advice.mp4
└── [30-50 more based on common questions]
AI Video Generation Prompts:
Two-Layer Composition:
Audio-Reactive Particle Parameters:
# Simplified audio-reactive particles
audio_amplitude = analyze_audio(input_stream)
particle_params = {
'mouth_density': map_range(audio_amplitude, 0, 1, 0.3, 1.0),
'mouth_velocity': map_range(audio_pitch, 20, 400, 0.1, 2.0),
'color_shift': map_emotion(sentiment_analysis)
}
TouchDesigner Network Architecture:
Audio In → FFT Analysis → Particle Emitters
↓
Emotion Analysis → Color/Pattern Modulation
↓
Persona Templates → Unique Behaviors
↓
Render Pipeline → Pepper's Ghost Display
Base System:
Persona-Specific Signatures:
WebSocket Architecture Replacement:
# Simple WebSocket voice handler
import asyncio
import websockets
from silero_vad import VADIterator
class VoiceConversationHandler:
def __init__(self):
self.vad = VADIterator(threshold=0.5)
self.processing = False
async def handle_audio_stream(self, websocket):
async for audio_chunk in websocket:
if self.processing:
continue
# Detect speech end
speech_dict = self.vad(audio_chunk)
if speech_dict['speech_end']:
self.processing = True
# Process complete utterance
text = await self.stt(speech_dict['audio'])
response = await self.get_vonnegut_response(text)
audio = await self.tts_elevenlabs(response)
# Stream back
await websocket.send(audio)
self.processing = False
# FAQ Router
class OracleRouter:
def __init__(self):
self.faq_embeddings = load_embeddings('faq_database.pkl')
self.cooldowns = {}
async def route_query(self, audio_input):
text = await self.stt(audio_input)
embedding = self.encode(text)
# Check FAQ match
match, score = self.search_faqs(embedding)
if score > 0.83 and self.can_play(match.id):
# Play pre-rendered video with baked audio
return ('play_faq', match.video_path)
else:
# Generate live response
response_text = await self.llm_generate(text)
response_audio = await self.tts(response_text)
# Choose base video by length
base_video = self.select_video_loop(len(response_audio))
return ('play_live', base_video, response_audio)
# In TouchDesigner Execute DAT
def onFrameStart(frame):
# Get audio analysis
audio_level = op('audioanalysis1')['level']
audio_low = op('audioanalysis1')['low']
audio_mid = op('audioanalysis1')['mid']
audio_high = op('audioanalysis1')['high']
# Modulate particle parameters
particles = op('particles1')
# Mouth region density
mouth_force = particles.par.force1
mouth_force.val = fit(audio_level, 0, 0.8, 0.1, 2.0)
# Color based on frequency
color_r = fit(audio_low, 0, 1, 0.0, 0.3) # Warm on low
color_g = fit(audio_mid, 0, 1, 0.5, 1.0) # Cyan on mid
color_b = fit(audio_high, 0, 1, 0.8, 1.0) # Bright on high
# Persona-specific modulation
if parent().par.Persona == 'vonnegut':
# Add smoke wisps on thoughtful pauses
if audio_level < 0.1:
particles.par.birthrate = 500
particles.par.velocity = 0.5
elif parent().par.Persona == 'bub':
# Sparkle on high frequencies (purrs)
if audio_high > 0.7:
particles.par.turbulence = 2.0
Phase 1: Foundation (Months 1-2)
Phase 2: Enhancement (Months 2-4)
Phase 3: Polish (Months 4-6)
Essential Skills:
Test Project Brief: “Create a 30-second particle face loop that responds to audio amplitude. Particles should feel weightless and ethereal. Use cyan/gold palette. Black background is TRUE black.”
Current Explorations:
Budget Reallocation:
This approach prioritizes the magical particle aesthetic while maintaining technical feasibility and cost efficiency.