Temporal reasoning accuracy measures if the memory system correctly answer questions about event ordering, state at a specific time, recency, intervals, and sequences
Methodology
Questions are categorized into 5 temporal reasoning types. Each method ingests time-stamped events and is evaluated on accuracy per question type.
Neocortex achieves perfect accuracy on recency questions (100%), directly demonstrating the effectiveness of its Ebbinghaus time-decay model. Recent memories naturally have higher retention scores. The directfeed method (feeding full context to the LLM) performs well on interval and sequence questions where having the complete timeline helps, but this approach doesn't scale beyond context window limits.