BigBrother

BigBrother monitors a folder for newly appearing text files and scans them for suspicious keywords loaded from a separate threat-list file. It must handle very large files (tens of gigabytes) without loading them fully into memory and must react if the keyword list changes while running.

Phase 1 — Keyword Loading & Single-File Scanner

Implement the core scanning logic.

  • Read keyword list from a file (one keyword per line).
  • Validate the keyword file (non-empty, no invalid entries).
  • Implement a streaming search for suspicious words inside a single file, reading in chunks.
  • When a match is found, extract ±100 characters of surrounding context.
  • Print alerts to console or write them to an output file.

Functional Result: Running bigbrother.py <keywords.txt> <file.txt> scans the file and prints all matches with context.


Phase 2 — Folder Monitoring Loop

Extend the tool to automatically process new incoming files.

  • Monitor a target folder for new files (polling every X seconds).
  • Detect previously unseen files and scan them using Phase 1 logic.
  • Handle large files and partially written files safely.
  • Log all operations (file scanned, no matches, errors).

Functional Result: Starting BigBrother on a folder causes it to continuously watch for new files and scan each one when it appears.


Phase 3 — Hot-Reload Threat Keywords

Add support for updating the keyword list without restarting the program.

  • Detect changes to the keyword file.
  • Reload keywords dynamically in-memory.
  • Use the updated keyword list for future scans.
  • Validate new keyword file contents and log errors.

Functional Result: Modifying keywords.txt while BigBrother is running immediately updates the internal keyword list.


Phase 4 — Robustness, Alert Formatting, and Cleanup

Improve usability, safety, and output clarity.

  • Add structured alert formatting (timestamp, file name, location, context).
  • Colored terminal output for matches.
  • Configurable output mode: console, log file, or both.
  • Graceful error handling: permission errors, missing folders, corrupted files.
  • Write a final README and usage instructions.

Functional Result: BigBrother runs for long periods without errors, provides polished output, and handles real-world issues like errors, file changes, and keyword updates.