BigBrother
BigBrother monitors a folder for newly appearing text files and scans them for suspicious keywords loaded from a separate threat-list file. It must handle very large files (tens of gigabytes) without loading them fully into memory and must react if the keyword list changes while running.
Phase 1 — Keyword Loading & Single-File Scanner
Implement the core scanning logic.
- Read keyword list from a file (one keyword per line).
- Validate the keyword file (non-empty, no invalid entries).
- Implement a streaming search for suspicious words inside a single file, reading in chunks.
- When a match is found, extract ±100 characters of surrounding context.
- Print alerts to console or write them to an output file.
Functional Result: Running bigbrother.py <keywords.txt> <file.txt> scans the file and prints all matches with context.
Phase 2 — Folder Monitoring Loop
Extend the tool to automatically process new incoming files.
- Monitor a target folder for new files (polling every X seconds).
- Detect previously unseen files and scan them using Phase 1 logic.
- Handle large files and partially written files safely.
- Log all operations (file scanned, no matches, errors).
Functional Result: Starting BigBrother on a folder causes it to continuously watch for new files and scan each one when it appears.
Phase 3 — Hot-Reload Threat Keywords
Add support for updating the keyword list without restarting the program.
- Detect changes to the keyword file.
- Reload keywords dynamically in-memory.
- Use the updated keyword list for future scans.
- Validate new keyword file contents and log errors.
Functional Result: Modifying keywords.txt while BigBrother is running immediately updates the internal keyword list.
Phase 4 — Robustness, Alert Formatting, and Cleanup
Improve usability, safety, and output clarity.
- Add structured alert formatting (timestamp, file name, location, context).
- Colored terminal output for matches.
- Configurable output mode: console, log file, or both.
- Graceful error handling: permission errors, missing folders, corrupted files.
- Write a final README and usage instructions.
Functional Result: BigBrother runs for long periods without errors, provides polished output, and handles real-world issues like errors, file changes, and keyword updates.