Snaptrack

Snaptrack is a site-snapshot and change-tracking tool. It captures the HTML of any given site (or set of pages), stores snapshots (including HTTP headers) in a local SQLite database, and highlights differences between consecutive snapshots.

Overview

Snaptrack is a Go-based application designed to monitor websites for changes over time. It does this by:

Fetching the raw HTML (and HTTP headers) from a URL or recursively crawling an entire domain.
Storing each snapshot in a local SQLite database.
Comparing each new snapshot to the previous version for that page and presenting a unified diff of what changed.

The tool can be run in CLI mode for batch usage (crawl, check, etc.) or in a TUI (Text-based User Interface) for interactive exploration of snapshots.

Features

Recursive Crawl: Optionally follow links within the same domain to capture snapshots of multiple pages.
SQLite Database: Stores snapshots locally—simple, lightweight, no external DB required.
Diff Highlights: Compares new HTML to the previous version for each page, generating a unified diff (with optional color).
TUI Interface: A text-based interface lets you browse tracked URLs, see changes, and re-check pages on demand.
Stores Request/Response Headers: Helpful for auditing server headers, analyzing status codes, and monitoring security headers over time.
Raw HTTP Approach (No Headless Browser)**: Faster, simpler for static or server-rendered pages.
- (If needed, revert to a headless browser approach for JavaScript-heavy sites.)

Getting Started

1. Prerequisites

Go (version 1.18+ recommended).
A SQLite driver (e.g., github.com/mattn/go-sqlite3) automatically installed via go mod tidy.
(Optional) A color-supporting terminal for color-coded diffs.

2. Installation

Clone this repository:

git clone https://github.com/copyleftdev/snaptrack.git

Change to the directory:
```
cd snaptrack
```
Install dependencies:
```
go mod tidy
```

3. Building from Source

Use the Makefile:

make build
make run

Or manually:

go build -o bin/snapstack ./cmd/snapstack

The executable snapstack is placed in ./bin/.

Usage

Snaptrack can be invoked via CLI subcommands or launched in a TUI if no arguments are provided.

1. Crawling a Domain

./bin/snapstack crawl https://example.com --max-depth=2

Crawl the specified domain (example.com) recursively up to 2 levels.
Store HTML snapshots (and headers/status code) in snapshots.db.
Show diff logs if changes are detected on subsequent crawls.

2. Checking or Diffing a Single URL

./bin/snapstack check https://example.com

(If implemented—example usage. Checks a single page.)

./bin/snapstack diff https://example.com

(If implemented—example usage. Shows a unified diff for that page’s last two snapshots.)

3. TUI (Interactive Mode)

./bin/snapstack

Launches a text-based interface to:
- List all tracked URLs in your DB.
- Select a URL to see if it changed.
- Press d for diff output, r to recapture, etc.
- Press q or Esc to quit.

Storing HTTP Headers & Status Codes

By default, Snaptrack now captures and stores:

Request Headers (the final headers sent, such as User-Agent).
Response Headers (e.g. Content-Type, Set-Cookie, Cache-Control).
HTTP Status Code (e.g. 200, 301, 404).

They’re stored as JSON in the snapshots table under request_headers and response_headers columns, along with an integer status_code. You can optionally parse or display this data in your TUI or logs to monitor header changes or track server responses over time.

Use Cases

1. Site Owners & Content Managers

Maintain a historical record of content changes over time.
Quickly identify any unapproved modifications or mistakes in text or layout.

2. Security Teams & Professionals

Monitor a public site for unexpected or malicious header changes or inserted scripts.
Diff after each deployment or scheduled check to confirm the site hasn’t been tampered with.
Helps detect defacement, backdoors, or suspicious header values if an attacker alters responses.

3. Testers & QA Engineers

Compare staging and production pages by capturing snapshots from each environment.
Confirm no undesired changes slipped into a new release, both in HTML and server headers.
Record each build’s output so you can see exactly what changed from one version to the next.

4. SEO & Marketing Analysts

Track how metadata, headings, or content changes might affect SEO.
Keep a historical log of keyword or content modifications.

Configuration & Customization

Database Path: Defaults to snapshots.db in the current directory. Change in main.go or environment variables as desired.
Crawl Depth & Concurrency: --max-depth plus internal concurrency settings let you control the scope and speed of crawling.
Timeout: Each HTTP request uses a default of ~15 seconds. Adjust in capture.go if needed.
Unified Diff: We produce a standard “unified diff.” For color highlighting, ensure your terminal supports ANSI or integrate with external tools.

Contributing

We welcome contributions! Please:

Fork this repo & create a feature branch.
Submit a pull request when ready.
Open an issue to discuss features, request improvements, or report bugs.

License

Snaptrack is licensed under the MIT License. See the LICENSE file for more info.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
cmd/snapstack		cmd/snapstack
pkg		pkg
ui		ui
.gitignore		.gitignore
LICENSE		LICENSE
Makefile		Makefile
go.mod		go.mod
go.sum		go.sum
logo.png		logo.png
readme.md		readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Snaptrack

Table of Contents

Overview

Features

Getting Started

1. Prerequisites

2. Installation

3. Building from Source

Usage

1. Crawling a Domain

2. Checking or Diffing a Single URL

3. TUI (Interactive Mode)

Storing HTTP Headers & Status Codes

Use Cases

1. Site Owners & Content Managers

2. Security Teams & Professionals

3. Testers & QA Engineers

4. SEO & Marketing Analysts

Configuration & Customization

Contributing

License

About

Releases

Packages

Languages

License

copyleftdev/snaptrack

Folders and files

Latest commit

History

Repository files navigation

Snaptrack

Table of Contents

Overview

Features

Getting Started

1. Prerequisites

2. Installation

3. Building from Source

Usage

1. Crawling a Domain

2. Checking or Diffing a Single URL

3. TUI (Interactive Mode)

Storing HTTP Headers & Status Codes

Use Cases

1. Site Owners & Content Managers

2. Security Teams & Professionals

3. Testers & QA Engineers

4. SEO & Marketing Analysts

Configuration & Customization

Contributing

License

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages