107 Commits

Author SHA1 Message Date
f2df64479c Fix S3 versioning (live-object VersionId, DM PUT/DELETE), harden DeleteObjects/ListObjects conformance, and run hot paths on blocking threads 2026-04-23 22:40:38 +08:00
bd405cc2fe Fix S3 versioning/delete markers, path-safety leaks, and error-code conformance; parallelize DeleteObjects; restore per-op rate limits 2026-04-23 20:23:11 +08:00
7ef3820f6e Fix SigV4/SHA256/TCP_NODELAY critical paths; tighten multipart, copy, versioning, and S3 error conformance 2026-04-23 17:52:30 +08:00
e1fb225034 csrf fixes 2026-04-22 23:01:32 +08:00
2767e7e79d Optimize bucket listing for 10K-100K objects
- Shallow listing: read per-directory _index.json once for eTags instead
  of N serial .meta.json reads. Validate prefix for path traversal and
  verify normalized target stays within bucket root.
- Recursive listing: cache full per-directory index during the walk so
  each _index.json is parsed at most once per call.
- Per-bucket listing cache with 5s TTL and per-bucket rebuild mutex.
  Invalidated on put/delete/copy/metadata/tags/multipart-complete.
  Pagination uses partition_point for O(log n) start lookup.
- UI stream endpoint now actually streams via mpsc + Body::from_stream
  instead of buffering into a Vec<String>. Cancels producer on client
  disconnect.
- UI JSON endpoint honors delimiter=/ and returns common_prefixes.
- run_blocking wrapper dispatches sync filesystem work via
  block_in_place on multi-threaded runtimes, falls back to inline on
  current-thread runtimes (unit tests).
2026-04-22 19:55:44 +08:00
217af6d1c6 Full migration and transition to Rust; Remove python artifacts 2026-04-22 17:19:19 +08:00
51d54b42ac Rust fixes 2026-04-22 15:41:18 +08:00
9ec5797919 Applied max-keys to combined current + archived ListObjectVersions output and reports truncation 2026-04-22 00:12:22 +08:00
8935188c8f Update static website 404 page 2026-04-21 21:26:50 +08:00
c77c592832 Update static website to include proper error handling; add missing features 2026-04-21 20:54:00 +08:00
501d563df2 Add missing features - notifcations, object lock, acl 2026-04-21 00:27:50 +08:00
ddcdb4026c Fix domains mapping missing 2026-04-20 22:02:05 +08:00
3e7c0af019 Fix dockerfile issue 2026-04-20 21:39:06 +08:00
476b9bd2e4 First porting of Python to Rust - update docs and bug fixes 2026-04-20 21:27:02 +08:00
c2ef37b84e Separate Python and Rust into python/ and rust/ with per-stack Dockerfiles 2026-04-19 14:01:05 +08:00
be8e030940 Migrate more Python functions to Rust 2026-04-19 13:53:55 +08:00
ad7b2a02cb Add missing endpoints for Rust S3 API 2026-04-05 15:22:24 +08:00
72ddd9822c Add docker support for rust integration 2026-04-03 12:31:11 +08:00
4c30efd802 Update myfsio rust engines - added more implementations 2026-04-02 21:57:16 +08:00
926a7e6366 Add Rust storage engine foundation 2026-04-02 17:00:58 +08:00
1eadc7b75c Fix more-actions dropdown positioning: use Popper fixed strategy instead of raw CSS position:fixed 2026-04-01 16:24:42 +08:00
4a224a127b Fix more-actions dropdown triggering row selection on object list 2026-04-01 16:17:29 +08:00
c498fe7aee Add self-heal missing ETags and harden ETag index persistence 2026-03-31 21:10:47 +08:00
3838aed954 Fix presigned URL security vulnerabilities: enforce key/user status in SigV4 paths, remove duplicate verification, remove X-Forwarded-Host trust 2026-03-31 20:27:18 +08:00
6a193dbb1c Add --version option for run.py 2026-03-31 17:21:33 +08:00
e94b341a5b Add robust myfsio_core staleness detection with Python fallback; document Rust extension build in README 2026-03-31 17:13:05 +08:00
2ad3736852 Add intra-bucket cursor tracking to integrity scanner for progressive full coverage; Optimize integrity scanner: early batch exit, lazy sorted walk, cursor-aware index reads 2026-03-31 17:04:28 +08:00
f05b2668c0 Reduce per-request overhead: pre-compile SigV4 regex, in-memory etag index cache, 1MB GET chunks, configurable meta cache, skip fsync for rebuildable caches 2026-03-25 13:44:34 +08:00
f7c1c1f809 Update requirements.txt 2026-03-25 13:26:42 +08:00
0e392e18b4 Hide ghost details in object panel when preview fails to load 2026-03-24 15:15:03 +08:00
8996f1ce06 Fix folder selection not showing delete button in bucket browser 2026-03-24 12:10:38 +08:00
f60dbaf9c9 Respect DISPLAY_TIMEZONE in GC and integrity scanner history tables 2026-03-23 18:36:13 +08:00
1a5a7aa9e1 Auto-refresh Recent Scans/Executions tables after GC and integrity scan completion 2026-03-23 18:31:13 +08:00
326367ae4c Fix integrity scanner batch limit and add cursor-based rotation 2026-03-23 17:46:27 +08:00
a7f9b0a22f Convert GC to async with polling to prevent proxy timeouts 2026-03-23 17:14:04 +08:00
0e525713b1 Fix missing CSRF token on presigned URL request 2026-03-23 16:48:25 +08:00
f43fad02fb Replace fetch with XHR for multipart upload progress and add retry logic 2026-03-23 16:27:28 +08:00
eff3e378f3 Fix mobile infinite scroll on object list and ghost preview on fast object swap 2026-03-23 11:55:46 +08:00
5e32cef792 Add I/O throttling to GC and integrity scanner to prevent HDD starvation 2026-03-23 11:36:38 +08:00
9898167f8d Make integrity scan async with progress indicator in UI 2026-03-22 14:17:43 +08:00
4a553555d3 Clean up debug code 2026-03-22 11:38:29 +08:00
7a3202c996 Possible fix for the issue 2026-03-22 11:27:52 +08:00
bd20ca86ab Further debugging on s3 api issues on Granian 2026-03-22 11:22:24 +08:00
532cf95d59 Debug s3 api issues on Granian 2026-03-22 11:14:32 +08:00
366f8ce60d the middleware now also triggers when Content-Length is '0' but X-Amz-Decoded-Content-Length or aws-chunked headers indicate a body should be present 2026-03-22 00:24:04 +08:00
7612cb054a further fixes 2026-03-22 00:16:30 +08:00
966d524dca Fix 0-byte uploads caused by Granian stripping Expect header and missing CONTENT_LENGTH for chunked transfers 2026-03-22 00:04:55 +08:00
e84f1f1851 Fix SigV4 SignatureDoesNotMatch when Expect header is stripped by WSGI server 2026-03-21 23:48:19 +08:00
a059f0502d Fix 0-byte uploads caused by Granian default buffer size; Add SERVER_MAX_BUFFER_SIZE config 2026-03-21 22:57:48 +08:00
afd7173ba0 Fix buttons all showing Running state when only one action is triggered 2026-03-21 14:51:43 +08:00
c807bb2388 Update install/uninstall scripts for encrypted IAM config 2026-03-20 17:51:00 +08:00
aa4f9f5566 Bypass boto3 proxy for object streaming, read directly from storage layer; Add streaming object iterator to eliminate O(n²) directory rescanning on large buckets; Add iter_objects_shallow delegation to EncryptedObjectStorage 2026-03-20 17:35:10 +08:00
14786151e5 Fix selected object losing highlight on scroll in virtual list 2026-03-20 12:10:26 +08:00
a496862902 Fix stale object count on dashboard after deleting all objects in bucket 2026-03-17 23:25:30 +08:00
df4f27ca2e Fix IAM policy editor injecting prefix on existing policies without one 2026-03-15 16:04:35 +08:00
d72e0a347e Overhaul IAM: granular actions, multi-key users, prefix-scoped policies 2026-03-14 23:50:44 +08:00
6ed4b7d8ea Add System page: server info, feature flags, GC and integrity scanner UI 2026-03-14 20:27:57 +08:00
31ebbea680 Fix Docker healthcheck failure: Granian cannot run inside daemon process 2026-03-14 18:31:12 +08:00
d878134ebf Switch from Waitress to Granian (Rust/hyper WSGI server) for improved concurrency 2026-03-14 18:17:39 +08:00
55568d6892 Fix video seekbar in static website hosting by adding HTTP Range request support 2026-03-10 22:21:55 +08:00
a4ae81c77c Add integrity scanner: background detection and healing of corrupted objects, orphaned files, phantom metadata, stale versions, etag cache inconsistencies, and legacy metadata drift 2026-03-10 22:14:39 +08:00
9da7104887 Redesign tags UI: split pills, grid editor with column headers, ghost delete buttons 2026-03-10 17:48:17 +08:00
de5377e5ac Add garbage collection: background cleanup of orphaned temp files, multipart uploads, lock files, metadata, versions, and empty directories 2026-03-09 17:34:21 +08:00
80b77b64eb Fix bucket dashboard missing created date and incorrect object count badge in folder view 2026-03-09 15:27:08 +08:00
6c912a3d71 Add conditional GET/HEAD headers: If-Match, If-None-Match, If-Modified-Since, If-Unmodified-Since 2026-03-09 15:09:15 +08:00
c6e368324a Update docs.md and docs.html for credential expiry, IAM encryption, admin key env vars, and --reset-cred 2026-03-08 13:38:44 +08:00
7b6c096bb7 Remove the check out the documentation paragraph at login page 2026-03-08 13:18:03 +08:00
03353a0aec Add credential expiry support: per-user expires_at with UI management, presets, and badge indicators; Add credential expiry support: per-user expires_at with UI management, presets, and badge indicators; Fix IAM card dropdown clipped by overflow: remove gradient bar, allow overflow visible 2026-03-08 13:08:57 +08:00
72f5d9d70c Restore data integrity guarantees: Content-MD5 validation, fsync durability, atomic metadata writes, concurrent write protection 2026-03-07 17:54:00 +08:00
be63e27c15 Reduce per-request CPU overhead: eliminate double stat(), cache content type and policy context, gate logging, configurable stat intervals 2026-03-07 14:08:23 +08:00
81ef0fe4c7 Fix stale object count in bucket header and metrics dashboard after deletes 2026-03-03 19:42:37 +08:00
5f24bd920d Reduce P99 tail latency: defer etag index writes, eliminate double cache rebuild, skip redundant stat() in bucket config 2026-03-02 22:39:37 +08:00
8552f193de Reduce CPU/lock contention under concurrent uploads: split cache lock, in-memory stats, dict copy, lightweight request IDs, defaultdict metrics 2026-03-02 22:05:54 +08:00
5536330aeb Move performance-critical Python functions to Rust: streaming I/O, multipart assembly, and AES-256-GCM encryption 2026-02-27 22:55:20 +08:00
d4657c389d Fix misleading default credentials in README to match actual random generation behavior 2026-02-27 21:58:10 +08:00
3827235232 Reduce CPU usage on heavy uploads: skip SHA256 body hashing in SigV4, use Rust md5_file post-write instead of per-chunk _HashingReader 2026-02-27 21:57:13 +08:00
dfc0058d0d Extend myfsio_core Rust extension with 7 storage hot paths (directory scanning, metadata I/O, object listing, search, bucket stats, cache building) 2026-02-27 12:22:39 +08:00
27aef84311 Fix rclone CopyObject SignatureDoesNotMatch caused by internal metadata leaking as X-Amz-Meta headers 2026-02-26 21:39:43 +08:00
5003514a3d Fix null ETags in shallow listing by updating etag index on store/delete 2026-02-26 18:09:08 +08:00
20a314e030 Fix incorrect Upgrading & Updates section in Docs 2026-02-26 17:49:59 +08:00
d8232340c3 Update docs 2026-02-26 17:38:44 +08:00
a356bb0c4e perf: shallow listing, os.scandir stats, server-side search for large buckets 2026-02-26 17:11:07 +08:00
1c328ee3af Fix list performance for large buckets: delimiter-aware shallow listing, cache TTL increase, UI delimiter streaming. header badge shows total bucket objects, fix status bar text concatenation 2026-02-26 16:29:28 +08:00
5bf7962c04 Fix UI: versioning modals and object browser panel showing 'null' 2026-02-24 20:41:39 +08:00
e06f653606 Fix version panel showing 'null' instead of timestamp, exclude current version from list, auto-refresh versions after upload 2026-02-24 17:19:12 +08:00
9c2809c195 Backwards compatibility for Proxy trust config 2026-02-22 18:03:38 +08:00
fb32ca0a7d Harden security: fail-closed policies, presigned URL time/expiry validation, SSRF DNS pinning, lockout cap, proxy trust config 2026-02-22 17:55:40 +08:00
6ab702a818 Use cached etag in HEAD instead of re-hashing entire file 2026-02-22 16:01:46 +08:00
550e7d435c Move SigV4 canonical request construction to Rust unified verify function 2026-02-22 14:03:12 +08:00
776967e80d Add Rust index reader, metadata read cache, and 256KB stream chunks 2026-02-19 23:01:40 +08:00
082a7fbcd1 Move index JSON read to Rust for GIL-released parsing (serde_json) 2026-02-19 22:43:28 +08:00
ff287cf67b Improve Sites page UI/UX: dropdown actions, collapsible forms, AJAX submissions, Check All Health, safer selectors 2026-02-16 22:04:46 +08:00
bddf36d52d Fix domain mapping cross-process staleness, filter bucket dropdown to website-enabled only 2026-02-16 17:48:21 +08:00
cf6cec9cab Add 5 missing S3 API operations: DeleteBucketEncryption, GetObjectAcl, PutObjectAcl, GetObjectAttributes, GetBucketPolicyStatus 2026-02-16 16:41:27 +08:00
d425839e57 Remove Rust build artifacts from tracking, update .gitignore 2026-02-16 16:06:42 +08:00
4c661477d5 Add Rust extension module (myfsio_core) for SigV4, hashing, and validation hot paths 2026-02-16 16:04:15 +08:00
f3f52f14a5 Fix domain mapping bugs and improve UI/UX: normalize domains, fix delete, add validation and search 2026-02-16 00:51:19 +08:00
d19ba3e305 UI/UX enhancements to IAM page: role badges, search, copy keys, improved policy display 2026-02-16 00:40:04 +08:00
c627f41f53 UI/UX enhancements to Metrics page 2026-02-15 23:56:18 +08:00
bcad0cd3da Improve web UI: sort/search/context menu, fix security and UX bugs 2026-02-15 23:30:26 +08:00
67f057ca1c Add static website hosting 2026-02-15 20:57:02 +08:00
01e79e6993 Fix object browser UI issues 2026-02-10 11:41:02 +08:00
1e3c4b545f Migrate UI backend from direct storage calls to S3 API proxy via boto3 2026-02-09 22:33:47 +08:00
4ecd32a554 Fix empty UI on large bucket first load: keep loading row during streaming, add progress indicator, throttle renders 2026-02-09 19:29:50 +08:00
aa6d7c4d28 Optimize replication failure caching, batch UI auth checks, add bulk download size limit, background parent cleanup 2026-02-09 18:23:45 +08:00
6e6d6d32bf Optimize KMS: cache AESGCM instance, remove duplicate get_provider 2026-02-09 17:01:19 +08:00
54705ab9c4 Fix Content-Length mismatch on range requests (206 Partial Content) 2026-02-06 16:14:35 +08:00
161 changed files with 49994 additions and 28195 deletions

View File

@@ -1,13 +1,9 @@
.git
.gitignore
.venv
__pycache__
*.pyc
*.pyo
*.pyd
.pytest_cache
.coverage
htmlcov
logs
data
tmp
target
crates/*/tests
Dockerfile
.dockerignore

3
.gitignore vendored
View File

@@ -26,6 +26,9 @@ dist/
*.egg-info/
.eggs/
# Rust engine build artifacts
target/
# Local runtime artifacts
logs/
*.log

5252
Cargo.lock generated Normal file

File diff suppressed because it is too large Load Diff

62
Cargo.toml Normal file
View File

@@ -0,0 +1,62 @@
[workspace]
resolver = "2"
members = [
"crates/myfsio-common",
"crates/myfsio-auth",
"crates/myfsio-crypto",
"crates/myfsio-storage",
"crates/myfsio-xml",
"crates/myfsio-server",
]
[workspace.package]
version = "0.4.4"
edition = "2021"
[workspace.dependencies]
tokio = { version = "1", features = ["full"] }
axum = { version = "0.8" }
tower = { version = "0.5" }
tower-http = { version = "0.6", features = ["cors", "trace", "fs", "compression-gzip", "timeout"] }
hyper = { version = "1" }
bytes = "1"
serde = { version = "1", features = ["derive"] }
serde_json = "1"
quick-xml = { version = "0.37", features = ["serialize"] }
hmac = "0.12"
sha2 = "0.10"
md-5 = "0.10"
hex = "0.4"
aes = "0.8"
aes-gcm = "0.10"
cbc = { version = "0.1", features = ["alloc"] }
hkdf = "0.12"
uuid = { version = "1", features = ["v4"] }
parking_lot = "0.12"
lru = "0.14"
percent-encoding = "2"
regex = "1"
unicode-normalization = "0.1"
tracing = "0.1"
tracing-subscriber = { version = "0.3", features = ["env-filter"] }
thiserror = "2"
chrono = { version = "0.4", features = ["serde"] }
base64 = "0.22"
tokio-util = { version = "0.7", features = ["io"] }
tokio-stream = "0.1"
futures = "0.3"
dashmap = "6"
crc32fast = "1"
duckdb = { version = "1", features = ["bundled"] }
reqwest = { version = "0.12", default-features = false, features = ["stream", "rustls-tls", "json"] }
aws-sdk-s3 = { version = "1", features = ["behavior-version-latest", "rt-tokio"] }
aws-config = { version = "1", features = ["behavior-version-latest"] }
aws-credential-types = "1"
aws-smithy-runtime-api = "1"
aws-smithy-types = "1"
async-trait = "0.1"
tera = "1"
cookie = "0.18"
subtle = "2"
clap = { version = "4", features = ["derive"] }
dotenvy = "0.15"

View File

@@ -1,33 +1,50 @@
FROM python:3.14.3-slim
FROM rust:1-slim-bookworm AS builder
ENV PYTHONDONTWRITEBYTECODE=1 \
PYTHONUNBUFFERED=1
WORKDIR /build
RUN apt-get update \
&& apt-get install -y --no-install-recommends build-essential pkg-config libssl-dev \
&& rm -rf /var/lib/apt/lists/*
COPY Cargo.toml Cargo.lock ./
COPY crates ./crates
RUN cargo build --release --bin myfsio-server \
&& strip target/release/myfsio-server
FROM debian:bookworm-slim
WORKDIR /app
RUN apt-get update \
&& apt-get install -y --no-install-recommends build-essential \
&& rm -rf /var/lib/apt/lists/*
COPY requirements.txt ./
RUN pip install --no-cache-dir -r requirements.txt
COPY . .
RUN chmod +x docker-entrypoint.sh
RUN mkdir -p /app/data \
&& apt-get install -y --no-install-recommends ca-certificates curl \
&& rm -rf /var/lib/apt/lists/* \
&& mkdir -p /app/data \
&& useradd -m -u 1000 myfsio \
&& chown -R myfsio:myfsio /app
COPY --from=builder /build/target/release/myfsio-server /usr/local/bin/myfsio-server
COPY --from=builder /build/crates/myfsio-server/templates /app/templates
COPY --from=builder /build/crates/myfsio-server/static /app/static
COPY docker-entrypoint.sh /app/docker-entrypoint.sh
RUN chmod +x /app/docker-entrypoint.sh \
&& chown -R myfsio:myfsio /app
USER myfsio
EXPOSE 5000 5100
ENV APP_HOST=0.0.0.0 \
FLASK_ENV=production \
FLASK_DEBUG=0
EXPOSE 5000
EXPOSE 5100
ENV HOST=0.0.0.0 \
PORT=5000 \
UI_PORT=5100 \
STORAGE_ROOT=/app/data \
TEMPLATES_DIR=/app/templates \
STATIC_DIR=/app/static \
RUST_LOG=info
HEALTHCHECK --interval=30s --timeout=3s --start-period=5s --retries=3 \
CMD python -c "import requests; requests.get('http://localhost:5000/myfsio/health', timeout=2)"
CMD curl -fsS "http://localhost:${PORT}/myfsio/health" || exit 1
CMD ["./docker-entrypoint.sh"]
CMD ["/app/docker-entrypoint.sh"]

379
README.md
View File

@@ -1,250 +1,205 @@
# MyFSIO
A lightweight, S3-compatible object storage system built with Flask. MyFSIO implements core AWS S3 REST API operations with filesystem-backed storage, making it ideal for local development, testing, and self-hosted storage scenarios.
MyFSIO is an S3-compatible object storage server with a Rust runtime and a filesystem-backed storage engine. The repository root is the Cargo workspace; the server serves both the S3 API and the built-in web UI from a single process.
## Features
**Core Storage**
- S3-compatible REST API with AWS Signature Version 4 authentication
- Bucket and object CRUD operations
- Object versioning with version history
- Multipart uploads for large files
- Presigned URLs (1 second to 7 days validity)
- S3-compatible REST API with Signature Version 4 authentication
- Browser UI for buckets, objects, IAM users, policies, replication, metrics, and site administration
- Filesystem-backed storage rooted at `data/`
- Bucket versioning, multipart uploads, presigned URLs, CORS, object and bucket tagging
- Server-side encryption and built-in KMS support
- Optional background services for lifecycle, garbage collection, integrity scanning, operation metrics, and system metrics history
- Replication, site sync, and static website hosting support
**Security & Access Control**
- IAM users with access key management and rotation
- Bucket policies (AWS Policy Version 2012-10-17)
- Server-side encryption (SSE-S3 and SSE-KMS)
- Built-in Key Management Service (KMS)
- Rate limiting per endpoint
## Runtime Model
**Advanced Features**
- Cross-bucket replication to remote S3-compatible endpoints
- Hot-reload for bucket policies (no restart required)
- CORS configuration per bucket
MyFSIO now runs as one Rust process:
**Management UI**
- Web console for bucket and object management
- IAM dashboard for user administration
- Inline JSON policy editor with presets
- Object browser with folder navigation and bulk operations
- Dark mode support
- API listener on `HOST` + `PORT` (default `127.0.0.1:5000`)
- UI listener on `HOST` + `UI_PORT` (default `127.0.0.1:5100`)
- Shared state for storage, IAM, policies, sessions, metrics, and background workers
## Architecture
```
+------------------+ +------------------+
| API Server | | UI Server |
| (port 5000) | | (port 5100) |
| | | |
| - S3 REST API |<------->| - Web Console |
| - SigV4 Auth | | - IAM Dashboard |
| - Presign URLs | | - Bucket Editor |
+--------+---------+ +------------------+
|
v
+------------------+ +------------------+
| Object Storage | | System Metadata |
| (filesystem) | | (.myfsio.sys/) |
| | | |
| data/<bucket>/ | | - IAM config |
| <objects> | | - Bucket policies|
| | | - Encryption keys|
+------------------+ +------------------+
```
If you want API-only mode, set `UI_ENABLED=false`. There is no separate "UI-only" runtime anymore.
## Quick Start
From the repository root:
```bash
# Clone and setup
git clone https://gitea.jzwsite.com/kqjy/MyFSIO
cd s3
python -m venv .venv
# Activate virtual environment
# Windows PowerShell:
.\.venv\Scripts\Activate.ps1
# Windows CMD:
.venv\Scripts\activate.bat
# Linux/macOS:
source .venv/bin/activate
# Install dependencies
pip install -r requirements.txt
# Start both servers
python run.py
# Or start individually
python run.py --mode api # API only (port 5000)
python run.py --mode ui # UI only (port 5100)
cargo run -p myfsio-server --
```
**Default Credentials:** `localadmin` / `localadmin`
Useful URLs:
- **Web Console:** http://127.0.0.1:5100/ui
- **API Endpoint:** http://127.0.0.1:5000
- UI: `http://127.0.0.1:5100/ui`
- API: `http://127.0.0.1:5000/`
- Health: `http://127.0.0.1:5000/myfsio/health`
On first boot, MyFSIO creates `data/.myfsio.sys/config/iam.json` and prints the generated admin access key and secret key to the console.
### Common CLI commands
```bash
# Show resolved configuration
cargo run -p myfsio-server -- --show-config
# Validate configuration and exit non-zero on critical issues
cargo run -p myfsio-server -- --check-config
# Reset admin credentials
cargo run -p myfsio-server -- --reset-cred
# API only
UI_ENABLED=false cargo run -p myfsio-server --
```
## Building a Binary
```bash
cargo build --release -p myfsio-server
```
Binary locations:
- Linux/macOS: `target/release/myfsio-server`
- Windows: `target/release/myfsio-server.exe`
Run the built binary directly:
```bash
./target/release/myfsio-server
```
## Configuration
The server reads environment variables from the process environment and also loads, when present:
- `/opt/myfsio/myfsio.env`
- `.env`
- `myfsio.env`
Core settings:
| Variable | Default | Description |
|----------|---------|-------------|
| `STORAGE_ROOT` | `./data` | Filesystem root for bucket storage |
| `IAM_CONFIG` | `.myfsio.sys/config/iam.json` | IAM user and policy store |
| `BUCKET_POLICY_PATH` | `.myfsio.sys/config/bucket_policies.json` | Bucket policy store |
| `API_BASE_URL` | `http://127.0.0.1:5000` | API endpoint for UI calls |
| `MAX_UPLOAD_SIZE` | `1073741824` | Maximum upload size in bytes (1 GB) |
| `MULTIPART_MIN_PART_SIZE` | `5242880` | Minimum multipart part size (5 MB) |
| `UI_PAGE_SIZE` | `100` | Default page size for listings |
| `SECRET_KEY` | `dev-secret-key` | Flask session secret |
| `AWS_REGION` | `us-east-1` | Region for SigV4 signing |
| `AWS_SERVICE` | `s3` | Service name for SigV4 signing |
| `ENCRYPTION_ENABLED` | `false` | Enable server-side encryption |
| `KMS_ENABLED` | `false` | Enable Key Management Service |
| `LOG_LEVEL` | `INFO` | Logging verbosity |
| `SIGV4_TIMESTAMP_TOLERANCE_SECONDS` | `900` | Max time skew for SigV4 requests |
| `PRESIGNED_URL_MAX_EXPIRY_SECONDS` | `604800` | Max presigned URL expiry (7 days) |
| `REPLICATION_CONNECT_TIMEOUT_SECONDS` | `5` | Replication connection timeout |
| `SITE_SYNC_ENABLED` | `false` | Enable bi-directional site sync |
| `OBJECT_TAG_LIMIT` | `50` | Maximum tags per object |
| --- | --- | --- |
| `HOST` | `127.0.0.1` | Bind address for API and UI listeners |
| `PORT` | `5000` | API port |
| `UI_PORT` | `5100` | UI port |
| `UI_ENABLED` | `true` | Disable to run API-only |
| `STORAGE_ROOT` | `./data` | Root directory for buckets and system metadata |
| `IAM_CONFIG` | `<STORAGE_ROOT>/.myfsio.sys/config/iam.json` | IAM config path |
| `API_BASE_URL` | unset | Public API base used by the UI and presigned URL generation |
| `AWS_REGION` | `us-east-1` | Region used in SigV4 scope |
| `SIGV4_TIMESTAMP_TOLERANCE_SECONDS` | `900` | Allowed request time skew |
| `PRESIGNED_URL_MIN_EXPIRY_SECONDS` | `1` | Minimum presigned URL expiry |
| `PRESIGNED_URL_MAX_EXPIRY_SECONDS` | `604800` | Maximum presigned URL expiry |
| `SECRET_KEY` | loaded from `.myfsio.sys/config/.secret` if present | Session signing key and IAM-at-rest encryption key |
| `ADMIN_ACCESS_KEY` | unset | Optional first-run or reset access key |
| `ADMIN_SECRET_KEY` | unset | Optional first-run or reset secret key |
Feature toggles:
| Variable | Default |
| --- | --- |
| `ENCRYPTION_ENABLED` | `false` |
| `KMS_ENABLED` | `false` |
| `GC_ENABLED` | `false` |
| `INTEGRITY_ENABLED` | `false` |
| `LIFECYCLE_ENABLED` | `false` |
| `METRICS_HISTORY_ENABLED` | `false` |
| `OPERATION_METRICS_ENABLED` | `false` |
| `WEBSITE_HOSTING_ENABLED` | `false` |
| `SITE_SYNC_ENABLED` | `false` |
Metrics and replication tuning:
| Variable | Default |
| --- | --- |
| `OPERATION_METRICS_INTERVAL_MINUTES` | `5` |
| `OPERATION_METRICS_RETENTION_HOURS` | `24` |
| `METRICS_HISTORY_INTERVAL_MINUTES` | `5` |
| `METRICS_HISTORY_RETENTION_HOURS` | `24` |
| `REPLICATION_CONNECT_TIMEOUT_SECONDS` | `5` |
| `REPLICATION_READ_TIMEOUT_SECONDS` | `30` |
| `REPLICATION_MAX_RETRIES` | `2` |
| `REPLICATION_STREAMING_THRESHOLD_BYTES` | `10485760` |
| `REPLICATION_MAX_FAILURES_PER_BUCKET` | `50` |
| `SITE_SYNC_INTERVAL_SECONDS` | `60` |
| `SITE_SYNC_BATCH_SIZE` | `100` |
| `SITE_SYNC_CONNECT_TIMEOUT_SECONDS` | `10` |
| `SITE_SYNC_READ_TIMEOUT_SECONDS` | `120` |
| `SITE_SYNC_MAX_RETRIES` | `2` |
| `SITE_SYNC_CLOCK_SKEW_TOLERANCE_SECONDS` | `1.0` |
UI asset overrides:
| Variable | Default |
| --- | --- |
| `TEMPLATES_DIR` | built-in crate templates directory |
| `STATIC_DIR` | built-in crate static directory |
See [docs.md](./docs.md) for the full Rust-side operations guide.
## Data Layout
```
```text
data/
├── <bucket>/ # User buckets with objects
└── .myfsio.sys/ # System metadata
├── config/
│ ├── iam.json # IAM users and policies
│ ├── bucket_policies.json # Bucket policies
├── replication_rules.json
└── connections.json # Remote S3 connections
├── buckets/<bucket>/
│ ├── meta/ # Object metadata (.meta.json)
│ ├── versions/ # Archived object versions
└── .bucket.json # Bucket config (versioning, CORS)
├── multipart/ # Active multipart uploads
└── keys/ # Encryption keys (SSE-S3/KMS)
<bucket>/
.myfsio.sys/
config/
iam.json
bucket_policies.json
connections.json
operation_metrics.json
metrics_history.json
buckets/<bucket>/
meta/
versions/
multipart/
keys/
```
## API Reference
All endpoints require AWS Signature Version 4 authentication unless using presigned URLs or public bucket policies.
### Bucket Operations
| Method | Endpoint | Description |
|--------|----------|-------------|
| `GET` | `/` | List all buckets |
| `PUT` | `/<bucket>` | Create bucket |
| `DELETE` | `/<bucket>` | Delete bucket (must be empty) |
| `HEAD` | `/<bucket>` | Check bucket exists |
### Object Operations
| Method | Endpoint | Description |
|--------|----------|-------------|
| `GET` | `/<bucket>` | List objects (supports `list-type=2`) |
| `PUT` | `/<bucket>/<key>` | Upload object |
| `GET` | `/<bucket>/<key>` | Download object |
| `DELETE` | `/<bucket>/<key>` | Delete object |
| `HEAD` | `/<bucket>/<key>` | Get object metadata |
| `POST` | `/<bucket>/<key>?uploads` | Initiate multipart upload |
| `PUT` | `/<bucket>/<key>?partNumber=N&uploadId=X` | Upload part |
| `POST` | `/<bucket>/<key>?uploadId=X` | Complete multipart upload |
| `DELETE` | `/<bucket>/<key>?uploadId=X` | Abort multipart upload |
### Bucket Policies (S3-compatible)
| Method | Endpoint | Description |
|--------|----------|-------------|
| `GET` | `/<bucket>?policy` | Get bucket policy |
| `PUT` | `/<bucket>?policy` | Set bucket policy |
| `DELETE` | `/<bucket>?policy` | Delete bucket policy |
### Versioning
| Method | Endpoint | Description |
|--------|----------|-------------|
| `GET` | `/<bucket>/<key>?versionId=X` | Get specific version |
| `DELETE` | `/<bucket>/<key>?versionId=X` | Delete specific version |
| `GET` | `/<bucket>?versions` | List object versions |
### Health Check
| Method | Endpoint | Description |
|--------|----------|-------------|
| `GET` | `/myfsio/health` | Health check endpoint |
## IAM & Access Control
### Users and Access Keys
On first run, MyFSIO creates a default admin user (`localadmin`/`localadmin`). Use the IAM dashboard to:
- Create and delete users
- Generate and rotate access keys
- Attach inline policies to users
- Control IAM management permissions
### Bucket Policies
Bucket policies follow AWS policy grammar (Version `2012-10-17`) with support for:
- Principal-based access (`*` for anonymous, specific users)
- Action-based permissions (`s3:GetObject`, `s3:PutObject`, etc.)
- Resource patterns (`arn:aws:s3:::bucket/*`)
- Condition keys
**Policy Presets:**
- **Public:** Grants anonymous read access (`s3:GetObject`, `s3:ListBucket`)
- **Private:** Removes bucket policy (IAM-only access)
- **Custom:** Manual policy editing with draft preservation
Policies hot-reload when the JSON file changes.
## Server-Side Encryption
MyFSIO supports two encryption modes:
- **SSE-S3:** Server-managed keys with automatic key rotation
- **SSE-KMS:** Customer-managed keys via built-in KMS
Enable encryption with:
```bash
ENCRYPTION_ENABLED=true python run.py
```
## Cross-Bucket Replication
Replicate objects to remote S3-compatible endpoints:
1. Configure remote connections in the UI
2. Create replication rules specifying source/destination
3. Objects are automatically replicated on upload
## Docker
Build the Rust image from the repository root:
```bash
docker build -t myfsio .
docker run -p 5000:5000 -p 5100:5100 -v ./data:/app/data myfsio
docker run --rm -p 5000:5000 -p 5100:5100 -v "${PWD}/data:/app/data" myfsio
```
If the instance sits behind a reverse proxy, set `API_BASE_URL` to the public S3 endpoint.
## Linux Installation
The repository includes `scripts/install.sh` for systemd-style Linux installs. Build the Rust binary first, then pass it to the installer:
```bash
cargo build --release -p myfsio-server
sudo ./scripts/install.sh --binary ./target/release/myfsio-server
```
The installer copies the binary into `/opt/myfsio/myfsio`, writes `/opt/myfsio/myfsio.env`, and can register a `myfsio.service` unit.
## Testing
Run the Rust test suite from the workspace:
```bash
# Run all tests
pytest tests/ -v
# Run specific test file
pytest tests/test_api.py -v
# Run with coverage
pytest tests/ --cov=app --cov-report=html
cargo test
```
## References
## Health Check
- [Amazon S3 Documentation](https://docs.aws.amazon.com/s3/)
- [AWS Signature Version 4](https://docs.aws.amazon.com/general/latest/gr/signature-version-4.html)
- [S3 Bucket Policy Examples](https://docs.aws.amazon.com/AmazonS3/latest/userguide/example-bucket-policies.html)
`GET /myfsio/health` returns:
```json
{
"status": "ok",
"version": "0.5.0"
}
```
The `version` field comes from the Rust crate version in `crates/myfsio-server/Cargo.toml`.

View File

@@ -1,501 +0,0 @@
from __future__ import annotations
import logging
import shutil
import sys
import time
import uuid
from logging.handlers import RotatingFileHandler
from pathlib import Path
from datetime import timedelta
from typing import Any, Dict, List, Optional
from flask import Flask, g, has_request_context, redirect, render_template, request, url_for
from flask_cors import CORS
from flask_wtf.csrf import CSRFError
from werkzeug.middleware.proxy_fix import ProxyFix
from .access_logging import AccessLoggingService
from .operation_metrics import OperationMetricsCollector, classify_endpoint
from .compression import GzipMiddleware
from .acl import AclService
from .bucket_policies import BucketPolicyStore
from .config import AppConfig
from .connections import ConnectionStore
from .encryption import EncryptionManager
from .extensions import limiter, csrf
from .iam import IamService
from .kms import KMSManager
from .lifecycle import LifecycleManager
from .notifications import NotificationService
from .object_lock import ObjectLockService
from .replication import ReplicationManager
from .secret_store import EphemeralSecretStore
from .site_registry import SiteRegistry, SiteInfo
from .storage import ObjectStorage
from .version import get_version
def _migrate_config_file(active_path: Path, legacy_paths: List[Path]) -> Path:
"""Migrate config file from legacy locations to the active path.
Checks each legacy path in order and moves the first one found to the active path.
This ensures backward compatibility for users upgrading from older versions.
"""
active_path.parent.mkdir(parents=True, exist_ok=True)
if active_path.exists():
return active_path
for legacy_path in legacy_paths:
if legacy_path.exists():
try:
shutil.move(str(legacy_path), str(active_path))
except OSError:
shutil.copy2(legacy_path, active_path)
try:
legacy_path.unlink(missing_ok=True)
except OSError:
pass
break
return active_path
def create_app(
test_config: Optional[Dict[str, Any]] = None,
*,
include_api: bool = True,
include_ui: bool = True,
) -> Flask:
"""Create and configure the Flask application."""
config = AppConfig.from_env(test_config)
if getattr(sys, "frozen", False):
project_root = Path(sys._MEIPASS)
else:
project_root = Path(__file__).resolve().parent.parent
app = Flask(
__name__,
static_folder=str(project_root / "static"),
template_folder=str(project_root / "templates"),
)
app.config.update(config.to_flask_config())
if test_config:
app.config.update(test_config)
app.config.setdefault("APP_VERSION", get_version())
app.permanent_session_lifetime = timedelta(days=int(app.config.get("SESSION_LIFETIME_DAYS", 30)))
if app.config.get("TESTING"):
app.config.setdefault("WTF_CSRF_ENABLED", False)
# Trust X-Forwarded-* headers from proxies
app.wsgi_app = ProxyFix(app.wsgi_app, x_for=1, x_proto=1, x_host=1, x_prefix=1)
# Enable gzip compression for responses (10-20x smaller JSON payloads)
if app.config.get("ENABLE_GZIP", True):
app.wsgi_app = GzipMiddleware(app.wsgi_app, compression_level=6)
_configure_cors(app)
_configure_logging(app)
limiter.init_app(app)
csrf.init_app(app)
storage = ObjectStorage(
Path(app.config["STORAGE_ROOT"]),
cache_ttl=app.config.get("OBJECT_CACHE_TTL", 5),
object_cache_max_size=app.config.get("OBJECT_CACHE_MAX_SIZE", 100),
bucket_config_cache_ttl=app.config.get("BUCKET_CONFIG_CACHE_TTL_SECONDS", 30.0),
object_key_max_length_bytes=app.config.get("OBJECT_KEY_MAX_LENGTH_BYTES", 1024),
)
if app.config.get("WARM_CACHE_ON_STARTUP", True) and not app.config.get("TESTING"):
storage.warm_cache_async()
iam = IamService(
Path(app.config["IAM_CONFIG"]),
auth_max_attempts=app.config.get("AUTH_MAX_ATTEMPTS", 5),
auth_lockout_minutes=app.config.get("AUTH_LOCKOUT_MINUTES", 15),
)
bucket_policies = BucketPolicyStore(Path(app.config["BUCKET_POLICY_PATH"]))
secret_store = EphemeralSecretStore(default_ttl=app.config.get("SECRET_TTL_SECONDS", 300))
storage_root = Path(app.config["STORAGE_ROOT"])
config_dir = storage_root / ".myfsio.sys" / "config"
config_dir.mkdir(parents=True, exist_ok=True)
connections_path = _migrate_config_file(
active_path=config_dir / "connections.json",
legacy_paths=[
storage_root / ".myfsio.sys" / "connections.json",
storage_root / ".connections.json",
],
)
replication_rules_path = _migrate_config_file(
active_path=config_dir / "replication_rules.json",
legacy_paths=[
storage_root / ".myfsio.sys" / "replication_rules.json",
storage_root / ".replication_rules.json",
],
)
connections = ConnectionStore(connections_path)
replication = ReplicationManager(
storage,
connections,
replication_rules_path,
storage_root,
connect_timeout=app.config.get("REPLICATION_CONNECT_TIMEOUT_SECONDS", 5),
read_timeout=app.config.get("REPLICATION_READ_TIMEOUT_SECONDS", 30),
max_retries=app.config.get("REPLICATION_MAX_RETRIES", 2),
streaming_threshold_bytes=app.config.get("REPLICATION_STREAMING_THRESHOLD_BYTES", 10 * 1024 * 1024),
max_failures_per_bucket=app.config.get("REPLICATION_MAX_FAILURES_PER_BUCKET", 50),
)
site_registry_path = config_dir / "site_registry.json"
site_registry = SiteRegistry(site_registry_path)
if app.config.get("SITE_ID") and not site_registry.get_local_site():
site_registry.set_local_site(SiteInfo(
site_id=app.config["SITE_ID"],
endpoint=app.config.get("SITE_ENDPOINT") or "",
region=app.config.get("SITE_REGION", "us-east-1"),
priority=app.config.get("SITE_PRIORITY", 100),
))
encryption_config = {
"encryption_enabled": app.config.get("ENCRYPTION_ENABLED", False),
"encryption_master_key_path": app.config.get("ENCRYPTION_MASTER_KEY_PATH"),
"default_encryption_algorithm": app.config.get("DEFAULT_ENCRYPTION_ALGORITHM", "AES256"),
"encryption_chunk_size_bytes": app.config.get("ENCRYPTION_CHUNK_SIZE_BYTES", 64 * 1024),
}
encryption_manager = EncryptionManager(encryption_config)
kms_manager = None
if app.config.get("KMS_ENABLED", False):
kms_keys_path = Path(app.config.get("KMS_KEYS_PATH", ""))
kms_master_key_path = Path(app.config.get("ENCRYPTION_MASTER_KEY_PATH", ""))
kms_manager = KMSManager(
kms_keys_path,
kms_master_key_path,
generate_data_key_min_bytes=app.config.get("KMS_GENERATE_DATA_KEY_MIN_BYTES", 1),
generate_data_key_max_bytes=app.config.get("KMS_GENERATE_DATA_KEY_MAX_BYTES", 1024),
)
encryption_manager.set_kms_provider(kms_manager)
if app.config.get("ENCRYPTION_ENABLED", False):
from .encrypted_storage import EncryptedObjectStorage
storage = EncryptedObjectStorage(storage, encryption_manager)
acl_service = AclService(storage_root)
object_lock_service = ObjectLockService(storage_root)
notification_service = NotificationService(
storage_root,
allow_internal_endpoints=app.config.get("ALLOW_INTERNAL_ENDPOINTS", False),
)
access_logging_service = AccessLoggingService(storage_root)
access_logging_service.set_storage(storage)
lifecycle_manager = None
if app.config.get("LIFECYCLE_ENABLED", False):
base_storage = storage.storage if hasattr(storage, 'storage') else storage
lifecycle_manager = LifecycleManager(
base_storage,
interval_seconds=app.config.get("LIFECYCLE_INTERVAL_SECONDS", 3600),
storage_root=storage_root,
max_history_per_bucket=app.config.get("LIFECYCLE_MAX_HISTORY_PER_BUCKET", 50),
)
lifecycle_manager.start()
app.extensions["object_storage"] = storage
app.extensions["iam"] = iam
app.extensions["bucket_policies"] = bucket_policies
app.extensions["secret_store"] = secret_store
app.extensions["limiter"] = limiter
app.extensions["connections"] = connections
app.extensions["replication"] = replication
app.extensions["encryption"] = encryption_manager
app.extensions["kms"] = kms_manager
app.extensions["acl"] = acl_service
app.extensions["lifecycle"] = lifecycle_manager
app.extensions["object_lock"] = object_lock_service
app.extensions["notifications"] = notification_service
app.extensions["access_logging"] = access_logging_service
app.extensions["site_registry"] = site_registry
operation_metrics_collector = None
if app.config.get("OPERATION_METRICS_ENABLED", False):
operation_metrics_collector = OperationMetricsCollector(
storage_root,
interval_minutes=app.config.get("OPERATION_METRICS_INTERVAL_MINUTES", 5),
retention_hours=app.config.get("OPERATION_METRICS_RETENTION_HOURS", 24),
)
app.extensions["operation_metrics"] = operation_metrics_collector
system_metrics_collector = None
if app.config.get("METRICS_HISTORY_ENABLED", False):
from .system_metrics import SystemMetricsCollector
system_metrics_collector = SystemMetricsCollector(
storage_root,
interval_minutes=app.config.get("METRICS_HISTORY_INTERVAL_MINUTES", 5),
retention_hours=app.config.get("METRICS_HISTORY_RETENTION_HOURS", 24),
)
system_metrics_collector.set_storage(storage)
app.extensions["system_metrics"] = system_metrics_collector
site_sync_worker = None
if app.config.get("SITE_SYNC_ENABLED", False):
from .site_sync import SiteSyncWorker
site_sync_worker = SiteSyncWorker(
storage=storage,
connections=connections,
replication_manager=replication,
storage_root=storage_root,
interval_seconds=app.config.get("SITE_SYNC_INTERVAL_SECONDS", 60),
batch_size=app.config.get("SITE_SYNC_BATCH_SIZE", 100),
connect_timeout=app.config.get("SITE_SYNC_CONNECT_TIMEOUT_SECONDS", 10),
read_timeout=app.config.get("SITE_SYNC_READ_TIMEOUT_SECONDS", 120),
max_retries=app.config.get("SITE_SYNC_MAX_RETRIES", 2),
clock_skew_tolerance_seconds=app.config.get("SITE_SYNC_CLOCK_SKEW_TOLERANCE_SECONDS", 1.0),
)
site_sync_worker.start()
app.extensions["site_sync"] = site_sync_worker
@app.errorhandler(500)
def internal_error(error):
wants_html = request.accept_mimetypes.accept_html
path = request.path or ""
if include_ui and wants_html and (path.startswith("/ui") or path == "/"):
return render_template('500.html'), 500
error_xml = (
'<?xml version="1.0" encoding="UTF-8"?>'
'<Error>'
'<Code>InternalError</Code>'
'<Message>An internal server error occurred</Message>'
f'<Resource>{path}</Resource>'
f'<RequestId>{getattr(g, "request_id", "-")}</RequestId>'
'</Error>'
)
return error_xml, 500, {'Content-Type': 'application/xml'}
@app.errorhandler(CSRFError)
def handle_csrf_error(e):
wants_html = request.accept_mimetypes.accept_html
path = request.path or ""
if include_ui and wants_html and (path.startswith("/ui") or path == "/"):
return render_template('csrf_error.html', reason=e.description), 400
error_xml = (
'<?xml version="1.0" encoding="UTF-8"?>'
'<Error>'
'<Code>CSRFError</Code>'
f'<Message>{e.description}</Message>'
f'<Resource>{path}</Resource>'
f'<RequestId>{getattr(g, "request_id", "-")}</RequestId>'
'</Error>'
)
return error_xml, 400, {'Content-Type': 'application/xml'}
@app.template_filter("filesizeformat")
def filesizeformat(value: int) -> str:
"""Format bytes as human-readable file size."""
for unit in ["B", "KB", "MB", "GB", "TB", "PB"]:
if abs(value) < 1024.0 or unit == "PB":
if unit == "B":
return f"{int(value)} {unit}"
return f"{value:.1f} {unit}"
value /= 1024.0
return f"{value:.1f} PB"
@app.template_filter("timestamp_to_datetime")
def timestamp_to_datetime(value: float) -> str:
"""Format Unix timestamp as human-readable datetime in configured timezone."""
from datetime import datetime, timezone as dt_timezone
from zoneinfo import ZoneInfo
if not value:
return "Never"
try:
dt_utc = datetime.fromtimestamp(value, dt_timezone.utc)
display_tz = app.config.get("DISPLAY_TIMEZONE", "UTC")
if display_tz and display_tz != "UTC":
try:
tz = ZoneInfo(display_tz)
dt_local = dt_utc.astimezone(tz)
return dt_local.strftime("%Y-%m-%d %H:%M:%S")
except (KeyError, ValueError):
pass
return dt_utc.strftime("%Y-%m-%d %H:%M:%S UTC")
except (ValueError, OSError):
return "Unknown"
@app.template_filter("format_datetime")
def format_datetime_filter(dt, include_tz: bool = True) -> str:
"""Format datetime object as human-readable string in configured timezone."""
from datetime import datetime, timezone as dt_timezone
from zoneinfo import ZoneInfo
if not dt:
return ""
try:
display_tz = app.config.get("DISPLAY_TIMEZONE", "UTC")
if display_tz and display_tz != "UTC":
try:
tz = ZoneInfo(display_tz)
if dt.tzinfo is None:
dt = dt.replace(tzinfo=dt_timezone.utc)
dt = dt.astimezone(tz)
except (KeyError, ValueError):
pass
tz_abbr = dt.strftime("%Z") or "UTC"
if include_tz:
return f"{dt.strftime('%b %d, %Y %H:%M')} ({tz_abbr})"
return dt.strftime("%b %d, %Y %H:%M")
except (ValueError, AttributeError):
return str(dt)
if include_api:
from .s3_api import s3_api_bp
from .kms_api import kms_api_bp
from .admin_api import admin_api_bp
app.register_blueprint(s3_api_bp)
app.register_blueprint(kms_api_bp)
app.register_blueprint(admin_api_bp)
csrf.exempt(s3_api_bp)
csrf.exempt(kms_api_bp)
csrf.exempt(admin_api_bp)
if include_ui:
from .ui import ui_bp
app.register_blueprint(ui_bp)
if not include_api:
@app.get("/")
def ui_root_redirect():
return redirect(url_for("ui.buckets_overview"))
@app.errorhandler(404)
def handle_not_found(error):
wants_html = request.accept_mimetypes.accept_html
path = request.path or ""
if include_ui and wants_html:
if not include_api or path.startswith("/ui") or path == "/":
return render_template("404.html"), 404
return error
@app.get("/myfsio/health")
def healthcheck() -> Dict[str, str]:
return {"status": "ok"}
return app
def create_api_app(test_config: Optional[Dict[str, Any]] = None) -> Flask:
return create_app(test_config, include_api=True, include_ui=False)
def create_ui_app(test_config: Optional[Dict[str, Any]] = None) -> Flask:
return create_app(test_config, include_api=False, include_ui=True)
def _configure_cors(app: Flask) -> None:
origins = app.config.get("CORS_ORIGINS", ["*"])
methods = app.config.get("CORS_METHODS", ["GET", "PUT", "POST", "DELETE", "OPTIONS", "HEAD"])
allow_headers = app.config.get("CORS_ALLOW_HEADERS", ["*"])
expose_headers = app.config.get("CORS_EXPOSE_HEADERS", ["*"])
CORS(
app,
resources={r"/*": {"origins": origins, "methods": methods, "allow_headers": allow_headers, "expose_headers": expose_headers}},
supports_credentials=True,
)
class _RequestContextFilter(logging.Filter):
"""Inject request-specific attributes into log records."""
def filter(self, record: logging.LogRecord) -> bool:
if has_request_context():
record.request_id = getattr(g, "request_id", "-")
record.path = request.path
record.method = request.method
record.remote_addr = request.remote_addr or "-"
else:
record.request_id = getattr(record, "request_id", "-")
record.path = getattr(record, "path", "-")
record.method = getattr(record, "method", "-")
record.remote_addr = getattr(record, "remote_addr", "-")
return True
def _configure_logging(app: Flask) -> None:
formatter = logging.Formatter(
"%(asctime)s | %(levelname)s | %(request_id)s | %(method)s %(path)s | %(message)s"
)
stream_handler = logging.StreamHandler(sys.stdout)
stream_handler.setFormatter(formatter)
stream_handler.addFilter(_RequestContextFilter())
logger = app.logger
for handler in logger.handlers[:]:
handler.close()
logger.handlers.clear()
logger.addHandler(stream_handler)
if app.config.get("LOG_TO_FILE"):
log_file = Path(app.config["LOG_FILE"])
log_file.parent.mkdir(parents=True, exist_ok=True)
file_handler = RotatingFileHandler(
log_file,
maxBytes=int(app.config.get("LOG_MAX_BYTES", 5 * 1024 * 1024)),
backupCount=int(app.config.get("LOG_BACKUP_COUNT", 3)),
encoding="utf-8",
)
file_handler.setFormatter(formatter)
file_handler.addFilter(_RequestContextFilter())
logger.addHandler(file_handler)
logger.setLevel(getattr(logging, app.config.get("LOG_LEVEL", "INFO"), logging.INFO))
@app.before_request
def _log_request_start() -> None:
g.request_id = uuid.uuid4().hex
g.request_started_at = time.perf_counter()
g.request_bytes_in = request.content_length or 0
app.logger.info(
"Request started",
extra={"path": request.path, "method": request.method, "remote_addr": request.remote_addr},
)
@app.after_request
def _log_request_end(response):
duration_ms = 0.0
if hasattr(g, "request_started_at"):
duration_ms = (time.perf_counter() - g.request_started_at) * 1000
request_id = getattr(g, "request_id", uuid.uuid4().hex)
response.headers.setdefault("X-Request-ID", request_id)
app.logger.info(
"Request completed",
extra={
"path": request.path,
"method": request.method,
"remote_addr": request.remote_addr,
},
)
response.headers["X-Request-Duration-ms"] = f"{duration_ms:.2f}"
operation_metrics = app.extensions.get("operation_metrics")
if operation_metrics:
bytes_in = getattr(g, "request_bytes_in", 0)
bytes_out = response.content_length or 0
error_code = getattr(g, "s3_error_code", None)
endpoint_type = classify_endpoint(request.path)
operation_metrics.record_request(
method=request.method,
endpoint_type=endpoint_type,
status_code=response.status_code,
latency_ms=duration_ms,
bytes_in=bytes_in,
bytes_out=bytes_out,
error_code=error_code,
)
return response

View File

@@ -1,265 +0,0 @@
from __future__ import annotations
import io
import json
import logging
import queue
import threading
import time
import uuid
from dataclasses import dataclass, field
from datetime import datetime, timezone
from pathlib import Path
from typing import Any, Dict, List, Optional
logger = logging.getLogger(__name__)
@dataclass
class AccessLogEntry:
bucket_owner: str = "-"
bucket: str = "-"
timestamp: datetime = field(default_factory=lambda: datetime.now(timezone.utc))
remote_ip: str = "-"
requester: str = "-"
request_id: str = field(default_factory=lambda: uuid.uuid4().hex[:16].upper())
operation: str = "-"
key: str = "-"
request_uri: str = "-"
http_status: int = 200
error_code: str = "-"
bytes_sent: int = 0
object_size: int = 0
total_time_ms: int = 0
turn_around_time_ms: int = 0
referrer: str = "-"
user_agent: str = "-"
version_id: str = "-"
host_id: str = "-"
signature_version: str = "SigV4"
cipher_suite: str = "-"
authentication_type: str = "AuthHeader"
host_header: str = "-"
tls_version: str = "-"
def to_log_line(self) -> str:
time_str = self.timestamp.strftime("[%d/%b/%Y:%H:%M:%S %z]")
return (
f'{self.bucket_owner} {self.bucket} {time_str} {self.remote_ip} '
f'{self.requester} {self.request_id} {self.operation} {self.key} '
f'"{self.request_uri}" {self.http_status} {self.error_code or "-"} '
f'{self.bytes_sent or "-"} {self.object_size or "-"} {self.total_time_ms or "-"} '
f'{self.turn_around_time_ms or "-"} "{self.referrer}" "{self.user_agent}" {self.version_id}'
)
def to_dict(self) -> Dict[str, Any]:
return {
"bucket_owner": self.bucket_owner,
"bucket": self.bucket,
"timestamp": self.timestamp.isoformat(),
"remote_ip": self.remote_ip,
"requester": self.requester,
"request_id": self.request_id,
"operation": self.operation,
"key": self.key,
"request_uri": self.request_uri,
"http_status": self.http_status,
"error_code": self.error_code,
"bytes_sent": self.bytes_sent,
"object_size": self.object_size,
"total_time_ms": self.total_time_ms,
"referrer": self.referrer,
"user_agent": self.user_agent,
"version_id": self.version_id,
}
@dataclass
class LoggingConfiguration:
target_bucket: str
target_prefix: str = ""
enabled: bool = True
def to_dict(self) -> Dict[str, Any]:
return {
"LoggingEnabled": {
"TargetBucket": self.target_bucket,
"TargetPrefix": self.target_prefix,
}
}
@classmethod
def from_dict(cls, data: Dict[str, Any]) -> Optional["LoggingConfiguration"]:
logging_enabled = data.get("LoggingEnabled")
if not logging_enabled:
return None
return cls(
target_bucket=logging_enabled.get("TargetBucket", ""),
target_prefix=logging_enabled.get("TargetPrefix", ""),
enabled=True,
)
class AccessLoggingService:
def __init__(self, storage_root: Path, flush_interval: int = 60, max_buffer_size: int = 1000):
self.storage_root = storage_root
self.flush_interval = flush_interval
self.max_buffer_size = max_buffer_size
self._configs: Dict[str, LoggingConfiguration] = {}
self._buffer: Dict[str, List[AccessLogEntry]] = {}
self._buffer_lock = threading.Lock()
self._shutdown = threading.Event()
self._storage = None
self._flush_thread = threading.Thread(target=self._flush_loop, name="access-log-flush", daemon=True)
self._flush_thread.start()
def set_storage(self, storage: Any) -> None:
self._storage = storage
def _config_path(self, bucket_name: str) -> Path:
return self.storage_root / ".myfsio.sys" / "buckets" / bucket_name / "logging.json"
def get_bucket_logging(self, bucket_name: str) -> Optional[LoggingConfiguration]:
if bucket_name in self._configs:
return self._configs[bucket_name]
config_path = self._config_path(bucket_name)
if not config_path.exists():
return None
try:
data = json.loads(config_path.read_text(encoding="utf-8"))
config = LoggingConfiguration.from_dict(data)
if config:
self._configs[bucket_name] = config
return config
except (json.JSONDecodeError, OSError) as e:
logger.warning(f"Failed to load logging config for {bucket_name}: {e}")
return None
def set_bucket_logging(self, bucket_name: str, config: LoggingConfiguration) -> None:
config_path = self._config_path(bucket_name)
config_path.parent.mkdir(parents=True, exist_ok=True)
config_path.write_text(json.dumps(config.to_dict(), indent=2), encoding="utf-8")
self._configs[bucket_name] = config
def delete_bucket_logging(self, bucket_name: str) -> None:
config_path = self._config_path(bucket_name)
try:
if config_path.exists():
config_path.unlink()
except OSError:
pass
self._configs.pop(bucket_name, None)
def log_request(
self,
bucket_name: str,
*,
operation: str,
key: str = "-",
remote_ip: str = "-",
requester: str = "-",
request_uri: str = "-",
http_status: int = 200,
error_code: str = "",
bytes_sent: int = 0,
object_size: int = 0,
total_time_ms: int = 0,
referrer: str = "-",
user_agent: str = "-",
version_id: str = "-",
request_id: str = "",
) -> None:
config = self.get_bucket_logging(bucket_name)
if not config or not config.enabled:
return
entry = AccessLogEntry(
bucket_owner="local-owner",
bucket=bucket_name,
remote_ip=remote_ip,
requester=requester,
request_id=request_id or uuid.uuid4().hex[:16].upper(),
operation=operation,
key=key,
request_uri=request_uri,
http_status=http_status,
error_code=error_code,
bytes_sent=bytes_sent,
object_size=object_size,
total_time_ms=total_time_ms,
referrer=referrer,
user_agent=user_agent,
version_id=version_id,
)
target_key = f"{config.target_bucket}:{config.target_prefix}"
should_flush = False
with self._buffer_lock:
if target_key not in self._buffer:
self._buffer[target_key] = []
self._buffer[target_key].append(entry)
should_flush = len(self._buffer[target_key]) >= self.max_buffer_size
if should_flush:
self._flush_buffer(target_key)
def _flush_loop(self) -> None:
while not self._shutdown.is_set():
self._shutdown.wait(timeout=self.flush_interval)
if not self._shutdown.is_set():
self._flush_all()
def _flush_all(self) -> None:
with self._buffer_lock:
targets = list(self._buffer.keys())
for target_key in targets:
self._flush_buffer(target_key)
def _flush_buffer(self, target_key: str) -> None:
with self._buffer_lock:
entries = self._buffer.pop(target_key, [])
if not entries or not self._storage:
return
try:
bucket_name, prefix = target_key.split(":", 1)
except ValueError:
logger.error(f"Invalid target key: {target_key}")
return
now = datetime.now(timezone.utc)
log_key = f"{prefix}{now.strftime('%Y-%m-%d-%H-%M-%S')}-{uuid.uuid4().hex[:8]}"
log_content = "\n".join(entry.to_log_line() for entry in entries) + "\n"
try:
stream = io.BytesIO(log_content.encode("utf-8"))
self._storage.put_object(bucket_name, log_key, stream, enforce_quota=False)
logger.info(f"Flushed {len(entries)} access log entries to {bucket_name}/{log_key}")
except Exception as e:
logger.error(f"Failed to write access log to {bucket_name}/{log_key}: {e}")
with self._buffer_lock:
if target_key not in self._buffer:
self._buffer[target_key] = []
self._buffer[target_key] = entries + self._buffer[target_key]
def flush(self) -> None:
self._flush_all()
def shutdown(self) -> None:
self._shutdown.set()
self._flush_all()
self._flush_thread.join(timeout=5.0)
def get_stats(self) -> Dict[str, Any]:
with self._buffer_lock:
buffered = sum(len(entries) for entries in self._buffer.values())
return {
"buffered_entries": buffered,
"target_buckets": len(self._buffer),
}

View File

@@ -1,204 +0,0 @@
from __future__ import annotations
import json
from dataclasses import dataclass, field
from pathlib import Path
from typing import Any, Dict, List, Optional, Set
ACL_PERMISSION_FULL_CONTROL = "FULL_CONTROL"
ACL_PERMISSION_WRITE = "WRITE"
ACL_PERMISSION_WRITE_ACP = "WRITE_ACP"
ACL_PERMISSION_READ = "READ"
ACL_PERMISSION_READ_ACP = "READ_ACP"
ALL_PERMISSIONS = {
ACL_PERMISSION_FULL_CONTROL,
ACL_PERMISSION_WRITE,
ACL_PERMISSION_WRITE_ACP,
ACL_PERMISSION_READ,
ACL_PERMISSION_READ_ACP,
}
PERMISSION_TO_ACTIONS = {
ACL_PERMISSION_FULL_CONTROL: {"read", "write", "delete", "list", "share"},
ACL_PERMISSION_WRITE: {"write", "delete"},
ACL_PERMISSION_WRITE_ACP: {"share"},
ACL_PERMISSION_READ: {"read", "list"},
ACL_PERMISSION_READ_ACP: {"share"},
}
GRANTEE_ALL_USERS = "*"
GRANTEE_AUTHENTICATED_USERS = "authenticated"
@dataclass
class AclGrant:
grantee: str
permission: str
def to_dict(self) -> Dict[str, str]:
return {"grantee": self.grantee, "permission": self.permission}
@classmethod
def from_dict(cls, data: Dict[str, str]) -> "AclGrant":
return cls(grantee=data["grantee"], permission=data["permission"])
@dataclass
class Acl:
owner: str
grants: List[AclGrant] = field(default_factory=list)
def to_dict(self) -> Dict[str, Any]:
return {
"owner": self.owner,
"grants": [g.to_dict() for g in self.grants],
}
@classmethod
def from_dict(cls, data: Dict[str, Any]) -> "Acl":
return cls(
owner=data.get("owner", ""),
grants=[AclGrant.from_dict(g) for g in data.get("grants", [])],
)
def get_allowed_actions(self, principal_id: Optional[str], is_authenticated: bool = True) -> Set[str]:
actions: Set[str] = set()
if principal_id and principal_id == self.owner:
actions.update(PERMISSION_TO_ACTIONS[ACL_PERMISSION_FULL_CONTROL])
for grant in self.grants:
if grant.grantee == GRANTEE_ALL_USERS:
actions.update(PERMISSION_TO_ACTIONS.get(grant.permission, set()))
elif grant.grantee == GRANTEE_AUTHENTICATED_USERS and is_authenticated:
actions.update(PERMISSION_TO_ACTIONS.get(grant.permission, set()))
elif principal_id and grant.grantee == principal_id:
actions.update(PERMISSION_TO_ACTIONS.get(grant.permission, set()))
return actions
CANNED_ACLS = {
"private": lambda owner: Acl(
owner=owner,
grants=[AclGrant(grantee=owner, permission=ACL_PERMISSION_FULL_CONTROL)],
),
"public-read": lambda owner: Acl(
owner=owner,
grants=[
AclGrant(grantee=owner, permission=ACL_PERMISSION_FULL_CONTROL),
AclGrant(grantee=GRANTEE_ALL_USERS, permission=ACL_PERMISSION_READ),
],
),
"public-read-write": lambda owner: Acl(
owner=owner,
grants=[
AclGrant(grantee=owner, permission=ACL_PERMISSION_FULL_CONTROL),
AclGrant(grantee=GRANTEE_ALL_USERS, permission=ACL_PERMISSION_READ),
AclGrant(grantee=GRANTEE_ALL_USERS, permission=ACL_PERMISSION_WRITE),
],
),
"authenticated-read": lambda owner: Acl(
owner=owner,
grants=[
AclGrant(grantee=owner, permission=ACL_PERMISSION_FULL_CONTROL),
AclGrant(grantee=GRANTEE_AUTHENTICATED_USERS, permission=ACL_PERMISSION_READ),
],
),
"bucket-owner-read": lambda owner: Acl(
owner=owner,
grants=[
AclGrant(grantee=owner, permission=ACL_PERMISSION_FULL_CONTROL),
],
),
"bucket-owner-full-control": lambda owner: Acl(
owner=owner,
grants=[
AclGrant(grantee=owner, permission=ACL_PERMISSION_FULL_CONTROL),
],
),
}
def create_canned_acl(canned_acl: str, owner: str) -> Acl:
factory = CANNED_ACLS.get(canned_acl)
if not factory:
return CANNED_ACLS["private"](owner)
return factory(owner)
class AclService:
def __init__(self, storage_root: Path):
self.storage_root = storage_root
self._bucket_acl_cache: Dict[str, Acl] = {}
def _bucket_acl_path(self, bucket_name: str) -> Path:
return self.storage_root / ".myfsio.sys" / "buckets" / bucket_name / ".acl.json"
def get_bucket_acl(self, bucket_name: str) -> Optional[Acl]:
if bucket_name in self._bucket_acl_cache:
return self._bucket_acl_cache[bucket_name]
acl_path = self._bucket_acl_path(bucket_name)
if not acl_path.exists():
return None
try:
data = json.loads(acl_path.read_text(encoding="utf-8"))
acl = Acl.from_dict(data)
self._bucket_acl_cache[bucket_name] = acl
return acl
except (OSError, json.JSONDecodeError):
return None
def set_bucket_acl(self, bucket_name: str, acl: Acl) -> None:
acl_path = self._bucket_acl_path(bucket_name)
acl_path.parent.mkdir(parents=True, exist_ok=True)
acl_path.write_text(json.dumps(acl.to_dict(), indent=2), encoding="utf-8")
self._bucket_acl_cache[bucket_name] = acl
def set_bucket_canned_acl(self, bucket_name: str, canned_acl: str, owner: str) -> Acl:
acl = create_canned_acl(canned_acl, owner)
self.set_bucket_acl(bucket_name, acl)
return acl
def delete_bucket_acl(self, bucket_name: str) -> None:
acl_path = self._bucket_acl_path(bucket_name)
if acl_path.exists():
acl_path.unlink()
self._bucket_acl_cache.pop(bucket_name, None)
def evaluate_bucket_acl(
self,
bucket_name: str,
principal_id: Optional[str],
action: str,
is_authenticated: bool = True,
) -> bool:
acl = self.get_bucket_acl(bucket_name)
if not acl:
return False
allowed_actions = acl.get_allowed_actions(principal_id, is_authenticated)
return action in allowed_actions
def get_object_acl(self, bucket_name: str, object_key: str, object_metadata: Dict[str, Any]) -> Optional[Acl]:
acl_data = object_metadata.get("__acl__")
if not acl_data:
return None
try:
return Acl.from_dict(acl_data)
except (TypeError, KeyError):
return None
def create_object_acl_metadata(self, acl: Acl) -> Dict[str, Any]:
return {"__acl__": acl.to_dict()}
def evaluate_object_acl(
self,
object_metadata: Dict[str, Any],
principal_id: Optional[str],
action: str,
is_authenticated: bool = True,
) -> bool:
acl = self.get_object_acl("", "", object_metadata)
if not acl:
return False
allowed_actions = acl.get_allowed_actions(principal_id, is_authenticated)
return action in allowed_actions

View File

@@ -1,675 +0,0 @@
from __future__ import annotations
import ipaddress
import json
import logging
import re
import socket
import time
from typing import Any, Dict, Optional, Tuple
from urllib.parse import urlparse
import requests
from flask import Blueprint, Response, current_app, jsonify, request
from .connections import ConnectionStore
from .extensions import limiter
from .iam import IamError, Principal
from .replication import ReplicationManager
from .site_registry import PeerSite, SiteInfo, SiteRegistry
def _is_safe_url(url: str, allow_internal: bool = False) -> bool:
"""Check if a URL is safe to make requests to (not internal/private).
Args:
url: The URL to check.
allow_internal: If True, allows internal/private IP addresses.
Use for self-hosted deployments on internal networks.
"""
try:
parsed = urlparse(url)
hostname = parsed.hostname
if not hostname:
return False
cloud_metadata_hosts = {
"metadata.google.internal",
"169.254.169.254",
}
if hostname.lower() in cloud_metadata_hosts:
return False
if allow_internal:
return True
blocked_hosts = {
"localhost",
"127.0.0.1",
"0.0.0.0",
"::1",
"[::1]",
}
if hostname.lower() in blocked_hosts:
return False
try:
resolved_ip = socket.gethostbyname(hostname)
ip = ipaddress.ip_address(resolved_ip)
if ip.is_private or ip.is_loopback or ip.is_link_local or ip.is_reserved:
return False
except (socket.gaierror, ValueError):
return False
return True
except Exception:
return False
def _validate_endpoint(endpoint: str) -> Optional[str]:
"""Validate endpoint URL format. Returns error message or None."""
try:
parsed = urlparse(endpoint)
if not parsed.scheme or parsed.scheme not in ("http", "https"):
return "Endpoint must be http or https URL"
if not parsed.netloc:
return "Endpoint must have a host"
return None
except Exception:
return "Invalid endpoint URL"
def _validate_priority(priority: Any) -> Optional[str]:
"""Validate priority value. Returns error message or None."""
try:
p = int(priority)
if p < 0 or p > 1000:
return "Priority must be between 0 and 1000"
return None
except (TypeError, ValueError):
return "Priority must be an integer"
def _validate_region(region: str) -> Optional[str]:
"""Validate region format. Returns error message or None."""
if not re.match(r"^[a-z]{2,}-[a-z]+-\d+$", region):
return "Region must match format like us-east-1"
return None
def _validate_site_id(site_id: str) -> Optional[str]:
"""Validate site_id format. Returns error message or None."""
if not site_id or len(site_id) > 63:
return "site_id must be 1-63 characters"
if not re.match(r'^[a-zA-Z0-9][a-zA-Z0-9_-]*$', site_id):
return "site_id must start with alphanumeric and contain only alphanumeric, hyphens, underscores"
return None
logger = logging.getLogger(__name__)
admin_api_bp = Blueprint("admin_api", __name__, url_prefix="/admin")
def _require_principal() -> Tuple[Optional[Principal], Optional[Tuple[Dict[str, Any], int]]]:
from .s3_api import _require_principal as s3_require_principal
return s3_require_principal()
def _require_admin() -> Tuple[Optional[Principal], Optional[Tuple[Dict[str, Any], int]]]:
principal, error = _require_principal()
if error:
return None, error
try:
_iam().authorize(principal, None, "iam:*")
return principal, None
except IamError:
return None, _json_error("AccessDenied", "Admin access required", 403)
def _site_registry() -> SiteRegistry:
return current_app.extensions["site_registry"]
def _connections() -> ConnectionStore:
return current_app.extensions["connections"]
def _replication() -> ReplicationManager:
return current_app.extensions["replication"]
def _iam():
return current_app.extensions["iam"]
def _json_error(code: str, message: str, status: int) -> Tuple[Dict[str, Any], int]:
return {"error": {"code": code, "message": message}}, status
def _get_admin_rate_limit() -> str:
return current_app.config.get("RATE_LIMIT_ADMIN", "60 per minute")
@admin_api_bp.route("/site", methods=["GET"])
@limiter.limit(lambda: _get_admin_rate_limit())
def get_local_site():
principal, error = _require_admin()
if error:
return error
registry = _site_registry()
local_site = registry.get_local_site()
if local_site:
return jsonify(local_site.to_dict())
config_site_id = current_app.config.get("SITE_ID")
config_endpoint = current_app.config.get("SITE_ENDPOINT")
if config_site_id:
return jsonify({
"site_id": config_site_id,
"endpoint": config_endpoint or "",
"region": current_app.config.get("SITE_REGION", "us-east-1"),
"priority": current_app.config.get("SITE_PRIORITY", 100),
"display_name": config_site_id,
"source": "environment",
})
return _json_error("NotFound", "Local site not configured", 404)
@admin_api_bp.route("/site", methods=["PUT"])
@limiter.limit(lambda: _get_admin_rate_limit())
def update_local_site():
principal, error = _require_admin()
if error:
return error
payload = request.get_json(silent=True) or {}
site_id = payload.get("site_id")
endpoint = payload.get("endpoint")
if not site_id:
return _json_error("ValidationError", "site_id is required", 400)
site_id_error = _validate_site_id(site_id)
if site_id_error:
return _json_error("ValidationError", site_id_error, 400)
if endpoint:
endpoint_error = _validate_endpoint(endpoint)
if endpoint_error:
return _json_error("ValidationError", endpoint_error, 400)
if "priority" in payload:
priority_error = _validate_priority(payload["priority"])
if priority_error:
return _json_error("ValidationError", priority_error, 400)
if "region" in payload:
region_error = _validate_region(payload["region"])
if region_error:
return _json_error("ValidationError", region_error, 400)
registry = _site_registry()
existing = registry.get_local_site()
site = SiteInfo(
site_id=site_id,
endpoint=endpoint or "",
region=payload.get("region", "us-east-1"),
priority=payload.get("priority", 100),
display_name=payload.get("display_name", site_id),
created_at=existing.created_at if existing else None,
)
registry.set_local_site(site)
logger.info("Local site updated", extra={"site_id": site_id, "principal": principal.access_key})
return jsonify(site.to_dict())
@admin_api_bp.route("/sites", methods=["GET"])
@limiter.limit(lambda: _get_admin_rate_limit())
def list_all_sites():
principal, error = _require_admin()
if error:
return error
registry = _site_registry()
local = registry.get_local_site()
peers = registry.list_peers()
result = {
"local": local.to_dict() if local else None,
"peers": [peer.to_dict() for peer in peers],
"total_peers": len(peers),
}
return jsonify(result)
@admin_api_bp.route("/sites", methods=["POST"])
@limiter.limit(lambda: _get_admin_rate_limit())
def register_peer_site():
principal, error = _require_admin()
if error:
return error
payload = request.get_json(silent=True) or {}
site_id = payload.get("site_id")
endpoint = payload.get("endpoint")
if not site_id:
return _json_error("ValidationError", "site_id is required", 400)
site_id_error = _validate_site_id(site_id)
if site_id_error:
return _json_error("ValidationError", site_id_error, 400)
if not endpoint:
return _json_error("ValidationError", "endpoint is required", 400)
endpoint_error = _validate_endpoint(endpoint)
if endpoint_error:
return _json_error("ValidationError", endpoint_error, 400)
region = payload.get("region", "us-east-1")
region_error = _validate_region(region)
if region_error:
return _json_error("ValidationError", region_error, 400)
priority = payload.get("priority", 100)
priority_error = _validate_priority(priority)
if priority_error:
return _json_error("ValidationError", priority_error, 400)
registry = _site_registry()
if registry.get_peer(site_id):
return _json_error("AlreadyExists", f"Peer site '{site_id}' already exists", 409)
connection_id = payload.get("connection_id")
if connection_id:
if not _connections().get(connection_id):
return _json_error("ValidationError", f"Connection '{connection_id}' not found", 400)
peer = PeerSite(
site_id=site_id,
endpoint=endpoint,
region=region,
priority=int(priority),
display_name=payload.get("display_name", site_id),
connection_id=connection_id,
)
registry.add_peer(peer)
logger.info("Peer site registered", extra={"site_id": site_id, "principal": principal.access_key})
return jsonify(peer.to_dict()), 201
@admin_api_bp.route("/sites/<site_id>", methods=["GET"])
@limiter.limit(lambda: _get_admin_rate_limit())
def get_peer_site(site_id: str):
principal, error = _require_admin()
if error:
return error
registry = _site_registry()
peer = registry.get_peer(site_id)
if not peer:
return _json_error("NotFound", f"Peer site '{site_id}' not found", 404)
return jsonify(peer.to_dict())
@admin_api_bp.route("/sites/<site_id>", methods=["PUT"])
@limiter.limit(lambda: _get_admin_rate_limit())
def update_peer_site(site_id: str):
principal, error = _require_admin()
if error:
return error
registry = _site_registry()
existing = registry.get_peer(site_id)
if not existing:
return _json_error("NotFound", f"Peer site '{site_id}' not found", 404)
payload = request.get_json(silent=True) or {}
if "endpoint" in payload:
endpoint_error = _validate_endpoint(payload["endpoint"])
if endpoint_error:
return _json_error("ValidationError", endpoint_error, 400)
if "priority" in payload:
priority_error = _validate_priority(payload["priority"])
if priority_error:
return _json_error("ValidationError", priority_error, 400)
if "region" in payload:
region_error = _validate_region(payload["region"])
if region_error:
return _json_error("ValidationError", region_error, 400)
if "connection_id" in payload:
if payload["connection_id"] and not _connections().get(payload["connection_id"]):
return _json_error("ValidationError", f"Connection '{payload['connection_id']}' not found", 400)
peer = PeerSite(
site_id=site_id,
endpoint=payload.get("endpoint", existing.endpoint),
region=payload.get("region", existing.region),
priority=payload.get("priority", existing.priority),
display_name=payload.get("display_name", existing.display_name),
connection_id=payload.get("connection_id", existing.connection_id),
created_at=existing.created_at,
is_healthy=existing.is_healthy,
last_health_check=existing.last_health_check,
)
registry.update_peer(peer)
logger.info("Peer site updated", extra={"site_id": site_id, "principal": principal.access_key})
return jsonify(peer.to_dict())
@admin_api_bp.route("/sites/<site_id>", methods=["DELETE"])
@limiter.limit(lambda: _get_admin_rate_limit())
def delete_peer_site(site_id: str):
principal, error = _require_admin()
if error:
return error
registry = _site_registry()
if not registry.delete_peer(site_id):
return _json_error("NotFound", f"Peer site '{site_id}' not found", 404)
logger.info("Peer site deleted", extra={"site_id": site_id, "principal": principal.access_key})
return Response(status=204)
@admin_api_bp.route("/sites/<site_id>/health", methods=["GET"])
@limiter.limit(lambda: _get_admin_rate_limit())
def check_peer_health(site_id: str):
principal, error = _require_admin()
if error:
return error
registry = _site_registry()
peer = registry.get_peer(site_id)
if not peer:
return _json_error("NotFound", f"Peer site '{site_id}' not found", 404)
is_healthy = False
error_message = None
if peer.connection_id:
connection = _connections().get(peer.connection_id)
if connection:
is_healthy = _replication().check_endpoint_health(connection)
else:
error_message = f"Connection '{peer.connection_id}' not found"
else:
error_message = "No connection configured for this peer"
registry.update_health(site_id, is_healthy)
result = {
"site_id": site_id,
"is_healthy": is_healthy,
"checked_at": time.time(),
}
if error_message:
result["error"] = error_message
return jsonify(result)
@admin_api_bp.route("/topology", methods=["GET"])
@limiter.limit(lambda: _get_admin_rate_limit())
def get_topology():
principal, error = _require_admin()
if error:
return error
registry = _site_registry()
local = registry.get_local_site()
peers = registry.list_peers()
sites = []
if local:
sites.append({
**local.to_dict(),
"is_local": True,
"is_healthy": True,
})
for peer in peers:
sites.append({
**peer.to_dict(),
"is_local": False,
})
sites.sort(key=lambda s: s.get("priority", 100))
return jsonify({
"sites": sites,
"total": len(sites),
"healthy_count": sum(1 for s in sites if s.get("is_healthy")),
})
@admin_api_bp.route("/sites/<site_id>/bidirectional-status", methods=["GET"])
@limiter.limit(lambda: _get_admin_rate_limit())
def check_bidirectional_status(site_id: str):
principal, error = _require_admin()
if error:
return error
registry = _site_registry()
peer = registry.get_peer(site_id)
if not peer:
return _json_error("NotFound", f"Peer site '{site_id}' not found", 404)
local_site = registry.get_local_site()
replication = _replication()
local_rules = replication.list_rules()
local_bidir_rules = []
for rule in local_rules:
if rule.target_connection_id == peer.connection_id and rule.mode == "bidirectional":
local_bidir_rules.append({
"bucket_name": rule.bucket_name,
"target_bucket": rule.target_bucket,
"enabled": rule.enabled,
})
result = {
"site_id": site_id,
"local_site_id": local_site.site_id if local_site else None,
"local_endpoint": local_site.endpoint if local_site else None,
"local_bidirectional_rules": local_bidir_rules,
"local_site_sync_enabled": current_app.config.get("SITE_SYNC_ENABLED", False),
"remote_status": None,
"issues": [],
"is_fully_configured": False,
}
if not local_site or not local_site.site_id:
result["issues"].append({
"code": "NO_LOCAL_SITE_ID",
"message": "Local site identity not configured",
"severity": "error",
})
if not local_site or not local_site.endpoint:
result["issues"].append({
"code": "NO_LOCAL_ENDPOINT",
"message": "Local site endpoint not configured (remote site cannot reach back)",
"severity": "error",
})
if not peer.connection_id:
result["issues"].append({
"code": "NO_CONNECTION",
"message": "No connection configured for this peer",
"severity": "error",
})
return jsonify(result)
connection = _connections().get(peer.connection_id)
if not connection:
result["issues"].append({
"code": "CONNECTION_NOT_FOUND",
"message": f"Connection '{peer.connection_id}' not found",
"severity": "error",
})
return jsonify(result)
if not local_bidir_rules:
result["issues"].append({
"code": "NO_LOCAL_BIDIRECTIONAL_RULES",
"message": "No bidirectional replication rules configured on this site",
"severity": "warning",
})
if not result["local_site_sync_enabled"]:
result["issues"].append({
"code": "SITE_SYNC_DISABLED",
"message": "Site sync worker is disabled (SITE_SYNC_ENABLED=false). Pull operations will not work.",
"severity": "warning",
})
if not replication.check_endpoint_health(connection):
result["issues"].append({
"code": "REMOTE_UNREACHABLE",
"message": "Remote endpoint is not reachable",
"severity": "error",
})
return jsonify(result)
allow_internal = current_app.config.get("ALLOW_INTERNAL_ENDPOINTS", False)
if not _is_safe_url(peer.endpoint, allow_internal=allow_internal):
result["issues"].append({
"code": "ENDPOINT_NOT_ALLOWED",
"message": "Peer endpoint points to cloud metadata service (SSRF protection)",
"severity": "error",
})
return jsonify(result)
try:
admin_url = peer.endpoint.rstrip("/") + "/admin/sites"
resp = requests.get(
admin_url,
timeout=10,
headers={
"Accept": "application/json",
"X-Access-Key": connection.access_key,
"X-Secret-Key": connection.secret_key,
},
)
if resp.status_code == 200:
try:
remote_data = resp.json()
if not isinstance(remote_data, dict):
raise ValueError("Expected JSON object")
remote_local = remote_data.get("local")
if remote_local is not None and not isinstance(remote_local, dict):
raise ValueError("Expected 'local' to be an object")
remote_peers = remote_data.get("peers", [])
if not isinstance(remote_peers, list):
raise ValueError("Expected 'peers' to be a list")
except (ValueError, json.JSONDecodeError) as e:
logger.warning("Invalid JSON from remote admin API: %s", e)
result["remote_status"] = {"reachable": True, "invalid_response": True}
result["issues"].append({
"code": "REMOTE_INVALID_RESPONSE",
"message": "Remote admin API returned invalid JSON",
"severity": "warning",
})
return jsonify(result)
result["remote_status"] = {
"reachable": True,
"local_site": remote_local,
"site_sync_enabled": None,
"has_peer_for_us": False,
"peer_connection_configured": False,
"has_bidirectional_rules_for_us": False,
}
for rp in remote_peers:
if not isinstance(rp, dict):
continue
if local_site and (
rp.get("site_id") == local_site.site_id or
rp.get("endpoint") == local_site.endpoint
):
result["remote_status"]["has_peer_for_us"] = True
result["remote_status"]["peer_connection_configured"] = bool(rp.get("connection_id"))
break
if not result["remote_status"]["has_peer_for_us"]:
result["issues"].append({
"code": "REMOTE_NO_PEER_FOR_US",
"message": "Remote site does not have this site registered as a peer",
"severity": "error",
})
elif not result["remote_status"]["peer_connection_configured"]:
result["issues"].append({
"code": "REMOTE_NO_CONNECTION_FOR_US",
"message": "Remote site has us as peer but no connection configured (cannot push back)",
"severity": "error",
})
elif resp.status_code == 401 or resp.status_code == 403:
result["remote_status"] = {
"reachable": True,
"admin_access_denied": True,
}
result["issues"].append({
"code": "REMOTE_ADMIN_ACCESS_DENIED",
"message": "Cannot verify remote configuration (admin access denied)",
"severity": "warning",
})
else:
result["remote_status"] = {
"reachable": True,
"admin_api_error": resp.status_code,
}
result["issues"].append({
"code": "REMOTE_ADMIN_API_ERROR",
"message": f"Remote admin API returned status {resp.status_code}",
"severity": "warning",
})
except requests.RequestException as e:
logger.warning("Remote admin API unreachable: %s", e)
result["remote_status"] = {
"reachable": False,
"error": "Connection failed",
}
result["issues"].append({
"code": "REMOTE_ADMIN_UNREACHABLE",
"message": "Could not reach remote admin API",
"severity": "warning",
})
except Exception as e:
logger.warning("Error checking remote bidirectional status: %s", e, exc_info=True)
result["issues"].append({
"code": "VERIFICATION_ERROR",
"message": "Internal error during verification",
"severity": "warning",
})
error_issues = [i for i in result["issues"] if i["severity"] == "error"]
result["is_fully_configured"] = len(error_issues) == 0 and len(local_bidir_rules) > 0
return jsonify(result)

View File

@@ -1,403 +0,0 @@
from __future__ import annotations
import ipaddress
import json
import re
import time
from dataclasses import dataclass, field
from fnmatch import fnmatch, translate
from functools import lru_cache
from pathlib import Path
from typing import Any, Dict, Iterable, List, Optional, Pattern, Sequence, Tuple
RESOURCE_PREFIX = "arn:aws:s3:::"
@lru_cache(maxsize=256)
def _compile_pattern(pattern: str) -> Pattern[str]:
return re.compile(translate(pattern), re.IGNORECASE)
def _match_string_like(value: str, pattern: str) -> bool:
compiled = _compile_pattern(pattern)
return bool(compiled.match(value))
def _ip_in_cidr(ip_str: str, cidr: str) -> bool:
try:
ip = ipaddress.ip_address(ip_str)
network = ipaddress.ip_network(cidr, strict=False)
return ip in network
except ValueError:
return False
def _evaluate_condition_operator(
operator: str,
condition_key: str,
condition_values: List[str],
context: Dict[str, Any],
) -> bool:
context_value = context.get(condition_key)
op_lower = operator.lower()
if_exists = op_lower.endswith("ifexists")
if if_exists:
op_lower = op_lower[:-8]
if context_value is None:
return if_exists
context_value_str = str(context_value)
context_value_lower = context_value_str.lower()
if op_lower == "stringequals":
return context_value_str in condition_values
elif op_lower == "stringnotequals":
return context_value_str not in condition_values
elif op_lower == "stringequalsignorecase":
return context_value_lower in [v.lower() for v in condition_values]
elif op_lower == "stringnotequalsignorecase":
return context_value_lower not in [v.lower() for v in condition_values]
elif op_lower == "stringlike":
return any(_match_string_like(context_value_str, p) for p in condition_values)
elif op_lower == "stringnotlike":
return not any(_match_string_like(context_value_str, p) for p in condition_values)
elif op_lower == "ipaddress":
return any(_ip_in_cidr(context_value_str, cidr) for cidr in condition_values)
elif op_lower == "notipaddress":
return not any(_ip_in_cidr(context_value_str, cidr) for cidr in condition_values)
elif op_lower == "bool":
bool_val = context_value_lower in ("true", "1", "yes")
return str(bool_val).lower() in [v.lower() for v in condition_values]
elif op_lower == "null":
is_null = context_value is None or context_value == ""
expected_null = condition_values[0].lower() in ("true", "1", "yes") if condition_values else True
return is_null == expected_null
return True
ACTION_ALIASES = {
"s3:listbucket": "list",
"s3:listallmybuckets": "list",
"s3:listbucketversions": "list",
"s3:listmultipartuploads": "list",
"s3:listparts": "list",
"s3:getobject": "read",
"s3:getobjectversion": "read",
"s3:getobjecttagging": "read",
"s3:getobjectversiontagging": "read",
"s3:getobjectacl": "read",
"s3:getbucketversioning": "read",
"s3:headobject": "read",
"s3:headbucket": "read",
"s3:putobject": "write",
"s3:createbucket": "write",
"s3:putobjecttagging": "write",
"s3:putbucketversioning": "write",
"s3:createmultipartupload": "write",
"s3:uploadpart": "write",
"s3:completemultipartupload": "write",
"s3:abortmultipartupload": "write",
"s3:copyobject": "write",
"s3:deleteobject": "delete",
"s3:deleteobjectversion": "delete",
"s3:deletebucket": "delete",
"s3:deleteobjecttagging": "delete",
"s3:putobjectacl": "share",
"s3:putbucketacl": "share",
"s3:getbucketacl": "share",
"s3:putbucketpolicy": "policy",
"s3:getbucketpolicy": "policy",
"s3:deletebucketpolicy": "policy",
"s3:getreplicationconfiguration": "replication",
"s3:putreplicationconfiguration": "replication",
"s3:deletereplicationconfiguration": "replication",
"s3:replicateobject": "replication",
"s3:replicatetags": "replication",
"s3:replicatedelete": "replication",
"s3:getlifecycleconfiguration": "lifecycle",
"s3:putlifecycleconfiguration": "lifecycle",
"s3:deletelifecycleconfiguration": "lifecycle",
"s3:getbucketlifecycle": "lifecycle",
"s3:putbucketlifecycle": "lifecycle",
"s3:getbucketcors": "cors",
"s3:putbucketcors": "cors",
"s3:deletebucketcors": "cors",
}
def _normalize_action(action: str) -> str:
action = action.strip().lower()
if action == "*":
return "*"
return ACTION_ALIASES.get(action, action)
def _normalize_actions(actions: Iterable[str]) -> List[str]:
values: List[str] = []
for action in actions:
canonical = _normalize_action(action)
if canonical == "*" and "*" not in values:
return ["*"]
if canonical and canonical not in values:
values.append(canonical)
return values
def _normalize_principals(principal_field: Any) -> List[str] | str:
if principal_field == "*":
return "*"
def _collect(values: Any) -> List[str]:
if values is None:
return []
if values == "*":
return ["*"]
if isinstance(values, str):
return [values]
if isinstance(values, dict):
aggregated: List[str] = []
for nested in values.values():
chunk = _collect(nested)
if "*" in chunk:
return ["*"]
aggregated.extend(chunk)
return aggregated
if isinstance(values, Iterable):
aggregated = []
for nested in values:
chunk = _collect(nested)
if "*" in chunk:
return ["*"]
aggregated.extend(chunk)
return aggregated
return [str(values)]
normalized: List[str] = []
for entry in _collect(principal_field):
token = str(entry).strip()
if token == "*":
return "*"
if token and token not in normalized:
normalized.append(token)
return normalized or "*"
def _parse_resource(resource: str) -> tuple[str | None, str | None]:
if not resource.startswith(RESOURCE_PREFIX):
return None, None
remainder = resource[len(RESOURCE_PREFIX) :]
if "/" not in remainder:
bucket = remainder or "*"
return bucket, None
bucket, _, key_pattern = remainder.partition("/")
return bucket or "*", key_pattern or "*"
@dataclass
class BucketPolicyStatement:
sid: Optional[str]
effect: str
principals: List[str] | str
actions: List[str]
resources: List[Tuple[str | None, str | None]]
conditions: Dict[str, Dict[str, List[str]]] = field(default_factory=dict)
_compiled_patterns: List[Tuple[str | None, Optional[Pattern[str]]]] | None = None
def _get_compiled_patterns(self) -> List[Tuple[str | None, Optional[Pattern[str]]]]:
if self._compiled_patterns is None:
self._compiled_patterns = []
for resource_bucket, key_pattern in self.resources:
if key_pattern is None:
self._compiled_patterns.append((resource_bucket, None))
else:
regex_pattern = translate(key_pattern)
self._compiled_patterns.append((resource_bucket, re.compile(regex_pattern)))
return self._compiled_patterns
def matches_principal(self, access_key: Optional[str]) -> bool:
if self.principals == "*":
return True
if access_key is None:
return False
return access_key in self.principals
def matches_action(self, action: str) -> bool:
action = _normalize_action(action)
return "*" in self.actions or action in self.actions
def matches_resource(self, bucket: Optional[str], object_key: Optional[str]) -> bool:
bucket = (bucket or "*").lower()
key = object_key or ""
for resource_bucket, compiled_pattern in self._get_compiled_patterns():
resource_bucket = (resource_bucket or "*").lower()
if resource_bucket not in {"*", bucket}:
continue
if compiled_pattern is None:
if not key:
return True
continue
if compiled_pattern.match(key):
return True
return False
def matches_condition(self, context: Optional[Dict[str, Any]]) -> bool:
if not self.conditions:
return True
if context is None:
context = {}
for operator, key_values in self.conditions.items():
for condition_key, condition_values in key_values.items():
if not _evaluate_condition_operator(operator, condition_key, condition_values, context):
return False
return True
class BucketPolicyStore:
"""Loads bucket policies from disk and evaluates statements."""
def __init__(self, policy_path: Path) -> None:
self.policy_path = Path(policy_path)
self.policy_path.parent.mkdir(parents=True, exist_ok=True)
if not self.policy_path.exists():
self.policy_path.write_text(json.dumps({"policies": {}}, indent=2))
self._raw: Dict[str, Any] = {}
self._policies: Dict[str, List[BucketPolicyStatement]] = {}
self._load()
self._last_mtime = self._current_mtime()
# Performance: Avoid stat() on every request
self._last_stat_check = 0.0
self._stat_check_interval = 1.0 # Only check mtime every 1 second
def maybe_reload(self) -> None:
# Performance: Skip stat check if we checked recently
now = time.time()
if now - self._last_stat_check < self._stat_check_interval:
return
self._last_stat_check = now
current = self._current_mtime()
if current is None or current == self._last_mtime:
return
self._load()
self._last_mtime = current
def _current_mtime(self) -> float | None:
try:
return self.policy_path.stat().st_mtime
except FileNotFoundError:
return None
def evaluate(
self,
access_key: Optional[str],
bucket: Optional[str],
object_key: Optional[str],
action: str,
context: Optional[Dict[str, Any]] = None,
) -> str | None:
bucket = (bucket or "").lower()
statements = self._policies.get(bucket) or []
decision: Optional[str] = None
for statement in statements:
if not statement.matches_principal(access_key):
continue
if not statement.matches_action(action):
continue
if not statement.matches_resource(bucket, object_key):
continue
if not statement.matches_condition(context):
continue
if statement.effect == "deny":
return "deny"
decision = "allow"
return decision
def get_policy(self, bucket: str) -> Dict[str, Any] | None:
return self._raw.get(bucket.lower())
def set_policy(self, bucket: str, policy_payload: Dict[str, Any]) -> None:
bucket = bucket.lower()
statements = self._normalize_policy(policy_payload)
if not statements:
raise ValueError("Policy must include at least one valid statement")
self._raw[bucket] = policy_payload
self._policies[bucket] = statements
self._persist()
def delete_policy(self, bucket: str) -> None:
bucket = bucket.lower()
self._raw.pop(bucket, None)
self._policies.pop(bucket, None)
self._persist()
def _load(self) -> None:
try:
content = self.policy_path.read_text(encoding='utf-8')
raw_payload = json.loads(content)
except FileNotFoundError:
raw_payload = {"policies": {}}
except json.JSONDecodeError as e:
raise ValueError(f"Corrupted bucket policy file (invalid JSON): {e}")
except PermissionError as e:
raise ValueError(f"Cannot read bucket policy file (permission denied): {e}")
except (OSError, ValueError) as e:
raise ValueError(f"Failed to load bucket policies: {e}")
policies: Dict[str, Any] = raw_payload.get("policies", {})
parsed: Dict[str, List[BucketPolicyStatement]] = {}
for bucket, policy in policies.items():
parsed[bucket.lower()] = self._normalize_policy(policy)
self._raw = {bucket.lower(): policy for bucket, policy in policies.items()}
self._policies = parsed
def _persist(self) -> None:
payload = {"policies": self._raw}
self.policy_path.write_text(json.dumps(payload, indent=2))
def _normalize_policy(self, policy: Dict[str, Any]) -> List[BucketPolicyStatement]:
statements_raw: Sequence[Dict[str, Any]] = policy.get("Statement", [])
statements: List[BucketPolicyStatement] = []
for statement in statements_raw:
actions = _normalize_actions(statement.get("Action", []))
principals = _normalize_principals(statement.get("Principal", "*"))
resources_field = statement.get("Resource", [])
if isinstance(resources_field, str):
resources_field = [resources_field]
resources: List[tuple[str | None, str | None]] = []
for resource in resources_field:
bucket, pattern = _parse_resource(str(resource))
if bucket:
resources.append((bucket, pattern))
if not resources:
continue
effect = statement.get("Effect", "Allow").lower()
conditions = self._normalize_conditions(statement.get("Condition", {}))
statements.append(
BucketPolicyStatement(
sid=statement.get("Sid"),
effect=effect,
principals=principals,
actions=actions or ["*"],
resources=resources,
conditions=conditions,
)
)
return statements
def _normalize_conditions(self, condition_block: Dict[str, Any]) -> Dict[str, Dict[str, List[str]]]:
if not condition_block or not isinstance(condition_block, dict):
return {}
normalized: Dict[str, Dict[str, List[str]]] = {}
for operator, key_values in condition_block.items():
if not isinstance(key_values, dict):
continue
normalized[operator] = {}
for cond_key, cond_values in key_values.items():
if isinstance(cond_values, str):
normalized[operator][cond_key] = [cond_values]
elif isinstance(cond_values, list):
normalized[operator][cond_key] = [str(v) for v in cond_values]
else:
normalized[operator][cond_key] = [str(cond_values)]
return normalized

View File

@@ -1,103 +0,0 @@
from __future__ import annotations
import gzip
import io
from typing import Callable, Iterable, List, Tuple
COMPRESSIBLE_MIMES = frozenset([
'application/json',
'application/javascript',
'application/xml',
'text/html',
'text/css',
'text/plain',
'text/xml',
'text/javascript',
'application/x-ndjson',
])
MIN_SIZE_FOR_COMPRESSION = 500
class GzipMiddleware:
def __init__(self, app: Callable, compression_level: int = 6, min_size: int = MIN_SIZE_FOR_COMPRESSION):
self.app = app
self.compression_level = compression_level
self.min_size = min_size
def __call__(self, environ: dict, start_response: Callable) -> Iterable[bytes]:
accept_encoding = environ.get('HTTP_ACCEPT_ENCODING', '')
if 'gzip' not in accept_encoding.lower():
return self.app(environ, start_response)
response_started = False
status_code = None
response_headers: List[Tuple[str, str]] = []
content_type = None
content_length = None
should_compress = False
is_streaming = False
exc_info_holder = [None]
def custom_start_response(status: str, headers: List[Tuple[str, str]], exc_info=None):
nonlocal response_started, status_code, response_headers, content_type, content_length, should_compress, is_streaming
response_started = True
status_code = int(status.split(' ', 1)[0])
response_headers = list(headers)
exc_info_holder[0] = exc_info
for name, value in headers:
name_lower = name.lower()
if name_lower == 'content-type':
content_type = value.split(';')[0].strip().lower()
elif name_lower == 'content-length':
content_length = int(value)
elif name_lower == 'content-encoding':
should_compress = False
return start_response(status, headers, exc_info)
elif name_lower == 'x-stream-response':
is_streaming = True
return start_response(status, headers, exc_info)
if content_type and content_type in COMPRESSIBLE_MIMES:
if content_length is None or content_length >= self.min_size:
should_compress = True
return None
app_iter = self.app(environ, custom_start_response)
if is_streaming:
return app_iter
response_body = b''.join(app_iter)
if not response_started:
return [response_body]
if should_compress and len(response_body) >= self.min_size:
buf = io.BytesIO()
with gzip.GzipFile(fileobj=buf, mode='wb', compresslevel=self.compression_level) as gz:
gz.write(response_body)
compressed = buf.getvalue()
if len(compressed) < len(response_body):
response_body = compressed
new_headers = []
for name, value in response_headers:
if name.lower() not in ('content-length', 'content-encoding'):
new_headers.append((name, value))
new_headers.append(('Content-Encoding', 'gzip'))
new_headers.append(('Content-Length', str(len(response_body))))
new_headers.append(('Vary', 'Accept-Encoding'))
response_headers = new_headers
status_str = f"{status_code} " + {
200: "OK", 201: "Created", 204: "No Content", 206: "Partial Content",
301: "Moved Permanently", 302: "Found", 304: "Not Modified",
400: "Bad Request", 401: "Unauthorized", 403: "Forbidden", 404: "Not Found",
405: "Method Not Allowed", 409: "Conflict", 500: "Internal Server Error",
}.get(status_code, "Unknown")
start_response(status_str, response_headers, exc_info_holder[0])
return [response_body]

View File

@@ -1,614 +0,0 @@
from __future__ import annotations
import os
import re
import secrets
import shutil
import sys
import warnings
from dataclasses import dataclass
from pathlib import Path
from typing import Any, Dict, Optional
import psutil
def _calculate_auto_threads() -> int:
cpu_count = psutil.cpu_count(logical=True) or 4
return max(1, min(cpu_count * 2, 64))
def _calculate_auto_connection_limit() -> int:
available_mb = psutil.virtual_memory().available / (1024 * 1024)
calculated = int(available_mb / 5)
return max(20, min(calculated, 1000))
def _calculate_auto_backlog(connection_limit: int) -> int:
return max(64, min(connection_limit * 2, 4096))
def _validate_rate_limit(value: str) -> str:
pattern = r"^\d+\s+per\s+(second|minute|hour|day)$"
if not re.match(pattern, value):
raise ValueError(f"Invalid rate limit format: {value}. Expected format: '200 per minute'")
return value
if getattr(sys, "frozen", False):
# Running in a PyInstaller bundle
PROJECT_ROOT = Path(sys._MEIPASS)
else:
# Running in a normal Python environment
PROJECT_ROOT = Path(__file__).resolve().parent.parent
def _prepare_config_file(active_path: Path, legacy_path: Optional[Path] = None) -> Path:
"""Ensure config directories exist and migrate legacy files when possible."""
active_path = Path(active_path)
active_path.parent.mkdir(parents=True, exist_ok=True)
if legacy_path:
legacy_path = Path(legacy_path)
if not active_path.exists() and legacy_path.exists():
legacy_path.parent.mkdir(parents=True, exist_ok=True)
try:
shutil.move(str(legacy_path), str(active_path))
except OSError:
shutil.copy2(legacy_path, active_path)
try:
legacy_path.unlink(missing_ok=True)
except OSError:
pass
return active_path
@dataclass
class AppConfig:
storage_root: Path
max_upload_size: int
ui_page_size: int
secret_key: str
iam_config_path: Path
bucket_policy_path: Path
api_base_url: Optional[str]
aws_region: str
aws_service: str
ui_enforce_bucket_policies: bool
log_level: str
log_to_file: bool
log_path: Path
log_max_bytes: int
log_backup_count: int
ratelimit_default: str
ratelimit_storage_uri: str
ratelimit_list_buckets: str
ratelimit_bucket_ops: str
ratelimit_object_ops: str
ratelimit_head_ops: str
cors_origins: list[str]
cors_methods: list[str]
cors_allow_headers: list[str]
cors_expose_headers: list[str]
session_lifetime_days: int
auth_max_attempts: int
auth_lockout_minutes: int
bulk_delete_max_keys: int
secret_ttl_seconds: int
stream_chunk_size: int
multipart_min_part_size: int
bucket_stats_cache_ttl: int
object_cache_ttl: int
encryption_enabled: bool
encryption_master_key_path: Path
kms_enabled: bool
kms_keys_path: Path
default_encryption_algorithm: str
display_timezone: str
lifecycle_enabled: bool
lifecycle_interval_seconds: int
metrics_history_enabled: bool
metrics_history_retention_hours: int
metrics_history_interval_minutes: int
operation_metrics_enabled: bool
operation_metrics_interval_minutes: int
operation_metrics_retention_hours: int
server_threads: int
server_connection_limit: int
server_backlog: int
server_channel_timeout: int
server_threads_auto: bool
server_connection_limit_auto: bool
server_backlog_auto: bool
site_sync_enabled: bool
site_sync_interval_seconds: int
site_sync_batch_size: int
sigv4_timestamp_tolerance_seconds: int
presigned_url_min_expiry_seconds: int
presigned_url_max_expiry_seconds: int
replication_connect_timeout_seconds: int
replication_read_timeout_seconds: int
replication_max_retries: int
replication_streaming_threshold_bytes: int
replication_max_failures_per_bucket: int
site_sync_connect_timeout_seconds: int
site_sync_read_timeout_seconds: int
site_sync_max_retries: int
site_sync_clock_skew_tolerance_seconds: float
object_key_max_length_bytes: int
object_cache_max_size: int
bucket_config_cache_ttl_seconds: float
object_tag_limit: int
encryption_chunk_size_bytes: int
kms_generate_data_key_min_bytes: int
kms_generate_data_key_max_bytes: int
lifecycle_max_history_per_bucket: int
site_id: Optional[str]
site_endpoint: Optional[str]
site_region: str
site_priority: int
ratelimit_admin: str
num_trusted_proxies: int
allowed_redirect_hosts: list[str]
allow_internal_endpoints: bool
@classmethod
def from_env(cls, overrides: Optional[Dict[str, Any]] = None) -> "AppConfig":
overrides = overrides or {}
def _get(name: str, default: Any) -> Any:
return overrides.get(name, os.getenv(name, default))
storage_root = Path(_get("STORAGE_ROOT", PROJECT_ROOT / "data")).resolve()
max_upload_size = int(_get("MAX_UPLOAD_SIZE", 1024 * 1024 * 1024))
ui_page_size = int(_get("UI_PAGE_SIZE", 100))
auth_max_attempts = int(_get("AUTH_MAX_ATTEMPTS", 5))
auth_lockout_minutes = int(_get("AUTH_LOCKOUT_MINUTES", 15))
bulk_delete_max_keys = int(_get("BULK_DELETE_MAX_KEYS", 500))
secret_ttl_seconds = int(_get("SECRET_TTL_SECONDS", 300))
stream_chunk_size = int(_get("STREAM_CHUNK_SIZE", 64 * 1024))
multipart_min_part_size = int(_get("MULTIPART_MIN_PART_SIZE", 5 * 1024 * 1024))
lifecycle_enabled = _get("LIFECYCLE_ENABLED", "false").lower() in ("true", "1", "yes")
lifecycle_interval_seconds = int(_get("LIFECYCLE_INTERVAL_SECONDS", 3600))
default_secret = "dev-secret-key"
secret_key = str(_get("SECRET_KEY", default_secret))
if not secret_key or secret_key == default_secret:
secret_file = storage_root / ".myfsio.sys" / "config" / ".secret"
if secret_file.exists():
secret_key = secret_file.read_text().strip()
else:
generated = secrets.token_urlsafe(32)
if secret_key == default_secret:
warnings.warn("Using insecure default SECRET_KEY. A random value has been generated and persisted; set SECRET_KEY for production", RuntimeWarning)
try:
secret_file.parent.mkdir(parents=True, exist_ok=True)
secret_file.write_text(generated)
try:
os.chmod(secret_file, 0o600)
except OSError:
pass
secret_key = generated
except OSError:
secret_key = generated
iam_env_override = "IAM_CONFIG" in overrides or "IAM_CONFIG" in os.environ
bucket_policy_override = "BUCKET_POLICY_PATH" in overrides or "BUCKET_POLICY_PATH" in os.environ
default_iam_path = storage_root / ".myfsio.sys" / "config" / "iam.json"
default_bucket_policy_path = storage_root / ".myfsio.sys" / "config" / "bucket_policies.json"
iam_config_path = Path(_get("IAM_CONFIG", default_iam_path)).resolve()
bucket_policy_path = Path(_get("BUCKET_POLICY_PATH", default_bucket_policy_path)).resolve()
iam_config_path = _prepare_config_file(
iam_config_path,
legacy_path=None if iam_env_override else storage_root / "iam.json",
)
bucket_policy_path = _prepare_config_file(
bucket_policy_path,
legacy_path=None if bucket_policy_override else storage_root / "bucket_policies.json",
)
api_base_url = _get("API_BASE_URL", None)
if api_base_url:
api_base_url = str(api_base_url)
aws_region = str(_get("AWS_REGION", "us-east-1"))
aws_service = str(_get("AWS_SERVICE", "s3"))
enforce_ui_policies = str(_get("UI_ENFORCE_BUCKET_POLICIES", "0")).lower() in {"1", "true", "yes", "on"}
log_level = str(_get("LOG_LEVEL", "INFO")).upper()
log_to_file = str(_get("LOG_TO_FILE", "1")).lower() in {"1", "true", "yes", "on"}
log_dir = Path(_get("LOG_DIR", storage_root.parent / "logs")).resolve()
log_dir.mkdir(parents=True, exist_ok=True)
log_path = log_dir / str(_get("LOG_FILE", "app.log"))
log_max_bytes = int(_get("LOG_MAX_BYTES", 5 * 1024 * 1024))
log_backup_count = int(_get("LOG_BACKUP_COUNT", 3))
ratelimit_default = _validate_rate_limit(str(_get("RATE_LIMIT_DEFAULT", "200 per minute")))
ratelimit_storage_uri = str(_get("RATE_LIMIT_STORAGE_URI", "memory://"))
ratelimit_list_buckets = _validate_rate_limit(str(_get("RATE_LIMIT_LIST_BUCKETS", "60 per minute")))
ratelimit_bucket_ops = _validate_rate_limit(str(_get("RATE_LIMIT_BUCKET_OPS", "120 per minute")))
ratelimit_object_ops = _validate_rate_limit(str(_get("RATE_LIMIT_OBJECT_OPS", "240 per minute")))
ratelimit_head_ops = _validate_rate_limit(str(_get("RATE_LIMIT_HEAD_OPS", "100 per minute")))
def _csv(value: str, default: list[str]) -> list[str]:
if not value:
return default
parts = [segment.strip() for segment in value.split(",") if segment.strip()]
return parts or default
cors_origins = _csv(str(_get("CORS_ORIGINS", "*")), ["*"])
cors_methods = _csv(str(_get("CORS_METHODS", "GET,PUT,POST,DELETE,OPTIONS,HEAD")), ["GET", "PUT", "POST", "DELETE", "OPTIONS", "HEAD"])
cors_allow_headers = _csv(str(_get("CORS_ALLOW_HEADERS", "*")), ["*"])
cors_expose_headers = _csv(str(_get("CORS_EXPOSE_HEADERS", "*")), ["*"])
session_lifetime_days = int(_get("SESSION_LIFETIME_DAYS", 30))
bucket_stats_cache_ttl = int(_get("BUCKET_STATS_CACHE_TTL", 60))
object_cache_ttl = int(_get("OBJECT_CACHE_TTL", 5))
encryption_enabled = str(_get("ENCRYPTION_ENABLED", "0")).lower() in {"1", "true", "yes", "on"}
encryption_keys_dir = storage_root / ".myfsio.sys" / "keys"
encryption_master_key_path = Path(_get("ENCRYPTION_MASTER_KEY_PATH", encryption_keys_dir / "master.key")).resolve()
kms_enabled = str(_get("KMS_ENABLED", "0")).lower() in {"1", "true", "yes", "on"}
kms_keys_path = Path(_get("KMS_KEYS_PATH", encryption_keys_dir / "kms_keys.json")).resolve()
default_encryption_algorithm = str(_get("DEFAULT_ENCRYPTION_ALGORITHM", "AES256"))
display_timezone = str(_get("DISPLAY_TIMEZONE", "UTC"))
metrics_history_enabled = str(_get("METRICS_HISTORY_ENABLED", "0")).lower() in {"1", "true", "yes", "on"}
metrics_history_retention_hours = int(_get("METRICS_HISTORY_RETENTION_HOURS", 24))
metrics_history_interval_minutes = int(_get("METRICS_HISTORY_INTERVAL_MINUTES", 5))
operation_metrics_enabled = str(_get("OPERATION_METRICS_ENABLED", "0")).lower() in {"1", "true", "yes", "on"}
operation_metrics_interval_minutes = int(_get("OPERATION_METRICS_INTERVAL_MINUTES", 5))
operation_metrics_retention_hours = int(_get("OPERATION_METRICS_RETENTION_HOURS", 24))
_raw_threads = int(_get("SERVER_THREADS", 0))
if _raw_threads == 0:
server_threads = _calculate_auto_threads()
server_threads_auto = True
else:
server_threads = _raw_threads
server_threads_auto = False
_raw_conn_limit = int(_get("SERVER_CONNECTION_LIMIT", 0))
if _raw_conn_limit == 0:
server_connection_limit = _calculate_auto_connection_limit()
server_connection_limit_auto = True
else:
server_connection_limit = _raw_conn_limit
server_connection_limit_auto = False
_raw_backlog = int(_get("SERVER_BACKLOG", 0))
if _raw_backlog == 0:
server_backlog = _calculate_auto_backlog(server_connection_limit)
server_backlog_auto = True
else:
server_backlog = _raw_backlog
server_backlog_auto = False
server_channel_timeout = int(_get("SERVER_CHANNEL_TIMEOUT", 120))
site_sync_enabled = str(_get("SITE_SYNC_ENABLED", "0")).lower() in {"1", "true", "yes", "on"}
site_sync_interval_seconds = int(_get("SITE_SYNC_INTERVAL_SECONDS", 60))
site_sync_batch_size = int(_get("SITE_SYNC_BATCH_SIZE", 100))
sigv4_timestamp_tolerance_seconds = int(_get("SIGV4_TIMESTAMP_TOLERANCE_SECONDS", 900))
presigned_url_min_expiry_seconds = int(_get("PRESIGNED_URL_MIN_EXPIRY_SECONDS", 1))
presigned_url_max_expiry_seconds = int(_get("PRESIGNED_URL_MAX_EXPIRY_SECONDS", 604800))
replication_connect_timeout_seconds = int(_get("REPLICATION_CONNECT_TIMEOUT_SECONDS", 5))
replication_read_timeout_seconds = int(_get("REPLICATION_READ_TIMEOUT_SECONDS", 30))
replication_max_retries = int(_get("REPLICATION_MAX_RETRIES", 2))
replication_streaming_threshold_bytes = int(_get("REPLICATION_STREAMING_THRESHOLD_BYTES", 10 * 1024 * 1024))
replication_max_failures_per_bucket = int(_get("REPLICATION_MAX_FAILURES_PER_BUCKET", 50))
site_sync_connect_timeout_seconds = int(_get("SITE_SYNC_CONNECT_TIMEOUT_SECONDS", 10))
site_sync_read_timeout_seconds = int(_get("SITE_SYNC_READ_TIMEOUT_SECONDS", 120))
site_sync_max_retries = int(_get("SITE_SYNC_MAX_RETRIES", 2))
site_sync_clock_skew_tolerance_seconds = float(_get("SITE_SYNC_CLOCK_SKEW_TOLERANCE_SECONDS", 1.0))
object_key_max_length_bytes = int(_get("OBJECT_KEY_MAX_LENGTH_BYTES", 1024))
object_cache_max_size = int(_get("OBJECT_CACHE_MAX_SIZE", 100))
bucket_config_cache_ttl_seconds = float(_get("BUCKET_CONFIG_CACHE_TTL_SECONDS", 30.0))
object_tag_limit = int(_get("OBJECT_TAG_LIMIT", 50))
encryption_chunk_size_bytes = int(_get("ENCRYPTION_CHUNK_SIZE_BYTES", 64 * 1024))
kms_generate_data_key_min_bytes = int(_get("KMS_GENERATE_DATA_KEY_MIN_BYTES", 1))
kms_generate_data_key_max_bytes = int(_get("KMS_GENERATE_DATA_KEY_MAX_BYTES", 1024))
lifecycle_max_history_per_bucket = int(_get("LIFECYCLE_MAX_HISTORY_PER_BUCKET", 50))
site_id_raw = _get("SITE_ID", None)
site_id = str(site_id_raw).strip() if site_id_raw else None
site_endpoint_raw = _get("SITE_ENDPOINT", None)
site_endpoint = str(site_endpoint_raw).strip() if site_endpoint_raw else None
site_region = str(_get("SITE_REGION", "us-east-1"))
site_priority = int(_get("SITE_PRIORITY", 100))
ratelimit_admin = _validate_rate_limit(str(_get("RATE_LIMIT_ADMIN", "60 per minute")))
num_trusted_proxies = int(_get("NUM_TRUSTED_PROXIES", 0))
allowed_redirect_hosts_raw = _get("ALLOWED_REDIRECT_HOSTS", "")
allowed_redirect_hosts = [h.strip() for h in str(allowed_redirect_hosts_raw).split(",") if h.strip()]
allow_internal_endpoints = str(_get("ALLOW_INTERNAL_ENDPOINTS", "0")).lower() in {"1", "true", "yes", "on"}
return cls(storage_root=storage_root,
max_upload_size=max_upload_size,
ui_page_size=ui_page_size,
secret_key=secret_key,
iam_config_path=iam_config_path,
bucket_policy_path=bucket_policy_path,
api_base_url=api_base_url,
aws_region=aws_region,
aws_service=aws_service,
ui_enforce_bucket_policies=enforce_ui_policies,
log_level=log_level,
log_to_file=log_to_file,
log_path=log_path,
log_max_bytes=log_max_bytes,
log_backup_count=log_backup_count,
ratelimit_default=ratelimit_default,
ratelimit_storage_uri=ratelimit_storage_uri,
ratelimit_list_buckets=ratelimit_list_buckets,
ratelimit_bucket_ops=ratelimit_bucket_ops,
ratelimit_object_ops=ratelimit_object_ops,
ratelimit_head_ops=ratelimit_head_ops,
cors_origins=cors_origins,
cors_methods=cors_methods,
cors_allow_headers=cors_allow_headers,
cors_expose_headers=cors_expose_headers,
session_lifetime_days=session_lifetime_days,
auth_max_attempts=auth_max_attempts,
auth_lockout_minutes=auth_lockout_minutes,
bulk_delete_max_keys=bulk_delete_max_keys,
secret_ttl_seconds=secret_ttl_seconds,
stream_chunk_size=stream_chunk_size,
multipart_min_part_size=multipart_min_part_size,
bucket_stats_cache_ttl=bucket_stats_cache_ttl,
object_cache_ttl=object_cache_ttl,
encryption_enabled=encryption_enabled,
encryption_master_key_path=encryption_master_key_path,
kms_enabled=kms_enabled,
kms_keys_path=kms_keys_path,
default_encryption_algorithm=default_encryption_algorithm,
display_timezone=display_timezone,
lifecycle_enabled=lifecycle_enabled,
lifecycle_interval_seconds=lifecycle_interval_seconds,
metrics_history_enabled=metrics_history_enabled,
metrics_history_retention_hours=metrics_history_retention_hours,
metrics_history_interval_minutes=metrics_history_interval_minutes,
operation_metrics_enabled=operation_metrics_enabled,
operation_metrics_interval_minutes=operation_metrics_interval_minutes,
operation_metrics_retention_hours=operation_metrics_retention_hours,
server_threads=server_threads,
server_connection_limit=server_connection_limit,
server_backlog=server_backlog,
server_channel_timeout=server_channel_timeout,
server_threads_auto=server_threads_auto,
server_connection_limit_auto=server_connection_limit_auto,
server_backlog_auto=server_backlog_auto,
site_sync_enabled=site_sync_enabled,
site_sync_interval_seconds=site_sync_interval_seconds,
site_sync_batch_size=site_sync_batch_size,
sigv4_timestamp_tolerance_seconds=sigv4_timestamp_tolerance_seconds,
presigned_url_min_expiry_seconds=presigned_url_min_expiry_seconds,
presigned_url_max_expiry_seconds=presigned_url_max_expiry_seconds,
replication_connect_timeout_seconds=replication_connect_timeout_seconds,
replication_read_timeout_seconds=replication_read_timeout_seconds,
replication_max_retries=replication_max_retries,
replication_streaming_threshold_bytes=replication_streaming_threshold_bytes,
replication_max_failures_per_bucket=replication_max_failures_per_bucket,
site_sync_connect_timeout_seconds=site_sync_connect_timeout_seconds,
site_sync_read_timeout_seconds=site_sync_read_timeout_seconds,
site_sync_max_retries=site_sync_max_retries,
site_sync_clock_skew_tolerance_seconds=site_sync_clock_skew_tolerance_seconds,
object_key_max_length_bytes=object_key_max_length_bytes,
object_cache_max_size=object_cache_max_size,
bucket_config_cache_ttl_seconds=bucket_config_cache_ttl_seconds,
object_tag_limit=object_tag_limit,
encryption_chunk_size_bytes=encryption_chunk_size_bytes,
kms_generate_data_key_min_bytes=kms_generate_data_key_min_bytes,
kms_generate_data_key_max_bytes=kms_generate_data_key_max_bytes,
lifecycle_max_history_per_bucket=lifecycle_max_history_per_bucket,
site_id=site_id,
site_endpoint=site_endpoint,
site_region=site_region,
site_priority=site_priority,
ratelimit_admin=ratelimit_admin,
num_trusted_proxies=num_trusted_proxies,
allowed_redirect_hosts=allowed_redirect_hosts,
allow_internal_endpoints=allow_internal_endpoints)
def validate_and_report(self) -> list[str]:
"""Validate configuration and return a list of warnings/issues.
Call this at startup to detect potential misconfigurations before
the application fully commits to running.
"""
issues = []
try:
test_file = self.storage_root / ".write_test"
test_file.touch()
test_file.unlink()
except (OSError, PermissionError) as e:
issues.append(f"CRITICAL: STORAGE_ROOT '{self.storage_root}' is not writable: {e}")
storage_str = str(self.storage_root).lower()
if "/tmp" in storage_str or "\\temp" in storage_str or "appdata\\local\\temp" in storage_str:
issues.append(f"WARNING: STORAGE_ROOT '{self.storage_root}' appears to be a temporary directory. Data may be lost on reboot!")
try:
self.iam_config_path.relative_to(self.storage_root)
except ValueError:
issues.append(f"WARNING: IAM_CONFIG '{self.iam_config_path}' is outside STORAGE_ROOT '{self.storage_root}'. Consider setting IAM_CONFIG explicitly or ensuring paths are aligned.")
try:
self.bucket_policy_path.relative_to(self.storage_root)
except ValueError:
issues.append(f"WARNING: BUCKET_POLICY_PATH '{self.bucket_policy_path}' is outside STORAGE_ROOT '{self.storage_root}'. Consider setting BUCKET_POLICY_PATH explicitly.")
try:
self.log_path.parent.mkdir(parents=True, exist_ok=True)
test_log = self.log_path.parent / ".write_test"
test_log.touch()
test_log.unlink()
except (OSError, PermissionError) as e:
issues.append(f"WARNING: Log directory '{self.log_path.parent}' is not writable: {e}")
log_str = str(self.log_path).lower()
if "/tmp" in log_str or "\\temp" in log_str or "appdata\\local\\temp" in log_str:
issues.append(f"WARNING: LOG_DIR '{self.log_path.parent}' appears to be a temporary directory. Logs may be lost on reboot!")
if self.encryption_enabled:
try:
self.encryption_master_key_path.relative_to(self.storage_root)
except ValueError:
issues.append(f"WARNING: ENCRYPTION_MASTER_KEY_PATH '{self.encryption_master_key_path}' is outside STORAGE_ROOT. Ensure proper backup procedures.")
if self.kms_enabled:
try:
self.kms_keys_path.relative_to(self.storage_root)
except ValueError:
issues.append(f"WARNING: KMS_KEYS_PATH '{self.kms_keys_path}' is outside STORAGE_ROOT. Ensure proper backup procedures.")
if self.secret_key == "dev-secret-key":
issues.append("WARNING: Using default SECRET_KEY. Set SECRET_KEY environment variable for production.")
if "*" in self.cors_origins:
issues.append("INFO: CORS_ORIGINS is set to '*'. Consider restricting to specific domains in production.")
if not (1 <= self.server_threads <= 64):
issues.append(f"CRITICAL: SERVER_THREADS={self.server_threads} is outside valid range (1-64). Server cannot start.")
if not (10 <= self.server_connection_limit <= 1000):
issues.append(f"CRITICAL: SERVER_CONNECTION_LIMIT={self.server_connection_limit} is outside valid range (10-1000). Server cannot start.")
if not (64 <= self.server_backlog <= 4096):
issues.append(f"CRITICAL: SERVER_BACKLOG={self.server_backlog} is outside valid range (64-4096). Server cannot start.")
if not (10 <= self.server_channel_timeout <= 300):
issues.append(f"CRITICAL: SERVER_CHANNEL_TIMEOUT={self.server_channel_timeout} is outside valid range (10-300). Server cannot start.")
if sys.platform != "win32":
try:
import resource
soft_limit, _ = resource.getrlimit(resource.RLIMIT_NOFILE)
threshold = int(soft_limit * 0.8)
if self.server_connection_limit > threshold:
issues.append(f"WARNING: SERVER_CONNECTION_LIMIT={self.server_connection_limit} exceeds 80% of system file descriptor limit (soft={soft_limit}). Consider running 'ulimit -n {self.server_connection_limit + 100}'.")
except (ImportError, OSError):
pass
try:
import psutil
available_mb = psutil.virtual_memory().available / (1024 * 1024)
estimated_mb = self.server_threads * 50
if estimated_mb > available_mb * 0.5:
issues.append(f"WARNING: SERVER_THREADS={self.server_threads} may require ~{estimated_mb}MB memory, exceeding 50% of available RAM ({int(available_mb)}MB).")
except ImportError:
pass
return issues
def print_startup_summary(self) -> None:
"""Print a summary of the configuration at startup."""
print("\n" + "=" * 60)
print("MyFSIO Configuration Summary")
print("=" * 60)
print(f" STORAGE_ROOT: {self.storage_root}")
print(f" IAM_CONFIG: {self.iam_config_path}")
print(f" BUCKET_POLICY: {self.bucket_policy_path}")
print(f" LOG_PATH: {self.log_path}")
if self.api_base_url:
print(f" API_BASE_URL: {self.api_base_url}")
if self.encryption_enabled:
print(f" ENCRYPTION: Enabled (Master key: {self.encryption_master_key_path})")
if self.kms_enabled:
print(f" KMS: Enabled (Keys: {self.kms_keys_path})")
def _auto(flag: bool) -> str:
return " (auto)" if flag else ""
print(f" SERVER_THREADS: {self.server_threads}{_auto(self.server_threads_auto)}")
print(f" CONNECTION_LIMIT: {self.server_connection_limit}{_auto(self.server_connection_limit_auto)}")
print(f" BACKLOG: {self.server_backlog}{_auto(self.server_backlog_auto)}")
print(f" CHANNEL_TIMEOUT: {self.server_channel_timeout}s")
print("=" * 60)
issues = self.validate_and_report()
if issues:
print("\nConfiguration Issues Detected:")
for issue in issues:
print(f"{issue}")
print()
else:
print(" ✓ Configuration validated successfully\n")
def to_flask_config(self) -> Dict[str, Any]:
return {
"STORAGE_ROOT": str(self.storage_root),
"MAX_CONTENT_LENGTH": self.max_upload_size,
"UI_PAGE_SIZE": self.ui_page_size,
"SECRET_KEY": self.secret_key,
"IAM_CONFIG": str(self.iam_config_path),
"BUCKET_POLICY_PATH": str(self.bucket_policy_path),
"API_BASE_URL": self.api_base_url,
"AWS_REGION": self.aws_region,
"AWS_SERVICE": self.aws_service,
"UI_ENFORCE_BUCKET_POLICIES": self.ui_enforce_bucket_policies,
"AUTH_MAX_ATTEMPTS": self.auth_max_attempts,
"AUTH_LOCKOUT_MINUTES": self.auth_lockout_minutes,
"BULK_DELETE_MAX_KEYS": self.bulk_delete_max_keys,
"SECRET_TTL_SECONDS": self.secret_ttl_seconds,
"STREAM_CHUNK_SIZE": self.stream_chunk_size,
"MULTIPART_MIN_PART_SIZE": self.multipart_min_part_size,
"BUCKET_STATS_CACHE_TTL": self.bucket_stats_cache_ttl,
"OBJECT_CACHE_TTL": self.object_cache_ttl,
"LOG_LEVEL": self.log_level,
"LOG_TO_FILE": self.log_to_file,
"LOG_FILE": str(self.log_path),
"LOG_MAX_BYTES": self.log_max_bytes,
"LOG_BACKUP_COUNT": self.log_backup_count,
"RATELIMIT_DEFAULT": self.ratelimit_default,
"RATELIMIT_STORAGE_URI": self.ratelimit_storage_uri,
"RATELIMIT_LIST_BUCKETS": self.ratelimit_list_buckets,
"RATELIMIT_BUCKET_OPS": self.ratelimit_bucket_ops,
"RATELIMIT_OBJECT_OPS": self.ratelimit_object_ops,
"RATELIMIT_HEAD_OPS": self.ratelimit_head_ops,
"CORS_ORIGINS": self.cors_origins,
"CORS_METHODS": self.cors_methods,
"CORS_ALLOW_HEADERS": self.cors_allow_headers,
"CORS_EXPOSE_HEADERS": self.cors_expose_headers,
"SESSION_LIFETIME_DAYS": self.session_lifetime_days,
"ENCRYPTION_ENABLED": self.encryption_enabled,
"ENCRYPTION_MASTER_KEY_PATH": str(self.encryption_master_key_path),
"KMS_ENABLED": self.kms_enabled,
"KMS_KEYS_PATH": str(self.kms_keys_path),
"DEFAULT_ENCRYPTION_ALGORITHM": self.default_encryption_algorithm,
"DISPLAY_TIMEZONE": self.display_timezone,
"LIFECYCLE_ENABLED": self.lifecycle_enabled,
"LIFECYCLE_INTERVAL_SECONDS": self.lifecycle_interval_seconds,
"METRICS_HISTORY_ENABLED": self.metrics_history_enabled,
"METRICS_HISTORY_RETENTION_HOURS": self.metrics_history_retention_hours,
"METRICS_HISTORY_INTERVAL_MINUTES": self.metrics_history_interval_minutes,
"OPERATION_METRICS_ENABLED": self.operation_metrics_enabled,
"OPERATION_METRICS_INTERVAL_MINUTES": self.operation_metrics_interval_minutes,
"OPERATION_METRICS_RETENTION_HOURS": self.operation_metrics_retention_hours,
"SERVER_THREADS": self.server_threads,
"SERVER_CONNECTION_LIMIT": self.server_connection_limit,
"SERVER_BACKLOG": self.server_backlog,
"SERVER_CHANNEL_TIMEOUT": self.server_channel_timeout,
"SITE_SYNC_ENABLED": self.site_sync_enabled,
"SITE_SYNC_INTERVAL_SECONDS": self.site_sync_interval_seconds,
"SITE_SYNC_BATCH_SIZE": self.site_sync_batch_size,
"SIGV4_TIMESTAMP_TOLERANCE_SECONDS": self.sigv4_timestamp_tolerance_seconds,
"PRESIGNED_URL_MIN_EXPIRY_SECONDS": self.presigned_url_min_expiry_seconds,
"PRESIGNED_URL_MAX_EXPIRY_SECONDS": self.presigned_url_max_expiry_seconds,
"REPLICATION_CONNECT_TIMEOUT_SECONDS": self.replication_connect_timeout_seconds,
"REPLICATION_READ_TIMEOUT_SECONDS": self.replication_read_timeout_seconds,
"REPLICATION_MAX_RETRIES": self.replication_max_retries,
"REPLICATION_STREAMING_THRESHOLD_BYTES": self.replication_streaming_threshold_bytes,
"REPLICATION_MAX_FAILURES_PER_BUCKET": self.replication_max_failures_per_bucket,
"SITE_SYNC_CONNECT_TIMEOUT_SECONDS": self.site_sync_connect_timeout_seconds,
"SITE_SYNC_READ_TIMEOUT_SECONDS": self.site_sync_read_timeout_seconds,
"SITE_SYNC_MAX_RETRIES": self.site_sync_max_retries,
"SITE_SYNC_CLOCK_SKEW_TOLERANCE_SECONDS": self.site_sync_clock_skew_tolerance_seconds,
"OBJECT_KEY_MAX_LENGTH_BYTES": self.object_key_max_length_bytes,
"OBJECT_CACHE_MAX_SIZE": self.object_cache_max_size,
"BUCKET_CONFIG_CACHE_TTL_SECONDS": self.bucket_config_cache_ttl_seconds,
"OBJECT_TAG_LIMIT": self.object_tag_limit,
"ENCRYPTION_CHUNK_SIZE_BYTES": self.encryption_chunk_size_bytes,
"KMS_GENERATE_DATA_KEY_MIN_BYTES": self.kms_generate_data_key_min_bytes,
"KMS_GENERATE_DATA_KEY_MAX_BYTES": self.kms_generate_data_key_max_bytes,
"LIFECYCLE_MAX_HISTORY_PER_BUCKET": self.lifecycle_max_history_per_bucket,
"SITE_ID": self.site_id,
"SITE_ENDPOINT": self.site_endpoint,
"SITE_REGION": self.site_region,
"SITE_PRIORITY": self.site_priority,
"RATE_LIMIT_ADMIN": self.ratelimit_admin,
"NUM_TRUSTED_PROXIES": self.num_trusted_proxies,
"ALLOWED_REDIRECT_HOSTS": self.allowed_redirect_hosts,
"ALLOW_INTERNAL_ENDPOINTS": self.allow_internal_endpoints,
}

View File

@@ -1,60 +0,0 @@
from __future__ import annotations
import json
from dataclasses import asdict, dataclass
from pathlib import Path
from typing import Dict, List, Optional
from .config import AppConfig
@dataclass
class RemoteConnection:
id: str
name: str
endpoint_url: str
access_key: str
secret_key: str
region: str = "us-east-1"
class ConnectionStore:
def __init__(self, config_path: Path) -> None:
self.config_path = config_path
self._connections: Dict[str, RemoteConnection] = {}
self.reload()
def reload(self) -> None:
if not self.config_path.exists():
self._connections = {}
return
try:
with open(self.config_path, "r") as f:
data = json.load(f)
for item in data:
conn = RemoteConnection(**item)
self._connections[conn.id] = conn
except (OSError, json.JSONDecodeError):
self._connections = {}
def save(self) -> None:
self.config_path.parent.mkdir(parents=True, exist_ok=True)
data = [asdict(conn) for conn in self._connections.values()]
with open(self.config_path, "w") as f:
json.dump(data, f, indent=2)
def list(self) -> List[RemoteConnection]:
return list(self._connections.values())
def get(self, connection_id: str) -> Optional[RemoteConnection]:
return self._connections.get(connection_id)
def add(self, connection: RemoteConnection) -> None:
self._connections[connection.id] = connection
self.save()
def delete(self, connection_id: str) -> None:
if connection_id in self._connections:
del self._connections[connection_id]
self.save()

View File

@@ -1,278 +0,0 @@
from __future__ import annotations
import io
from pathlib import Path
from typing import Any, BinaryIO, Dict, Optional
from .encryption import EncryptionManager, EncryptionMetadata, EncryptionError
from .storage import ObjectStorage, ObjectMeta, StorageError
class EncryptedObjectStorage:
"""Object storage with transparent server-side encryption.
This class wraps ObjectStorage and provides transparent encryption/decryption
of objects based on bucket encryption configuration.
Encryption is applied when:
1. Bucket has default encryption configured (SSE-S3 or SSE-KMS)
2. Client explicitly requests encryption via headers
The encryption metadata is stored alongside object metadata.
"""
STREAMING_THRESHOLD = 64 * 1024
def __init__(self, storage: ObjectStorage, encryption_manager: EncryptionManager):
self.storage = storage
self.encryption = encryption_manager
@property
def root(self) -> Path:
return self.storage.root
def _should_encrypt(self, bucket_name: str,
server_side_encryption: str | None = None) -> tuple[bool, str, str | None]:
"""Determine if object should be encrypted.
Returns:
Tuple of (should_encrypt, algorithm, kms_key_id)
"""
if not self.encryption.enabled:
return False, "", None
if server_side_encryption:
if server_side_encryption == "AES256":
return True, "AES256", None
elif server_side_encryption.startswith("aws:kms"):
parts = server_side_encryption.split(":")
kms_key_id = parts[2] if len(parts) > 2 else None
return True, "aws:kms", kms_key_id
try:
encryption_config = self.storage.get_bucket_encryption(bucket_name)
if encryption_config and encryption_config.get("Rules"):
rule = encryption_config["Rules"][0]
# AWS format: Rules[].ApplyServerSideEncryptionByDefault.SSEAlgorithm
sse_default = rule.get("ApplyServerSideEncryptionByDefault", {})
algorithm = sse_default.get("SSEAlgorithm", "AES256")
kms_key_id = sse_default.get("KMSMasterKeyID")
return True, algorithm, kms_key_id
except StorageError:
pass
return False, "", None
def _is_encrypted(self, metadata: Dict[str, str]) -> bool:
"""Check if object is encrypted based on its metadata."""
return "x-amz-server-side-encryption" in metadata
def put_object(
self,
bucket_name: str,
object_key: str,
stream: BinaryIO,
*,
metadata: Optional[Dict[str, str]] = None,
server_side_encryption: Optional[str] = None,
kms_key_id: Optional[str] = None,
) -> ObjectMeta:
"""Store an object, optionally with encryption.
Args:
bucket_name: Name of the bucket
object_key: Key for the object
stream: Binary stream of object data
metadata: Optional user metadata
server_side_encryption: Encryption algorithm ("AES256" or "aws:kms")
kms_key_id: KMS key ID (for aws:kms encryption)
Returns:
ObjectMeta with object information
Performance: Uses streaming encryption for large files to reduce memory usage.
"""
should_encrypt, algorithm, detected_kms_key = self._should_encrypt(
bucket_name, server_side_encryption
)
if kms_key_id is None:
kms_key_id = detected_kms_key
if should_encrypt:
try:
# Performance: Use streaming encryption to avoid loading entire file into memory
encrypted_stream, enc_metadata = self.encryption.encrypt_stream(
stream,
algorithm=algorithm,
context={"bucket": bucket_name, "key": object_key},
)
combined_metadata = metadata.copy() if metadata else {}
combined_metadata.update(enc_metadata.to_dict())
result = self.storage.put_object(
bucket_name,
object_key,
encrypted_stream,
metadata=combined_metadata,
)
result.metadata = combined_metadata
return result
except EncryptionError as exc:
raise StorageError(f"Encryption failed: {exc}") from exc
else:
return self.storage.put_object(
bucket_name,
object_key,
stream,
metadata=metadata,
)
def get_object_data(self, bucket_name: str, object_key: str) -> tuple[bytes, Dict[str, str]]:
"""Get object data, decrypting if necessary.
Returns:
Tuple of (data, metadata)
Performance: Uses streaming decryption to reduce memory usage.
"""
path = self.storage.get_object_path(bucket_name, object_key)
metadata = self.storage.get_object_metadata(bucket_name, object_key)
enc_metadata = EncryptionMetadata.from_dict(metadata)
if enc_metadata:
try:
# Performance: Use streaming decryption to avoid loading entire file into memory
with path.open("rb") as f:
decrypted_stream = self.encryption.decrypt_stream(f, enc_metadata)
data = decrypted_stream.read()
except EncryptionError as exc:
raise StorageError(f"Decryption failed: {exc}") from exc
else:
with path.open("rb") as f:
data = f.read()
clean_metadata = {
k: v for k, v in metadata.items()
if not k.startswith("x-amz-encryption")
and k != "x-amz-encrypted-data-key"
}
return data, clean_metadata
def get_object_stream(self, bucket_name: str, object_key: str) -> tuple[BinaryIO, Dict[str, str], int]:
"""Get object as a stream, decrypting if necessary.
Returns:
Tuple of (stream, metadata, original_size)
"""
data, metadata = self.get_object_data(bucket_name, object_key)
return io.BytesIO(data), metadata, len(data)
def list_buckets(self):
return self.storage.list_buckets()
def bucket_exists(self, bucket_name: str) -> bool:
return self.storage.bucket_exists(bucket_name)
def create_bucket(self, bucket_name: str) -> None:
return self.storage.create_bucket(bucket_name)
def delete_bucket(self, bucket_name: str) -> None:
return self.storage.delete_bucket(bucket_name)
def bucket_stats(self, bucket_name: str, cache_ttl: int = 60):
return self.storage.bucket_stats(bucket_name, cache_ttl)
def list_objects(self, bucket_name: str, **kwargs):
return self.storage.list_objects(bucket_name, **kwargs)
def list_objects_all(self, bucket_name: str):
return self.storage.list_objects_all(bucket_name)
def get_object_path(self, bucket_name: str, object_key: str):
return self.storage.get_object_path(bucket_name, object_key)
def get_object_metadata(self, bucket_name: str, object_key: str):
return self.storage.get_object_metadata(bucket_name, object_key)
def delete_object(self, bucket_name: str, object_key: str) -> None:
return self.storage.delete_object(bucket_name, object_key)
def purge_object(self, bucket_name: str, object_key: str) -> None:
return self.storage.purge_object(bucket_name, object_key)
def is_versioning_enabled(self, bucket_name: str) -> bool:
return self.storage.is_versioning_enabled(bucket_name)
def set_bucket_versioning(self, bucket_name: str, enabled: bool) -> None:
return self.storage.set_bucket_versioning(bucket_name, enabled)
def get_bucket_tags(self, bucket_name: str):
return self.storage.get_bucket_tags(bucket_name)
def set_bucket_tags(self, bucket_name: str, tags):
return self.storage.set_bucket_tags(bucket_name, tags)
def get_bucket_cors(self, bucket_name: str):
return self.storage.get_bucket_cors(bucket_name)
def set_bucket_cors(self, bucket_name: str, rules):
return self.storage.set_bucket_cors(bucket_name, rules)
def get_bucket_encryption(self, bucket_name: str):
return self.storage.get_bucket_encryption(bucket_name)
def set_bucket_encryption(self, bucket_name: str, config_payload):
return self.storage.set_bucket_encryption(bucket_name, config_payload)
def get_bucket_lifecycle(self, bucket_name: str):
return self.storage.get_bucket_lifecycle(bucket_name)
def set_bucket_lifecycle(self, bucket_name: str, rules):
return self.storage.set_bucket_lifecycle(bucket_name, rules)
def get_object_tags(self, bucket_name: str, object_key: str):
return self.storage.get_object_tags(bucket_name, object_key)
def set_object_tags(self, bucket_name: str, object_key: str, tags):
return self.storage.set_object_tags(bucket_name, object_key, tags)
def delete_object_tags(self, bucket_name: str, object_key: str):
return self.storage.delete_object_tags(bucket_name, object_key)
def list_object_versions(self, bucket_name: str, object_key: str):
return self.storage.list_object_versions(bucket_name, object_key)
def restore_object_version(self, bucket_name: str, object_key: str, version_id: str):
return self.storage.restore_object_version(bucket_name, object_key, version_id)
def list_orphaned_objects(self, bucket_name: str):
return self.storage.list_orphaned_objects(bucket_name)
def initiate_multipart_upload(self, bucket_name: str, object_key: str, *, metadata=None) -> str:
return self.storage.initiate_multipart_upload(bucket_name, object_key, metadata=metadata)
def upload_multipart_part(self, bucket_name: str, upload_id: str, part_number: int, stream: BinaryIO) -> str:
return self.storage.upload_multipart_part(bucket_name, upload_id, part_number, stream)
def complete_multipart_upload(self, bucket_name: str, upload_id: str, ordered_parts):
return self.storage.complete_multipart_upload(bucket_name, upload_id, ordered_parts)
def abort_multipart_upload(self, bucket_name: str, upload_id: str) -> None:
return self.storage.abort_multipart_upload(bucket_name, upload_id)
def list_multipart_parts(self, bucket_name: str, upload_id: str):
return self.storage.list_multipart_parts(bucket_name, upload_id)
def get_bucket_quota(self, bucket_name: str):
return self.storage.get_bucket_quota(bucket_name)
def set_bucket_quota(self, bucket_name: str, *, max_bytes=None, max_objects=None):
return self.storage.set_bucket_quota(bucket_name, max_bytes=max_bytes, max_objects=max_objects)
def _compute_etag(self, path: Path) -> str:
return self.storage._compute_etag(path)

View File

@@ -1,579 +0,0 @@
from __future__ import annotations
import base64
import io
import json
import logging
import os
import secrets
import subprocess
import sys
from dataclasses import dataclass
from pathlib import Path
from typing import Any, BinaryIO, Dict, Generator, Optional
from cryptography.hazmat.primitives.ciphers.aead import AESGCM
from cryptography.hazmat.primitives.kdf.hkdf import HKDF
from cryptography.hazmat.primitives import hashes
if sys.platform != "win32":
import fcntl
logger = logging.getLogger(__name__)
def _set_secure_file_permissions(file_path: Path) -> None:
"""Set restrictive file permissions (owner read/write only)."""
if sys.platform == "win32":
try:
username = os.environ.get("USERNAME", "")
if username:
subprocess.run(
["icacls", str(file_path), "/inheritance:r",
"/grant:r", f"{username}:F"],
check=True, capture_output=True
)
else:
logger.warning("Could not set secure permissions on %s: USERNAME not set", file_path)
except (subprocess.SubprocessError, OSError) as exc:
logger.warning("Failed to set secure permissions on %s: %s", file_path, exc)
else:
os.chmod(file_path, 0o600)
class EncryptionError(Exception):
"""Raised when encryption/decryption fails."""
@dataclass
class EncryptionResult:
"""Result of encrypting data."""
ciphertext: bytes
nonce: bytes
key_id: str
encrypted_data_key: bytes
@dataclass
class EncryptionMetadata:
"""Metadata stored with encrypted objects."""
algorithm: str
key_id: str
nonce: bytes
encrypted_data_key: bytes
def to_dict(self) -> Dict[str, str]:
return {
"x-amz-server-side-encryption": self.algorithm,
"x-amz-encryption-key-id": self.key_id,
"x-amz-encryption-nonce": base64.b64encode(self.nonce).decode(),
"x-amz-encrypted-data-key": base64.b64encode(self.encrypted_data_key).decode(),
}
@classmethod
def from_dict(cls, data: Dict[str, str]) -> Optional["EncryptionMetadata"]:
algorithm = data.get("x-amz-server-side-encryption")
if not algorithm:
return None
try:
return cls(
algorithm=algorithm,
key_id=data.get("x-amz-encryption-key-id", "local"),
nonce=base64.b64decode(data.get("x-amz-encryption-nonce", "")),
encrypted_data_key=base64.b64decode(data.get("x-amz-encrypted-data-key", "")),
)
except Exception:
return None
class EncryptionProvider:
"""Base class for encryption providers."""
def encrypt(self, plaintext: bytes, context: Dict[str, str] | None = None) -> EncryptionResult:
raise NotImplementedError
def decrypt(self, ciphertext: bytes, nonce: bytes, encrypted_data_key: bytes,
key_id: str, context: Dict[str, str] | None = None) -> bytes:
raise NotImplementedError
def generate_data_key(self) -> tuple[bytes, bytes]:
"""Generate a data key and its encrypted form.
Returns:
Tuple of (plaintext_key, encrypted_key)
"""
raise NotImplementedError
def decrypt_data_key(self, encrypted_data_key: bytes, key_id: str | None = None) -> bytes:
"""Decrypt an encrypted data key.
Args:
encrypted_data_key: The encrypted data key bytes
key_id: Optional key identifier (used by KMS providers)
Returns:
The decrypted data key
"""
raise NotImplementedError
class LocalKeyEncryption(EncryptionProvider):
"""SSE-S3 style encryption using a local master key.
Uses envelope encryption:
1. Generate a unique data key for each object
2. Encrypt the data with the data key (AES-256-GCM)
3. Encrypt the data key with the master key
4. Store the encrypted data key alongside the ciphertext
"""
KEY_ID = "local"
def __init__(self, master_key_path: Path):
self.master_key_path = master_key_path
self._master_key: bytes | None = None
@property
def master_key(self) -> bytes:
if self._master_key is None:
self._master_key = self._load_or_create_master_key()
return self._master_key
def _load_or_create_master_key(self) -> bytes:
"""Load master key from file or generate a new one (with file locking)."""
lock_path = self.master_key_path.with_suffix(".lock")
lock_path.parent.mkdir(parents=True, exist_ok=True)
try:
with open(lock_path, "w") as lock_file:
if sys.platform == "win32":
import msvcrt
msvcrt.locking(lock_file.fileno(), msvcrt.LK_LOCK, 1)
else:
fcntl.flock(lock_file.fileno(), fcntl.LOCK_EX)
try:
if self.master_key_path.exists():
try:
return base64.b64decode(self.master_key_path.read_text().strip())
except Exception as exc:
raise EncryptionError(f"Failed to load master key: {exc}") from exc
key = secrets.token_bytes(32)
try:
self.master_key_path.write_text(base64.b64encode(key).decode())
_set_secure_file_permissions(self.master_key_path)
except OSError as exc:
raise EncryptionError(f"Failed to save master key: {exc}") from exc
return key
finally:
if sys.platform == "win32":
import msvcrt
msvcrt.locking(lock_file.fileno(), msvcrt.LK_UNLCK, 1)
else:
fcntl.flock(lock_file.fileno(), fcntl.LOCK_UN)
except OSError as exc:
raise EncryptionError(f"Failed to acquire lock for master key: {exc}") from exc
DATA_KEY_AAD = b'{"purpose":"data_key","version":1}'
def _encrypt_data_key(self, data_key: bytes) -> bytes:
"""Encrypt the data key with the master key."""
aesgcm = AESGCM(self.master_key)
nonce = secrets.token_bytes(12)
encrypted = aesgcm.encrypt(nonce, data_key, self.DATA_KEY_AAD)
return nonce + encrypted
def _decrypt_data_key(self, encrypted_data_key: bytes) -> bytes:
"""Decrypt the data key using the master key."""
if len(encrypted_data_key) < 12 + 32 + 16: # nonce + key + tag
raise EncryptionError("Invalid encrypted data key")
aesgcm = AESGCM(self.master_key)
nonce = encrypted_data_key[:12]
ciphertext = encrypted_data_key[12:]
try:
return aesgcm.decrypt(nonce, ciphertext, self.DATA_KEY_AAD)
except Exception:
try:
return aesgcm.decrypt(nonce, ciphertext, None)
except Exception as exc:
raise EncryptionError(f"Failed to decrypt data key: {exc}") from exc
def decrypt_data_key(self, encrypted_data_key: bytes, key_id: str | None = None) -> bytes:
"""Decrypt an encrypted data key (key_id ignored for local encryption)."""
return self._decrypt_data_key(encrypted_data_key)
def generate_data_key(self) -> tuple[bytes, bytes]:
"""Generate a data key and its encrypted form."""
plaintext_key = secrets.token_bytes(32)
encrypted_key = self._encrypt_data_key(plaintext_key)
return plaintext_key, encrypted_key
def encrypt(self, plaintext: bytes, context: Dict[str, str] | None = None) -> EncryptionResult:
"""Encrypt data using envelope encryption."""
data_key, encrypted_data_key = self.generate_data_key()
aesgcm = AESGCM(data_key)
nonce = secrets.token_bytes(12)
aad = json.dumps(context, sort_keys=True).encode() if context else None
ciphertext = aesgcm.encrypt(nonce, plaintext, aad)
return EncryptionResult(
ciphertext=ciphertext,
nonce=nonce,
key_id=self.KEY_ID,
encrypted_data_key=encrypted_data_key,
)
def decrypt(self, ciphertext: bytes, nonce: bytes, encrypted_data_key: bytes,
key_id: str, context: Dict[str, str] | None = None) -> bytes:
"""Decrypt data using envelope encryption."""
data_key = self._decrypt_data_key(encrypted_data_key)
aesgcm = AESGCM(data_key)
aad = json.dumps(context, sort_keys=True).encode() if context else None
try:
return aesgcm.decrypt(nonce, ciphertext, aad)
except Exception as exc:
raise EncryptionError("Failed to decrypt data") from exc
class StreamingEncryptor:
"""Encrypts/decrypts data in streaming fashion for large files.
For large files, we encrypt in chunks. Each chunk is encrypted with the
same data key but a unique nonce derived from the base nonce + chunk index.
"""
CHUNK_SIZE = 64 * 1024
HEADER_SIZE = 4
def __init__(self, provider: EncryptionProvider, chunk_size: int = CHUNK_SIZE):
self.provider = provider
self.chunk_size = chunk_size
def _derive_chunk_nonce(self, base_nonce: bytes, chunk_index: int) -> bytes:
"""Derive a unique nonce for each chunk using HKDF."""
hkdf = HKDF(
algorithm=hashes.SHA256(),
length=12,
salt=base_nonce,
info=chunk_index.to_bytes(4, "big"),
)
return hkdf.derive(b"chunk_nonce")
def encrypt_stream(self, stream: BinaryIO,
context: Dict[str, str] | None = None) -> tuple[BinaryIO, EncryptionMetadata]:
"""Encrypt a stream and return encrypted stream + metadata.
Performance: Writes chunks directly to output buffer instead of accumulating in list.
"""
data_key, encrypted_data_key = self.provider.generate_data_key()
base_nonce = secrets.token_bytes(12)
aesgcm = AESGCM(data_key)
# Performance: Write directly to BytesIO instead of accumulating chunks
output = io.BytesIO()
output.write(b"\x00\x00\x00\x00") # Placeholder for chunk count
chunk_index = 0
while True:
chunk = stream.read(self.chunk_size)
if not chunk:
break
chunk_nonce = self._derive_chunk_nonce(base_nonce, chunk_index)
encrypted_chunk = aesgcm.encrypt(chunk_nonce, chunk, None)
# Write size prefix + encrypted chunk directly
output.write(len(encrypted_chunk).to_bytes(self.HEADER_SIZE, "big"))
output.write(encrypted_chunk)
chunk_index += 1
# Write actual chunk count to header
output.seek(0)
output.write(chunk_index.to_bytes(4, "big"))
output.seek(0)
metadata = EncryptionMetadata(
algorithm="AES256",
key_id=self.provider.KEY_ID if hasattr(self.provider, "KEY_ID") else "local",
nonce=base_nonce,
encrypted_data_key=encrypted_data_key,
)
return output, metadata
def decrypt_stream(self, stream: BinaryIO, metadata: EncryptionMetadata) -> BinaryIO:
"""Decrypt a stream using the provided metadata.
Performance: Writes chunks directly to output buffer instead of accumulating in list.
"""
data_key = self.provider.decrypt_data_key(metadata.encrypted_data_key, metadata.key_id)
aesgcm = AESGCM(data_key)
base_nonce = metadata.nonce
chunk_count_bytes = stream.read(4)
if len(chunk_count_bytes) < 4:
raise EncryptionError("Invalid encrypted stream: missing header")
chunk_count = int.from_bytes(chunk_count_bytes, "big")
# Performance: Write directly to BytesIO instead of accumulating chunks
output = io.BytesIO()
for chunk_index in range(chunk_count):
size_bytes = stream.read(self.HEADER_SIZE)
if len(size_bytes) < self.HEADER_SIZE:
raise EncryptionError(f"Invalid encrypted stream: truncated at chunk {chunk_index}")
chunk_size = int.from_bytes(size_bytes, "big")
encrypted_chunk = stream.read(chunk_size)
if len(encrypted_chunk) < chunk_size:
raise EncryptionError(f"Invalid encrypted stream: incomplete chunk {chunk_index}")
chunk_nonce = self._derive_chunk_nonce(base_nonce, chunk_index)
try:
decrypted_chunk = aesgcm.decrypt(chunk_nonce, encrypted_chunk, None)
output.write(decrypted_chunk) # Write directly instead of appending to list
except Exception as exc:
raise EncryptionError(f"Failed to decrypt chunk {chunk_index}: {exc}") from exc
output.seek(0)
return output
class EncryptionManager:
"""Manages encryption providers and operations."""
def __init__(self, config: Dict[str, Any]):
self.config = config
self._local_provider: LocalKeyEncryption | None = None
self._kms_provider: Any = None # Set by KMS module
self._streaming_encryptor: StreamingEncryptor | None = None
@property
def enabled(self) -> bool:
return self.config.get("encryption_enabled", False)
@property
def default_algorithm(self) -> str:
return self.config.get("default_encryption_algorithm", "AES256")
def get_local_provider(self) -> LocalKeyEncryption:
if self._local_provider is None:
key_path = Path(self.config.get("encryption_master_key_path", "data/.myfsio.sys/keys/master.key"))
self._local_provider = LocalKeyEncryption(key_path)
return self._local_provider
def set_kms_provider(self, kms_provider: Any) -> None:
"""Set the KMS provider (injected from kms module)."""
self._kms_provider = kms_provider
def get_provider(self, algorithm: str, kms_key_id: str | None = None) -> EncryptionProvider:
"""Get the appropriate encryption provider for the algorithm."""
if algorithm == "AES256":
return self.get_local_provider()
elif algorithm == "aws:kms":
if self._kms_provider is None:
raise EncryptionError("KMS is not configured")
return self._kms_provider.get_provider(kms_key_id)
else:
raise EncryptionError(f"Unsupported encryption algorithm: {algorithm}")
def get_streaming_encryptor(self) -> StreamingEncryptor:
if self._streaming_encryptor is None:
chunk_size = self.config.get("encryption_chunk_size_bytes", 64 * 1024)
self._streaming_encryptor = StreamingEncryptor(self.get_local_provider(), chunk_size=chunk_size)
return self._streaming_encryptor
def encrypt_object(self, data: bytes, algorithm: str = "AES256",
kms_key_id: str | None = None,
context: Dict[str, str] | None = None) -> tuple[bytes, EncryptionMetadata]:
"""Encrypt object data."""
provider = self.get_provider(algorithm, kms_key_id)
result = provider.encrypt(data, context)
metadata = EncryptionMetadata(
algorithm=algorithm,
key_id=result.key_id,
nonce=result.nonce,
encrypted_data_key=result.encrypted_data_key,
)
return result.ciphertext, metadata
def decrypt_object(self, ciphertext: bytes, metadata: EncryptionMetadata,
context: Dict[str, str] | None = None) -> bytes:
"""Decrypt object data."""
provider = self.get_provider(metadata.algorithm, metadata.key_id)
return provider.decrypt(
ciphertext,
metadata.nonce,
metadata.encrypted_data_key,
metadata.key_id,
context,
)
def encrypt_stream(self, stream: BinaryIO, algorithm: str = "AES256",
context: Dict[str, str] | None = None) -> tuple[BinaryIO, EncryptionMetadata]:
"""Encrypt a stream for large files."""
encryptor = self.get_streaming_encryptor()
return encryptor.encrypt_stream(stream, context)
def decrypt_stream(self, stream: BinaryIO, metadata: EncryptionMetadata) -> BinaryIO:
"""Decrypt a stream."""
encryptor = self.get_streaming_encryptor()
return encryptor.decrypt_stream(stream, metadata)
class SSECEncryption(EncryptionProvider):
"""SSE-C: Server-Side Encryption with Customer-Provided Keys.
The client provides the encryption key with each request.
Server encrypts/decrypts but never stores the key.
Required headers for PUT:
- x-amz-server-side-encryption-customer-algorithm: AES256
- x-amz-server-side-encryption-customer-key: Base64-encoded 256-bit key
- x-amz-server-side-encryption-customer-key-MD5: Base64-encoded MD5 of key
"""
KEY_ID = "customer-provided"
def __init__(self, customer_key: bytes):
if len(customer_key) != 32:
raise EncryptionError("Customer key must be exactly 256 bits (32 bytes)")
self.customer_key = customer_key
@classmethod
def from_headers(cls, headers: Dict[str, str]) -> "SSECEncryption":
algorithm = headers.get("x-amz-server-side-encryption-customer-algorithm", "")
if algorithm.upper() != "AES256":
raise EncryptionError(f"Unsupported SSE-C algorithm: {algorithm}. Only AES256 is supported.")
key_b64 = headers.get("x-amz-server-side-encryption-customer-key", "")
if not key_b64:
raise EncryptionError("Missing x-amz-server-side-encryption-customer-key header")
key_md5_b64 = headers.get("x-amz-server-side-encryption-customer-key-md5", "")
try:
customer_key = base64.b64decode(key_b64)
except Exception as e:
raise EncryptionError(f"Invalid base64 in customer key: {e}") from e
if len(customer_key) != 32:
raise EncryptionError(f"Customer key must be 256 bits, got {len(customer_key) * 8} bits")
if key_md5_b64:
import hashlib
expected_md5 = base64.b64encode(hashlib.md5(customer_key).digest()).decode()
if key_md5_b64 != expected_md5:
raise EncryptionError("Customer key MD5 mismatch")
return cls(customer_key)
def encrypt(self, plaintext: bytes, context: Dict[str, str] | None = None) -> EncryptionResult:
aesgcm = AESGCM(self.customer_key)
nonce = secrets.token_bytes(12)
aad = json.dumps(context, sort_keys=True).encode() if context else None
ciphertext = aesgcm.encrypt(nonce, plaintext, aad)
return EncryptionResult(
ciphertext=ciphertext,
nonce=nonce,
key_id=self.KEY_ID,
encrypted_data_key=b"",
)
def decrypt(self, ciphertext: bytes, nonce: bytes, encrypted_data_key: bytes,
key_id: str, context: Dict[str, str] | None = None) -> bytes:
aesgcm = AESGCM(self.customer_key)
aad = json.dumps(context, sort_keys=True).encode() if context else None
try:
return aesgcm.decrypt(nonce, ciphertext, aad)
except Exception as exc:
raise EncryptionError("SSE-C decryption failed") from exc
def generate_data_key(self) -> tuple[bytes, bytes]:
return self.customer_key, b""
@dataclass
class SSECMetadata:
algorithm: str = "AES256"
nonce: bytes = b""
key_md5: str = ""
def to_dict(self) -> Dict[str, str]:
return {
"x-amz-server-side-encryption-customer-algorithm": self.algorithm,
"x-amz-encryption-nonce": base64.b64encode(self.nonce).decode(),
"x-amz-server-side-encryption-customer-key-MD5": self.key_md5,
}
@classmethod
def from_dict(cls, data: Dict[str, str]) -> Optional["SSECMetadata"]:
algorithm = data.get("x-amz-server-side-encryption-customer-algorithm")
if not algorithm:
return None
try:
nonce = base64.b64decode(data.get("x-amz-encryption-nonce", ""))
return cls(
algorithm=algorithm,
nonce=nonce,
key_md5=data.get("x-amz-server-side-encryption-customer-key-MD5", ""),
)
except Exception:
return None
class ClientEncryptionHelper:
"""Helpers for client-side encryption.
Client-side encryption is performed by the client, but this helper
provides key generation and materials for clients that need them.
"""
@staticmethod
def generate_client_key() -> Dict[str, str]:
"""Generate a new client encryption key."""
from datetime import datetime, timezone
key = secrets.token_bytes(32)
return {
"key": base64.b64encode(key).decode(),
"algorithm": "AES-256-GCM",
"created_at": datetime.now(timezone.utc).isoformat(),
}
@staticmethod
def encrypt_with_key(plaintext: bytes, key_b64: str, context: Dict[str, str] | None = None) -> Dict[str, str]:
"""Encrypt data with a client-provided key."""
key = base64.b64decode(key_b64)
if len(key) != 32:
raise EncryptionError("Key must be 256 bits (32 bytes)")
aesgcm = AESGCM(key)
nonce = secrets.token_bytes(12)
aad = json.dumps(context, sort_keys=True).encode() if context else None
ciphertext = aesgcm.encrypt(nonce, plaintext, aad)
return {
"ciphertext": base64.b64encode(ciphertext).decode(),
"nonce": base64.b64encode(nonce).decode(),
"algorithm": "AES-256-GCM",
}
@staticmethod
def decrypt_with_key(ciphertext_b64: str, nonce_b64: str, key_b64: str, context: Dict[str, str] | None = None) -> bytes:
"""Decrypt data with a client-provided key."""
key = base64.b64decode(key_b64)
nonce = base64.b64decode(nonce_b64)
ciphertext = base64.b64decode(ciphertext_b64)
if len(key) != 32:
raise EncryptionError("Key must be 256 bits (32 bytes)")
aesgcm = AESGCM(key)
aad = json.dumps(context, sort_keys=True).encode() if context else None
try:
return aesgcm.decrypt(nonce, ciphertext, aad)
except Exception as exc:
raise EncryptionError("Decryption failed") from exc

View File

@@ -1,199 +0,0 @@
from __future__ import annotations
import logging
from dataclasses import dataclass, field
from typing import Optional, Dict, Any
from xml.etree.ElementTree import Element, SubElement, tostring
from flask import Response, jsonify, request, flash, redirect, url_for, g
from flask_limiter import RateLimitExceeded
logger = logging.getLogger(__name__)
@dataclass
class AppError(Exception):
"""Base application error with multi-format response support."""
code: str
message: str
status_code: int = 500
details: Optional[Dict[str, Any]] = field(default=None)
def __post_init__(self):
super().__init__(self.message)
def to_xml_response(self) -> Response:
"""Convert to S3 API XML error response."""
error = Element("Error")
SubElement(error, "Code").text = self.code
SubElement(error, "Message").text = self.message
request_id = getattr(g, 'request_id', None) if g else None
SubElement(error, "RequestId").text = request_id or "unknown"
xml_bytes = tostring(error, encoding="utf-8")
return Response(xml_bytes, status=self.status_code, mimetype="application/xml")
def to_json_response(self) -> tuple[Response, int]:
"""Convert to JSON error response for UI AJAX calls."""
payload: Dict[str, Any] = {
"success": False,
"error": {
"code": self.code,
"message": self.message
}
}
if self.details:
payload["error"]["details"] = self.details
return jsonify(payload), self.status_code
def to_flash_message(self) -> str:
"""Convert to user-friendly flash message."""
return self.message
@dataclass
class BucketNotFoundError(AppError):
"""Bucket does not exist."""
code: str = "NoSuchBucket"
message: str = "The specified bucket does not exist"
status_code: int = 404
@dataclass
class BucketAlreadyExistsError(AppError):
"""Bucket already exists."""
code: str = "BucketAlreadyExists"
message: str = "The requested bucket name is not available"
status_code: int = 409
@dataclass
class BucketNotEmptyError(AppError):
"""Bucket is not empty."""
code: str = "BucketNotEmpty"
message: str = "The bucket you tried to delete is not empty"
status_code: int = 409
@dataclass
class ObjectNotFoundError(AppError):
"""Object does not exist."""
code: str = "NoSuchKey"
message: str = "The specified key does not exist"
status_code: int = 404
@dataclass
class InvalidObjectKeyError(AppError):
"""Invalid object key."""
code: str = "InvalidKey"
message: str = "The specified key is not valid"
status_code: int = 400
@dataclass
class AccessDeniedError(AppError):
"""Access denied."""
code: str = "AccessDenied"
message: str = "Access Denied"
status_code: int = 403
@dataclass
class InvalidCredentialsError(AppError):
"""Invalid credentials."""
code: str = "InvalidAccessKeyId"
message: str = "The access key ID you provided does not exist"
status_code: int = 403
@dataclass
class MalformedRequestError(AppError):
"""Malformed request."""
code: str = "MalformedXML"
message: str = "The XML you provided was not well-formed"
status_code: int = 400
@dataclass
class InvalidArgumentError(AppError):
"""Invalid argument."""
code: str = "InvalidArgument"
message: str = "Invalid argument"
status_code: int = 400
@dataclass
class EntityTooLargeError(AppError):
"""Entity too large."""
code: str = "EntityTooLarge"
message: str = "Your proposed upload exceeds the maximum allowed size"
status_code: int = 413
@dataclass
class QuotaExceededAppError(AppError):
"""Bucket quota exceeded."""
code: str = "QuotaExceeded"
message: str = "The bucket quota has been exceeded"
status_code: int = 403
quota: Optional[Dict[str, Any]] = None
usage: Optional[Dict[str, int]] = None
def __post_init__(self):
if self.quota or self.usage:
self.details = {}
if self.quota:
self.details["quota"] = self.quota
if self.usage:
self.details["usage"] = self.usage
super().__post_init__()
def handle_app_error(error: AppError) -> Response:
"""Handle application errors with appropriate response format."""
log_extra = {"error_code": error.code}
if error.details:
log_extra["details"] = error.details
logger.error(f"{error.code}: {error.message}", extra=log_extra)
if request.path.startswith('/ui'):
wants_json = (
request.is_json or
request.headers.get('X-Requested-With') == 'XMLHttpRequest' or
'application/json' in request.accept_mimetypes.values()
)
if wants_json:
return error.to_json_response()
flash(error.to_flash_message(), 'danger')
referrer = request.referrer
if referrer and request.host in referrer:
return redirect(referrer)
return redirect(url_for('ui.buckets_overview'))
else:
return error.to_xml_response()
def handle_rate_limit_exceeded(e: RateLimitExceeded) -> Response:
g.s3_error_code = "SlowDown"
error = Element("Error")
SubElement(error, "Code").text = "SlowDown"
SubElement(error, "Message").text = "Please reduce your request rate."
SubElement(error, "Resource").text = request.path
SubElement(error, "RequestId").text = getattr(g, "request_id", "")
xml_bytes = tostring(error, encoding="utf-8")
return Response(xml_bytes, status=429, mimetype="application/xml")
def register_error_handlers(app):
"""Register error handlers with a Flask app."""
app.register_error_handler(AppError, handle_app_error)
app.register_error_handler(RateLimitExceeded, handle_rate_limit_exceeded)
for error_class in [
BucketNotFoundError, BucketAlreadyExistsError, BucketNotEmptyError,
ObjectNotFoundError, InvalidObjectKeyError,
AccessDeniedError, InvalidCredentialsError,
MalformedRequestError, InvalidArgumentError, EntityTooLargeError,
QuotaExceededAppError,
]:
app.register_error_handler(error_class, handle_app_error)

View File

@@ -1,16 +0,0 @@
from flask import g
from flask_limiter import Limiter
from flask_limiter.util import get_remote_address
from flask_wtf import CSRFProtect
def get_rate_limit_key():
"""Generate rate limit key based on authenticated user."""
if hasattr(g, 'principal') and g.principal:
return g.principal.access_key
return get_remote_address()
# Shared rate limiter instance; configured in app factory.
limiter = Limiter(key_func=get_rate_limit_key)
# Global CSRF protection for UI routes.
csrf = CSRFProtect()

View File

@@ -1,598 +0,0 @@
from __future__ import annotations
import hashlib
import hmac
import json
import math
import os
import secrets
import threading
import time
from collections import deque
from dataclasses import dataclass
from datetime import datetime, timedelta, timezone
from pathlib import Path
from typing import Any, Deque, Dict, Iterable, List, Optional, Sequence, Set, Tuple
class IamError(RuntimeError):
"""Raised when authentication or authorization fails."""
S3_ACTIONS = {"list", "read", "write", "delete", "share", "policy", "replication", "lifecycle", "cors"}
IAM_ACTIONS = {
"iam:list_users",
"iam:create_user",
"iam:delete_user",
"iam:rotate_key",
"iam:update_policy",
}
ALLOWED_ACTIONS = (S3_ACTIONS | IAM_ACTIONS) | {"iam:*"}
ACTION_ALIASES = {
"list": "list",
"s3:listbucket": "list",
"s3:listallmybuckets": "list",
"s3:listbucketversions": "list",
"s3:listmultipartuploads": "list",
"s3:listparts": "list",
"read": "read",
"s3:getobject": "read",
"s3:getobjectversion": "read",
"s3:getobjecttagging": "read",
"s3:getobjectversiontagging": "read",
"s3:getobjectacl": "read",
"s3:getbucketversioning": "read",
"s3:headobject": "read",
"s3:headbucket": "read",
"write": "write",
"s3:putobject": "write",
"s3:createbucket": "write",
"s3:putobjecttagging": "write",
"s3:putbucketversioning": "write",
"s3:createmultipartupload": "write",
"s3:uploadpart": "write",
"s3:completemultipartupload": "write",
"s3:abortmultipartupload": "write",
"s3:copyobject": "write",
"delete": "delete",
"s3:deleteobject": "delete",
"s3:deleteobjectversion": "delete",
"s3:deletebucket": "delete",
"s3:deleteobjecttagging": "delete",
"share": "share",
"s3:putobjectacl": "share",
"s3:putbucketacl": "share",
"s3:getbucketacl": "share",
"policy": "policy",
"s3:putbucketpolicy": "policy",
"s3:getbucketpolicy": "policy",
"s3:deletebucketpolicy": "policy",
"replication": "replication",
"s3:getreplicationconfiguration": "replication",
"s3:putreplicationconfiguration": "replication",
"s3:deletereplicationconfiguration": "replication",
"s3:replicateobject": "replication",
"s3:replicatetags": "replication",
"s3:replicatedelete": "replication",
"lifecycle": "lifecycle",
"s3:getlifecycleconfiguration": "lifecycle",
"s3:putlifecycleconfiguration": "lifecycle",
"s3:deletelifecycleconfiguration": "lifecycle",
"s3:getbucketlifecycle": "lifecycle",
"s3:putbucketlifecycle": "lifecycle",
"cors": "cors",
"s3:getbucketcors": "cors",
"s3:putbucketcors": "cors",
"s3:deletebucketcors": "cors",
"iam:listusers": "iam:list_users",
"iam:createuser": "iam:create_user",
"iam:deleteuser": "iam:delete_user",
"iam:rotateaccesskey": "iam:rotate_key",
"iam:putuserpolicy": "iam:update_policy",
"iam:*": "iam:*",
}
@dataclass
class Policy:
bucket: str
actions: Set[str]
@dataclass
class Principal:
access_key: str
display_name: str
policies: List[Policy]
class IamService:
"""Loads IAM configuration, manages users, and evaluates policies."""
def __init__(self, config_path: Path, auth_max_attempts: int = 5, auth_lockout_minutes: int = 15) -> None:
self.config_path = Path(config_path)
self.auth_max_attempts = auth_max_attempts
self.auth_lockout_window = timedelta(minutes=auth_lockout_minutes)
self.config_path.parent.mkdir(parents=True, exist_ok=True)
if not self.config_path.exists():
self._write_default()
self._users: Dict[str, Dict[str, Any]] = {}
self._raw_config: Dict[str, Any] = {}
self._failed_attempts: Dict[str, Deque[datetime]] = {}
self._last_load_time = 0.0
self._principal_cache: Dict[str, Tuple[Principal, float]] = {}
self._secret_key_cache: Dict[str, Tuple[str, float]] = {}
self._cache_ttl = float(os.environ.get("IAM_CACHE_TTL_SECONDS", "5.0"))
self._last_stat_check = 0.0
self._stat_check_interval = 1.0
self._sessions: Dict[str, Dict[str, Any]] = {}
self._session_lock = threading.Lock()
self._load()
self._load_lockout_state()
def _maybe_reload(self) -> None:
"""Reload configuration if the file has changed on disk."""
now = time.time()
if now - self._last_stat_check < self._stat_check_interval:
return
self._last_stat_check = now
try:
if self.config_path.stat().st_mtime > self._last_load_time:
self._load()
self._principal_cache.clear()
self._secret_key_cache.clear()
except OSError:
pass
def authenticate(self, access_key: str, secret_key: str) -> Principal:
self._maybe_reload()
access_key = (access_key or "").strip()
secret_key = (secret_key or "").strip()
if not access_key or not secret_key:
raise IamError("Missing access credentials")
if self._is_locked_out(access_key):
seconds = self._seconds_until_unlock(access_key)
raise IamError(
f"Access temporarily locked. Try again in {seconds} seconds."
)
record = self._users.get(access_key)
stored_secret = record["secret_key"] if record else secrets.token_urlsafe(24)
if not record or not hmac.compare_digest(stored_secret, secret_key):
self._record_failed_attempt(access_key)
raise IamError("Invalid credentials")
self._clear_failed_attempts(access_key)
return self._build_principal(access_key, record)
def _record_failed_attempt(self, access_key: str) -> None:
if not access_key:
return
attempts = self._failed_attempts.setdefault(access_key, deque())
self._prune_attempts(attempts)
attempts.append(datetime.now(timezone.utc))
self._save_lockout_state()
def _clear_failed_attempts(self, access_key: str) -> None:
if not access_key:
return
if self._failed_attempts.pop(access_key, None) is not None:
self._save_lockout_state()
def _lockout_file(self) -> Path:
return self.config_path.parent / "lockout_state.json"
def _load_lockout_state(self) -> None:
"""Load lockout state from disk."""
try:
if self._lockout_file().exists():
data = json.loads(self._lockout_file().read_text(encoding="utf-8"))
cutoff = datetime.now(timezone.utc) - self.auth_lockout_window
for key, timestamps in data.get("failed_attempts", {}).items():
valid = []
for ts in timestamps:
try:
dt = datetime.fromisoformat(ts)
if dt > cutoff:
valid.append(dt)
except (ValueError, TypeError):
continue
if valid:
self._failed_attempts[key] = deque(valid)
except (OSError, json.JSONDecodeError):
pass
def _save_lockout_state(self) -> None:
"""Persist lockout state to disk."""
data: Dict[str, Any] = {"failed_attempts": {}}
for key, attempts in self._failed_attempts.items():
data["failed_attempts"][key] = [ts.isoformat() for ts in attempts]
try:
self._lockout_file().write_text(json.dumps(data), encoding="utf-8")
except OSError:
pass
def _prune_attempts(self, attempts: Deque[datetime]) -> None:
cutoff = datetime.now(timezone.utc) - self.auth_lockout_window
while attempts and attempts[0] < cutoff:
attempts.popleft()
def _is_locked_out(self, access_key: str) -> bool:
if not access_key:
return False
attempts = self._failed_attempts.get(access_key)
if not attempts:
return False
self._prune_attempts(attempts)
return len(attempts) >= self.auth_max_attempts
def _seconds_until_unlock(self, access_key: str) -> int:
attempts = self._failed_attempts.get(access_key)
if not attempts:
return 0
self._prune_attempts(attempts)
if len(attempts) < self.auth_max_attempts:
return 0
oldest = attempts[0]
elapsed = (datetime.now(timezone.utc) - oldest).total_seconds()
return int(max(0, self.auth_lockout_window.total_seconds() - elapsed))
def create_session_token(self, access_key: str, duration_seconds: int = 3600) -> str:
"""Create a temporary session token for an access key."""
self._maybe_reload()
record = self._users.get(access_key)
if not record:
raise IamError("Unknown access key")
self._cleanup_expired_sessions()
token = secrets.token_urlsafe(32)
expires_at = time.time() + duration_seconds
self._sessions[token] = {
"access_key": access_key,
"expires_at": expires_at,
}
return token
def validate_session_token(self, access_key: str, session_token: str) -> bool:
"""Validate a session token for an access key (thread-safe, constant-time)."""
dummy_key = secrets.token_urlsafe(16)
dummy_token = secrets.token_urlsafe(32)
with self._session_lock:
session = self._sessions.get(session_token)
if not session:
hmac.compare_digest(access_key, dummy_key)
hmac.compare_digest(session_token, dummy_token)
return False
key_match = hmac.compare_digest(session["access_key"], access_key)
if not key_match:
hmac.compare_digest(session_token, dummy_token)
return False
if time.time() > session["expires_at"]:
self._sessions.pop(session_token, None)
return False
return True
def _cleanup_expired_sessions(self) -> None:
"""Remove expired session tokens."""
now = time.time()
expired = [token for token, data in self._sessions.items() if now > data["expires_at"]]
for token in expired:
del self._sessions[token]
def principal_for_key(self, access_key: str) -> Principal:
now = time.time()
cached = self._principal_cache.get(access_key)
if cached:
principal, cached_time = cached
if now - cached_time < self._cache_ttl:
return principal
self._maybe_reload()
record = self._users.get(access_key)
if not record:
raise IamError("Unknown access key")
principal = self._build_principal(access_key, record)
self._principal_cache[access_key] = (principal, now)
return principal
def secret_for_key(self, access_key: str) -> str:
self._maybe_reload()
record = self._users.get(access_key)
if not record:
raise IamError("Unknown access key")
return record["secret_key"]
def authorize(self, principal: Principal, bucket_name: str | None, action: str) -> None:
action = self._normalize_action(action)
if action not in ALLOWED_ACTIONS:
raise IamError(f"Unknown action '{action}'")
bucket_name = bucket_name or "*"
normalized = bucket_name.lower() if bucket_name != "*" else bucket_name
if not self._is_allowed(principal, normalized, action):
raise IamError(f"Access denied for action '{action}' on bucket '{bucket_name}'")
def buckets_for_principal(self, principal: Principal, buckets: Iterable[str]) -> List[str]:
return [bucket for bucket in buckets if self._is_allowed(principal, bucket, "list")]
def _is_allowed(self, principal: Principal, bucket_name: str, action: str) -> bool:
bucket_name = bucket_name.lower()
for policy in principal.policies:
if policy.bucket not in {"*", bucket_name}:
continue
if "*" in policy.actions or action in policy.actions:
return True
if "iam:*" in policy.actions and action.startswith("iam:"):
return True
return False
def list_users(self) -> List[Dict[str, Any]]:
listing: List[Dict[str, Any]] = []
for access_key, record in self._users.items():
listing.append(
{
"access_key": access_key,
"display_name": record["display_name"],
"policies": [
{"bucket": policy.bucket, "actions": sorted(policy.actions)}
for policy in record["policies"]
],
}
)
return listing
def create_user(
self,
*,
display_name: str,
policies: Optional[Sequence[Dict[str, Any]]] = None,
access_key: str | None = None,
secret_key: str | None = None,
) -> Dict[str, str]:
access_key = (access_key or self._generate_access_key()).strip()
if not access_key:
raise IamError("Access key cannot be empty")
if access_key in self._users:
raise IamError("Access key already exists")
secret_key = secret_key or self._generate_secret_key()
sanitized_policies = self._prepare_policy_payload(policies)
record = {
"access_key": access_key,
"secret_key": secret_key,
"display_name": display_name or access_key,
"policies": sanitized_policies,
}
self._raw_config.setdefault("users", []).append(record)
self._save()
self._load()
return {"access_key": access_key, "secret_key": secret_key}
def rotate_secret(self, access_key: str) -> str:
user = self._get_raw_user(access_key)
new_secret = self._generate_secret_key()
user["secret_key"] = new_secret
self._save()
self._principal_cache.pop(access_key, None)
self._secret_key_cache.pop(access_key, None)
from .s3_api import clear_signing_key_cache
clear_signing_key_cache()
self._load()
return new_secret
def update_user(self, access_key: str, display_name: str) -> None:
user = self._get_raw_user(access_key)
user["display_name"] = display_name
self._save()
self._load()
def delete_user(self, access_key: str) -> None:
users = self._raw_config.get("users", [])
if len(users) <= 1:
raise IamError("Cannot delete the only user")
remaining = [user for user in users if user["access_key"] != access_key]
if len(remaining) == len(users):
raise IamError("User not found")
self._raw_config["users"] = remaining
self._save()
self._principal_cache.pop(access_key, None)
self._secret_key_cache.pop(access_key, None)
from .s3_api import clear_signing_key_cache
clear_signing_key_cache()
self._load()
def update_user_policies(self, access_key: str, policies: Sequence[Dict[str, Any]]) -> None:
user = self._get_raw_user(access_key)
user["policies"] = self._prepare_policy_payload(policies)
self._save()
self._load()
def _load(self) -> None:
try:
self._last_load_time = self.config_path.stat().st_mtime
content = self.config_path.read_text(encoding='utf-8')
raw = json.loads(content)
except FileNotFoundError:
raise IamError(f"IAM config not found: {self.config_path}")
except json.JSONDecodeError as e:
raise IamError(f"Corrupted IAM config (invalid JSON): {e}")
except PermissionError as e:
raise IamError(f"Cannot read IAM config (permission denied): {e}")
except (OSError, ValueError) as e:
raise IamError(f"Failed to load IAM config: {e}")
users: Dict[str, Dict[str, Any]] = {}
for user in raw.get("users", []):
policies = self._build_policy_objects(user.get("policies", []))
users[user["access_key"]] = {
"secret_key": user["secret_key"],
"display_name": user.get("display_name", user["access_key"]),
"policies": policies,
}
if not users:
raise IamError("IAM configuration contains no users")
self._users = users
self._raw_config = {
"users": [
{
"access_key": entry["access_key"],
"secret_key": entry["secret_key"],
"display_name": entry.get("display_name", entry["access_key"]),
"policies": entry.get("policies", []),
}
for entry in raw.get("users", [])
]
}
def _save(self) -> None:
try:
temp_path = self.config_path.with_suffix('.json.tmp')
temp_path.write_text(json.dumps(self._raw_config, indent=2), encoding='utf-8')
temp_path.replace(self.config_path)
except (OSError, PermissionError) as e:
raise IamError(f"Cannot save IAM config: {e}")
def config_summary(self) -> Dict[str, Any]:
return {
"path": str(self.config_path),
"user_count": len(self._users),
"allowed_actions": sorted(ALLOWED_ACTIONS),
}
def export_config(self, mask_secrets: bool = True) -> Dict[str, Any]:
payload: Dict[str, Any] = {"users": []}
for user in self._raw_config.get("users", []):
record = dict(user)
if mask_secrets and "secret_key" in record:
record["secret_key"] = "••••••••••"
payload["users"].append(record)
return payload
def _build_policy_objects(self, policies: Sequence[Dict[str, Any]]) -> List[Policy]:
entries: List[Policy] = []
for policy in policies:
bucket = str(policy.get("bucket", "*")).lower()
raw_actions = policy.get("actions", [])
if isinstance(raw_actions, str):
raw_actions = [raw_actions]
action_set: Set[str] = set()
for action in raw_actions:
canonical = self._normalize_action(action)
if canonical == "*":
action_set = set(ALLOWED_ACTIONS)
break
if canonical:
action_set.add(canonical)
if action_set:
entries.append(Policy(bucket=bucket, actions=action_set))
return entries
def _prepare_policy_payload(self, policies: Optional[Sequence[Dict[str, Any]]]) -> List[Dict[str, Any]]:
if not policies:
policies = (
{
"bucket": "*",
"actions": ["list", "read", "write", "delete", "share", "policy"],
},
)
sanitized: List[Dict[str, Any]] = []
for policy in policies:
bucket = str(policy.get("bucket", "*")).lower()
raw_actions = policy.get("actions", [])
if isinstance(raw_actions, str):
raw_actions = [raw_actions]
action_set: Set[str] = set()
for action in raw_actions:
canonical = self._normalize_action(action)
if canonical == "*":
action_set = set(ALLOWED_ACTIONS)
break
if canonical:
action_set.add(canonical)
if not action_set:
continue
sanitized.append({"bucket": bucket, "actions": sorted(action_set)})
if not sanitized:
raise IamError("At least one policy with valid actions is required")
return sanitized
def _build_principal(self, access_key: str, record: Dict[str, Any]) -> Principal:
return Principal(
access_key=access_key,
display_name=record["display_name"],
policies=record["policies"],
)
def _normalize_action(self, action: str) -> str:
if not action:
return ""
lowered = action.strip().lower()
if lowered == "*":
return "*"
candidate = ACTION_ALIASES.get(lowered, lowered)
return candidate if candidate in ALLOWED_ACTIONS else ""
def _write_default(self) -> None:
access_key = secrets.token_hex(12)
secret_key = secrets.token_urlsafe(32)
default = {
"users": [
{
"access_key": access_key,
"secret_key": secret_key,
"display_name": "Local Admin",
"policies": [
{"bucket": "*", "actions": list(ALLOWED_ACTIONS)}
],
}
]
}
self.config_path.write_text(json.dumps(default, indent=2))
print(f"\n{'='*60}")
print("MYFSIO FIRST RUN - ADMIN CREDENTIALS GENERATED")
print(f"{'='*60}")
print(f"Access Key: {access_key}")
print(f"Secret Key: {secret_key}")
print(f"{'='*60}")
print(f"Missed this? Check: {self.config_path}")
print(f"{'='*60}\n")
def _generate_access_key(self) -> str:
return secrets.token_hex(8)
def _generate_secret_key(self) -> str:
return secrets.token_urlsafe(24)
def _get_raw_user(self, access_key: str) -> Dict[str, Any]:
for user in self._raw_config.get("users", []):
if user["access_key"] == access_key:
return user
raise IamError("User not found")
def get_secret_key(self, access_key: str) -> str | None:
now = time.time()
cached = self._secret_key_cache.get(access_key)
if cached:
secret_key, cached_time = cached
if now - cached_time < self._cache_ttl:
return secret_key
self._maybe_reload()
record = self._users.get(access_key)
if record:
secret_key = record["secret_key"]
self._secret_key_cache[access_key] = (secret_key, now)
return secret_key
return None
def get_principal(self, access_key: str) -> Principal | None:
now = time.time()
cached = self._principal_cache.get(access_key)
if cached:
principal, cached_time = cached
if now - cached_time < self._cache_ttl:
return principal
self._maybe_reload()
record = self._users.get(access_key)
if record:
principal = self._build_principal(access_key, record)
self._principal_cache[access_key] = (principal, now)
return principal
return None

View File

@@ -1,438 +0,0 @@
from __future__ import annotations
import base64
import json
import logging
import os
import secrets
import subprocess
import sys
import uuid
from dataclasses import dataclass, field
from datetime import datetime, timezone
from pathlib import Path
from typing import Any, Dict, List, Optional
from cryptography.hazmat.primitives.ciphers.aead import AESGCM
from .encryption import EncryptionError, EncryptionProvider, EncryptionResult
if sys.platform != "win32":
import fcntl
logger = logging.getLogger(__name__)
def _set_secure_file_permissions(file_path: Path) -> None:
"""Set restrictive file permissions (owner read/write only)."""
if sys.platform == "win32":
try:
username = os.environ.get("USERNAME", "")
if username:
subprocess.run(
["icacls", str(file_path), "/inheritance:r",
"/grant:r", f"{username}:F"],
check=True, capture_output=True
)
else:
logger.warning("Could not set secure permissions on %s: USERNAME not set", file_path)
except (subprocess.SubprocessError, OSError) as exc:
logger.warning("Failed to set secure permissions on %s: %s", file_path, exc)
else:
os.chmod(file_path, 0o600)
@dataclass
class KMSKey:
"""Represents a KMS encryption key."""
key_id: str
description: str
created_at: str
enabled: bool = True
key_material: bytes = field(default_factory=lambda: b"", repr=False)
@property
def arn(self) -> str:
return f"arn:aws:kms:local:000000000000:key/{self.key_id}"
def to_dict(self, include_key: bool = False) -> Dict[str, Any]:
data = {
"KeyId": self.key_id,
"Arn": self.arn,
"Description": self.description,
"CreationDate": self.created_at,
"Enabled": self.enabled,
"KeyState": "Enabled" if self.enabled else "Disabled",
"KeyUsage": "ENCRYPT_DECRYPT",
"KeySpec": "SYMMETRIC_DEFAULT",
}
if include_key:
data["KeyMaterial"] = base64.b64encode(self.key_material).decode()
return data
@classmethod
def from_dict(cls, data: Dict[str, Any]) -> "KMSKey":
key_material = b""
if "KeyMaterial" in data:
key_material = base64.b64decode(data["KeyMaterial"])
return cls(
key_id=data["KeyId"],
description=data.get("Description", ""),
created_at=data.get("CreationDate", datetime.now(timezone.utc).isoformat()),
enabled=data.get("Enabled", True),
key_material=key_material,
)
class KMSEncryptionProvider(EncryptionProvider):
"""Encryption provider using a specific KMS key."""
def __init__(self, kms: "KMSManager", key_id: str):
self.kms = kms
self.key_id = key_id
@property
def KEY_ID(self) -> str:
return self.key_id
def generate_data_key(self) -> tuple[bytes, bytes]:
"""Generate a data key encrypted with the KMS key."""
return self.kms.generate_data_key(self.key_id)
def encrypt(self, plaintext: bytes, context: Dict[str, str] | None = None) -> EncryptionResult:
"""Encrypt data using envelope encryption with KMS."""
data_key, encrypted_data_key = self.generate_data_key()
aesgcm = AESGCM(data_key)
nonce = secrets.token_bytes(12)
ciphertext = aesgcm.encrypt(nonce, plaintext,
json.dumps(context, sort_keys=True).encode() if context else None)
return EncryptionResult(
ciphertext=ciphertext,
nonce=nonce,
key_id=self.key_id,
encrypted_data_key=encrypted_data_key,
)
def decrypt(self, ciphertext: bytes, nonce: bytes, encrypted_data_key: bytes,
key_id: str, context: Dict[str, str] | None = None) -> bytes:
"""Decrypt data using envelope encryption with KMS."""
data_key = self.kms.decrypt_data_key(key_id, encrypted_data_key, context=None)
if len(data_key) != 32:
raise EncryptionError("Invalid data key size")
aesgcm = AESGCM(data_key)
try:
return aesgcm.decrypt(nonce, ciphertext,
json.dumps(context, sort_keys=True).encode() if context else None)
except Exception as exc:
logger.debug("KMS decryption failed: %s", exc)
raise EncryptionError("Failed to decrypt data") from exc
def decrypt_data_key(self, encrypted_data_key: bytes, key_id: str | None = None) -> bytes:
"""Decrypt an encrypted data key using KMS."""
if key_id is None:
key_id = self.key_id
data_key = self.kms.decrypt_data_key(key_id, encrypted_data_key, context=None)
if len(data_key) != 32:
raise EncryptionError("Invalid data key size")
return data_key
class KMSManager:
"""Manages KMS keys and operations.
This is a local implementation that mimics AWS KMS functionality.
Keys are stored encrypted on disk.
"""
def __init__(
self,
keys_path: Path,
master_key_path: Path,
generate_data_key_min_bytes: int = 1,
generate_data_key_max_bytes: int = 1024,
):
self.keys_path = keys_path
self.master_key_path = master_key_path
self.generate_data_key_min_bytes = generate_data_key_min_bytes
self.generate_data_key_max_bytes = generate_data_key_max_bytes
self._keys: Dict[str, KMSKey] = {}
self._master_key: bytes | None = None
self._loaded = False
@property
def master_key(self) -> bytes:
"""Load or create the master key for encrypting KMS keys (with file locking)."""
if self._master_key is None:
lock_path = self.master_key_path.with_suffix(".lock")
lock_path.parent.mkdir(parents=True, exist_ok=True)
with open(lock_path, "w") as lock_file:
if sys.platform == "win32":
import msvcrt
msvcrt.locking(lock_file.fileno(), msvcrt.LK_LOCK, 1)
else:
fcntl.flock(lock_file.fileno(), fcntl.LOCK_EX)
try:
if self.master_key_path.exists():
self._master_key = base64.b64decode(
self.master_key_path.read_text().strip()
)
else:
self._master_key = secrets.token_bytes(32)
self.master_key_path.write_text(
base64.b64encode(self._master_key).decode()
)
_set_secure_file_permissions(self.master_key_path)
finally:
if sys.platform == "win32":
import msvcrt
msvcrt.locking(lock_file.fileno(), msvcrt.LK_UNLCK, 1)
else:
fcntl.flock(lock_file.fileno(), fcntl.LOCK_UN)
return self._master_key
def _load_keys(self) -> None:
"""Load keys from disk."""
if self._loaded:
return
if self.keys_path.exists():
try:
data = json.loads(self.keys_path.read_text(encoding="utf-8"))
for key_data in data.get("keys", []):
key = KMSKey.from_dict(key_data)
if key_data.get("EncryptedKeyMaterial"):
encrypted = base64.b64decode(key_data["EncryptedKeyMaterial"])
key.key_material = self._decrypt_key_material(encrypted)
self._keys[key.key_id] = key
except json.JSONDecodeError as exc:
logger.error("Failed to parse KMS keys file: %s", exc)
except (ValueError, KeyError) as exc:
logger.error("Invalid KMS key data: %s", exc)
self._loaded = True
def _save_keys(self) -> None:
"""Save keys to disk (with encrypted key material)."""
keys_data = []
for key in self._keys.values():
data = key.to_dict(include_key=False)
encrypted = self._encrypt_key_material(key.key_material)
data["EncryptedKeyMaterial"] = base64.b64encode(encrypted).decode()
keys_data.append(data)
self.keys_path.parent.mkdir(parents=True, exist_ok=True)
self.keys_path.write_text(
json.dumps({"keys": keys_data}, indent=2),
encoding="utf-8"
)
_set_secure_file_permissions(self.keys_path)
def _encrypt_key_material(self, key_material: bytes) -> bytes:
"""Encrypt key material with the master key."""
aesgcm = AESGCM(self.master_key)
nonce = secrets.token_bytes(12)
ciphertext = aesgcm.encrypt(nonce, key_material, None)
return nonce + ciphertext
def _decrypt_key_material(self, encrypted: bytes) -> bytes:
"""Decrypt key material with the master key."""
aesgcm = AESGCM(self.master_key)
nonce = encrypted[:12]
ciphertext = encrypted[12:]
return aesgcm.decrypt(nonce, ciphertext, None)
def create_key(self, description: str = "", key_id: str | None = None) -> KMSKey:
"""Create a new KMS key."""
self._load_keys()
if key_id is None:
key_id = str(uuid.uuid4())
if key_id in self._keys:
raise EncryptionError(f"Key already exists: {key_id}")
key = KMSKey(
key_id=key_id,
description=description,
created_at=datetime.now(timezone.utc).isoformat(),
enabled=True,
key_material=secrets.token_bytes(32),
)
self._keys[key_id] = key
self._save_keys()
return key
def get_key(self, key_id: str) -> KMSKey | None:
"""Get a key by ID."""
self._load_keys()
return self._keys.get(key_id)
def list_keys(self) -> List[KMSKey]:
"""List all keys."""
self._load_keys()
return list(self._keys.values())
def get_default_key_id(self) -> str:
"""Get the default KMS key ID, creating one if none exist."""
self._load_keys()
for key in self._keys.values():
if key.enabled:
return key.key_id
default_key = self.create_key(description="Default KMS Key")
return default_key.key_id
def get_provider(self, key_id: str | None = None) -> "KMSEncryptionProvider":
"""Get a KMS encryption provider for the specified key."""
if key_id is None:
key_id = self.get_default_key_id()
key = self.get_key(key_id)
if not key:
raise EncryptionError(f"Key not found: {key_id}")
if not key.enabled:
raise EncryptionError(f"Key is disabled: {key_id}")
return KMSEncryptionProvider(self, key_id)
def enable_key(self, key_id: str) -> None:
"""Enable a key."""
self._load_keys()
key = self._keys.get(key_id)
if not key:
raise EncryptionError(f"Key not found: {key_id}")
key.enabled = True
self._save_keys()
def disable_key(self, key_id: str) -> None:
"""Disable a key."""
self._load_keys()
key = self._keys.get(key_id)
if not key:
raise EncryptionError(f"Key not found: {key_id}")
key.enabled = False
self._save_keys()
def delete_key(self, key_id: str) -> None:
"""Delete a key (schedule for deletion in real KMS)."""
self._load_keys()
if key_id not in self._keys:
raise EncryptionError(f"Key not found: {key_id}")
del self._keys[key_id]
self._save_keys()
def encrypt(self, key_id: str, plaintext: bytes,
context: Dict[str, str] | None = None) -> bytes:
"""Encrypt data directly with a KMS key."""
self._load_keys()
key = self._keys.get(key_id)
if not key:
raise EncryptionError(f"Key not found: {key_id}")
if not key.enabled:
raise EncryptionError(f"Key is disabled: {key_id}")
aesgcm = AESGCM(key.key_material)
nonce = secrets.token_bytes(12)
aad = json.dumps(context, sort_keys=True).encode() if context else None
ciphertext = aesgcm.encrypt(nonce, plaintext, aad)
key_id_bytes = key_id.encode("utf-8")
return len(key_id_bytes).to_bytes(2, "big") + key_id_bytes + nonce + ciphertext
def decrypt(self, ciphertext: bytes,
context: Dict[str, str] | None = None) -> tuple[bytes, str]:
"""Decrypt data directly with a KMS key.
Returns:
Tuple of (plaintext, key_id)
"""
self._load_keys()
key_id_len = int.from_bytes(ciphertext[:2], "big")
key_id = ciphertext[2:2 + key_id_len].decode("utf-8")
rest = ciphertext[2 + key_id_len:]
key = self._keys.get(key_id)
if not key:
raise EncryptionError(f"Key not found: {key_id}")
if not key.enabled:
raise EncryptionError(f"Key is disabled: {key_id}")
nonce = rest[:12]
encrypted = rest[12:]
aesgcm = AESGCM(key.key_material)
aad = json.dumps(context, sort_keys=True).encode() if context else None
try:
plaintext = aesgcm.decrypt(nonce, encrypted, aad)
return plaintext, key_id
except Exception as exc:
logger.debug("KMS decrypt operation failed: %s", exc)
raise EncryptionError("Decryption failed") from exc
def generate_data_key(self, key_id: str,
context: Dict[str, str] | None = None,
key_spec: str = "AES_256") -> tuple[bytes, bytes]:
"""Generate a data key and return both plaintext and encrypted versions.
Args:
key_id: The KMS key ID to use for encryption
context: Optional encryption context
key_spec: Key specification - AES_128 or AES_256 (default)
Returns:
Tuple of (plaintext_key, encrypted_key)
"""
self._load_keys()
key = self._keys.get(key_id)
if not key:
raise EncryptionError(f"Key not found: {key_id}")
if not key.enabled:
raise EncryptionError(f"Key is disabled: {key_id}")
key_bytes = 32 if key_spec == "AES_256" else 16
plaintext_key = secrets.token_bytes(key_bytes)
encrypted_key = self.encrypt(key_id, plaintext_key, context)
return plaintext_key, encrypted_key
def decrypt_data_key(self, key_id: str, encrypted_key: bytes,
context: Dict[str, str] | None = None) -> bytes:
"""Decrypt a data key."""
plaintext, _ = self.decrypt(encrypted_key, context)
return plaintext
def get_provider(self, key_id: str | None = None) -> KMSEncryptionProvider:
"""Get an encryption provider for a specific key."""
self._load_keys()
if key_id is None:
if not self._keys:
key = self.create_key("Default KMS Key")
key_id = key.key_id
else:
key_id = next(iter(self._keys.keys()))
if key_id not in self._keys:
raise EncryptionError(f"Key not found: {key_id}")
return KMSEncryptionProvider(self, key_id)
def re_encrypt(self, ciphertext: bytes, destination_key_id: str,
source_context: Dict[str, str] | None = None,
destination_context: Dict[str, str] | None = None) -> bytes:
"""Re-encrypt data with a different key."""
plaintext, source_key_id = self.decrypt(ciphertext, source_context)
return self.encrypt(destination_key_id, plaintext, destination_context)
def generate_random(self, num_bytes: int = 32) -> bytes:
"""Generate cryptographically secure random bytes."""
if num_bytes < self.generate_data_key_min_bytes or num_bytes > self.generate_data_key_max_bytes:
raise EncryptionError(
f"Number of bytes must be between {self.generate_data_key_min_bytes} and {self.generate_data_key_max_bytes}"
)
return secrets.token_bytes(num_bytes)

View File

@@ -1,444 +0,0 @@
from __future__ import annotations
import base64
import uuid
from typing import Any, Dict
from flask import Blueprint, Response, current_app, jsonify, request
from .encryption import ClientEncryptionHelper, EncryptionError
from .extensions import limiter
from .iam import IamError
kms_api_bp = Blueprint("kms_api", __name__, url_prefix="/kms")
def _require_principal():
"""Require authentication for KMS operations."""
from .s3_api import _require_principal as s3_require_principal
return s3_require_principal()
def _kms():
"""Get KMS manager from app extensions."""
return current_app.extensions.get("kms")
def _encryption():
"""Get encryption manager from app extensions."""
return current_app.extensions.get("encryption")
def _error_response(code: str, message: str, status: int) -> tuple[Dict[str, Any], int]:
return {"__type": code, "message": message}, status
@kms_api_bp.route("/keys", methods=["GET", "POST"])
@limiter.limit("30 per minute")
def list_or_create_keys():
"""List all KMS keys or create a new key."""
principal, error = _require_principal()
if error:
return error
kms = _kms()
if not kms:
return _error_response("KMSNotEnabled", "KMS is not configured", 400)
if request.method == "POST":
payload = request.get_json(silent=True) or {}
key_id = payload.get("KeyId") or payload.get("key_id")
description = payload.get("Description") or payload.get("description", "")
try:
key = kms.create_key(description=description, key_id=key_id)
current_app.logger.info(
"KMS key created",
extra={"key_id": key.key_id, "principal": principal.access_key},
)
return jsonify({
"KeyMetadata": key.to_dict(),
})
except EncryptionError as exc:
return _error_response("KMSInternalException", str(exc), 400)
keys = kms.list_keys()
return jsonify({
"Keys": [{"KeyId": k.key_id, "KeyArn": k.arn} for k in keys],
"Truncated": False,
})
@kms_api_bp.route("/keys/<key_id>", methods=["GET", "DELETE"])
@limiter.limit("30 per minute")
def get_or_delete_key(key_id: str):
"""Get or delete a specific KMS key."""
principal, error = _require_principal()
if error:
return error
kms = _kms()
if not kms:
return _error_response("KMSNotEnabled", "KMS is not configured", 400)
if request.method == "DELETE":
try:
kms.delete_key(key_id)
current_app.logger.info(
"KMS key deleted",
extra={"key_id": key_id, "principal": principal.access_key},
)
return Response(status=204)
except EncryptionError as exc:
return _error_response("NotFoundException", str(exc), 404)
key = kms.get_key(key_id)
if not key:
return _error_response("NotFoundException", f"Key not found: {key_id}", 404)
return jsonify({"KeyMetadata": key.to_dict()})
@kms_api_bp.route("/keys/<key_id>/enable", methods=["POST"])
@limiter.limit("30 per minute")
def enable_key(key_id: str):
"""Enable a KMS key."""
principal, error = _require_principal()
if error:
return error
kms = _kms()
if not kms:
return _error_response("KMSNotEnabled", "KMS is not configured", 400)
try:
kms.enable_key(key_id)
current_app.logger.info(
"KMS key enabled",
extra={"key_id": key_id, "principal": principal.access_key},
)
return Response(status=200)
except EncryptionError as exc:
return _error_response("NotFoundException", str(exc), 404)
@kms_api_bp.route("/keys/<key_id>/disable", methods=["POST"])
@limiter.limit("30 per minute")
def disable_key(key_id: str):
"""Disable a KMS key."""
principal, error = _require_principal()
if error:
return error
kms = _kms()
if not kms:
return _error_response("KMSNotEnabled", "KMS is not configured", 400)
try:
kms.disable_key(key_id)
current_app.logger.info(
"KMS key disabled",
extra={"key_id": key_id, "principal": principal.access_key},
)
return Response(status=200)
except EncryptionError as exc:
return _error_response("NotFoundException", str(exc), 404)
@kms_api_bp.route("/encrypt", methods=["POST"])
@limiter.limit("60 per minute")
def encrypt_data():
"""Encrypt data using a KMS key."""
principal, error = _require_principal()
if error:
return error
kms = _kms()
if not kms:
return _error_response("KMSNotEnabled", "KMS is not configured", 400)
payload = request.get_json(silent=True) or {}
key_id = payload.get("KeyId")
plaintext_b64 = payload.get("Plaintext")
context = payload.get("EncryptionContext")
if not key_id:
return _error_response("ValidationException", "KeyId is required", 400)
if not plaintext_b64:
return _error_response("ValidationException", "Plaintext is required", 400)
try:
plaintext = base64.b64decode(plaintext_b64)
except Exception:
return _error_response("ValidationException", "Plaintext must be base64 encoded", 400)
try:
ciphertext = kms.encrypt(key_id, plaintext, context)
return jsonify({
"CiphertextBlob": base64.b64encode(ciphertext).decode(),
"KeyId": key_id,
"EncryptionAlgorithm": "SYMMETRIC_DEFAULT",
})
except EncryptionError as exc:
return _error_response("KMSInternalException", str(exc), 400)
@kms_api_bp.route("/decrypt", methods=["POST"])
@limiter.limit("60 per minute")
def decrypt_data():
"""Decrypt data using a KMS key."""
principal, error = _require_principal()
if error:
return error
kms = _kms()
if not kms:
return _error_response("KMSNotEnabled", "KMS is not configured", 400)
payload = request.get_json(silent=True) or {}
ciphertext_b64 = payload.get("CiphertextBlob")
context = payload.get("EncryptionContext")
if not ciphertext_b64:
return _error_response("ValidationException", "CiphertextBlob is required", 400)
try:
ciphertext = base64.b64decode(ciphertext_b64)
except Exception:
return _error_response("ValidationException", "CiphertextBlob must be base64 encoded", 400)
try:
plaintext, key_id = kms.decrypt(ciphertext, context)
return jsonify({
"Plaintext": base64.b64encode(plaintext).decode(),
"KeyId": key_id,
"EncryptionAlgorithm": "SYMMETRIC_DEFAULT",
})
except EncryptionError as exc:
return _error_response("InvalidCiphertextException", str(exc), 400)
@kms_api_bp.route("/generate-data-key", methods=["POST"])
@limiter.limit("60 per minute")
def generate_data_key():
"""Generate a data encryption key."""
principal, error = _require_principal()
if error:
return error
kms = _kms()
if not kms:
return _error_response("KMSNotEnabled", "KMS is not configured", 400)
payload = request.get_json(silent=True) or {}
key_id = payload.get("KeyId")
context = payload.get("EncryptionContext")
key_spec = payload.get("KeySpec", "AES_256")
if not key_id:
return _error_response("ValidationException", "KeyId is required", 400)
if key_spec not in {"AES_256", "AES_128"}:
return _error_response("ValidationException", "KeySpec must be AES_256 or AES_128", 400)
try:
plaintext_key, encrypted_key = kms.generate_data_key(key_id, context)
if key_spec == "AES_128":
plaintext_key = plaintext_key[:16]
return jsonify({
"Plaintext": base64.b64encode(plaintext_key).decode(),
"CiphertextBlob": base64.b64encode(encrypted_key).decode(),
"KeyId": key_id,
})
except EncryptionError as exc:
return _error_response("KMSInternalException", str(exc), 400)
@kms_api_bp.route("/generate-data-key-without-plaintext", methods=["POST"])
@limiter.limit("60 per minute")
def generate_data_key_without_plaintext():
"""Generate a data encryption key without returning the plaintext."""
principal, error = _require_principal()
if error:
return error
kms = _kms()
if not kms:
return _error_response("KMSNotEnabled", "KMS is not configured", 400)
payload = request.get_json(silent=True) or {}
key_id = payload.get("KeyId")
context = payload.get("EncryptionContext")
if not key_id:
return _error_response("ValidationException", "KeyId is required", 400)
try:
_, encrypted_key = kms.generate_data_key(key_id, context)
return jsonify({
"CiphertextBlob": base64.b64encode(encrypted_key).decode(),
"KeyId": key_id,
})
except EncryptionError as exc:
return _error_response("KMSInternalException", str(exc), 400)
@kms_api_bp.route("/re-encrypt", methods=["POST"])
@limiter.limit("30 per minute")
def re_encrypt():
"""Re-encrypt data with a different key."""
principal, error = _require_principal()
if error:
return error
kms = _kms()
if not kms:
return _error_response("KMSNotEnabled", "KMS is not configured", 400)
payload = request.get_json(silent=True) or {}
ciphertext_b64 = payload.get("CiphertextBlob")
destination_key_id = payload.get("DestinationKeyId")
source_context = payload.get("SourceEncryptionContext")
destination_context = payload.get("DestinationEncryptionContext")
if not ciphertext_b64:
return _error_response("ValidationException", "CiphertextBlob is required", 400)
if not destination_key_id:
return _error_response("ValidationException", "DestinationKeyId is required", 400)
try:
ciphertext = base64.b64decode(ciphertext_b64)
except Exception:
return _error_response("ValidationException", "CiphertextBlob must be base64 encoded", 400)
try:
plaintext, source_key_id = kms.decrypt(ciphertext, source_context)
new_ciphertext = kms.encrypt(destination_key_id, plaintext, destination_context)
return jsonify({
"CiphertextBlob": base64.b64encode(new_ciphertext).decode(),
"SourceKeyId": source_key_id,
"KeyId": destination_key_id,
})
except EncryptionError as exc:
return _error_response("KMSInternalException", str(exc), 400)
@kms_api_bp.route("/generate-random", methods=["POST"])
@limiter.limit("60 per minute")
def generate_random():
"""Generate random bytes."""
principal, error = _require_principal()
if error:
return error
kms = _kms()
if not kms:
return _error_response("KMSNotEnabled", "KMS is not configured", 400)
payload = request.get_json(silent=True) or {}
num_bytes = payload.get("NumberOfBytes", 32)
try:
num_bytes = int(num_bytes)
except (TypeError, ValueError):
return _error_response("ValidationException", "NumberOfBytes must be an integer", 400)
try:
random_bytes = kms.generate_random(num_bytes)
return jsonify({
"Plaintext": base64.b64encode(random_bytes).decode(),
})
except EncryptionError as exc:
return _error_response("ValidationException", str(exc), 400)
@kms_api_bp.route("/client/generate-key", methods=["POST"])
@limiter.limit("30 per minute")
def generate_client_key():
"""Generate a client-side encryption key."""
principal, error = _require_principal()
if error:
return error
key_info = ClientEncryptionHelper.generate_client_key()
return jsonify(key_info)
@kms_api_bp.route("/client/encrypt", methods=["POST"])
@limiter.limit("60 per minute")
def client_encrypt():
"""Encrypt data using client-side encryption."""
principal, error = _require_principal()
if error:
return error
payload = request.get_json(silent=True) or {}
plaintext_b64 = payload.get("Plaintext")
key_b64 = payload.get("Key")
if not plaintext_b64 or not key_b64:
return _error_response("ValidationException", "Plaintext and Key are required", 400)
try:
plaintext = base64.b64decode(plaintext_b64)
result = ClientEncryptionHelper.encrypt_with_key(plaintext, key_b64)
return jsonify(result)
except Exception as exc:
return _error_response("EncryptionError", str(exc), 400)
@kms_api_bp.route("/client/decrypt", methods=["POST"])
@limiter.limit("60 per minute")
def client_decrypt():
"""Decrypt data using client-side encryption."""
principal, error = _require_principal()
if error:
return error
payload = request.get_json(silent=True) or {}
ciphertext_b64 = payload.get("Ciphertext") or payload.get("ciphertext")
nonce_b64 = payload.get("Nonce") or payload.get("nonce")
key_b64 = payload.get("Key") or payload.get("key")
if not ciphertext_b64 or not nonce_b64 or not key_b64:
return _error_response("ValidationException", "Ciphertext, Nonce, and Key are required", 400)
try:
plaintext = ClientEncryptionHelper.decrypt_with_key(ciphertext_b64, nonce_b64, key_b64)
return jsonify({
"Plaintext": base64.b64encode(plaintext).decode(),
})
except Exception as exc:
return _error_response("DecryptionError", str(exc), 400)
@kms_api_bp.route("/materials/<key_id>", methods=["POST"])
@limiter.limit("60 per minute")
def get_encryption_materials(key_id: str):
"""Get encryption materials for client-side S3 encryption.
This is used by S3 encryption clients that want to use KMS for
key management but perform encryption client-side.
"""
principal, error = _require_principal()
if error:
return error
kms = _kms()
if not kms:
return _error_response("KMSNotEnabled", "KMS is not configured", 400)
payload = request.get_json(silent=True) or {}
context = payload.get("EncryptionContext")
try:
plaintext_key, encrypted_key = kms.generate_data_key(key_id, context)
return jsonify({
"PlaintextKey": base64.b64encode(plaintext_key).decode(),
"EncryptedKey": base64.b64encode(encrypted_key).decode(),
"KeyId": key_id,
"Algorithm": "AES-256-GCM",
"KeyWrapAlgorithm": "kms",
})
except EncryptionError as exc:
return _error_response("KMSInternalException", str(exc), 400)

View File

@@ -1,340 +0,0 @@
from __future__ import annotations
import json
import logging
import threading
import time
from dataclasses import dataclass, field
from datetime import datetime, timedelta, timezone
from pathlib import Path
from typing import Any, Dict, List, Optional
from .storage import ObjectStorage, StorageError
logger = logging.getLogger(__name__)
@dataclass
class LifecycleResult:
bucket_name: str
objects_deleted: int = 0
versions_deleted: int = 0
uploads_aborted: int = 0
errors: List[str] = field(default_factory=list)
execution_time_seconds: float = 0.0
@dataclass
class LifecycleExecutionRecord:
timestamp: float
bucket_name: str
objects_deleted: int
versions_deleted: int
uploads_aborted: int
errors: List[str]
execution_time_seconds: float
def to_dict(self) -> dict:
return {
"timestamp": self.timestamp,
"bucket_name": self.bucket_name,
"objects_deleted": self.objects_deleted,
"versions_deleted": self.versions_deleted,
"uploads_aborted": self.uploads_aborted,
"errors": self.errors,
"execution_time_seconds": self.execution_time_seconds,
}
@classmethod
def from_dict(cls, data: dict) -> "LifecycleExecutionRecord":
return cls(
timestamp=data["timestamp"],
bucket_name=data["bucket_name"],
objects_deleted=data["objects_deleted"],
versions_deleted=data["versions_deleted"],
uploads_aborted=data["uploads_aborted"],
errors=data.get("errors", []),
execution_time_seconds=data["execution_time_seconds"],
)
@classmethod
def from_result(cls, result: LifecycleResult) -> "LifecycleExecutionRecord":
return cls(
timestamp=time.time(),
bucket_name=result.bucket_name,
objects_deleted=result.objects_deleted,
versions_deleted=result.versions_deleted,
uploads_aborted=result.uploads_aborted,
errors=result.errors.copy(),
execution_time_seconds=result.execution_time_seconds,
)
class LifecycleHistoryStore:
def __init__(self, storage_root: Path, max_history_per_bucket: int = 50) -> None:
self.storage_root = storage_root
self.max_history_per_bucket = max_history_per_bucket
self._lock = threading.Lock()
def _get_history_path(self, bucket_name: str) -> Path:
return self.storage_root / ".myfsio.sys" / "buckets" / bucket_name / "lifecycle_history.json"
def load_history(self, bucket_name: str) -> List[LifecycleExecutionRecord]:
path = self._get_history_path(bucket_name)
if not path.exists():
return []
try:
with open(path, "r") as f:
data = json.load(f)
return [LifecycleExecutionRecord.from_dict(d) for d in data.get("executions", [])]
except (OSError, ValueError, KeyError) as e:
logger.error(f"Failed to load lifecycle history for {bucket_name}: {e}")
return []
def save_history(self, bucket_name: str, records: List[LifecycleExecutionRecord]) -> None:
path = self._get_history_path(bucket_name)
path.parent.mkdir(parents=True, exist_ok=True)
data = {"executions": [r.to_dict() for r in records[:self.max_history_per_bucket]]}
try:
with open(path, "w") as f:
json.dump(data, f, indent=2)
except OSError as e:
logger.error(f"Failed to save lifecycle history for {bucket_name}: {e}")
def add_record(self, bucket_name: str, record: LifecycleExecutionRecord) -> None:
with self._lock:
records = self.load_history(bucket_name)
records.insert(0, record)
self.save_history(bucket_name, records)
def get_history(self, bucket_name: str, limit: int = 50, offset: int = 0) -> List[LifecycleExecutionRecord]:
records = self.load_history(bucket_name)
return records[offset:offset + limit]
class LifecycleManager:
def __init__(
self,
storage: ObjectStorage,
interval_seconds: int = 3600,
storage_root: Optional[Path] = None,
max_history_per_bucket: int = 50,
):
self.storage = storage
self.interval_seconds = interval_seconds
self.storage_root = storage_root
self._timer: Optional[threading.Timer] = None
self._shutdown = False
self._lock = threading.Lock()
self.history_store = LifecycleHistoryStore(storage_root, max_history_per_bucket) if storage_root else None
def start(self) -> None:
if self._timer is not None:
return
self._shutdown = False
self._schedule_next()
logger.info(f"Lifecycle manager started with interval {self.interval_seconds}s")
def stop(self) -> None:
self._shutdown = True
if self._timer:
self._timer.cancel()
self._timer = None
logger.info("Lifecycle manager stopped")
def _schedule_next(self) -> None:
if self._shutdown:
return
self._timer = threading.Timer(self.interval_seconds, self._run_enforcement)
self._timer.daemon = True
self._timer.start()
def _run_enforcement(self) -> None:
if self._shutdown:
return
try:
self.enforce_all_buckets()
except Exception as e:
logger.error(f"Lifecycle enforcement failed: {e}")
finally:
self._schedule_next()
def enforce_all_buckets(self) -> Dict[str, LifecycleResult]:
results = {}
try:
buckets = self.storage.list_buckets()
for bucket in buckets:
result = self.enforce_rules(bucket.name)
if result.objects_deleted > 0 or result.versions_deleted > 0 or result.uploads_aborted > 0:
results[bucket.name] = result
except StorageError as e:
logger.error(f"Failed to list buckets for lifecycle: {e}")
return results
def enforce_rules(self, bucket_name: str) -> LifecycleResult:
start_time = time.time()
result = LifecycleResult(bucket_name=bucket_name)
try:
lifecycle = self.storage.get_bucket_lifecycle(bucket_name)
if not lifecycle:
return result
for rule in lifecycle:
if rule.get("Status") != "Enabled":
continue
rule_id = rule.get("ID", "unknown")
prefix = rule.get("Prefix", rule.get("Filter", {}).get("Prefix", ""))
self._enforce_expiration(bucket_name, rule, prefix, result)
self._enforce_noncurrent_expiration(bucket_name, rule, prefix, result)
self._enforce_abort_multipart(bucket_name, rule, result)
except StorageError as e:
result.errors.append(str(e))
logger.error(f"Lifecycle enforcement error for {bucket_name}: {e}")
result.execution_time_seconds = time.time() - start_time
if result.objects_deleted > 0 or result.versions_deleted > 0 or result.uploads_aborted > 0 or result.errors:
logger.info(
f"Lifecycle enforcement for {bucket_name}: "
f"deleted={result.objects_deleted}, versions={result.versions_deleted}, "
f"aborted={result.uploads_aborted}, time={result.execution_time_seconds:.2f}s"
)
if self.history_store:
record = LifecycleExecutionRecord.from_result(result)
self.history_store.add_record(bucket_name, record)
return result
def _enforce_expiration(
self, bucket_name: str, rule: Dict[str, Any], prefix: str, result: LifecycleResult
) -> None:
expiration = rule.get("Expiration", {})
if not expiration:
return
days = expiration.get("Days")
date_str = expiration.get("Date")
if days:
cutoff = datetime.now(timezone.utc) - timedelta(days=days)
elif date_str:
try:
cutoff = datetime.fromisoformat(date_str.replace("Z", "+00:00"))
except ValueError:
return
else:
return
try:
objects = self.storage.list_objects_all(bucket_name)
for obj in objects:
if prefix and not obj.key.startswith(prefix):
continue
if obj.last_modified < cutoff:
try:
self.storage.delete_object(bucket_name, obj.key)
result.objects_deleted += 1
except StorageError as e:
result.errors.append(f"Failed to delete {obj.key}: {e}")
except StorageError as e:
result.errors.append(f"Failed to list objects: {e}")
def _enforce_noncurrent_expiration(
self, bucket_name: str, rule: Dict[str, Any], prefix: str, result: LifecycleResult
) -> None:
noncurrent = rule.get("NoncurrentVersionExpiration", {})
noncurrent_days = noncurrent.get("NoncurrentDays")
if not noncurrent_days:
return
cutoff = datetime.now(timezone.utc) - timedelta(days=noncurrent_days)
try:
objects = self.storage.list_objects_all(bucket_name)
for obj in objects:
if prefix and not obj.key.startswith(prefix):
continue
try:
versions = self.storage.list_object_versions(bucket_name, obj.key)
for version in versions:
archived_at_str = version.get("archived_at", "")
if not archived_at_str:
continue
try:
archived_at = datetime.fromisoformat(archived_at_str.replace("Z", "+00:00"))
if archived_at < cutoff:
version_id = version.get("version_id")
if version_id:
self.storage.delete_object_version(bucket_name, obj.key, version_id)
result.versions_deleted += 1
except (ValueError, StorageError) as e:
result.errors.append(f"Failed to process version: {e}")
except StorageError:
pass
except StorageError as e:
result.errors.append(f"Failed to list objects: {e}")
try:
orphaned = self.storage.list_orphaned_objects(bucket_name)
for item in orphaned:
obj_key = item.get("key", "")
if prefix and not obj_key.startswith(prefix):
continue
try:
versions = self.storage.list_object_versions(bucket_name, obj_key)
for version in versions:
archived_at_str = version.get("archived_at", "")
if not archived_at_str:
continue
try:
archived_at = datetime.fromisoformat(archived_at_str.replace("Z", "+00:00"))
if archived_at < cutoff:
version_id = version.get("version_id")
if version_id:
self.storage.delete_object_version(bucket_name, obj_key, version_id)
result.versions_deleted += 1
except (ValueError, StorageError) as e:
result.errors.append(f"Failed to process orphaned version: {e}")
except StorageError:
pass
except StorageError as e:
result.errors.append(f"Failed to list orphaned objects: {e}")
def _enforce_abort_multipart(
self, bucket_name: str, rule: Dict[str, Any], result: LifecycleResult
) -> None:
abort_config = rule.get("AbortIncompleteMultipartUpload", {})
days_after = abort_config.get("DaysAfterInitiation")
if not days_after:
return
cutoff = datetime.now(timezone.utc) - timedelta(days=days_after)
try:
uploads = self.storage.list_multipart_uploads(bucket_name)
for upload in uploads:
created_at_str = upload.get("created_at", "")
if not created_at_str:
continue
try:
created_at = datetime.fromisoformat(created_at_str.replace("Z", "+00:00"))
if created_at < cutoff:
upload_id = upload.get("upload_id")
if upload_id:
self.storage.abort_multipart_upload(bucket_name, upload_id)
result.uploads_aborted += 1
except (ValueError, StorageError) as e:
result.errors.append(f"Failed to abort upload: {e}")
except StorageError as e:
result.errors.append(f"Failed to list multipart uploads: {e}")
def run_now(self, bucket_name: Optional[str] = None) -> Dict[str, LifecycleResult]:
if bucket_name:
return {bucket_name: self.enforce_rules(bucket_name)}
return self.enforce_all_buckets()
def get_execution_history(self, bucket_name: str, limit: int = 50, offset: int = 0) -> List[LifecycleExecutionRecord]:
if not self.history_store:
return []
return self.history_store.get_history(bucket_name, limit, offset)

View File

@@ -1,381 +0,0 @@
from __future__ import annotations
import ipaddress
import json
import logging
import queue
import socket
import threading
import time
import uuid
from dataclasses import dataclass, field
from datetime import datetime, timezone
from pathlib import Path
from typing import Any, Dict, List, Optional
from urllib.parse import urlparse
import requests
def _is_safe_url(url: str, allow_internal: bool = False) -> bool:
"""Check if a URL is safe to make requests to (not internal/private).
Args:
url: The URL to check.
allow_internal: If True, allows internal/private IP addresses.
Use for self-hosted deployments on internal networks.
"""
try:
parsed = urlparse(url)
hostname = parsed.hostname
if not hostname:
return False
cloud_metadata_hosts = {
"metadata.google.internal",
"169.254.169.254",
}
if hostname.lower() in cloud_metadata_hosts:
return False
if allow_internal:
return True
blocked_hosts = {
"localhost",
"127.0.0.1",
"0.0.0.0",
"::1",
"[::1]",
}
if hostname.lower() in blocked_hosts:
return False
try:
resolved_ip = socket.gethostbyname(hostname)
ip = ipaddress.ip_address(resolved_ip)
if ip.is_private or ip.is_loopback or ip.is_link_local or ip.is_reserved:
return False
except (socket.gaierror, ValueError):
return False
return True
except Exception:
return False
logger = logging.getLogger(__name__)
@dataclass
class NotificationEvent:
event_name: str
bucket_name: str
object_key: str
object_size: int = 0
etag: str = ""
version_id: Optional[str] = None
timestamp: datetime = field(default_factory=lambda: datetime.now(timezone.utc))
request_id: str = field(default_factory=lambda: uuid.uuid4().hex)
source_ip: str = ""
user_identity: str = ""
def to_s3_event(self) -> Dict[str, Any]:
return {
"Records": [
{
"eventVersion": "2.1",
"eventSource": "myfsio:s3",
"awsRegion": "local",
"eventTime": self.timestamp.strftime("%Y-%m-%dT%H:%M:%S.000Z"),
"eventName": self.event_name,
"userIdentity": {
"principalId": self.user_identity or "ANONYMOUS",
},
"requestParameters": {
"sourceIPAddress": self.source_ip or "127.0.0.1",
},
"responseElements": {
"x-amz-request-id": self.request_id,
"x-amz-id-2": self.request_id,
},
"s3": {
"s3SchemaVersion": "1.0",
"configurationId": "notification",
"bucket": {
"name": self.bucket_name,
"ownerIdentity": {"principalId": "local"},
"arn": f"arn:aws:s3:::{self.bucket_name}",
},
"object": {
"key": self.object_key,
"size": self.object_size,
"eTag": self.etag,
"versionId": self.version_id or "null",
"sequencer": f"{int(time.time() * 1000):016X}",
},
},
}
]
}
@dataclass
class WebhookDestination:
url: str
headers: Dict[str, str] = field(default_factory=dict)
timeout_seconds: int = 30
retry_count: int = 3
retry_delay_seconds: int = 1
def to_dict(self) -> Dict[str, Any]:
return {
"url": self.url,
"headers": self.headers,
"timeout_seconds": self.timeout_seconds,
"retry_count": self.retry_count,
"retry_delay_seconds": self.retry_delay_seconds,
}
@classmethod
def from_dict(cls, data: Dict[str, Any]) -> "WebhookDestination":
return cls(
url=data.get("url", ""),
headers=data.get("headers", {}),
timeout_seconds=data.get("timeout_seconds", 30),
retry_count=data.get("retry_count", 3),
retry_delay_seconds=data.get("retry_delay_seconds", 1),
)
@dataclass
class NotificationConfiguration:
id: str
events: List[str]
destination: WebhookDestination
prefix_filter: str = ""
suffix_filter: str = ""
def matches_event(self, event_name: str, object_key: str) -> bool:
event_match = False
for pattern in self.events:
if pattern.endswith("*"):
base = pattern[:-1]
if event_name.startswith(base):
event_match = True
break
elif pattern == event_name:
event_match = True
break
if not event_match:
return False
if self.prefix_filter and not object_key.startswith(self.prefix_filter):
return False
if self.suffix_filter and not object_key.endswith(self.suffix_filter):
return False
return True
def to_dict(self) -> Dict[str, Any]:
return {
"Id": self.id,
"Events": self.events,
"Destination": self.destination.to_dict(),
"Filter": {
"Key": {
"FilterRules": [
{"Name": "prefix", "Value": self.prefix_filter},
{"Name": "suffix", "Value": self.suffix_filter},
]
}
},
}
@classmethod
def from_dict(cls, data: Dict[str, Any]) -> "NotificationConfiguration":
prefix = ""
suffix = ""
filter_data = data.get("Filter", {})
key_filter = filter_data.get("Key", {})
for rule in key_filter.get("FilterRules", []):
if rule.get("Name") == "prefix":
prefix = rule.get("Value", "")
elif rule.get("Name") == "suffix":
suffix = rule.get("Value", "")
return cls(
id=data.get("Id", uuid.uuid4().hex),
events=data.get("Events", []),
destination=WebhookDestination.from_dict(data.get("Destination", {})),
prefix_filter=prefix,
suffix_filter=suffix,
)
class NotificationService:
def __init__(self, storage_root: Path, worker_count: int = 2, allow_internal_endpoints: bool = False):
self.storage_root = storage_root
self._allow_internal_endpoints = allow_internal_endpoints
self._configs: Dict[str, List[NotificationConfiguration]] = {}
self._queue: queue.Queue[tuple[NotificationEvent, WebhookDestination]] = queue.Queue()
self._workers: List[threading.Thread] = []
self._shutdown = threading.Event()
self._stats = {
"events_queued": 0,
"events_sent": 0,
"events_failed": 0,
}
for i in range(worker_count):
worker = threading.Thread(target=self._worker_loop, name=f"notification-worker-{i}", daemon=True)
worker.start()
self._workers.append(worker)
def _config_path(self, bucket_name: str) -> Path:
return self.storage_root / ".myfsio.sys" / "buckets" / bucket_name / "notifications.json"
def get_bucket_notifications(self, bucket_name: str) -> List[NotificationConfiguration]:
if bucket_name in self._configs:
return self._configs[bucket_name]
config_path = self._config_path(bucket_name)
if not config_path.exists():
return []
try:
data = json.loads(config_path.read_text(encoding="utf-8"))
configs = [NotificationConfiguration.from_dict(c) for c in data.get("configurations", [])]
self._configs[bucket_name] = configs
return configs
except (json.JSONDecodeError, OSError) as e:
logger.warning(f"Failed to load notification config for {bucket_name}: {e}")
return []
def set_bucket_notifications(
self, bucket_name: str, configurations: List[NotificationConfiguration]
) -> None:
config_path = self._config_path(bucket_name)
config_path.parent.mkdir(parents=True, exist_ok=True)
data = {"configurations": [c.to_dict() for c in configurations]}
config_path.write_text(json.dumps(data, indent=2), encoding="utf-8")
self._configs[bucket_name] = configurations
def delete_bucket_notifications(self, bucket_name: str) -> None:
config_path = self._config_path(bucket_name)
try:
if config_path.exists():
config_path.unlink()
except OSError:
pass
self._configs.pop(bucket_name, None)
def emit_event(self, event: NotificationEvent) -> None:
configurations = self.get_bucket_notifications(event.bucket_name)
if not configurations:
return
for config in configurations:
if config.matches_event(event.event_name, event.object_key):
self._queue.put((event, config.destination))
self._stats["events_queued"] += 1
logger.debug(
f"Queued notification for {event.event_name} on {event.bucket_name}/{event.object_key}"
)
def emit_object_created(
self,
bucket_name: str,
object_key: str,
*,
size: int = 0,
etag: str = "",
version_id: Optional[str] = None,
request_id: str = "",
source_ip: str = "",
user_identity: str = "",
operation: str = "Put",
) -> None:
event = NotificationEvent(
event_name=f"s3:ObjectCreated:{operation}",
bucket_name=bucket_name,
object_key=object_key,
object_size=size,
etag=etag,
version_id=version_id,
request_id=request_id or uuid.uuid4().hex,
source_ip=source_ip,
user_identity=user_identity,
)
self.emit_event(event)
def emit_object_removed(
self,
bucket_name: str,
object_key: str,
*,
version_id: Optional[str] = None,
request_id: str = "",
source_ip: str = "",
user_identity: str = "",
operation: str = "Delete",
) -> None:
event = NotificationEvent(
event_name=f"s3:ObjectRemoved:{operation}",
bucket_name=bucket_name,
object_key=object_key,
version_id=version_id,
request_id=request_id or uuid.uuid4().hex,
source_ip=source_ip,
user_identity=user_identity,
)
self.emit_event(event)
def _worker_loop(self) -> None:
while not self._shutdown.is_set():
try:
event, destination = self._queue.get(timeout=1.0)
except queue.Empty:
continue
try:
self._send_notification(event, destination)
self._stats["events_sent"] += 1
except Exception as e:
self._stats["events_failed"] += 1
logger.error(f"Failed to send notification: {e}")
finally:
self._queue.task_done()
def _send_notification(self, event: NotificationEvent, destination: WebhookDestination) -> None:
if not _is_safe_url(destination.url, allow_internal=self._allow_internal_endpoints):
raise RuntimeError(f"Blocked request to cloud metadata service (SSRF protection): {destination.url}")
payload = event.to_s3_event()
headers = {"Content-Type": "application/json", **destination.headers}
last_error = None
for attempt in range(destination.retry_count):
try:
response = requests.post(
destination.url,
json=payload,
headers=headers,
timeout=destination.timeout_seconds,
)
if response.status_code < 400:
logger.info(
f"Notification sent: {event.event_name} -> {destination.url} (status={response.status_code})"
)
return
last_error = f"HTTP {response.status_code}: {response.text[:200]}"
except requests.RequestException as e:
last_error = str(e)
if attempt < destination.retry_count - 1:
time.sleep(destination.retry_delay_seconds * (attempt + 1))
raise RuntimeError(f"Failed after {destination.retry_count} attempts: {last_error}")
def get_stats(self) -> Dict[str, int]:
return dict(self._stats)
def shutdown(self) -> None:
self._shutdown.set()
for worker in self._workers:
worker.join(timeout=5.0)

View File

@@ -1,234 +0,0 @@
from __future__ import annotations
import json
from dataclasses import dataclass
from datetime import datetime, timezone
from enum import Enum
from pathlib import Path
from typing import Any, Dict, Optional
class RetentionMode(Enum):
GOVERNANCE = "GOVERNANCE"
COMPLIANCE = "COMPLIANCE"
class ObjectLockError(Exception):
pass
@dataclass
class ObjectLockRetention:
mode: RetentionMode
retain_until_date: datetime
def to_dict(self) -> Dict[str, str]:
return {
"Mode": self.mode.value,
"RetainUntilDate": self.retain_until_date.isoformat(),
}
@classmethod
def from_dict(cls, data: Dict[str, Any]) -> Optional["ObjectLockRetention"]:
if not data:
return None
mode_str = data.get("Mode")
date_str = data.get("RetainUntilDate")
if not mode_str or not date_str:
return None
try:
mode = RetentionMode(mode_str)
retain_until = datetime.fromisoformat(date_str.replace("Z", "+00:00"))
return cls(mode=mode, retain_until_date=retain_until)
except (ValueError, KeyError):
return None
def is_expired(self) -> bool:
return datetime.now(timezone.utc) > self.retain_until_date
@dataclass
class ObjectLockConfig:
enabled: bool = False
default_retention: Optional[ObjectLockRetention] = None
def to_dict(self) -> Dict[str, Any]:
result: Dict[str, Any] = {"ObjectLockEnabled": "Enabled" if self.enabled else "Disabled"}
if self.default_retention:
result["Rule"] = {
"DefaultRetention": {
"Mode": self.default_retention.mode.value,
"Days": None,
"Years": None,
}
}
return result
@classmethod
def from_dict(cls, data: Dict[str, Any]) -> "ObjectLockConfig":
enabled = data.get("ObjectLockEnabled") == "Enabled"
default_retention = None
rule = data.get("Rule")
if rule and "DefaultRetention" in rule:
dr = rule["DefaultRetention"]
mode_str = dr.get("Mode", "GOVERNANCE")
days = dr.get("Days")
years = dr.get("Years")
if days or years:
from datetime import timedelta
now = datetime.now(timezone.utc)
if years:
delta = timedelta(days=int(years) * 365)
else:
delta = timedelta(days=int(days))
default_retention = ObjectLockRetention(
mode=RetentionMode(mode_str),
retain_until_date=now + delta,
)
return cls(enabled=enabled, default_retention=default_retention)
class ObjectLockService:
def __init__(self, storage_root: Path):
self.storage_root = storage_root
self._config_cache: Dict[str, ObjectLockConfig] = {}
def _bucket_lock_config_path(self, bucket_name: str) -> Path:
return self.storage_root / ".myfsio.sys" / "buckets" / bucket_name / "object_lock.json"
def _object_lock_meta_path(self, bucket_name: str, object_key: str) -> Path:
safe_key = object_key.replace("/", "_").replace("\\", "_")
return (
self.storage_root / ".myfsio.sys" / "buckets" / bucket_name /
"locks" / f"{safe_key}.lock.json"
)
def get_bucket_lock_config(self, bucket_name: str) -> ObjectLockConfig:
if bucket_name in self._config_cache:
return self._config_cache[bucket_name]
config_path = self._bucket_lock_config_path(bucket_name)
if not config_path.exists():
return ObjectLockConfig(enabled=False)
try:
data = json.loads(config_path.read_text(encoding="utf-8"))
config = ObjectLockConfig.from_dict(data)
self._config_cache[bucket_name] = config
return config
except (json.JSONDecodeError, OSError):
return ObjectLockConfig(enabled=False)
def set_bucket_lock_config(self, bucket_name: str, config: ObjectLockConfig) -> None:
config_path = self._bucket_lock_config_path(bucket_name)
config_path.parent.mkdir(parents=True, exist_ok=True)
config_path.write_text(json.dumps(config.to_dict()), encoding="utf-8")
self._config_cache[bucket_name] = config
def enable_bucket_lock(self, bucket_name: str) -> None:
config = self.get_bucket_lock_config(bucket_name)
config.enabled = True
self.set_bucket_lock_config(bucket_name, config)
def is_bucket_lock_enabled(self, bucket_name: str) -> bool:
return self.get_bucket_lock_config(bucket_name).enabled
def get_object_retention(self, bucket_name: str, object_key: str) -> Optional[ObjectLockRetention]:
meta_path = self._object_lock_meta_path(bucket_name, object_key)
if not meta_path.exists():
return None
try:
data = json.loads(meta_path.read_text(encoding="utf-8"))
return ObjectLockRetention.from_dict(data.get("retention", {}))
except (json.JSONDecodeError, OSError):
return None
def set_object_retention(
self,
bucket_name: str,
object_key: str,
retention: ObjectLockRetention,
bypass_governance: bool = False,
) -> None:
existing = self.get_object_retention(bucket_name, object_key)
if existing and not existing.is_expired():
if existing.mode == RetentionMode.COMPLIANCE:
raise ObjectLockError(
"Cannot modify retention on object with COMPLIANCE mode until retention expires"
)
if existing.mode == RetentionMode.GOVERNANCE and not bypass_governance:
raise ObjectLockError(
"Cannot modify GOVERNANCE retention without bypass-governance permission"
)
meta_path = self._object_lock_meta_path(bucket_name, object_key)
meta_path.parent.mkdir(parents=True, exist_ok=True)
existing_data: Dict[str, Any] = {}
if meta_path.exists():
try:
existing_data = json.loads(meta_path.read_text(encoding="utf-8"))
except (json.JSONDecodeError, OSError):
pass
existing_data["retention"] = retention.to_dict()
meta_path.write_text(json.dumps(existing_data), encoding="utf-8")
def get_legal_hold(self, bucket_name: str, object_key: str) -> bool:
meta_path = self._object_lock_meta_path(bucket_name, object_key)
if not meta_path.exists():
return False
try:
data = json.loads(meta_path.read_text(encoding="utf-8"))
return data.get("legal_hold", False)
except (json.JSONDecodeError, OSError):
return False
def set_legal_hold(self, bucket_name: str, object_key: str, enabled: bool) -> None:
meta_path = self._object_lock_meta_path(bucket_name, object_key)
meta_path.parent.mkdir(parents=True, exist_ok=True)
existing_data: Dict[str, Any] = {}
if meta_path.exists():
try:
existing_data = json.loads(meta_path.read_text(encoding="utf-8"))
except (json.JSONDecodeError, OSError):
pass
existing_data["legal_hold"] = enabled
meta_path.write_text(json.dumps(existing_data), encoding="utf-8")
def can_delete_object(
self,
bucket_name: str,
object_key: str,
bypass_governance: bool = False,
) -> tuple[bool, str]:
if self.get_legal_hold(bucket_name, object_key):
return False, "Object is under legal hold"
retention = self.get_object_retention(bucket_name, object_key)
if retention and not retention.is_expired():
if retention.mode == RetentionMode.COMPLIANCE:
return False, f"Object is locked in COMPLIANCE mode until {retention.retain_until_date.isoformat()}"
if retention.mode == RetentionMode.GOVERNANCE:
if not bypass_governance:
return False, f"Object is locked in GOVERNANCE mode until {retention.retain_until_date.isoformat()}"
return True, ""
def can_overwrite_object(
self,
bucket_name: str,
object_key: str,
bypass_governance: bool = False,
) -> tuple[bool, str]:
return self.can_delete_object(bucket_name, object_key, bypass_governance)
def delete_object_lock_metadata(self, bucket_name: str, object_key: str) -> None:
meta_path = self._object_lock_meta_path(bucket_name, object_key)
try:
if meta_path.exists():
meta_path.unlink()
except OSError:
pass

View File

@@ -1,271 +0,0 @@
from __future__ import annotations
import json
import logging
import threading
import time
from dataclasses import dataclass, field
from datetime import datetime, timezone
from pathlib import Path
from typing import Any, Dict, List, Optional
logger = logging.getLogger(__name__)
@dataclass
class OperationStats:
count: int = 0
success_count: int = 0
error_count: int = 0
latency_sum_ms: float = 0.0
latency_min_ms: float = float("inf")
latency_max_ms: float = 0.0
bytes_in: int = 0
bytes_out: int = 0
def record(self, latency_ms: float, success: bool, bytes_in: int = 0, bytes_out: int = 0) -> None:
self.count += 1
if success:
self.success_count += 1
else:
self.error_count += 1
self.latency_sum_ms += latency_ms
if latency_ms < self.latency_min_ms:
self.latency_min_ms = latency_ms
if latency_ms > self.latency_max_ms:
self.latency_max_ms = latency_ms
self.bytes_in += bytes_in
self.bytes_out += bytes_out
def to_dict(self) -> Dict[str, Any]:
avg_latency = self.latency_sum_ms / self.count if self.count > 0 else 0.0
min_latency = self.latency_min_ms if self.latency_min_ms != float("inf") else 0.0
return {
"count": self.count,
"success_count": self.success_count,
"error_count": self.error_count,
"latency_avg_ms": round(avg_latency, 2),
"latency_min_ms": round(min_latency, 2),
"latency_max_ms": round(self.latency_max_ms, 2),
"bytes_in": self.bytes_in,
"bytes_out": self.bytes_out,
}
def merge(self, other: "OperationStats") -> None:
self.count += other.count
self.success_count += other.success_count
self.error_count += other.error_count
self.latency_sum_ms += other.latency_sum_ms
if other.latency_min_ms < self.latency_min_ms:
self.latency_min_ms = other.latency_min_ms
if other.latency_max_ms > self.latency_max_ms:
self.latency_max_ms = other.latency_max_ms
self.bytes_in += other.bytes_in
self.bytes_out += other.bytes_out
@dataclass
class MetricsSnapshot:
timestamp: datetime
window_seconds: int
by_method: Dict[str, Dict[str, Any]]
by_endpoint: Dict[str, Dict[str, Any]]
by_status_class: Dict[str, int]
error_codes: Dict[str, int]
totals: Dict[str, Any]
def to_dict(self) -> Dict[str, Any]:
return {
"timestamp": self.timestamp.isoformat(),
"window_seconds": self.window_seconds,
"by_method": self.by_method,
"by_endpoint": self.by_endpoint,
"by_status_class": self.by_status_class,
"error_codes": self.error_codes,
"totals": self.totals,
}
@classmethod
def from_dict(cls, data: Dict[str, Any]) -> "MetricsSnapshot":
return cls(
timestamp=datetime.fromisoformat(data["timestamp"]),
window_seconds=data.get("window_seconds", 300),
by_method=data.get("by_method", {}),
by_endpoint=data.get("by_endpoint", {}),
by_status_class=data.get("by_status_class", {}),
error_codes=data.get("error_codes", {}),
totals=data.get("totals", {}),
)
class OperationMetricsCollector:
def __init__(
self,
storage_root: Path,
interval_minutes: int = 5,
retention_hours: int = 24,
):
self.storage_root = storage_root
self.interval_seconds = interval_minutes * 60
self.retention_hours = retention_hours
self._lock = threading.Lock()
self._by_method: Dict[str, OperationStats] = {}
self._by_endpoint: Dict[str, OperationStats] = {}
self._by_status_class: Dict[str, int] = {}
self._error_codes: Dict[str, int] = {}
self._totals = OperationStats()
self._window_start = time.time()
self._shutdown = threading.Event()
self._snapshots: List[MetricsSnapshot] = []
self._load_history()
self._snapshot_thread = threading.Thread(
target=self._snapshot_loop, name="operation-metrics-snapshot", daemon=True
)
self._snapshot_thread.start()
def _config_path(self) -> Path:
return self.storage_root / ".myfsio.sys" / "config" / "operation_metrics.json"
def _load_history(self) -> None:
config_path = self._config_path()
if not config_path.exists():
return
try:
data = json.loads(config_path.read_text(encoding="utf-8"))
snapshots_data = data.get("snapshots", [])
self._snapshots = [MetricsSnapshot.from_dict(s) for s in snapshots_data]
self._prune_old_snapshots()
except (json.JSONDecodeError, OSError, KeyError) as e:
logger.warning(f"Failed to load operation metrics history: {e}")
def _save_history(self) -> None:
config_path = self._config_path()
config_path.parent.mkdir(parents=True, exist_ok=True)
try:
data = {"snapshots": [s.to_dict() for s in self._snapshots]}
config_path.write_text(json.dumps(data, indent=2), encoding="utf-8")
except OSError as e:
logger.warning(f"Failed to save operation metrics history: {e}")
def _prune_old_snapshots(self) -> None:
if not self._snapshots:
return
cutoff = datetime.now(timezone.utc).timestamp() - (self.retention_hours * 3600)
self._snapshots = [
s for s in self._snapshots if s.timestamp.timestamp() > cutoff
]
def _snapshot_loop(self) -> None:
while not self._shutdown.is_set():
self._shutdown.wait(timeout=self.interval_seconds)
if not self._shutdown.is_set():
self._take_snapshot()
def _take_snapshot(self) -> None:
with self._lock:
now = datetime.now(timezone.utc)
window_seconds = int(time.time() - self._window_start)
snapshot = MetricsSnapshot(
timestamp=now,
window_seconds=window_seconds,
by_method={k: v.to_dict() for k, v in self._by_method.items()},
by_endpoint={k: v.to_dict() for k, v in self._by_endpoint.items()},
by_status_class=dict(self._by_status_class),
error_codes=dict(self._error_codes),
totals=self._totals.to_dict(),
)
self._snapshots.append(snapshot)
self._prune_old_snapshots()
self._save_history()
self._by_method.clear()
self._by_endpoint.clear()
self._by_status_class.clear()
self._error_codes.clear()
self._totals = OperationStats()
self._window_start = time.time()
def record_request(
self,
method: str,
endpoint_type: str,
status_code: int,
latency_ms: float,
bytes_in: int = 0,
bytes_out: int = 0,
error_code: Optional[str] = None,
) -> None:
success = 200 <= status_code < 400
status_class = f"{status_code // 100}xx"
with self._lock:
if method not in self._by_method:
self._by_method[method] = OperationStats()
self._by_method[method].record(latency_ms, success, bytes_in, bytes_out)
if endpoint_type not in self._by_endpoint:
self._by_endpoint[endpoint_type] = OperationStats()
self._by_endpoint[endpoint_type].record(latency_ms, success, bytes_in, bytes_out)
self._by_status_class[status_class] = self._by_status_class.get(status_class, 0) + 1
if error_code:
self._error_codes[error_code] = self._error_codes.get(error_code, 0) + 1
self._totals.record(latency_ms, success, bytes_in, bytes_out)
def get_current_stats(self) -> Dict[str, Any]:
with self._lock:
window_seconds = int(time.time() - self._window_start)
return {
"timestamp": datetime.now(timezone.utc).isoformat(),
"window_seconds": window_seconds,
"by_method": {k: v.to_dict() for k, v in self._by_method.items()},
"by_endpoint": {k: v.to_dict() for k, v in self._by_endpoint.items()},
"by_status_class": dict(self._by_status_class),
"error_codes": dict(self._error_codes),
"totals": self._totals.to_dict(),
}
def get_history(self, hours: Optional[int] = None) -> List[Dict[str, Any]]:
with self._lock:
snapshots = list(self._snapshots)
if hours:
cutoff = datetime.now(timezone.utc).timestamp() - (hours * 3600)
snapshots = [s for s in snapshots if s.timestamp.timestamp() > cutoff]
return [s.to_dict() for s in snapshots]
def shutdown(self) -> None:
self._shutdown.set()
self._take_snapshot()
self._snapshot_thread.join(timeout=5.0)
def classify_endpoint(path: str) -> str:
if not path or path == "/":
return "service"
path = path.rstrip("/")
if path.startswith("/ui"):
return "ui"
if path.startswith("/kms"):
return "kms"
if path.startswith("/myfsio"):
return "service"
parts = path.lstrip("/").split("/")
if len(parts) == 0:
return "service"
elif len(parts) == 1:
return "bucket"
else:
return "object"

View File

@@ -1,653 +0,0 @@
from __future__ import annotations
import json
import logging
import mimetypes
import threading
import time
from concurrent.futures import ThreadPoolExecutor
from dataclasses import dataclass, field
from pathlib import Path
from typing import Any, Dict, List, Optional
import boto3
from botocore.config import Config
from botocore.exceptions import ClientError
from boto3.exceptions import S3UploadFailedError
from .connections import ConnectionStore, RemoteConnection
from .storage import ObjectStorage, StorageError
logger = logging.getLogger(__name__)
REPLICATION_USER_AGENT = "S3ReplicationAgent/1.0"
REPLICATION_MODE_NEW_ONLY = "new_only"
REPLICATION_MODE_ALL = "all"
REPLICATION_MODE_BIDIRECTIONAL = "bidirectional"
def _create_s3_client(
connection: RemoteConnection,
*,
health_check: bool = False,
connect_timeout: int = 5,
read_timeout: int = 30,
max_retries: int = 2,
) -> Any:
"""Create a boto3 S3 client for the given connection.
Args:
connection: Remote S3 connection configuration
health_check: If True, use minimal retries for quick health checks
"""
config = Config(
user_agent_extra=REPLICATION_USER_AGENT,
connect_timeout=connect_timeout,
read_timeout=read_timeout,
retries={'max_attempts': 1 if health_check else max_retries},
signature_version='s3v4',
s3={'addressing_style': 'path'},
request_checksum_calculation='when_required',
response_checksum_validation='when_required',
)
return boto3.client(
"s3",
endpoint_url=connection.endpoint_url,
aws_access_key_id=connection.access_key,
aws_secret_access_key=connection.secret_key,
region_name=connection.region or 'us-east-1',
config=config,
)
@dataclass
class ReplicationStats:
"""Statistics for replication operations - computed dynamically."""
objects_synced: int = 0
objects_pending: int = 0
objects_orphaned: int = 0
bytes_synced: int = 0
last_sync_at: Optional[float] = None
last_sync_key: Optional[str] = None
def to_dict(self) -> dict:
return {
"objects_synced": self.objects_synced,
"objects_pending": self.objects_pending,
"objects_orphaned": self.objects_orphaned,
"bytes_synced": self.bytes_synced,
"last_sync_at": self.last_sync_at,
"last_sync_key": self.last_sync_key,
}
@classmethod
def from_dict(cls, data: dict) -> "ReplicationStats":
return cls(
objects_synced=data.get("objects_synced", 0),
objects_pending=data.get("objects_pending", 0),
objects_orphaned=data.get("objects_orphaned", 0),
bytes_synced=data.get("bytes_synced", 0),
last_sync_at=data.get("last_sync_at"),
last_sync_key=data.get("last_sync_key"),
)
@dataclass
class ReplicationFailure:
object_key: str
error_message: str
timestamp: float
failure_count: int
bucket_name: str
action: str
last_error_code: Optional[str] = None
def to_dict(self) -> dict:
return {
"object_key": self.object_key,
"error_message": self.error_message,
"timestamp": self.timestamp,
"failure_count": self.failure_count,
"bucket_name": self.bucket_name,
"action": self.action,
"last_error_code": self.last_error_code,
}
@classmethod
def from_dict(cls, data: dict) -> "ReplicationFailure":
return cls(
object_key=data["object_key"],
error_message=data["error_message"],
timestamp=data["timestamp"],
failure_count=data["failure_count"],
bucket_name=data["bucket_name"],
action=data["action"],
last_error_code=data.get("last_error_code"),
)
@dataclass
class ReplicationRule:
bucket_name: str
target_connection_id: str
target_bucket: str
enabled: bool = True
mode: str = REPLICATION_MODE_NEW_ONLY
created_at: Optional[float] = None
stats: ReplicationStats = field(default_factory=ReplicationStats)
sync_deletions: bool = True
last_pull_at: Optional[float] = None
filter_prefix: Optional[str] = None
def to_dict(self) -> dict:
return {
"bucket_name": self.bucket_name,
"target_connection_id": self.target_connection_id,
"target_bucket": self.target_bucket,
"enabled": self.enabled,
"mode": self.mode,
"created_at": self.created_at,
"stats": self.stats.to_dict(),
"sync_deletions": self.sync_deletions,
"last_pull_at": self.last_pull_at,
"filter_prefix": self.filter_prefix,
}
@classmethod
def from_dict(cls, data: dict) -> "ReplicationRule":
stats_data = data.pop("stats", {})
if "mode" not in data:
data["mode"] = REPLICATION_MODE_NEW_ONLY
if "created_at" not in data:
data["created_at"] = None
if "sync_deletions" not in data:
data["sync_deletions"] = True
if "last_pull_at" not in data:
data["last_pull_at"] = None
if "filter_prefix" not in data:
data["filter_prefix"] = None
rule = cls(**data)
rule.stats = ReplicationStats.from_dict(stats_data) if stats_data else ReplicationStats()
return rule
class ReplicationFailureStore:
def __init__(self, storage_root: Path, max_failures_per_bucket: int = 50) -> None:
self.storage_root = storage_root
self.max_failures_per_bucket = max_failures_per_bucket
self._lock = threading.Lock()
def _get_failures_path(self, bucket_name: str) -> Path:
return self.storage_root / ".myfsio.sys" / "buckets" / bucket_name / "replication_failures.json"
def load_failures(self, bucket_name: str) -> List[ReplicationFailure]:
path = self._get_failures_path(bucket_name)
if not path.exists():
return []
try:
with open(path, "r") as f:
data = json.load(f)
return [ReplicationFailure.from_dict(d) for d in data.get("failures", [])]
except (OSError, ValueError, KeyError) as e:
logger.error(f"Failed to load replication failures for {bucket_name}: {e}")
return []
def save_failures(self, bucket_name: str, failures: List[ReplicationFailure]) -> None:
path = self._get_failures_path(bucket_name)
path.parent.mkdir(parents=True, exist_ok=True)
data = {"failures": [f.to_dict() for f in failures[:self.max_failures_per_bucket]]}
try:
with open(path, "w") as f:
json.dump(data, f, indent=2)
except OSError as e:
logger.error(f"Failed to save replication failures for {bucket_name}: {e}")
def add_failure(self, bucket_name: str, failure: ReplicationFailure) -> None:
with self._lock:
failures = self.load_failures(bucket_name)
existing = next((f for f in failures if f.object_key == failure.object_key), None)
if existing:
existing.failure_count += 1
existing.timestamp = failure.timestamp
existing.error_message = failure.error_message
existing.last_error_code = failure.last_error_code
else:
failures.insert(0, failure)
self.save_failures(bucket_name, failures)
def remove_failure(self, bucket_name: str, object_key: str) -> bool:
with self._lock:
failures = self.load_failures(bucket_name)
original_len = len(failures)
failures = [f for f in failures if f.object_key != object_key]
if len(failures) < original_len:
self.save_failures(bucket_name, failures)
return True
return False
def clear_failures(self, bucket_name: str) -> None:
with self._lock:
path = self._get_failures_path(bucket_name)
if path.exists():
path.unlink()
def get_failure(self, bucket_name: str, object_key: str) -> Optional[ReplicationFailure]:
failures = self.load_failures(bucket_name)
return next((f for f in failures if f.object_key == object_key), None)
def get_failure_count(self, bucket_name: str) -> int:
return len(self.load_failures(bucket_name))
class ReplicationManager:
def __init__(
self,
storage: ObjectStorage,
connections: ConnectionStore,
rules_path: Path,
storage_root: Path,
connect_timeout: int = 5,
read_timeout: int = 30,
max_retries: int = 2,
streaming_threshold_bytes: int = 10 * 1024 * 1024,
max_failures_per_bucket: int = 50,
) -> None:
self.storage = storage
self.connections = connections
self.rules_path = rules_path
self.storage_root = storage_root
self.connect_timeout = connect_timeout
self.read_timeout = read_timeout
self.max_retries = max_retries
self.streaming_threshold_bytes = streaming_threshold_bytes
self._rules: Dict[str, ReplicationRule] = {}
self._stats_lock = threading.Lock()
self._executor = ThreadPoolExecutor(max_workers=4, thread_name_prefix="ReplicationWorker")
self._shutdown = False
self.failure_store = ReplicationFailureStore(storage_root, max_failures_per_bucket)
self.reload_rules()
def _create_client(self, connection: RemoteConnection, *, health_check: bool = False) -> Any:
"""Create an S3 client with the manager's configured timeouts."""
return _create_s3_client(
connection,
health_check=health_check,
connect_timeout=self.connect_timeout,
read_timeout=self.read_timeout,
max_retries=self.max_retries,
)
def shutdown(self, wait: bool = True) -> None:
"""Shutdown the replication executor gracefully.
Args:
wait: If True, wait for pending tasks to complete
"""
self._shutdown = True
self._executor.shutdown(wait=wait)
logger.info("Replication manager shut down")
def reload_rules(self) -> None:
if not self.rules_path.exists():
self._rules = {}
return
try:
with open(self.rules_path, "r") as f:
data = json.load(f)
for bucket, rule_data in data.items():
self._rules[bucket] = ReplicationRule.from_dict(rule_data)
except (OSError, ValueError) as e:
logger.error(f"Failed to load replication rules: {e}")
def save_rules(self) -> None:
data = {b: rule.to_dict() for b, rule in self._rules.items()}
self.rules_path.parent.mkdir(parents=True, exist_ok=True)
with open(self.rules_path, "w") as f:
json.dump(data, f, indent=2)
def check_endpoint_health(self, connection: RemoteConnection) -> bool:
"""Check if a remote endpoint is reachable and responsive.
Returns True if endpoint is healthy, False otherwise.
Uses short timeouts to prevent blocking.
"""
try:
s3 = self._create_client(connection, health_check=True)
s3.list_buckets()
return True
except Exception as e:
logger.warning(f"Endpoint health check failed for {connection.name} ({connection.endpoint_url}): {e}")
return False
def get_rule(self, bucket_name: str) -> Optional[ReplicationRule]:
return self._rules.get(bucket_name)
def list_rules(self) -> List[ReplicationRule]:
return list(self._rules.values())
def set_rule(self, rule: ReplicationRule) -> None:
old_rule = self._rules.get(rule.bucket_name)
was_all_mode = old_rule and old_rule.mode == REPLICATION_MODE_ALL if old_rule else False
self._rules[rule.bucket_name] = rule
self.save_rules()
if rule.mode == REPLICATION_MODE_ALL and rule.enabled and not was_all_mode:
logger.info(f"Replication mode ALL enabled for {rule.bucket_name}, triggering sync of existing objects")
self._executor.submit(self.replicate_existing_objects, rule.bucket_name)
def delete_rule(self, bucket_name: str) -> None:
if bucket_name in self._rules:
del self._rules[bucket_name]
self.save_rules()
def _update_last_sync(self, bucket_name: str, object_key: str = "") -> None:
"""Update last sync timestamp after a successful operation."""
with self._stats_lock:
rule = self._rules.get(bucket_name)
if not rule:
return
rule.stats.last_sync_at = time.time()
rule.stats.last_sync_key = object_key
self.save_rules()
def get_sync_status(self, bucket_name: str) -> Optional[ReplicationStats]:
"""Dynamically compute replication status by comparing source and destination buckets."""
rule = self.get_rule(bucket_name)
if not rule:
return None
connection = self.connections.get(rule.target_connection_id)
if not connection:
return rule.stats
try:
source_objects = self.storage.list_objects_all(bucket_name)
source_keys = {obj.key: obj.size for obj in source_objects}
s3 = self._create_client(connection)
dest_keys = set()
bytes_synced = 0
paginator = s3.get_paginator('list_objects_v2')
try:
for page in paginator.paginate(Bucket=rule.target_bucket):
for obj in page.get('Contents', []):
dest_keys.add(obj['Key'])
if obj['Key'] in source_keys:
bytes_synced += obj.get('Size', 0)
except ClientError as e:
if e.response['Error']['Code'] == 'NoSuchBucket':
dest_keys = set()
else:
raise
synced = source_keys.keys() & dest_keys
orphaned = dest_keys - source_keys.keys()
if rule.mode == REPLICATION_MODE_ALL:
pending = source_keys.keys() - dest_keys
else:
pending = set()
rule.stats.objects_synced = len(synced)
rule.stats.objects_pending = len(pending)
rule.stats.objects_orphaned = len(orphaned)
rule.stats.bytes_synced = bytes_synced
return rule.stats
except (ClientError, StorageError) as e:
logger.error(f"Failed to compute sync status for {bucket_name}: {e}")
return rule.stats
def replicate_existing_objects(self, bucket_name: str) -> None:
"""Trigger replication for all existing objects in a bucket."""
rule = self.get_rule(bucket_name)
if not rule or not rule.enabled:
return
connection = self.connections.get(rule.target_connection_id)
if not connection:
logger.warning(f"Cannot replicate existing objects: Connection {rule.target_connection_id} not found")
return
if not self.check_endpoint_health(connection):
logger.warning(f"Cannot replicate existing objects: Endpoint {connection.name} ({connection.endpoint_url}) is not reachable")
return
try:
objects = self.storage.list_objects_all(bucket_name)
logger.info(f"Starting replication of {len(objects)} existing objects from {bucket_name}")
for obj in objects:
self._executor.submit(self._replicate_task, bucket_name, obj.key, rule, connection, "write")
except StorageError as e:
logger.error(f"Failed to list objects for replication: {e}")
def create_remote_bucket(self, connection_id: str, bucket_name: str) -> None:
"""Create a bucket on the remote connection."""
connection = self.connections.get(connection_id)
if not connection:
raise ValueError(f"Connection {connection_id} not found")
try:
s3 = self._create_client(connection)
s3.create_bucket(Bucket=bucket_name)
except ClientError as e:
logger.error(f"Failed to create remote bucket {bucket_name}: {e}")
raise
def trigger_replication(self, bucket_name: str, object_key: str, action: str = "write") -> None:
rule = self.get_rule(bucket_name)
if not rule or not rule.enabled:
return
connection = self.connections.get(rule.target_connection_id)
if not connection:
logger.warning(f"Replication skipped for {bucket_name}/{object_key}: Connection {rule.target_connection_id} not found")
return
if not self.check_endpoint_health(connection):
logger.warning(f"Replication skipped for {bucket_name}/{object_key}: Endpoint {connection.name} ({connection.endpoint_url}) is not reachable")
return
self._executor.submit(self._replicate_task, bucket_name, object_key, rule, connection, action)
def _replicate_task(self, bucket_name: str, object_key: str, rule: ReplicationRule, conn: RemoteConnection, action: str) -> None:
if self._shutdown:
return
current_rule = self.get_rule(bucket_name)
if not current_rule or not current_rule.enabled:
logger.debug(f"Replication skipped for {bucket_name}/{object_key}: rule disabled or removed")
return
if ".." in object_key or object_key.startswith("/") or object_key.startswith("\\"):
logger.error(f"Invalid object key in replication (path traversal attempt): {object_key}")
return
try:
from .storage import ObjectStorage
ObjectStorage._sanitize_object_key(object_key)
except StorageError as e:
logger.error(f"Object key validation failed in replication: {e}")
return
try:
s3 = self._create_client(conn)
if action == "delete":
try:
s3.delete_object(Bucket=rule.target_bucket, Key=object_key)
logger.info(f"Replicated DELETE {bucket_name}/{object_key} to {conn.name} ({rule.target_bucket})")
self._update_last_sync(bucket_name, object_key)
self.failure_store.remove_failure(bucket_name, object_key)
except ClientError as e:
error_code = e.response.get('Error', {}).get('Code')
logger.error(f"Replication DELETE failed for {bucket_name}/{object_key}: {e}")
self.failure_store.add_failure(bucket_name, ReplicationFailure(
object_key=object_key,
error_message=str(e),
timestamp=time.time(),
failure_count=1,
bucket_name=bucket_name,
action="delete",
last_error_code=error_code,
))
return
try:
path = self.storage.get_object_path(bucket_name, object_key)
except StorageError:
logger.error(f"Source object not found: {bucket_name}/{object_key}")
return
content_type, _ = mimetypes.guess_type(path)
file_size = path.stat().st_size
logger.info(f"Replicating {bucket_name}/{object_key}: Size={file_size}, ContentType={content_type}")
def do_upload() -> None:
"""Upload object using appropriate method based on file size.
For small files (< 10 MiB): Read into memory for simpler handling
For large files: Use streaming upload to avoid memory issues
"""
extra_args = {}
if content_type:
extra_args["ContentType"] = content_type
if file_size >= self.streaming_threshold_bytes:
s3.upload_file(
str(path),
rule.target_bucket,
object_key,
ExtraArgs=extra_args if extra_args else None,
)
else:
file_content = path.read_bytes()
put_kwargs = {
"Bucket": rule.target_bucket,
"Key": object_key,
"Body": file_content,
**extra_args,
}
s3.put_object(**put_kwargs)
try:
do_upload()
except (ClientError, S3UploadFailedError) as e:
error_code = None
if isinstance(e, ClientError):
error_code = e.response['Error']['Code']
elif isinstance(e, S3UploadFailedError):
if "NoSuchBucket" in str(e):
error_code = 'NoSuchBucket'
if error_code == 'NoSuchBucket':
logger.info(f"Target bucket {rule.target_bucket} not found. Attempting to create it.")
bucket_ready = False
try:
s3.create_bucket(Bucket=rule.target_bucket)
bucket_ready = True
logger.info(f"Created target bucket {rule.target_bucket}")
except ClientError as bucket_err:
if bucket_err.response['Error']['Code'] in ('BucketAlreadyExists', 'BucketAlreadyOwnedByYou'):
logger.debug(f"Bucket {rule.target_bucket} already exists (created by another thread)")
bucket_ready = True
else:
logger.error(f"Failed to create target bucket {rule.target_bucket}: {bucket_err}")
raise e
if bucket_ready:
do_upload()
else:
raise e
logger.info(f"Replicated {bucket_name}/{object_key} to {conn.name} ({rule.target_bucket})")
self._update_last_sync(bucket_name, object_key)
self.failure_store.remove_failure(bucket_name, object_key)
except (ClientError, OSError, ValueError) as e:
error_code = None
if isinstance(e, ClientError):
error_code = e.response.get('Error', {}).get('Code')
logger.error(f"Replication failed for {bucket_name}/{object_key}: {e}")
self.failure_store.add_failure(bucket_name, ReplicationFailure(
object_key=object_key,
error_message=str(e),
timestamp=time.time(),
failure_count=1,
bucket_name=bucket_name,
action=action,
last_error_code=error_code,
))
except Exception as e:
logger.exception(f"Unexpected error during replication for {bucket_name}/{object_key}")
self.failure_store.add_failure(bucket_name, ReplicationFailure(
object_key=object_key,
error_message=str(e),
timestamp=time.time(),
failure_count=1,
bucket_name=bucket_name,
action=action,
last_error_code=None,
))
def get_failed_items(self, bucket_name: str, limit: int = 50, offset: int = 0) -> List[ReplicationFailure]:
failures = self.failure_store.load_failures(bucket_name)
return failures[offset:offset + limit]
def get_failure_count(self, bucket_name: str) -> int:
return self.failure_store.get_failure_count(bucket_name)
def retry_failed_item(self, bucket_name: str, object_key: str) -> bool:
failure = self.failure_store.get_failure(bucket_name, object_key)
if not failure:
return False
rule = self.get_rule(bucket_name)
if not rule or not rule.enabled:
return False
connection = self.connections.get(rule.target_connection_id)
if not connection:
logger.warning(f"Cannot retry: Connection {rule.target_connection_id} not found")
return False
if not self.check_endpoint_health(connection):
logger.warning(f"Cannot retry: Endpoint {connection.name} is not reachable")
return False
self._executor.submit(self._replicate_task, bucket_name, object_key, rule, connection, failure.action)
return True
def retry_all_failed(self, bucket_name: str) -> Dict[str, int]:
failures = self.failure_store.load_failures(bucket_name)
if not failures:
return {"submitted": 0, "skipped": 0}
rule = self.get_rule(bucket_name)
if not rule or not rule.enabled:
return {"submitted": 0, "skipped": len(failures)}
connection = self.connections.get(rule.target_connection_id)
if not connection:
logger.warning(f"Cannot retry: Connection {rule.target_connection_id} not found")
return {"submitted": 0, "skipped": len(failures)}
if not self.check_endpoint_health(connection):
logger.warning(f"Cannot retry: Endpoint {connection.name} is not reachable")
return {"submitted": 0, "skipped": len(failures)}
submitted = 0
for failure in failures:
self._executor.submit(self._replicate_task, bucket_name, failure.object_key, rule, connection, failure.action)
submitted += 1
return {"submitted": submitted, "skipped": 0}
def dismiss_failure(self, bucket_name: str, object_key: str) -> bool:
return self.failure_store.remove_failure(bucket_name, object_key)
def clear_failures(self, bucket_name: str) -> None:
self.failure_store.clear_failures(bucket_name)

File diff suppressed because it is too large Load Diff

View File

@@ -1,48 +0,0 @@
from __future__ import annotations
import secrets
import time
from typing import Any, Dict, Optional
class EphemeralSecretStore:
"""Keeps values in-memory for a short period and returns them once."""
def __init__(self, default_ttl: int = 300) -> None:
self._default_ttl = max(default_ttl, 1)
self._store: Dict[str, tuple[Any, float]] = {}
def remember(self, payload: Any, *, ttl: Optional[int] = None) -> str:
token = secrets.token_urlsafe(16)
expires_at = time.time() + (ttl or self._default_ttl)
self._store[token] = (payload, expires_at)
return token
def peek(self, token: str | None) -> Any | None:
if not token:
return None
entry = self._store.get(token)
if not entry:
return None
payload, expires_at = entry
if expires_at < time.time():
self._store.pop(token, None)
return None
return payload
def pop(self, token: str | None) -> Any | None:
if not token:
return None
entry = self._store.pop(token, None)
if not entry:
return None
payload, expires_at = entry
if expires_at < time.time():
return None
return payload
def purge_expired(self) -> None:
now = time.time()
stale = [token for token, (_, expires_at) in self._store.items() if expires_at < now]
for token in stale:
self._store.pop(token, None)

View File

@@ -1,171 +0,0 @@
"""S3 SelectObjectContent SQL query execution using DuckDB."""
from __future__ import annotations
import json
from pathlib import Path
from typing import Any, Dict, Generator, Optional
try:
import duckdb
DUCKDB_AVAILABLE = True
except ImportError:
DUCKDB_AVAILABLE = False
class SelectError(Exception):
"""Error during SELECT query execution."""
pass
def execute_select_query(
file_path: Path,
expression: str,
input_format: str,
input_config: Dict[str, Any],
output_format: str,
output_config: Dict[str, Any],
chunk_size: int = 65536,
) -> Generator[bytes, None, None]:
"""Execute SQL query on object content."""
if not DUCKDB_AVAILABLE:
raise SelectError("DuckDB is not installed. Install with: pip install duckdb")
conn = duckdb.connect(":memory:")
try:
if input_format == "CSV":
_load_csv(conn, file_path, input_config)
elif input_format == "JSON":
_load_json(conn, file_path, input_config)
elif input_format == "Parquet":
_load_parquet(conn, file_path)
else:
raise SelectError(f"Unsupported input format: {input_format}")
normalized_expression = expression.replace("s3object", "data").replace("S3Object", "data")
try:
result = conn.execute(normalized_expression)
except duckdb.Error as exc:
raise SelectError(f"SQL execution error: {exc}")
if output_format == "CSV":
yield from _output_csv(result, output_config, chunk_size)
elif output_format == "JSON":
yield from _output_json(result, output_config, chunk_size)
else:
raise SelectError(f"Unsupported output format: {output_format}")
finally:
conn.close()
def _load_csv(conn, file_path: Path, config: Dict[str, Any]) -> None:
"""Load CSV file into DuckDB."""
file_header_info = config.get("file_header_info", "NONE")
delimiter = config.get("field_delimiter", ",")
quote = config.get("quote_character", '"')
header = file_header_info in ("USE", "IGNORE")
path_str = str(file_path).replace("\\", "/")
conn.execute(f"""
CREATE TABLE data AS
SELECT * FROM read_csv('{path_str}',
header={header},
delim='{delimiter}',
quote='{quote}'
)
""")
def _load_json(conn, file_path: Path, config: Dict[str, Any]) -> None:
"""Load JSON file into DuckDB."""
json_type = config.get("type", "DOCUMENT")
path_str = str(file_path).replace("\\", "/")
if json_type == "LINES":
conn.execute(f"""
CREATE TABLE data AS
SELECT * FROM read_json_auto('{path_str}', format='newline_delimited')
""")
else:
conn.execute(f"""
CREATE TABLE data AS
SELECT * FROM read_json_auto('{path_str}', format='array')
""")
def _load_parquet(conn, file_path: Path) -> None:
"""Load Parquet file into DuckDB."""
path_str = str(file_path).replace("\\", "/")
conn.execute(f"CREATE TABLE data AS SELECT * FROM read_parquet('{path_str}')")
def _output_csv(
result,
config: Dict[str, Any],
chunk_size: int,
) -> Generator[bytes, None, None]:
"""Output query results as CSV."""
delimiter = config.get("field_delimiter", ",")
record_delimiter = config.get("record_delimiter", "\n")
quote = config.get("quote_character", '"')
buffer = ""
while True:
rows = result.fetchmany(1000)
if not rows:
break
for row in rows:
fields = []
for value in row:
if value is None:
fields.append("")
elif isinstance(value, str):
if delimiter in value or quote in value or record_delimiter in value:
escaped = value.replace(quote, quote + quote)
fields.append(f'{quote}{escaped}{quote}')
else:
fields.append(value)
else:
fields.append(str(value))
buffer += delimiter.join(fields) + record_delimiter
while len(buffer) >= chunk_size:
yield buffer[:chunk_size].encode("utf-8")
buffer = buffer[chunk_size:]
if buffer:
yield buffer.encode("utf-8")
def _output_json(
result,
config: Dict[str, Any],
chunk_size: int,
) -> Generator[bytes, None, None]:
"""Output query results as JSON Lines."""
record_delimiter = config.get("record_delimiter", "\n")
columns = [desc[0] for desc in result.description]
buffer = ""
while True:
rows = result.fetchmany(1000)
if not rows:
break
for row in rows:
record = dict(zip(columns, row))
buffer += json.dumps(record, default=str) + record_delimiter
while len(buffer) >= chunk_size:
yield buffer[:chunk_size].encode("utf-8")
buffer = buffer[chunk_size:]
if buffer:
yield buffer.encode("utf-8")

View File

@@ -1,177 +0,0 @@
from __future__ import annotations
import json
import time
from dataclasses import dataclass, field
from pathlib import Path
from typing import Any, Dict, List, Optional
@dataclass
class SiteInfo:
site_id: str
endpoint: str
region: str = "us-east-1"
priority: int = 100
display_name: str = ""
created_at: Optional[float] = None
updated_at: Optional[float] = None
def __post_init__(self) -> None:
if not self.display_name:
self.display_name = self.site_id
if self.created_at is None:
self.created_at = time.time()
def to_dict(self) -> Dict[str, Any]:
return {
"site_id": self.site_id,
"endpoint": self.endpoint,
"region": self.region,
"priority": self.priority,
"display_name": self.display_name,
"created_at": self.created_at,
"updated_at": self.updated_at,
}
@classmethod
def from_dict(cls, data: Dict[str, Any]) -> SiteInfo:
return cls(
site_id=data["site_id"],
endpoint=data.get("endpoint", ""),
region=data.get("region", "us-east-1"),
priority=data.get("priority", 100),
display_name=data.get("display_name", ""),
created_at=data.get("created_at"),
updated_at=data.get("updated_at"),
)
@dataclass
class PeerSite:
site_id: str
endpoint: str
region: str = "us-east-1"
priority: int = 100
display_name: str = ""
created_at: Optional[float] = None
updated_at: Optional[float] = None
connection_id: Optional[str] = None
is_healthy: Optional[bool] = None
last_health_check: Optional[float] = None
def __post_init__(self) -> None:
if not self.display_name:
self.display_name = self.site_id
if self.created_at is None:
self.created_at = time.time()
def to_dict(self) -> Dict[str, Any]:
return {
"site_id": self.site_id,
"endpoint": self.endpoint,
"region": self.region,
"priority": self.priority,
"display_name": self.display_name,
"created_at": self.created_at,
"updated_at": self.updated_at,
"connection_id": self.connection_id,
"is_healthy": self.is_healthy,
"last_health_check": self.last_health_check,
}
@classmethod
def from_dict(cls, data: Dict[str, Any]) -> PeerSite:
return cls(
site_id=data["site_id"],
endpoint=data.get("endpoint", ""),
region=data.get("region", "us-east-1"),
priority=data.get("priority", 100),
display_name=data.get("display_name", ""),
created_at=data.get("created_at"),
updated_at=data.get("updated_at"),
connection_id=data.get("connection_id"),
is_healthy=data.get("is_healthy"),
last_health_check=data.get("last_health_check"),
)
class SiteRegistry:
def __init__(self, config_path: Path) -> None:
self.config_path = config_path
self._local_site: Optional[SiteInfo] = None
self._peers: Dict[str, PeerSite] = {}
self.reload()
def reload(self) -> None:
if not self.config_path.exists():
self._local_site = None
self._peers = {}
return
try:
with open(self.config_path, "r", encoding="utf-8") as f:
data = json.load(f)
if data.get("local"):
self._local_site = SiteInfo.from_dict(data["local"])
else:
self._local_site = None
self._peers = {}
for peer_data in data.get("peers", []):
peer = PeerSite.from_dict(peer_data)
self._peers[peer.site_id] = peer
except (OSError, json.JSONDecodeError, KeyError):
self._local_site = None
self._peers = {}
def save(self) -> None:
self.config_path.parent.mkdir(parents=True, exist_ok=True)
data = {
"local": self._local_site.to_dict() if self._local_site else None,
"peers": [peer.to_dict() for peer in self._peers.values()],
}
with open(self.config_path, "w", encoding="utf-8") as f:
json.dump(data, f, indent=2)
def get_local_site(self) -> Optional[SiteInfo]:
return self._local_site
def set_local_site(self, site: SiteInfo) -> None:
site.updated_at = time.time()
self._local_site = site
self.save()
def list_peers(self) -> List[PeerSite]:
return list(self._peers.values())
def get_peer(self, site_id: str) -> Optional[PeerSite]:
return self._peers.get(site_id)
def add_peer(self, peer: PeerSite) -> None:
peer.created_at = peer.created_at or time.time()
self._peers[peer.site_id] = peer
self.save()
def update_peer(self, peer: PeerSite) -> None:
if peer.site_id not in self._peers:
raise ValueError(f"Peer {peer.site_id} not found")
peer.updated_at = time.time()
self._peers[peer.site_id] = peer
self.save()
def delete_peer(self, site_id: str) -> bool:
if site_id in self._peers:
del self._peers[site_id]
self.save()
return True
return False
def update_health(self, site_id: str, is_healthy: bool) -> None:
peer = self._peers.get(site_id)
if peer:
peer.is_healthy = is_healthy
peer.last_health_check = time.time()
self.save()

View File

@@ -1,416 +0,0 @@
from __future__ import annotations
import json
import logging
import tempfile
import threading
import time
from dataclasses import dataclass, field
from datetime import datetime, timezone
from pathlib import Path
from typing import Any, Dict, List, Optional, TYPE_CHECKING
import boto3
from botocore.config import Config
from botocore.exceptions import ClientError
if TYPE_CHECKING:
from .connections import ConnectionStore, RemoteConnection
from .replication import ReplicationManager, ReplicationRule
from .storage import ObjectStorage
logger = logging.getLogger(__name__)
SITE_SYNC_USER_AGENT = "SiteSyncAgent/1.0"
@dataclass
class SyncedObjectInfo:
last_synced_at: float
remote_etag: str
source: str
def to_dict(self) -> Dict[str, Any]:
return {
"last_synced_at": self.last_synced_at,
"remote_etag": self.remote_etag,
"source": self.source,
}
@classmethod
def from_dict(cls, data: Dict[str, Any]) -> "SyncedObjectInfo":
return cls(
last_synced_at=data["last_synced_at"],
remote_etag=data["remote_etag"],
source=data["source"],
)
@dataclass
class SyncState:
synced_objects: Dict[str, SyncedObjectInfo] = field(default_factory=dict)
last_full_sync: Optional[float] = None
def to_dict(self) -> Dict[str, Any]:
return {
"synced_objects": {k: v.to_dict() for k, v in self.synced_objects.items()},
"last_full_sync": self.last_full_sync,
}
@classmethod
def from_dict(cls, data: Dict[str, Any]) -> "SyncState":
synced_objects = {}
for k, v in data.get("synced_objects", {}).items():
synced_objects[k] = SyncedObjectInfo.from_dict(v)
return cls(
synced_objects=synced_objects,
last_full_sync=data.get("last_full_sync"),
)
@dataclass
class SiteSyncStats:
last_sync_at: Optional[float] = None
objects_pulled: int = 0
objects_skipped: int = 0
conflicts_resolved: int = 0
deletions_applied: int = 0
errors: int = 0
def to_dict(self) -> Dict[str, Any]:
return {
"last_sync_at": self.last_sync_at,
"objects_pulled": self.objects_pulled,
"objects_skipped": self.objects_skipped,
"conflicts_resolved": self.conflicts_resolved,
"deletions_applied": self.deletions_applied,
"errors": self.errors,
}
@dataclass
class RemoteObjectMeta:
key: str
size: int
last_modified: datetime
etag: str
@classmethod
def from_s3_object(cls, obj: Dict[str, Any]) -> "RemoteObjectMeta":
return cls(
key=obj["Key"],
size=obj.get("Size", 0),
last_modified=obj["LastModified"],
etag=obj.get("ETag", "").strip('"'),
)
def _create_sync_client(
connection: "RemoteConnection",
*,
connect_timeout: int = 10,
read_timeout: int = 120,
max_retries: int = 2,
) -> Any:
config = Config(
user_agent_extra=SITE_SYNC_USER_AGENT,
connect_timeout=connect_timeout,
read_timeout=read_timeout,
retries={"max_attempts": max_retries},
signature_version="s3v4",
s3={"addressing_style": "path"},
request_checksum_calculation="when_required",
response_checksum_validation="when_required",
)
return boto3.client(
"s3",
endpoint_url=connection.endpoint_url,
aws_access_key_id=connection.access_key,
aws_secret_access_key=connection.secret_key,
region_name=connection.region or "us-east-1",
config=config,
)
class SiteSyncWorker:
def __init__(
self,
storage: "ObjectStorage",
connections: "ConnectionStore",
replication_manager: "ReplicationManager",
storage_root: Path,
interval_seconds: int = 60,
batch_size: int = 100,
connect_timeout: int = 10,
read_timeout: int = 120,
max_retries: int = 2,
clock_skew_tolerance_seconds: float = 1.0,
):
self.storage = storage
self.connections = connections
self.replication_manager = replication_manager
self.storage_root = storage_root
self.interval_seconds = interval_seconds
self.batch_size = batch_size
self.connect_timeout = connect_timeout
self.read_timeout = read_timeout
self.max_retries = max_retries
self.clock_skew_tolerance_seconds = clock_skew_tolerance_seconds
self._lock = threading.Lock()
self._shutdown = threading.Event()
self._sync_thread: Optional[threading.Thread] = None
self._bucket_stats: Dict[str, SiteSyncStats] = {}
def _create_client(self, connection: "RemoteConnection") -> Any:
"""Create an S3 client with the worker's configured timeouts."""
return _create_sync_client(
connection,
connect_timeout=self.connect_timeout,
read_timeout=self.read_timeout,
max_retries=self.max_retries,
)
def start(self) -> None:
if self._sync_thread is not None and self._sync_thread.is_alive():
return
self._shutdown.clear()
self._sync_thread = threading.Thread(
target=self._sync_loop, name="site-sync-worker", daemon=True
)
self._sync_thread.start()
logger.info("Site sync worker started (interval=%ds)", self.interval_seconds)
def shutdown(self) -> None:
self._shutdown.set()
if self._sync_thread is not None:
self._sync_thread.join(timeout=10.0)
logger.info("Site sync worker shut down")
def trigger_sync(self, bucket_name: str) -> Optional[SiteSyncStats]:
from .replication import REPLICATION_MODE_BIDIRECTIONAL
rule = self.replication_manager.get_rule(bucket_name)
if not rule or rule.mode != REPLICATION_MODE_BIDIRECTIONAL or not rule.enabled:
return None
return self._sync_bucket(rule)
def get_stats(self, bucket_name: str) -> Optional[SiteSyncStats]:
with self._lock:
return self._bucket_stats.get(bucket_name)
def _sync_loop(self) -> None:
while not self._shutdown.is_set():
self._shutdown.wait(timeout=self.interval_seconds)
if self._shutdown.is_set():
break
self._run_sync_cycle()
def _run_sync_cycle(self) -> None:
from .replication import REPLICATION_MODE_BIDIRECTIONAL
for bucket_name, rule in list(self.replication_manager._rules.items()):
if self._shutdown.is_set():
break
if rule.mode != REPLICATION_MODE_BIDIRECTIONAL or not rule.enabled:
continue
try:
stats = self._sync_bucket(rule)
with self._lock:
self._bucket_stats[bucket_name] = stats
except Exception as e:
logger.exception("Site sync failed for bucket %s: %s", bucket_name, e)
def _sync_bucket(self, rule: "ReplicationRule") -> SiteSyncStats:
stats = SiteSyncStats()
connection = self.connections.get(rule.target_connection_id)
if not connection:
logger.warning("Connection %s not found for bucket %s", rule.target_connection_id, rule.bucket_name)
stats.errors += 1
return stats
try:
local_objects = self._list_local_objects(rule.bucket_name)
except Exception as e:
logger.error("Failed to list local objects for %s: %s", rule.bucket_name, e)
stats.errors += 1
return stats
try:
remote_objects = self._list_remote_objects(rule, connection)
except Exception as e:
logger.error("Failed to list remote objects for %s: %s", rule.bucket_name, e)
stats.errors += 1
return stats
sync_state = self._load_sync_state(rule.bucket_name)
local_keys = set(local_objects.keys())
remote_keys = set(remote_objects.keys())
to_pull = []
for key in remote_keys:
remote_meta = remote_objects[key]
local_meta = local_objects.get(key)
if local_meta is None:
to_pull.append(key)
else:
resolution = self._resolve_conflict(local_meta, remote_meta)
if resolution == "pull":
to_pull.append(key)
stats.conflicts_resolved += 1
else:
stats.objects_skipped += 1
pulled_count = 0
for key in to_pull:
if self._shutdown.is_set():
break
if pulled_count >= self.batch_size:
break
remote_meta = remote_objects[key]
success = self._pull_object(rule, key, connection, remote_meta)
if success:
stats.objects_pulled += 1
pulled_count += 1
sync_state.synced_objects[key] = SyncedObjectInfo(
last_synced_at=time.time(),
remote_etag=remote_meta.etag,
source="remote",
)
else:
stats.errors += 1
if rule.sync_deletions:
for key in list(sync_state.synced_objects.keys()):
if key not in remote_keys and key in local_keys:
tracked = sync_state.synced_objects[key]
if tracked.source == "remote":
local_meta = local_objects.get(key)
if local_meta and local_meta.last_modified.timestamp() <= tracked.last_synced_at:
success = self._apply_remote_deletion(rule.bucket_name, key)
if success:
stats.deletions_applied += 1
del sync_state.synced_objects[key]
sync_state.last_full_sync = time.time()
self._save_sync_state(rule.bucket_name, sync_state)
with self.replication_manager._stats_lock:
rule.last_pull_at = time.time()
self.replication_manager.save_rules()
stats.last_sync_at = time.time()
logger.info(
"Site sync completed for %s: pulled=%d, skipped=%d, conflicts=%d, deletions=%d, errors=%d",
rule.bucket_name,
stats.objects_pulled,
stats.objects_skipped,
stats.conflicts_resolved,
stats.deletions_applied,
stats.errors,
)
return stats
def _list_local_objects(self, bucket_name: str) -> Dict[str, Any]:
from .storage import ObjectMeta
objects = self.storage.list_objects_all(bucket_name)
return {obj.key: obj for obj in objects}
def _list_remote_objects(self, rule: "ReplicationRule", connection: "RemoteConnection") -> Dict[str, RemoteObjectMeta]:
s3 = self._create_client(connection)
result: Dict[str, RemoteObjectMeta] = {}
paginator = s3.get_paginator("list_objects_v2")
try:
for page in paginator.paginate(Bucket=rule.target_bucket):
for obj in page.get("Contents", []):
meta = RemoteObjectMeta.from_s3_object(obj)
result[meta.key] = meta
except ClientError as e:
if e.response["Error"]["Code"] == "NoSuchBucket":
return {}
raise
return result
def _resolve_conflict(self, local_meta: Any, remote_meta: RemoteObjectMeta) -> str:
local_ts = local_meta.last_modified.timestamp()
remote_ts = remote_meta.last_modified.timestamp()
if abs(remote_ts - local_ts) < self.clock_skew_tolerance_seconds:
local_etag = local_meta.etag or ""
if remote_meta.etag == local_etag:
return "skip"
return "pull" if remote_meta.etag > local_etag else "keep"
return "pull" if remote_ts > local_ts else "keep"
def _pull_object(
self,
rule: "ReplicationRule",
object_key: str,
connection: "RemoteConnection",
remote_meta: RemoteObjectMeta,
) -> bool:
s3 = self._create_client(connection)
tmp_path = None
try:
tmp_dir = self.storage_root / ".myfsio.sys" / "tmp"
tmp_dir.mkdir(parents=True, exist_ok=True)
with tempfile.NamedTemporaryFile(dir=tmp_dir, delete=False) as tmp_file:
tmp_path = Path(tmp_file.name)
s3.download_file(rule.target_bucket, object_key, str(tmp_path))
head_response = s3.head_object(Bucket=rule.target_bucket, Key=object_key)
user_metadata = head_response.get("Metadata", {})
with open(tmp_path, "rb") as f:
self.storage.put_object(
rule.bucket_name,
object_key,
f,
metadata=user_metadata if user_metadata else None,
)
logger.debug("Pulled object %s/%s from remote", rule.bucket_name, object_key)
return True
except ClientError as e:
logger.error("Failed to pull %s/%s: %s", rule.bucket_name, object_key, e)
return False
except Exception as e:
logger.error("Failed to store pulled object %s/%s: %s", rule.bucket_name, object_key, e)
return False
finally:
if tmp_path and tmp_path.exists():
try:
tmp_path.unlink()
except OSError:
pass
def _apply_remote_deletion(self, bucket_name: str, object_key: str) -> bool:
try:
self.storage.delete_object(bucket_name, object_key)
logger.debug("Applied remote deletion for %s/%s", bucket_name, object_key)
return True
except Exception as e:
logger.error("Failed to apply remote deletion for %s/%s: %s", bucket_name, object_key, e)
return False
def _sync_state_path(self, bucket_name: str) -> Path:
return self.storage_root / ".myfsio.sys" / "buckets" / bucket_name / "site_sync_state.json"
def _load_sync_state(self, bucket_name: str) -> SyncState:
path = self._sync_state_path(bucket_name)
if not path.exists():
return SyncState()
try:
data = json.loads(path.read_text(encoding="utf-8"))
return SyncState.from_dict(data)
except (json.JSONDecodeError, OSError, KeyError) as e:
logger.warning("Failed to load sync state for %s: %s", bucket_name, e)
return SyncState()
def _save_sync_state(self, bucket_name: str, state: SyncState) -> None:
path = self._sync_state_path(bucket_name)
path.parent.mkdir(parents=True, exist_ok=True)
try:
path.write_text(json.dumps(state.to_dict(), indent=2), encoding="utf-8")
except OSError as e:
logger.warning("Failed to save sync state for %s: %s", bucket_name, e)

File diff suppressed because it is too large Load Diff

View File

@@ -1,215 +0,0 @@
from __future__ import annotations
import json
import logging
import threading
import time
from dataclasses import dataclass
from datetime import datetime, timezone
from pathlib import Path
from typing import Any, Dict, List, Optional, TYPE_CHECKING
import psutil
if TYPE_CHECKING:
from .storage import ObjectStorage
logger = logging.getLogger(__name__)
@dataclass
class SystemMetricsSnapshot:
timestamp: datetime
cpu_percent: float
memory_percent: float
disk_percent: float
storage_bytes: int
def to_dict(self) -> Dict[str, Any]:
return {
"timestamp": self.timestamp.strftime("%Y-%m-%dT%H:%M:%SZ"),
"cpu_percent": round(self.cpu_percent, 2),
"memory_percent": round(self.memory_percent, 2),
"disk_percent": round(self.disk_percent, 2),
"storage_bytes": self.storage_bytes,
}
@classmethod
def from_dict(cls, data: Dict[str, Any]) -> "SystemMetricsSnapshot":
timestamp_str = data["timestamp"]
if timestamp_str.endswith("Z"):
timestamp_str = timestamp_str[:-1] + "+00:00"
return cls(
timestamp=datetime.fromisoformat(timestamp_str),
cpu_percent=data.get("cpu_percent", 0.0),
memory_percent=data.get("memory_percent", 0.0),
disk_percent=data.get("disk_percent", 0.0),
storage_bytes=data.get("storage_bytes", 0),
)
class SystemMetricsCollector:
def __init__(
self,
storage_root: Path,
interval_minutes: int = 5,
retention_hours: int = 24,
):
self.storage_root = storage_root
self.interval_seconds = interval_minutes * 60
self.retention_hours = retention_hours
self._lock = threading.Lock()
self._shutdown = threading.Event()
self._snapshots: List[SystemMetricsSnapshot] = []
self._storage_ref: Optional["ObjectStorage"] = None
self._load_history()
self._snapshot_thread = threading.Thread(
target=self._snapshot_loop,
name="system-metrics-snapshot",
daemon=True,
)
self._snapshot_thread.start()
def set_storage(self, storage: "ObjectStorage") -> None:
with self._lock:
self._storage_ref = storage
def _config_path(self) -> Path:
return self.storage_root / ".myfsio.sys" / "config" / "metrics_history.json"
def _load_history(self) -> None:
config_path = self._config_path()
if not config_path.exists():
return
try:
data = json.loads(config_path.read_text(encoding="utf-8"))
history_data = data.get("history", [])
self._snapshots = [SystemMetricsSnapshot.from_dict(s) for s in history_data]
self._prune_old_snapshots()
except (json.JSONDecodeError, OSError, KeyError) as e:
logger.warning(f"Failed to load system metrics history: {e}")
def _save_history(self) -> None:
config_path = self._config_path()
config_path.parent.mkdir(parents=True, exist_ok=True)
try:
data = {"history": [s.to_dict() for s in self._snapshots]}
config_path.write_text(json.dumps(data, indent=2), encoding="utf-8")
except OSError as e:
logger.warning(f"Failed to save system metrics history: {e}")
def _prune_old_snapshots(self) -> None:
if not self._snapshots:
return
cutoff = datetime.now(timezone.utc).timestamp() - (self.retention_hours * 3600)
self._snapshots = [
s for s in self._snapshots if s.timestamp.timestamp() > cutoff
]
def _snapshot_loop(self) -> None:
while not self._shutdown.is_set():
self._shutdown.wait(timeout=self.interval_seconds)
if not self._shutdown.is_set():
self._take_snapshot()
def _take_snapshot(self) -> None:
try:
cpu_percent = psutil.cpu_percent(interval=0.1)
memory = psutil.virtual_memory()
disk = psutil.disk_usage(str(self.storage_root))
storage_bytes = 0
with self._lock:
storage = self._storage_ref
if storage:
try:
buckets = storage.list_buckets()
for bucket in buckets:
stats = storage.bucket_stats(bucket.name, cache_ttl=60)
storage_bytes += stats.get("total_bytes", stats.get("bytes", 0))
except Exception as e:
logger.warning(f"Failed to collect bucket stats: {e}")
snapshot = SystemMetricsSnapshot(
timestamp=datetime.now(timezone.utc),
cpu_percent=cpu_percent,
memory_percent=memory.percent,
disk_percent=disk.percent,
storage_bytes=storage_bytes,
)
with self._lock:
self._snapshots.append(snapshot)
self._prune_old_snapshots()
self._save_history()
logger.debug(f"System metrics snapshot taken: CPU={cpu_percent:.1f}%, Memory={memory.percent:.1f}%")
except Exception as e:
logger.warning(f"Failed to take system metrics snapshot: {e}")
def get_current(self) -> Dict[str, Any]:
cpu_percent = psutil.cpu_percent(interval=0.1)
memory = psutil.virtual_memory()
disk = psutil.disk_usage(str(self.storage_root))
boot_time = psutil.boot_time()
uptime_seconds = time.time() - boot_time
uptime_days = int(uptime_seconds / 86400)
total_buckets = 0
total_objects = 0
total_bytes_used = 0
total_versions = 0
with self._lock:
storage = self._storage_ref
if storage:
try:
buckets = storage.list_buckets()
total_buckets = len(buckets)
for bucket in buckets:
stats = storage.bucket_stats(bucket.name, cache_ttl=60)
total_objects += stats.get("total_objects", stats.get("objects", 0))
total_bytes_used += stats.get("total_bytes", stats.get("bytes", 0))
total_versions += stats.get("version_count", 0)
except Exception as e:
logger.warning(f"Failed to collect current bucket stats: {e}")
return {
"cpu_percent": round(cpu_percent, 2),
"memory": {
"total": memory.total,
"available": memory.available,
"used": memory.used,
"percent": round(memory.percent, 2),
},
"disk": {
"total": disk.total,
"free": disk.free,
"used": disk.used,
"percent": round(disk.percent, 2),
},
"app": {
"buckets": total_buckets,
"objects": total_objects,
"versions": total_versions,
"storage_bytes": total_bytes_used,
"uptime_days": uptime_days,
},
}
def get_history(self, hours: Optional[int] = None) -> List[Dict[str, Any]]:
with self._lock:
snapshots = list(self._snapshots)
if hours:
cutoff = datetime.now(timezone.utc).timestamp() - (hours * 3600)
snapshots = [s for s in snapshots if s.timestamp.timestamp() > cutoff]
return [s.to_dict() for s in snapshots]
def shutdown(self) -> None:
self._shutdown.set()
self._take_snapshot()
self._snapshot_thread.join(timeout=5.0)

3351
app/ui.py

File diff suppressed because it is too large Load Diff

View File

@@ -1,8 +0,0 @@
from __future__ import annotations
APP_VERSION = "0.2.6"
def get_version() -> str:
"""Return the current application version."""
return APP_VERSION

View File

@@ -0,0 +1,27 @@
[package]
name = "myfsio-auth"
version.workspace = true
edition.workspace = true
[dependencies]
myfsio-common = { path = "../myfsio-common" }
hmac = { workspace = true }
sha2 = { workspace = true }
hex = { workspace = true }
aes = { workspace = true }
cbc = { workspace = true }
base64 = { workspace = true }
pbkdf2 = "0.12"
rand = "0.8"
lru = { workspace = true }
parking_lot = { workspace = true }
percent-encoding = { workspace = true }
serde = { workspace = true }
serde_json = { workspace = true }
thiserror = { workspace = true }
chrono = { workspace = true }
tracing = { workspace = true }
uuid = { workspace = true }
[dev-dependencies]
tempfile = "3"

View File

@@ -0,0 +1,118 @@
use aes::cipher::{block_padding::Pkcs7, BlockDecryptMut, BlockEncryptMut, KeyIvInit};
use base64::{engine::general_purpose::URL_SAFE, Engine};
use hmac::{Hmac, Mac};
use rand::RngCore;
use sha2::Sha256;
type Aes128CbcDec = cbc::Decryptor<aes::Aes128>;
type Aes128CbcEnc = cbc::Encryptor<aes::Aes128>;
type HmacSha256 = Hmac<Sha256>;
pub fn derive_fernet_key(secret: &str) -> String {
let mut derived = [0u8; 32];
pbkdf2::pbkdf2_hmac::<Sha256>(
secret.as_bytes(),
b"myfsio-iam-encryption",
100_000,
&mut derived,
);
URL_SAFE.encode(derived)
}
pub fn decrypt(key_b64: &str, token: &str) -> Result<Vec<u8>, &'static str> {
let key_bytes = URL_SAFE
.decode(key_b64)
.map_err(|_| "invalid fernet key base64")?;
if key_bytes.len() != 32 {
return Err("fernet key must be 32 bytes");
}
let signing_key = &key_bytes[..16];
let encryption_key = &key_bytes[16..];
let token_bytes = URL_SAFE
.decode(token)
.map_err(|_| "invalid fernet token base64")?;
if token_bytes.len() < 57 {
return Err("fernet token too short");
}
if token_bytes[0] != 0x80 {
return Err("invalid fernet version");
}
let hmac_offset = token_bytes.len() - 32;
let payload = &token_bytes[..hmac_offset];
let expected_hmac = &token_bytes[hmac_offset..];
let mut mac = HmacSha256::new_from_slice(signing_key).map_err(|_| "hmac key error")?;
mac.update(payload);
mac.verify_slice(expected_hmac)
.map_err(|_| "HMAC verification failed")?;
let iv = &token_bytes[9..25];
let ciphertext = &token_bytes[25..hmac_offset];
let plaintext = Aes128CbcDec::new(encryption_key.into(), iv.into())
.decrypt_padded_vec_mut::<Pkcs7>(ciphertext)
.map_err(|_| "AES-CBC decryption failed")?;
Ok(plaintext)
}
pub fn encrypt(key_b64: &str, plaintext: &[u8]) -> Result<String, &'static str> {
let key_bytes = URL_SAFE
.decode(key_b64)
.map_err(|_| "invalid fernet key base64")?;
if key_bytes.len() != 32 {
return Err("fernet key must be 32 bytes");
}
let signing_key = &key_bytes[..16];
let encryption_key = &key_bytes[16..];
let mut iv = [0u8; 16];
rand::thread_rng().fill_bytes(&mut iv);
let timestamp = std::time::SystemTime::now()
.duration_since(std::time::UNIX_EPOCH)
.map_err(|_| "system time error")?
.as_secs();
let ciphertext = Aes128CbcEnc::new(encryption_key.into(), (&iv).into())
.encrypt_padded_vec_mut::<Pkcs7>(plaintext);
let mut payload = Vec::with_capacity(1 + 8 + 16 + ciphertext.len());
payload.push(0x80);
payload.extend_from_slice(&timestamp.to_be_bytes());
payload.extend_from_slice(&iv);
payload.extend_from_slice(&ciphertext);
let mut mac = HmacSha256::new_from_slice(signing_key).map_err(|_| "hmac key error")?;
mac.update(&payload);
let tag = mac.finalize().into_bytes();
let mut token_bytes = payload;
token_bytes.extend_from_slice(&tag);
Ok(URL_SAFE.encode(&token_bytes))
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_derive_fernet_key_format() {
let key = derive_fernet_key("test-secret");
let decoded = URL_SAFE.decode(&key).unwrap();
assert_eq!(decoded.len(), 32);
}
#[test]
fn test_roundtrip_with_python_compat() {
let key = derive_fernet_key("dev-secret-key");
let decoded = URL_SAFE.decode(&key).unwrap();
assert_eq!(decoded.len(), 32);
}
}

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,4 @@
mod fernet;
pub mod iam;
pub mod principal;
pub mod sigv4;

View File

@@ -0,0 +1 @@
pub use myfsio_common::types::Principal;

View File

@@ -0,0 +1,287 @@
use hmac::{Hmac, Mac};
use lru::LruCache;
use parking_lot::Mutex;
use percent_encoding::{percent_encode, AsciiSet, NON_ALPHANUMERIC};
use sha2::{Digest, Sha256};
use std::num::NonZeroUsize;
use std::sync::LazyLock;
use std::time::Instant;
type HmacSha256 = Hmac<Sha256>;
struct CacheEntry {
key: Vec<u8>,
created: Instant,
}
static SIGNING_KEY_CACHE: LazyLock<Mutex<LruCache<(String, String, String, String), CacheEntry>>> =
LazyLock::new(|| Mutex::new(LruCache::new(NonZeroUsize::new(256).unwrap())));
const CACHE_TTL_SECS: u64 = 60;
const AWS_ENCODE_SET: &AsciiSet = &NON_ALPHANUMERIC
.remove(b'-')
.remove(b'_')
.remove(b'.')
.remove(b'~');
fn hmac_sha256(key: &[u8], msg: &[u8]) -> Vec<u8> {
let mut mac = HmacSha256::new_from_slice(key).expect("HMAC key length is always valid");
mac.update(msg);
mac.finalize().into_bytes().to_vec()
}
fn sha256_hex(data: &[u8]) -> String {
let mut hasher = Sha256::new();
hasher.update(data);
hex::encode(hasher.finalize())
}
fn aws_uri_encode(input: &str) -> String {
percent_encode(input.as_bytes(), AWS_ENCODE_SET).to_string()
}
pub fn derive_signing_key_cached(
secret_key: &str,
date_stamp: &str,
region: &str,
service: &str,
) -> Vec<u8> {
let cache_key = (
secret_key.to_owned(),
date_stamp.to_owned(),
region.to_owned(),
service.to_owned(),
);
{
let mut cache = SIGNING_KEY_CACHE.lock();
if let Some(entry) = cache.get(&cache_key) {
if entry.created.elapsed().as_secs() < CACHE_TTL_SECS {
return entry.key.clone();
}
cache.pop(&cache_key);
}
}
let k_date = hmac_sha256(
format!("AWS4{}", secret_key).as_bytes(),
date_stamp.as_bytes(),
);
let k_region = hmac_sha256(&k_date, region.as_bytes());
let k_service = hmac_sha256(&k_region, service.as_bytes());
let k_signing = hmac_sha256(&k_service, b"aws4_request");
{
let mut cache = SIGNING_KEY_CACHE.lock();
cache.put(
cache_key,
CacheEntry {
key: k_signing.clone(),
created: Instant::now(),
},
);
}
k_signing
}
fn constant_time_compare_inner(a: &[u8], b: &[u8]) -> bool {
if a.len() != b.len() {
return false;
}
let mut result: u8 = 0;
for (x, y) in a.iter().zip(b.iter()) {
result |= x ^ y;
}
result == 0
}
pub fn verify_sigv4_signature(
method: &str,
canonical_uri: &str,
query_params: &[(String, String)],
signed_headers_str: &str,
header_values: &[(String, String)],
payload_hash: &str,
amz_date: &str,
date_stamp: &str,
region: &str,
service: &str,
secret_key: &str,
provided_signature: &str,
) -> bool {
let mut sorted_params = query_params.to_vec();
sorted_params.sort_by(|a, b| a.0.cmp(&b.0).then_with(|| a.1.cmp(&b.1)));
let canonical_query_string = sorted_params
.iter()
.map(|(k, v)| format!("{}={}", aws_uri_encode(k), aws_uri_encode(v)))
.collect::<Vec<_>>()
.join("&");
let mut canonical_headers = String::new();
for (name, value) in header_values {
let lower_name = name.to_lowercase();
let normalized = value.split_whitespace().collect::<Vec<_>>().join(" ");
let final_value = if lower_name == "expect" && normalized.is_empty() {
"100-continue"
} else {
&normalized
};
canonical_headers.push_str(&lower_name);
canonical_headers.push(':');
canonical_headers.push_str(final_value);
canonical_headers.push('\n');
}
let canonical_request = format!(
"{}\n{}\n{}\n{}\n{}\n{}",
method,
canonical_uri,
canonical_query_string,
canonical_headers,
signed_headers_str,
payload_hash
);
let credential_scope = format!("{}/{}/{}/aws4_request", date_stamp, region, service);
let cr_hash = sha256_hex(canonical_request.as_bytes());
let string_to_sign = format!(
"AWS4-HMAC-SHA256\n{}\n{}\n{}",
amz_date, credential_scope, cr_hash
);
let signing_key = derive_signing_key_cached(secret_key, date_stamp, region, service);
let calculated = hmac_sha256(&signing_key, string_to_sign.as_bytes());
let calculated_hex = hex::encode(&calculated);
constant_time_compare_inner(calculated_hex.as_bytes(), provided_signature.as_bytes())
}
pub fn derive_signing_key(
secret_key: &str,
date_stamp: &str,
region: &str,
service: &str,
) -> Vec<u8> {
derive_signing_key_cached(secret_key, date_stamp, region, service)
}
pub fn compute_signature(signing_key: &[u8], string_to_sign: &str) -> String {
let sig = hmac_sha256(signing_key, string_to_sign.as_bytes());
hex::encode(sig)
}
pub fn compute_post_policy_signature(signing_key: &[u8], policy_b64: &str) -> String {
let sig = hmac_sha256(signing_key, policy_b64.as_bytes());
hex::encode(sig)
}
pub fn build_string_to_sign(
amz_date: &str,
credential_scope: &str,
canonical_request: &str,
) -> String {
let cr_hash = sha256_hex(canonical_request.as_bytes());
format!(
"AWS4-HMAC-SHA256\n{}\n{}\n{}",
amz_date, credential_scope, cr_hash
)
}
pub fn constant_time_compare(a: &str, b: &str) -> bool {
constant_time_compare_inner(a.as_bytes(), b.as_bytes())
}
pub fn clear_signing_key_cache() {
SIGNING_KEY_CACHE.lock().clear();
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_derive_signing_key() {
let key = derive_signing_key(
"wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY",
"20130524",
"us-east-1",
"s3",
);
assert_eq!(key.len(), 32);
}
#[test]
fn test_derive_signing_key_cached() {
let key1 = derive_signing_key("secret", "20240101", "us-east-1", "s3");
let key2 = derive_signing_key("secret", "20240101", "us-east-1", "s3");
assert_eq!(key1, key2);
}
#[test]
fn test_constant_time_compare() {
assert!(constant_time_compare("abc", "abc"));
assert!(!constant_time_compare("abc", "abd"));
assert!(!constant_time_compare("abc", "abcd"));
}
#[test]
fn test_build_string_to_sign() {
let result = build_string_to_sign(
"20130524T000000Z",
"20130524/us-east-1/s3/aws4_request",
"GET\n/\n\nhost:example.com\n\nhost\nUNSIGNED-PAYLOAD",
);
assert!(result.starts_with("AWS4-HMAC-SHA256\n"));
assert!(result.contains("20130524T000000Z"));
}
#[test]
fn test_aws_uri_encode() {
assert_eq!(aws_uri_encode("hello world"), "hello%20world");
assert_eq!(aws_uri_encode("test-file_name.txt"), "test-file_name.txt");
assert_eq!(aws_uri_encode("a/b"), "a%2Fb");
}
#[test]
fn test_verify_sigv4_roundtrip() {
let secret = "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY";
let date_stamp = "20130524";
let region = "us-east-1";
let service = "s3";
let amz_date = "20130524T000000Z";
let signing_key = derive_signing_key(secret, date_stamp, region, service);
let canonical_request =
"GET\n/\n\nhost:examplebucket.s3.amazonaws.com\n\nhost\nUNSIGNED-PAYLOAD";
let string_to_sign = build_string_to_sign(
amz_date,
&format!("{}/{}/{}/aws4_request", date_stamp, region, service),
canonical_request,
);
let signature = compute_signature(&signing_key, &string_to_sign);
let result = verify_sigv4_signature(
"GET",
"/",
&[],
"host",
&[(
"host".to_string(),
"examplebucket.s3.amazonaws.com".to_string(),
)],
"UNSIGNED-PAYLOAD",
amz_date,
date_stamp,
region,
service,
secret,
&signature,
);
assert!(result);
}
}

View File

@@ -0,0 +1,11 @@
[package]
name = "myfsio-common"
version.workspace = true
edition.workspace = true
[dependencies]
thiserror = { workspace = true }
serde = { workspace = true }
serde_json = { workspace = true }
chrono = { workspace = true }
uuid = { workspace = true }

View File

@@ -0,0 +1,21 @@
pub const SYSTEM_ROOT: &str = ".myfsio.sys";
pub const SYSTEM_BUCKETS_DIR: &str = "buckets";
pub const SYSTEM_MULTIPART_DIR: &str = "multipart";
pub const BUCKET_META_DIR: &str = "meta";
pub const BUCKET_VERSIONS_DIR: &str = "versions";
pub const BUCKET_CONFIG_FILE: &str = ".bucket.json";
pub const STATS_FILE: &str = "stats.json";
pub const ETAG_INDEX_FILE: &str = "etag_index.json";
pub const INDEX_FILE: &str = "_index.json";
pub const MANIFEST_FILE: &str = "manifest.json";
pub const DIR_MARKER_FILE: &str = ".__myfsio_dirobj__";
pub const INTERNAL_FOLDERS: &[&str] = &[".meta", ".versions", ".multipart"];
pub const DEFAULT_REGION: &str = "us-east-1";
pub const AWS_SERVICE: &str = "s3";
pub const DEFAULT_MAX_KEYS: usize = 1000;
pub const DEFAULT_OBJECT_KEY_MAX_BYTES: usize = 1024;
pub const DEFAULT_CHUNK_SIZE: usize = 65536;
pub const STREAM_CHUNK_SIZE: usize = 1_048_576;

View File

@@ -0,0 +1,267 @@
use std::fmt;
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
pub enum S3ErrorCode {
AccessDenied,
BadDigest,
BucketAlreadyExists,
BucketAlreadyOwnedByYou,
BucketNotEmpty,
EntityTooLarge,
EntityTooSmall,
InternalError,
InvalidAccessKeyId,
InvalidArgument,
InvalidBucketName,
InvalidKey,
InvalidPart,
InvalidPartOrder,
InvalidPolicyDocument,
InvalidRange,
InvalidRequest,
InvalidTag,
MalformedXML,
MethodNotAllowed,
NoSuchBucket,
NoSuchBucketPolicy,
NoSuchKey,
NoSuchLifecycleConfiguration,
NoSuchUpload,
NoSuchVersion,
NoSuchTagSet,
PreconditionFailed,
NotModified,
QuotaExceeded,
RequestTimeTooSkewed,
ServerSideEncryptionConfigurationNotFoundError,
SignatureDoesNotMatch,
SlowDown,
}
impl S3ErrorCode {
pub fn http_status(&self) -> u16 {
match self {
Self::AccessDenied => 403,
Self::BadDigest => 400,
Self::BucketAlreadyExists => 409,
Self::BucketAlreadyOwnedByYou => 409,
Self::BucketNotEmpty => 409,
Self::EntityTooLarge => 413,
Self::EntityTooSmall => 400,
Self::InternalError => 500,
Self::InvalidAccessKeyId => 403,
Self::InvalidArgument => 400,
Self::InvalidBucketName => 400,
Self::InvalidKey => 400,
Self::InvalidPart => 400,
Self::InvalidPartOrder => 400,
Self::InvalidPolicyDocument => 400,
Self::InvalidRange => 416,
Self::InvalidRequest => 400,
Self::InvalidTag => 400,
Self::MalformedXML => 400,
Self::MethodNotAllowed => 405,
Self::NoSuchBucket => 404,
Self::NoSuchBucketPolicy => 404,
Self::NoSuchKey => 404,
Self::NoSuchLifecycleConfiguration => 404,
Self::NoSuchUpload => 404,
Self::NoSuchVersion => 404,
Self::NoSuchTagSet => 404,
Self::PreconditionFailed => 412,
Self::NotModified => 304,
Self::QuotaExceeded => 403,
Self::RequestTimeTooSkewed => 403,
Self::ServerSideEncryptionConfigurationNotFoundError => 404,
Self::SignatureDoesNotMatch => 403,
Self::SlowDown => 503,
}
}
pub fn as_str(&self) -> &'static str {
match self {
Self::AccessDenied => "AccessDenied",
Self::BadDigest => "BadDigest",
Self::BucketAlreadyExists => "BucketAlreadyExists",
Self::BucketAlreadyOwnedByYou => "BucketAlreadyOwnedByYou",
Self::BucketNotEmpty => "BucketNotEmpty",
Self::EntityTooLarge => "EntityTooLarge",
Self::EntityTooSmall => "EntityTooSmall",
Self::InternalError => "InternalError",
Self::InvalidAccessKeyId => "InvalidAccessKeyId",
Self::InvalidArgument => "InvalidArgument",
Self::InvalidBucketName => "InvalidBucketName",
Self::InvalidKey => "InvalidKey",
Self::InvalidPart => "InvalidPart",
Self::InvalidPartOrder => "InvalidPartOrder",
Self::InvalidPolicyDocument => "InvalidPolicyDocument",
Self::InvalidRange => "InvalidRange",
Self::InvalidRequest => "InvalidRequest",
Self::InvalidTag => "InvalidTag",
Self::MalformedXML => "MalformedXML",
Self::MethodNotAllowed => "MethodNotAllowed",
Self::NoSuchBucket => "NoSuchBucket",
Self::NoSuchBucketPolicy => "NoSuchBucketPolicy",
Self::NoSuchKey => "NoSuchKey",
Self::NoSuchLifecycleConfiguration => "NoSuchLifecycleConfiguration",
Self::NoSuchUpload => "NoSuchUpload",
Self::NoSuchVersion => "NoSuchVersion",
Self::NoSuchTagSet => "NoSuchTagSet",
Self::PreconditionFailed => "PreconditionFailed",
Self::NotModified => "NotModified",
Self::QuotaExceeded => "QuotaExceeded",
Self::RequestTimeTooSkewed => "RequestTimeTooSkewed",
Self::ServerSideEncryptionConfigurationNotFoundError => {
"ServerSideEncryptionConfigurationNotFoundError"
}
Self::SignatureDoesNotMatch => "SignatureDoesNotMatch",
Self::SlowDown => "SlowDown",
}
}
pub fn default_message(&self) -> &'static str {
match self {
Self::AccessDenied => "Access Denied",
Self::BadDigest => "The Content-MD5 or checksum value you specified did not match what we received",
Self::BucketAlreadyExists => "The requested bucket name is not available",
Self::BucketAlreadyOwnedByYou => "Your previous request to create the named bucket succeeded and you already own it",
Self::BucketNotEmpty => "The bucket you tried to delete is not empty",
Self::EntityTooLarge => "Your proposed upload exceeds the maximum allowed size",
Self::EntityTooSmall => "Your proposed upload is smaller than the minimum allowed object size",
Self::InternalError => "We encountered an internal error. Please try again.",
Self::InvalidAccessKeyId => "The access key ID you provided does not exist",
Self::InvalidArgument => "Invalid argument",
Self::InvalidBucketName => "The specified bucket is not valid",
Self::InvalidKey => "The specified key is not valid",
Self::InvalidPart => "One or more of the specified parts could not be found",
Self::InvalidPartOrder => "The list of parts was not in ascending order",
Self::InvalidPolicyDocument => "The content of the form does not meet the conditions specified in the policy document",
Self::InvalidRange => "The requested range is not satisfiable",
Self::InvalidRequest => "Invalid request",
Self::InvalidTag => "The Tagging header is invalid",
Self::MalformedXML => "The XML you provided was not well-formed",
Self::MethodNotAllowed => "The specified method is not allowed against this resource",
Self::NoSuchBucket => "The specified bucket does not exist",
Self::NoSuchBucketPolicy => "The bucket policy does not exist",
Self::NoSuchKey => "The specified key does not exist",
Self::NoSuchLifecycleConfiguration => "The lifecycle configuration does not exist",
Self::NoSuchUpload => "The specified multipart upload does not exist",
Self::NoSuchVersion => "The specified version does not exist",
Self::NoSuchTagSet => "The TagSet does not exist",
Self::PreconditionFailed => "At least one of the preconditions you specified did not hold",
Self::NotModified => "Not Modified",
Self::QuotaExceeded => "The bucket quota has been exceeded",
Self::RequestTimeTooSkewed => "The difference between the request time and the server's time is too large",
Self::ServerSideEncryptionConfigurationNotFoundError => "The server side encryption configuration was not found",
Self::SignatureDoesNotMatch => "The request signature we calculated does not match the signature you provided",
Self::SlowDown => "Please reduce your request rate",
}
}
}
impl fmt::Display for S3ErrorCode {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
f.write_str(self.as_str())
}
}
#[derive(Debug, Clone)]
pub struct S3Error {
pub code: S3ErrorCode,
pub message: String,
pub resource: String,
pub request_id: String,
}
impl S3Error {
pub fn new(code: S3ErrorCode, message: impl Into<String>) -> Self {
Self {
code,
message: message.into(),
resource: String::new(),
request_id: String::new(),
}
}
pub fn from_code(code: S3ErrorCode) -> Self {
Self::new(code, code.default_message())
}
pub fn with_resource(mut self, resource: impl Into<String>) -> Self {
self.resource = resource.into();
self
}
pub fn with_request_id(mut self, request_id: impl Into<String>) -> Self {
self.request_id = request_id.into();
self
}
pub fn http_status(&self) -> u16 {
self.code.http_status()
}
pub fn to_xml(&self) -> String {
format!(
"<?xml version=\"1.0\" encoding=\"UTF-8\"?>\
<Error>\
<Code>{}</Code>\
<Message>{}</Message>\
<Resource>{}</Resource>\
<RequestId>{}</RequestId>\
</Error>",
self.code.as_str(),
xml_escape(&self.message),
xml_escape(&self.resource),
xml_escape(&self.request_id),
)
}
}
impl fmt::Display for S3Error {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
write!(f, "{}: {}", self.code, self.message)
}
}
impl std::error::Error for S3Error {}
fn xml_escape(s: &str) -> String {
s.replace('&', "&amp;")
.replace('<', "&lt;")
.replace('>', "&gt;")
.replace('"', "&quot;")
.replace('\'', "&apos;")
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_error_codes() {
assert_eq!(S3ErrorCode::NoSuchKey.http_status(), 404);
assert_eq!(S3ErrorCode::AccessDenied.http_status(), 403);
assert_eq!(S3ErrorCode::NoSuchBucket.as_str(), "NoSuchBucket");
}
#[test]
fn test_error_to_xml() {
let err = S3Error::from_code(S3ErrorCode::NoSuchKey)
.with_resource("/test-bucket/test-key")
.with_request_id("abc123");
let xml = err.to_xml();
assert!(xml.contains("<Code>NoSuchKey</Code>"));
assert!(xml.contains("<Resource>/test-bucket/test-key</Resource>"));
assert!(xml.contains("<RequestId>abc123</RequestId>"));
}
#[test]
fn test_xml_escape() {
let err = S3Error::new(S3ErrorCode::InvalidArgument, "key <test> & \"value\"")
.with_resource("/bucket/key&amp");
let xml = err.to_xml();
assert!(xml.contains("&lt;test&gt;"));
assert!(xml.contains("&amp;"));
}
}

View File

@@ -0,0 +1,3 @@
pub mod constants;
pub mod error;
pub mod types;

View File

@@ -0,0 +1,191 @@
use std::collections::HashMap;
use chrono::{DateTime, Utc};
use serde::{Deserialize, Serialize};
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct ObjectMeta {
pub key: String,
pub size: u64,
pub last_modified: DateTime<Utc>,
pub etag: Option<String>,
pub content_type: Option<String>,
pub storage_class: Option<String>,
pub metadata: HashMap<String, String>,
#[serde(default)]
pub version_id: Option<String>,
#[serde(default)]
pub is_delete_marker: bool,
}
impl ObjectMeta {
pub fn new(key: String, size: u64, last_modified: DateTime<Utc>) -> Self {
Self {
key,
size,
last_modified,
etag: None,
content_type: None,
storage_class: Some("STANDARD".to_string()),
metadata: HashMap::new(),
version_id: None,
is_delete_marker: false,
}
}
}
#[derive(Debug, Clone, Default)]
pub struct DeleteOutcome {
pub version_id: Option<String>,
pub is_delete_marker: bool,
pub existed: bool,
}
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct BucketMeta {
pub name: String,
pub creation_date: DateTime<Utc>,
}
#[derive(Debug, Clone, Default)]
pub struct BucketStats {
pub objects: u64,
pub bytes: u64,
pub version_count: u64,
pub version_bytes: u64,
}
impl BucketStats {
pub fn total_objects(&self) -> u64 {
self.objects + self.version_count
}
pub fn total_bytes(&self) -> u64 {
self.bytes + self.version_bytes
}
}
#[derive(Debug, Clone)]
pub struct ListObjectsResult {
pub objects: Vec<ObjectMeta>,
pub is_truncated: bool,
pub next_continuation_token: Option<String>,
}
#[derive(Debug, Clone)]
pub struct ShallowListResult {
pub objects: Vec<ObjectMeta>,
pub common_prefixes: Vec<String>,
pub is_truncated: bool,
pub next_continuation_token: Option<String>,
}
#[derive(Debug, Clone, Default)]
pub struct ListParams {
pub max_keys: usize,
pub continuation_token: Option<String>,
pub prefix: Option<String>,
pub start_after: Option<String>,
}
#[derive(Debug, Clone, Default)]
pub struct ShallowListParams {
pub prefix: String,
pub delimiter: String,
pub max_keys: usize,
pub continuation_token: Option<String>,
}
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct PartMeta {
pub part_number: u32,
pub etag: String,
pub size: u64,
pub last_modified: Option<DateTime<Utc>>,
}
#[derive(Debug, Clone)]
pub struct PartInfo {
pub part_number: u32,
pub etag: String,
}
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct MultipartUploadInfo {
pub upload_id: String,
pub key: String,
pub initiated: DateTime<Utc>,
}
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct VersionInfo {
pub version_id: String,
pub key: String,
pub size: u64,
pub last_modified: DateTime<Utc>,
pub etag: Option<String>,
pub is_latest: bool,
#[serde(default)]
pub is_delete_marker: bool,
}
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct Tag {
pub key: String,
pub value: String,
}
#[derive(Debug, Clone, Default, Serialize, Deserialize)]
pub struct BucketConfig {
#[serde(default)]
pub versioning_enabled: bool,
#[serde(default)]
pub tags: Vec<Tag>,
#[serde(default)]
pub cors: Option<serde_json::Value>,
#[serde(default)]
pub encryption: Option<serde_json::Value>,
#[serde(default)]
pub lifecycle: Option<serde_json::Value>,
#[serde(default)]
pub website: Option<serde_json::Value>,
#[serde(default)]
pub quota: Option<QuotaConfig>,
#[serde(default)]
pub acl: Option<serde_json::Value>,
#[serde(default)]
pub notification: Option<serde_json::Value>,
#[serde(default)]
pub logging: Option<serde_json::Value>,
#[serde(default)]
pub object_lock: Option<serde_json::Value>,
#[serde(default)]
pub policy: Option<serde_json::Value>,
#[serde(default)]
pub replication: Option<serde_json::Value>,
}
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct QuotaConfig {
pub max_bytes: Option<u64>,
pub max_objects: Option<u64>,
}
#[derive(Debug, Clone)]
pub struct Principal {
pub access_key: String,
pub user_id: String,
pub display_name: String,
pub is_admin: bool,
}
impl Principal {
pub fn new(access_key: String, user_id: String, display_name: String, is_admin: bool) -> Self {
Self {
access_key,
user_id,
display_name,
is_admin,
}
}
}

View File

@@ -0,0 +1,24 @@
[package]
name = "myfsio-crypto"
version.workspace = true
edition.workspace = true
[dependencies]
myfsio-common = { path = "../myfsio-common" }
md-5 = { workspace = true }
sha2 = { workspace = true }
hex = { workspace = true }
aes-gcm = { workspace = true }
hkdf = { workspace = true }
thiserror = { workspace = true }
tokio = { workspace = true }
serde = { workspace = true }
serde_json = { workspace = true }
uuid = { workspace = true }
chrono = { workspace = true }
base64 = { workspace = true }
rand = "0.8"
[dev-dependencies]
tokio = { workspace = true, features = ["macros", "rt-multi-thread"] }
tempfile = "3"

View File

@@ -0,0 +1,253 @@
use aes_gcm::aead::Aead;
use aes_gcm::{Aes256Gcm, KeyInit, Nonce};
use hkdf::Hkdf;
use sha2::Sha256;
use std::fs::File;
use std::io::{Read, Seek, SeekFrom, Write};
use std::path::Path;
use thiserror::Error;
const DEFAULT_CHUNK_SIZE: usize = 65536;
const HEADER_SIZE: usize = 4;
#[derive(Debug, Error)]
pub enum CryptoError {
#[error("IO error: {0}")]
Io(#[from] std::io::Error),
#[error("Invalid key size: expected 32 bytes, got {0}")]
InvalidKeySize(usize),
#[error("Invalid nonce size: expected 12 bytes, got {0}")]
InvalidNonceSize(usize),
#[error("Encryption failed: {0}")]
EncryptionFailed(String),
#[error("Decryption failed at chunk {0}")]
DecryptionFailed(u32),
#[error("HKDF expand failed: {0}")]
HkdfFailed(String),
}
fn read_exact_chunk(reader: &mut impl Read, buf: &mut [u8]) -> std::io::Result<usize> {
let mut filled = 0;
while filled < buf.len() {
match reader.read(&mut buf[filled..]) {
Ok(0) => break,
Ok(n) => filled += n,
Err(ref e) if e.kind() == std::io::ErrorKind::Interrupted => continue,
Err(e) => return Err(e),
}
}
Ok(filled)
}
fn derive_chunk_nonce(base_nonce: &[u8], chunk_index: u32) -> Result<[u8; 12], CryptoError> {
let hkdf = Hkdf::<Sha256>::new(Some(base_nonce), b"chunk_nonce");
let mut okm = [0u8; 12];
hkdf.expand(&chunk_index.to_be_bytes(), &mut okm)
.map_err(|e| CryptoError::HkdfFailed(e.to_string()))?;
Ok(okm)
}
pub fn encrypt_stream_chunked(
input_path: &Path,
output_path: &Path,
key: &[u8],
base_nonce: &[u8],
chunk_size: Option<usize>,
) -> Result<u32, CryptoError> {
if key.len() != 32 {
return Err(CryptoError::InvalidKeySize(key.len()));
}
if base_nonce.len() != 12 {
return Err(CryptoError::InvalidNonceSize(base_nonce.len()));
}
let chunk_size = chunk_size.unwrap_or(DEFAULT_CHUNK_SIZE);
let key_arr: [u8; 32] = key.try_into().unwrap();
let nonce_arr: [u8; 12] = base_nonce.try_into().unwrap();
let cipher = Aes256Gcm::new(&key_arr.into());
let mut infile = File::open(input_path)?;
let mut outfile = File::create(output_path)?;
outfile.write_all(&[0u8; 4])?;
let mut buf = vec![0u8; chunk_size];
let mut chunk_index: u32 = 0;
loop {
let n = read_exact_chunk(&mut infile, &mut buf)?;
if n == 0 {
break;
}
let nonce_bytes = derive_chunk_nonce(&nonce_arr, chunk_index)?;
let nonce = Nonce::from_slice(&nonce_bytes);
let encrypted = cipher
.encrypt(nonce, &buf[..n])
.map_err(|e| CryptoError::EncryptionFailed(e.to_string()))?;
let size = encrypted.len() as u32;
outfile.write_all(&size.to_be_bytes())?;
outfile.write_all(&encrypted)?;
chunk_index += 1;
}
outfile.seek(SeekFrom::Start(0))?;
outfile.write_all(&chunk_index.to_be_bytes())?;
Ok(chunk_index)
}
pub fn decrypt_stream_chunked(
input_path: &Path,
output_path: &Path,
key: &[u8],
base_nonce: &[u8],
) -> Result<u32, CryptoError> {
if key.len() != 32 {
return Err(CryptoError::InvalidKeySize(key.len()));
}
if base_nonce.len() != 12 {
return Err(CryptoError::InvalidNonceSize(base_nonce.len()));
}
let key_arr: [u8; 32] = key.try_into().unwrap();
let nonce_arr: [u8; 12] = base_nonce.try_into().unwrap();
let cipher = Aes256Gcm::new(&key_arr.into());
let mut infile = File::open(input_path)?;
let mut outfile = File::create(output_path)?;
let mut header = [0u8; HEADER_SIZE];
infile.read_exact(&mut header)?;
let chunk_count = u32::from_be_bytes(header);
let mut size_buf = [0u8; HEADER_SIZE];
for chunk_index in 0..chunk_count {
infile.read_exact(&mut size_buf)?;
let chunk_size = u32::from_be_bytes(size_buf) as usize;
let mut encrypted = vec![0u8; chunk_size];
infile.read_exact(&mut encrypted)?;
let nonce_bytes = derive_chunk_nonce(&nonce_arr, chunk_index)?;
let nonce = Nonce::from_slice(&nonce_bytes);
let decrypted = cipher
.decrypt(nonce, encrypted.as_ref())
.map_err(|_| CryptoError::DecryptionFailed(chunk_index))?;
outfile.write_all(&decrypted)?;
}
Ok(chunk_count)
}
pub async fn encrypt_stream_chunked_async(
input_path: &Path,
output_path: &Path,
key: &[u8],
base_nonce: &[u8],
chunk_size: Option<usize>,
) -> Result<u32, CryptoError> {
let input_path = input_path.to_owned();
let output_path = output_path.to_owned();
let key = key.to_vec();
let base_nonce = base_nonce.to_vec();
tokio::task::spawn_blocking(move || {
encrypt_stream_chunked(&input_path, &output_path, &key, &base_nonce, chunk_size)
})
.await
.map_err(|e| CryptoError::Io(std::io::Error::new(std::io::ErrorKind::Other, e)))?
}
pub async fn decrypt_stream_chunked_async(
input_path: &Path,
output_path: &Path,
key: &[u8],
base_nonce: &[u8],
) -> Result<u32, CryptoError> {
let input_path = input_path.to_owned();
let output_path = output_path.to_owned();
let key = key.to_vec();
let base_nonce = base_nonce.to_vec();
tokio::task::spawn_blocking(move || {
decrypt_stream_chunked(&input_path, &output_path, &key, &base_nonce)
})
.await
.map_err(|e| CryptoError::Io(std::io::Error::new(std::io::ErrorKind::Other, e)))?
}
#[cfg(test)]
mod tests {
use super::*;
use std::io::Write as IoWrite;
#[test]
fn test_encrypt_decrypt_roundtrip() {
let dir = tempfile::tempdir().unwrap();
let input = dir.path().join("input.bin");
let encrypted = dir.path().join("encrypted.bin");
let decrypted = dir.path().join("decrypted.bin");
let data = b"Hello, this is a test of AES-256-GCM chunked encryption!";
std::fs::File::create(&input)
.unwrap()
.write_all(data)
.unwrap();
let key = [0x42u8; 32];
let nonce = [0x01u8; 12];
let chunks = encrypt_stream_chunked(&input, &encrypted, &key, &nonce, Some(16)).unwrap();
assert!(chunks > 0);
let chunks2 = decrypt_stream_chunked(&encrypted, &decrypted, &key, &nonce).unwrap();
assert_eq!(chunks, chunks2);
let result = std::fs::read(&decrypted).unwrap();
assert_eq!(result, data);
}
#[test]
fn test_invalid_key_size() {
let dir = tempfile::tempdir().unwrap();
let input = dir.path().join("input.bin");
std::fs::File::create(&input)
.unwrap()
.write_all(b"test")
.unwrap();
let result = encrypt_stream_chunked(
&input,
&dir.path().join("out"),
&[0u8; 16],
&[0u8; 12],
None,
);
assert!(matches!(result, Err(CryptoError::InvalidKeySize(16))));
}
#[test]
fn test_wrong_key_fails_decrypt() {
let dir = tempfile::tempdir().unwrap();
let input = dir.path().join("input.bin");
let encrypted = dir.path().join("encrypted.bin");
let decrypted = dir.path().join("decrypted.bin");
std::fs::File::create(&input)
.unwrap()
.write_all(b"secret data")
.unwrap();
let key = [0x42u8; 32];
let nonce = [0x01u8; 12];
encrypt_stream_chunked(&input, &encrypted, &key, &nonce, None).unwrap();
let wrong_key = [0x43u8; 32];
let result = decrypt_stream_chunked(&encrypted, &decrypted, &wrong_key, &nonce);
assert!(matches!(result, Err(CryptoError::DecryptionFailed(_))));
}
}

View File

@@ -0,0 +1,404 @@
use base64::engine::general_purpose::STANDARD as B64;
use base64::Engine;
use rand::RngCore;
use std::collections::HashMap;
use std::path::Path;
use crate::aes_gcm::{decrypt_stream_chunked, encrypt_stream_chunked, CryptoError};
use crate::kms::KmsService;
#[derive(Debug, Clone, PartialEq)]
pub enum SseAlgorithm {
Aes256,
AwsKms,
CustomerProvided,
}
impl SseAlgorithm {
pub fn as_str(&self) -> &'static str {
match self {
SseAlgorithm::Aes256 => "AES256",
SseAlgorithm::AwsKms => "aws:kms",
SseAlgorithm::CustomerProvided => "AES256",
}
}
}
#[derive(Debug, Clone)]
pub struct EncryptionContext {
pub algorithm: SseAlgorithm,
pub kms_key_id: Option<String>,
pub customer_key: Option<Vec<u8>>,
}
#[derive(Debug, Clone)]
pub struct EncryptionMetadata {
pub algorithm: String,
pub nonce: String,
pub encrypted_data_key: Option<String>,
pub kms_key_id: Option<String>,
}
impl EncryptionMetadata {
pub fn to_metadata_map(&self) -> HashMap<String, String> {
let mut map = HashMap::new();
map.insert(
"x-amz-server-side-encryption".to_string(),
self.algorithm.clone(),
);
map.insert("x-amz-encryption-nonce".to_string(), self.nonce.clone());
if let Some(ref dk) = self.encrypted_data_key {
map.insert("x-amz-encrypted-data-key".to_string(), dk.clone());
}
if let Some(ref kid) = self.kms_key_id {
map.insert("x-amz-encryption-key-id".to_string(), kid.clone());
}
map
}
pub fn from_metadata(meta: &HashMap<String, String>) -> Option<Self> {
let algorithm = meta.get("x-amz-server-side-encryption")?;
let nonce = meta.get("x-amz-encryption-nonce")?;
Some(Self {
algorithm: algorithm.clone(),
nonce: nonce.clone(),
encrypted_data_key: meta.get("x-amz-encrypted-data-key").cloned(),
kms_key_id: meta.get("x-amz-encryption-key-id").cloned(),
})
}
pub fn is_encrypted(meta: &HashMap<String, String>) -> bool {
meta.contains_key("x-amz-server-side-encryption")
}
pub fn clean_metadata(meta: &mut HashMap<String, String>) {
meta.remove("x-amz-server-side-encryption");
meta.remove("x-amz-encryption-nonce");
meta.remove("x-amz-encrypted-data-key");
meta.remove("x-amz-encryption-key-id");
}
}
pub struct EncryptionService {
master_key: [u8; 32],
kms: Option<std::sync::Arc<KmsService>>,
config: EncryptionConfig,
}
#[derive(Debug, Clone, Copy)]
pub struct EncryptionConfig {
pub chunk_size: usize,
}
impl Default for EncryptionConfig {
fn default() -> Self {
Self { chunk_size: 65_536 }
}
}
impl EncryptionService {
pub fn new(master_key: [u8; 32], kms: Option<std::sync::Arc<KmsService>>) -> Self {
Self::with_config(master_key, kms, EncryptionConfig::default())
}
pub fn with_config(
master_key: [u8; 32],
kms: Option<std::sync::Arc<KmsService>>,
config: EncryptionConfig,
) -> Self {
Self {
master_key,
kms,
config,
}
}
pub fn generate_data_key(&self) -> ([u8; 32], [u8; 12]) {
let mut data_key = [0u8; 32];
let mut nonce = [0u8; 12];
rand::thread_rng().fill_bytes(&mut data_key);
rand::thread_rng().fill_bytes(&mut nonce);
(data_key, nonce)
}
pub fn wrap_data_key(&self, data_key: &[u8; 32]) -> Result<String, CryptoError> {
use aes_gcm::aead::Aead;
use aes_gcm::{Aes256Gcm, KeyInit, Nonce};
let cipher = Aes256Gcm::new((&self.master_key).into());
let mut nonce_bytes = [0u8; 12];
rand::thread_rng().fill_bytes(&mut nonce_bytes);
let nonce = Nonce::from_slice(&nonce_bytes);
let encrypted = cipher
.encrypt(nonce, data_key.as_slice())
.map_err(|e| CryptoError::EncryptionFailed(e.to_string()))?;
let mut combined = Vec::with_capacity(12 + encrypted.len());
combined.extend_from_slice(&nonce_bytes);
combined.extend_from_slice(&encrypted);
Ok(B64.encode(&combined))
}
pub fn unwrap_data_key(&self, wrapped_b64: &str) -> Result<[u8; 32], CryptoError> {
use aes_gcm::aead::Aead;
use aes_gcm::{Aes256Gcm, KeyInit, Nonce};
let combined = B64.decode(wrapped_b64).map_err(|e| {
CryptoError::EncryptionFailed(format!("Bad wrapped key encoding: {}", e))
})?;
if combined.len() < 12 {
return Err(CryptoError::EncryptionFailed(
"Wrapped key too short".to_string(),
));
}
let (nonce_bytes, ciphertext) = combined.split_at(12);
let cipher = Aes256Gcm::new((&self.master_key).into());
let nonce = Nonce::from_slice(nonce_bytes);
let plaintext = cipher
.decrypt(nonce, ciphertext)
.map_err(|_| CryptoError::DecryptionFailed(0))?;
if plaintext.len() != 32 {
return Err(CryptoError::InvalidKeySize(plaintext.len()));
}
let mut key = [0u8; 32];
key.copy_from_slice(&plaintext);
Ok(key)
}
pub async fn encrypt_object(
&self,
input_path: &Path,
output_path: &Path,
ctx: &EncryptionContext,
) -> Result<EncryptionMetadata, CryptoError> {
let (data_key, nonce) = self.generate_data_key();
let (encrypted_data_key, kms_key_id) = match ctx.algorithm {
SseAlgorithm::Aes256 => {
let wrapped = self.wrap_data_key(&data_key)?;
(Some(wrapped), None)
}
SseAlgorithm::AwsKms => {
let kms = self
.kms
.as_ref()
.ok_or_else(|| CryptoError::EncryptionFailed("KMS not available".into()))?;
let kid = ctx
.kms_key_id
.as_ref()
.ok_or_else(|| CryptoError::EncryptionFailed("No KMS key ID".into()))?;
let ciphertext = kms.encrypt_data(kid, &data_key).await?;
(Some(B64.encode(&ciphertext)), Some(kid.clone()))
}
SseAlgorithm::CustomerProvided => (None, None),
};
let actual_key = if ctx.algorithm == SseAlgorithm::CustomerProvided {
let ck = ctx
.customer_key
.as_ref()
.ok_or_else(|| CryptoError::EncryptionFailed("No customer key provided".into()))?;
if ck.len() != 32 {
return Err(CryptoError::InvalidKeySize(ck.len()));
}
let mut k = [0u8; 32];
k.copy_from_slice(ck);
k
} else {
data_key
};
let ip = input_path.to_owned();
let op = output_path.to_owned();
let ak = actual_key;
let n = nonce;
let chunk_size = self.config.chunk_size;
tokio::task::spawn_blocking(move || {
encrypt_stream_chunked(&ip, &op, &ak, &n, Some(chunk_size))
})
.await
.map_err(|e| CryptoError::Io(std::io::Error::new(std::io::ErrorKind::Other, e)))??;
Ok(EncryptionMetadata {
algorithm: ctx.algorithm.as_str().to_string(),
nonce: B64.encode(nonce),
encrypted_data_key,
kms_key_id,
})
}
pub async fn decrypt_object(
&self,
input_path: &Path,
output_path: &Path,
enc_meta: &EncryptionMetadata,
customer_key: Option<&[u8]>,
) -> Result<(), CryptoError> {
let nonce_bytes = B64
.decode(&enc_meta.nonce)
.map_err(|e| CryptoError::EncryptionFailed(format!("Bad nonce encoding: {}", e)))?;
if nonce_bytes.len() != 12 {
return Err(CryptoError::InvalidNonceSize(nonce_bytes.len()));
}
let data_key: [u8; 32] = if let Some(ck) = customer_key {
if ck.len() != 32 {
return Err(CryptoError::InvalidKeySize(ck.len()));
}
let mut k = [0u8; 32];
k.copy_from_slice(ck);
k
} else if enc_meta.algorithm == "aws:kms" {
let kms = self
.kms
.as_ref()
.ok_or_else(|| CryptoError::EncryptionFailed("KMS not available".into()))?;
let kid = enc_meta
.kms_key_id
.as_ref()
.ok_or_else(|| CryptoError::EncryptionFailed("No KMS key ID in metadata".into()))?;
let encrypted_dk = enc_meta.encrypted_data_key.as_ref().ok_or_else(|| {
CryptoError::EncryptionFailed("No encrypted data key in metadata".into())
})?;
let ct = B64.decode(encrypted_dk).map_err(|e| {
CryptoError::EncryptionFailed(format!("Bad data key encoding: {}", e))
})?;
let dk = kms.decrypt_data(kid, &ct).await?;
if dk.len() != 32 {
return Err(CryptoError::InvalidKeySize(dk.len()));
}
let mut k = [0u8; 32];
k.copy_from_slice(&dk);
k
} else {
let wrapped = enc_meta.encrypted_data_key.as_ref().ok_or_else(|| {
CryptoError::EncryptionFailed("No encrypted data key in metadata".into())
})?;
self.unwrap_data_key(wrapped)?
};
let ip = input_path.to_owned();
let op = output_path.to_owned();
let nb: [u8; 12] = nonce_bytes.try_into().unwrap();
tokio::task::spawn_blocking(move || decrypt_stream_chunked(&ip, &op, &data_key, &nb))
.await
.map_err(|e| CryptoError::Io(std::io::Error::new(std::io::ErrorKind::Other, e)))??;
Ok(())
}
}
#[cfg(test)]
mod tests {
use super::*;
use std::io::Write;
fn test_master_key() -> [u8; 32] {
[0x42u8; 32]
}
#[test]
fn test_wrap_unwrap_data_key() {
let svc = EncryptionService::new(test_master_key(), None);
let dk = [0xAAu8; 32];
let wrapped = svc.wrap_data_key(&dk).unwrap();
let unwrapped = svc.unwrap_data_key(&wrapped).unwrap();
assert_eq!(dk, unwrapped);
}
#[tokio::test]
async fn test_encrypt_decrypt_object_sse_s3() {
let dir = tempfile::tempdir().unwrap();
let input = dir.path().join("plain.bin");
let encrypted = dir.path().join("enc.bin");
let decrypted = dir.path().join("dec.bin");
let data = b"SSE-S3 encrypted content for testing!";
std::fs::File::create(&input)
.unwrap()
.write_all(data)
.unwrap();
let svc = EncryptionService::new(test_master_key(), None);
let ctx = EncryptionContext {
algorithm: SseAlgorithm::Aes256,
kms_key_id: None,
customer_key: None,
};
let meta = svc.encrypt_object(&input, &encrypted, &ctx).await.unwrap();
assert_eq!(meta.algorithm, "AES256");
assert!(meta.encrypted_data_key.is_some());
svc.decrypt_object(&encrypted, &decrypted, &meta, None)
.await
.unwrap();
let result = std::fs::read(&decrypted).unwrap();
assert_eq!(result, data);
}
#[tokio::test]
async fn test_encrypt_decrypt_object_sse_c() {
let dir = tempfile::tempdir().unwrap();
let input = dir.path().join("plain.bin");
let encrypted = dir.path().join("enc.bin");
let decrypted = dir.path().join("dec.bin");
let data = b"SSE-C encrypted content!";
std::fs::File::create(&input)
.unwrap()
.write_all(data)
.unwrap();
let customer_key = [0xBBu8; 32];
let svc = EncryptionService::new(test_master_key(), None);
let ctx = EncryptionContext {
algorithm: SseAlgorithm::CustomerProvided,
kms_key_id: None,
customer_key: Some(customer_key.to_vec()),
};
let meta = svc.encrypt_object(&input, &encrypted, &ctx).await.unwrap();
assert!(meta.encrypted_data_key.is_none());
svc.decrypt_object(&encrypted, &decrypted, &meta, Some(&customer_key))
.await
.unwrap();
let result = std::fs::read(&decrypted).unwrap();
assert_eq!(result, data);
}
#[test]
fn test_encryption_metadata_roundtrip() {
let meta = EncryptionMetadata {
algorithm: "AES256".to_string(),
nonce: "dGVzdG5vbmNlMTI=".to_string(),
encrypted_data_key: Some("c29tZWtleQ==".to_string()),
kms_key_id: None,
};
let map = meta.to_metadata_map();
let restored = EncryptionMetadata::from_metadata(&map).unwrap();
assert_eq!(restored.algorithm, "AES256");
assert_eq!(restored.nonce, meta.nonce);
assert_eq!(restored.encrypted_data_key, meta.encrypted_data_key);
}
#[test]
fn test_is_encrypted() {
let mut meta = HashMap::new();
assert!(!EncryptionMetadata::is_encrypted(&meta));
meta.insert(
"x-amz-server-side-encryption".to_string(),
"AES256".to_string(),
);
assert!(EncryptionMetadata::is_encrypted(&meta));
}
}

View File

@@ -0,0 +1,138 @@
use md5::{Digest, Md5};
use sha2::Sha256;
use std::io::Read;
use std::path::Path;
const CHUNK_SIZE: usize = 65536;
pub fn md5_file(path: &Path) -> std::io::Result<String> {
let mut file = std::fs::File::open(path)?;
let mut hasher = Md5::new();
let mut buf = vec![0u8; CHUNK_SIZE];
loop {
let n = file.read(&mut buf)?;
if n == 0 {
break;
}
hasher.update(&buf[..n]);
}
Ok(format!("{:x}", hasher.finalize()))
}
pub fn md5_bytes(data: &[u8]) -> String {
let mut hasher = Md5::new();
hasher.update(data);
format!("{:x}", hasher.finalize())
}
pub fn sha256_file(path: &Path) -> std::io::Result<String> {
let mut file = std::fs::File::open(path)?;
let mut hasher = Sha256::new();
let mut buf = vec![0u8; CHUNK_SIZE];
loop {
let n = file.read(&mut buf)?;
if n == 0 {
break;
}
hasher.update(&buf[..n]);
}
Ok(format!("{:x}", hasher.finalize()))
}
pub fn sha256_bytes(data: &[u8]) -> String {
let mut hasher = Sha256::new();
hasher.update(data);
format!("{:x}", hasher.finalize())
}
pub fn md5_sha256_file(path: &Path) -> std::io::Result<(String, String)> {
let mut file = std::fs::File::open(path)?;
let mut md5_hasher = Md5::new();
let mut sha_hasher = Sha256::new();
let mut buf = vec![0u8; CHUNK_SIZE];
loop {
let n = file.read(&mut buf)?;
if n == 0 {
break;
}
md5_hasher.update(&buf[..n]);
sha_hasher.update(&buf[..n]);
}
Ok((
format!("{:x}", md5_hasher.finalize()),
format!("{:x}", sha_hasher.finalize()),
))
}
pub async fn md5_file_async(path: &Path) -> std::io::Result<String> {
let path = path.to_owned();
tokio::task::spawn_blocking(move || md5_file(&path))
.await
.map_err(|e| std::io::Error::new(std::io::ErrorKind::Other, e))?
}
pub async fn sha256_file_async(path: &Path) -> std::io::Result<String> {
let path = path.to_owned();
tokio::task::spawn_blocking(move || sha256_file(&path))
.await
.map_err(|e| std::io::Error::new(std::io::ErrorKind::Other, e))?
}
pub async fn md5_sha256_file_async(path: &Path) -> std::io::Result<(String, String)> {
let path = path.to_owned();
tokio::task::spawn_blocking(move || md5_sha256_file(&path))
.await
.map_err(|e| std::io::Error::new(std::io::ErrorKind::Other, e))?
}
#[cfg(test)]
mod tests {
use super::*;
use std::io::Write;
#[test]
fn test_md5_bytes() {
assert_eq!(md5_bytes(b""), "d41d8cd98f00b204e9800998ecf8427e");
assert_eq!(md5_bytes(b"hello"), "5d41402abc4b2a76b9719d911017c592");
}
#[test]
fn test_sha256_bytes() {
let hash = sha256_bytes(b"hello");
assert_eq!(
hash,
"2cf24dba5fb0a30e26e83b2ac5b9e29e1b161e5c1fa7425e73043362938b9824"
);
}
#[test]
fn test_md5_file() {
let mut tmp = tempfile::NamedTempFile::new().unwrap();
tmp.write_all(b"hello").unwrap();
tmp.flush().unwrap();
let hash = md5_file(tmp.path()).unwrap();
assert_eq!(hash, "5d41402abc4b2a76b9719d911017c592");
}
#[test]
fn test_md5_sha256_file() {
let mut tmp = tempfile::NamedTempFile::new().unwrap();
tmp.write_all(b"hello").unwrap();
tmp.flush().unwrap();
let (md5, sha) = md5_sha256_file(tmp.path()).unwrap();
assert_eq!(md5, "5d41402abc4b2a76b9719d911017c592");
assert_eq!(
sha,
"2cf24dba5fb0a30e26e83b2ac5b9e29e1b161e5c1fa7425e73043362938b9824"
);
}
#[tokio::test]
async fn test_md5_file_async() {
let mut tmp = tempfile::NamedTempFile::new().unwrap();
tmp.write_all(b"hello").unwrap();
tmp.flush().unwrap();
let hash = md5_file_async(tmp.path()).await.unwrap();
assert_eq!(hash, "5d41402abc4b2a76b9719d911017c592");
}
}

View File

@@ -0,0 +1,451 @@
use aes_gcm::aead::Aead;
use aes_gcm::{Aes256Gcm, KeyInit, Nonce};
use base64::engine::general_purpose::STANDARD as B64;
use base64::Engine;
use chrono::{DateTime, Utc};
use rand::RngCore;
use serde::{Deserialize, Serialize};
use std::path::{Path, PathBuf};
use std::sync::Arc;
use tokio::sync::RwLock;
use crate::aes_gcm::CryptoError;
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct KmsKey {
#[serde(rename = "KeyId")]
pub key_id: String,
#[serde(rename = "Arn")]
pub arn: String,
#[serde(rename = "Description")]
pub description: String,
#[serde(rename = "CreationDate")]
pub creation_date: DateTime<Utc>,
#[serde(rename = "Enabled")]
pub enabled: bool,
#[serde(rename = "KeyState")]
pub key_state: String,
#[serde(rename = "KeyUsage")]
pub key_usage: String,
#[serde(rename = "KeySpec")]
pub key_spec: String,
#[serde(rename = "EncryptedKeyMaterial")]
pub encrypted_key_material: String,
}
#[derive(Debug, Clone, Serialize, Deserialize)]
struct KmsStore {
keys: Vec<KmsKey>,
}
pub struct KmsService {
keys_path: PathBuf,
master_key: Arc<RwLock<[u8; 32]>>,
keys: Arc<RwLock<Vec<KmsKey>>>,
}
impl KmsService {
pub async fn new(keys_dir: &Path) -> Result<Self, CryptoError> {
std::fs::create_dir_all(keys_dir).map_err(CryptoError::Io)?;
let keys_path = keys_dir.join("kms_keys.json");
let master_key = Self::load_or_create_master_key(&keys_dir.join("kms_master.key"))?;
let keys = if keys_path.exists() {
let data = std::fs::read_to_string(&keys_path).map_err(CryptoError::Io)?;
let store: KmsStore = serde_json::from_str(&data)
.map_err(|e| CryptoError::EncryptionFailed(format!("Bad KMS store: {}", e)))?;
store.keys
} else {
Vec::new()
};
Ok(Self {
keys_path,
master_key: Arc::new(RwLock::new(master_key)),
keys: Arc::new(RwLock::new(keys)),
})
}
fn load_or_create_master_key(path: &Path) -> Result<[u8; 32], CryptoError> {
if path.exists() {
let encoded = std::fs::read_to_string(path).map_err(CryptoError::Io)?;
let decoded = B64.decode(encoded.trim()).map_err(|e| {
CryptoError::EncryptionFailed(format!("Bad master key encoding: {}", e))
})?;
if decoded.len() != 32 {
return Err(CryptoError::InvalidKeySize(decoded.len()));
}
let mut key = [0u8; 32];
key.copy_from_slice(&decoded);
Ok(key)
} else {
let mut key = [0u8; 32];
rand::thread_rng().fill_bytes(&mut key);
let encoded = B64.encode(key);
std::fs::write(path, &encoded).map_err(CryptoError::Io)?;
Ok(key)
}
}
fn encrypt_key_material(
master_key: &[u8; 32],
plaintext_key: &[u8],
) -> Result<String, CryptoError> {
let cipher = Aes256Gcm::new(master_key.into());
let mut nonce_bytes = [0u8; 12];
rand::thread_rng().fill_bytes(&mut nonce_bytes);
let nonce = Nonce::from_slice(&nonce_bytes);
let ciphertext = cipher
.encrypt(nonce, plaintext_key)
.map_err(|e| CryptoError::EncryptionFailed(e.to_string()))?;
let mut combined = Vec::with_capacity(12 + ciphertext.len());
combined.extend_from_slice(&nonce_bytes);
combined.extend_from_slice(&ciphertext);
Ok(B64.encode(&combined))
}
fn decrypt_key_material(
master_key: &[u8; 32],
encrypted_b64: &str,
) -> Result<Vec<u8>, CryptoError> {
let combined = B64.decode(encrypted_b64).map_err(|e| {
CryptoError::EncryptionFailed(format!("Bad key material encoding: {}", e))
})?;
if combined.len() < 12 {
return Err(CryptoError::EncryptionFailed(
"Encrypted key material too short".to_string(),
));
}
let (nonce_bytes, ciphertext) = combined.split_at(12);
let cipher = Aes256Gcm::new(master_key.into());
let nonce = Nonce::from_slice(nonce_bytes);
cipher
.decrypt(nonce, ciphertext)
.map_err(|_| CryptoError::DecryptionFailed(0))
}
async fn save(&self) -> Result<(), CryptoError> {
let keys = self.keys.read().await;
let store = KmsStore { keys: keys.clone() };
let json = serde_json::to_string_pretty(&store)
.map_err(|e| CryptoError::EncryptionFailed(e.to_string()))?;
std::fs::write(&self.keys_path, json).map_err(CryptoError::Io)?;
Ok(())
}
pub async fn create_key(&self, description: &str) -> Result<KmsKey, CryptoError> {
let key_id = uuid::Uuid::new_v4().to_string();
let arn = format!("arn:aws:kms:local:000000000000:key/{}", key_id);
let mut plaintext_key = [0u8; 32];
rand::thread_rng().fill_bytes(&mut plaintext_key);
let master = self.master_key.read().await;
let encrypted = Self::encrypt_key_material(&master, &plaintext_key)?;
let kms_key = KmsKey {
key_id: key_id.clone(),
arn,
description: description.to_string(),
creation_date: Utc::now(),
enabled: true,
key_state: "Enabled".to_string(),
key_usage: "ENCRYPT_DECRYPT".to_string(),
key_spec: "SYMMETRIC_DEFAULT".to_string(),
encrypted_key_material: encrypted,
};
self.keys.write().await.push(kms_key.clone());
self.save().await?;
Ok(kms_key)
}
pub async fn list_keys(&self) -> Vec<KmsKey> {
self.keys.read().await.clone()
}
pub async fn get_key(&self, key_id: &str) -> Option<KmsKey> {
let keys = self.keys.read().await;
keys.iter()
.find(|k| k.key_id == key_id || k.arn == key_id)
.cloned()
}
pub async fn delete_key(&self, key_id: &str) -> Result<bool, CryptoError> {
let mut keys = self.keys.write().await;
let len_before = keys.len();
keys.retain(|k| k.key_id != key_id && k.arn != key_id);
let removed = keys.len() < len_before;
drop(keys);
if removed {
self.save().await?;
}
Ok(removed)
}
pub async fn enable_key(&self, key_id: &str) -> Result<bool, CryptoError> {
let mut keys = self.keys.write().await;
if let Some(key) = keys.iter_mut().find(|k| k.key_id == key_id) {
key.enabled = true;
key.key_state = "Enabled".to_string();
drop(keys);
self.save().await?;
Ok(true)
} else {
Ok(false)
}
}
pub async fn disable_key(&self, key_id: &str) -> Result<bool, CryptoError> {
let mut keys = self.keys.write().await;
if let Some(key) = keys.iter_mut().find(|k| k.key_id == key_id) {
key.enabled = false;
key.key_state = "Disabled".to_string();
drop(keys);
self.save().await?;
Ok(true)
} else {
Ok(false)
}
}
pub async fn decrypt_data_key(&self, key_id: &str) -> Result<Vec<u8>, CryptoError> {
let keys = self.keys.read().await;
let key = keys
.iter()
.find(|k| k.key_id == key_id || k.arn == key_id)
.ok_or_else(|| CryptoError::EncryptionFailed("KMS key not found".to_string()))?;
if !key.enabled {
return Err(CryptoError::EncryptionFailed(
"KMS key is disabled".to_string(),
));
}
let master = self.master_key.read().await;
Self::decrypt_key_material(&master, &key.encrypted_key_material)
}
pub async fn encrypt_data(
&self,
key_id: &str,
plaintext: &[u8],
) -> Result<Vec<u8>, CryptoError> {
let data_key = self.decrypt_data_key(key_id).await?;
if data_key.len() != 32 {
return Err(CryptoError::InvalidKeySize(data_key.len()));
}
let key_arr: [u8; 32] = data_key.try_into().unwrap();
let cipher = Aes256Gcm::new(&key_arr.into());
let mut nonce_bytes = [0u8; 12];
rand::thread_rng().fill_bytes(&mut nonce_bytes);
let nonce = Nonce::from_slice(&nonce_bytes);
let ciphertext = cipher
.encrypt(nonce, plaintext)
.map_err(|e| CryptoError::EncryptionFailed(e.to_string()))?;
let mut result = Vec::with_capacity(12 + ciphertext.len());
result.extend_from_slice(&nonce_bytes);
result.extend_from_slice(&ciphertext);
Ok(result)
}
pub async fn decrypt_data(
&self,
key_id: &str,
ciphertext: &[u8],
) -> Result<Vec<u8>, CryptoError> {
if ciphertext.len() < 12 {
return Err(CryptoError::EncryptionFailed(
"Ciphertext too short".to_string(),
));
}
let data_key = self.decrypt_data_key(key_id).await?;
if data_key.len() != 32 {
return Err(CryptoError::InvalidKeySize(data_key.len()));
}
let key_arr: [u8; 32] = data_key.try_into().unwrap();
let (nonce_bytes, ct) = ciphertext.split_at(12);
let cipher = Aes256Gcm::new(&key_arr.into());
let nonce = Nonce::from_slice(nonce_bytes);
cipher
.decrypt(nonce, ct)
.map_err(|_| CryptoError::DecryptionFailed(0))
}
pub async fn generate_data_key(
&self,
key_id: &str,
num_bytes: usize,
) -> Result<(Vec<u8>, Vec<u8>), CryptoError> {
let kms_key = self.decrypt_data_key(key_id).await?;
if kms_key.len() != 32 {
return Err(CryptoError::InvalidKeySize(kms_key.len()));
}
let mut plaintext_key = vec![0u8; num_bytes];
rand::thread_rng().fill_bytes(&mut plaintext_key);
let key_arr: [u8; 32] = kms_key.try_into().unwrap();
let cipher = Aes256Gcm::new(&key_arr.into());
let mut nonce_bytes = [0u8; 12];
rand::thread_rng().fill_bytes(&mut nonce_bytes);
let nonce = Nonce::from_slice(&nonce_bytes);
let encrypted = cipher
.encrypt(nonce, plaintext_key.as_slice())
.map_err(|e| CryptoError::EncryptionFailed(e.to_string()))?;
let mut wrapped = Vec::with_capacity(12 + encrypted.len());
wrapped.extend_from_slice(&nonce_bytes);
wrapped.extend_from_slice(&encrypted);
Ok((plaintext_key, wrapped))
}
}
pub async fn load_or_create_master_key(keys_dir: &Path) -> Result<[u8; 32], CryptoError> {
std::fs::create_dir_all(keys_dir).map_err(CryptoError::Io)?;
let path = keys_dir.join("master.key");
if path.exists() {
let encoded = std::fs::read_to_string(&path).map_err(CryptoError::Io)?;
let decoded = B64.decode(encoded.trim()).map_err(|e| {
CryptoError::EncryptionFailed(format!("Bad master key encoding: {}", e))
})?;
if decoded.len() != 32 {
return Err(CryptoError::InvalidKeySize(decoded.len()));
}
let mut key = [0u8; 32];
key.copy_from_slice(&decoded);
Ok(key)
} else {
let mut key = [0u8; 32];
rand::thread_rng().fill_bytes(&mut key);
let encoded = B64.encode(key);
std::fs::write(&path, &encoded).map_err(CryptoError::Io)?;
Ok(key)
}
}
#[cfg(test)]
mod tests {
use super::*;
#[tokio::test]
async fn test_create_and_list_keys() {
let dir = tempfile::tempdir().unwrap();
let kms = KmsService::new(dir.path()).await.unwrap();
let key = kms.create_key("test key").await.unwrap();
assert!(key.enabled);
assert_eq!(key.description, "test key");
assert!(key.key_id.len() > 0);
let keys = kms.list_keys().await;
assert_eq!(keys.len(), 1);
assert_eq!(keys[0].key_id, key.key_id);
}
#[tokio::test]
async fn test_enable_disable_key() {
let dir = tempfile::tempdir().unwrap();
let kms = KmsService::new(dir.path()).await.unwrap();
let key = kms.create_key("toggle").await.unwrap();
assert!(key.enabled);
kms.disable_key(&key.key_id).await.unwrap();
let k = kms.get_key(&key.key_id).await.unwrap();
assert!(!k.enabled);
kms.enable_key(&key.key_id).await.unwrap();
let k = kms.get_key(&key.key_id).await.unwrap();
assert!(k.enabled);
}
#[tokio::test]
async fn test_delete_key() {
let dir = tempfile::tempdir().unwrap();
let kms = KmsService::new(dir.path()).await.unwrap();
let key = kms.create_key("doomed").await.unwrap();
assert!(kms.delete_key(&key.key_id).await.unwrap());
assert!(kms.get_key(&key.key_id).await.is_none());
assert_eq!(kms.list_keys().await.len(), 0);
}
#[tokio::test]
async fn test_encrypt_decrypt_data() {
let dir = tempfile::tempdir().unwrap();
let kms = KmsService::new(dir.path()).await.unwrap();
let key = kms.create_key("enc-key").await.unwrap();
let plaintext = b"Hello, KMS!";
let ciphertext = kms.encrypt_data(&key.key_id, plaintext).await.unwrap();
assert_ne!(&ciphertext, plaintext);
let decrypted = kms.decrypt_data(&key.key_id, &ciphertext).await.unwrap();
assert_eq!(decrypted, plaintext);
}
#[tokio::test]
async fn test_generate_data_key() {
let dir = tempfile::tempdir().unwrap();
let kms = KmsService::new(dir.path()).await.unwrap();
let key = kms.create_key("data-key-gen").await.unwrap();
let (plaintext, wrapped) = kms.generate_data_key(&key.key_id, 32).await.unwrap();
assert_eq!(plaintext.len(), 32);
assert!(wrapped.len() > 32);
}
#[tokio::test]
async fn test_disabled_key_cannot_encrypt() {
let dir = tempfile::tempdir().unwrap();
let kms = KmsService::new(dir.path()).await.unwrap();
let key = kms.create_key("disabled").await.unwrap();
kms.disable_key(&key.key_id).await.unwrap();
let result = kms.encrypt_data(&key.key_id, b"test").await;
assert!(result.is_err());
}
#[tokio::test]
async fn test_persistence_across_reload() {
let dir = tempfile::tempdir().unwrap();
let key_id = {
let kms = KmsService::new(dir.path()).await.unwrap();
let key = kms.create_key("persistent").await.unwrap();
key.key_id
};
let kms2 = KmsService::new(dir.path()).await.unwrap();
let key = kms2.get_key(&key_id).await;
assert!(key.is_some());
assert_eq!(key.unwrap().description, "persistent");
}
#[tokio::test]
async fn test_master_key_roundtrip() {
let dir = tempfile::tempdir().unwrap();
let key1 = load_or_create_master_key(dir.path()).await.unwrap();
let key2 = load_or_create_master_key(dir.path()).await.unwrap();
assert_eq!(key1, key2);
}
}

View File

@@ -0,0 +1,4 @@
pub mod aes_gcm;
pub mod encryption;
pub mod hashing;
pub mod kms;

View File

@@ -0,0 +1,60 @@
[package]
name = "myfsio-server"
version.workspace = true
edition.workspace = true
[dependencies]
myfsio-common = { path = "../myfsio-common" }
myfsio-auth = { path = "../myfsio-auth" }
myfsio-crypto = { path = "../myfsio-crypto" }
myfsio-storage = { path = "../myfsio-storage" }
myfsio-xml = { path = "../myfsio-xml" }
base64 = { workspace = true }
md-5 = { workspace = true }
axum = { workspace = true }
tokio = { workspace = true }
tower = { workspace = true }
tower-http = { workspace = true }
hyper = { workspace = true }
bytes = { workspace = true }
serde = { workspace = true }
serde_json = { workspace = true }
serde_urlencoded = "0.7"
tracing = { workspace = true }
tracing-subscriber = { workspace = true }
tokio-util = { workspace = true }
tokio-stream = { workspace = true }
chrono = { workspace = true }
uuid = { workspace = true }
futures = { workspace = true }
http-body = "1"
http-body-util = "0.1"
percent-encoding = { workspace = true }
quick-xml = { workspace = true }
mime_guess = "2"
crc32fast = { workspace = true }
sha2 = { workspace = true }
hex = { workspace = true }
duckdb = { workspace = true }
roxmltree = "0.20"
parking_lot = { workspace = true }
regex = "1"
multer = "3"
reqwest = { workspace = true }
aws-sdk-s3 = { workspace = true }
aws-config = { workspace = true }
aws-credential-types = { workspace = true }
aws-smithy-types = { workspace = true }
async-trait = { workspace = true }
rand = "0.8"
tera = { workspace = true }
cookie = { workspace = true }
subtle = { workspace = true }
clap = { workspace = true }
dotenvy = { workspace = true }
sysinfo = "0.32"
aes-gcm = { workspace = true }
[dev-dependencies]
tempfile = "3"
tower = { workspace = true, features = ["util"] }

View File

@@ -0,0 +1,621 @@
use std::net::SocketAddr;
use std::path::PathBuf;
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
pub struct RateLimitSetting {
pub max_requests: u32,
pub window_seconds: u64,
}
impl RateLimitSetting {
pub const fn new(max_requests: u32, window_seconds: u64) -> Self {
Self {
max_requests,
window_seconds,
}
}
}
#[derive(Debug, Clone)]
pub struct ServerConfig {
pub bind_addr: SocketAddr,
pub ui_bind_addr: SocketAddr,
pub storage_root: PathBuf,
pub region: String,
pub iam_config_path: PathBuf,
pub sigv4_timestamp_tolerance_secs: u64,
pub presigned_url_min_expiry: u64,
pub presigned_url_max_expiry: u64,
pub secret_key: Option<String>,
pub encryption_enabled: bool,
pub encryption_chunk_size_bytes: usize,
pub kms_enabled: bool,
pub kms_generate_data_key_min_bytes: usize,
pub kms_generate_data_key_max_bytes: usize,
pub gc_enabled: bool,
pub gc_interval_hours: f64,
pub gc_temp_file_max_age_hours: f64,
pub gc_multipart_max_age_days: u64,
pub gc_lock_file_max_age_hours: f64,
pub gc_dry_run: bool,
pub integrity_enabled: bool,
pub metrics_enabled: bool,
pub metrics_history_enabled: bool,
pub metrics_interval_minutes: u64,
pub metrics_retention_hours: u64,
pub metrics_history_interval_minutes: u64,
pub metrics_history_retention_hours: u64,
pub lifecycle_enabled: bool,
pub lifecycle_max_history_per_bucket: usize,
pub website_hosting_enabled: bool,
pub object_key_max_length_bytes: usize,
pub object_tag_limit: usize,
pub object_cache_max_size: usize,
pub bucket_config_cache_ttl_seconds: f64,
pub replication_connect_timeout_secs: u64,
pub replication_read_timeout_secs: u64,
pub replication_max_retries: u32,
pub replication_streaming_threshold_bytes: u64,
pub replication_max_failures_per_bucket: usize,
pub site_sync_enabled: bool,
pub site_sync_interval_secs: u64,
pub site_sync_batch_size: usize,
pub site_sync_connect_timeout_secs: u64,
pub site_sync_read_timeout_secs: u64,
pub site_sync_max_retries: u32,
pub site_sync_clock_skew_tolerance: f64,
pub site_id: Option<String>,
pub site_endpoint: Option<String>,
pub site_region: String,
pub site_priority: i32,
pub api_base_url: String,
pub num_trusted_proxies: usize,
pub allowed_redirect_hosts: Vec<String>,
pub allow_internal_endpoints: bool,
pub cors_origins: Vec<String>,
pub cors_methods: Vec<String>,
pub cors_allow_headers: Vec<String>,
pub cors_expose_headers: Vec<String>,
pub session_lifetime_days: u64,
pub log_level: String,
pub multipart_min_part_size: u64,
pub bulk_delete_max_keys: usize,
pub stream_chunk_size: usize,
pub request_body_timeout_secs: u64,
pub ratelimit_default: RateLimitSetting,
pub ratelimit_list_buckets: RateLimitSetting,
pub ratelimit_bucket_ops: RateLimitSetting,
pub ratelimit_object_ops: RateLimitSetting,
pub ratelimit_head_ops: RateLimitSetting,
pub ratelimit_admin: RateLimitSetting,
pub ratelimit_storage_uri: String,
pub ui_enabled: bool,
pub templates_dir: PathBuf,
pub static_dir: PathBuf,
}
impl ServerConfig {
pub fn from_env() -> Self {
let host = std::env::var("HOST").unwrap_or_else(|_| "127.0.0.1".to_string());
let port: u16 = std::env::var("PORT")
.unwrap_or_else(|_| "5000".to_string())
.parse()
.unwrap_or(5000);
let host_ip: std::net::IpAddr = host.parse().unwrap();
let bind_addr = SocketAddr::new(host_ip, port);
let ui_port: u16 = std::env::var("UI_PORT")
.unwrap_or_else(|_| "5100".to_string())
.parse()
.unwrap_or(5100);
let storage_root = std::env::var("STORAGE_ROOT").unwrap_or_else(|_| "./data".to_string());
let region = std::env::var("AWS_REGION").unwrap_or_else(|_| "us-east-1".to_string());
let storage_path = PathBuf::from(&storage_root);
let iam_config_path = std::env::var("IAM_CONFIG")
.map(PathBuf::from)
.unwrap_or_else(|_| {
storage_path
.join(".myfsio.sys")
.join("config")
.join("iam.json")
});
let sigv4_timestamp_tolerance_secs: u64 =
std::env::var("SIGV4_TIMESTAMP_TOLERANCE_SECONDS")
.unwrap_or_else(|_| "900".to_string())
.parse()
.unwrap_or(900);
let presigned_url_min_expiry: u64 = std::env::var("PRESIGNED_URL_MIN_EXPIRY_SECONDS")
.unwrap_or_else(|_| "1".to_string())
.parse()
.unwrap_or(1);
let presigned_url_max_expiry: u64 = std::env::var("PRESIGNED_URL_MAX_EXPIRY_SECONDS")
.unwrap_or_else(|_| "604800".to_string())
.parse()
.unwrap_or(604800);
let secret_key = {
let env_key = std::env::var("SECRET_KEY").ok();
match env_key {
Some(k) if !k.is_empty() && k != "dev-secret-key" => Some(k),
_ => {
let secret_file = storage_path
.join(".myfsio.sys")
.join("config")
.join(".secret");
std::fs::read_to_string(&secret_file)
.ok()
.map(|s| s.trim().to_string())
}
}
};
let encryption_enabled = parse_bool_env("ENCRYPTION_ENABLED", false);
let encryption_chunk_size_bytes = parse_usize_env("ENCRYPTION_CHUNK_SIZE_BYTES", 65_536);
let kms_enabled = parse_bool_env("KMS_ENABLED", false);
let kms_generate_data_key_min_bytes = parse_usize_env("KMS_GENERATE_DATA_KEY_MIN_BYTES", 1);
let kms_generate_data_key_max_bytes =
parse_usize_env("KMS_GENERATE_DATA_KEY_MAX_BYTES", 1024);
let gc_enabled = parse_bool_env("GC_ENABLED", false);
let gc_interval_hours = parse_f64_env("GC_INTERVAL_HOURS", 6.0);
let gc_temp_file_max_age_hours = parse_f64_env("GC_TEMP_FILE_MAX_AGE_HOURS", 24.0);
let gc_multipart_max_age_days = parse_u64_env("GC_MULTIPART_MAX_AGE_DAYS", 7);
let gc_lock_file_max_age_hours = parse_f64_env("GC_LOCK_FILE_MAX_AGE_HOURS", 1.0);
let gc_dry_run = parse_bool_env("GC_DRY_RUN", false);
let integrity_enabled = parse_bool_env("INTEGRITY_ENABLED", false);
let metrics_enabled = parse_bool_env("OPERATION_METRICS_ENABLED", false);
let metrics_history_enabled = parse_bool_env("METRICS_HISTORY_ENABLED", false);
let metrics_interval_minutes = parse_u64_env("OPERATION_METRICS_INTERVAL_MINUTES", 5);
let metrics_retention_hours = parse_u64_env("OPERATION_METRICS_RETENTION_HOURS", 24);
let metrics_history_interval_minutes = parse_u64_env("METRICS_HISTORY_INTERVAL_MINUTES", 5);
let metrics_history_retention_hours = parse_u64_env("METRICS_HISTORY_RETENTION_HOURS", 24);
let lifecycle_enabled = parse_bool_env("LIFECYCLE_ENABLED", false);
let lifecycle_max_history_per_bucket =
parse_usize_env("LIFECYCLE_MAX_HISTORY_PER_BUCKET", 50);
let website_hosting_enabled = parse_bool_env("WEBSITE_HOSTING_ENABLED", false);
let object_key_max_length_bytes = parse_usize_env("OBJECT_KEY_MAX_LENGTH_BYTES", 1024);
let object_tag_limit = parse_usize_env("OBJECT_TAG_LIMIT", 50);
let object_cache_max_size = parse_usize_env("OBJECT_CACHE_MAX_SIZE", 100);
let bucket_config_cache_ttl_seconds =
parse_f64_env("BUCKET_CONFIG_CACHE_TTL_SECONDS", 30.0);
let replication_connect_timeout_secs =
parse_u64_env("REPLICATION_CONNECT_TIMEOUT_SECONDS", 5);
let replication_read_timeout_secs = parse_u64_env("REPLICATION_READ_TIMEOUT_SECONDS", 30);
let replication_max_retries = parse_u64_env("REPLICATION_MAX_RETRIES", 2) as u32;
let replication_streaming_threshold_bytes =
parse_u64_env("REPLICATION_STREAMING_THRESHOLD_BYTES", 10_485_760);
let replication_max_failures_per_bucket =
parse_u64_env("REPLICATION_MAX_FAILURES_PER_BUCKET", 50) as usize;
let site_sync_enabled = parse_bool_env("SITE_SYNC_ENABLED", false);
let site_sync_interval_secs = parse_u64_env("SITE_SYNC_INTERVAL_SECONDS", 60);
let site_sync_batch_size = parse_u64_env("SITE_SYNC_BATCH_SIZE", 100) as usize;
let site_sync_connect_timeout_secs = parse_u64_env("SITE_SYNC_CONNECT_TIMEOUT_SECONDS", 10);
let site_sync_read_timeout_secs = parse_u64_env("SITE_SYNC_READ_TIMEOUT_SECONDS", 120);
let site_sync_max_retries = parse_u64_env("SITE_SYNC_MAX_RETRIES", 2) as u32;
let site_sync_clock_skew_tolerance: f64 =
std::env::var("SITE_SYNC_CLOCK_SKEW_TOLERANCE_SECONDS")
.ok()
.and_then(|s| s.parse().ok())
.unwrap_or(1.0);
let site_id = parse_optional_string_env("SITE_ID");
let site_endpoint = parse_optional_string_env("SITE_ENDPOINT");
let site_region = std::env::var("SITE_REGION").unwrap_or_else(|_| region.clone());
let site_priority = parse_i32_env("SITE_PRIORITY", 100);
let api_base_url = std::env::var("API_BASE_URL")
.unwrap_or_else(|_| format!("http://{}", bind_addr))
.trim_end_matches('/')
.to_string();
let num_trusted_proxies = parse_usize_env("NUM_TRUSTED_PROXIES", 0);
let allowed_redirect_hosts = parse_list_env("ALLOWED_REDIRECT_HOSTS", "");
let allow_internal_endpoints = parse_bool_env("ALLOW_INTERNAL_ENDPOINTS", false);
let cors_origins = parse_list_env("CORS_ORIGINS", "*");
let cors_methods = parse_list_env("CORS_METHODS", "GET,PUT,POST,DELETE,OPTIONS,HEAD");
let cors_allow_headers = parse_list_env("CORS_ALLOW_HEADERS", "*");
let cors_expose_headers = parse_list_env("CORS_EXPOSE_HEADERS", "*");
let session_lifetime_days = parse_u64_env("SESSION_LIFETIME_DAYS", 1);
let log_level = std::env::var("LOG_LEVEL").unwrap_or_else(|_| "INFO".to_string());
let multipart_min_part_size = parse_u64_env("MULTIPART_MIN_PART_SIZE", 5_242_880);
let bulk_delete_max_keys = parse_usize_env("BULK_DELETE_MAX_KEYS", 1000);
let stream_chunk_size = parse_usize_env("STREAM_CHUNK_SIZE", 1_048_576);
let request_body_timeout_secs = parse_u64_env("REQUEST_BODY_TIMEOUT_SECONDS", 60);
let ratelimit_default =
parse_rate_limit_env("RATE_LIMIT_DEFAULT", RateLimitSetting::new(500, 60));
let ratelimit_list_buckets =
parse_rate_limit_env("RATE_LIMIT_LIST_BUCKETS", ratelimit_default);
let ratelimit_bucket_ops =
parse_rate_limit_env("RATE_LIMIT_BUCKET_OPS", ratelimit_default);
let ratelimit_object_ops =
parse_rate_limit_env("RATE_LIMIT_OBJECT_OPS", ratelimit_default);
let ratelimit_head_ops =
parse_rate_limit_env("RATE_LIMIT_HEAD_OPS", ratelimit_default);
let ratelimit_admin =
parse_rate_limit_env("RATE_LIMIT_ADMIN", RateLimitSetting::new(60, 60));
let ratelimit_storage_uri =
std::env::var("RATE_LIMIT_STORAGE_URI").unwrap_or_else(|_| "memory://".to_string());
let ui_enabled = parse_bool_env("UI_ENABLED", true);
let templates_dir = std::env::var("TEMPLATES_DIR")
.map(PathBuf::from)
.unwrap_or_else(|_| default_templates_dir());
let static_dir = std::env::var("STATIC_DIR")
.map(PathBuf::from)
.unwrap_or_else(|_| default_static_dir());
Self {
bind_addr,
ui_bind_addr: SocketAddr::new(host_ip, ui_port),
storage_root: storage_path,
region,
iam_config_path,
sigv4_timestamp_tolerance_secs,
presigned_url_min_expiry,
presigned_url_max_expiry,
secret_key,
encryption_enabled,
encryption_chunk_size_bytes,
kms_enabled,
kms_generate_data_key_min_bytes,
kms_generate_data_key_max_bytes,
gc_enabled,
gc_interval_hours,
gc_temp_file_max_age_hours,
gc_multipart_max_age_days,
gc_lock_file_max_age_hours,
gc_dry_run,
integrity_enabled,
metrics_enabled,
metrics_history_enabled,
metrics_interval_minutes,
metrics_retention_hours,
metrics_history_interval_minutes,
metrics_history_retention_hours,
lifecycle_enabled,
lifecycle_max_history_per_bucket,
website_hosting_enabled,
object_key_max_length_bytes,
object_tag_limit,
object_cache_max_size,
bucket_config_cache_ttl_seconds,
replication_connect_timeout_secs,
replication_read_timeout_secs,
replication_max_retries,
replication_streaming_threshold_bytes,
replication_max_failures_per_bucket,
site_sync_enabled,
site_sync_interval_secs,
site_sync_batch_size,
site_sync_connect_timeout_secs,
site_sync_read_timeout_secs,
site_sync_max_retries,
site_sync_clock_skew_tolerance,
site_id,
site_endpoint,
site_region,
site_priority,
api_base_url,
num_trusted_proxies,
allowed_redirect_hosts,
allow_internal_endpoints,
cors_origins,
cors_methods,
cors_allow_headers,
cors_expose_headers,
session_lifetime_days,
log_level,
multipart_min_part_size,
bulk_delete_max_keys,
stream_chunk_size,
request_body_timeout_secs,
ratelimit_default,
ratelimit_list_buckets,
ratelimit_bucket_ops,
ratelimit_object_ops,
ratelimit_head_ops,
ratelimit_admin,
ratelimit_storage_uri,
ui_enabled,
templates_dir,
static_dir,
}
}
}
impl Default for ServerConfig {
fn default() -> Self {
Self {
bind_addr: "127.0.0.1:5000".parse().unwrap(),
ui_bind_addr: "127.0.0.1:5100".parse().unwrap(),
storage_root: PathBuf::from("./data"),
region: "us-east-1".to_string(),
iam_config_path: PathBuf::from("./data/.myfsio.sys/config/iam.json"),
sigv4_timestamp_tolerance_secs: 900,
presigned_url_min_expiry: 1,
presigned_url_max_expiry: 604_800,
secret_key: None,
encryption_enabled: false,
encryption_chunk_size_bytes: 65_536,
kms_enabled: false,
kms_generate_data_key_min_bytes: 1,
kms_generate_data_key_max_bytes: 1024,
gc_enabled: false,
gc_interval_hours: 6.0,
gc_temp_file_max_age_hours: 24.0,
gc_multipart_max_age_days: 7,
gc_lock_file_max_age_hours: 1.0,
gc_dry_run: false,
integrity_enabled: false,
metrics_enabled: false,
metrics_history_enabled: false,
metrics_interval_minutes: 5,
metrics_retention_hours: 24,
metrics_history_interval_minutes: 5,
metrics_history_retention_hours: 24,
lifecycle_enabled: false,
lifecycle_max_history_per_bucket: 50,
website_hosting_enabled: false,
object_key_max_length_bytes: 1024,
object_tag_limit: 50,
object_cache_max_size: 100,
bucket_config_cache_ttl_seconds: 30.0,
replication_connect_timeout_secs: 5,
replication_read_timeout_secs: 30,
replication_max_retries: 2,
replication_streaming_threshold_bytes: 10_485_760,
replication_max_failures_per_bucket: 50,
site_sync_enabled: false,
site_sync_interval_secs: 60,
site_sync_batch_size: 100,
site_sync_connect_timeout_secs: 10,
site_sync_read_timeout_secs: 120,
site_sync_max_retries: 2,
site_sync_clock_skew_tolerance: 1.0,
site_id: None,
site_endpoint: None,
site_region: "us-east-1".to_string(),
site_priority: 100,
api_base_url: "http://127.0.0.1:5000".to_string(),
num_trusted_proxies: 0,
allowed_redirect_hosts: Vec::new(),
allow_internal_endpoints: false,
cors_origins: vec!["*".to_string()],
cors_methods: vec![
"GET".to_string(),
"PUT".to_string(),
"POST".to_string(),
"DELETE".to_string(),
"OPTIONS".to_string(),
"HEAD".to_string(),
],
cors_allow_headers: vec!["*".to_string()],
cors_expose_headers: vec!["*".to_string()],
session_lifetime_days: 1,
log_level: "INFO".to_string(),
multipart_min_part_size: 5_242_880,
bulk_delete_max_keys: 1000,
stream_chunk_size: 1_048_576,
request_body_timeout_secs: 60,
ratelimit_default: RateLimitSetting::new(500, 60),
ratelimit_list_buckets: RateLimitSetting::new(500, 60),
ratelimit_bucket_ops: RateLimitSetting::new(500, 60),
ratelimit_object_ops: RateLimitSetting::new(500, 60),
ratelimit_head_ops: RateLimitSetting::new(500, 60),
ratelimit_admin: RateLimitSetting::new(60, 60),
ratelimit_storage_uri: "memory://".to_string(),
ui_enabled: true,
templates_dir: default_templates_dir(),
static_dir: default_static_dir(),
}
}
}
fn default_templates_dir() -> PathBuf {
let manifest_dir = PathBuf::from(env!("CARGO_MANIFEST_DIR"));
manifest_dir.join("templates")
}
fn default_static_dir() -> PathBuf {
let manifest_dir = PathBuf::from(env!("CARGO_MANIFEST_DIR"));
for candidate in [
manifest_dir.join("static"),
manifest_dir.join("..").join("..").join("..").join("static"),
] {
if candidate.exists() {
return candidate;
}
}
manifest_dir.join("static")
}
fn parse_u64_env(key: &str, default: u64) -> u64 {
std::env::var(key)
.ok()
.and_then(|s| s.parse().ok())
.unwrap_or(default)
}
fn parse_usize_env(key: &str, default: usize) -> usize {
std::env::var(key)
.ok()
.and_then(|s| s.parse().ok())
.unwrap_or(default)
}
fn parse_i32_env(key: &str, default: i32) -> i32 {
std::env::var(key)
.ok()
.and_then(|s| s.parse().ok())
.unwrap_or(default)
}
fn parse_f64_env(key: &str, default: f64) -> f64 {
std::env::var(key)
.ok()
.and_then(|s| s.parse().ok())
.unwrap_or(default)
}
fn parse_bool_env(key: &str, default: bool) -> bool {
std::env::var(key)
.ok()
.map(|value| {
matches!(
value.trim().to_ascii_lowercase().as_str(),
"1" | "true" | "yes" | "on"
)
})
.unwrap_or(default)
}
fn parse_optional_string_env(key: &str) -> Option<String> {
std::env::var(key)
.ok()
.map(|value| value.trim().to_string())
.filter(|value| !value.is_empty())
}
fn parse_list_env(key: &str, default: &str) -> Vec<String> {
std::env::var(key)
.unwrap_or_else(|_| default.to_string())
.split(',')
.map(|value| value.trim().to_string())
.filter(|value| !value.is_empty())
.collect()
}
pub fn parse_rate_limit(value: &str) -> Option<RateLimitSetting> {
let trimmed = value.trim();
if let Some((requests, window)) = trimmed.split_once('/') {
let max_requests = requests.trim().parse::<u32>().ok()?;
if max_requests == 0 {
return None;
}
let window_str = window.trim().to_ascii_lowercase();
let window_seconds = if let Ok(n) = window_str.parse::<u64>() {
if n == 0 {
return None;
}
n
} else {
match window_str.as_str() {
"s" | "sec" | "second" | "seconds" => 1,
"m" | "min" | "minute" | "minutes" => 60,
"h" | "hr" | "hour" | "hours" => 3600,
"d" | "day" | "days" => 86_400,
_ => return None,
}
};
return Some(RateLimitSetting::new(max_requests, window_seconds));
}
let parts = trimmed.split_whitespace().collect::<Vec<_>>();
if parts.len() != 3 || !parts[1].eq_ignore_ascii_case("per") {
return None;
}
let max_requests = parts[0].parse::<u32>().ok()?;
if max_requests == 0 {
return None;
}
let window_seconds = match parts[2].to_ascii_lowercase().as_str() {
"second" | "seconds" => 1,
"minute" | "minutes" => 60,
"hour" | "hours" => 3600,
"day" | "days" => 86_400,
_ => return None,
};
Some(RateLimitSetting::new(max_requests, window_seconds))
}
fn parse_rate_limit_env(key: &str, default: RateLimitSetting) -> RateLimitSetting {
std::env::var(key)
.ok()
.and_then(|value| parse_rate_limit(&value))
.unwrap_or(default)
}
#[cfg(test)]
mod tests {
use super::*;
use std::sync::{Mutex, OnceLock};
fn env_lock() -> &'static Mutex<()> {
static LOCK: OnceLock<Mutex<()>> = OnceLock::new();
LOCK.get_or_init(|| Mutex::new(()))
}
#[test]
fn parses_rate_limit_text() {
assert_eq!(
parse_rate_limit("200 per minute"),
Some(RateLimitSetting::new(200, 60))
);
assert_eq!(
parse_rate_limit("3 per hours"),
Some(RateLimitSetting::new(3, 3600))
);
assert_eq!(
parse_rate_limit("50000/60"),
Some(RateLimitSetting::new(50000, 60))
);
assert_eq!(
parse_rate_limit("100/minute"),
Some(RateLimitSetting::new(100, 60))
);
assert_eq!(parse_rate_limit("0/60"), None);
assert_eq!(parse_rate_limit("0 per minute"), None);
assert_eq!(parse_rate_limit("bad"), None);
}
#[test]
fn env_defaults_and_invalid_values_fall_back() {
let _guard = env_lock().lock().unwrap();
std::env::remove_var("OBJECT_KEY_MAX_LENGTH_BYTES");
std::env::set_var("OBJECT_TAG_LIMIT", "not-a-number");
std::env::set_var("RATE_LIMIT_DEFAULT", "invalid");
let config = ServerConfig::from_env();
assert_eq!(config.object_key_max_length_bytes, 1024);
assert_eq!(config.object_tag_limit, 50);
assert_eq!(config.ratelimit_default, RateLimitSetting::new(500, 60));
std::env::remove_var("OBJECT_TAG_LIMIT");
std::env::remove_var("RATE_LIMIT_DEFAULT");
}
#[test]
fn env_overrides_new_values() {
let _guard = env_lock().lock().unwrap();
std::env::set_var("OBJECT_KEY_MAX_LENGTH_BYTES", "2048");
std::env::set_var("GC_DRY_RUN", "true");
std::env::set_var("RATE_LIMIT_ADMIN", "7 per second");
std::env::set_var("HOST", "127.0.0.1");
std::env::set_var("PORT", "5501");
std::env::remove_var("API_BASE_URL");
let config = ServerConfig::from_env();
assert_eq!(config.object_key_max_length_bytes, 2048);
assert!(config.gc_dry_run);
assert_eq!(config.ratelimit_admin, RateLimitSetting::new(7, 1));
assert_eq!(config.api_base_url, "http://127.0.0.1:5501");
std::env::remove_var("OBJECT_KEY_MAX_LENGTH_BYTES");
std::env::remove_var("GC_DRY_RUN");
std::env::remove_var("RATE_LIMIT_ADMIN");
std::env::remove_var("HOST");
std::env::remove_var("PORT");
}
}

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,184 @@
use std::pin::Pin;
use std::task::{Context, Poll};
use bytes::{Buf, BytesMut};
use tokio::io::{AsyncRead, ReadBuf};
enum State {
ReadSize,
ReadData(u64),
ReadTrailer,
Finished,
}
pub struct AwsChunkedStream<S> {
inner: S,
buffer: BytesMut,
state: State,
pending: BytesMut,
eof: bool,
}
impl<S> AwsChunkedStream<S> {
pub fn new(inner: S) -> Self {
Self {
inner,
buffer: BytesMut::with_capacity(8192),
state: State::ReadSize,
pending: BytesMut::new(),
eof: false,
}
}
fn find_crlf(&self) -> Option<usize> {
for i in 0..self.buffer.len().saturating_sub(1) {
if self.buffer[i] == b'\r' && self.buffer[i + 1] == b'\n' {
return Some(i);
}
}
None
}
fn parse_chunk_size(line: &[u8]) -> std::io::Result<u64> {
let text = std::str::from_utf8(line).map_err(|_| {
std::io::Error::new(
std::io::ErrorKind::InvalidData,
"invalid chunk size encoding",
)
})?;
let head = text.split(';').next().unwrap_or("").trim();
u64::from_str_radix(head, 16).map_err(|_| {
std::io::Error::new(
std::io::ErrorKind::InvalidData,
format!("invalid chunk size: {}", head),
)
})
}
fn try_advance(&mut self, out: &mut ReadBuf<'_>) -> std::io::Result<bool> {
loop {
if out.remaining() == 0 {
return Ok(true);
}
if !self.pending.is_empty() {
let take = std::cmp::min(self.pending.len(), out.remaining());
out.put_slice(&self.pending[..take]);
self.pending.advance(take);
continue;
}
match self.state {
State::Finished => return Ok(true),
State::ReadSize => {
let idx = match self.find_crlf() {
Some(i) => i,
None => return Ok(false),
};
let line = self.buffer.split_to(idx);
self.buffer.advance(2);
let size = Self::parse_chunk_size(&line)?;
if size == 0 {
self.state = State::ReadTrailer;
} else {
self.state = State::ReadData(size);
}
}
State::ReadData(remaining) => {
if self.buffer.is_empty() {
return Ok(false);
}
let avail = std::cmp::min(self.buffer.len() as u64, remaining) as usize;
let take = std::cmp::min(avail, out.remaining());
out.put_slice(&self.buffer[..take]);
self.buffer.advance(take);
let new_remaining = remaining - take as u64;
if new_remaining == 0 {
if self.buffer.len() < 2 {
self.state = State::ReadData(0);
return Ok(false);
}
if &self.buffer[..2] != b"\r\n" {
return Err(std::io::Error::new(
std::io::ErrorKind::InvalidData,
"malformed chunk terminator",
));
}
self.buffer.advance(2);
self.state = State::ReadSize;
} else {
self.state = State::ReadData(new_remaining);
}
}
State::ReadTrailer => {
let idx = match self.find_crlf() {
Some(i) => i,
None => return Ok(false),
};
if idx == 0 {
self.buffer.advance(2);
self.state = State::Finished;
} else {
self.buffer.advance(idx + 2);
}
}
}
}
}
}
impl<S> AsyncRead for AwsChunkedStream<S>
where
S: AsyncRead + Unpin,
{
fn poll_read(
mut self: Pin<&mut Self>,
cx: &mut Context<'_>,
buf: &mut ReadBuf<'_>,
) -> Poll<std::io::Result<()>> {
loop {
let before = buf.filled().len();
let done = match self.try_advance(buf) {
Ok(v) => v,
Err(e) => return Poll::Ready(Err(e)),
};
if buf.filled().len() > before {
return Poll::Ready(Ok(()));
}
if done {
return Poll::Ready(Ok(()));
}
if self.eof {
return Poll::Ready(Err(std::io::Error::new(
std::io::ErrorKind::UnexpectedEof,
"unexpected EOF in aws-chunked stream",
)));
}
let mut tmp = [0u8; 8192];
let mut rb = ReadBuf::new(&mut tmp);
match Pin::new(&mut self.inner).poll_read(cx, &mut rb) {
Poll::Ready(Ok(())) => {
let n = rb.filled().len();
if n == 0 {
self.eof = true;
continue;
}
self.buffer.extend_from_slice(rb.filled());
}
Poll::Ready(Err(e)) => return Poll::Ready(Err(e)),
Poll::Pending => return Poll::Pending,
}
}
}
}
pub fn decode_body(body: axum::body::Body) -> impl AsyncRead + Send + Unpin {
use futures::TryStreamExt;
let stream = tokio_util::io::StreamReader::new(
http_body_util::BodyStream::new(body)
.map_ok(|frame| frame.into_data().unwrap_or_default())
.map_err(|e| std::io::Error::new(std::io::ErrorKind::Other, e)),
);
AwsChunkedStream::new(stream)
}

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,559 @@
use aes_gcm::aead::Aead;
use aes_gcm::{Aes256Gcm, KeyInit, Nonce};
use axum::body::Body;
use axum::extract::State;
use axum::http::StatusCode;
use axum::response::{IntoResponse, Response};
use base64::engine::general_purpose::STANDARD as B64;
use base64::Engine;
use rand::RngCore;
use serde_json::{json, Value};
use crate::state::AppState;
fn json_ok(value: Value) -> Response {
(
StatusCode::OK,
[("content-type", "application/json")],
value.to_string(),
)
.into_response()
}
fn json_err(status: StatusCode, msg: &str) -> Response {
(
status,
[("content-type", "application/json")],
json!({"error": msg}).to_string(),
)
.into_response()
}
async fn read_json(body: Body) -> Result<Value, Response> {
let body_bytes = http_body_util::BodyExt::collect(body)
.await
.map_err(|_| json_err(StatusCode::BAD_REQUEST, "Invalid request body"))?
.to_bytes();
if body_bytes.is_empty() {
Ok(json!({}))
} else {
serde_json::from_slice(&body_bytes)
.map_err(|_| json_err(StatusCode::BAD_REQUEST, "Invalid JSON"))
}
}
fn require_kms(
state: &AppState,
) -> Result<&std::sync::Arc<myfsio_crypto::kms::KmsService>, Response> {
state
.kms
.as_ref()
.ok_or_else(|| json_err(StatusCode::SERVICE_UNAVAILABLE, "KMS not enabled"))
}
fn decode_b64(value: &str, field: &str) -> Result<Vec<u8>, Response> {
B64.decode(value).map_err(|_| {
json_err(
StatusCode::BAD_REQUEST,
&format!("Invalid base64 {}", field),
)
})
}
fn require_str<'a>(value: &'a Value, names: &[&str], message: &str) -> Result<&'a str, Response> {
for name in names {
if let Some(found) = value.get(*name).and_then(|v| v.as_str()) {
return Ok(found);
}
}
Err(json_err(StatusCode::BAD_REQUEST, message))
}
pub async fn list_keys(State(state): State<AppState>) -> Response {
let kms = match require_kms(&state) {
Ok(kms) => kms,
Err(response) => return response,
};
let keys = kms.list_keys().await;
let keys_json: Vec<Value> = keys
.iter()
.map(|k| {
json!({
"KeyId": k.key_id,
"Arn": k.arn,
"Description": k.description,
"CreationDate": k.creation_date.to_rfc3339(),
"Enabled": k.enabled,
"KeyState": k.key_state,
"KeyUsage": k.key_usage,
"KeySpec": k.key_spec,
})
})
.collect();
json_ok(json!({"keys": keys_json}))
}
pub async fn create_key(State(state): State<AppState>, body: Body) -> Response {
let kms = match require_kms(&state) {
Ok(kms) => kms,
Err(response) => return response,
};
let req = match read_json(body).await {
Ok(req) => req,
Err(response) => return response,
};
let description = req
.get("Description")
.or_else(|| req.get("description"))
.and_then(|d| d.as_str())
.unwrap_or("");
match kms.create_key(description).await {
Ok(key) => json_ok(json!({
"KeyId": key.key_id,
"Arn": key.arn,
"Description": key.description,
"CreationDate": key.creation_date.to_rfc3339(),
"Enabled": key.enabled,
"KeyState": key.key_state,
})),
Err(e) => json_err(StatusCode::INTERNAL_SERVER_ERROR, &e.to_string()),
}
}
pub async fn get_key(
State(state): State<AppState>,
axum::extract::Path(key_id): axum::extract::Path<String>,
) -> Response {
let kms = match require_kms(&state) {
Ok(kms) => kms,
Err(response) => return response,
};
match kms.get_key(&key_id).await {
Some(key) => json_ok(json!({
"KeyId": key.key_id,
"Arn": key.arn,
"Description": key.description,
"CreationDate": key.creation_date.to_rfc3339(),
"Enabled": key.enabled,
"KeyState": key.key_state,
"KeyUsage": key.key_usage,
"KeySpec": key.key_spec,
})),
None => json_err(StatusCode::NOT_FOUND, "Key not found"),
}
}
pub async fn delete_key(
State(state): State<AppState>,
axum::extract::Path(key_id): axum::extract::Path<String>,
) -> Response {
let kms = match require_kms(&state) {
Ok(kms) => kms,
Err(response) => return response,
};
match kms.delete_key(&key_id).await {
Ok(true) => StatusCode::NO_CONTENT.into_response(),
Ok(false) => json_err(StatusCode::NOT_FOUND, "Key not found"),
Err(e) => json_err(StatusCode::INTERNAL_SERVER_ERROR, &e.to_string()),
}
}
pub async fn enable_key(
State(state): State<AppState>,
axum::extract::Path(key_id): axum::extract::Path<String>,
) -> Response {
let kms = match require_kms(&state) {
Ok(kms) => kms,
Err(response) => return response,
};
match kms.enable_key(&key_id).await {
Ok(true) => json_ok(json!({"status": "enabled"})),
Ok(false) => json_err(StatusCode::NOT_FOUND, "Key not found"),
Err(e) => json_err(StatusCode::INTERNAL_SERVER_ERROR, &e.to_string()),
}
}
pub async fn disable_key(
State(state): State<AppState>,
axum::extract::Path(key_id): axum::extract::Path<String>,
) -> Response {
let kms = match require_kms(&state) {
Ok(kms) => kms,
Err(response) => return response,
};
match kms.disable_key(&key_id).await {
Ok(true) => json_ok(json!({"status": "disabled"})),
Ok(false) => json_err(StatusCode::NOT_FOUND, "Key not found"),
Err(e) => json_err(StatusCode::INTERNAL_SERVER_ERROR, &e.to_string()),
}
}
pub async fn encrypt(State(state): State<AppState>, body: Body) -> Response {
let kms = match require_kms(&state) {
Ok(kms) => kms,
Err(response) => return response,
};
let req = match read_json(body).await {
Ok(req) => req,
Err(response) => return response,
};
let key_id = match require_str(&req, &["KeyId", "key_id"], "Missing KeyId") {
Ok(value) => value,
Err(response) => return response,
};
let plaintext_b64 = match require_str(&req, &["Plaintext", "plaintext"], "Missing Plaintext") {
Ok(value) => value,
Err(response) => return response,
};
let plaintext = match decode_b64(plaintext_b64, "Plaintext") {
Ok(value) => value,
Err(response) => return response,
};
match kms.encrypt_data(key_id, &plaintext).await {
Ok(ct) => json_ok(json!({
"KeyId": key_id,
"CiphertextBlob": B64.encode(&ct),
})),
Err(e) => json_err(StatusCode::INTERNAL_SERVER_ERROR, &e.to_string()),
}
}
pub async fn decrypt(State(state): State<AppState>, body: Body) -> Response {
let kms = match require_kms(&state) {
Ok(kms) => kms,
Err(response) => return response,
};
let req = match read_json(body).await {
Ok(req) => req,
Err(response) => return response,
};
let key_id = match require_str(&req, &["KeyId", "key_id"], "Missing KeyId") {
Ok(value) => value,
Err(response) => return response,
};
let ciphertext_b64 = match require_str(
&req,
&["CiphertextBlob", "ciphertext_blob"],
"Missing CiphertextBlob",
) {
Ok(value) => value,
Err(response) => return response,
};
let ciphertext = match decode_b64(ciphertext_b64, "CiphertextBlob") {
Ok(value) => value,
Err(response) => return response,
};
match kms.decrypt_data(key_id, &ciphertext).await {
Ok(pt) => json_ok(json!({
"KeyId": key_id,
"Plaintext": B64.encode(&pt),
})),
Err(e) => json_err(StatusCode::INTERNAL_SERVER_ERROR, &e.to_string()),
}
}
pub async fn generate_data_key(State(state): State<AppState>, body: Body) -> Response {
generate_data_key_inner(state, body, true).await
}
pub async fn generate_data_key_without_plaintext(
State(state): State<AppState>,
body: Body,
) -> Response {
generate_data_key_inner(state, body, false).await
}
async fn generate_data_key_inner(state: AppState, body: Body, include_plaintext: bool) -> Response {
let kms = match require_kms(&state) {
Ok(kms) => kms,
Err(response) => return response,
};
let req = match read_json(body).await {
Ok(req) => req,
Err(response) => return response,
};
let key_id = match require_str(&req, &["KeyId", "key_id"], "Missing KeyId") {
Ok(value) => value,
Err(response) => return response,
};
let num_bytes = req
.get("NumberOfBytes")
.and_then(|v| v.as_u64())
.unwrap_or(32) as usize;
if num_bytes < state.config.kms_generate_data_key_min_bytes
|| num_bytes > state.config.kms_generate_data_key_max_bytes
{
return json_err(
StatusCode::BAD_REQUEST,
&format!(
"NumberOfBytes must be {}-{}",
state.config.kms_generate_data_key_min_bytes,
state.config.kms_generate_data_key_max_bytes
),
);
}
match kms.generate_data_key(key_id, num_bytes).await {
Ok((plaintext, wrapped)) => {
let mut value = json!({
"KeyId": key_id,
"CiphertextBlob": B64.encode(&wrapped),
});
if include_plaintext {
value["Plaintext"] = json!(B64.encode(&plaintext));
}
json_ok(value)
}
Err(e) => json_err(StatusCode::INTERNAL_SERVER_ERROR, &e.to_string()),
}
}
pub async fn re_encrypt(State(state): State<AppState>, body: Body) -> Response {
let kms = match require_kms(&state) {
Ok(kms) => kms,
Err(response) => return response,
};
let req = match read_json(body).await {
Ok(req) => req,
Err(response) => return response,
};
let ciphertext_b64 = match require_str(
&req,
&["CiphertextBlob", "ciphertext_blob"],
"CiphertextBlob is required",
) {
Ok(value) => value,
Err(response) => return response,
};
let destination_key_id = match require_str(
&req,
&["DestinationKeyId", "destination_key_id"],
"DestinationKeyId is required",
) {
Ok(value) => value,
Err(response) => return response,
};
let ciphertext = match decode_b64(ciphertext_b64, "CiphertextBlob") {
Ok(value) => value,
Err(response) => return response,
};
let keys = kms.list_keys().await;
let mut source_key_id: Option<String> = None;
let mut plaintext: Option<Vec<u8>> = None;
for key in keys {
if !key.enabled {
continue;
}
if let Ok(value) = kms.decrypt_data(&key.key_id, &ciphertext).await {
source_key_id = Some(key.key_id);
plaintext = Some(value);
break;
}
}
let Some(source_key_id) = source_key_id else {
return json_err(
StatusCode::BAD_REQUEST,
"Could not determine source key for CiphertextBlob",
);
};
let plaintext = plaintext.unwrap_or_default();
match kms.encrypt_data(destination_key_id, &plaintext).await {
Ok(new_ciphertext) => json_ok(json!({
"CiphertextBlob": B64.encode(&new_ciphertext),
"SourceKeyId": source_key_id,
"KeyId": destination_key_id,
})),
Err(e) => json_err(StatusCode::INTERNAL_SERVER_ERROR, &e.to_string()),
}
}
pub async fn generate_random(State(state): State<AppState>, body: Body) -> Response {
if let Err(response) = require_kms(&state) {
return response;
}
let req = match read_json(body).await {
Ok(req) => req,
Err(response) => return response,
};
let num_bytes = req
.get("NumberOfBytes")
.and_then(|v| v.as_u64())
.unwrap_or(32) as usize;
if num_bytes < state.config.kms_generate_data_key_min_bytes
|| num_bytes > state.config.kms_generate_data_key_max_bytes
{
return json_err(
StatusCode::BAD_REQUEST,
&format!(
"NumberOfBytes must be {}-{}",
state.config.kms_generate_data_key_min_bytes,
state.config.kms_generate_data_key_max_bytes
),
);
}
let mut bytes = vec![0u8; num_bytes];
rand::thread_rng().fill_bytes(&mut bytes);
json_ok(json!({
"Plaintext": B64.encode(bytes),
}))
}
pub async fn client_generate_key(State(state): State<AppState>) -> Response {
let _ = state;
let mut key = [0u8; 32];
rand::thread_rng().fill_bytes(&mut key);
json_ok(json!({
"Key": B64.encode(key),
"Algorithm": "AES-256-GCM",
"KeySize": 32,
}))
}
pub async fn client_encrypt(State(state): State<AppState>, body: Body) -> Response {
let _ = state;
let req = match read_json(body).await {
Ok(req) => req,
Err(response) => return response,
};
let plaintext_b64 =
match require_str(&req, &["Plaintext", "plaintext"], "Plaintext is required") {
Ok(value) => value,
Err(response) => return response,
};
let key_b64 = match require_str(&req, &["Key", "key"], "Key is required") {
Ok(value) => value,
Err(response) => return response,
};
let plaintext = match decode_b64(plaintext_b64, "Plaintext") {
Ok(value) => value,
Err(response) => return response,
};
let key_bytes = match decode_b64(key_b64, "Key") {
Ok(value) => value,
Err(response) => return response,
};
if key_bytes.len() != 32 {
return json_err(StatusCode::BAD_REQUEST, "Key must decode to 32 bytes");
}
let cipher = match Aes256Gcm::new_from_slice(&key_bytes) {
Ok(cipher) => cipher,
Err(_) => return json_err(StatusCode::BAD_REQUEST, "Invalid encryption key"),
};
let mut nonce_bytes = [0u8; 12];
rand::thread_rng().fill_bytes(&mut nonce_bytes);
let nonce = Nonce::from_slice(&nonce_bytes);
match cipher.encrypt(nonce, plaintext.as_ref()) {
Ok(ciphertext) => json_ok(json!({
"Ciphertext": B64.encode(ciphertext),
"Nonce": B64.encode(nonce_bytes),
"Algorithm": "AES-256-GCM",
})),
Err(e) => json_err(StatusCode::BAD_REQUEST, &e.to_string()),
}
}
pub async fn client_decrypt(State(state): State<AppState>, body: Body) -> Response {
let _ = state;
let req = match read_json(body).await {
Ok(req) => req,
Err(response) => return response,
};
let ciphertext_b64 = match require_str(
&req,
&["Ciphertext", "ciphertext"],
"Ciphertext is required",
) {
Ok(value) => value,
Err(response) => return response,
};
let nonce_b64 = match require_str(&req, &["Nonce", "nonce"], "Nonce is required") {
Ok(value) => value,
Err(response) => return response,
};
let key_b64 = match require_str(&req, &["Key", "key"], "Key is required") {
Ok(value) => value,
Err(response) => return response,
};
let ciphertext = match decode_b64(ciphertext_b64, "Ciphertext") {
Ok(value) => value,
Err(response) => return response,
};
let nonce_bytes = match decode_b64(nonce_b64, "Nonce") {
Ok(value) => value,
Err(response) => return response,
};
let key_bytes = match decode_b64(key_b64, "Key") {
Ok(value) => value,
Err(response) => return response,
};
if key_bytes.len() != 32 {
return json_err(StatusCode::BAD_REQUEST, "Key must decode to 32 bytes");
}
if nonce_bytes.len() != 12 {
return json_err(StatusCode::BAD_REQUEST, "Nonce must decode to 12 bytes");
}
let cipher = match Aes256Gcm::new_from_slice(&key_bytes) {
Ok(cipher) => cipher,
Err(_) => return json_err(StatusCode::BAD_REQUEST, "Invalid encryption key"),
};
let nonce = Nonce::from_slice(&nonce_bytes);
match cipher.decrypt(nonce, ciphertext.as_ref()) {
Ok(plaintext) => json_ok(json!({
"Plaintext": B64.encode(plaintext),
})),
Err(e) => json_err(StatusCode::BAD_REQUEST, &e.to_string()),
}
}
pub async fn materials(
State(state): State<AppState>,
axum::extract::Path(key_id): axum::extract::Path<String>,
body: Body,
) -> Response {
let kms = match require_kms(&state) {
Ok(kms) => kms,
Err(response) => return response,
};
let _ = match read_json(body).await {
Ok(req) => req,
Err(response) => return response,
};
match kms.generate_data_key(&key_id, 32).await {
Ok((plaintext, wrapped)) => json_ok(json!({
"PlaintextKey": B64.encode(plaintext),
"EncryptedKey": B64.encode(wrapped),
"KeyId": key_id,
"Algorithm": "AES-256-GCM",
"KeyWrapAlgorithm": "kms",
})),
Err(e) => json_err(StatusCode::INTERNAL_SERVER_ERROR, &e.to_string()),
}
}

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,578 @@
use std::collections::HashMap;
use std::path::{Path, PathBuf};
use axum::body::Body;
use axum::http::{HeaderMap, HeaderName, StatusCode};
use axum::response::{IntoResponse, Response};
use base64::Engine;
use bytes::Bytes;
use crc32fast::Hasher;
use duckdb::types::ValueRef;
use duckdb::Connection;
use futures::stream;
use http_body_util::BodyExt;
use myfsio_common::error::{S3Error, S3ErrorCode};
use myfsio_storage::traits::StorageEngine;
use crate::state::AppState;
#[cfg(target_os = "windows")]
#[link(name = "Rstrtmgr")]
extern "system" {}
const CHUNK_SIZE: usize = 65_536;
pub async fn post_select_object_content(
state: &AppState,
bucket: &str,
key: &str,
headers: &HeaderMap,
body: Body,
) -> Response {
if let Some(resp) = require_xml_content_type(headers) {
return resp;
}
let body_bytes = match body.collect().await {
Ok(collected) => collected.to_bytes(),
Err(_) => {
return s3_error_response(S3Error::new(
S3ErrorCode::MalformedXML,
"Unable to parse XML document",
));
}
};
let request = match parse_select_request(&body_bytes) {
Ok(r) => r,
Err(err) => return s3_error_response(err),
};
let object_path = match state.storage.get_object_path(bucket, key).await {
Ok(path) => path,
Err(_) => {
return s3_error_response(S3Error::new(S3ErrorCode::NoSuchKey, "Object not found"));
}
};
let join_res =
tokio::task::spawn_blocking(move || execute_select_query(object_path, request)).await;
let chunks = match join_res {
Ok(Ok(chunks)) => chunks,
Ok(Err(message)) => {
return s3_error_response(S3Error::new(S3ErrorCode::InvalidRequest, message));
}
Err(_) => {
return s3_error_response(S3Error::new(
S3ErrorCode::InternalError,
"SelectObjectContent execution failed",
));
}
};
let bytes_returned: usize = chunks.iter().map(|c| c.len()).sum();
let mut events: Vec<Bytes> = Vec::with_capacity(chunks.len() + 2);
for chunk in chunks {
events.push(Bytes::from(encode_select_event("Records", &chunk)));
}
let stats_payload = build_stats_xml(0, bytes_returned);
events.push(Bytes::from(encode_select_event(
"Stats",
stats_payload.as_bytes(),
)));
events.push(Bytes::from(encode_select_event("End", b"")));
let stream = stream::iter(events.into_iter().map(Ok::<Bytes, std::io::Error>));
let body = Body::from_stream(stream);
let mut response = (StatusCode::OK, body).into_response();
response.headers_mut().insert(
HeaderName::from_static("content-type"),
"application/octet-stream".parse().unwrap(),
);
response.headers_mut().insert(
HeaderName::from_static("x-amz-request-charged"),
"requester".parse().unwrap(),
);
response
}
#[derive(Clone)]
struct SelectRequest {
expression: String,
input_format: InputFormat,
output_format: OutputFormat,
}
#[derive(Clone)]
enum InputFormat {
Csv(CsvInputConfig),
Json(JsonInputConfig),
Parquet,
}
#[derive(Clone)]
struct CsvInputConfig {
file_header_info: String,
field_delimiter: String,
quote_character: String,
}
#[derive(Clone)]
struct JsonInputConfig {
json_type: String,
}
#[derive(Clone)]
enum OutputFormat {
Csv(CsvOutputConfig),
Json(JsonOutputConfig),
}
#[derive(Clone)]
struct CsvOutputConfig {
field_delimiter: String,
record_delimiter: String,
quote_character: String,
}
#[derive(Clone)]
struct JsonOutputConfig {
record_delimiter: String,
}
fn parse_select_request(payload: &[u8]) -> Result<SelectRequest, S3Error> {
let xml = String::from_utf8_lossy(payload);
let doc = roxmltree::Document::parse(&xml)
.map_err(|_| S3Error::new(S3ErrorCode::MalformedXML, "Unable to parse XML document"))?;
let root = doc.root_element();
if root.tag_name().name() != "SelectObjectContentRequest" {
return Err(S3Error::new(
S3ErrorCode::MalformedXML,
"Root element must be SelectObjectContentRequest",
));
}
let expression = child_text(&root, "Expression")
.filter(|v| !v.is_empty())
.ok_or_else(|| S3Error::new(S3ErrorCode::InvalidRequest, "Expression is required"))?;
let expression_type = child_text(&root, "ExpressionType").unwrap_or_else(|| "SQL".to_string());
if !expression_type.eq_ignore_ascii_case("SQL") {
return Err(S3Error::new(
S3ErrorCode::InvalidRequest,
"Only SQL expression type is supported",
));
}
let input_node = child(&root, "InputSerialization").ok_or_else(|| {
S3Error::new(
S3ErrorCode::InvalidRequest,
"InputSerialization is required",
)
})?;
let output_node = child(&root, "OutputSerialization").ok_or_else(|| {
S3Error::new(
S3ErrorCode::InvalidRequest,
"OutputSerialization is required",
)
})?;
let input_format = parse_input_format(&input_node)?;
let output_format = parse_output_format(&output_node)?;
Ok(SelectRequest {
expression,
input_format,
output_format,
})
}
fn parse_input_format(node: &roxmltree::Node<'_, '_>) -> Result<InputFormat, S3Error> {
if let Some(csv_node) = child(node, "CSV") {
return Ok(InputFormat::Csv(CsvInputConfig {
file_header_info: child_text(&csv_node, "FileHeaderInfo")
.unwrap_or_else(|| "NONE".to_string())
.to_ascii_uppercase(),
field_delimiter: child_text(&csv_node, "FieldDelimiter")
.unwrap_or_else(|| ",".to_string()),
quote_character: child_text(&csv_node, "QuoteCharacter")
.unwrap_or_else(|| "\"".to_string()),
}));
}
if let Some(json_node) = child(node, "JSON") {
return Ok(InputFormat::Json(JsonInputConfig {
json_type: child_text(&json_node, "Type")
.unwrap_or_else(|| "DOCUMENT".to_string())
.to_ascii_uppercase(),
}));
}
if child(node, "Parquet").is_some() {
return Ok(InputFormat::Parquet);
}
Err(S3Error::new(
S3ErrorCode::InvalidRequest,
"InputSerialization must specify CSV, JSON, or Parquet",
))
}
fn parse_output_format(node: &roxmltree::Node<'_, '_>) -> Result<OutputFormat, S3Error> {
if let Some(csv_node) = child(node, "CSV") {
return Ok(OutputFormat::Csv(CsvOutputConfig {
field_delimiter: child_text(&csv_node, "FieldDelimiter")
.unwrap_or_else(|| ",".to_string()),
record_delimiter: child_text(&csv_node, "RecordDelimiter")
.unwrap_or_else(|| "\n".to_string()),
quote_character: child_text(&csv_node, "QuoteCharacter")
.unwrap_or_else(|| "\"".to_string()),
}));
}
if let Some(json_node) = child(node, "JSON") {
return Ok(OutputFormat::Json(JsonOutputConfig {
record_delimiter: child_text(&json_node, "RecordDelimiter")
.unwrap_or_else(|| "\n".to_string()),
}));
}
Err(S3Error::new(
S3ErrorCode::InvalidRequest,
"OutputSerialization must specify CSV or JSON",
))
}
fn child<'a, 'input>(
node: &'a roxmltree::Node<'a, 'input>,
name: &str,
) -> Option<roxmltree::Node<'a, 'input>> {
node.children()
.find(|n| n.is_element() && n.tag_name().name() == name)
}
fn child_text(node: &roxmltree::Node<'_, '_>, name: &str) -> Option<String> {
child(node, name)
.and_then(|n| n.text())
.map(|s| s.to_string())
}
fn execute_select_query(path: PathBuf, request: SelectRequest) -> Result<Vec<Vec<u8>>, String> {
let conn =
Connection::open_in_memory().map_err(|e| format!("DuckDB connection error: {}", e))?;
load_input_table(&conn, &path, &request.input_format)?;
let expression = request
.expression
.replace("s3object", "data")
.replace("S3Object", "data");
let mut stmt = conn
.prepare(&expression)
.map_err(|e| format!("SQL execution error: {}", e))?;
let mut rows = stmt
.query([])
.map_err(|e| format!("SQL execution error: {}", e))?;
let stmt_ref = rows
.as_ref()
.ok_or_else(|| "SQL execution error: statement metadata unavailable".to_string())?;
let col_count = stmt_ref.column_count();
let mut columns: Vec<String> = Vec::with_capacity(col_count);
for i in 0..col_count {
let name = stmt_ref
.column_name(i)
.map(|s| s.to_string())
.unwrap_or_else(|_| format!("_{}", i));
columns.push(name);
}
match request.output_format {
OutputFormat::Csv(cfg) => collect_csv_chunks(&mut rows, col_count, cfg),
OutputFormat::Json(cfg) => collect_json_chunks(&mut rows, col_count, &columns, cfg),
}
}
fn load_input_table(conn: &Connection, path: &Path, input: &InputFormat) -> Result<(), String> {
let path_str = path.to_string_lossy().replace('\\', "/");
match input {
InputFormat::Csv(cfg) => {
let header = cfg.file_header_info == "USE" || cfg.file_header_info == "IGNORE";
let delimiter = normalize_single_char(&cfg.field_delimiter, ',');
let quote = normalize_single_char(&cfg.quote_character, '"');
let sql = format!(
"CREATE TABLE data AS SELECT * FROM read_csv('{}', header={}, delim='{}', quote='{}')",
sql_escape(&path_str),
if header { "true" } else { "false" },
sql_escape(&delimiter),
sql_escape(&quote)
);
conn.execute_batch(&sql)
.map_err(|e| format!("Failed loading CSV data: {}", e))?;
}
InputFormat::Json(cfg) => {
let format = if cfg.json_type == "LINES" {
"newline_delimited"
} else {
"array"
};
let sql = format!(
"CREATE TABLE data AS SELECT * FROM read_json_auto('{}', format='{}')",
sql_escape(&path_str),
format
);
conn.execute_batch(&sql)
.map_err(|e| format!("Failed loading JSON data: {}", e))?;
}
InputFormat::Parquet => {
let sql = format!(
"CREATE TABLE data AS SELECT * FROM read_parquet('{}')",
sql_escape(&path_str)
);
conn.execute_batch(&sql)
.map_err(|e| format!("Failed loading Parquet data: {}", e))?;
}
}
Ok(())
}
fn sql_escape(value: &str) -> String {
value.replace('\'', "''")
}
fn normalize_single_char(value: &str, default_char: char) -> String {
value.chars().next().unwrap_or(default_char).to_string()
}
fn collect_csv_chunks(
rows: &mut duckdb::Rows<'_>,
col_count: usize,
cfg: CsvOutputConfig,
) -> Result<Vec<Vec<u8>>, String> {
let delimiter = cfg.field_delimiter;
let record_delimiter = cfg.record_delimiter;
let quote = cfg.quote_character;
let mut chunks: Vec<Vec<u8>> = Vec::new();
let mut buffer = String::new();
while let Some(row) = rows
.next()
.map_err(|e| format!("SQL execution error: {}", e))?
{
let mut fields: Vec<String> = Vec::with_capacity(col_count);
for i in 0..col_count {
let value = row
.get_ref(i)
.map_err(|e| format!("SQL execution error: {}", e))?;
if matches!(value, ValueRef::Null) {
fields.push(String::new());
continue;
}
let mut text = value_ref_to_string(value);
if text.contains(&delimiter)
|| text.contains(&quote)
|| text.contains(&record_delimiter)
{
text = text.replace(&quote, &(quote.clone() + &quote));
text = format!("{}{}{}", quote, text, quote);
}
fields.push(text);
}
buffer.push_str(&fields.join(&delimiter));
buffer.push_str(&record_delimiter);
while buffer.len() >= CHUNK_SIZE {
let rest = buffer.split_off(CHUNK_SIZE);
chunks.push(buffer.into_bytes());
buffer = rest;
}
}
if !buffer.is_empty() {
chunks.push(buffer.into_bytes());
}
Ok(chunks)
}
fn collect_json_chunks(
rows: &mut duckdb::Rows<'_>,
col_count: usize,
columns: &[String],
cfg: JsonOutputConfig,
) -> Result<Vec<Vec<u8>>, String> {
let record_delimiter = cfg.record_delimiter;
let mut chunks: Vec<Vec<u8>> = Vec::new();
let mut buffer = String::new();
while let Some(row) = rows
.next()
.map_err(|e| format!("SQL execution error: {}", e))?
{
let mut record: HashMap<String, serde_json::Value> = HashMap::with_capacity(col_count);
for i in 0..col_count {
let value = row
.get_ref(i)
.map_err(|e| format!("SQL execution error: {}", e))?;
let key = columns.get(i).cloned().unwrap_or_else(|| format!("_{}", i));
record.insert(key, value_ref_to_json(value));
}
let line = serde_json::to_string(&record)
.map_err(|e| format!("JSON output encoding failed: {}", e))?;
buffer.push_str(&line);
buffer.push_str(&record_delimiter);
while buffer.len() >= CHUNK_SIZE {
let rest = buffer.split_off(CHUNK_SIZE);
chunks.push(buffer.into_bytes());
buffer = rest;
}
}
if !buffer.is_empty() {
chunks.push(buffer.into_bytes());
}
Ok(chunks)
}
fn value_ref_to_string(value: ValueRef<'_>) -> String {
match value {
ValueRef::Null => String::new(),
ValueRef::Boolean(v) => v.to_string(),
ValueRef::TinyInt(v) => v.to_string(),
ValueRef::SmallInt(v) => v.to_string(),
ValueRef::Int(v) => v.to_string(),
ValueRef::BigInt(v) => v.to_string(),
ValueRef::UTinyInt(v) => v.to_string(),
ValueRef::USmallInt(v) => v.to_string(),
ValueRef::UInt(v) => v.to_string(),
ValueRef::UBigInt(v) => v.to_string(),
ValueRef::Float(v) => v.to_string(),
ValueRef::Double(v) => v.to_string(),
ValueRef::Decimal(v) => v.to_string(),
ValueRef::Text(v) => String::from_utf8_lossy(v).into_owned(),
ValueRef::Blob(v) => base64::engine::general_purpose::STANDARD.encode(v),
_ => format!("{:?}", value),
}
}
fn value_ref_to_json(value: ValueRef<'_>) -> serde_json::Value {
match value {
ValueRef::Null => serde_json::Value::Null,
ValueRef::Boolean(v) => serde_json::Value::Bool(v),
ValueRef::TinyInt(v) => serde_json::json!(v),
ValueRef::SmallInt(v) => serde_json::json!(v),
ValueRef::Int(v) => serde_json::json!(v),
ValueRef::BigInt(v) => serde_json::json!(v),
ValueRef::UTinyInt(v) => serde_json::json!(v),
ValueRef::USmallInt(v) => serde_json::json!(v),
ValueRef::UInt(v) => serde_json::json!(v),
ValueRef::UBigInt(v) => serde_json::json!(v),
ValueRef::Float(v) => serde_json::json!(v),
ValueRef::Double(v) => serde_json::json!(v),
ValueRef::Decimal(v) => serde_json::Value::String(v.to_string()),
ValueRef::Text(v) => serde_json::Value::String(String::from_utf8_lossy(v).into_owned()),
ValueRef::Blob(v) => {
serde_json::Value::String(base64::engine::general_purpose::STANDARD.encode(v))
}
_ => serde_json::Value::String(format!("{:?}", value)),
}
}
fn require_xml_content_type(headers: &HeaderMap) -> Option<Response> {
let value = headers
.get("content-type")
.and_then(|v| v.to_str().ok())
.unwrap_or("")
.trim();
if value.is_empty() {
return None;
}
let lowered = value.to_ascii_lowercase();
if lowered.starts_with("application/xml") || lowered.starts_with("text/xml") {
return None;
}
Some(s3_error_response(S3Error::new(
S3ErrorCode::InvalidRequest,
"Content-Type must be application/xml or text/xml",
)))
}
fn s3_error_response(err: S3Error) -> Response {
let status =
StatusCode::from_u16(err.http_status()).unwrap_or(StatusCode::INTERNAL_SERVER_ERROR);
let resource = if err.resource.is_empty() {
"/".to_string()
} else {
err.resource.clone()
};
let body = err
.with_resource(resource)
.with_request_id(uuid::Uuid::new_v4().simple().to_string())
.to_xml();
(status, [("content-type", "application/xml")], body).into_response()
}
fn build_stats_xml(bytes_scanned: usize, bytes_returned: usize) -> String {
format!(
"<Stats><BytesScanned>{}</BytesScanned><BytesProcessed>{}</BytesProcessed><BytesReturned>{}</BytesReturned></Stats>",
bytes_scanned,
bytes_scanned,
bytes_returned
)
}
fn encode_select_event(event_type: &str, payload: &[u8]) -> Vec<u8> {
let mut headers = Vec::new();
headers.extend(encode_select_header(":event-type", event_type));
if event_type == "Records" {
headers.extend(encode_select_header(
":content-type",
"application/octet-stream",
));
} else if event_type == "Stats" {
headers.extend(encode_select_header(":content-type", "text/xml"));
}
headers.extend(encode_select_header(":message-type", "event"));
let headers_len = headers.len() as u32;
let total_len = 4 + 4 + 4 + headers.len() + payload.len() + 4;
let mut message = Vec::with_capacity(total_len);
let mut prelude = Vec::with_capacity(8);
prelude.extend((total_len as u32).to_be_bytes());
prelude.extend(headers_len.to_be_bytes());
let prelude_crc = crc32(&prelude);
message.extend(prelude);
message.extend(prelude_crc.to_be_bytes());
message.extend(headers);
message.extend(payload);
let msg_crc = crc32(&message);
message.extend(msg_crc.to_be_bytes());
message
}
fn encode_select_header(name: &str, value: &str) -> Vec<u8> {
let name_bytes = name.as_bytes();
let value_bytes = value.as_bytes();
let mut header = Vec::with_capacity(1 + name_bytes.len() + 1 + 2 + value_bytes.len());
header.push(name_bytes.len() as u8);
header.extend(name_bytes);
header.push(7);
header.extend((value_bytes.len() as u16).to_be_bytes());
header.extend(value_bytes);
header
}
fn crc32(data: &[u8]) -> u32 {
let mut hasher = Hasher::new();
hasher.update(data);
hasher.finalize()
}

View File

@@ -0,0 +1,226 @@
use std::collections::HashMap;
use std::error::Error as StdError;
use axum::extract::{Extension, Form, State};
use axum::http::{header, HeaderMap, StatusCode};
use axum::response::{IntoResponse, Redirect, Response};
use tera::Context;
use crate::middleware::session::SessionHandle;
use crate::session::FlashMessage;
use crate::state::AppState;
pub async fn login_page(
State(state): State<AppState>,
Extension(session): Extension<SessionHandle>,
) -> Response {
if session.read(|s| s.is_authenticated()) {
return Redirect::to("/ui/buckets").into_response();
}
let mut ctx = base_context(&session, None);
let flashed = session.write(|s| s.take_flash());
inject_flash(&mut ctx, flashed);
render(&state, "login.html", &ctx)
}
#[derive(serde::Deserialize)]
pub struct LoginForm {
pub access_key: String,
pub secret_key: String,
#[serde(default)]
pub csrf_token: String,
#[serde(default)]
pub next: Option<String>,
}
pub async fn login_submit(
State(state): State<AppState>,
Extension(session): Extension<SessionHandle>,
Form(form): Form<LoginForm>,
) -> Response {
let access_key = form.access_key.trim();
let secret_key = form.secret_key.trim();
match state.iam.get_secret_key(access_key) {
Some(expected) if constant_time_eq_str(&expected, secret_key) => {
let display = state
.iam
.get_user(access_key)
.await
.and_then(|v| {
v.get("display_name")
.and_then(|d| d.as_str())
.map(|s| s.to_string())
})
.unwrap_or_else(|| access_key.to_string());
session.write(|s| {
s.user_id = Some(access_key.to_string());
s.display_name = Some(display);
s.rotate_csrf();
s.push_flash("success", "Signed in successfully.");
});
let next = form
.next
.as_deref()
.filter(|n| is_allowed_redirect(n, &state.config.allowed_redirect_hosts))
.unwrap_or("/ui/buckets")
.to_string();
Redirect::to(&next).into_response()
}
_ => {
session.write(|s| {
s.push_flash("danger", "Invalid access key or secret key.");
});
Redirect::to("/login").into_response()
}
}
}
fn is_allowed_redirect(target: &str, allowed_hosts: &[String]) -> bool {
if target == "/ui" || target.starts_with("/ui/") {
return true;
}
let Some(rest) = target
.strip_prefix("https://")
.or_else(|| target.strip_prefix("http://"))
else {
return false;
};
let host = rest
.split('/')
.next()
.unwrap_or_default()
.split('@')
.last()
.unwrap_or_default()
.split(':')
.next()
.unwrap_or_default()
.to_ascii_lowercase();
allowed_hosts
.iter()
.any(|allowed| allowed.eq_ignore_ascii_case(&host))
}
pub async fn logout(Extension(session): Extension<SessionHandle>) -> Response {
session.write(|s| {
s.user_id = None;
s.display_name = None;
s.flash.clear();
s.rotate_csrf();
s.push_flash("info", "Signed out.");
});
Redirect::to("/login").into_response()
}
pub async fn root_redirect() -> Response {
Redirect::to("/ui/buckets").into_response()
}
pub async fn not_found_page(
State(state): State<AppState>,
Extension(session): Extension<SessionHandle>,
) -> Response {
let ctx = base_context(&session, None);
let mut resp = render(&state, "404.html", &ctx);
*resp.status_mut() = StatusCode::NOT_FOUND;
resp
}
pub async fn require_login(
Extension(session): Extension<SessionHandle>,
req: axum::extract::Request,
next: axum::middleware::Next,
) -> Response {
if session.read(|s| s.is_authenticated()) {
return next.run(req).await;
}
let path = req.uri().path().to_string();
let query = req
.uri()
.query()
.map(|q| format!("?{}", q))
.unwrap_or_default();
let next_url = format!("{}{}", path, query);
let encoded =
percent_encoding::utf8_percent_encode(&next_url, percent_encoding::NON_ALPHANUMERIC)
.to_string();
let target = format!("/login?next={}", encoded);
Redirect::to(&target).into_response()
}
pub fn render(state: &AppState, template: &str, ctx: &Context) -> Response {
let engine = match &state.templates {
Some(e) => e,
None => {
return (
StatusCode::INTERNAL_SERVER_ERROR,
"Templates not configured",
)
.into_response();
}
};
match engine.render(template, ctx) {
Ok(html) => {
let mut headers = HeaderMap::new();
headers.insert(
header::CONTENT_TYPE,
"text/html; charset=utf-8".parse().unwrap(),
);
(StatusCode::OK, headers, html).into_response()
}
Err(e) => {
let mut detail = format!("{}", e);
let mut src = StdError::source(&e);
while let Some(s) = src {
detail.push_str(" | ");
detail.push_str(&s.to_string());
src = s.source();
}
tracing::error!("Template render failed ({}): {}", template, detail);
let fallback_ctx = Context::new();
let body = if template != "500.html" {
engine
.render("500.html", &fallback_ctx)
.unwrap_or_else(|_| "Internal Server Error".to_string())
} else {
"Internal Server Error".to_string()
};
let mut headers = HeaderMap::new();
headers.insert(
header::CONTENT_TYPE,
"text/html; charset=utf-8".parse().unwrap(),
);
(StatusCode::INTERNAL_SERVER_ERROR, headers, body).into_response()
}
}
}
pub fn base_context(session: &SessionHandle, endpoint: Option<&str>) -> Context {
let mut ctx = Context::new();
let snapshot = session.snapshot();
ctx.insert("csrf_token_value", &snapshot.csrf_token);
ctx.insert("is_authenticated", &snapshot.user_id.is_some());
ctx.insert("current_user", &snapshot.user_id);
ctx.insert("current_user_display_name", &snapshot.display_name);
ctx.insert("current_endpoint", &endpoint.unwrap_or(""));
ctx.insert("request_args", &HashMap::<String, String>::new());
ctx.insert("null", &serde_json::Value::Null);
ctx.insert("none", &serde_json::Value::Null);
ctx
}
pub fn inject_flash(ctx: &mut Context, flashed: Vec<FlashMessage>) {
ctx.insert("flashed_messages", &flashed);
}
fn constant_time_eq_str(a: &str, b: &str) -> bool {
if a.len() != b.len() {
return false;
}
subtle::ConstantTimeEq::ct_eq(a.as_bytes(), b.as_bytes()).into()
}

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,657 @@
pub mod config;
pub mod handlers;
pub mod middleware;
pub mod services;
pub mod session;
pub mod state;
pub mod stores;
pub mod templates;
use axum::Router;
pub const SERVER_HEADER: &str = concat!("MyFSIO-Rust/", env!("CARGO_PKG_VERSION"));
pub fn create_ui_router(state: state::AppState) -> Router {
use axum::routing::{delete, get, post, put};
use handlers::ui;
use handlers::ui_api;
use handlers::ui_pages;
let protected = Router::new()
.route("/", get(ui::root_redirect))
.route("/ui", get(ui::root_redirect))
.route("/ui/", get(ui::root_redirect))
.route(
"/ui/buckets",
get(ui_pages::buckets_overview).post(ui_pages::create_bucket),
)
.route("/ui/buckets/create", post(ui_pages::create_bucket))
.route("/ui/buckets/{bucket_name}", get(ui_pages::bucket_detail))
.route(
"/ui/buckets/{bucket_name}/delete",
post(ui_pages::delete_bucket),
)
.route(
"/ui/buckets/{bucket_name}/versioning",
post(ui_pages::update_bucket_versioning),
)
.route(
"/ui/buckets/{bucket_name}/quota",
post(ui_pages::update_bucket_quota),
)
.route(
"/ui/buckets/{bucket_name}/encryption",
post(ui_pages::update_bucket_encryption),
)
.route(
"/ui/buckets/{bucket_name}/policy",
post(ui_pages::update_bucket_policy),
)
.route(
"/ui/buckets/{bucket_name}/replication",
post(ui_pages::update_bucket_replication),
)
.route(
"/ui/buckets/{bucket_name}/website",
post(ui_pages::update_bucket_website),
)
.route(
"/ui/buckets/{bucket_name}/upload",
post(ui_api::upload_object),
)
.route(
"/ui/buckets/{bucket_name}/multipart/initiate",
post(ui_api::initiate_multipart_upload),
)
.route(
"/ui/buckets/{bucket_name}/multipart/{upload_id}/part",
put(ui_api::upload_multipart_part),
)
.route(
"/ui/buckets/{bucket_name}/multipart/{upload_id}/parts",
put(ui_api::upload_multipart_part),
)
.route(
"/ui/buckets/{bucket_name}/multipart/{upload_id}/complete",
post(ui_api::complete_multipart_upload),
)
.route(
"/ui/buckets/{bucket_name}/multipart/{upload_id}/abort",
delete(ui_api::abort_multipart_upload),
)
.route(
"/ui/buckets/{bucket_name}/multipart/{upload_id}",
delete(ui_api::abort_multipart_upload),
)
.route(
"/ui/buckets/{bucket_name}/objects",
get(ui_api::list_bucket_objects),
)
.route(
"/ui/buckets/{bucket_name}/objects/stream",
get(ui_api::stream_bucket_objects),
)
.route(
"/ui/buckets/{bucket_name}/folders",
get(ui_api::list_bucket_folders),
)
.route(
"/ui/buckets/{bucket_name}/copy-targets",
get(ui_api::list_copy_targets),
)
.route(
"/ui/buckets/{bucket_name}/list-for-copy",
get(ui_api::list_copy_targets),
)
.route(
"/ui/buckets/{bucket_name}/objects/bulk-delete",
post(ui_api::bulk_delete_objects),
)
.route(
"/ui/buckets/{bucket_name}/objects/bulk-download",
post(ui_api::bulk_download_objects),
)
.route(
"/ui/buckets/{bucket_name}/objects/{*rest}",
get(ui_api::object_get_dispatch).post(ui_api::object_post_dispatch),
)
.route(
"/ui/buckets/{bucket_name}/acl",
get(ui_api::bucket_acl).post(ui_api::update_bucket_acl),
)
.route(
"/ui/buckets/{bucket_name}/cors",
get(ui_api::bucket_cors).post(ui_api::update_bucket_cors),
)
.route(
"/ui/buckets/{bucket_name}/lifecycle",
get(ui_api::bucket_lifecycle).post(ui_api::update_bucket_lifecycle),
)
.route(
"/ui/buckets/{bucket_name}/lifecycle/history",
get(ui_api::lifecycle_history),
)
.route(
"/ui/buckets/{bucket_name}/replication/status",
get(ui_api::replication_status),
)
.route(
"/ui/buckets/{bucket_name}/replication/failures",
get(ui_api::replication_failures).delete(ui_api::clear_replication_failures),
)
.route(
"/ui/buckets/{bucket_name}/replication/failures/retry",
post(ui_api::retry_replication_failure),
)
.route(
"/ui/buckets/{bucket_name}/replication/failures/retry-all",
post(ui_api::retry_all_replication_failures),
)
.route(
"/ui/buckets/{bucket_name}/replication/failures/dismiss",
delete(ui_api::dismiss_replication_failure),
)
.route(
"/ui/buckets/{bucket_name}/replication/failures/clear",
delete(ui_api::clear_replication_failures),
)
.route(
"/ui/buckets/{bucket_name}/replication/failures/{*rest}",
post(ui_api::retry_replication_failure_path)
.delete(ui_api::dismiss_replication_failure_path),
)
.route(
"/ui/buckets/{bucket_name}/bulk-delete",
post(ui_api::bulk_delete_objects),
)
.route(
"/ui/buckets/{bucket_name}/bulk-download",
post(ui_api::bulk_download_objects),
)
.route(
"/ui/buckets/{bucket_name}/archived",
get(ui_api::archived_objects),
)
.route(
"/ui/buckets/{bucket_name}/archived/{*rest}",
post(ui_api::archived_post_dispatch),
)
.route("/ui/iam", get(ui_pages::iam_dashboard))
.route("/ui/iam/users", post(ui_pages::create_iam_user))
.route("/ui/iam/users/{user_id}", post(ui_pages::update_iam_user))
.route(
"/ui/iam/users/{user_id}/delete",
post(ui_pages::delete_iam_user),
)
.route(
"/ui/iam/users/{user_id}/update",
post(ui_pages::update_iam_user),
)
.route(
"/ui/iam/users/{user_id}/policies",
post(ui_pages::update_iam_policies),
)
.route(
"/ui/iam/users/{user_id}/expiry",
post(ui_pages::update_iam_expiry),
)
.route(
"/ui/iam/users/{user_id}/rotate-secret",
post(ui_pages::rotate_iam_secret),
)
.route(
"/ui/iam/users/{user_id}/rotate",
post(ui_pages::rotate_iam_secret),
)
.route("/ui/connections/create", post(ui_pages::create_connection))
.route("/ui/connections/test", post(ui_api::test_connection))
.route(
"/ui/connections/{connection_id}",
post(ui_pages::update_connection),
)
.route(
"/ui/connections/{connection_id}/update",
post(ui_pages::update_connection),
)
.route(
"/ui/connections/{connection_id}/delete",
post(ui_pages::delete_connection),
)
.route(
"/ui/connections/{connection_id}/health",
get(ui_api::connection_health),
)
.route("/ui/sites", get(ui_pages::sites_dashboard))
.route("/ui/sites/local", post(ui_pages::update_local_site))
.route("/ui/sites/peers", post(ui_pages::add_peer_site))
.route(
"/ui/sites/peers/{site_id}/update",
post(ui_pages::update_peer_site),
)
.route(
"/ui/sites/peers/{site_id}/delete",
post(ui_pages::delete_peer_site),
)
.route("/ui/sites/peers/{site_id}/health", get(ui_api::peer_health))
.route(
"/ui/sites/peers/{site_id}/sync-stats",
get(ui_api::peer_sync_stats),
)
.route(
"/ui/sites/peers/{site_id}/bidirectional-status",
get(ui_api::peer_bidirectional_status),
)
.route(
"/ui/connections",
get(ui_pages::connections_dashboard).post(ui_pages::create_connection),
)
.route("/ui/metrics", get(ui_pages::metrics_dashboard))
.route(
"/ui/metrics/settings",
get(ui_api::metrics_settings).put(ui_api::update_metrics_settings),
)
.route("/ui/metrics/api", get(ui_api::metrics_api))
.route("/ui/metrics/history", get(ui_api::metrics_history))
.route("/ui/metrics/operations", get(ui_api::metrics_operations))
.route(
"/ui/metrics/operations/history",
get(ui_api::metrics_operations_history),
)
.route("/ui/system", get(ui_pages::system_dashboard))
.route("/ui/system/gc/status", get(ui_api::gc_status_ui))
.route("/ui/system/gc/run", post(ui_api::gc_run_ui))
.route("/ui/system/gc/history", get(ui_api::gc_history_ui))
.route(
"/ui/system/integrity/status",
get(ui_api::integrity_status_ui),
)
.route("/ui/system/integrity/run", post(ui_api::integrity_run_ui))
.route(
"/ui/system/integrity/history",
get(ui_api::integrity_history_ui),
)
.route(
"/ui/website-domains",
get(ui_pages::website_domains_dashboard),
)
.route(
"/ui/website-domains/create",
post(ui_pages::create_website_domain),
)
.route(
"/ui/website-domains/{domain}",
post(ui_pages::update_website_domain),
)
.route(
"/ui/website-domains/{domain}/update",
post(ui_pages::update_website_domain),
)
.route(
"/ui/website-domains/{domain}/delete",
post(ui_pages::delete_website_domain),
)
.route("/ui/replication/new", get(ui_pages::replication_wizard))
.route(
"/ui/replication/create",
post(ui_pages::create_peer_replication_rules_from_query),
)
.route(
"/ui/sites/peers/{site_id}/replication-rules",
post(ui_pages::create_peer_replication_rules),
)
.route("/ui/docs", get(ui_pages::docs_page))
.layer(axum::middleware::from_fn(ui::require_login));
let public = Router::new()
.route("/login", get(ui::login_page).post(ui::login_submit))
.route("/logout", post(ui::logout).get(ui::logout));
let session_state = middleware::SessionLayerState {
store: state.sessions.clone(),
secure: false,
};
let static_service = tower_http::services::ServeDir::new(&state.config.static_dir);
protected
.merge(public)
.fallback(ui::not_found_page)
.layer(axum::middleware::from_fn_with_state(
state.clone(),
middleware::csrf_layer,
))
.layer(axum::middleware::from_fn_with_state(
session_state,
middleware::session_layer,
))
.layer(axum::middleware::from_fn_with_state(
state.clone(),
middleware::ui_metrics_layer,
))
.with_state(state)
.nest_service("/static", static_service)
.layer(axum::middleware::from_fn(middleware::server_header))
.layer(tower_http::compression::CompressionLayer::new())
}
pub fn create_router(state: state::AppState) -> Router {
let default_rate_limit = middleware::RateLimitLayerState::with_per_op(
state.config.ratelimit_default,
state.config.ratelimit_list_buckets,
state.config.ratelimit_bucket_ops,
state.config.ratelimit_object_ops,
state.config.ratelimit_head_ops,
state.config.num_trusted_proxies,
);
let admin_rate_limit = middleware::RateLimitLayerState::new(
state.config.ratelimit_admin,
state.config.num_trusted_proxies,
);
let mut api_router = Router::new()
.route("/myfsio/health", axum::routing::get(handlers::health_check))
.route("/", axum::routing::get(handlers::list_buckets))
.route(
"/{bucket}",
axum::routing::put(handlers::create_bucket)
.get(handlers::get_bucket)
.delete(handlers::delete_bucket)
.head(handlers::head_bucket)
.post(handlers::post_bucket),
)
.route(
"/{bucket}/",
axum::routing::put(handlers::create_bucket)
.get(handlers::get_bucket)
.delete(handlers::delete_bucket)
.head(handlers::head_bucket)
.post(handlers::post_bucket),
)
.route(
"/{bucket}/{*key}",
axum::routing::put(handlers::put_object)
.get(handlers::get_object)
.delete(handlers::delete_object)
.head(handlers::head_object)
.post(handlers::post_object),
);
if state.config.kms_enabled {
api_router = api_router
.route(
"/kms/keys",
axum::routing::get(handlers::kms::list_keys).post(handlers::kms::create_key),
)
.route(
"/kms/keys/{key_id}",
axum::routing::get(handlers::kms::get_key).delete(handlers::kms::delete_key),
)
.route(
"/kms/keys/{key_id}/enable",
axum::routing::post(handlers::kms::enable_key),
)
.route(
"/kms/keys/{key_id}/disable",
axum::routing::post(handlers::kms::disable_key),
)
.route("/kms/encrypt", axum::routing::post(handlers::kms::encrypt))
.route("/kms/decrypt", axum::routing::post(handlers::kms::decrypt))
.route(
"/kms/generate-data-key",
axum::routing::post(handlers::kms::generate_data_key),
)
.route(
"/kms/generate-data-key-without-plaintext",
axum::routing::post(handlers::kms::generate_data_key_without_plaintext),
)
.route(
"/kms/re-encrypt",
axum::routing::post(handlers::kms::re_encrypt),
)
.route(
"/kms/generate-random",
axum::routing::post(handlers::kms::generate_random),
)
.route(
"/kms/client/generate-key",
axum::routing::post(handlers::kms::client_generate_key),
)
.route(
"/kms/client/encrypt",
axum::routing::post(handlers::kms::client_encrypt),
)
.route(
"/kms/client/decrypt",
axum::routing::post(handlers::kms::client_decrypt),
)
.route(
"/kms/materials/{key_id}",
axum::routing::post(handlers::kms::materials),
);
}
api_router = api_router
.layer(axum::middleware::from_fn_with_state(
state.clone(),
middleware::auth_layer,
))
.layer(axum::middleware::from_fn_with_state(
default_rate_limit,
middleware::rate_limit_layer,
));
let admin_router = Router::new()
.route(
"/admin/site",
axum::routing::get(handlers::admin::get_local_site)
.put(handlers::admin::update_local_site),
)
.route(
"/admin/sites",
axum::routing::get(handlers::admin::list_all_sites)
.post(handlers::admin::register_peer_site),
)
.route(
"/admin/sites/{site_id}",
axum::routing::get(handlers::admin::get_peer_site)
.put(handlers::admin::update_peer_site)
.delete(handlers::admin::delete_peer_site),
)
.route(
"/admin/sites/{site_id}/health",
axum::routing::get(handlers::admin::check_peer_health)
.post(handlers::admin::check_peer_health),
)
.route(
"/admin/sites/{site_id}/bidirectional-status",
axum::routing::get(handlers::admin::check_bidirectional_status),
)
.route(
"/admin/topology",
axum::routing::get(handlers::admin::get_topology),
)
.route(
"/admin/site/local",
axum::routing::get(handlers::admin::get_local_site)
.put(handlers::admin::update_local_site),
)
.route(
"/admin/site/all",
axum::routing::get(handlers::admin::list_all_sites),
)
.route(
"/admin/site/peers",
axum::routing::post(handlers::admin::register_peer_site),
)
.route(
"/admin/site/peers/{site_id}",
axum::routing::get(handlers::admin::get_peer_site)
.put(handlers::admin::update_peer_site)
.delete(handlers::admin::delete_peer_site),
)
.route(
"/admin/site/peers/{site_id}/health",
axum::routing::post(handlers::admin::check_peer_health),
)
.route(
"/admin/site/topology",
axum::routing::get(handlers::admin::get_topology),
)
.route(
"/admin/site/peers/{site_id}/bidirectional-status",
axum::routing::get(handlers::admin::check_bidirectional_status),
)
.route(
"/admin/iam/users",
axum::routing::get(handlers::admin::iam_list_users),
)
.route(
"/admin/iam/users/{identifier}",
axum::routing::get(handlers::admin::iam_get_user),
)
.route(
"/admin/iam/users/{identifier}/policies",
axum::routing::get(handlers::admin::iam_get_user_policies),
)
.route(
"/admin/iam/users/{identifier}/access-keys",
axum::routing::post(handlers::admin::iam_create_access_key),
)
.route(
"/admin/iam/users/{identifier}/keys",
axum::routing::post(handlers::admin::iam_create_access_key),
)
.route(
"/admin/iam/users/{identifier}/access-keys/{access_key}",
axum::routing::delete(handlers::admin::iam_delete_access_key),
)
.route(
"/admin/iam/users/{identifier}/keys/{access_key}",
axum::routing::delete(handlers::admin::iam_delete_access_key),
)
.route(
"/admin/iam/users/{identifier}/disable",
axum::routing::post(handlers::admin::iam_disable_user),
)
.route(
"/admin/iam/users/{identifier}/enable",
axum::routing::post(handlers::admin::iam_enable_user),
)
.route(
"/admin/website-domains",
axum::routing::get(handlers::admin::list_website_domains)
.post(handlers::admin::create_website_domain),
)
.route(
"/admin/website-domains/{domain}",
axum::routing::get(handlers::admin::get_website_domain)
.put(handlers::admin::update_website_domain)
.delete(handlers::admin::delete_website_domain),
)
.route(
"/admin/gc/status",
axum::routing::get(handlers::admin::gc_status),
)
.route(
"/admin/gc/run",
axum::routing::post(handlers::admin::gc_run),
)
.route(
"/admin/gc/history",
axum::routing::get(handlers::admin::gc_history),
)
.route(
"/admin/integrity/status",
axum::routing::get(handlers::admin::integrity_status),
)
.route(
"/admin/integrity/run",
axum::routing::post(handlers::admin::integrity_run),
)
.route(
"/admin/integrity/history",
axum::routing::get(handlers::admin::integrity_history),
)
.layer(axum::middleware::from_fn_with_state(
state.clone(),
middleware::auth_layer,
))
.layer(axum::middleware::from_fn_with_state(
admin_rate_limit,
middleware::rate_limit_layer,
));
let request_body_timeout =
std::time::Duration::from_secs(state.config.request_body_timeout_secs);
api_router
.merge(admin_router)
.layer(axum::middleware::from_fn(middleware::server_header))
.layer(cors_layer(&state.config))
.layer(tower_http::compression::CompressionLayer::new())
.layer(tower_http::timeout::RequestBodyTimeoutLayer::new(
request_body_timeout,
))
.with_state(state)
}
fn cors_layer(config: &config::ServerConfig) -> tower_http::cors::CorsLayer {
use axum::http::{HeaderName, HeaderValue, Method};
use tower_http::cors::{Any, CorsLayer};
let mut layer = CorsLayer::new();
if config.cors_origins.iter().any(|origin| origin == "*") {
layer = layer.allow_origin(Any);
} else {
let origins = config
.cors_origins
.iter()
.filter_map(|origin| HeaderValue::from_str(origin).ok())
.collect::<Vec<_>>();
if !origins.is_empty() {
layer = layer.allow_origin(origins);
}
}
let methods = config
.cors_methods
.iter()
.filter_map(|method| method.parse::<Method>().ok())
.collect::<Vec<_>>();
if !methods.is_empty() {
layer = layer.allow_methods(methods);
}
if config.cors_allow_headers.iter().any(|header| header == "*") {
layer = layer.allow_headers(Any);
} else {
let headers = config
.cors_allow_headers
.iter()
.filter_map(|header| header.parse::<HeaderName>().ok())
.collect::<Vec<_>>();
if !headers.is_empty() {
layer = layer.allow_headers(headers);
}
}
if config
.cors_expose_headers
.iter()
.any(|header| header == "*")
{
layer = layer.expose_headers(Any);
} else {
let headers = config
.cors_expose_headers
.iter()
.filter_map(|header| header.parse::<HeaderName>().ok())
.collect::<Vec<_>>();
if !headers.is_empty() {
layer = layer.expose_headers(headers);
}
}
layer
}

View File

@@ -0,0 +1,557 @@
use clap::{Parser, Subcommand};
use myfsio_server::config::ServerConfig;
use myfsio_server::state::AppState;
#[derive(Parser)]
#[command(
name = "myfsio",
version,
about = "MyFSIO S3-compatible storage engine"
)]
struct Cli {
#[arg(long, help = "Validate configuration and exit")]
check_config: bool,
#[arg(long, help = "Show configuration summary and exit")]
show_config: bool,
#[arg(long, help = "Reset admin credentials and exit")]
reset_cred: bool,
#[command(subcommand)]
command: Option<Command>,
}
#[derive(Subcommand)]
enum Command {
Serve,
Version,
}
#[tokio::main]
async fn main() {
load_env_files();
init_tracing();
let cli = Cli::parse();
let config = ServerConfig::from_env();
if !config
.ratelimit_storage_uri
.eq_ignore_ascii_case("memory://")
{
tracing::warn!(
"RATE_LIMIT_STORAGE_URI={} is not supported yet; using in-memory rate limits",
config.ratelimit_storage_uri
);
}
if cli.reset_cred {
reset_admin_credentials(&config);
return;
}
if cli.check_config || cli.show_config {
print_config_summary(&config);
if cli.check_config {
let issues = validate_config(&config);
for issue in &issues {
println!("{issue}");
}
if issues.iter().any(|issue| issue.starts_with("CRITICAL:")) {
std::process::exit(1);
}
}
return;
}
match cli.command.unwrap_or(Command::Serve) {
Command::Version => {
println!("myfsio {}", env!("CARGO_PKG_VERSION"));
return;
}
Command::Serve => {}
}
ensure_iam_bootstrap(&config);
let bind_addr = config.bind_addr;
let ui_bind_addr = config.ui_bind_addr;
tracing::info!("MyFSIO Rust Engine starting — API on {}", bind_addr);
if config.ui_enabled {
tracing::info!("UI will bind on {}", ui_bind_addr);
}
tracing::info!("Storage root: {}", config.storage_root.display());
tracing::info!("Region: {}", config.region);
tracing::info!(
"Encryption: {}, KMS: {}, GC: {}, Lifecycle: {}, Integrity: {}, Metrics History: {}, Operation Metrics: {}, UI: {}",
config.encryption_enabled,
config.kms_enabled,
config.gc_enabled,
config.lifecycle_enabled,
config.integrity_enabled,
config.metrics_history_enabled,
config.metrics_enabled,
config.ui_enabled
);
let state = if config.encryption_enabled || config.kms_enabled {
AppState::new_with_encryption(config.clone()).await
} else {
AppState::new(config.clone())
};
let mut bg_handles: Vec<tokio::task::JoinHandle<()>> = Vec::new();
if let Some(ref gc) = state.gc {
bg_handles.push(gc.clone().start_background());
tracing::info!("GC background service started");
}
if let Some(ref integrity) = state.integrity {
bg_handles.push(integrity.clone().start_background());
tracing::info!("Integrity checker background service started");
}
if let Some(ref metrics) = state.metrics {
bg_handles.push(metrics.clone().start_background());
tracing::info!("Metrics collector background service started");
}
if let Some(ref system_metrics) = state.system_metrics {
bg_handles.push(system_metrics.clone().start_background());
tracing::info!("System metrics history collector started");
}
if config.lifecycle_enabled {
let lifecycle =
std::sync::Arc::new(myfsio_server::services::lifecycle::LifecycleService::new(
state.storage.clone(),
config.storage_root.clone(),
myfsio_server::services::lifecycle::LifecycleConfig {
interval_seconds: 3600,
max_history_per_bucket: config.lifecycle_max_history_per_bucket,
},
));
bg_handles.push(lifecycle.start_background());
tracing::info!("Lifecycle manager background service started");
}
if let Some(ref site_sync) = state.site_sync {
let worker = site_sync.clone();
bg_handles.push(tokio::spawn(async move {
worker.run().await;
}));
tracing::info!("Site sync worker started");
}
let ui_enabled = config.ui_enabled;
let api_app = myfsio_server::create_router(state.clone());
let ui_app = if ui_enabled {
Some(myfsio_server::create_ui_router(state.clone()))
} else {
None
};
let api_listener = match tokio::net::TcpListener::bind(bind_addr).await {
Ok(listener) => listener,
Err(err) => {
if err.kind() == std::io::ErrorKind::AddrInUse {
tracing::error!("API port already in use: {}", bind_addr);
} else {
tracing::error!("Failed to bind API {}: {}", bind_addr, err);
}
for handle in bg_handles {
handle.abort();
}
std::process::exit(1);
}
};
tracing::info!("API listening on {}", bind_addr);
let ui_listener = if let Some(ref app) = ui_app {
let _ = app;
match tokio::net::TcpListener::bind(ui_bind_addr).await {
Ok(listener) => {
tracing::info!("UI listening on {}", ui_bind_addr);
Some(listener)
}
Err(err) => {
if err.kind() == std::io::ErrorKind::AddrInUse {
tracing::error!("UI port already in use: {}", ui_bind_addr);
} else {
tracing::error!("Failed to bind UI {}: {}", ui_bind_addr, err);
}
for handle in bg_handles {
handle.abort();
}
std::process::exit(1);
}
}
} else {
None
};
let shutdown = shutdown_signal_shared();
let api_shutdown = shutdown.clone();
let api_listener = axum::serve::ListenerExt::tap_io(api_listener, |stream| {
if let Err(err) = stream.set_nodelay(true) {
tracing::trace!("failed to set TCP_NODELAY on api socket: {}", err);
}
});
let api_task = tokio::spawn(async move {
axum::serve(
api_listener,
api_app.into_make_service_with_connect_info::<std::net::SocketAddr>(),
)
.with_graceful_shutdown(async move {
api_shutdown.notified().await;
})
.await
});
let ui_task = if let (Some(listener), Some(app)) = (ui_listener, ui_app) {
let ui_shutdown = shutdown.clone();
let listener = axum::serve::ListenerExt::tap_io(listener, |stream| {
if let Err(err) = stream.set_nodelay(true) {
tracing::trace!("failed to set TCP_NODELAY on ui socket: {}", err);
}
});
Some(tokio::spawn(async move {
axum::serve(listener, app)
.with_graceful_shutdown(async move {
ui_shutdown.notified().await;
})
.await
}))
} else {
None
};
tokio::signal::ctrl_c()
.await
.expect("Failed to listen for Ctrl+C");
tracing::info!("Shutdown signal received");
shutdown.notify_waiters();
if let Err(err) = api_task.await.unwrap_or(Ok(())) {
tracing::error!("API server exited with error: {}", err);
}
if let Some(task) = ui_task {
if let Err(err) = task.await.unwrap_or(Ok(())) {
tracing::error!("UI server exited with error: {}", err);
}
}
for handle in bg_handles {
handle.abort();
}
}
fn print_config_summary(config: &ServerConfig) {
println!("MyFSIO Rust Configuration");
println!("Version: {}", env!("CARGO_PKG_VERSION"));
println!("API bind: {}", config.bind_addr);
println!("UI bind: {}", config.ui_bind_addr);
println!("UI enabled: {}", config.ui_enabled);
println!("Storage root: {}", config.storage_root.display());
println!("IAM config: {}", config.iam_config_path.display());
println!("Region: {}", config.region);
println!("Encryption enabled: {}", config.encryption_enabled);
println!(
"Encryption chunk size: {} bytes",
config.encryption_chunk_size_bytes
);
println!("KMS enabled: {}", config.kms_enabled);
println!(
"KMS data key bounds: {}-{} bytes",
config.kms_generate_data_key_min_bytes, config.kms_generate_data_key_max_bytes
);
println!("GC enabled: {}", config.gc_enabled);
println!(
"GC interval: {} hours, dry run: {}",
config.gc_interval_hours, config.gc_dry_run
);
println!("Integrity enabled: {}", config.integrity_enabled);
println!("Lifecycle enabled: {}", config.lifecycle_enabled);
println!(
"Lifecycle history limit: {}",
config.lifecycle_max_history_per_bucket
);
println!(
"Website hosting enabled: {}",
config.website_hosting_enabled
);
println!("Site sync enabled: {}", config.site_sync_enabled);
println!("API base URL: {}", config.api_base_url);
println!(
"Object key max: {} bytes, tag limit: {}",
config.object_key_max_length_bytes, config.object_tag_limit
);
println!(
"Rate limits: default {} per {}s, admin {} per {}s",
config.ratelimit_default.max_requests,
config.ratelimit_default.window_seconds,
config.ratelimit_admin.max_requests,
config.ratelimit_admin.window_seconds
);
println!(
"Metrics history enabled: {}",
config.metrics_history_enabled
);
println!("Operation metrics enabled: {}", config.metrics_enabled);
}
fn validate_config(config: &ServerConfig) -> Vec<String> {
let mut issues = Vec::new();
if config.ui_enabled && config.bind_addr == config.ui_bind_addr {
issues.push(
"CRITICAL: API and UI bind addresses cannot be identical when UI is enabled."
.to_string(),
);
}
if config.presigned_url_min_expiry > config.presigned_url_max_expiry {
issues.push("CRITICAL: PRESIGNED_URL_MIN_EXPIRY_SECONDS cannot exceed PRESIGNED_URL_MAX_EXPIRY_SECONDS.".to_string());
}
if config.encryption_chunk_size_bytes == 0 {
issues.push("CRITICAL: ENCRYPTION_CHUNK_SIZE_BYTES must be greater than zero.".to_string());
}
if config.kms_generate_data_key_min_bytes == 0 {
issues.push(
"CRITICAL: KMS_GENERATE_DATA_KEY_MIN_BYTES must be greater than zero.".to_string(),
);
}
if config.kms_generate_data_key_min_bytes > config.kms_generate_data_key_max_bytes {
issues.push("CRITICAL: KMS_GENERATE_DATA_KEY_MIN_BYTES cannot exceed KMS_GENERATE_DATA_KEY_MAX_BYTES.".to_string());
}
if config.gc_interval_hours <= 0.0 {
issues.push("CRITICAL: GC_INTERVAL_HOURS must be greater than zero.".to_string());
}
if config.bucket_config_cache_ttl_seconds < 0.0 {
issues.push("CRITICAL: BUCKET_CONFIG_CACHE_TTL_SECONDS cannot be negative.".to_string());
}
if !config
.ratelimit_storage_uri
.eq_ignore_ascii_case("memory://")
{
issues.push(format!(
"WARNING: RATE_LIMIT_STORAGE_URI={} is not supported yet; using in-memory limits.",
config.ratelimit_storage_uri
));
}
if let Err(err) = std::fs::create_dir_all(&config.storage_root) {
issues.push(format!(
"CRITICAL: Cannot create storage root {}: {}",
config.storage_root.display(),
err
));
}
if let Some(parent) = config.iam_config_path.parent() {
if let Err(err) = std::fs::create_dir_all(parent) {
issues.push(format!(
"CRITICAL: Cannot create IAM config directory {}: {}",
parent.display(),
err
));
}
}
if config.encryption_enabled && config.secret_key.is_none() {
issues.push(
"WARNING: ENCRYPTION_ENABLED=true but SECRET_KEY is not configured; secure-at-rest config encryption is unavailable.".to_string(),
);
}
if config.site_sync_enabled && !config.website_hosting_enabled {
issues.push(
"INFO: SITE_SYNC_ENABLED=true without WEBSITE_HOSTING_ENABLED; this is valid but unrelated.".to_string(),
);
}
issues
}
fn init_tracing() {
use tracing_subscriber::EnvFilter;
let filter = EnvFilter::try_from_env("RUST_LOG")
.or_else(|_| {
EnvFilter::try_new(std::env::var("LOG_LEVEL").unwrap_or_else(|_| "INFO".to_string()))
})
.unwrap_or_else(|_| EnvFilter::new("INFO"));
tracing_subscriber::fmt().with_env_filter(filter).init();
}
fn shutdown_signal_shared() -> std::sync::Arc<tokio::sync::Notify> {
std::sync::Arc::new(tokio::sync::Notify::new())
}
fn load_env_files() {
let cwd = std::env::current_dir().ok();
let mut candidates: Vec<std::path::PathBuf> = Vec::new();
candidates.push(std::path::PathBuf::from("/opt/myfsio/myfsio.env"));
if let Some(ref dir) = cwd {
candidates.push(dir.join(".env"));
candidates.push(dir.join("myfsio.env"));
for ancestor in dir.ancestors().skip(1).take(4) {
candidates.push(ancestor.join(".env"));
candidates.push(ancestor.join("myfsio.env"));
}
}
let mut seen = std::collections::HashSet::new();
for path in candidates {
if !seen.insert(path.clone()) {
continue;
}
if path.is_file() {
match dotenvy::from_path_override(&path) {
Ok(()) => eprintln!("Loaded env file: {}", path.display()),
Err(e) => eprintln!("Failed to load env file {}: {}", path.display(), e),
}
}
}
}
fn ensure_iam_bootstrap(config: &ServerConfig) {
let iam_path = &config.iam_config_path;
if iam_path.exists() {
return;
}
let access_key = std::env::var("ADMIN_ACCESS_KEY")
.ok()
.map(|s| s.trim().to_string())
.filter(|s| !s.is_empty())
.unwrap_or_else(|| format!("AK{}", uuid::Uuid::new_v4().simple()));
let secret_key = std::env::var("ADMIN_SECRET_KEY")
.ok()
.map(|s| s.trim().to_string())
.filter(|s| !s.is_empty())
.unwrap_or_else(|| format!("SK{}", uuid::Uuid::new_v4().simple()));
let user_id = format!("u-{}", &uuid::Uuid::new_v4().simple().to_string()[..16]);
let created_at = chrono::Utc::now().to_rfc3339();
let body = serde_json::json!({
"version": 2,
"users": [{
"user_id": user_id,
"display_name": "Local Admin",
"enabled": true,
"access_keys": [{
"access_key": access_key,
"secret_key": secret_key,
"status": "active",
"created_at": created_at,
}],
"policies": [{
"bucket": "*",
"actions": ["*"],
"prefix": "*",
}]
}]
});
let json = match serde_json::to_string_pretty(&body) {
Ok(s) => s,
Err(e) => {
tracing::error!("Failed to serialize IAM bootstrap config: {}", e);
return;
}
};
if let Some(parent) = iam_path.parent() {
if let Err(e) = std::fs::create_dir_all(parent) {
tracing::error!(
"Failed to create IAM config dir {}: {}",
parent.display(),
e
);
return;
}
}
if let Err(e) = std::fs::write(iam_path, json) {
tracing::error!(
"Failed to write IAM bootstrap config {}: {}",
iam_path.display(),
e
);
return;
}
tracing::info!("============================================================");
tracing::info!("MYFSIO - ADMIN CREDENTIALS INITIALIZED");
tracing::info!("============================================================");
tracing::info!("Access Key: {}", access_key);
tracing::info!("Secret Key: {}", secret_key);
tracing::info!("Saved to: {}", iam_path.display());
tracing::info!("============================================================");
}
fn reset_admin_credentials(config: &ServerConfig) {
if let Some(parent) = config.iam_config_path.parent() {
if let Err(err) = std::fs::create_dir_all(parent) {
eprintln!(
"Failed to create IAM config directory {}: {}",
parent.display(),
err
);
std::process::exit(1);
}
}
if config.iam_config_path.exists() {
let backup = config
.iam_config_path
.with_extension(format!("bak-{}", chrono::Utc::now().timestamp()));
if let Err(err) = std::fs::rename(&config.iam_config_path, &backup) {
eprintln!(
"Failed to back up existing IAM config {}: {}",
config.iam_config_path.display(),
err
);
std::process::exit(1);
}
println!("Backed up existing IAM config to {}", backup.display());
prune_iam_backups(&config.iam_config_path, 5);
}
ensure_iam_bootstrap(config);
println!("Admin credentials reset.");
}
fn prune_iam_backups(iam_path: &std::path::Path, keep: usize) {
let parent = match iam_path.parent() {
Some(p) => p,
None => return,
};
let stem = match iam_path.file_stem().and_then(|s| s.to_str()) {
Some(s) => s,
None => return,
};
let prefix = format!("{}.bak-", stem);
let entries = match std::fs::read_dir(parent) {
Ok(entries) => entries,
Err(_) => return,
};
let mut backups: Vec<(i64, std::path::PathBuf)> = entries
.filter_map(|e| e.ok())
.filter_map(|e| {
let path = e.path();
let name = path.file_name()?.to_str()?;
let rest = name.strip_prefix(&prefix)?;
let ts: i64 = rest.parse().ok()?;
Some((ts, path))
})
.collect();
backups.sort_by(|a, b| b.0.cmp(&a.0));
for (_, path) in backups.into_iter().skip(keep) {
if let Err(err) = std::fs::remove_file(&path) {
eprintln!(
"Failed to remove old IAM backup {}: {}",
path.display(),
err
);
} else {
println!("Pruned old IAM backup {}", path.display());
}
}
}

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,88 @@
mod auth;
pub mod ratelimit;
pub mod session;
pub(crate) mod sha_body;
pub use auth::auth_layer;
pub use ratelimit::{rate_limit_layer, RateLimitLayerState};
pub use session::{csrf_layer, session_layer, SessionHandle, SessionLayerState};
use axum::extract::{Request, State};
use axum::middleware::Next;
use axum::response::Response;
use std::time::Instant;
use crate::state::AppState;
pub async fn server_header(req: Request, next: Next) -> Response {
let mut resp = next.run(req).await;
resp.headers_mut()
.insert("server", crate::SERVER_HEADER.parse().unwrap());
resp
}
pub async fn ui_metrics_layer(State(state): State<AppState>, req: Request, next: Next) -> Response {
let metrics = match state.metrics.clone() {
Some(m) => m,
None => return next.run(req).await,
};
let start = Instant::now();
let method = req.method().clone();
let path = req.uri().path().to_string();
let endpoint_type = classify_ui_endpoint(&path);
let bytes_in = req
.headers()
.get(axum::http::header::CONTENT_LENGTH)
.and_then(|v| v.to_str().ok())
.and_then(|v| v.parse::<u64>().ok())
.unwrap_or(0);
let response = next.run(req).await;
let latency_ms = start.elapsed().as_secs_f64() * 1000.0;
let status = response.status().as_u16();
let bytes_out = response
.headers()
.get(axum::http::header::CONTENT_LENGTH)
.and_then(|v| v.to_str().ok())
.and_then(|v| v.parse::<u64>().ok())
.unwrap_or(0);
let error_code = if status >= 400 { Some("UIError") } else { None };
metrics.record_request(
method.as_str(),
endpoint_type,
status,
latency_ms,
bytes_in,
bytes_out,
error_code,
);
response
}
fn classify_ui_endpoint(path: &str) -> &'static str {
if path.contains("/upload") {
"ui_upload"
} else if path.starts_with("/ui/buckets/") {
"ui_bucket"
} else if path.starts_with("/ui/iam") {
"ui_iam"
} else if path.starts_with("/ui/sites") {
"ui_sites"
} else if path.starts_with("/ui/connections") {
"ui_connections"
} else if path.starts_with("/ui/metrics") {
"ui_metrics"
} else if path.starts_with("/ui/system") {
"ui_system"
} else if path.starts_with("/ui/website-domains") {
"ui_website_domains"
} else if path.starts_with("/ui/replication") {
"ui_replication"
} else if path.starts_with("/login") || path.starts_with("/logout") {
"ui_auth"
} else {
"ui_other"
}
}

View File

@@ -0,0 +1,313 @@
use std::collections::HashMap;
use std::net::SocketAddr;
use std::sync::Arc;
use std::time::{Duration, Instant};
use axum::extract::{ConnectInfo, Request, State};
use axum::http::{header, Method, StatusCode};
use axum::middleware::Next;
use axum::response::{IntoResponse, Response};
use parking_lot::Mutex;
use crate::config::RateLimitSetting;
#[derive(Clone)]
pub struct RateLimitLayerState {
default_limiter: Arc<FixedWindowLimiter>,
list_buckets_limiter: Option<Arc<FixedWindowLimiter>>,
bucket_ops_limiter: Option<Arc<FixedWindowLimiter>>,
object_ops_limiter: Option<Arc<FixedWindowLimiter>>,
head_ops_limiter: Option<Arc<FixedWindowLimiter>>,
num_trusted_proxies: usize,
}
impl RateLimitLayerState {
pub fn new(setting: RateLimitSetting, num_trusted_proxies: usize) -> Self {
Self {
default_limiter: Arc::new(FixedWindowLimiter::new(setting)),
list_buckets_limiter: None,
bucket_ops_limiter: None,
object_ops_limiter: None,
head_ops_limiter: None,
num_trusted_proxies,
}
}
pub fn with_per_op(
default: RateLimitSetting,
list_buckets: RateLimitSetting,
bucket_ops: RateLimitSetting,
object_ops: RateLimitSetting,
head_ops: RateLimitSetting,
num_trusted_proxies: usize,
) -> Self {
Self {
default_limiter: Arc::new(FixedWindowLimiter::new(default)),
list_buckets_limiter: (list_buckets != default)
.then(|| Arc::new(FixedWindowLimiter::new(list_buckets))),
bucket_ops_limiter: (bucket_ops != default)
.then(|| Arc::new(FixedWindowLimiter::new(bucket_ops))),
object_ops_limiter: (object_ops != default)
.then(|| Arc::new(FixedWindowLimiter::new(object_ops))),
head_ops_limiter: (head_ops != default)
.then(|| Arc::new(FixedWindowLimiter::new(head_ops))),
num_trusted_proxies,
}
}
fn select_limiter(&self, req: &Request) -> &Arc<FixedWindowLimiter> {
let path = req.uri().path();
let method = req.method();
if path == "/" && *method == Method::GET {
if let Some(ref limiter) = self.list_buckets_limiter {
return limiter;
}
}
let segments: Vec<&str> = path
.trim_start_matches('/')
.split('/')
.filter(|s| !s.is_empty())
.collect();
if *method == Method::HEAD {
if let Some(ref limiter) = self.head_ops_limiter {
return limiter;
}
}
if segments.len() == 1 {
if let Some(ref limiter) = self.bucket_ops_limiter {
return limiter;
}
} else if segments.len() >= 2 {
if let Some(ref limiter) = self.object_ops_limiter {
return limiter;
}
}
&self.default_limiter
}
}
#[derive(Debug)]
struct FixedWindowLimiter {
setting: RateLimitSetting,
state: Mutex<LimiterState>,
}
#[derive(Debug)]
struct LimiterState {
entries: HashMap<String, LimitEntry>,
last_sweep: Instant,
}
#[derive(Debug, Clone, Copy)]
struct LimitEntry {
window_started: Instant,
count: u32,
}
const SWEEP_MIN_INTERVAL: Duration = Duration::from_secs(60);
const SWEEP_ENTRY_THRESHOLD: usize = 1024;
impl FixedWindowLimiter {
fn new(setting: RateLimitSetting) -> Self {
Self {
setting,
state: Mutex::new(LimiterState {
entries: HashMap::new(),
last_sweep: Instant::now(),
}),
}
}
fn check(&self, key: &str) -> Result<(), u64> {
let now = Instant::now();
let window = Duration::from_secs(self.setting.window_seconds.max(1));
let mut state = self.state.lock();
if state.entries.len() >= SWEEP_ENTRY_THRESHOLD
&& now.duration_since(state.last_sweep) >= SWEEP_MIN_INTERVAL
{
state
.entries
.retain(|_, entry| now.duration_since(entry.window_started) < window);
state.last_sweep = now;
}
let entry = state.entries.entry(key.to_string()).or_insert(LimitEntry {
window_started: now,
count: 0,
});
if now.duration_since(entry.window_started) >= window {
entry.window_started = now;
entry.count = 0;
}
if entry.count >= self.setting.max_requests {
let elapsed = now.duration_since(entry.window_started);
let retry_after = window.saturating_sub(elapsed).as_secs().max(1);
return Err(retry_after);
}
entry.count += 1;
Ok(())
}
}
pub async fn rate_limit_layer(
State(state): State<RateLimitLayerState>,
req: Request,
next: Next,
) -> Response {
let key = rate_limit_key(&req, state.num_trusted_proxies);
let limiter = state.select_limiter(&req);
match limiter.check(&key) {
Ok(()) => next.run(req).await,
Err(retry_after) => {
let resource = req.uri().path().to_string();
too_many_requests(retry_after, &resource)
}
}
}
fn too_many_requests(retry_after: u64, resource: &str) -> Response {
let request_id = uuid::Uuid::new_v4().simple().to_string();
let body = myfsio_xml::response::rate_limit_exceeded_xml(resource, &request_id);
let mut response = (
StatusCode::SERVICE_UNAVAILABLE,
[
(header::CONTENT_TYPE, "application/xml".to_string()),
(header::RETRY_AFTER, retry_after.to_string()),
],
body,
)
.into_response();
if let Ok(value) = request_id.parse() {
response
.headers_mut()
.insert("x-amz-request-id", value);
}
response
}
fn rate_limit_key(req: &Request, num_trusted_proxies: usize) -> String {
format!("ip:{}", client_ip(req, num_trusted_proxies))
}
fn client_ip(req: &Request, num_trusted_proxies: usize) -> String {
if num_trusted_proxies > 0 {
if let Some(value) = req
.headers()
.get("x-forwarded-for")
.and_then(|v| v.to_str().ok())
{
let parts = value
.split(',')
.map(|part| part.trim())
.filter(|part| !part.is_empty())
.collect::<Vec<_>>();
if parts.len() > num_trusted_proxies {
let index = parts.len() - num_trusted_proxies - 1;
return parts[index].to_string();
}
}
if let Some(value) = req.headers().get("x-real-ip").and_then(|v| v.to_str().ok()) {
if !value.trim().is_empty() {
return value.trim().to_string();
}
}
}
req.extensions()
.get::<ConnectInfo<SocketAddr>>()
.map(|ConnectInfo(addr)| addr.ip().to_string())
.unwrap_or_else(|| "unknown".to_string())
}
#[cfg(test)]
mod tests {
use super::*;
use axum::body::Body;
#[test]
fn honors_trusted_proxy_count_for_forwarded_for() {
let req = Request::builder()
.header("x-forwarded-for", "198.51.100.1, 10.0.0.1, 10.0.0.2")
.body(Body::empty())
.unwrap();
assert_eq!(rate_limit_key(&req, 2), "ip:198.51.100.1");
assert_eq!(rate_limit_key(&req, 1), "ip:10.0.0.1");
}
#[test]
fn falls_back_to_connect_info_when_forwarded_for_has_too_few_hops() {
let mut req = Request::builder()
.header("x-forwarded-for", "198.51.100.1")
.body(Body::empty())
.unwrap();
req.extensions_mut()
.insert(ConnectInfo(SocketAddr::from(([203, 0, 113, 9], 443))));
assert_eq!(rate_limit_key(&req, 2), "ip:203.0.113.9");
}
#[test]
fn ignores_forwarded_headers_when_no_proxies_are_trusted() {
let mut req = Request::builder()
.header("x-forwarded-for", "198.51.100.1")
.header("x-real-ip", "198.51.100.2")
.body(Body::empty())
.unwrap();
req.extensions_mut()
.insert(ConnectInfo(SocketAddr::from(([203, 0, 113, 9], 443))));
assert_eq!(rate_limit_key(&req, 0), "ip:203.0.113.9");
}
#[test]
fn uses_connect_info_for_direct_clients() {
let mut req = Request::builder().body(Body::empty()).unwrap();
req.extensions_mut()
.insert(ConnectInfo(SocketAddr::from(([203, 0, 113, 10], 443))));
assert_eq!(rate_limit_key(&req, 0), "ip:203.0.113.10");
}
#[test]
fn fixed_window_rejects_after_quota() {
let limiter = FixedWindowLimiter::new(RateLimitSetting::new(2, 60));
assert!(limiter.check("k").is_ok());
assert!(limiter.check("k").is_ok());
assert!(limiter.check("k").is_err());
}
#[test]
fn sweep_removes_expired_entries() {
let limiter = FixedWindowLimiter::new(RateLimitSetting::new(10, 1));
let far_past = Instant::now() - (SWEEP_MIN_INTERVAL + Duration::from_secs(5));
{
let mut state = limiter.state.lock();
for i in 0..(SWEEP_ENTRY_THRESHOLD + 1024) {
state.entries.insert(
format!("stale-{}", i),
LimitEntry {
window_started: far_past,
count: 5,
},
);
}
state.last_sweep = far_past;
}
let seeded = limiter.state.lock().entries.len();
assert_eq!(seeded, SWEEP_ENTRY_THRESHOLD + 1024);
assert!(limiter.check("fresh").is_ok());
let remaining = limiter.state.lock().entries.len();
assert_eq!(
remaining, 1,
"expected sweep to leave only the fresh entry, got {}",
remaining
);
}
}

View File

@@ -0,0 +1,257 @@
use std::sync::Arc;
use axum::extract::{Request, State};
use axum::http::{header, HeaderValue, StatusCode};
use axum::middleware::Next;
use axum::response::{IntoResponse, Response};
use cookie::{Cookie, SameSite};
use parking_lot::Mutex;
use crate::session::{
csrf_tokens_match, SessionData, SessionStore, CSRF_FIELD_NAME, CSRF_HEADER_NAME,
SESSION_COOKIE_NAME,
};
#[derive(Clone)]
pub struct SessionLayerState {
pub store: Arc<SessionStore>,
pub secure: bool,
}
#[derive(Clone)]
pub struct SessionHandle {
pub id: String,
inner: Arc<Mutex<SessionData>>,
dirty: Arc<Mutex<bool>>,
}
impl SessionHandle {
pub fn new(id: String, data: SessionData) -> Self {
Self {
id,
inner: Arc::new(Mutex::new(data)),
dirty: Arc::new(Mutex::new(false)),
}
}
pub fn read<R>(&self, f: impl FnOnce(&SessionData) -> R) -> R {
let guard = self.inner.lock();
f(&guard)
}
pub fn write<R>(&self, f: impl FnOnce(&mut SessionData) -> R) -> R {
let mut guard = self.inner.lock();
let out = f(&mut guard);
*self.dirty.lock() = true;
out
}
pub fn snapshot(&self) -> SessionData {
self.inner.lock().clone()
}
pub fn is_dirty(&self) -> bool {
*self.dirty.lock()
}
}
pub async fn session_layer(
State(state): State<SessionLayerState>,
mut req: Request,
next: Next,
) -> Response {
let cookie_id = extract_session_cookie(&req);
let (session_id, session_data, is_new) =
match cookie_id.and_then(|id| state.store.get(&id).map(|data| (id.clone(), data))) {
Some((id, data)) => (id, data, false),
None => {
let (id, data) = state.store.create();
(id, data, true)
}
};
let handle = SessionHandle::new(session_id.clone(), session_data);
req.extensions_mut().insert(handle.clone());
let mut resp = next.run(req).await;
if handle.is_dirty() {
state.store.save(&handle.id, handle.snapshot());
}
if is_new {
let cookie = build_session_cookie(&session_id, state.secure);
if let Ok(value) = HeaderValue::from_str(&cookie.to_string()) {
resp.headers_mut().append(header::SET_COOKIE, value);
}
}
resp
}
pub async fn csrf_layer(
State(state): State<crate::state::AppState>,
req: Request,
next: Next,
) -> Response {
const CSRF_HEADER_ALIAS: &str = "x-csrftoken";
let method = req.method().clone();
let needs_check = matches!(
method,
axum::http::Method::POST
| axum::http::Method::PUT
| axum::http::Method::PATCH
| axum::http::Method::DELETE
);
if !needs_check {
return next.run(req).await;
}
let is_ui = req.uri().path().starts_with("/ui/")
|| req.uri().path() == "/ui"
|| req.uri().path() == "/login"
|| req.uri().path() == "/logout";
if !is_ui {
return next.run(req).await;
}
let handle = match req.extensions().get::<SessionHandle>() {
Some(h) => h.clone(),
None => return (StatusCode::FORBIDDEN, "Missing session").into_response(),
};
let expected = handle.read(|s| s.csrf_token.clone());
let header_token = req
.headers()
.get(CSRF_HEADER_NAME)
.or_else(|| req.headers().get(CSRF_HEADER_ALIAS))
.and_then(|v| v.to_str().ok())
.map(|s| s.to_string());
if let Some(token) = header_token.as_deref() {
if csrf_tokens_match(&expected, token) {
return next.run(req).await;
}
}
let content_type = req
.headers()
.get(header::CONTENT_TYPE)
.and_then(|v| v.to_str().ok())
.unwrap_or("")
.to_string();
let (parts, body) = req.into_parts();
let bytes = match axum::body::to_bytes(body, usize::MAX).await {
Ok(b) => b,
Err(_) => return (StatusCode::BAD_REQUEST, "Body read failed").into_response(),
};
let form_token = if content_type.starts_with("application/x-www-form-urlencoded") {
extract_form_token(&bytes)
} else if content_type.starts_with("multipart/form-data") {
extract_multipart_token(&content_type, &bytes)
} else {
None
};
if let Some(token) = form_token {
if csrf_tokens_match(&expected, &token) {
let req = Request::from_parts(parts, axum::body::Body::from(bytes));
return next.run(req).await;
}
}
tracing::warn!(
path = %parts.uri.path(),
content_type = %content_type,
expected_len = expected.len(),
header_present = header_token.is_some(),
"CSRF token mismatch"
);
let accept = parts
.headers
.get(header::ACCEPT)
.and_then(|v| v.to_str().ok())
.unwrap_or("");
let is_form_submit = content_type.starts_with("application/x-www-form-urlencoded")
|| content_type.starts_with("multipart/form-data");
let wants_json =
accept.contains("application/json") || content_type.starts_with("application/json");
if is_form_submit && !wants_json {
let ctx = crate::handlers::ui::base_context(&handle, None);
let mut resp = crate::handlers::ui::render(&state, "csrf_error.html", &ctx);
*resp.status_mut() = StatusCode::FORBIDDEN;
return resp;
}
let mut resp = (
StatusCode::FORBIDDEN,
[(header::CONTENT_TYPE, "application/json")],
r#"{"error":"Invalid CSRF token"}"#,
)
.into_response();
*resp.status_mut() = StatusCode::FORBIDDEN;
resp
}
fn extract_multipart_token(content_type: &str, body: &[u8]) -> Option<String> {
let boundary = multer::parse_boundary(content_type).ok()?;
let prefix = format!("--{}", boundary);
let text = std::str::from_utf8(body).ok()?;
let needle = "name=\"csrf_token\"";
let idx = text.find(needle)?;
let after = &text[idx + needle.len()..];
let body_start = after.find("\r\n\r\n")? + 4;
let tail = &after[body_start..];
let end = tail
.find(&format!("\r\n--{}", prefix.trim_start_matches("--")))
.or_else(|| tail.find("\r\n--"))
.unwrap_or(tail.len());
Some(tail[..end].trim().to_string())
}
fn extract_session_cookie(req: &Request) -> Option<String> {
let raw = req.headers().get(header::COOKIE)?.to_str().ok()?;
for pair in raw.split(';') {
if let Ok(cookie) = Cookie::parse(pair.trim().to_string()) {
if cookie.name() == SESSION_COOKIE_NAME {
return Some(cookie.value().to_string());
}
}
}
None
}
fn build_session_cookie(id: &str, secure: bool) -> Cookie<'static> {
let mut cookie = Cookie::new(SESSION_COOKIE_NAME, id.to_string());
cookie.set_http_only(true);
cookie.set_same_site(SameSite::Lax);
cookie.set_secure(secure);
cookie.set_path("/");
cookie
}
fn extract_form_token(body: &[u8]) -> Option<String> {
let text = std::str::from_utf8(body).ok()?;
let prefix = format!("{}=", CSRF_FIELD_NAME);
for pair in text.split('&') {
if let Some(rest) = pair.strip_prefix(&prefix) {
return urldecode(rest);
}
}
None
}
fn urldecode(s: &str) -> Option<String> {
percent_encoding::percent_decode_str(&s.replace('+', " "))
.decode_utf8()
.ok()
.map(|c| c.into_owned())
}

View File

@@ -0,0 +1,107 @@
use axum::body::Body;
use bytes::Bytes;
use http_body::{Body as HttpBody, Frame};
use sha2::{Digest, Sha256};
use std::error::Error;
use std::fmt;
use std::pin::Pin;
use std::task::{Context, Poll};
#[derive(Debug)]
struct Sha256MismatchError {
expected: String,
computed: String,
}
impl Sha256MismatchError {
fn message(&self) -> String {
format!(
"The x-amz-content-sha256 you specified did not match what we received (expected {}, computed {})",
self.expected, self.computed
)
}
}
impl fmt::Display for Sha256MismatchError {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
write!(
f,
"XAmzContentSHA256Mismatch: expected {}, computed {}",
self.expected, self.computed
)
}
}
impl Error for Sha256MismatchError {}
pub struct Sha256VerifyBody {
inner: Body,
expected: String,
hasher: Option<Sha256>,
}
impl Sha256VerifyBody {
pub fn new(inner: Body, expected_hex: String) -> Self {
Self {
inner,
expected: expected_hex.to_ascii_lowercase(),
hasher: Some(Sha256::new()),
}
}
}
impl HttpBody for Sha256VerifyBody {
type Data = Bytes;
type Error = Box<dyn std::error::Error + Send + Sync>;
fn poll_frame(
mut self: Pin<&mut Self>,
cx: &mut Context<'_>,
) -> Poll<Option<Result<Frame<Self::Data>, Self::Error>>> {
let this = self.as_mut().get_mut();
match Pin::new(&mut this.inner).poll_frame(cx) {
Poll::Pending => Poll::Pending,
Poll::Ready(Some(Err(e))) => Poll::Ready(Some(Err(Box::new(e)))),
Poll::Ready(Some(Ok(frame))) => {
if let Some(data) = frame.data_ref() {
if let Some(h) = this.hasher.as_mut() {
h.update(data);
}
}
Poll::Ready(Some(Ok(frame)))
}
Poll::Ready(None) => {
if let Some(hasher) = this.hasher.take() {
let computed = hex::encode(hasher.finalize());
if computed != this.expected {
return Poll::Ready(Some(Err(Box::new(Sha256MismatchError {
expected: this.expected.clone(),
computed,
}))));
}
}
Poll::Ready(None)
}
}
}
fn is_end_stream(&self) -> bool {
self.inner.is_end_stream()
}
fn size_hint(&self) -> http_body::SizeHint {
self.inner.size_hint()
}
}
pub fn is_hex_sha256(s: &str) -> bool {
s.len() == 64 && s.bytes().all(|b| b.is_ascii_hexdigit())
}
pub fn sha256_mismatch_message(err: &(dyn Error + 'static)) -> Option<String> {
if let Some(mismatch) = err.downcast_ref::<Sha256MismatchError>() {
return Some(mismatch.message());
}
err.source().and_then(sha256_mismatch_message)
}

View File

@@ -0,0 +1,105 @@
use parking_lot::RwLock;
use serde::{Deserialize, Serialize};
use std::collections::HashMap;
use std::path::{Path, PathBuf};
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct LoggingConfiguration {
pub target_bucket: String,
#[serde(default)]
pub target_prefix: String,
#[serde(default = "default_enabled")]
pub enabled: bool,
}
fn default_enabled() -> bool {
true
}
#[derive(Serialize, Deserialize)]
struct StoredLoggingFile {
#[serde(rename = "LoggingEnabled")]
logging_enabled: Option<StoredLoggingEnabled>,
}
#[derive(Serialize, Deserialize)]
struct StoredLoggingEnabled {
#[serde(rename = "TargetBucket")]
target_bucket: String,
#[serde(rename = "TargetPrefix", default)]
target_prefix: String,
}
pub struct AccessLoggingService {
storage_root: PathBuf,
cache: RwLock<HashMap<String, Option<LoggingConfiguration>>>,
}
impl AccessLoggingService {
pub fn new(storage_root: &Path) -> Self {
Self {
storage_root: storage_root.to_path_buf(),
cache: RwLock::new(HashMap::new()),
}
}
fn config_path(&self, bucket: &str) -> PathBuf {
self.storage_root
.join(".myfsio.sys")
.join("buckets")
.join(bucket)
.join("logging.json")
}
pub fn get(&self, bucket: &str) -> Option<LoggingConfiguration> {
if let Some(cached) = self.cache.read().get(bucket).cloned() {
return cached;
}
let path = self.config_path(bucket);
let config = if path.exists() {
std::fs::read_to_string(&path)
.ok()
.and_then(|s| serde_json::from_str::<StoredLoggingFile>(&s).ok())
.and_then(|f| f.logging_enabled)
.map(|e| LoggingConfiguration {
target_bucket: e.target_bucket,
target_prefix: e.target_prefix,
enabled: true,
})
} else {
None
};
self.cache
.write()
.insert(bucket.to_string(), config.clone());
config
}
pub fn set(&self, bucket: &str, config: LoggingConfiguration) -> std::io::Result<()> {
let path = self.config_path(bucket);
if let Some(parent) = path.parent() {
std::fs::create_dir_all(parent)?;
}
let stored = StoredLoggingFile {
logging_enabled: Some(StoredLoggingEnabled {
target_bucket: config.target_bucket.clone(),
target_prefix: config.target_prefix.clone(),
}),
};
let json = serde_json::to_string_pretty(&stored)
.map_err(|e| std::io::Error::new(std::io::ErrorKind::Other, e))?;
std::fs::write(&path, json)?;
self.cache.write().insert(bucket.to_string(), Some(config));
Ok(())
}
pub fn delete(&self, bucket: &str) {
let path = self.config_path(bucket);
if path.exists() {
let _ = std::fs::remove_file(&path);
}
self.cache.write().insert(bucket.to_string(), None);
}
}

View File

@@ -0,0 +1,276 @@
use serde::{Deserialize, Serialize};
use serde_json::Value;
use std::collections::{HashMap, HashSet};
pub const ACL_METADATA_KEY: &str = "__acl__";
pub const GRANTEE_ALL_USERS: &str = "*";
pub const GRANTEE_AUTHENTICATED_USERS: &str = "authenticated";
const ACL_PERMISSION_FULL_CONTROL: &str = "FULL_CONTROL";
const ACL_PERMISSION_WRITE: &str = "WRITE";
const ACL_PERMISSION_WRITE_ACP: &str = "WRITE_ACP";
const ACL_PERMISSION_READ: &str = "READ";
const ACL_PERMISSION_READ_ACP: &str = "READ_ACP";
#[derive(Debug, Clone, Serialize, Deserialize, PartialEq, Eq)]
pub struct AclGrant {
pub grantee: String,
pub permission: String,
}
#[derive(Debug, Clone, Serialize, Deserialize, PartialEq, Eq)]
pub struct Acl {
pub owner: String,
#[serde(default)]
pub grants: Vec<AclGrant>,
}
impl Acl {
pub fn allowed_actions(
&self,
principal_id: Option<&str>,
is_authenticated: bool,
) -> HashSet<&'static str> {
let mut actions = HashSet::new();
if let Some(principal_id) = principal_id {
if principal_id == self.owner {
actions.extend(permission_to_actions(ACL_PERMISSION_FULL_CONTROL));
}
}
for grant in &self.grants {
if grant.grantee == GRANTEE_ALL_USERS {
actions.extend(permission_to_actions(&grant.permission));
} else if grant.grantee == GRANTEE_AUTHENTICATED_USERS && is_authenticated {
actions.extend(permission_to_actions(&grant.permission));
} else if let Some(principal_id) = principal_id {
if grant.grantee == principal_id {
actions.extend(permission_to_actions(&grant.permission));
}
}
}
actions
}
}
pub fn create_canned_acl(canned_acl: &str, owner: &str) -> Acl {
let owner_grant = AclGrant {
grantee: owner.to_string(),
permission: ACL_PERMISSION_FULL_CONTROL.to_string(),
};
match canned_acl {
"public-read" => Acl {
owner: owner.to_string(),
grants: vec![
owner_grant,
AclGrant {
grantee: GRANTEE_ALL_USERS.to_string(),
permission: ACL_PERMISSION_READ.to_string(),
},
],
},
"public-read-write" => Acl {
owner: owner.to_string(),
grants: vec![
owner_grant,
AclGrant {
grantee: GRANTEE_ALL_USERS.to_string(),
permission: ACL_PERMISSION_READ.to_string(),
},
AclGrant {
grantee: GRANTEE_ALL_USERS.to_string(),
permission: ACL_PERMISSION_WRITE.to_string(),
},
],
},
"authenticated-read" => Acl {
owner: owner.to_string(),
grants: vec![
owner_grant,
AclGrant {
grantee: GRANTEE_AUTHENTICATED_USERS.to_string(),
permission: ACL_PERMISSION_READ.to_string(),
},
],
},
"bucket-owner-read" | "bucket-owner-full-control" | "private" | _ => Acl {
owner: owner.to_string(),
grants: vec![owner_grant],
},
}
}
pub fn acl_to_xml(acl: &Acl) -> String {
let mut xml = format!(
"<?xml version=\"1.0\" encoding=\"UTF-8\"?>\
<AccessControlPolicy xmlns=\"http://s3.amazonaws.com/doc/2006-03-01/\">\
<Owner><ID>{}</ID><DisplayName>{}</DisplayName></Owner>\
<AccessControlList>",
xml_escape(&acl.owner),
xml_escape(&acl.owner),
);
for grant in &acl.grants {
xml.push_str("<Grant>");
match grant.grantee.as_str() {
GRANTEE_ALL_USERS => {
xml.push_str(
"<Grantee xmlns:xsi=\"http://www.w3.org/2001/XMLSchema-instance\" xsi:type=\"Group\">\
<URI>http://acs.amazonaws.com/groups/global/AllUsers</URI>\
</Grantee>",
);
}
GRANTEE_AUTHENTICATED_USERS => {
xml.push_str(
"<Grantee xmlns:xsi=\"http://www.w3.org/2001/XMLSchema-instance\" xsi:type=\"Group\">\
<URI>http://acs.amazonaws.com/groups/global/AuthenticatedUsers</URI>\
</Grantee>",
);
}
other => {
xml.push_str(&format!(
"<Grantee xmlns:xsi=\"http://www.w3.org/2001/XMLSchema-instance\" xsi:type=\"CanonicalUser\">\
<ID>{}</ID><DisplayName>{}</DisplayName>\
</Grantee>",
xml_escape(other),
xml_escape(other),
));
}
}
xml.push_str(&format!(
"<Permission>{}</Permission></Grant>",
xml_escape(&grant.permission)
));
}
xml.push_str("</AccessControlList></AccessControlPolicy>");
xml
}
pub fn acl_from_bucket_config(value: &Value) -> Option<Acl> {
match value {
Value::String(raw) => acl_from_xml(raw).or_else(|| serde_json::from_str(raw).ok()),
Value::Object(_) => serde_json::from_value(value.clone()).ok(),
_ => None,
}
}
pub fn acl_from_object_metadata(metadata: &HashMap<String, String>) -> Option<Acl> {
metadata
.get(ACL_METADATA_KEY)
.and_then(|raw| serde_json::from_str::<Acl>(raw).ok())
}
pub fn store_object_acl(metadata: &mut HashMap<String, String>, acl: &Acl) {
if let Ok(serialized) = serde_json::to_string(acl) {
metadata.insert(ACL_METADATA_KEY.to_string(), serialized);
}
}
fn acl_from_xml(xml: &str) -> Option<Acl> {
let doc = roxmltree::Document::parse(xml).ok()?;
let owner = doc
.descendants()
.find(|node| node.is_element() && node.tag_name().name() == "Owner")
.and_then(|node| {
node.children()
.find(|child| child.is_element() && child.tag_name().name() == "ID")
.and_then(|child| child.text())
})
.unwrap_or("myfsio")
.trim()
.to_string();
let mut grants = Vec::new();
for grant in doc
.descendants()
.filter(|node| node.is_element() && node.tag_name().name() == "Grant")
{
let permission = grant
.children()
.find(|child| child.is_element() && child.tag_name().name() == "Permission")
.and_then(|child| child.text())
.unwrap_or_default()
.trim()
.to_string();
if permission.is_empty() {
continue;
}
let grantee_node = grant
.children()
.find(|child| child.is_element() && child.tag_name().name() == "Grantee");
let grantee = grantee_node
.and_then(|node| {
let uri = node
.children()
.find(|child| child.is_element() && child.tag_name().name() == "URI")
.and_then(|child| child.text())
.map(|text| text.trim().to_string());
match uri.as_deref() {
Some("http://acs.amazonaws.com/groups/global/AllUsers") => {
Some(GRANTEE_ALL_USERS.to_string())
}
Some("http://acs.amazonaws.com/groups/global/AuthenticatedUsers") => {
Some(GRANTEE_AUTHENTICATED_USERS.to_string())
}
_ => node
.children()
.find(|child| child.is_element() && child.tag_name().name() == "ID")
.and_then(|child| child.text())
.map(|text| text.trim().to_string()),
}
})
.unwrap_or_default();
if grantee.is_empty() {
continue;
}
grants.push(AclGrant {
grantee,
permission,
});
}
Some(Acl { owner, grants })
}
fn permission_to_actions(permission: &str) -> &'static [&'static str] {
match permission {
ACL_PERMISSION_FULL_CONTROL => &["read", "write", "delete", "list", "share"],
ACL_PERMISSION_WRITE => &["write", "delete"],
ACL_PERMISSION_WRITE_ACP => &["share"],
ACL_PERMISSION_READ => &["read", "list"],
ACL_PERMISSION_READ_ACP => &["share"],
_ => &[],
}
}
fn xml_escape(s: &str) -> String {
s.replace('&', "&amp;")
.replace('<', "&lt;")
.replace('>', "&gt;")
.replace('"', "&quot;")
.replace('\'', "&apos;")
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn canned_acl_grants_public_read() {
let acl = create_canned_acl("public-read", "owner");
let actions = acl.allowed_actions(None, false);
assert!(actions.contains("read"));
assert!(actions.contains("list"));
assert!(!actions.contains("write"));
}
#[test]
fn xml_round_trip_preserves_grants() {
let acl = create_canned_acl("authenticated-read", "owner");
let parsed = acl_from_bucket_config(&Value::String(acl_to_xml(&acl))).unwrap();
assert_eq!(parsed.owner, "owner");
assert_eq!(parsed.grants.len(), 2);
assert!(parsed
.grants
.iter()
.any(|grant| grant.grantee == GRANTEE_AUTHENTICATED_USERS));
}
}

View File

@@ -0,0 +1,315 @@
use serde_json::{json, Value};
use std::path::PathBuf;
use std::sync::Arc;
use std::time::Instant;
use tokio::sync::RwLock;
pub struct GcConfig {
pub interval_hours: f64,
pub temp_file_max_age_hours: f64,
pub multipart_max_age_days: u64,
pub lock_file_max_age_hours: f64,
pub dry_run: bool,
}
impl Default for GcConfig {
fn default() -> Self {
Self {
interval_hours: 6.0,
temp_file_max_age_hours: 24.0,
multipart_max_age_days: 7,
lock_file_max_age_hours: 1.0,
dry_run: false,
}
}
}
#[cfg(test)]
mod tests {
use super::*;
#[tokio::test]
async fn dry_run_reports_but_does_not_delete_temp_files() {
let tmp = tempfile::tempdir().unwrap();
let tmp_dir = tmp.path().join(".myfsio.sys").join("tmp");
std::fs::create_dir_all(&tmp_dir).unwrap();
let file_path = tmp_dir.join("stale.tmp");
std::fs::write(&file_path, b"temporary").unwrap();
tokio::time::sleep(std::time::Duration::from_millis(5)).await;
let service = GcService::new(
tmp.path().to_path_buf(),
GcConfig {
temp_file_max_age_hours: 0.0,
dry_run: true,
..GcConfig::default()
},
);
let result = service.run_now(false).await.unwrap();
assert_eq!(result["temp_files_deleted"], 1);
assert!(file_path.exists());
}
}
pub struct GcService {
storage_root: PathBuf,
config: GcConfig,
running: Arc<RwLock<bool>>,
started_at: Arc<RwLock<Option<Instant>>>,
history: Arc<RwLock<Vec<Value>>>,
history_path: PathBuf,
}
impl GcService {
pub fn new(storage_root: PathBuf, config: GcConfig) -> Self {
let history_path = storage_root
.join(".myfsio.sys")
.join("config")
.join("gc_history.json");
let history = if history_path.exists() {
std::fs::read_to_string(&history_path)
.ok()
.and_then(|s| serde_json::from_str::<Value>(&s).ok())
.and_then(|v| v.get("executions").and_then(|e| e.as_array().cloned()))
.unwrap_or_default()
} else {
Vec::new()
};
Self {
storage_root,
config,
running: Arc::new(RwLock::new(false)),
started_at: Arc::new(RwLock::new(None)),
history: Arc::new(RwLock::new(history)),
history_path,
}
}
pub async fn status(&self) -> Value {
let running = *self.running.read().await;
let scan_elapsed_seconds = self
.started_at
.read()
.await
.as_ref()
.map(|started| started.elapsed().as_secs_f64());
json!({
"enabled": true,
"running": running,
"scanning": running,
"scan_elapsed_seconds": scan_elapsed_seconds,
"interval_hours": self.config.interval_hours,
"temp_file_max_age_hours": self.config.temp_file_max_age_hours,
"multipart_max_age_days": self.config.multipart_max_age_days,
"lock_file_max_age_hours": self.config.lock_file_max_age_hours,
"dry_run": self.config.dry_run,
})
}
pub async fn history(&self) -> Value {
let history = self.history.read().await;
let mut executions: Vec<Value> = history.iter().cloned().collect();
executions.reverse();
json!({ "executions": executions })
}
pub async fn run_now(&self, dry_run: bool) -> Result<Value, String> {
{
let mut running = self.running.write().await;
if *running {
return Err("GC already running".to_string());
}
*running = true;
}
*self.started_at.write().await = Some(Instant::now());
let start = Instant::now();
let result = self.execute_gc(dry_run || self.config.dry_run).await;
let elapsed = start.elapsed().as_secs_f64();
*self.running.write().await = false;
*self.started_at.write().await = None;
let mut result_json = result.clone();
if let Some(obj) = result_json.as_object_mut() {
obj.insert("execution_time_seconds".to_string(), json!(elapsed));
}
let record = json!({
"timestamp": chrono::Utc::now().timestamp_millis() as f64 / 1000.0,
"dry_run": dry_run || self.config.dry_run,
"result": result_json,
});
{
let mut history = self.history.write().await;
history.push(record);
if history.len() > 50 {
let excess = history.len() - 50;
history.drain(..excess);
}
}
self.save_history().await;
Ok(result)
}
async fn execute_gc(&self, dry_run: bool) -> Value {
let mut temp_files_deleted = 0u64;
let mut temp_bytes_freed = 0u64;
let mut multipart_uploads_deleted = 0u64;
let mut lock_files_deleted = 0u64;
let mut empty_dirs_removed = 0u64;
let mut errors: Vec<String> = Vec::new();
let now = std::time::SystemTime::now();
let temp_max_age =
std::time::Duration::from_secs_f64(self.config.temp_file_max_age_hours * 3600.0);
let multipart_max_age =
std::time::Duration::from_secs(self.config.multipart_max_age_days * 86400);
let lock_max_age =
std::time::Duration::from_secs_f64(self.config.lock_file_max_age_hours * 3600.0);
let tmp_dir = self.storage_root.join(".myfsio.sys").join("tmp");
if tmp_dir.exists() {
match std::fs::read_dir(&tmp_dir) {
Ok(entries) => {
for entry in entries.flatten() {
if let Ok(metadata) = entry.metadata() {
if let Ok(modified) = metadata.modified() {
if let Ok(age) = now.duration_since(modified) {
if age > temp_max_age {
let size = metadata.len();
if !dry_run {
if let Err(e) = std::fs::remove_file(entry.path()) {
errors.push(format!(
"Failed to remove temp file: {}",
e
));
continue;
}
}
temp_files_deleted += 1;
temp_bytes_freed += size;
}
}
}
}
}
}
Err(e) => errors.push(format!("Failed to read tmp dir: {}", e)),
}
}
let multipart_dir = self.storage_root.join(".myfsio.sys").join("multipart");
if multipart_dir.exists() {
if let Ok(bucket_dirs) = std::fs::read_dir(&multipart_dir) {
for bucket_entry in bucket_dirs.flatten() {
if let Ok(uploads) = std::fs::read_dir(bucket_entry.path()) {
for upload in uploads.flatten() {
if let Ok(metadata) = upload.metadata() {
if let Ok(modified) = metadata.modified() {
if let Ok(age) = now.duration_since(modified) {
if age > multipart_max_age {
if !dry_run {
let _ = std::fs::remove_dir_all(upload.path());
}
multipart_uploads_deleted += 1;
}
}
}
}
}
}
}
}
}
let buckets_dir = self.storage_root.join(".myfsio.sys").join("buckets");
if buckets_dir.exists() {
if let Ok(bucket_dirs) = std::fs::read_dir(&buckets_dir) {
for bucket_entry in bucket_dirs.flatten() {
let locks_dir = bucket_entry.path().join("locks");
if locks_dir.exists() {
if let Ok(locks) = std::fs::read_dir(&locks_dir) {
for lock in locks.flatten() {
if let Ok(metadata) = lock.metadata() {
if let Ok(modified) = metadata.modified() {
if let Ok(age) = now.duration_since(modified) {
if age > lock_max_age {
if !dry_run {
let _ = std::fs::remove_file(lock.path());
}
lock_files_deleted += 1;
}
}
}
}
}
}
}
}
}
}
if !dry_run {
for dir in [&tmp_dir, &multipart_dir] {
if dir.exists() {
if let Ok(entries) = std::fs::read_dir(dir) {
for entry in entries.flatten() {
if entry.path().is_dir() {
if let Ok(mut contents) = std::fs::read_dir(entry.path()) {
if contents.next().is_none() {
let _ = std::fs::remove_dir(entry.path());
empty_dirs_removed += 1;
}
}
}
}
}
}
}
}
json!({
"temp_files_deleted": temp_files_deleted,
"temp_bytes_freed": temp_bytes_freed,
"multipart_uploads_deleted": multipart_uploads_deleted,
"lock_files_deleted": lock_files_deleted,
"empty_dirs_removed": empty_dirs_removed,
"errors": errors,
})
}
async fn save_history(&self) {
let history = self.history.read().await;
let data = json!({ "executions": *history });
if let Some(parent) = self.history_path.parent() {
let _ = std::fs::create_dir_all(parent);
}
let _ = std::fs::write(
&self.history_path,
serde_json::to_string_pretty(&data).unwrap_or_default(),
);
}
pub fn start_background(self: Arc<Self>) -> tokio::task::JoinHandle<()> {
let interval = std::time::Duration::from_secs_f64(self.config.interval_hours * 3600.0);
tokio::spawn(async move {
let mut timer = tokio::time::interval(interval);
timer.tick().await;
loop {
timer.tick().await;
tracing::info!("GC cycle starting");
match self.run_now(false).await {
Ok(result) => tracing::info!("GC cycle complete: {:?}", result),
Err(e) => tracing::warn!("GC cycle failed: {}", e),
}
}
})
}
}

View File

@@ -0,0 +1,732 @@
use myfsio_common::constants::{
BUCKET_META_DIR, BUCKET_VERSIONS_DIR, INDEX_FILE, SYSTEM_BUCKETS_DIR, SYSTEM_ROOT,
};
use myfsio_storage::fs_backend::FsStorageBackend;
use serde_json::{json, Map, Value};
use std::collections::{HashMap, HashSet};
use std::path::{Path, PathBuf};
use std::sync::Arc;
use std::time::Instant;
use tokio::sync::RwLock;
const MAX_ISSUES: usize = 500;
const INTERNAL_FOLDERS: &[&str] = &[".meta", ".versions", ".multipart"];
pub struct IntegrityConfig {
pub interval_hours: f64,
pub batch_size: usize,
pub auto_heal: bool,
pub dry_run: bool,
}
impl Default for IntegrityConfig {
fn default() -> Self {
Self {
interval_hours: 24.0,
batch_size: 10_000,
auto_heal: false,
dry_run: false,
}
}
}
pub struct IntegrityService {
#[allow(dead_code)]
storage: Arc<FsStorageBackend>,
storage_root: PathBuf,
config: IntegrityConfig,
running: Arc<RwLock<bool>>,
started_at: Arc<RwLock<Option<Instant>>>,
history: Arc<RwLock<Vec<Value>>>,
history_path: PathBuf,
}
#[derive(Default)]
struct ScanState {
objects_scanned: u64,
buckets_scanned: u64,
corrupted_objects: u64,
orphaned_objects: u64,
phantom_metadata: u64,
stale_versions: u64,
etag_cache_inconsistencies: u64,
issues: Vec<Value>,
errors: Vec<String>,
}
impl ScanState {
fn batch_exhausted(&self, batch_size: usize) -> bool {
self.objects_scanned >= batch_size as u64
}
fn push_issue(&mut self, issue_type: &str, bucket: &str, key: &str, detail: String) {
if self.issues.len() < MAX_ISSUES {
self.issues.push(json!({
"issue_type": issue_type,
"bucket": bucket,
"key": key,
"detail": detail,
}));
}
}
fn into_json(self, elapsed: f64) -> Value {
json!({
"objects_scanned": self.objects_scanned,
"buckets_scanned": self.buckets_scanned,
"corrupted_objects": self.corrupted_objects,
"orphaned_objects": self.orphaned_objects,
"phantom_metadata": self.phantom_metadata,
"stale_versions": self.stale_versions,
"etag_cache_inconsistencies": self.etag_cache_inconsistencies,
"issues_healed": 0,
"issues": self.issues,
"errors": self.errors,
"execution_time_seconds": elapsed,
})
}
}
impl IntegrityService {
pub fn new(
storage: Arc<FsStorageBackend>,
storage_root: &Path,
config: IntegrityConfig,
) -> Self {
let history_path = storage_root
.join(SYSTEM_ROOT)
.join("config")
.join("integrity_history.json");
let history = if history_path.exists() {
std::fs::read_to_string(&history_path)
.ok()
.and_then(|s| serde_json::from_str::<Value>(&s).ok())
.and_then(|v| v.get("executions").and_then(|e| e.as_array().cloned()))
.unwrap_or_default()
} else {
Vec::new()
};
Self {
storage,
storage_root: storage_root.to_path_buf(),
config,
running: Arc::new(RwLock::new(false)),
started_at: Arc::new(RwLock::new(None)),
history: Arc::new(RwLock::new(history)),
history_path,
}
}
pub async fn status(&self) -> Value {
let running = *self.running.read().await;
let scan_elapsed_seconds = self
.started_at
.read()
.await
.as_ref()
.map(|started| started.elapsed().as_secs_f64());
json!({
"enabled": true,
"running": running,
"scanning": running,
"scan_elapsed_seconds": scan_elapsed_seconds,
"interval_hours": self.config.interval_hours,
"batch_size": self.config.batch_size,
"auto_heal": self.config.auto_heal,
"dry_run": self.config.dry_run,
})
}
pub async fn history(&self) -> Value {
let history = self.history.read().await;
let mut executions: Vec<Value> = history.iter().cloned().collect();
executions.reverse();
json!({ "executions": executions })
}
pub async fn run_now(&self, dry_run: bool, auto_heal: bool) -> Result<Value, String> {
{
let mut running = self.running.write().await;
if *running {
return Err("Integrity check already running".to_string());
}
*running = true;
}
*self.started_at.write().await = Some(Instant::now());
let start = Instant::now();
let storage_root = self.storage_root.clone();
let batch_size = self.config.batch_size;
let result =
tokio::task::spawn_blocking(move || scan_all_buckets(&storage_root, batch_size))
.await
.unwrap_or_else(|e| {
let mut st = ScanState::default();
st.errors.push(format!("scan task failed: {}", e));
st
});
let elapsed = start.elapsed().as_secs_f64();
*self.running.write().await = false;
*self.started_at.write().await = None;
let result_json = result.into_json(elapsed);
let record = json!({
"timestamp": chrono::Utc::now().timestamp_millis() as f64 / 1000.0,
"dry_run": dry_run,
"auto_heal": auto_heal,
"result": result_json.clone(),
});
{
let mut history = self.history.write().await;
history.push(record);
if history.len() > 50 {
let excess = history.len() - 50;
history.drain(..excess);
}
}
self.save_history().await;
Ok(result_json)
}
async fn save_history(&self) {
let history = self.history.read().await;
let data = json!({ "executions": *history });
if let Some(parent) = self.history_path.parent() {
let _ = std::fs::create_dir_all(parent);
}
let _ = std::fs::write(
&self.history_path,
serde_json::to_string_pretty(&data).unwrap_or_default(),
);
}
pub fn start_background(self: Arc<Self>) -> tokio::task::JoinHandle<()> {
let interval = std::time::Duration::from_secs_f64(self.config.interval_hours * 3600.0);
tokio::spawn(async move {
let mut timer = tokio::time::interval(interval);
timer.tick().await;
loop {
timer.tick().await;
tracing::info!("Integrity check starting");
match self.run_now(false, false).await {
Ok(result) => tracing::info!("Integrity check complete: {:?}", result),
Err(e) => tracing::warn!("Integrity check failed: {}", e),
}
}
})
}
}
fn scan_all_buckets(storage_root: &Path, batch_size: usize) -> ScanState {
let mut state = ScanState::default();
let buckets = match list_bucket_names(storage_root) {
Ok(b) => b,
Err(e) => {
state.errors.push(format!("list buckets: {}", e));
return state;
}
};
for bucket in &buckets {
if state.batch_exhausted(batch_size) {
break;
}
state.buckets_scanned += 1;
let bucket_path = storage_root.join(bucket);
let meta_root = storage_root
.join(SYSTEM_ROOT)
.join(SYSTEM_BUCKETS_DIR)
.join(bucket)
.join(BUCKET_META_DIR);
let index_entries = collect_index_entries(&meta_root);
check_corrupted(&mut state, bucket, &bucket_path, &index_entries, batch_size);
check_phantom(&mut state, bucket, &bucket_path, &index_entries, batch_size);
check_orphaned(&mut state, bucket, &bucket_path, &index_entries, batch_size);
check_stale_versions(&mut state, storage_root, bucket, batch_size);
check_etag_cache(&mut state, storage_root, bucket, &index_entries, batch_size);
}
state
}
fn list_bucket_names(storage_root: &Path) -> std::io::Result<Vec<String>> {
let mut names = Vec::new();
if !storage_root.exists() {
return Ok(names);
}
for entry in std::fs::read_dir(storage_root)? {
let entry = entry?;
let name = entry.file_name().to_string_lossy().to_string();
if name == SYSTEM_ROOT {
continue;
}
if entry.file_type().map(|t| t.is_dir()).unwrap_or(false) {
names.push(name);
}
}
Ok(names)
}
#[allow(dead_code)]
struct IndexEntryInfo {
entry: Value,
index_file: PathBuf,
key_name: String,
}
fn collect_index_entries(meta_root: &Path) -> HashMap<String, IndexEntryInfo> {
let mut out: HashMap<String, IndexEntryInfo> = HashMap::new();
if !meta_root.exists() {
return out;
}
let mut stack: Vec<PathBuf> = vec![meta_root.to_path_buf()];
while let Some(dir) = stack.pop() {
let rd = match std::fs::read_dir(&dir) {
Ok(r) => r,
Err(_) => continue,
};
for entry in rd.flatten() {
let path = entry.path();
let ft = match entry.file_type() {
Ok(t) => t,
Err(_) => continue,
};
if ft.is_dir() {
stack.push(path);
continue;
}
if entry.file_name().to_string_lossy() != INDEX_FILE {
continue;
}
let rel_dir = match path.parent().and_then(|p| p.strip_prefix(meta_root).ok()) {
Some(p) => p.to_path_buf(),
None => continue,
};
let dir_prefix = if rel_dir.as_os_str().is_empty() {
String::new()
} else {
rel_dir
.components()
.map(|c| c.as_os_str().to_string_lossy().to_string())
.collect::<Vec<_>>()
.join("/")
};
let content = match std::fs::read_to_string(&path) {
Ok(c) => c,
Err(_) => continue,
};
let index_data: Map<String, Value> = match serde_json::from_str(&content) {
Ok(Value::Object(m)) => m,
_ => continue,
};
for (key_name, entry_val) in index_data {
let full_key = if dir_prefix.is_empty() {
key_name.clone()
} else {
format!("{}/{}", dir_prefix, key_name)
};
out.insert(
full_key,
IndexEntryInfo {
entry: entry_val,
index_file: path.clone(),
key_name,
},
);
}
}
}
out
}
fn stored_etag(entry: &Value) -> Option<String> {
entry
.get("metadata")
.and_then(|m| m.get("__etag__"))
.and_then(|v| v.as_str())
.map(|s| s.to_string())
}
fn check_corrupted(
state: &mut ScanState,
bucket: &str,
bucket_path: &Path,
entries: &HashMap<String, IndexEntryInfo>,
batch_size: usize,
) {
let mut keys: Vec<&String> = entries.keys().collect();
keys.sort();
for full_key in keys {
if state.batch_exhausted(batch_size) {
return;
}
let info = &entries[full_key];
let object_path = bucket_path.join(full_key);
if !object_path.exists() {
continue;
}
state.objects_scanned += 1;
let Some(stored) = stored_etag(&info.entry) else {
continue;
};
match myfsio_crypto::hashing::md5_file(&object_path) {
Ok(actual) => {
if actual != stored {
state.corrupted_objects += 1;
state.push_issue(
"corrupted_object",
bucket,
full_key,
format!("stored_etag={} actual_etag={}", stored, actual),
);
}
}
Err(e) => state
.errors
.push(format!("hash {}/{}: {}", bucket, full_key, e)),
}
}
}
fn check_phantom(
state: &mut ScanState,
bucket: &str,
bucket_path: &Path,
entries: &HashMap<String, IndexEntryInfo>,
batch_size: usize,
) {
let mut keys: Vec<&String> = entries.keys().collect();
keys.sort();
for full_key in keys {
if state.batch_exhausted(batch_size) {
return;
}
state.objects_scanned += 1;
let object_path = bucket_path.join(full_key);
if !object_path.exists() {
state.phantom_metadata += 1;
state.push_issue(
"phantom_metadata",
bucket,
full_key,
"metadata entry without file on disk".to_string(),
);
}
}
}
fn check_orphaned(
state: &mut ScanState,
bucket: &str,
bucket_path: &Path,
entries: &HashMap<String, IndexEntryInfo>,
batch_size: usize,
) {
let indexed: HashSet<&String> = entries.keys().collect();
let mut stack: Vec<(PathBuf, String)> = vec![(bucket_path.to_path_buf(), String::new())];
while let Some((dir, prefix)) = stack.pop() {
if state.batch_exhausted(batch_size) {
return;
}
let rd = match std::fs::read_dir(&dir) {
Ok(r) => r,
Err(_) => continue,
};
for entry in rd.flatten() {
if state.batch_exhausted(batch_size) {
return;
}
let name = entry.file_name().to_string_lossy().to_string();
let ft = match entry.file_type() {
Ok(t) => t,
Err(_) => continue,
};
if ft.is_dir() {
if prefix.is_empty() && INTERNAL_FOLDERS.contains(&name.as_str()) {
continue;
}
let new_prefix = if prefix.is_empty() {
name
} else {
format!("{}/{}", prefix, name)
};
stack.push((entry.path(), new_prefix));
} else if ft.is_file() {
let full_key = if prefix.is_empty() {
name
} else {
format!("{}/{}", prefix, name)
};
state.objects_scanned += 1;
if !indexed.contains(&full_key) {
state.orphaned_objects += 1;
state.push_issue(
"orphaned_object",
bucket,
&full_key,
"file exists without metadata entry".to_string(),
);
}
}
}
}
}
fn check_stale_versions(
state: &mut ScanState,
storage_root: &Path,
bucket: &str,
batch_size: usize,
) {
let versions_root = storage_root
.join(SYSTEM_ROOT)
.join(SYSTEM_BUCKETS_DIR)
.join(bucket)
.join(BUCKET_VERSIONS_DIR);
if !versions_root.exists() {
return;
}
let mut stack: Vec<PathBuf> = vec![versions_root.clone()];
while let Some(dir) = stack.pop() {
if state.batch_exhausted(batch_size) {
return;
}
let rd = match std::fs::read_dir(&dir) {
Ok(r) => r,
Err(_) => continue,
};
let mut bin_stems: HashMap<String, PathBuf> = HashMap::new();
let mut json_stems: HashMap<String, PathBuf> = HashMap::new();
let mut subdirs: Vec<PathBuf> = Vec::new();
for entry in rd.flatten() {
let ft = match entry.file_type() {
Ok(t) => t,
Err(_) => continue,
};
let path = entry.path();
if ft.is_dir() {
subdirs.push(path);
continue;
}
let name = entry.file_name().to_string_lossy().to_string();
if let Some(stem) = name.strip_suffix(".bin") {
bin_stems.insert(stem.to_string(), path);
} else if let Some(stem) = name.strip_suffix(".json") {
json_stems.insert(stem.to_string(), path);
}
}
for (stem, path) in &bin_stems {
if state.batch_exhausted(batch_size) {
return;
}
state.objects_scanned += 1;
if !json_stems.contains_key(stem) {
state.stale_versions += 1;
let key = path
.strip_prefix(&versions_root)
.map(|p| p.to_string_lossy().replace('\\', "/"))
.unwrap_or_else(|_| path.display().to_string());
state.push_issue(
"stale_version",
bucket,
&key,
"version data without manifest".to_string(),
);
}
}
for (stem, path) in &json_stems {
if state.batch_exhausted(batch_size) {
return;
}
state.objects_scanned += 1;
if !bin_stems.contains_key(stem) {
state.stale_versions += 1;
let key = path
.strip_prefix(&versions_root)
.map(|p| p.to_string_lossy().replace('\\', "/"))
.unwrap_or_else(|_| path.display().to_string());
state.push_issue(
"stale_version",
bucket,
&key,
"version manifest without data".to_string(),
);
}
}
stack.extend(subdirs);
}
}
fn check_etag_cache(
state: &mut ScanState,
storage_root: &Path,
bucket: &str,
entries: &HashMap<String, IndexEntryInfo>,
batch_size: usize,
) {
let etag_index_path = storage_root
.join(SYSTEM_ROOT)
.join(SYSTEM_BUCKETS_DIR)
.join(bucket)
.join("etag_index.json");
if !etag_index_path.exists() {
return;
}
let cache: HashMap<String, Value> = match std::fs::read_to_string(&etag_index_path)
.ok()
.and_then(|s| serde_json::from_str(&s).ok())
{
Some(Value::Object(m)) => m.into_iter().collect(),
_ => return,
};
for (full_key, cached_val) in cache {
if state.batch_exhausted(batch_size) {
return;
}
state.objects_scanned += 1;
let Some(cached_etag) = cached_val.as_str() else {
continue;
};
let Some(info) = entries.get(&full_key) else {
continue;
};
let Some(stored) = stored_etag(&info.entry) else {
continue;
};
if cached_etag != stored {
state.etag_cache_inconsistencies += 1;
state.push_issue(
"etag_cache_inconsistency",
bucket,
&full_key,
format!("cached_etag={} index_etag={}", cached_etag, stored),
);
}
}
}
#[cfg(test)]
mod tests {
use super::*;
use std::fs;
fn md5_hex(bytes: &[u8]) -> String {
myfsio_crypto::hashing::md5_bytes(bytes)
}
fn write_index(meta_dir: &Path, entries: &[(&str, &str)]) {
fs::create_dir_all(meta_dir).unwrap();
let mut map = Map::new();
for (name, etag) in entries {
map.insert(
name.to_string(),
json!({ "metadata": { "__etag__": etag } }),
);
}
fs::write(
meta_dir.join(INDEX_FILE),
serde_json::to_string(&Value::Object(map)).unwrap(),
)
.unwrap();
}
#[test]
fn scan_detects_each_issue_type() {
let tmp = tempfile::tempdir().unwrap();
let root = tmp.path();
let bucket = "testbucket";
let bucket_path = root.join(bucket);
let meta_root = root
.join(SYSTEM_ROOT)
.join(SYSTEM_BUCKETS_DIR)
.join(bucket)
.join(BUCKET_META_DIR);
fs::create_dir_all(&bucket_path).unwrap();
let clean_bytes = b"clean file contents";
let clean_etag = md5_hex(clean_bytes);
fs::write(bucket_path.join("clean.txt"), clean_bytes).unwrap();
let corrupted_bytes = b"actual content";
fs::write(bucket_path.join("corrupted.txt"), corrupted_bytes).unwrap();
fs::write(bucket_path.join("orphan.txt"), b"no metadata").unwrap();
write_index(
&meta_root,
&[
("clean.txt", &clean_etag),
("corrupted.txt", "00000000000000000000000000000000"),
("phantom.txt", "deadbeefdeadbeefdeadbeefdeadbeef"),
],
);
let versions_root = root
.join(SYSTEM_ROOT)
.join(SYSTEM_BUCKETS_DIR)
.join(bucket)
.join(BUCKET_VERSIONS_DIR)
.join("someobject");
fs::create_dir_all(&versions_root).unwrap();
fs::write(versions_root.join("v1.bin"), b"orphan bin").unwrap();
fs::write(versions_root.join("v2.json"), b"{}").unwrap();
let etag_index = root
.join(SYSTEM_ROOT)
.join(SYSTEM_BUCKETS_DIR)
.join(bucket)
.join("etag_index.json");
fs::write(
&etag_index,
serde_json::to_string(&json!({ "clean.txt": "stale-cached-etag" })).unwrap(),
)
.unwrap();
let state = scan_all_buckets(root, 10_000);
assert_eq!(state.corrupted_objects, 1, "corrupted");
assert_eq!(state.phantom_metadata, 1, "phantom");
assert_eq!(state.orphaned_objects, 1, "orphaned");
assert_eq!(state.stale_versions, 2, "stale versions");
assert_eq!(state.etag_cache_inconsistencies, 1, "etag cache");
assert_eq!(state.buckets_scanned, 1);
assert!(
state.errors.is_empty(),
"unexpected errors: {:?}",
state.errors
);
}
#[test]
fn skips_system_root_as_bucket() {
let tmp = tempfile::tempdir().unwrap();
fs::create_dir_all(tmp.path().join(SYSTEM_ROOT).join("config")).unwrap();
let state = scan_all_buckets(tmp.path(), 100);
assert_eq!(state.buckets_scanned, 0);
}
}

View File

@@ -0,0 +1,637 @@
use chrono::{DateTime, Duration, Utc};
use myfsio_storage::fs_backend::FsStorageBackend;
use myfsio_storage::traits::StorageEngine;
use serde::{Deserialize, Serialize};
use serde_json::{json, Value};
use std::collections::VecDeque;
use std::path::{Path, PathBuf};
use std::sync::Arc;
use tokio::sync::RwLock;
pub struct LifecycleConfig {
pub interval_seconds: u64,
pub max_history_per_bucket: usize,
}
impl Default for LifecycleConfig {
fn default() -> Self {
Self {
interval_seconds: 3600,
max_history_per_bucket: 50,
}
}
}
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct LifecycleExecutionRecord {
pub timestamp: f64,
pub bucket_name: String,
pub objects_deleted: u64,
pub versions_deleted: u64,
pub uploads_aborted: u64,
#[serde(default)]
pub errors: Vec<String>,
pub execution_time_seconds: f64,
}
#[derive(Debug, Clone, Default)]
struct BucketLifecycleResult {
bucket_name: String,
objects_deleted: u64,
versions_deleted: u64,
uploads_aborted: u64,
errors: Vec<String>,
execution_time_seconds: f64,
}
#[derive(Debug, Clone, Default)]
struct ParsedLifecycleRule {
status: String,
prefix: String,
expiration_days: Option<u64>,
expiration_date: Option<DateTime<Utc>>,
noncurrent_days: Option<u64>,
abort_incomplete_multipart_days: Option<u64>,
}
pub struct LifecycleService {
storage: Arc<FsStorageBackend>,
storage_root: PathBuf,
config: LifecycleConfig,
running: Arc<RwLock<bool>>,
}
impl LifecycleService {
pub fn new(
storage: Arc<FsStorageBackend>,
storage_root: impl Into<PathBuf>,
config: LifecycleConfig,
) -> Self {
Self {
storage,
storage_root: storage_root.into(),
config,
running: Arc::new(RwLock::new(false)),
}
}
pub async fn run_cycle(&self) -> Result<Value, String> {
{
let mut running = self.running.write().await;
if *running {
return Err("Lifecycle already running".to_string());
}
*running = true;
}
let result = self.evaluate_rules().await;
*self.running.write().await = false;
Ok(result)
}
async fn evaluate_rules(&self) -> Value {
let buckets = match self.storage.list_buckets().await {
Ok(buckets) => buckets,
Err(err) => return json!({ "error": err.to_string() }),
};
let mut bucket_results = Vec::new();
let mut total_objects_deleted = 0u64;
let mut total_versions_deleted = 0u64;
let mut total_uploads_aborted = 0u64;
let mut errors = Vec::new();
for bucket in &buckets {
let started_at = std::time::Instant::now();
let mut result = BucketLifecycleResult {
bucket_name: bucket.name.clone(),
..Default::default()
};
let config = match self.storage.get_bucket_config(&bucket.name).await {
Ok(config) => config,
Err(err) => {
result.errors.push(err.to_string());
result.execution_time_seconds = started_at.elapsed().as_secs_f64();
self.append_history(&result);
errors.extend(result.errors.clone());
bucket_results.push(result);
continue;
}
};
let Some(lifecycle) = config.lifecycle.as_ref() else {
continue;
};
let rules = parse_lifecycle_rules(lifecycle);
if rules.is_empty() {
continue;
}
for rule in &rules {
if rule.status != "Enabled" {
continue;
}
if let Some(err) = self
.apply_expiration_rule(&bucket.name, rule, &mut result)
.await
{
result.errors.push(err);
}
if let Some(err) = self
.apply_noncurrent_expiration_rule(&bucket.name, rule, &mut result)
.await
{
result.errors.push(err);
}
if let Some(err) = self
.apply_abort_incomplete_multipart_rule(&bucket.name, rule, &mut result)
.await
{
result.errors.push(err);
}
}
result.execution_time_seconds = started_at.elapsed().as_secs_f64();
if result.objects_deleted > 0
|| result.versions_deleted > 0
|| result.uploads_aborted > 0
|| !result.errors.is_empty()
{
total_objects_deleted += result.objects_deleted;
total_versions_deleted += result.versions_deleted;
total_uploads_aborted += result.uploads_aborted;
errors.extend(result.errors.clone());
self.append_history(&result);
bucket_results.push(result);
}
}
json!({
"objects_deleted": total_objects_deleted,
"versions_deleted": total_versions_deleted,
"multipart_aborted": total_uploads_aborted,
"buckets_evaluated": buckets.len(),
"results": bucket_results.iter().map(result_to_json).collect::<Vec<_>>(),
"errors": errors,
})
}
async fn apply_expiration_rule(
&self,
bucket: &str,
rule: &ParsedLifecycleRule,
result: &mut BucketLifecycleResult,
) -> Option<String> {
let cutoff = if let Some(days) = rule.expiration_days {
Some(Utc::now() - Duration::days(days as i64))
} else {
rule.expiration_date
};
let Some(cutoff) = cutoff else {
return None;
};
let params = myfsio_common::types::ListParams {
max_keys: 10_000,
prefix: if rule.prefix.is_empty() {
None
} else {
Some(rule.prefix.clone())
},
..Default::default()
};
match self.storage.list_objects(bucket, &params).await {
Ok(objects) => {
for object in &objects.objects {
if object.last_modified < cutoff {
if let Err(err) = self.storage.delete_object(bucket, &object.key).await {
result
.errors
.push(format!("{}:{}: {}", bucket, object.key, err));
} else {
result.objects_deleted += 1;
}
}
}
None
}
Err(err) => Some(format!("Failed to list objects for {}: {}", bucket, err)),
}
}
async fn apply_noncurrent_expiration_rule(
&self,
bucket: &str,
rule: &ParsedLifecycleRule,
result: &mut BucketLifecycleResult,
) -> Option<String> {
let Some(days) = rule.noncurrent_days else {
return None;
};
let cutoff = Utc::now() - Duration::days(days as i64);
let versions_root = version_root_for_bucket(&self.storage_root, bucket);
if !versions_root.exists() {
return None;
}
let mut stack = VecDeque::from([versions_root]);
while let Some(current) = stack.pop_front() {
let entries = match std::fs::read_dir(&current) {
Ok(entries) => entries,
Err(err) => return Some(err.to_string()),
};
for entry in entries.flatten() {
let file_type = match entry.file_type() {
Ok(file_type) => file_type,
Err(_) => continue,
};
if file_type.is_dir() {
stack.push_back(entry.path());
continue;
}
if entry.path().extension().and_then(|ext| ext.to_str()) != Some("json") {
continue;
}
let contents = match std::fs::read_to_string(entry.path()) {
Ok(contents) => contents,
Err(_) => continue,
};
let Ok(manifest) = serde_json::from_str::<Value>(&contents) else {
continue;
};
let key = manifest
.get("key")
.and_then(|value| value.as_str())
.unwrap_or_default()
.to_string();
if !rule.prefix.is_empty() && !key.starts_with(&rule.prefix) {
continue;
}
let archived_at = manifest
.get("archived_at")
.and_then(|value| value.as_str())
.and_then(|value| DateTime::parse_from_rfc3339(value).ok())
.map(|value| value.with_timezone(&Utc));
if archived_at.is_none() || archived_at.unwrap() >= cutoff {
continue;
}
let version_id = manifest
.get("version_id")
.and_then(|value| value.as_str())
.unwrap_or_default();
let data_path = entry.path().with_file_name(format!("{}.bin", version_id));
let _ = std::fs::remove_file(&data_path);
let _ = std::fs::remove_file(entry.path());
result.versions_deleted += 1;
}
}
None
}
async fn apply_abort_incomplete_multipart_rule(
&self,
bucket: &str,
rule: &ParsedLifecycleRule,
result: &mut BucketLifecycleResult,
) -> Option<String> {
let Some(days) = rule.abort_incomplete_multipart_days else {
return None;
};
let cutoff = Utc::now() - Duration::days(days as i64);
match self.storage.list_multipart_uploads(bucket).await {
Ok(uploads) => {
for upload in &uploads {
if upload.initiated < cutoff {
if let Err(err) = self
.storage
.abort_multipart(bucket, &upload.upload_id)
.await
{
result
.errors
.push(format!("abort {}: {}", upload.upload_id, err));
} else {
result.uploads_aborted += 1;
}
}
}
None
}
Err(err) => Some(format!(
"Failed to list multipart uploads for {}: {}",
bucket, err
)),
}
}
fn append_history(&self, result: &BucketLifecycleResult) {
let path = lifecycle_history_path(&self.storage_root, &result.bucket_name);
let mut history = load_history(&path);
history.insert(
0,
LifecycleExecutionRecord {
timestamp: Utc::now().timestamp_millis() as f64 / 1000.0,
bucket_name: result.bucket_name.clone(),
objects_deleted: result.objects_deleted,
versions_deleted: result.versions_deleted,
uploads_aborted: result.uploads_aborted,
errors: result.errors.clone(),
execution_time_seconds: result.execution_time_seconds,
},
);
history.truncate(self.config.max_history_per_bucket);
let payload = json!({
"executions": history,
});
if let Some(parent) = path.parent() {
let _ = std::fs::create_dir_all(parent);
}
let _ = std::fs::write(
&path,
serde_json::to_string_pretty(&payload).unwrap_or_else(|_| "{}".to_string()),
);
}
pub fn start_background(self: Arc<Self>) -> tokio::task::JoinHandle<()> {
let interval = std::time::Duration::from_secs(self.config.interval_seconds);
tokio::spawn(async move {
let mut timer = tokio::time::interval(interval);
timer.tick().await;
loop {
timer.tick().await;
tracing::info!("Lifecycle evaluation starting");
match self.run_cycle().await {
Ok(result) => tracing::info!("Lifecycle cycle complete: {:?}", result),
Err(err) => tracing::warn!("Lifecycle cycle failed: {}", err),
}
}
})
}
}
pub fn read_history(storage_root: &Path, bucket_name: &str, limit: usize, offset: usize) -> Value {
let path = lifecycle_history_path(storage_root, bucket_name);
let mut history = load_history(&path);
let total = history.len();
let executions = history
.drain(offset.min(total)..)
.take(limit)
.collect::<Vec<_>>();
json!({
"executions": executions,
"total": total,
"limit": limit,
"offset": offset,
"enabled": true,
})
}
fn load_history(path: &Path) -> Vec<LifecycleExecutionRecord> {
if !path.exists() {
return Vec::new();
}
std::fs::read_to_string(path)
.ok()
.and_then(|contents| serde_json::from_str::<Value>(&contents).ok())
.and_then(|value| value.get("executions").cloned())
.and_then(|value| serde_json::from_value::<Vec<LifecycleExecutionRecord>>(value).ok())
.unwrap_or_default()
}
fn lifecycle_history_path(storage_root: &Path, bucket_name: &str) -> PathBuf {
storage_root
.join(".myfsio.sys")
.join("buckets")
.join(bucket_name)
.join("lifecycle_history.json")
}
fn version_root_for_bucket(storage_root: &Path, bucket_name: &str) -> PathBuf {
storage_root
.join(".myfsio.sys")
.join("buckets")
.join(bucket_name)
.join("versions")
}
fn parse_lifecycle_rules(value: &Value) -> Vec<ParsedLifecycleRule> {
match value {
Value::String(raw) => parse_lifecycle_rules_from_string(raw),
Value::Array(items) => items.iter().filter_map(parse_lifecycle_rule).collect(),
Value::Object(map) => map
.get("Rules")
.and_then(|rules| rules.as_array())
.map(|rules| rules.iter().filter_map(parse_lifecycle_rule).collect())
.unwrap_or_default(),
_ => Vec::new(),
}
}
fn parse_lifecycle_rules_from_string(raw: &str) -> Vec<ParsedLifecycleRule> {
if let Ok(json) = serde_json::from_str::<Value>(raw) {
return parse_lifecycle_rules(&json);
}
let Ok(doc) = roxmltree::Document::parse(raw) else {
return Vec::new();
};
doc.descendants()
.filter(|node| node.is_element() && node.tag_name().name() == "Rule")
.map(|rule| ParsedLifecycleRule {
status: child_text(&rule, "Status").unwrap_or_else(|| "Enabled".to_string()),
prefix: child_text(&rule, "Prefix")
.or_else(|| {
rule.descendants()
.find(|node| {
node.is_element()
&& node.tag_name().name() == "Filter"
&& node.children().any(|child| {
child.is_element() && child.tag_name().name() == "Prefix"
})
})
.and_then(|filter| child_text(&filter, "Prefix"))
})
.unwrap_or_default(),
expiration_days: rule
.descendants()
.find(|node| node.is_element() && node.tag_name().name() == "Expiration")
.and_then(|expiration| child_text(&expiration, "Days"))
.and_then(|value| value.parse::<u64>().ok()),
expiration_date: rule
.descendants()
.find(|node| node.is_element() && node.tag_name().name() == "Expiration")
.and_then(|expiration| child_text(&expiration, "Date"))
.as_deref()
.and_then(parse_datetime),
noncurrent_days: rule
.descendants()
.find(|node| {
node.is_element() && node.tag_name().name() == "NoncurrentVersionExpiration"
})
.and_then(|node| child_text(&node, "NoncurrentDays"))
.and_then(|value| value.parse::<u64>().ok()),
abort_incomplete_multipart_days: rule
.descendants()
.find(|node| {
node.is_element() && node.tag_name().name() == "AbortIncompleteMultipartUpload"
})
.and_then(|node| child_text(&node, "DaysAfterInitiation"))
.and_then(|value| value.parse::<u64>().ok()),
})
.collect()
}
fn parse_lifecycle_rule(value: &Value) -> Option<ParsedLifecycleRule> {
let map = value.as_object()?;
Some(ParsedLifecycleRule {
status: map
.get("Status")
.and_then(|value| value.as_str())
.unwrap_or("Enabled")
.to_string(),
prefix: map
.get("Prefix")
.and_then(|value| value.as_str())
.or_else(|| {
map.get("Filter")
.and_then(|value| value.get("Prefix"))
.and_then(|value| value.as_str())
})
.unwrap_or_default()
.to_string(),
expiration_days: map
.get("Expiration")
.and_then(|value| value.get("Days"))
.and_then(|value| value.as_u64()),
expiration_date: map
.get("Expiration")
.and_then(|value| value.get("Date"))
.and_then(|value| value.as_str())
.and_then(parse_datetime),
noncurrent_days: map
.get("NoncurrentVersionExpiration")
.and_then(|value| value.get("NoncurrentDays"))
.and_then(|value| value.as_u64()),
abort_incomplete_multipart_days: map
.get("AbortIncompleteMultipartUpload")
.and_then(|value| value.get("DaysAfterInitiation"))
.and_then(|value| value.as_u64()),
})
}
fn parse_datetime(value: &str) -> Option<DateTime<Utc>> {
DateTime::parse_from_rfc3339(value)
.ok()
.map(|value| value.with_timezone(&Utc))
}
fn child_text(node: &roxmltree::Node<'_, '_>, name: &str) -> Option<String> {
node.children()
.find(|child| child.is_element() && child.tag_name().name() == name)
.and_then(|child| child.text())
.map(|text| text.trim().to_string())
.filter(|text| !text.is_empty())
}
fn result_to_json(result: &BucketLifecycleResult) -> Value {
json!({
"bucket_name": result.bucket_name,
"objects_deleted": result.objects_deleted,
"versions_deleted": result.versions_deleted,
"uploads_aborted": result.uploads_aborted,
"errors": result.errors,
"execution_time_seconds": result.execution_time_seconds,
})
}
#[cfg(test)]
mod tests {
use super::*;
use chrono::Duration;
#[test]
fn parses_rules_from_xml() {
let xml = r#"<?xml version="1.0" encoding="UTF-8"?>
<LifecycleConfiguration>
<Rule>
<Status>Enabled</Status>
<Filter><Prefix>logs/</Prefix></Filter>
<Expiration><Days>10</Days></Expiration>
<NoncurrentVersionExpiration><NoncurrentDays>30</NoncurrentDays></NoncurrentVersionExpiration>
<AbortIncompleteMultipartUpload><DaysAfterInitiation>7</DaysAfterInitiation></AbortIncompleteMultipartUpload>
</Rule>
</LifecycleConfiguration>"#;
let rules = parse_lifecycle_rules(&Value::String(xml.to_string()));
assert_eq!(rules.len(), 1);
assert_eq!(rules[0].prefix, "logs/");
assert_eq!(rules[0].expiration_days, Some(10));
assert_eq!(rules[0].noncurrent_days, Some(30));
assert_eq!(rules[0].abort_incomplete_multipart_days, Some(7));
}
#[tokio::test]
async fn run_cycle_writes_history_and_deletes_noncurrent_versions() {
let tmp = tempfile::tempdir().unwrap();
let storage = Arc::new(FsStorageBackend::new(tmp.path().to_path_buf()));
storage.create_bucket("docs").await.unwrap();
storage.set_versioning("docs", true).await.unwrap();
storage
.put_object(
"docs",
"logs/file.txt",
Box::pin(std::io::Cursor::new(b"old".to_vec())),
None,
)
.await
.unwrap();
storage
.put_object(
"docs",
"logs/file.txt",
Box::pin(std::io::Cursor::new(b"new".to_vec())),
None,
)
.await
.unwrap();
let versions_root = version_root_for_bucket(tmp.path(), "docs")
.join("logs")
.join("file.txt");
let manifest = std::fs::read_dir(&versions_root)
.unwrap()
.flatten()
.find(|entry| entry.path().extension().and_then(|ext| ext.to_str()) == Some("json"))
.unwrap()
.path();
let old_manifest = json!({
"version_id": "ver-1",
"key": "logs/file.txt",
"size": 3,
"archived_at": (Utc::now() - Duration::days(45)).to_rfc3339(),
"etag": "etag",
});
std::fs::write(&manifest, serde_json::to_string(&old_manifest).unwrap()).unwrap();
std::fs::write(manifest.with_file_name("ver-1.bin"), b"old").unwrap();
let lifecycle_xml = r#"<?xml version="1.0" encoding="UTF-8"?>
<LifecycleConfiguration>
<Rule>
<Status>Enabled</Status>
<Filter><Prefix>logs/</Prefix></Filter>
<NoncurrentVersionExpiration><NoncurrentDays>30</NoncurrentDays></NoncurrentVersionExpiration>
</Rule>
</LifecycleConfiguration>"#;
let mut config = storage.get_bucket_config("docs").await.unwrap();
config.lifecycle = Some(Value::String(lifecycle_xml.to_string()));
storage.set_bucket_config("docs", &config).await.unwrap();
let service =
LifecycleService::new(storage.clone(), tmp.path(), LifecycleConfig::default());
let result = service.run_cycle().await.unwrap();
assert_eq!(result["versions_deleted"], 1);
let history = read_history(tmp.path(), "docs", 50, 0);
assert_eq!(history["total"], 1);
assert_eq!(history["executions"][0]["versions_deleted"], 1);
}
}

View File

@@ -0,0 +1,368 @@
use chrono::{DateTime, Utc};
use parking_lot::Mutex;
use rand::Rng;
use serde::{Deserialize, Serialize};
use serde_json::{json, Value};
use std::collections::HashMap;
use std::path::{Path, PathBuf};
use std::sync::Arc;
use std::time::{SystemTime, UNIX_EPOCH};
const MAX_LATENCY_SAMPLES: usize = 5000;
pub struct MetricsConfig {
pub interval_minutes: u64,
pub retention_hours: u64,
}
impl Default for MetricsConfig {
fn default() -> Self {
Self {
interval_minutes: 5,
retention_hours: 24,
}
}
}
#[derive(Debug, Clone)]
struct OperationStats {
count: u64,
success_count: u64,
error_count: u64,
latency_sum_ms: f64,
latency_min_ms: f64,
latency_max_ms: f64,
bytes_in: u64,
bytes_out: u64,
latency_samples: Vec<f64>,
}
impl Default for OperationStats {
fn default() -> Self {
Self {
count: 0,
success_count: 0,
error_count: 0,
latency_sum_ms: 0.0,
latency_min_ms: f64::INFINITY,
latency_max_ms: 0.0,
bytes_in: 0,
bytes_out: 0,
latency_samples: Vec::new(),
}
}
}
impl OperationStats {
fn record(&mut self, latency_ms: f64, success: bool, bytes_in: u64, bytes_out: u64) {
self.count += 1;
if success {
self.success_count += 1;
} else {
self.error_count += 1;
}
self.latency_sum_ms += latency_ms;
if latency_ms < self.latency_min_ms {
self.latency_min_ms = latency_ms;
}
if latency_ms > self.latency_max_ms {
self.latency_max_ms = latency_ms;
}
self.bytes_in += bytes_in;
self.bytes_out += bytes_out;
if self.latency_samples.len() < MAX_LATENCY_SAMPLES {
self.latency_samples.push(latency_ms);
} else {
let mut rng = rand::thread_rng();
let j = rng.gen_range(0..self.count as usize);
if j < MAX_LATENCY_SAMPLES {
self.latency_samples[j] = latency_ms;
}
}
}
fn compute_percentile(sorted: &[f64], p: f64) -> f64 {
if sorted.is_empty() {
return 0.0;
}
let k = (sorted.len() - 1) as f64 * (p / 100.0);
let f = k.floor() as usize;
let c = (f + 1).min(sorted.len() - 1);
let d = k - f as f64;
sorted[f] + d * (sorted[c] - sorted[f])
}
fn to_json(&self) -> Value {
let avg = if self.count > 0 {
self.latency_sum_ms / self.count as f64
} else {
0.0
};
let min = if self.latency_min_ms.is_infinite() {
0.0
} else {
self.latency_min_ms
};
let mut sorted = self.latency_samples.clone();
sorted.sort_by(|a, b| a.partial_cmp(b).unwrap_or(std::cmp::Ordering::Equal));
json!({
"count": self.count,
"success_count": self.success_count,
"error_count": self.error_count,
"latency_avg_ms": round2(avg),
"latency_min_ms": round2(min),
"latency_max_ms": round2(self.latency_max_ms),
"latency_p50_ms": round2(Self::compute_percentile(&sorted, 50.0)),
"latency_p95_ms": round2(Self::compute_percentile(&sorted, 95.0)),
"latency_p99_ms": round2(Self::compute_percentile(&sorted, 99.0)),
"bytes_in": self.bytes_in,
"bytes_out": self.bytes_out,
})
}
}
fn round2(v: f64) -> f64 {
(v * 100.0).round() / 100.0
}
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct MetricsSnapshot {
pub timestamp: DateTime<Utc>,
pub window_seconds: u64,
pub by_method: HashMap<String, Value>,
pub by_endpoint: HashMap<String, Value>,
pub by_status_class: HashMap<String, u64>,
pub error_codes: HashMap<String, u64>,
pub totals: Value,
}
struct Inner {
by_method: HashMap<String, OperationStats>,
by_endpoint: HashMap<String, OperationStats>,
by_status_class: HashMap<String, u64>,
error_codes: HashMap<String, u64>,
totals: OperationStats,
window_start: f64,
snapshots: Vec<MetricsSnapshot>,
}
pub struct MetricsService {
config: MetricsConfig,
inner: Arc<Mutex<Inner>>,
snapshots_path: PathBuf,
}
impl MetricsService {
pub fn new(storage_root: &Path, config: MetricsConfig) -> Self {
let snapshots_path = storage_root
.join(".myfsio.sys")
.join("config")
.join("operation_metrics.json");
let mut snapshots: Vec<MetricsSnapshot> = if snapshots_path.exists() {
std::fs::read_to_string(&snapshots_path)
.ok()
.and_then(|s| serde_json::from_str::<Value>(&s).ok())
.and_then(|v| {
v.get("snapshots").and_then(|s| {
serde_json::from_value::<Vec<MetricsSnapshot>>(s.clone()).ok()
})
})
.unwrap_or_default()
} else {
Vec::new()
};
let cutoff = now_secs() - (config.retention_hours * 3600) as f64;
snapshots.retain(|s| s.timestamp.timestamp() as f64 > cutoff);
Self {
config,
inner: Arc::new(Mutex::new(Inner {
by_method: HashMap::new(),
by_endpoint: HashMap::new(),
by_status_class: HashMap::new(),
error_codes: HashMap::new(),
totals: OperationStats::default(),
window_start: now_secs(),
snapshots,
})),
snapshots_path,
}
}
pub fn record_request(
&self,
method: &str,
endpoint_type: &str,
status_code: u16,
latency_ms: f64,
bytes_in: u64,
bytes_out: u64,
error_code: Option<&str>,
) {
let success = (200..400).contains(&status_code);
let status_class = format!("{}xx", status_code / 100);
let mut inner = self.inner.lock();
inner
.by_method
.entry(method.to_string())
.or_default()
.record(latency_ms, success, bytes_in, bytes_out);
inner
.by_endpoint
.entry(endpoint_type.to_string())
.or_default()
.record(latency_ms, success, bytes_in, bytes_out);
*inner.by_status_class.entry(status_class).or_insert(0) += 1;
if let Some(code) = error_code {
*inner.error_codes.entry(code.to_string()).or_insert(0) += 1;
}
inner
.totals
.record(latency_ms, success, bytes_in, bytes_out);
}
pub fn get_current_stats(&self) -> Value {
let inner = self.inner.lock();
let window_seconds = (now_secs() - inner.window_start).max(0.0) as u64;
let by_method: HashMap<String, Value> = inner
.by_method
.iter()
.map(|(k, v)| (k.clone(), v.to_json()))
.collect();
let by_endpoint: HashMap<String, Value> = inner
.by_endpoint
.iter()
.map(|(k, v)| (k.clone(), v.to_json()))
.collect();
json!({
"timestamp": Utc::now().to_rfc3339(),
"window_seconds": window_seconds,
"by_method": by_method,
"by_endpoint": by_endpoint,
"by_status_class": inner.by_status_class,
"error_codes": inner.error_codes,
"totals": inner.totals.to_json(),
})
}
pub fn get_history(&self, hours: Option<u64>) -> Vec<MetricsSnapshot> {
let inner = self.inner.lock();
let mut snapshots = inner.snapshots.clone();
if let Some(h) = hours {
let cutoff = now_secs() - (h * 3600) as f64;
snapshots.retain(|s| s.timestamp.timestamp() as f64 > cutoff);
}
snapshots
}
pub fn snapshot(&self) -> Value {
let current = self.get_current_stats();
let history = self.get_history(None);
json!({
"enabled": true,
"current": current,
"snapshots": history,
})
}
fn take_snapshot(&self) {
let snapshot = {
let mut inner = self.inner.lock();
let window_seconds = (now_secs() - inner.window_start).max(0.0) as u64;
let by_method: HashMap<String, Value> = inner
.by_method
.iter()
.map(|(k, v)| (k.clone(), v.to_json()))
.collect();
let by_endpoint: HashMap<String, Value> = inner
.by_endpoint
.iter()
.map(|(k, v)| (k.clone(), v.to_json()))
.collect();
let snap = MetricsSnapshot {
timestamp: Utc::now(),
window_seconds,
by_method,
by_endpoint,
by_status_class: inner.by_status_class.clone(),
error_codes: inner.error_codes.clone(),
totals: inner.totals.to_json(),
};
inner.snapshots.push(snap.clone());
let cutoff = now_secs() - (self.config.retention_hours * 3600) as f64;
inner
.snapshots
.retain(|s| s.timestamp.timestamp() as f64 > cutoff);
inner.by_method.clear();
inner.by_endpoint.clear();
inner.by_status_class.clear();
inner.error_codes.clear();
inner.totals = OperationStats::default();
inner.window_start = now_secs();
snap
};
let _ = snapshot;
self.save_snapshots();
}
fn save_snapshots(&self) {
let snapshots = { self.inner.lock().snapshots.clone() };
if let Some(parent) = self.snapshots_path.parent() {
let _ = std::fs::create_dir_all(parent);
}
let data = json!({ "snapshots": snapshots });
let _ = std::fs::write(
&self.snapshots_path,
serde_json::to_string_pretty(&data).unwrap_or_default(),
);
}
pub fn start_background(self: Arc<Self>) -> tokio::task::JoinHandle<()> {
let interval = std::time::Duration::from_secs(self.config.interval_minutes * 60);
tokio::spawn(async move {
let mut timer = tokio::time::interval(interval);
timer.tick().await;
loop {
timer.tick().await;
self.take_snapshot();
}
})
}
}
pub fn classify_endpoint(path: &str) -> &'static str {
if path.is_empty() || path == "/" {
return "service";
}
let trimmed = path.trim_end_matches('/');
if trimmed.starts_with("/ui") {
return "ui";
}
if trimmed.starts_with("/kms") {
return "kms";
}
if trimmed.starts_with("/myfsio") {
return "service";
}
let parts: Vec<&str> = trimmed.trim_start_matches('/').split('/').collect();
match parts.len() {
0 => "service",
1 => "bucket",
_ => "object",
}
}
fn now_secs() -> f64 {
SystemTime::now()
.duration_since(UNIX_EPOCH)
.map(|d| d.as_secs_f64())
.unwrap_or(0.0)
}

View File

@@ -0,0 +1,14 @@
pub mod access_logging;
pub mod acl;
pub mod gc;
pub mod integrity;
pub mod lifecycle;
pub mod metrics;
pub mod notifications;
pub mod object_lock;
pub mod replication;
pub mod s3_client;
pub mod site_registry;
pub mod site_sync;
pub mod system_metrics;
pub mod website_domains;

View File

@@ -0,0 +1,296 @@
use crate::state::AppState;
use chrono::{DateTime, Utc};
use myfsio_storage::traits::StorageEngine;
use serde::Serialize;
use serde_json::json;
#[derive(Debug, Clone, PartialEq, Eq)]
pub struct WebhookDestination {
pub url: String,
}
#[derive(Debug, Clone, PartialEq, Eq)]
pub struct NotificationConfiguration {
pub id: String,
pub events: Vec<String>,
pub destination: WebhookDestination,
pub prefix_filter: String,
pub suffix_filter: String,
}
#[derive(Debug, Clone, Serialize)]
pub struct NotificationEvent {
#[serde(rename = "eventVersion")]
event_version: &'static str,
#[serde(rename = "eventSource")]
event_source: &'static str,
#[serde(rename = "awsRegion")]
aws_region: &'static str,
#[serde(rename = "eventTime")]
event_time: String,
#[serde(rename = "eventName")]
event_name: String,
#[serde(rename = "userIdentity")]
user_identity: serde_json::Value,
#[serde(rename = "requestParameters")]
request_parameters: serde_json::Value,
#[serde(rename = "responseElements")]
response_elements: serde_json::Value,
s3: serde_json::Value,
}
impl NotificationConfiguration {
pub fn matches_event(&self, event_name: &str, object_key: &str) -> bool {
let event_match = self.events.iter().any(|pattern| {
if let Some(prefix) = pattern.strip_suffix('*') {
event_name.starts_with(prefix)
} else {
pattern == event_name
}
});
if !event_match {
return false;
}
if !self.prefix_filter.is_empty() && !object_key.starts_with(&self.prefix_filter) {
return false;
}
if !self.suffix_filter.is_empty() && !object_key.ends_with(&self.suffix_filter) {
return false;
}
true
}
}
pub fn parse_notification_configurations(
xml: &str,
) -> Result<Vec<NotificationConfiguration>, String> {
let doc = roxmltree::Document::parse(xml).map_err(|err| err.to_string())?;
let mut configs = Vec::new();
for webhook in doc
.descendants()
.filter(|node| node.is_element() && node.tag_name().name() == "WebhookConfiguration")
{
let id = child_text(&webhook, "Id").unwrap_or_else(|| uuid::Uuid::new_v4().to_string());
let events = webhook
.children()
.filter(|node| node.is_element() && node.tag_name().name() == "Event")
.filter_map(|node| node.text())
.map(|text| text.trim().to_string())
.filter(|text| !text.is_empty())
.collect::<Vec<_>>();
let destination = webhook
.children()
.find(|node| node.is_element() && node.tag_name().name() == "Destination");
let url = destination
.as_ref()
.and_then(|node| child_text(node, "Url"))
.unwrap_or_default();
if url.trim().is_empty() {
return Err("Destination URL is required".to_string());
}
let mut prefix_filter = String::new();
let mut suffix_filter = String::new();
if let Some(filter) = webhook
.children()
.find(|node| node.is_element() && node.tag_name().name() == "Filter")
{
if let Some(key) = filter
.children()
.find(|node| node.is_element() && node.tag_name().name() == "S3Key")
{
for rule in key
.children()
.filter(|node| node.is_element() && node.tag_name().name() == "FilterRule")
{
let name = child_text(&rule, "Name").unwrap_or_default();
let value = child_text(&rule, "Value").unwrap_or_default();
if name == "prefix" {
prefix_filter = value;
} else if name == "suffix" {
suffix_filter = value;
}
}
}
}
configs.push(NotificationConfiguration {
id,
events,
destination: WebhookDestination { url },
prefix_filter,
suffix_filter,
});
}
Ok(configs)
}
pub fn emit_object_created(
state: &AppState,
bucket: &str,
key: &str,
size: u64,
etag: Option<&str>,
request_id: &str,
source_ip: &str,
user_identity: &str,
operation: &str,
) {
emit_notifications(
state.clone(),
bucket.to_string(),
key.to_string(),
format!("s3:ObjectCreated:{}", operation),
size,
etag.unwrap_or_default().to_string(),
request_id.to_string(),
source_ip.to_string(),
user_identity.to_string(),
);
}
pub fn emit_object_removed(
state: &AppState,
bucket: &str,
key: &str,
request_id: &str,
source_ip: &str,
user_identity: &str,
operation: &str,
) {
emit_notifications(
state.clone(),
bucket.to_string(),
key.to_string(),
format!("s3:ObjectRemoved:{}", operation),
0,
String::new(),
request_id.to_string(),
source_ip.to_string(),
user_identity.to_string(),
);
}
fn emit_notifications(
state: AppState,
bucket: String,
key: String,
event_name: String,
size: u64,
etag: String,
request_id: String,
source_ip: String,
user_identity: String,
) {
tokio::spawn(async move {
let config = match state.storage.get_bucket_config(&bucket).await {
Ok(config) => config,
Err(_) => return,
};
let raw = match config.notification {
Some(serde_json::Value::String(raw)) => raw,
_ => return,
};
let configs = match parse_notification_configurations(&raw) {
Ok(configs) => configs,
Err(err) => {
tracing::warn!("Invalid notification config for bucket {}: {}", bucket, err);
return;
}
};
let record = NotificationEvent {
event_version: "2.1",
event_source: "myfsio:s3",
aws_region: "local",
event_time: format_event_time(Utc::now()),
event_name: event_name.clone(),
user_identity: json!({ "principalId": if user_identity.is_empty() { "ANONYMOUS" } else { &user_identity } }),
request_parameters: json!({ "sourceIPAddress": if source_ip.is_empty() { "127.0.0.1" } else { &source_ip } }),
response_elements: json!({
"x-amz-request-id": request_id,
"x-amz-id-2": request_id,
}),
s3: json!({
"s3SchemaVersion": "1.0",
"configurationId": "notification",
"bucket": {
"name": bucket,
"ownerIdentity": { "principalId": "local" },
"arn": format!("arn:aws:s3:::{}", bucket),
},
"object": {
"key": key,
"size": size,
"eTag": etag,
"versionId": "null",
"sequencer": format!("{:016X}", Utc::now().timestamp_millis()),
}
}),
};
let payload = json!({ "Records": [record] });
let client = reqwest::Client::new();
for config in configs {
if !config.matches_event(&event_name, &key) {
continue;
}
let result = client
.post(&config.destination.url)
.header("content-type", "application/json")
.json(&payload)
.send()
.await;
if let Err(err) = result {
tracing::warn!(
"Failed to deliver notification for {} to {}: {}",
event_name,
config.destination.url,
err
);
}
}
});
}
fn format_event_time(value: DateTime<Utc>) -> String {
value.format("%Y-%m-%dT%H:%M:%S.000Z").to_string()
}
fn child_text(node: &roxmltree::Node<'_, '_>, name: &str) -> Option<String> {
node.children()
.find(|child| child.is_element() && child.tag_name().name() == name)
.and_then(|child| child.text())
.map(|text| text.trim().to_string())
.filter(|text| !text.is_empty())
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn parse_webhook_configuration() {
let xml = r#"<?xml version="1.0" encoding="UTF-8"?>
<NotificationConfiguration xmlns="http://s3.amazonaws.com/doc/2006-03-01/">
<WebhookConfiguration>
<Id>upload</Id>
<Event>s3:ObjectCreated:*</Event>
<Destination><Url>https://example.com/hook</Url></Destination>
<Filter>
<S3Key>
<FilterRule><Name>prefix</Name><Value>logs/</Value></FilterRule>
<FilterRule><Name>suffix</Name><Value>.txt</Value></FilterRule>
</S3Key>
</Filter>
</WebhookConfiguration>
</NotificationConfiguration>"#;
let configs = parse_notification_configurations(xml).unwrap();
assert_eq!(configs.len(), 1);
assert!(configs[0].matches_event("s3:ObjectCreated:Put", "logs/test.txt"));
assert!(!configs[0].matches_event("s3:ObjectRemoved:Delete", "logs/test.txt"));
}
}

View File

@@ -0,0 +1,128 @@
use chrono::{DateTime, Utc};
use serde::{Deserialize, Serialize};
use std::collections::HashMap;
pub const LEGAL_HOLD_METADATA_KEY: &str = "__legal_hold__";
pub const RETENTION_METADATA_KEY: &str = "__object_retention__";
#[derive(Debug, Clone, Serialize, Deserialize, PartialEq, Eq)]
pub enum RetentionMode {
GOVERNANCE,
COMPLIANCE,
}
#[derive(Debug, Clone, Serialize, Deserialize, PartialEq, Eq)]
pub struct ObjectLockRetention {
pub mode: RetentionMode,
pub retain_until_date: DateTime<Utc>,
}
impl ObjectLockRetention {
pub fn is_expired(&self) -> bool {
Utc::now() > self.retain_until_date
}
}
pub fn get_object_retention(metadata: &HashMap<String, String>) -> Option<ObjectLockRetention> {
metadata
.get(RETENTION_METADATA_KEY)
.and_then(|raw| serde_json::from_str::<ObjectLockRetention>(raw).ok())
}
pub fn set_object_retention(
metadata: &mut HashMap<String, String>,
retention: &ObjectLockRetention,
) -> Result<(), String> {
let encoded = serde_json::to_string(retention).map_err(|err| err.to_string())?;
metadata.insert(RETENTION_METADATA_KEY.to_string(), encoded);
Ok(())
}
pub fn get_legal_hold(metadata: &HashMap<String, String>) -> bool {
metadata
.get(LEGAL_HOLD_METADATA_KEY)
.map(|value| value.eq_ignore_ascii_case("ON") || value.eq_ignore_ascii_case("true"))
.unwrap_or(false)
}
pub fn set_legal_hold(metadata: &mut HashMap<String, String>, enabled: bool) {
metadata.insert(
LEGAL_HOLD_METADATA_KEY.to_string(),
if enabled { "ON" } else { "OFF" }.to_string(),
);
}
pub fn ensure_retention_mutable(
metadata: &HashMap<String, String>,
bypass_governance: bool,
) -> Result<(), String> {
let Some(existing) = get_object_retention(metadata) else {
return Ok(());
};
if existing.is_expired() {
return Ok(());
}
match existing.mode {
RetentionMode::COMPLIANCE => Err(format!(
"Cannot modify retention on object with COMPLIANCE mode until retention expires"
)),
RetentionMode::GOVERNANCE if !bypass_governance => Err(
"Cannot modify GOVERNANCE retention without bypass-governance permission".to_string(),
),
RetentionMode::GOVERNANCE => Ok(()),
}
}
pub fn can_delete_object(
metadata: &HashMap<String, String>,
bypass_governance: bool,
) -> Result<(), String> {
if get_legal_hold(metadata) {
return Err("Object is under legal hold".to_string());
}
if let Some(retention) = get_object_retention(metadata) {
if !retention.is_expired() {
return match retention.mode {
RetentionMode::COMPLIANCE => Err(format!(
"Object is locked in COMPLIANCE mode until {}",
retention.retain_until_date.to_rfc3339()
)),
RetentionMode::GOVERNANCE if !bypass_governance => Err(format!(
"Object is locked in GOVERNANCE mode until {}",
retention.retain_until_date.to_rfc3339()
)),
RetentionMode::GOVERNANCE => Ok(()),
};
}
}
Ok(())
}
#[cfg(test)]
mod tests {
use super::*;
use chrono::Duration;
#[test]
fn legal_hold_blocks_delete() {
let mut metadata = HashMap::new();
set_legal_hold(&mut metadata, true);
let err = can_delete_object(&metadata, false).unwrap_err();
assert!(err.contains("legal hold"));
}
#[test]
fn governance_requires_bypass() {
let mut metadata = HashMap::new();
set_object_retention(
&mut metadata,
&ObjectLockRetention {
mode: RetentionMode::GOVERNANCE,
retain_until_date: Utc::now() + Duration::hours(1),
},
)
.unwrap();
assert!(can_delete_object(&metadata, false).is_err());
assert!(can_delete_object(&metadata, true).is_ok());
}
}

View File

@@ -0,0 +1,713 @@
use std::collections::HashMap;
use std::path::{Path, PathBuf};
use std::sync::Arc;
use std::time::{Duration, SystemTime, UNIX_EPOCH};
use aws_sdk_s3::primitives::ByteStream;
use parking_lot::Mutex;
use serde::{Deserialize, Serialize};
use tokio::sync::Semaphore;
use myfsio_common::types::ListParams;
use myfsio_storage::fs_backend::FsStorageBackend;
use myfsio_storage::traits::StorageEngine;
use crate::services::s3_client::{build_client, check_endpoint_health, ClientOptions};
use crate::stores::connections::{ConnectionStore, RemoteConnection};
pub const MODE_NEW_ONLY: &str = "new_only";
pub const MODE_ALL: &str = "all";
pub const MODE_BIDIRECTIONAL: &str = "bidirectional";
#[derive(Debug, Clone, Default, Serialize, Deserialize)]
pub struct ReplicationStats {
#[serde(default)]
pub objects_synced: u64,
#[serde(default)]
pub objects_pending: u64,
#[serde(default)]
pub objects_orphaned: u64,
#[serde(default)]
pub bytes_synced: u64,
#[serde(default)]
pub last_sync_at: Option<f64>,
#[serde(default)]
pub last_sync_key: Option<String>,
}
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct ReplicationRule {
pub bucket_name: String,
pub target_connection_id: String,
pub target_bucket: String,
#[serde(default = "default_true")]
pub enabled: bool,
#[serde(default = "default_mode")]
pub mode: String,
#[serde(default)]
pub created_at: Option<f64>,
#[serde(default)]
pub stats: ReplicationStats,
#[serde(default = "default_true")]
pub sync_deletions: bool,
#[serde(default)]
pub last_pull_at: Option<f64>,
#[serde(default)]
pub filter_prefix: Option<String>,
}
fn default_true() -> bool {
true
}
fn default_mode() -> String {
MODE_NEW_ONLY.to_string()
}
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct ReplicationFailure {
pub object_key: String,
pub error_message: String,
pub timestamp: f64,
pub failure_count: u32,
pub bucket_name: String,
pub action: String,
#[serde(default)]
pub last_error_code: Option<String>,
}
pub struct ReplicationFailureStore {
storage_root: PathBuf,
max_failures_per_bucket: usize,
cache: Mutex<HashMap<String, Vec<ReplicationFailure>>>,
}
impl ReplicationFailureStore {
pub fn new(storage_root: PathBuf, max_failures_per_bucket: usize) -> Self {
Self {
storage_root,
max_failures_per_bucket,
cache: Mutex::new(HashMap::new()),
}
}
fn path(&self, bucket: &str) -> PathBuf {
self.storage_root
.join(".myfsio.sys")
.join("buckets")
.join(bucket)
.join("replication_failures.json")
}
fn load_from_disk(&self, bucket: &str) -> Vec<ReplicationFailure> {
let path = self.path(bucket);
if !path.exists() {
return Vec::new();
}
match std::fs::read_to_string(&path) {
Ok(text) => {
let parsed: serde_json::Value = match serde_json::from_str(&text) {
Ok(v) => v,
Err(_) => return Vec::new(),
};
parsed
.get("failures")
.and_then(|v| serde_json::from_value(v.clone()).ok())
.unwrap_or_default()
}
Err(_) => Vec::new(),
}
}
fn save_to_disk(&self, bucket: &str, failures: &[ReplicationFailure]) {
let path = self.path(bucket);
if let Some(parent) = path.parent() {
let _ = std::fs::create_dir_all(parent);
}
let trimmed = &failures[..failures.len().min(self.max_failures_per_bucket)];
let data = serde_json::json!({ "failures": trimmed });
let _ = std::fs::write(
&path,
serde_json::to_string_pretty(&data).unwrap_or_default(),
);
}
pub fn load(&self, bucket: &str) -> Vec<ReplicationFailure> {
let mut cache = self.cache.lock();
if let Some(existing) = cache.get(bucket) {
return existing.clone();
}
let loaded = self.load_from_disk(bucket);
cache.insert(bucket.to_string(), loaded.clone());
loaded
}
pub fn save(&self, bucket: &str, failures: Vec<ReplicationFailure>) {
let trimmed: Vec<ReplicationFailure> = failures
.into_iter()
.take(self.max_failures_per_bucket)
.collect();
self.save_to_disk(bucket, &trimmed);
self.cache.lock().insert(bucket.to_string(), trimmed);
}
pub fn add(&self, bucket: &str, failure: ReplicationFailure) {
let mut failures = self.load(bucket);
if let Some(existing) = failures
.iter_mut()
.find(|f| f.object_key == failure.object_key)
{
existing.failure_count += 1;
existing.timestamp = failure.timestamp;
existing.error_message = failure.error_message.clone();
existing.last_error_code = failure.last_error_code.clone();
} else {
failures.insert(0, failure);
}
self.save(bucket, failures);
}
pub fn remove(&self, bucket: &str, object_key: &str) -> bool {
let failures = self.load(bucket);
let before = failures.len();
let after: Vec<_> = failures
.into_iter()
.filter(|f| f.object_key != object_key)
.collect();
if after.len() != before {
self.save(bucket, after);
true
} else {
false
}
}
pub fn clear(&self, bucket: &str) {
self.cache.lock().remove(bucket);
let path = self.path(bucket);
let _ = std::fs::remove_file(path);
}
pub fn get(&self, bucket: &str, object_key: &str) -> Option<ReplicationFailure> {
self.load(bucket)
.into_iter()
.find(|f| f.object_key == object_key)
}
pub fn count(&self, bucket: &str) -> usize {
self.load(bucket).len()
}
}
pub struct ReplicationManager {
storage: Arc<FsStorageBackend>,
connections: Arc<ConnectionStore>,
rules_path: PathBuf,
rules: Mutex<HashMap<String, ReplicationRule>>,
client_options: ClientOptions,
streaming_threshold_bytes: u64,
pub failures: Arc<ReplicationFailureStore>,
semaphore: Arc<Semaphore>,
}
impl ReplicationManager {
pub fn new(
storage: Arc<FsStorageBackend>,
connections: Arc<ConnectionStore>,
storage_root: &Path,
connect_timeout: Duration,
read_timeout: Duration,
max_retries: u32,
streaming_threshold_bytes: u64,
max_failures_per_bucket: usize,
) -> Self {
let rules_path = storage_root
.join(".myfsio.sys")
.join("config")
.join("replication_rules.json");
let rules = load_rules(&rules_path);
let failures = Arc::new(ReplicationFailureStore::new(
storage_root.to_path_buf(),
max_failures_per_bucket,
));
let client_options = ClientOptions {
connect_timeout,
read_timeout,
max_attempts: max_retries,
};
Self {
storage,
connections,
rules_path,
rules: Mutex::new(rules),
client_options,
streaming_threshold_bytes,
failures,
semaphore: Arc::new(Semaphore::new(4)),
}
}
pub fn reload_rules(&self) {
*self.rules.lock() = load_rules(&self.rules_path);
}
pub fn list_rules(&self) -> Vec<ReplicationRule> {
self.rules.lock().values().cloned().collect()
}
pub fn get_rule(&self, bucket: &str) -> Option<ReplicationRule> {
self.rules.lock().get(bucket).cloned()
}
pub fn set_rule(&self, rule: ReplicationRule) {
{
let mut guard = self.rules.lock();
guard.insert(rule.bucket_name.clone(), rule);
}
self.save_rules();
}
pub fn delete_rule(&self, bucket: &str) {
{
let mut guard = self.rules.lock();
guard.remove(bucket);
}
self.save_rules();
}
pub fn save_rules(&self) {
let snapshot: HashMap<String, ReplicationRule> = self.rules.lock().clone();
if let Some(parent) = self.rules_path.parent() {
let _ = std::fs::create_dir_all(parent);
}
if let Ok(text) = serde_json::to_string_pretty(&snapshot) {
let _ = std::fs::write(&self.rules_path, text);
}
}
fn update_last_sync(&self, bucket: &str, key: &str) {
{
let mut guard = self.rules.lock();
if let Some(rule) = guard.get_mut(bucket) {
rule.stats.last_sync_at = Some(now_secs());
rule.stats.last_sync_key = Some(key.to_string());
}
}
self.save_rules();
}
pub async fn trigger(self: Arc<Self>, bucket: String, key: String, action: String) {
let rule = match self.get_rule(&bucket) {
Some(r) if r.enabled => r,
_ => return,
};
let connection = match self.connections.get(&rule.target_connection_id) {
Some(c) => c,
None => {
tracing::warn!(
"Replication skipped for {}/{}: connection {} not found",
bucket,
key,
rule.target_connection_id
);
return;
}
};
let permit = match self.semaphore.clone().try_acquire_owned() {
Ok(p) => p,
Err(_) => {
let sem = self.semaphore.clone();
match sem.acquire_owned().await {
Ok(p) => p,
Err(_) => return,
}
}
};
let manager = self.clone();
tokio::spawn(async move {
let _permit = permit;
manager
.replicate_task(&bucket, &key, &rule, &connection, &action)
.await;
});
}
pub async fn replicate_existing_objects(self: Arc<Self>, bucket: String) -> usize {
let rule = match self.get_rule(&bucket) {
Some(r) if r.enabled => r,
_ => return 0,
};
let connection = match self.connections.get(&rule.target_connection_id) {
Some(c) => c,
None => {
tracing::warn!(
"Cannot replicate existing objects for {}: connection {} not found",
bucket,
rule.target_connection_id
);
return 0;
}
};
if !self.check_endpoint(&connection).await {
tracing::warn!(
"Cannot replicate existing objects for {}: endpoint {} is unreachable",
bucket,
connection.endpoint_url
);
return 0;
}
let mut continuation_token: Option<String> = None;
let mut submitted = 0usize;
loop {
let page = match self
.storage
.list_objects(
&bucket,
&ListParams {
max_keys: 1000,
continuation_token: continuation_token.clone(),
prefix: rule.filter_prefix.clone(),
start_after: None,
},
)
.await
{
Ok(page) => page,
Err(err) => {
tracing::error!(
"Failed to list existing objects for replication in {}: {}",
bucket,
err
);
break;
}
};
let next_token = page.next_continuation_token.clone();
let is_truncated = page.is_truncated;
for object in page.objects {
submitted += 1;
self.clone()
.trigger(bucket.clone(), object.key, "write".to_string())
.await;
}
if !is_truncated {
break;
}
continuation_token = next_token;
if continuation_token.is_none() {
break;
}
}
submitted
}
pub fn schedule_existing_objects_sync(self: Arc<Self>, bucket: String) {
tokio::spawn(async move {
let submitted = self
.clone()
.replicate_existing_objects(bucket.clone())
.await;
if submitted > 0 {
tracing::info!(
"Scheduled {} existing object(s) for replication in {}",
submitted,
bucket
);
}
});
}
async fn replicate_task(
&self,
bucket: &str,
object_key: &str,
rule: &ReplicationRule,
conn: &RemoteConnection,
action: &str,
) {
if object_key.contains("..") || object_key.starts_with('/') || object_key.starts_with('\\')
{
tracing::error!("Invalid object key (path traversal): {}", object_key);
return;
}
let client = build_client(conn, &self.client_options);
if action == "delete" {
match client
.delete_object()
.bucket(&rule.target_bucket)
.key(object_key)
.send()
.await
{
Ok(_) => {
tracing::info!(
"Replicated DELETE {}/{} to {} ({})",
bucket,
object_key,
conn.name,
rule.target_bucket
);
self.update_last_sync(bucket, object_key);
self.failures.remove(bucket, object_key);
}
Err(err) => {
let msg = format!("{:?}", err);
tracing::error!(
"Replication DELETE failed {}/{}: {}",
bucket,
object_key,
msg
);
self.failures.add(
bucket,
ReplicationFailure {
object_key: object_key.to_string(),
error_message: msg,
timestamp: now_secs(),
failure_count: 1,
bucket_name: bucket.to_string(),
action: "delete".to_string(),
last_error_code: None,
},
);
}
}
return;
}
let src_path = match self.storage.get_object_path(bucket, object_key).await {
Ok(p) => p,
Err(_) => {
tracing::error!("Source object not found: {}/{}", bucket, object_key);
return;
}
};
let file_size = match tokio::fs::metadata(&src_path).await {
Ok(m) => m.len(),
Err(_) => 0,
};
let content_type = mime_guess::from_path(&src_path)
.first_raw()
.map(|s| s.to_string());
let upload_result = upload_object(
&client,
&rule.target_bucket,
object_key,
&src_path,
file_size,
self.streaming_threshold_bytes,
content_type.as_deref(),
)
.await;
let final_result = match upload_result {
Err(err) if is_no_such_bucket(&err) => {
tracing::info!(
"Target bucket {} not found, creating it",
rule.target_bucket
);
match client
.create_bucket()
.bucket(&rule.target_bucket)
.send()
.await
{
Ok(_) | Err(_) => {
upload_object(
&client,
&rule.target_bucket,
object_key,
&src_path,
file_size,
self.streaming_threshold_bytes,
content_type.as_deref(),
)
.await
}
}
}
other => other,
};
match final_result {
Ok(()) => {
tracing::info!(
"Replicated {}/{} to {} ({})",
bucket,
object_key,
conn.name,
rule.target_bucket
);
self.update_last_sync(bucket, object_key);
self.failures.remove(bucket, object_key);
}
Err(err) => {
let msg = err.to_string();
tracing::error!("Replication failed {}/{}: {}", bucket, object_key, msg);
self.failures.add(
bucket,
ReplicationFailure {
object_key: object_key.to_string(),
error_message: msg,
timestamp: now_secs(),
failure_count: 1,
bucket_name: bucket.to_string(),
action: action.to_string(),
last_error_code: None,
},
);
}
}
}
pub async fn check_endpoint(&self, conn: &RemoteConnection) -> bool {
let client = build_client(conn, &self.client_options);
check_endpoint_health(&client).await
}
pub async fn retry_failed(&self, bucket: &str, object_key: &str) -> bool {
let failure = match self.failures.get(bucket, object_key) {
Some(f) => f,
None => return false,
};
let rule = match self.get_rule(bucket) {
Some(r) if r.enabled => r,
_ => return false,
};
let conn = match self.connections.get(&rule.target_connection_id) {
Some(c) => c,
None => return false,
};
self.replicate_task(bucket, object_key, &rule, &conn, &failure.action)
.await;
true
}
pub async fn retry_all(&self, bucket: &str) -> (usize, usize) {
let failures = self.failures.load(bucket);
if failures.is_empty() {
return (0, 0);
}
let rule = match self.get_rule(bucket) {
Some(r) if r.enabled => r,
_ => return (0, failures.len()),
};
let conn = match self.connections.get(&rule.target_connection_id) {
Some(c) => c,
None => return (0, failures.len()),
};
let mut submitted = 0;
for failure in failures {
self.replicate_task(bucket, &failure.object_key, &rule, &conn, &failure.action)
.await;
submitted += 1;
}
(submitted, 0)
}
pub fn get_failure_count(&self, bucket: &str) -> usize {
self.failures.count(bucket)
}
pub fn get_failed_items(
&self,
bucket: &str,
limit: usize,
offset: usize,
) -> Vec<ReplicationFailure> {
self.failures
.load(bucket)
.into_iter()
.skip(offset)
.take(limit)
.collect()
}
pub fn dismiss_failure(&self, bucket: &str, key: &str) -> bool {
self.failures.remove(bucket, key)
}
pub fn clear_failures(&self, bucket: &str) {
self.failures.clear(bucket);
}
pub fn rules_snapshot(&self) -> HashMap<String, ReplicationRule> {
self.rules.lock().clone()
}
pub fn update_last_pull(&self, bucket: &str, at: f64) {
{
let mut guard = self.rules.lock();
if let Some(rule) = guard.get_mut(bucket) {
rule.last_pull_at = Some(at);
}
}
self.save_rules();
}
pub fn client_options(&self) -> &ClientOptions {
&self.client_options
}
}
fn is_no_such_bucket<E: std::fmt::Debug>(err: &E) -> bool {
let text = format!("{:?}", err);
text.contains("NoSuchBucket")
}
async fn upload_object(
client: &aws_sdk_s3::Client,
bucket: &str,
key: &str,
path: &Path,
file_size: u64,
streaming_threshold: u64,
content_type: Option<&str>,
) -> Result<(), aws_sdk_s3::error::SdkError<aws_sdk_s3::operation::put_object::PutObjectError>> {
let mut req = client.put_object().bucket(bucket).key(key);
if let Some(ct) = content_type {
req = req.content_type(ct);
}
let body = if file_size >= streaming_threshold {
ByteStream::from_path(path).await.map_err(|e| {
aws_sdk_s3::error::SdkError::construction_failure(Box::new(std::io::Error::new(
std::io::ErrorKind::Other,
e,
)))
})?
} else {
let bytes = tokio::fs::read(path)
.await
.map_err(|e| aws_sdk_s3::error::SdkError::construction_failure(Box::new(e)))?;
ByteStream::from(bytes)
};
req.body(body).send().await.map(|_| ())
}
fn load_rules(path: &Path) -> HashMap<String, ReplicationRule> {
if !path.exists() {
return HashMap::new();
}
match std::fs::read_to_string(path) {
Ok(text) => serde_json::from_str(&text).unwrap_or_default(),
Err(_) => HashMap::new(),
}
}
fn now_secs() -> f64 {
SystemTime::now()
.duration_since(UNIX_EPOCH)
.map(|d| d.as_secs_f64())
.unwrap_or(0.0)
}

View File

@@ -0,0 +1,64 @@
use std::time::Duration;
use aws_config::BehaviorVersion;
use aws_credential_types::Credentials;
use aws_sdk_s3::config::{Region, SharedCredentialsProvider};
use aws_sdk_s3::Client;
use crate::stores::connections::RemoteConnection;
pub struct ClientOptions {
pub connect_timeout: Duration,
pub read_timeout: Duration,
pub max_attempts: u32,
}
impl Default for ClientOptions {
fn default() -> Self {
Self {
connect_timeout: Duration::from_secs(5),
read_timeout: Duration::from_secs(30),
max_attempts: 2,
}
}
}
pub fn build_client(connection: &RemoteConnection, options: &ClientOptions) -> Client {
let credentials = Credentials::new(
connection.access_key.clone(),
connection.secret_key.clone(),
None,
None,
"myfsio-replication",
);
let timeout_config = aws_smithy_types::timeout::TimeoutConfig::builder()
.connect_timeout(options.connect_timeout)
.read_timeout(options.read_timeout)
.build();
let retry_config =
aws_smithy_types::retry::RetryConfig::standard().with_max_attempts(options.max_attempts);
let config = aws_sdk_s3::config::Builder::new()
.behavior_version(BehaviorVersion::latest())
.credentials_provider(SharedCredentialsProvider::new(credentials))
.region(Region::new(connection.region.clone()))
.endpoint_url(connection.endpoint_url.clone())
.force_path_style(true)
.timeout_config(timeout_config)
.retry_config(retry_config)
.build();
Client::from_conf(config)
}
pub async fn check_endpoint_health(client: &Client) -> bool {
match client.list_buckets().send().await {
Ok(_) => true,
Err(err) => {
tracing::warn!("Endpoint health check failed: {:?}", err);
false
}
}
}

View File

@@ -0,0 +1,148 @@
use chrono::Utc;
use parking_lot::RwLock;
use serde::{Deserialize, Serialize};
use std::path::PathBuf;
use std::sync::Arc;
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct SiteInfo {
pub site_id: String,
pub endpoint: String,
#[serde(default = "default_region")]
pub region: String,
#[serde(default = "default_priority")]
pub priority: i32,
#[serde(default)]
pub display_name: String,
#[serde(default)]
pub created_at: Option<String>,
}
fn default_region() -> String {
"us-east-1".to_string()
}
fn default_priority() -> i32 {
100
}
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct PeerSite {
pub site_id: String,
pub endpoint: String,
#[serde(default = "default_region")]
pub region: String,
#[serde(default = "default_priority")]
pub priority: i32,
#[serde(default)]
pub display_name: String,
#[serde(default)]
pub connection_id: Option<String>,
#[serde(default)]
pub created_at: Option<String>,
#[serde(default)]
pub is_healthy: bool,
#[serde(default)]
pub last_health_check: Option<String>,
}
#[derive(Debug, Clone, Serialize, Deserialize, Default)]
struct RegistryData {
#[serde(default)]
local: Option<SiteInfo>,
#[serde(default)]
peers: Vec<PeerSite>,
}
pub struct SiteRegistry {
path: PathBuf,
data: Arc<RwLock<RegistryData>>,
}
impl SiteRegistry {
pub fn new(storage_root: &std::path::Path) -> Self {
let path = storage_root
.join(".myfsio.sys")
.join("config")
.join("site_registry.json");
let data = if path.exists() {
std::fs::read_to_string(&path)
.ok()
.and_then(|s| serde_json::from_str(&s).ok())
.unwrap_or_default()
} else {
RegistryData::default()
};
Self {
path,
data: Arc::new(RwLock::new(data)),
}
}
fn save(&self) {
let data = self.data.read();
if let Some(parent) = self.path.parent() {
let _ = std::fs::create_dir_all(parent);
}
if let Ok(json) = serde_json::to_string_pretty(&*data) {
let _ = std::fs::write(&self.path, json);
}
}
pub fn get_local_site(&self) -> Option<SiteInfo> {
self.data.read().local.clone()
}
pub fn set_local_site(&self, site: SiteInfo) {
self.data.write().local = Some(site);
self.save();
}
pub fn list_peers(&self) -> Vec<PeerSite> {
self.data.read().peers.clone()
}
pub fn get_peer(&self, site_id: &str) -> Option<PeerSite> {
self.data
.read()
.peers
.iter()
.find(|p| p.site_id == site_id)
.cloned()
}
pub fn add_peer(&self, peer: PeerSite) {
self.data.write().peers.push(peer);
self.save();
}
pub fn update_peer(&self, peer: PeerSite) {
let mut data = self.data.write();
if let Some(existing) = data.peers.iter_mut().find(|p| p.site_id == peer.site_id) {
*existing = peer;
}
drop(data);
self.save();
}
pub fn delete_peer(&self, site_id: &str) -> bool {
let mut data = self.data.write();
let len_before = data.peers.len();
data.peers.retain(|p| p.site_id != site_id);
let removed = data.peers.len() < len_before;
drop(data);
if removed {
self.save();
}
removed
}
pub fn update_health(&self, site_id: &str, is_healthy: bool) {
let mut data = self.data.write();
if let Some(peer) = data.peers.iter_mut().find(|p| p.site_id == site_id) {
peer.is_healthy = is_healthy;
peer.last_health_check = Some(Utc::now().to_rfc3339());
}
drop(data);
self.save();
}
}

View File

@@ -0,0 +1,498 @@
use std::collections::HashMap;
use std::path::PathBuf;
use std::pin::Pin;
use std::sync::Arc;
use std::time::{Duration, SystemTime, UNIX_EPOCH};
use aws_sdk_s3::Client;
use parking_lot::Mutex;
use serde::{Deserialize, Serialize};
use tokio::io::AsyncRead;
use tokio::sync::Notify;
use myfsio_common::types::{ListParams, ObjectMeta};
use myfsio_storage::fs_backend::FsStorageBackend;
use myfsio_storage::traits::StorageEngine;
use crate::services::replication::{ReplicationManager, ReplicationRule, MODE_BIDIRECTIONAL};
use crate::services::s3_client::{build_client, ClientOptions};
use crate::stores::connections::ConnectionStore;
#[derive(Debug, Clone, Default, Serialize, Deserialize)]
pub struct SyncedObjectInfo {
pub last_synced_at: f64,
pub remote_etag: String,
pub source: String,
}
#[derive(Debug, Clone, Default, Serialize, Deserialize)]
pub struct SyncState {
#[serde(default)]
pub synced_objects: HashMap<String, SyncedObjectInfo>,
#[serde(default)]
pub last_full_sync: Option<f64>,
}
#[derive(Debug, Clone, Default, Serialize)]
pub struct SiteSyncStats {
pub last_sync_at: Option<f64>,
pub objects_pulled: u64,
pub objects_skipped: u64,
pub conflicts_resolved: u64,
pub deletions_applied: u64,
pub errors: u64,
}
#[derive(Debug, Clone)]
struct RemoteObjectMeta {
last_modified: f64,
etag: String,
}
pub struct SiteSyncWorker {
storage: Arc<FsStorageBackend>,
connections: Arc<ConnectionStore>,
replication: Arc<ReplicationManager>,
storage_root: PathBuf,
interval: Duration,
batch_size: usize,
clock_skew_tolerance: f64,
client_options: ClientOptions,
bucket_stats: Mutex<HashMap<String, SiteSyncStats>>,
shutdown: Arc<Notify>,
}
impl SiteSyncWorker {
pub fn new(
storage: Arc<FsStorageBackend>,
connections: Arc<ConnectionStore>,
replication: Arc<ReplicationManager>,
storage_root: PathBuf,
interval_seconds: u64,
batch_size: usize,
connect_timeout: Duration,
read_timeout: Duration,
max_retries: u32,
clock_skew_tolerance: f64,
) -> Self {
Self {
storage,
connections,
replication,
storage_root,
interval: Duration::from_secs(interval_seconds),
batch_size,
clock_skew_tolerance,
client_options: ClientOptions {
connect_timeout,
read_timeout,
max_attempts: max_retries,
},
bucket_stats: Mutex::new(HashMap::new()),
shutdown: Arc::new(Notify::new()),
}
}
pub fn shutdown(&self) {
self.shutdown.notify_waiters();
}
pub fn get_stats(&self, bucket: &str) -> Option<SiteSyncStats> {
self.bucket_stats.lock().get(bucket).cloned()
}
pub async fn run(self: Arc<Self>) {
tracing::info!(
"Site sync worker started (interval={}s)",
self.interval.as_secs()
);
loop {
tokio::select! {
_ = tokio::time::sleep(self.interval) => {}
_ = self.shutdown.notified() => {
tracing::info!("Site sync worker shutting down");
return;
}
}
self.run_cycle().await;
}
}
async fn run_cycle(&self) {
let rules = self.replication.rules_snapshot();
for (bucket, rule) in rules {
if rule.mode != MODE_BIDIRECTIONAL || !rule.enabled {
continue;
}
match self.sync_bucket(&rule).await {
Ok(stats) => {
self.bucket_stats.lock().insert(bucket, stats);
}
Err(e) => {
tracing::error!("Site sync failed for bucket {}: {}", bucket, e);
}
}
}
}
pub async fn trigger_sync(&self, bucket: &str) -> Option<SiteSyncStats> {
let rule = self.replication.get_rule(bucket)?;
if rule.mode != MODE_BIDIRECTIONAL || !rule.enabled {
return None;
}
match self.sync_bucket(&rule).await {
Ok(stats) => {
self.bucket_stats
.lock()
.insert(bucket.to_string(), stats.clone());
Some(stats)
}
Err(e) => {
tracing::error!("Site sync trigger failed for {}: {}", bucket, e);
None
}
}
}
async fn sync_bucket(&self, rule: &ReplicationRule) -> Result<SiteSyncStats, String> {
let mut stats = SiteSyncStats::default();
let connection = self
.connections
.get(&rule.target_connection_id)
.ok_or_else(|| format!("connection {} not found", rule.target_connection_id))?;
let local_objects = self
.list_local_objects(&rule.bucket_name)
.await
.map_err(|e| format!("list local failed: {}", e))?;
let client = build_client(&connection, &self.client_options);
let remote_objects = self
.list_remote_objects(&client, &rule.target_bucket)
.await
.map_err(|e| format!("list remote failed: {}", e))?;
let mut sync_state = self.load_sync_state(&rule.bucket_name);
let mut to_pull: Vec<String> = Vec::new();
for (key, remote_meta) in &remote_objects {
if let Some(local_meta) = local_objects.get(key) {
match self.resolve_conflict(local_meta, remote_meta) {
"pull" => {
to_pull.push(key.clone());
stats.conflicts_resolved += 1;
}
_ => {
stats.objects_skipped += 1;
}
}
} else {
to_pull.push(key.clone());
}
}
let mut pulled = 0usize;
for key in &to_pull {
if pulled >= self.batch_size {
break;
}
let remote_meta = match remote_objects.get(key) {
Some(m) => m,
None => continue,
};
if self
.pull_object(&client, &rule.target_bucket, &rule.bucket_name, key)
.await
{
stats.objects_pulled += 1;
pulled += 1;
sync_state.synced_objects.insert(
key.clone(),
SyncedObjectInfo {
last_synced_at: now_secs(),
remote_etag: remote_meta.etag.clone(),
source: "remote".to_string(),
},
);
} else {
stats.errors += 1;
}
}
if rule.sync_deletions {
let tracked_keys: Vec<String> = sync_state.synced_objects.keys().cloned().collect();
for key in tracked_keys {
if remote_objects.contains_key(&key) {
continue;
}
let local_meta = match local_objects.get(&key) {
Some(m) => m,
None => continue,
};
let tracked = match sync_state.synced_objects.get(&key) {
Some(t) => t.clone(),
None => continue,
};
if tracked.source != "remote" {
continue;
}
let local_ts = local_meta.last_modified.timestamp() as f64;
if local_ts <= tracked.last_synced_at
&& self.apply_remote_deletion(&rule.bucket_name, &key).await
{
stats.deletions_applied += 1;
sync_state.synced_objects.remove(&key);
}
}
}
sync_state.last_full_sync = Some(now_secs());
self.save_sync_state(&rule.bucket_name, &sync_state);
self.replication
.update_last_pull(&rule.bucket_name, now_secs());
stats.last_sync_at = Some(now_secs());
tracing::info!(
"Site sync completed for {}: pulled={}, skipped={}, conflicts={}, deletions={}, errors={}",
rule.bucket_name,
stats.objects_pulled,
stats.objects_skipped,
stats.conflicts_resolved,
stats.deletions_applied,
stats.errors,
);
Ok(stats)
}
async fn list_local_objects(
&self,
bucket: &str,
) -> Result<HashMap<String, ObjectMeta>, String> {
let mut result = HashMap::new();
let mut token: Option<String> = None;
loop {
let params = ListParams {
max_keys: 1000,
continuation_token: token.clone(),
prefix: None,
start_after: None,
};
let page = self
.storage
.list_objects(bucket, &params)
.await
.map_err(|e| e.to_string())?;
for obj in page.objects {
result.insert(obj.key.clone(), obj);
}
if !page.is_truncated {
break;
}
token = page.next_continuation_token;
if token.is_none() {
break;
}
}
Ok(result)
}
async fn list_remote_objects(
&self,
client: &Client,
bucket: &str,
) -> Result<HashMap<String, RemoteObjectMeta>, String> {
let mut result = HashMap::new();
let mut continuation: Option<String> = None;
loop {
let mut req = client.list_objects_v2().bucket(bucket);
if let Some(ref t) = continuation {
req = req.continuation_token(t);
}
let resp = match req.send().await {
Ok(r) => r,
Err(err) => {
if is_not_found_error(&err) {
return Ok(result);
}
return Err(format!("{:?}", err));
}
};
for obj in resp.contents() {
let key = match obj.key() {
Some(k) => k.to_string(),
None => continue,
};
let last_modified = obj
.last_modified()
.and_then(|t| {
let secs = t.secs();
let nanos = t.subsec_nanos();
Some(secs as f64 + nanos as f64 / 1_000_000_000.0)
})
.unwrap_or(0.0);
let etag = obj.e_tag().unwrap_or("").trim_matches('"').to_string();
result.insert(
key,
RemoteObjectMeta {
last_modified,
etag,
},
);
}
if resp.is_truncated().unwrap_or(false) {
continuation = resp.next_continuation_token().map(|s| s.to_string());
if continuation.is_none() {
break;
}
} else {
break;
}
}
Ok(result)
}
fn resolve_conflict(&self, local: &ObjectMeta, remote: &RemoteObjectMeta) -> &'static str {
let local_ts = local.last_modified.timestamp() as f64
+ local.last_modified.timestamp_subsec_nanos() as f64 / 1_000_000_000.0;
let remote_ts = remote.last_modified;
if (remote_ts - local_ts).abs() < self.clock_skew_tolerance {
let local_etag = local.etag.clone().unwrap_or_default();
let local_etag_trim = local_etag.trim_matches('"');
if remote.etag == local_etag_trim {
return "skip";
}
if remote.etag.as_str() > local_etag_trim {
return "pull";
}
return "keep";
}
if remote_ts > local_ts {
"pull"
} else {
"keep"
}
}
async fn pull_object(
&self,
client: &Client,
remote_bucket: &str,
local_bucket: &str,
key: &str,
) -> bool {
let resp = match client
.get_object()
.bucket(remote_bucket)
.key(key)
.send()
.await
{
Ok(r) => r,
Err(err) => {
tracing::error!("Pull GetObject failed {}/{}: {:?}", local_bucket, key, err);
return false;
}
};
let head = match client
.head_object()
.bucket(remote_bucket)
.key(key)
.send()
.await
{
Ok(r) => r,
Err(err) => {
tracing::error!("Pull HeadObject failed {}/{}: {:?}", local_bucket, key, err);
return false;
}
};
let metadata: Option<HashMap<String, String>> = head
.metadata()
.map(|m| m.iter().map(|(k, v)| (k.clone(), v.clone())).collect());
let stream = resp.body.into_async_read();
let boxed: Pin<Box<dyn AsyncRead + Send>> = Box::pin(stream);
match self
.storage
.put_object(local_bucket, key, boxed, metadata)
.await
{
Ok(_) => {
tracing::debug!("Pulled object {}/{} from remote", local_bucket, key);
true
}
Err(err) => {
tracing::error!(
"Store pulled object failed {}/{}: {}",
local_bucket,
key,
err
);
false
}
}
}
async fn apply_remote_deletion(&self, bucket: &str, key: &str) -> bool {
match self.storage.delete_object(bucket, key).await {
Ok(_) => {
tracing::debug!("Applied remote deletion for {}/{}", bucket, key);
true
}
Err(err) => {
tracing::error!("Remote deletion failed {}/{}: {}", bucket, key, err);
false
}
}
}
fn sync_state_path(&self, bucket: &str) -> PathBuf {
self.storage_root
.join(".myfsio.sys")
.join("buckets")
.join(bucket)
.join("site_sync_state.json")
}
fn load_sync_state(&self, bucket: &str) -> SyncState {
let path = self.sync_state_path(bucket);
if !path.exists() {
return SyncState::default();
}
match std::fs::read_to_string(&path) {
Ok(text) => serde_json::from_str(&text).unwrap_or_default(),
Err(_) => SyncState::default(),
}
}
fn save_sync_state(&self, bucket: &str, state: &SyncState) {
let path = self.sync_state_path(bucket);
if let Some(parent) = path.parent() {
let _ = std::fs::create_dir_all(parent);
}
if let Ok(text) = serde_json::to_string_pretty(state) {
let _ = std::fs::write(&path, text);
}
}
}
fn now_secs() -> f64 {
SystemTime::now()
.duration_since(UNIX_EPOCH)
.map(|d| d.as_secs_f64())
.unwrap_or(0.0)
}
fn is_not_found_error<E: std::fmt::Debug>(err: &aws_sdk_s3::error::SdkError<E>) -> bool {
let msg = format!("{:?}", err);
msg.contains("NoSuchBucket")
|| msg.contains("code: Some(\"NotFound\")")
|| msg.contains("code: Some(\"NoSuchBucket\")")
|| msg.contains("status: 404")
}

View File

@@ -0,0 +1,203 @@
use chrono::{DateTime, Utc};
use myfsio_storage::fs_backend::FsStorageBackend;
use myfsio_storage::traits::StorageEngine;
use serde::{Deserialize, Serialize};
use serde_json::json;
use std::path::{Path, PathBuf};
use std::sync::Arc;
use sysinfo::{Disks, System};
use tokio::sync::RwLock;
#[derive(Debug, Clone)]
pub struct SystemMetricsConfig {
pub interval_minutes: u64,
pub retention_hours: u64,
}
impl Default for SystemMetricsConfig {
fn default() -> Self {
Self {
interval_minutes: 5,
retention_hours: 24,
}
}
}
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct SystemMetricsSnapshot {
pub timestamp: DateTime<Utc>,
pub cpu_percent: f64,
pub memory_percent: f64,
pub disk_percent: f64,
pub storage_bytes: u64,
}
pub struct SystemMetricsService {
storage_root: PathBuf,
storage: Arc<FsStorageBackend>,
config: SystemMetricsConfig,
history: Arc<RwLock<Vec<SystemMetricsSnapshot>>>,
history_path: PathBuf,
}
impl SystemMetricsService {
pub fn new(
storage_root: &Path,
storage: Arc<FsStorageBackend>,
config: SystemMetricsConfig,
) -> Self {
let history_path = storage_root
.join(".myfsio.sys")
.join("config")
.join("metrics_history.json");
let mut history = if history_path.exists() {
std::fs::read_to_string(&history_path)
.ok()
.and_then(|s| serde_json::from_str::<serde_json::Value>(&s).ok())
.and_then(|v| {
v.get("history").and_then(|h| {
serde_json::from_value::<Vec<SystemMetricsSnapshot>>(h.clone()).ok()
})
})
.unwrap_or_default()
} else {
Vec::new()
};
prune_history(&mut history, config.retention_hours);
Self {
storage_root: storage_root.to_path_buf(),
storage,
config,
history: Arc::new(RwLock::new(history)),
history_path,
}
}
pub async fn get_history(&self, hours: Option<u64>) -> Vec<SystemMetricsSnapshot> {
let mut history = self.history.read().await.clone();
prune_history(&mut history, hours.unwrap_or(self.config.retention_hours));
history
}
async fn take_snapshot(&self) {
let snapshot = collect_snapshot(&self.storage_root, &self.storage).await;
let mut history = self.history.write().await;
history.push(snapshot);
prune_history(&mut history, self.config.retention_hours);
drop(history);
self.save_history().await;
}
async fn save_history(&self) {
let history = self.history.read().await;
let data = json!({ "history": *history });
if let Some(parent) = self.history_path.parent() {
let _ = std::fs::create_dir_all(parent);
}
let _ = std::fs::write(
&self.history_path,
serde_json::to_string_pretty(&data).unwrap_or_default(),
);
}
pub fn start_background(self: Arc<Self>) -> tokio::task::JoinHandle<()> {
let interval =
std::time::Duration::from_secs(self.config.interval_minutes.saturating_mul(60));
tokio::spawn(async move {
self.take_snapshot().await;
let mut timer = tokio::time::interval(interval);
loop {
timer.tick().await;
self.take_snapshot().await;
}
})
}
}
fn prune_history(history: &mut Vec<SystemMetricsSnapshot>, retention_hours: u64) {
let cutoff = Utc::now() - chrono::Duration::hours(retention_hours as i64);
history.retain(|item| item.timestamp > cutoff);
}
fn sample_system_now() -> (f64, f64) {
let mut system = System::new();
system.refresh_cpu_usage();
std::thread::sleep(sysinfo::MINIMUM_CPU_UPDATE_INTERVAL);
system.refresh_cpu_usage();
system.refresh_memory();
let cpu_percent = system.global_cpu_usage() as f64;
let memory_percent = if system.total_memory() > 0 {
(system.used_memory() as f64 / system.total_memory() as f64) * 100.0
} else {
0.0
};
(cpu_percent, memory_percent)
}
fn normalize_path_for_mount(path: &Path) -> String {
let canonical = path.canonicalize().unwrap_or_else(|_| path.to_path_buf());
let raw = canonical.to_string_lossy().to_string();
let stripped = raw.strip_prefix(r"\\?\").unwrap_or(&raw);
stripped.to_lowercase()
}
fn sample_disk(path: &Path) -> (u64, u64) {
let disks = Disks::new_with_refreshed_list();
let path_str = normalize_path_for_mount(path);
let mut best: Option<(usize, u64, u64)> = None;
for disk in disks.list() {
let mount_raw = disk.mount_point().to_string_lossy().to_string();
let mount = mount_raw
.strip_prefix(r"\\?\")
.unwrap_or(&mount_raw)
.to_lowercase();
let total = disk.total_space();
let free = disk.available_space();
if path_str.starts_with(&mount) {
let len = mount.len();
match best {
Some((best_len, _, _)) if len <= best_len => {}
_ => best = Some((len, total, free)),
}
}
}
best.map(|(_, total, free)| (total, free)).unwrap_or((0, 0))
}
async fn collect_snapshot(
storage_root: &Path,
storage: &Arc<FsStorageBackend>,
) -> SystemMetricsSnapshot {
let (cpu_percent, memory_percent) = sample_system_now();
let (disk_total, disk_free) = sample_disk(storage_root);
let disk_percent = if disk_total > 0 {
((disk_total - disk_free) as f64 / disk_total as f64) * 100.0
} else {
0.0
};
let mut storage_bytes = 0u64;
let buckets = storage.list_buckets().await.unwrap_or_default();
for bucket in buckets {
if let Ok(stats) = storage.bucket_stats(&bucket.name).await {
storage_bytes += stats.total_bytes();
}
}
SystemMetricsSnapshot {
timestamp: Utc::now(),
cpu_percent: round2(cpu_percent),
memory_percent: round2(memory_percent),
disk_percent: round2(disk_percent),
storage_bytes,
}
}
fn round2(value: f64) -> f64 {
(value * 100.0).round() / 100.0
}

View File

@@ -0,0 +1,197 @@
use parking_lot::RwLock;
use serde::{Deserialize, Serialize};
use std::collections::HashMap;
use std::path::PathBuf;
use std::sync::Arc;
#[derive(Debug, Clone, Serialize, Deserialize, Default)]
#[serde(deny_unknown_fields)]
struct DomainData {
#[serde(default)]
mappings: HashMap<String, String>,
}
#[derive(Debug, Deserialize)]
#[serde(untagged)]
enum DomainDataFile {
Wrapped(DomainData),
Flat(HashMap<String, String>),
}
impl DomainDataFile {
fn into_domain_data(self) -> DomainData {
match self {
Self::Wrapped(data) => data,
Self::Flat(mappings) => DomainData {
mappings: mappings
.into_iter()
.map(|(domain, bucket)| (normalize_domain(&domain), bucket))
.collect(),
},
}
}
}
pub struct WebsiteDomainStore {
path: PathBuf,
data: Arc<RwLock<DomainData>>,
}
impl WebsiteDomainStore {
pub fn new(storage_root: &std::path::Path) -> Self {
let path = storage_root
.join(".myfsio.sys")
.join("config")
.join("website_domains.json");
let data = if path.exists() {
std::fs::read_to_string(&path)
.ok()
.and_then(|s| serde_json::from_str::<DomainDataFile>(&s).ok())
.map(DomainDataFile::into_domain_data)
.unwrap_or_default()
} else {
DomainData::default()
};
Self {
path,
data: Arc::new(RwLock::new(data)),
}
}
fn save(&self) {
let data = self.data.read();
if let Some(parent) = self.path.parent() {
let _ = std::fs::create_dir_all(parent);
}
if let Ok(json) = serde_json::to_string_pretty(&data.mappings) {
let _ = std::fs::write(&self.path, json);
}
}
pub fn list_all(&self) -> Vec<serde_json::Value> {
self.data
.read()
.mappings
.iter()
.map(|(domain, bucket)| {
serde_json::json!({
"domain": domain,
"bucket": bucket,
})
})
.collect()
}
pub fn get_bucket(&self, domain: &str) -> Option<String> {
let domain = normalize_domain(domain);
self.data.read().mappings.get(&domain).cloned()
}
pub fn set_mapping(&self, domain: &str, bucket: &str) {
let domain = normalize_domain(domain);
self.data
.write()
.mappings
.insert(domain, bucket.to_string());
self.save();
}
pub fn delete_mapping(&self, domain: &str) -> bool {
let domain = normalize_domain(domain);
let removed = self.data.write().mappings.remove(&domain).is_some();
if removed {
self.save();
}
removed
}
}
pub fn normalize_domain(domain: &str) -> String {
domain.trim().to_ascii_lowercase()
}
pub fn is_valid_domain(domain: &str) -> bool {
if domain.is_empty() || domain.len() > 253 {
return false;
}
let labels: Vec<&str> = domain.split('.').collect();
if labels.len() < 2 {
return false;
}
for label in &labels {
if label.is_empty() || label.len() > 63 {
return false;
}
if !label.chars().all(|c| c.is_ascii_alphanumeric() || c == '-') {
return false;
}
if label.starts_with('-') || label.ends_with('-') {
return false;
}
}
true
}
#[cfg(test)]
mod tests {
use super::WebsiteDomainStore;
use serde_json::json;
use tempfile::tempdir;
#[test]
fn loads_legacy_flat_mapping_file() {
let tmp = tempdir().expect("tempdir");
let config_dir = tmp.path().join(".myfsio.sys").join("config");
std::fs::create_dir_all(&config_dir).expect("create config dir");
std::fs::write(
config_dir.join("website_domains.json"),
r#"{"Example.COM":"site-bucket"}"#,
)
.expect("write config");
let store = WebsiteDomainStore::new(tmp.path());
assert_eq!(
store.get_bucket("example.com"),
Some("site-bucket".to_string())
);
}
#[test]
fn loads_wrapped_mapping_file() {
let tmp = tempdir().expect("tempdir");
let config_dir = tmp.path().join(".myfsio.sys").join("config");
std::fs::create_dir_all(&config_dir).expect("create config dir");
std::fs::write(
config_dir.join("website_domains.json"),
r#"{"mappings":{"example.com":"site-bucket"}}"#,
)
.expect("write config");
let store = WebsiteDomainStore::new(tmp.path());
assert_eq!(
store.get_bucket("example.com"),
Some("site-bucket".to_string())
);
}
#[test]
fn saves_in_shared_plain_mapping_format() {
let tmp = tempdir().expect("tempdir");
let store = WebsiteDomainStore::new(tmp.path());
store.set_mapping("Example.COM", "site-bucket");
let saved = std::fs::read_to_string(
tmp.path()
.join(".myfsio.sys")
.join("config")
.join("website_domains.json"),
)
.expect("read config");
let json: serde_json::Value = serde_json::from_str(&saved).expect("parse config");
assert_eq!(json, json!({"example.com": "site-bucket"}));
}
}

View File

@@ -0,0 +1,133 @@
use std::collections::HashMap;
use std::sync::Arc;
use std::time::{Duration, Instant};
use base64::{engine::general_purpose::URL_SAFE_NO_PAD, Engine};
use parking_lot::RwLock;
use rand::RngCore;
use serde::{Deserialize, Serialize};
pub const SESSION_COOKIE_NAME: &str = "myfsio_session";
pub const CSRF_FIELD_NAME: &str = "csrf_token";
pub const CSRF_HEADER_NAME: &str = "x-csrf-token";
const SESSION_ID_BYTES: usize = 32;
const CSRF_TOKEN_BYTES: usize = 32;
#[derive(Clone, Debug, Serialize, Deserialize)]
pub struct FlashMessage {
pub category: String,
pub message: String,
}
#[derive(Clone, Debug)]
pub struct SessionData {
pub user_id: Option<String>,
pub display_name: Option<String>,
pub csrf_token: String,
pub flash: Vec<FlashMessage>,
pub extra: HashMap<String, String>,
last_accessed: Instant,
}
impl SessionData {
pub fn new() -> Self {
Self {
user_id: None,
display_name: None,
csrf_token: generate_token(CSRF_TOKEN_BYTES),
flash: Vec::new(),
extra: HashMap::new(),
last_accessed: Instant::now(),
}
}
pub fn is_authenticated(&self) -> bool {
self.user_id.is_some()
}
pub fn push_flash(&mut self, category: impl Into<String>, message: impl Into<String>) {
self.flash.push(FlashMessage {
category: category.into(),
message: message.into(),
});
}
pub fn take_flash(&mut self) -> Vec<FlashMessage> {
std::mem::take(&mut self.flash)
}
pub fn rotate_csrf(&mut self) {
self.csrf_token = generate_token(CSRF_TOKEN_BYTES);
}
}
impl Default for SessionData {
fn default() -> Self {
Self::new()
}
}
pub struct SessionStore {
sessions: RwLock<HashMap<String, SessionData>>,
ttl: Duration,
}
impl SessionStore {
pub fn new(ttl: Duration) -> Self {
Self {
sessions: RwLock::new(HashMap::new()),
ttl,
}
}
pub fn create(&self) -> (String, SessionData) {
let id = generate_token(SESSION_ID_BYTES);
let data = SessionData::new();
self.sessions.write().insert(id.clone(), data.clone());
(id, data)
}
pub fn get(&self, id: &str) -> Option<SessionData> {
let mut guard = self.sessions.write();
let entry = guard.get_mut(id)?;
if entry.last_accessed.elapsed() > self.ttl {
guard.remove(id);
return None;
}
entry.last_accessed = Instant::now();
Some(entry.clone())
}
pub fn save(&self, id: &str, data: SessionData) {
let mut guard = self.sessions.write();
let mut updated = data;
updated.last_accessed = Instant::now();
guard.insert(id.to_string(), updated);
}
pub fn destroy(&self, id: &str) {
self.sessions.write().remove(id);
}
pub fn sweep(&self) {
let ttl = self.ttl;
let mut guard = self.sessions.write();
guard.retain(|_, data| data.last_accessed.elapsed() <= ttl);
}
}
pub type SharedSessionStore = Arc<SessionStore>;
pub fn generate_token(bytes: usize) -> String {
let mut buf = vec![0u8; bytes];
rand::thread_rng().fill_bytes(&mut buf);
URL_SAFE_NO_PAD.encode(&buf)
}
pub fn csrf_tokens_match(a: &str, b: &str) -> bool {
if a.len() != b.len() {
return false;
}
subtle::ConstantTimeEq::ct_eq(a.as_bytes(), b.as_bytes()).into()
}

View File

@@ -0,0 +1,240 @@
use std::sync::Arc;
use std::time::Duration;
use crate::config::ServerConfig;
use crate::services::access_logging::AccessLoggingService;
use crate::services::gc::GcService;
use crate::services::integrity::IntegrityService;
use crate::services::metrics::MetricsService;
use crate::services::replication::ReplicationManager;
use crate::services::site_registry::SiteRegistry;
use crate::services::site_sync::SiteSyncWorker;
use crate::services::system_metrics::SystemMetricsService;
use crate::services::website_domains::WebsiteDomainStore;
use crate::session::SessionStore;
use crate::stores::connections::ConnectionStore;
use crate::templates::TemplateEngine;
use myfsio_auth::iam::IamService;
use myfsio_crypto::encryption::{EncryptionConfig, EncryptionService};
use myfsio_crypto::kms::KmsService;
use myfsio_storage::fs_backend::{FsStorageBackend, FsStorageBackendConfig};
#[derive(Clone)]
pub struct AppState {
pub config: ServerConfig,
pub storage: Arc<FsStorageBackend>,
pub iam: Arc<IamService>,
pub encryption: Option<Arc<EncryptionService>>,
pub kms: Option<Arc<KmsService>>,
pub gc: Option<Arc<GcService>>,
pub integrity: Option<Arc<IntegrityService>>,
pub metrics: Option<Arc<MetricsService>>,
pub system_metrics: Option<Arc<SystemMetricsService>>,
pub site_registry: Option<Arc<SiteRegistry>>,
pub website_domains: Option<Arc<WebsiteDomainStore>>,
pub connections: Arc<ConnectionStore>,
pub replication: Arc<ReplicationManager>,
pub site_sync: Option<Arc<SiteSyncWorker>>,
pub templates: Option<Arc<TemplateEngine>>,
pub sessions: Arc<SessionStore>,
pub access_logging: Arc<AccessLoggingService>,
}
impl AppState {
pub fn new(config: ServerConfig) -> Self {
let storage = Arc::new(FsStorageBackend::new_with_config(
config.storage_root.clone(),
FsStorageBackendConfig {
object_key_max_length_bytes: config.object_key_max_length_bytes,
object_cache_max_size: config.object_cache_max_size,
bucket_config_cache_ttl: Duration::from_secs_f64(
config.bucket_config_cache_ttl_seconds,
),
},
));
let iam = Arc::new(IamService::new_with_secret(
config.iam_config_path.clone(),
config.secret_key.clone(),
));
let gc = if config.gc_enabled {
Some(Arc::new(GcService::new(
config.storage_root.clone(),
crate::services::gc::GcConfig {
interval_hours: config.gc_interval_hours,
temp_file_max_age_hours: config.gc_temp_file_max_age_hours,
multipart_max_age_days: config.gc_multipart_max_age_days,
lock_file_max_age_hours: config.gc_lock_file_max_age_hours,
dry_run: config.gc_dry_run,
},
)))
} else {
None
};
let integrity = if config.integrity_enabled {
Some(Arc::new(IntegrityService::new(
storage.clone(),
&config.storage_root,
crate::services::integrity::IntegrityConfig::default(),
)))
} else {
None
};
let metrics = if config.metrics_enabled {
Some(Arc::new(MetricsService::new(
&config.storage_root,
crate::services::metrics::MetricsConfig {
interval_minutes: config.metrics_interval_minutes,
retention_hours: config.metrics_retention_hours,
},
)))
} else {
None
};
let system_metrics = if config.metrics_history_enabled {
Some(Arc::new(SystemMetricsService::new(
&config.storage_root,
storage.clone(),
crate::services::system_metrics::SystemMetricsConfig {
interval_minutes: config.metrics_history_interval_minutes,
retention_hours: config.metrics_history_retention_hours,
},
)))
} else {
None
};
let site_registry = {
let registry = SiteRegistry::new(&config.storage_root);
if let (Some(site_id), Some(endpoint)) =
(config.site_id.as_deref(), config.site_endpoint.as_deref())
{
registry.set_local_site(crate::services::site_registry::SiteInfo {
site_id: site_id.to_string(),
endpoint: endpoint.to_string(),
region: config.site_region.clone(),
priority: config.site_priority,
display_name: site_id.to_string(),
created_at: Some(chrono::Utc::now().to_rfc3339()),
});
}
Some(Arc::new(registry))
};
let website_domains = if config.website_hosting_enabled {
Some(Arc::new(WebsiteDomainStore::new(&config.storage_root)))
} else {
None
};
let connections = Arc::new(ConnectionStore::new(&config.storage_root));
let replication = Arc::new(ReplicationManager::new(
storage.clone(),
connections.clone(),
&config.storage_root,
Duration::from_secs(config.replication_connect_timeout_secs),
Duration::from_secs(config.replication_read_timeout_secs),
config.replication_max_retries,
config.replication_streaming_threshold_bytes,
config.replication_max_failures_per_bucket,
));
let site_sync = if config.site_sync_enabled {
Some(Arc::new(SiteSyncWorker::new(
storage.clone(),
connections.clone(),
replication.clone(),
config.storage_root.clone(),
config.site_sync_interval_secs,
config.site_sync_batch_size,
Duration::from_secs(config.site_sync_connect_timeout_secs),
Duration::from_secs(config.site_sync_read_timeout_secs),
config.site_sync_max_retries,
config.site_sync_clock_skew_tolerance,
)))
} else {
None
};
let templates = init_templates(&config.templates_dir);
let access_logging = Arc::new(AccessLoggingService::new(&config.storage_root));
let session_ttl = Duration::from_secs(config.session_lifetime_days.saturating_mul(86_400));
Self {
config,
storage,
iam,
encryption: None,
kms: None,
gc,
integrity,
metrics,
system_metrics,
site_registry,
website_domains,
connections,
replication,
site_sync,
templates,
sessions: Arc::new(SessionStore::new(session_ttl)),
access_logging,
}
}
pub async fn new_with_encryption(config: ServerConfig) -> Self {
let mut state = Self::new(config.clone());
let keys_dir = config.storage_root.join(".myfsio.sys").join("keys");
let kms = if config.kms_enabled {
match KmsService::new(&keys_dir).await {
Ok(k) => Some(Arc::new(k)),
Err(e) => {
tracing::error!("Failed to initialize KMS: {}", e);
None
}
}
} else {
None
};
let encryption = if config.encryption_enabled {
match myfsio_crypto::kms::load_or_create_master_key(&keys_dir).await {
Ok(master_key) => Some(Arc::new(EncryptionService::with_config(
master_key,
kms.clone(),
EncryptionConfig {
chunk_size: config.encryption_chunk_size_bytes,
},
))),
Err(e) => {
tracing::error!("Failed to initialize encryption: {}", e);
None
}
}
} else {
None
};
state.encryption = encryption;
state.kms = kms;
state
}
}
fn init_templates(templates_dir: &std::path::Path) -> Option<Arc<TemplateEngine>> {
let glob = format!("{}/*.html", templates_dir.display()).replace('\\', "/");
match TemplateEngine::new(&glob) {
Ok(engine) => {
crate::handlers::ui_pages::register_ui_endpoints(&engine);
Some(Arc::new(engine))
}
Err(e) => {
tracing::error!("Template engine init failed: {}", e);
None
}
}
}

View File

@@ -0,0 +1,94 @@
use std::path::{Path, PathBuf};
use std::sync::Arc;
use parking_lot::RwLock;
use serde::{Deserialize, Serialize};
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct RemoteConnection {
pub id: String,
pub name: String,
pub endpoint_url: String,
pub access_key: String,
pub secret_key: String,
#[serde(default = "default_region")]
pub region: String,
}
fn default_region() -> String {
"us-east-1".to_string()
}
pub struct ConnectionStore {
path: PathBuf,
inner: Arc<RwLock<Vec<RemoteConnection>>>,
}
impl ConnectionStore {
pub fn new(storage_root: &Path) -> Self {
let path = storage_root
.join(".myfsio.sys")
.join("config")
.join("connections.json");
let inner = Arc::new(RwLock::new(load_from_disk(&path)));
Self { path, inner }
}
pub fn reload(&self) {
let loaded = load_from_disk(&self.path);
*self.inner.write() = loaded;
}
pub fn list(&self) -> Vec<RemoteConnection> {
self.inner.read().clone()
}
pub fn get(&self, id: &str) -> Option<RemoteConnection> {
self.inner.read().iter().find(|c| c.id == id).cloned()
}
pub fn add(&self, connection: RemoteConnection) -> std::io::Result<()> {
{
let mut guard = self.inner.write();
if let Some(existing) = guard.iter_mut().find(|c| c.id == connection.id) {
*existing = connection;
} else {
guard.push(connection);
}
}
self.save()
}
pub fn delete(&self, id: &str) -> std::io::Result<bool> {
let removed = {
let mut guard = self.inner.write();
let before = guard.len();
guard.retain(|c| c.id != id);
guard.len() != before
};
if removed {
self.save()?;
}
Ok(removed)
}
fn save(&self) -> std::io::Result<()> {
if let Some(parent) = self.path.parent() {
std::fs::create_dir_all(parent)?;
}
let snapshot = self.inner.read().clone();
let bytes = serde_json::to_vec_pretty(&snapshot)
.map_err(|e| std::io::Error::new(std::io::ErrorKind::InvalidData, e))?;
std::fs::write(&self.path, bytes)
}
}
fn load_from_disk(path: &Path) -> Vec<RemoteConnection> {
if !path.exists() {
return Vec::new();
}
match std::fs::read_to_string(path) {
Ok(text) => serde_json::from_str(&text).unwrap_or_default(),
Err(_) => Vec::new(),
}
}

View File

@@ -0,0 +1 @@
pub mod connections;

View File

@@ -0,0 +1,355 @@
use std::collections::HashMap;
use std::sync::Arc;
use chrono::{DateTime, Utc};
use parking_lot::RwLock;
use serde_json::Value;
use tera::{Context, Error as TeraError, Tera};
pub type EndpointResolver =
Arc<dyn Fn(&str, &HashMap<String, Value>) -> Option<String> + Send + Sync>;
#[derive(Clone)]
pub struct TemplateEngine {
tera: Arc<RwLock<Tera>>,
endpoints: Arc<RwLock<HashMap<String, String>>>,
}
impl TemplateEngine {
pub fn new(template_glob: &str) -> Result<Self, TeraError> {
let mut tera = Tera::new(template_glob)?;
tera.set_escape_fn(html_escape);
register_filters(&mut tera);
let endpoints: Arc<RwLock<HashMap<String, String>>> = Arc::new(RwLock::new(HashMap::new()));
register_functions(&mut tera, endpoints.clone());
Ok(Self {
tera: Arc::new(RwLock::new(tera)),
endpoints,
})
}
pub fn register_endpoint(&self, name: &str, path_template: &str) {
self.endpoints
.write()
.insert(name.to_string(), path_template.to_string());
}
pub fn register_endpoints(&self, pairs: &[(&str, &str)]) {
let mut guard = self.endpoints.write();
for (n, p) in pairs {
guard.insert((*n).to_string(), (*p).to_string());
}
}
pub fn render(&self, name: &str, context: &Context) -> Result<String, TeraError> {
self.tera.read().render(name, context)
}
pub fn reload(&self) -> Result<(), TeraError> {
self.tera.write().full_reload()
}
}
fn html_escape(input: &str) -> String {
let mut out = String::with_capacity(input.len());
for c in input.chars() {
match c {
'&' => out.push_str("&amp;"),
'<' => out.push_str("&lt;"),
'>' => out.push_str("&gt;"),
'"' => out.push_str("&quot;"),
'\'' => out.push_str("&#x27;"),
_ => out.push(c),
}
}
out
}
fn register_filters(tera: &mut Tera) {
tera.register_filter("format_datetime", format_datetime_filter);
tera.register_filter("filesizeformat", filesizeformat_filter);
tera.register_filter("slice", slice_filter);
}
fn register_functions(tera: &mut Tera, endpoints: Arc<RwLock<HashMap<String, String>>>) {
let endpoints_for_url = endpoints.clone();
tera.register_function(
"url_for",
move |args: &HashMap<String, Value>| -> tera::Result<Value> {
let endpoint = args
.get("endpoint")
.and_then(|v| v.as_str())
.ok_or_else(|| tera::Error::msg("url_for requires endpoint"))?;
if endpoint == "static" {
let filename = args.get("filename").and_then(|v| v.as_str()).unwrap_or("");
return Ok(Value::String(format!("/static/{}", filename)));
}
let path = match endpoints_for_url.read().get(endpoint) {
Some(p) => p.clone(),
None => {
return Ok(Value::String(format!("/__missing__/{}", endpoint)));
}
};
Ok(Value::String(substitute_path_params(&path, args)))
},
);
tera.register_function(
"csrf_token",
|args: &HashMap<String, Value>| -> tera::Result<Value> {
if let Some(token) = args.get("token").and_then(|v| v.as_str()) {
return Ok(Value::String(token.to_string()));
}
Ok(Value::String(String::new()))
},
);
}
fn substitute_path_params(template: &str, args: &HashMap<String, Value>) -> String {
let mut path = template.to_string();
let mut query: Vec<(String, String)> = Vec::new();
for (k, v) in args {
if k == "endpoint" || k == "filename" {
continue;
}
let value_str = value_to_string(v);
let placeholder = format!("{{{}}}", k);
if path.contains(&placeholder) {
let encoded = urlencode_path(&value_str);
path = path.replace(&placeholder, &encoded);
} else {
query.push((k.clone(), value_str));
}
}
if !query.is_empty() {
let qs: Vec<String> = query
.into_iter()
.map(|(k, v)| format!("{}={}", urlencode_query(&k), urlencode_query(&v)))
.collect();
path.push('?');
path.push_str(&qs.join("&"));
}
path
}
fn value_to_string(v: &Value) -> String {
match v {
Value::String(s) => s.clone(),
Value::Number(n) => n.to_string(),
Value::Bool(b) => b.to_string(),
Value::Null => String::new(),
other => other.to_string(),
}
}
const UNRESERVED: &percent_encoding::AsciiSet = &percent_encoding::NON_ALPHANUMERIC
.remove(b'-')
.remove(b'_')
.remove(b'.')
.remove(b'~');
fn urlencode_path(s: &str) -> String {
percent_encoding::utf8_percent_encode(s, UNRESERVED).to_string()
}
fn urlencode_query(s: &str) -> String {
percent_encoding::utf8_percent_encode(s, UNRESERVED).to_string()
}
fn format_datetime_filter(value: &Value, args: &HashMap<String, Value>) -> tera::Result<Value> {
let format = args
.get("format")
.and_then(|v| v.as_str())
.unwrap_or("%Y-%m-%d %H:%M:%S UTC");
let dt: Option<DateTime<Utc>> = match value {
Value::String(s) => DateTime::parse_from_rfc3339(s)
.ok()
.map(|d| d.with_timezone(&Utc))
.or_else(|| {
DateTime::parse_from_rfc2822(s)
.ok()
.map(|d| d.with_timezone(&Utc))
}),
Value::Number(n) => n.as_f64().and_then(|f| {
let secs = f as i64;
let nanos = ((f - secs as f64) * 1_000_000_000.0) as u32;
DateTime::<Utc>::from_timestamp(secs, nanos)
}),
_ => None,
};
match dt {
Some(d) => Ok(Value::String(d.format(format).to_string())),
None => Ok(value.clone()),
}
}
fn slice_filter(value: &Value, args: &HashMap<String, Value>) -> tera::Result<Value> {
let start = args.get("start").and_then(|v| v.as_i64()).unwrap_or(0);
let end = args.get("end").and_then(|v| v.as_i64());
match value {
Value::String(s) => {
let chars: Vec<char> = s.chars().collect();
let len = chars.len() as i64;
let norm = |i: i64| -> usize {
if i < 0 {
(len + i).max(0) as usize
} else {
i.min(len) as usize
}
};
let s_idx = norm(start);
let e_idx = match end {
Some(e) => norm(e),
None => len as usize,
};
let e_idx = e_idx.max(s_idx);
Ok(Value::String(chars[s_idx..e_idx].iter().collect()))
}
Value::Array(arr) => {
let len = arr.len() as i64;
let norm = |i: i64| -> usize {
if i < 0 {
(len + i).max(0) as usize
} else {
i.min(len) as usize
}
};
let s_idx = norm(start);
let e_idx = match end {
Some(e) => norm(e),
None => len as usize,
};
let e_idx = e_idx.max(s_idx);
Ok(Value::Array(arr[s_idx..e_idx].to_vec()))
}
Value::Null => Ok(Value::String(String::new())),
_ => Err(tera::Error::msg("slice: unsupported value type")),
}
}
fn filesizeformat_filter(value: &Value, _args: &HashMap<String, Value>) -> tera::Result<Value> {
let bytes = match value {
Value::Number(n) => n.as_f64().unwrap_or(0.0),
Value::String(s) => s.parse::<f64>().unwrap_or(0.0),
_ => 0.0,
};
const UNITS: [&str; 6] = ["B", "KB", "MB", "GB", "TB", "PB"];
let mut size = bytes;
let mut unit = 0;
while size >= 1024.0 && unit < UNITS.len() - 1 {
size /= 1024.0;
unit += 1;
}
let formatted = if unit == 0 {
format!("{} {}", size as u64, UNITS[unit])
} else {
format!("{:.1} {}", size, UNITS[unit])
};
Ok(Value::String(formatted))
}
#[cfg(test)]
mod tests {
use super::*;
fn test_engine() -> TemplateEngine {
let tmp = tempfile::TempDir::new().unwrap();
let tpl = tmp.path().join("t.html");
std::fs::write(&tpl, "").unwrap();
let glob = format!("{}/*.html", tmp.path().display());
let engine = TemplateEngine::new(&glob).unwrap();
engine.register_endpoints(&[
("ui.buckets_overview", "/ui/buckets"),
("ui.bucket_detail", "/ui/buckets/{bucket_name}"),
(
"ui.abort_multipart_upload",
"/ui/buckets/{bucket_name}/multipart/{upload_id}/abort",
),
]);
engine
}
fn render_inline(engine: &TemplateEngine, tpl: &str) -> String {
let mut tera = engine.tera.write();
tera.add_raw_template("__inline__", tpl).unwrap();
drop(tera);
engine.render("__inline__", &Context::new()).unwrap()
}
#[test]
fn static_url() {
let e = test_engine();
let out = render_inline(
&e,
"{{ url_for(endpoint='static', filename='css/main.css') }}",
);
assert_eq!(out, "/static/css/main.css");
}
#[test]
fn path_param_substitution() {
let e = test_engine();
let out = render_inline(
&e,
"{{ url_for(endpoint='ui.bucket_detail', bucket_name='my-bucket') }}",
);
assert_eq!(out, "/ui/buckets/my-bucket");
}
#[test]
fn extra_args_become_query() {
let e = test_engine();
let out = render_inline(
&e,
"{{ url_for(endpoint='ui.bucket_detail', bucket_name='b', tab='replication') }}",
);
assert_eq!(out, "/ui/buckets/b?tab=replication");
}
#[test]
fn filesizeformat_basic() {
let v = filesizeformat_filter(&Value::Number(1024.into()), &HashMap::new()).unwrap();
assert_eq!(v, Value::String("1.0 KB".into()));
let v = filesizeformat_filter(&Value::Number(1_048_576.into()), &HashMap::new()).unwrap();
assert_eq!(v, Value::String("1.0 MB".into()));
let v = filesizeformat_filter(&Value::Number(500.into()), &HashMap::new()).unwrap();
assert_eq!(v, Value::String("500 B".into()));
}
#[test]
fn project_templates_parse() {
let mut path = std::path::PathBuf::from(env!("CARGO_MANIFEST_DIR"));
path.push("templates");
path.push("*.html");
let glob = path.to_string_lossy().replace('\\', "/");
let engine = TemplateEngine::new(&glob).expect("Tera parse failed");
let names: Vec<String> = engine
.tera
.read()
.get_template_names()
.map(|s| s.to_string())
.collect();
assert!(
names.len() >= 10,
"expected 10+ templates, got {}",
names.len()
);
}
#[test]
fn format_datetime_rfc3339() {
let v = format_datetime_filter(
&Value::String("2024-06-15T12:34:56Z".into()),
&HashMap::new(),
)
.unwrap();
assert_eq!(v, Value::String("2024-06-15 12:34:56 UTC".into()));
}
}

View File

@@ -15,6 +15,12 @@
--myfsio-hover-bg: rgba(59, 130, 246, 0.12);
--myfsio-accent: #3b82f6;
--myfsio-accent-hover: #2563eb;
--myfsio-tag-key-bg: #e0e7ff;
--myfsio-tag-key-text: #3730a3;
--myfsio-tag-value-bg: #f0f1fa;
--myfsio-tag-value-text: #4338ca;
--myfsio-tag-border: #c7d2fe;
--myfsio-tag-delete-hover: #ef4444;
}
[data-theme='dark'] {
@@ -34,6 +40,12 @@
--myfsio-hover-bg: rgba(59, 130, 246, 0.2);
--myfsio-accent: #60a5fa;
--myfsio-accent-hover: #3b82f6;
--myfsio-tag-key-bg: #312e81;
--myfsio-tag-key-text: #c7d2fe;
--myfsio-tag-value-bg: #1e1b4b;
--myfsio-tag-value-text: #a5b4fc;
--myfsio-tag-border: #4338ca;
--myfsio-tag-delete-hover: #f87171;
}
[data-theme='dark'] body,
@@ -1081,6 +1093,26 @@ html.sidebar-will-collapse .sidebar-user {
letter-spacing: 0.08em;
}
[data-theme='dark'] .docs-table .table-secondary,
[data-theme='dark'] .docs-section .table-secondary {
--bs-table-bg: rgba(148, 163, 184, 0.14);
--bs-table-striped-bg: rgba(148, 163, 184, 0.16);
--bs-table-hover-bg: rgba(148, 163, 184, 0.2);
--bs-table-color: var(--myfsio-text);
color: var(--myfsio-text);
}
[data-theme='dark'] .docs-table .table-secondary th,
[data-theme='dark'] .docs-table .table-secondary td,
[data-theme='dark'] .docs-table .table-secondary strong,
[data-theme='dark'] .docs-table .table-secondary code,
[data-theme='dark'] .docs-section .table-secondary th,
[data-theme='dark'] .docs-section .table-secondary td,
[data-theme='dark'] .docs-section .table-secondary strong,
[data-theme='dark'] .docs-section .table-secondary code {
color: var(--myfsio-text);
}
.main-content:has(.docs-sidebar) {
overflow-x: visible;
}
@@ -1151,17 +1183,104 @@ html.sidebar-will-collapse .sidebar-user {
}
.iam-user-card {
border: 1px solid var(--myfsio-card-border);
border-radius: 0.75rem;
transition: box-shadow 0.2s ease, transform 0.2s ease;
position: relative;
border: 1px solid var(--myfsio-card-border) !important;
border-radius: 1rem !important;
overflow: visible;
transition: all 0.2s cubic-bezier(0.4, 0, 0.2, 1);
}
.iam-user-card:hover {
box-shadow: 0 4px 12px rgba(0, 0, 0, 0.1);
transform: translateY(-2px);
box-shadow: 0 8px 24px -4px rgba(0, 0, 0, 0.12), 0 4px 8px -4px rgba(0, 0, 0, 0.08);
border-color: var(--myfsio-accent) !important;
}
[data-theme='dark'] .iam-user-card:hover {
box-shadow: 0 4px 12px rgba(0, 0, 0, 0.3);
box-shadow: 0 8px 24px -4px rgba(0, 0, 0, 0.4), 0 4px 8px -4px rgba(0, 0, 0, 0.3);
}
.iam-role-badge {
display: inline-flex;
align-items: center;
padding: 0.25em 0.65em;
border-radius: 999px;
font-size: 0.7rem;
font-weight: 600;
text-transform: uppercase;
letter-spacing: 0.03em;
}
.iam-role-admin {
background: rgba(245, 158, 11, 0.15);
color: #d97706;
}
[data-theme='dark'] .iam-role-admin {
background: rgba(245, 158, 11, 0.25);
color: #fbbf24;
}
.iam-role-user {
background: rgba(59, 130, 246, 0.12);
color: #2563eb;
}
[data-theme='dark'] .iam-role-user {
background: rgba(59, 130, 246, 0.2);
color: #60a5fa;
}
.iam-perm-badge {
display: inline-flex;
align-items: center;
gap: 0.25rem;
padding: 0.3em 0.6em;
border-radius: 999px;
font-size: 0.75rem;
font-weight: 500;
background: rgba(59, 130, 246, 0.08);
color: var(--myfsio-text);
border: 1px solid rgba(59, 130, 246, 0.15);
}
[data-theme='dark'] .iam-perm-badge {
background: rgba(59, 130, 246, 0.15);
border-color: rgba(59, 130, 246, 0.25);
}
.iam-copy-key {
display: inline-flex;
align-items: center;
justify-content: center;
width: 22px;
height: 22px;
padding: 0;
border: none;
background: transparent;
color: var(--myfsio-muted);
border-radius: 4px;
cursor: pointer;
transition: all 0.15s ease;
flex-shrink: 0;
}
.iam-copy-key:hover {
background: var(--myfsio-hover-bg);
color: var(--myfsio-text);
}
.iam-no-results {
text-align: center;
padding: 2rem 1rem;
color: var(--myfsio-muted);
}
@media (max-width: 768px) {
.iam-user-card:hover {
transform: none;
}
}
.user-avatar-lg {
@@ -1288,6 +1407,20 @@ html.sidebar-will-collapse .sidebar-user {
padding: 2rem 1rem;
}
#preview-text {
padding: 1rem 1.125rem;
max-height: 360px;
overflow: auto;
white-space: pre-wrap;
word-break: break-word;
font-family: 'SFMono-Regular', 'Menlo', 'Consolas', 'Liberation Mono', monospace;
font-size: .8rem;
line-height: 1.6;
tab-size: 4;
color: var(--myfsio-text);
background: transparent;
}
.upload-progress-stack {
display: flex;
flex-direction: column;
@@ -1463,6 +1596,11 @@ html.sidebar-will-collapse .sidebar-user {
border: 1px solid var(--myfsio-card-border);
}
.policy-editor-disabled {
opacity: 0.72;
cursor: not-allowed;
}
.objects-table-container {
max-height: 600px;
overflow-y: auto;
@@ -1875,7 +2013,9 @@ pre {
}
[data-theme='dark'] pre {
background-color: rgba(248, 250, 252, 0.05);
background-color: #111827;
border-color: rgba(148, 163, 184, 0.24);
color: #e5eefb;
}
pre code {
@@ -1884,6 +2024,16 @@ pre code {
color: inherit;
}
[data-theme='dark'] .docs-section .bg-light {
background-color: #182235 !important;
border: 1px solid rgba(148, 163, 184, 0.18);
color: #e5eefb;
}
[data-theme='dark'] .docs-section .bg-light .text-muted {
color: #a9b6c8 !important;
}
.docs-section + .docs-section {
margin-top: 1.25rem;
}
@@ -2542,7 +2692,7 @@ pre code {
}
.objects-table-container {
max-height: none;
max-height: 60vh;
}
.preview-card {
@@ -2805,6 +2955,195 @@ body:has(.login-card) .main-wrapper {
padding-top: 0 !important;
}
.context-menu {
position: fixed;
z-index: 1060;
min-width: 180px;
background: var(--myfsio-card-bg);
border: 1px solid var(--myfsio-card-border);
border-radius: 0.5rem;
box-shadow: 0 10px 25px -5px rgba(0, 0, 0, 0.15), 0 8px 10px -6px rgba(0, 0, 0, 0.1);
padding: 0.25rem 0;
font-size: 0.875rem;
}
[data-theme='dark'] .context-menu {
box-shadow: 0 10px 25px -5px rgba(0, 0, 0, 0.4), 0 8px 10px -6px rgba(0, 0, 0, 0.3);
}
.context-menu-item {
display: flex;
align-items: center;
gap: 0.625rem;
padding: 0.5rem 0.875rem;
color: var(--myfsio-text);
cursor: pointer;
transition: background-color 0.1s ease;
border: none;
background: none;
width: 100%;
text-align: left;
font-size: inherit;
}
.context-menu-item:hover {
background-color: var(--myfsio-hover-bg);
}
.context-menu-item.text-danger:hover {
background-color: rgba(239, 68, 68, 0.1);
}
.context-menu-divider {
height: 1px;
background: var(--myfsio-card-border);
margin: 0.25rem 0;
}
.context-menu-shortcut {
margin-left: auto;
font-size: 0.75rem;
color: var(--myfsio-muted);
}
.kbd-shortcuts-list {
display: flex;
flex-direction: column;
gap: 0.5rem;
}
.kbd-shortcuts-list .shortcut-row {
display: flex;
align-items: center;
justify-content: space-between;
padding: 0.375rem 0;
}
.kbd-shortcuts-list kbd {
display: inline-flex;
align-items: center;
justify-content: center;
min-width: 1.75rem;
padding: 0.2rem 0.5rem;
font-family: inherit;
font-size: 0.75rem;
font-weight: 600;
background: var(--myfsio-preview-bg);
border: 1px solid var(--myfsio-card-border);
border-radius: 0.25rem;
box-shadow: 0 1px 0 1px rgba(0, 0, 0, 0.05);
color: var(--myfsio-text);
}
[data-theme='dark'] .kbd-shortcuts-list kbd {
background: rgba(255, 255, 255, 0.1);
box-shadow: 0 1px 0 1px rgba(0, 0, 0, 0.2);
}
.sort-dropdown .dropdown-item.active,
.sort-dropdown .dropdown-item:active {
background-color: var(--myfsio-hover-bg);
color: var(--myfsio-text);
}
.sort-dropdown .dropdown-item {
font-size: 0.875rem;
padding: 0.375rem 1rem;
}
.tag-pill {
display: inline-flex;
border-radius: 9999px;
border: 1px solid var(--myfsio-tag-border);
overflow: hidden;
font-size: 0.75rem;
line-height: 1;
}
.tag-pill-key {
padding: 0.3rem 0.5rem;
background: var(--myfsio-tag-key-bg);
color: var(--myfsio-tag-key-text);
font-weight: 600;
}
.tag-pill-value {
padding: 0.3rem 0.5rem;
background: var(--myfsio-tag-value-bg);
color: var(--myfsio-tag-value-text);
font-weight: 400;
}
.tag-editor-card {
background: var(--myfsio-preview-bg);
border-radius: 0.5rem;
padding: 0.75rem;
}
.tag-editor-header,
.tag-editor-row {
display: grid;
grid-template-columns: 1fr 1fr 28px;
gap: 0.5rem;
align-items: center;
}
.tag-editor-header {
padding-bottom: 0.375rem;
border-bottom: 1px solid var(--myfsio-card-border);
margin-bottom: 0.5rem;
}
.tag-editor-header span {
font-size: 0.7rem;
font-weight: 600;
text-transform: uppercase;
color: var(--myfsio-muted);
letter-spacing: 0.05em;
}
.tag-editor-row {
margin-bottom: 0.375rem;
}
.tag-editor-delete {
display: inline-flex;
align-items: center;
justify-content: center;
width: 28px;
height: 28px;
border: none;
background: transparent;
color: var(--myfsio-muted);
border-radius: 0.375rem;
cursor: pointer;
transition: color 0.15s, background 0.15s;
}
.tag-editor-delete:hover {
color: var(--myfsio-tag-delete-hover);
background: rgba(239, 68, 68, 0.1);
}
.tag-editor-actions {
display: flex;
align-items: center;
gap: 0.5rem;
margin-top: 0.75rem;
padding-top: 0.5rem;
border-top: 1px solid var(--myfsio-card-border);
}
@media (prefers-reduced-motion: reduce) {
*,
*::before,
*::after {
animation-duration: 0.01ms !important;
animation-iteration-count: 1 !important;
transition-duration: 0.01ms !important;
}
}
@media print {
.sidebar,
.mobile-header {

View File

Before

Width:  |  Height:  |  Size: 200 KiB

After

Width:  |  Height:  |  Size: 200 KiB

View File

Before

Width:  |  Height:  |  Size: 872 KiB

After

Width:  |  Height:  |  Size: 872 KiB

View File

@@ -3,6 +3,8 @@ window.BucketDetailUpload = (function() {
const MULTIPART_THRESHOLD = 8 * 1024 * 1024;
const CHUNK_SIZE = 8 * 1024 * 1024;
const MAX_PART_RETRIES = 3;
const RETRY_BASE_DELAY_MS = 1000;
let state = {
isUploading: false,
@@ -204,6 +206,67 @@ window.BucketDetailUpload = (function() {
}
}
function uploadPartXHR(url, chunk, csrfToken, baseBytes, fileSize, progressItem, partNumber, totalParts) {
return new Promise((resolve, reject) => {
const xhr = new XMLHttpRequest();
xhr.open('PUT', url, true);
xhr.setRequestHeader('X-CSRFToken', csrfToken || '');
xhr.upload.addEventListener('progress', (e) => {
if (e.lengthComputable) {
updateProgressItem(progressItem, {
status: `Part ${partNumber}/${totalParts}`,
loaded: baseBytes + e.loaded,
total: fileSize
});
}
});
xhr.addEventListener('load', () => {
if (xhr.status >= 200 && xhr.status < 300) {
try {
resolve(JSON.parse(xhr.responseText));
} catch {
reject(new Error(`Part ${partNumber}: invalid response`));
}
} else {
try {
const data = JSON.parse(xhr.responseText);
reject(new Error(data.error || `Part ${partNumber} failed (${xhr.status})`));
} catch {
reject(new Error(`Part ${partNumber} failed (${xhr.status})`));
}
}
});
xhr.addEventListener('error', () => reject(new Error(`Part ${partNumber}: network error`)));
xhr.addEventListener('abort', () => reject(new Error(`Part ${partNumber}: aborted`)));
xhr.send(chunk);
});
}
async function uploadPartWithRetry(url, chunk, csrfToken, baseBytes, fileSize, progressItem, partNumber, totalParts) {
let lastError;
for (let attempt = 0; attempt <= MAX_PART_RETRIES; attempt++) {
try {
return await uploadPartXHR(url, chunk, csrfToken, baseBytes, fileSize, progressItem, partNumber, totalParts);
} catch (err) {
lastError = err;
if (attempt < MAX_PART_RETRIES) {
const delay = RETRY_BASE_DELAY_MS * Math.pow(2, attempt);
updateProgressItem(progressItem, {
status: `Part ${partNumber}/${totalParts} retry ${attempt + 1}/${MAX_PART_RETRIES}...`,
loaded: baseBytes,
total: fileSize
});
await new Promise(r => setTimeout(r, delay));
}
}
}
throw lastError;
}
async function uploadMultipart(file, objectKey, metadata, progressItem, urls) {
const csrfToken = document.querySelector('input[name="csrf_token"]')?.value;
@@ -233,26 +296,14 @@ window.BucketDetailUpload = (function() {
const end = Math.min(start + CHUNK_SIZE, file.size);
const chunk = file.slice(start, end);
updateProgressItem(progressItem, {
status: `Part ${partNumber}/${totalParts}`,
loaded: uploadedBytes,
total: file.size
});
const partData = await uploadPartWithRetry(
`${partUrl}?partNumber=${partNumber}`,
chunk, csrfToken, uploadedBytes, file.size,
progressItem, partNumber, totalParts
);
const partResp = await fetch(`${partUrl}?partNumber=${partNumber}`, {
method: 'PUT',
headers: { 'X-CSRFToken': csrfToken || '' },
body: chunk
});
if (!partResp.ok) {
const err = await partResp.json().catch(() => ({}));
throw new Error(err.error || `Part ${partNumber} failed`);
}
const partData = await partResp.json();
parts.push({ part_number: partNumber, etag: partData.etag });
uploadedBytes += chunk.size;
uploadedBytes += (end - start);
updateProgressItem(progressItem, {
loaded: uploadedBytes,
@@ -293,6 +344,7 @@ window.BucketDetailUpload = (function() {
const xhr = new XMLHttpRequest();
xhr.open('POST', formAction, true);
xhr.setRequestHeader('X-Requested-With', 'XMLHttpRequest');
xhr.setRequestHeader('X-CSRFToken', csrfToken || '');
xhr.upload.addEventListener('progress', (e) => {
if (e.lengthComputable) {

View File

@@ -78,7 +78,7 @@ window.ConnectionsManagement = (function() {
try {
var controller = new AbortController();
var timeoutId = setTimeout(function() { controller.abort(); }, 15000);
var timeoutId = setTimeout(function() { controller.abort(); }, 10000);
var response = await fetch(endpoints.healthTemplate.replace('CONNECTION_ID', connectionId), {
signal: controller.signal
@@ -147,7 +147,7 @@ window.ConnectionsManagement = (function() {
'<button type="button" class="btn btn-outline-secondary" data-bs-toggle="modal" data-bs-target="#editConnectionModal" ' +
'data-id="' + window.UICore.escapeHtml(conn.id) + '" data-name="' + window.UICore.escapeHtml(conn.name) + '" ' +
'data-endpoint="' + window.UICore.escapeHtml(conn.endpoint_url) + '" data-region="' + window.UICore.escapeHtml(conn.region) + '" ' +
'data-access="' + window.UICore.escapeHtml(conn.access_key) + '" data-secret="' + window.UICore.escapeHtml(conn.secret_key || '') + '" title="Edit connection">' +
'data-access="' + window.UICore.escapeHtml(conn.access_key) + '" title="Edit connection">' +
'<svg xmlns="http://www.w3.org/2000/svg" width="14" height="14" fill="currentColor" viewBox="0 0 16 16">' +
'<path d="M12.146.146a.5.5 0 0 1 .708 0l3 3a.5.5 0 0 1 0 .708l-10 10a.5.5 0 0 1-.168.11l-5 2a.5.5 0 0 1-.65-.65l2-5a.5.5 0 0 1 .11-.168l10-10zM11.207 2.5 13.5 4.793 14.793 3.5 12.5 1.207 11.207 2.5zm1.586 3L10.5 3.207 4 9.707V10h.5a.5.5 0 0 1 .5.5v.5h.5a.5.5 0 0 1 .5.5v.5h.293l6.5-6.5z"/></svg></button>' +
'<button type="button" class="btn btn-outline-danger" data-bs-toggle="modal" data-bs-target="#deleteConnectionModal" ' +
@@ -185,7 +185,9 @@ window.ConnectionsManagement = (function() {
document.getElementById('edit_endpoint_url').value = button.getAttribute('data-endpoint') || '';
document.getElementById('edit_region').value = button.getAttribute('data-region') || '';
document.getElementById('edit_access_key').value = button.getAttribute('data-access') || '';
document.getElementById('edit_secret_key').value = button.getAttribute('data-secret') || '';
document.getElementById('edit_secret_key').value = '';
document.getElementById('edit_secret_key').placeholder = '(unchanged — leave blank to keep current)';
document.getElementById('edit_secret_key').required = false;
document.getElementById('editTestResult').innerHTML = '';
var form = document.getElementById('editConnectionForm');
@@ -288,9 +290,6 @@ window.ConnectionsManagement = (function() {
editBtn.setAttribute('data-endpoint', data.connection.endpoint_url);
editBtn.setAttribute('data-region', data.connection.region);
editBtn.setAttribute('data-access', data.connection.access_key);
if (data.connection.secret_key) {
editBtn.setAttribute('data-secret', data.connection.secret_key);
}
}
var deleteBtn = row.querySelector('[data-bs-target="#deleteConnectionModal"]');

View File

@@ -11,16 +11,70 @@ window.IAMManagement = (function() {
var editUserModal = null;
var deleteUserModal = null;
var rotateSecretModal = null;
var expiryModal = null;
var currentRotateKey = null;
var currentEditKey = null;
var currentDeleteKey = null;
var currentEditAccessKey = null;
var currentDeleteAccessKey = null;
var currentExpiryKey = null;
var currentExpiryAccessKey = null;
var ALL_S3_ACTIONS = [
'list', 'read', 'write', 'delete', 'share', 'policy',
'replication', 'lifecycle', 'cors',
'create_bucket', 'delete_bucket',
'versioning', 'tagging', 'encryption', 'quota',
'object_lock', 'notification', 'logging', 'website'
];
var policyTemplates = {
full: [{ bucket: '*', actions: ['list', 'read', 'write', 'delete', 'share', 'policy', 'replication', 'lifecycle', 'cors', 'iam:*'] }],
full: [{ bucket: '*', actions: ['list', 'read', 'write', 'delete', 'share', 'policy', 'create_bucket', 'delete_bucket', 'replication', 'lifecycle', 'cors', 'versioning', 'tagging', 'encryption', 'quota', 'object_lock', 'notification', 'logging', 'website', 'iam:*'] }],
readonly: [{ bucket: '*', actions: ['list', 'read'] }],
writer: [{ bucket: '*', actions: ['list', 'read', 'write'] }]
writer: [{ bucket: '*', actions: ['list', 'read', 'write'] }],
operator: [{ bucket: '*', actions: ['list', 'read', 'write', 'delete', 'create_bucket', 'delete_bucket'] }],
bucketadmin: [{ bucket: '*', actions: ['list', 'read', 'write', 'delete', 'share', 'policy', 'create_bucket', 'delete_bucket', 'versioning', 'tagging', 'encryption', 'cors', 'lifecycle', 'quota', 'object_lock', 'notification', 'logging', 'website', 'replication'] }]
};
function isAdminUser(policies) {
if (!policies || !policies.length) return false;
return policies.some(function(p) {
return p.actions && (p.actions.indexOf('iam:*') >= 0 || p.actions.indexOf('*') >= 0);
});
}
function getPermissionLevel(actions) {
if (!actions || !actions.length) return 'Custom (0)';
if (actions.indexOf('*') >= 0) return 'Full Access';
if (actions.length >= ALL_S3_ACTIONS.length) {
var hasAll = ALL_S3_ACTIONS.every(function(a) { return actions.indexOf(a) >= 0; });
if (hasAll) return 'Full Access';
}
var has = function(a) { return actions.indexOf(a) >= 0; };
if (has('list') && has('read') && has('write') && has('delete')) return 'Read + Write + Delete';
if (has('list') && has('read') && has('write')) return 'Read + Write';
if (has('list') && has('read')) return 'Read Only';
return 'Custom (' + actions.length + ')';
}
function getBucketLabel(bucket) {
return bucket === '*' ? 'All Buckets' : bucket;
}
function buildUserUrl(template, userId) {
return template.replace('USER_ID', encodeURIComponent(userId));
}
function getUserByIdentifier(identifier) {
return users.find(function(u) {
return u.user_id === identifier || u.access_key === identifier;
}) || null;
}
function getUserById(userId) {
return users.find(function(u) { return u.user_id === userId; }) || null;
}
function init(config) {
users = config.users || [];
currentUserKey = config.currentUserKey || null;
@@ -38,7 +92,10 @@ window.IAMManagement = (function() {
setupEditUserModal();
setupDeleteUserModal();
setupRotateSecretModal();
setupExpiryModal();
setupFormHandlers();
setupSearch();
setupCopyAccessKeyButtons();
}
function initModals() {
@@ -46,11 +103,13 @@ window.IAMManagement = (function() {
var editModalEl = document.getElementById('editUserModal');
var deleteModalEl = document.getElementById('deleteUserModal');
var rotateModalEl = document.getElementById('rotateSecretModal');
var expiryModalEl = document.getElementById('expiryModal');
if (policyModalEl) policyModal = new bootstrap.Modal(policyModalEl);
if (editModalEl) editUserModal = new bootstrap.Modal(editModalEl);
if (deleteModalEl) deleteUserModal = new bootstrap.Modal(deleteModalEl);
if (rotateModalEl) rotateSecretModal = new bootstrap.Modal(rotateModalEl);
if (expiryModalEl) expiryModal = new bootstrap.Modal(expiryModalEl);
}
function setupJsonAutoIndent() {
@@ -68,6 +127,15 @@ window.IAMManagement = (function() {
});
});
var accessKeyCopyButton = document.querySelector('[data-access-key-copy]');
if (accessKeyCopyButton) {
accessKeyCopyButton.addEventListener('click', async function() {
var accessKeyInput = document.getElementById('disclosedAccessKeyValue');
if (!accessKeyInput) return;
await window.UICore.copyToClipboard(accessKeyInput.value, accessKeyCopyButton, 'Copy');
});
}
var secretCopyButton = document.querySelector('[data-secret-copy]');
if (secretCopyButton) {
secretCopyButton.addEventListener('click', async function() {
@@ -78,8 +146,8 @@ window.IAMManagement = (function() {
}
}
function getUserPolicies(accessKey) {
var user = users.find(function(u) { return u.access_key === accessKey; });
function getUserPolicies(identifier) {
var user = getUserByIdentifier(identifier);
return user ? JSON.stringify(user.policies, null, 2) : '';
}
@@ -91,7 +159,7 @@ window.IAMManagement = (function() {
function setupPolicyEditor() {
var userLabelEl = document.getElementById('policyEditorUserLabel');
var userInputEl = document.getElementById('policyEditorUser');
var userInputEl = document.getElementById('policyEditorUserId');
var textareaEl = document.getElementById('policyEditorDocument');
document.querySelectorAll('[data-policy-template]').forEach(function(button) {
@@ -102,18 +170,35 @@ window.IAMManagement = (function() {
document.querySelectorAll('[data-policy-editor]').forEach(function(button) {
button.addEventListener('click', function() {
var key = button.getAttribute('data-access-key');
if (!key) return;
var userId = button.dataset.userId;
var accessKey = button.dataset.accessKey || userId;
if (!userId) return;
userLabelEl.textContent = key;
userInputEl.value = key;
textareaEl.value = getUserPolicies(key);
userLabelEl.textContent = accessKey;
userInputEl.value = userId;
textareaEl.value = getUserPolicies(userId);
policyModal.show();
});
});
}
function generateSecureHex(byteCount) {
var arr = new Uint8Array(byteCount);
crypto.getRandomValues(arr);
return Array.from(arr).map(function(b) { return b.toString(16).padStart(2, '0'); }).join('');
}
function generateSecureBase64(byteCount) {
var arr = new Uint8Array(byteCount);
crypto.getRandomValues(arr);
var binary = '';
for (var i = 0; i < arr.length; i++) {
binary += String.fromCharCode(arr[i]);
}
return btoa(binary).replace(/\+/g, '-').replace(/\//g, '_').replace(/=+$/, '');
}
function setupCreateUserModal() {
var createUserPoliciesEl = document.getElementById('createUserPolicies');
@@ -122,6 +207,22 @@ window.IAMManagement = (function() {
applyPolicyTemplate(button.dataset.createPolicyTemplate, createUserPoliciesEl);
});
});
var genAccessKeyBtn = document.getElementById('generateAccessKeyBtn');
if (genAccessKeyBtn) {
genAccessKeyBtn.addEventListener('click', function() {
var input = document.getElementById('createUserAccessKey');
if (input) input.value = generateSecureHex(8);
});
}
var genSecretKeyBtn = document.getElementById('generateSecretKeyBtn');
if (genSecretKeyBtn) {
genSecretKeyBtn.addEventListener('click', function() {
var input = document.getElementById('createUserSecretKey');
if (input) input.value = generateSecureBase64(24);
});
}
}
function setupEditUserModal() {
@@ -130,11 +231,13 @@ window.IAMManagement = (function() {
document.querySelectorAll('[data-edit-user]').forEach(function(btn) {
btn.addEventListener('click', function() {
var key = btn.dataset.editUser;
var key = btn.dataset.userId;
var accessKey = btn.dataset.accessKey || key;
var name = btn.dataset.displayName;
currentEditKey = key;
currentEditAccessKey = accessKey;
editUserDisplayName.value = name;
editUserForm.action = endpoints.updateUser.replace('ACCESS_KEY', key);
editUserForm.action = buildUserUrl(endpoints.updateUser, key);
editUserModal.show();
});
});
@@ -147,12 +250,14 @@ window.IAMManagement = (function() {
document.querySelectorAll('[data-delete-user]').forEach(function(btn) {
btn.addEventListener('click', function() {
var key = btn.dataset.deleteUser;
var key = btn.dataset.userId;
var accessKey = btn.dataset.accessKey || key;
currentDeleteKey = key;
deleteUserLabel.textContent = key;
deleteUserForm.action = endpoints.deleteUser.replace('ACCESS_KEY', key);
currentDeleteAccessKey = accessKey;
deleteUserLabel.textContent = accessKey;
deleteUserForm.action = buildUserUrl(endpoints.deleteUser, key);
if (key === currentUserKey) {
if (accessKey === currentUserKey) {
deleteSelfWarning.classList.remove('d-none');
} else {
deleteSelfWarning.classList.add('d-none');
@@ -175,8 +280,8 @@ window.IAMManagement = (function() {
document.querySelectorAll('[data-rotate-user]').forEach(function(btn) {
btn.addEventListener('click', function() {
currentRotateKey = btn.dataset.rotateUser;
rotateUserLabel.textContent = currentRotateKey;
currentRotateKey = btn.dataset.userId;
rotateUserLabel.textContent = btn.dataset.accessKey || currentRotateKey;
rotateSecretConfirm.classList.remove('d-none');
rotateSecretResult.classList.add('d-none');
@@ -195,7 +300,7 @@ window.IAMManagement = (function() {
window.UICore.setButtonLoading(confirmRotateBtn, true, 'Rotating...');
try {
var url = endpoints.rotateSecret.replace('ACCESS_KEY', currentRotateKey);
var url = buildUserUrl(endpoints.rotateSecret, currentRotateKey);
var response = await fetch(url, {
method: 'POST',
headers: {
@@ -242,23 +347,108 @@ window.IAMManagement = (function() {
}
}
function createUserCardHtml(accessKey, displayName, policies) {
function openExpiryModal(key, expiresAt) {
currentExpiryKey = key;
var user = getUserByIdentifier(key);
var label = document.getElementById('expiryUserLabel');
var input = document.getElementById('expiryDateInput');
var form = document.getElementById('expiryForm');
if (label) label.textContent = currentExpiryAccessKey || (user ? user.access_key : key);
if (expiresAt) {
try {
var dt = new Date(expiresAt);
var local = new Date(dt.getTime() - dt.getTimezoneOffset() * 60000);
if (input) input.value = local.toISOString().slice(0, 16);
} catch(e) {
if (input) input.value = '';
}
} else {
if (input) input.value = '';
}
if (form) form.action = buildUserUrl(endpoints.updateExpiry, key);
var modalEl = document.getElementById('expiryModal');
if (modalEl) {
var modal = bootstrap.Modal.getOrCreateInstance(modalEl);
modal.show();
}
}
function setupExpiryModal() {
document.querySelectorAll('[data-expiry-user]').forEach(function(btn) {
btn.addEventListener('click', function(e) {
e.preventDefault();
currentExpiryAccessKey = btn.dataset.accessKey || btn.dataset.userId;
openExpiryModal(btn.dataset.userId, btn.dataset.expiresAt || '');
});
});
document.querySelectorAll('[data-expiry-preset]').forEach(function(btn) {
btn.addEventListener('click', function() {
var preset = btn.dataset.expiryPreset;
var input = document.getElementById('expiryDateInput');
if (!input) return;
if (preset === 'clear') {
input.value = '';
return;
}
var now = new Date();
var ms = 0;
if (preset === '1h') ms = 3600000;
else if (preset === '24h') ms = 86400000;
else if (preset === '7d') ms = 7 * 86400000;
else if (preset === '30d') ms = 30 * 86400000;
else if (preset === '90d') ms = 90 * 86400000;
var future = new Date(now.getTime() + ms);
var local = new Date(future.getTime() - future.getTimezoneOffset() * 60000);
input.value = local.toISOString().slice(0, 16);
});
});
var expiryForm = document.getElementById('expiryForm');
if (expiryForm) {
expiryForm.addEventListener('submit', function(e) {
e.preventDefault();
window.UICore.submitFormAjax(expiryForm, {
successMessage: 'Expiry updated',
onSuccess: function() {
var modalEl = document.getElementById('expiryModal');
if (modalEl) bootstrap.Modal.getOrCreateInstance(modalEl).hide();
window.location.reload();
}
});
});
}
}
function createUserCardHtml(user) {
var userId = user.user_id || '';
var accessKey = user.access_key || userId;
var displayName = user.display_name || accessKey;
var policies = user.policies || [];
var expiresAt = user.expires_at || '';
var admin = isAdminUser(policies);
var cardClass = 'card h-100 iam-user-card' + (admin ? ' iam-admin-card' : '');
var roleBadge = admin
? '<span class="iam-role-badge iam-role-admin" data-role-badge>Admin</span>'
: '<span class="iam-role-badge iam-role-user" data-role-badge>User</span>';
var policyBadges = '';
if (policies && policies.length > 0) {
policyBadges = policies.map(function(p) {
var actionText = p.actions && p.actions.includes('*') ? 'full' : (p.actions ? p.actions.length : 0);
return '<span class="badge bg-primary bg-opacity-10 text-primary">' +
var bucketLabel = getBucketLabel(p.bucket);
var permLevel = getPermissionLevel(p.actions);
return '<span class="iam-perm-badge">' +
'<svg xmlns="http://www.w3.org/2000/svg" width="10" height="10" fill="currentColor" class="me-1" viewBox="0 0 16 16">' +
'<path d="M2.522 5H2a.5.5 0 0 0-.494.574l1.372 9.149A1.5 1.5 0 0 0 4.36 16h7.278a1.5 1.5 0 0 0 1.483-1.277l1.373-9.149A.5.5 0 0 0 14 5h-.522A5.5 5.5 0 0 0 2.522 5zm1.005 0a4.5 4.5 0 0 1 8.945 0H3.527z"/>' +
'</svg>' + window.UICore.escapeHtml(p.bucket) +
'<span class="opacity-75">(' + actionText + ')</span></span>';
'</svg>' + window.UICore.escapeHtml(bucketLabel) + ' &middot; ' + window.UICore.escapeHtml(permLevel) + '</span>';
}).join('');
} else {
policyBadges = '<span class="badge bg-secondary bg-opacity-10 text-secondary">No policies</span>';
}
return '<div class="col-md-6 col-xl-4">' +
'<div class="card h-100 iam-user-card">' +
var esc = window.UICore.escapeHtml;
return '<div class="col-md-6 col-xl-4 iam-user-item" data-user-id="' + esc(userId) + '" data-access-key="' + esc(accessKey) + '" data-display-name="' + esc(displayName.toLowerCase()) + '" data-access-key-filter="' + esc(accessKey.toLowerCase()) + '">' +
'<div class="' + cardClass + '">' +
'<div class="card-body">' +
'<div class="d-flex align-items-start justify-content-between mb-3">' +
'<div class="d-flex align-items-center gap-3 min-width-0 overflow-hidden">' +
@@ -267,8 +457,18 @@ window.IAMManagement = (function() {
'<path d="M8 8a3 3 0 1 0 0-6 3 3 0 0 0 0 6zm2-3a2 2 0 1 1-4 0 2 2 0 0 1 4 0zm4 8c0 1-1 1-1 1H3s-1 0-1-1 1-4 6-4 6 3 6 4zm-1-.004c-.001-.246-.154-.986-.832-1.664C11.516 10.68 10.289 10 8 10c-2.29 0-3.516.68-4.168 1.332-.678.678-.83 1.418-.832 1.664h10z"/>' +
'</svg></div>' +
'<div class="min-width-0">' +
'<h6 class="fw-semibold mb-0 text-truncate" title="' + window.UICore.escapeHtml(displayName) + '">' + window.UICore.escapeHtml(displayName) + '</h6>' +
'<code class="small text-muted d-block text-truncate" title="' + window.UICore.escapeHtml(accessKey) + '">' + window.UICore.escapeHtml(accessKey) + '</code>' +
'<div class="d-flex align-items-center gap-2 mb-0">' +
'<h6 class="fw-semibold mb-0 text-truncate" title="' + esc(displayName) + '">' + esc(displayName) + '</h6>' +
roleBadge +
'</div>' +
'<div class="d-flex align-items-center gap-1">' +
'<code class="small text-muted text-truncate" title="' + esc(accessKey) + '">' + esc(accessKey) + '</code>' +
'<button type="button" class="iam-copy-key" title="Copy access key" data-copy-access-key="' + esc(accessKey) + '">' +
'<svg xmlns="http://www.w3.org/2000/svg" width="12" height="12" fill="currentColor" viewBox="0 0 16 16">' +
'<path d="M4 1.5H3a2 2 0 0 0-2 2V14a2 2 0 0 0 2 2h10a2 2 0 0 0 2-2V3.5a2 2 0 0 0-2-2h-1v1h1a1 1 0 0 1 1 1V14a1 1 0 0 1-1 1H3a1 1 0 0 1-1-1V3.5a1 1 0 0 1 1-1h1v-1z"/>' +
'<path d="M9.5 1a.5.5 0 0 1 .5.5v1a.5.5 0 0 1-.5.5h-3a.5.5 0 0 1-.5-.5v-1a.5.5 0 0 1 .5-.5h3zm-3-1A1.5 1.5 0 0 0 5 1.5v1A1.5 1.5 0 0 0 6.5 4h3A1.5 1.5 0 0 0 11 2.5v-1A1.5 1.5 0 0 0 9.5 0h-3z"/>' +
'</svg></button>' +
'</div>' +
'</div></div>' +
'<div class="dropdown flex-shrink-0">' +
'<button class="btn btn-sm btn-icon" type="button" data-bs-toggle="dropdown" aria-expanded="false">' +
@@ -276,29 +476,36 @@ window.IAMManagement = (function() {
'<path d="M9.5 13a1.5 1.5 0 1 1-3 0 1.5 1.5 0 0 1 3 0zm0-5a1.5 1.5 0 1 1-3 0 1.5 1.5 0 0 1 3 0zm0-5a1.5 1.5 0 1 1-3 0 1.5 1.5 0 0 1 3 0z"/>' +
'</svg></button>' +
'<ul class="dropdown-menu dropdown-menu-end">' +
'<li><button class="dropdown-item" type="button" data-edit-user="' + window.UICore.escapeHtml(accessKey) + '" data-display-name="' + window.UICore.escapeHtml(displayName) + '">' +
'<li><button class="dropdown-item" type="button" data-edit-user data-user-id="' + esc(userId) + '" data-access-key="' + esc(accessKey) + '" data-display-name="' + esc(displayName) + '">' +
'<svg xmlns="http://www.w3.org/2000/svg" width="14" height="14" fill="currentColor" class="me-2" viewBox="0 0 16 16"><path d="M12.146.146a.5.5 0 0 1 .708 0l3 3a.5.5 0 0 1 0 .708l-10 10a.5.5 0 0 1-.168.11l-5 2a.5.5 0 0 1-.65-.65l2-5a.5.5 0 0 1 .11-.168l10-10zM11.207 2.5 13.5 4.793 14.793 3.5 12.5 1.207 11.207 2.5zm1.586 3L10.5 3.207 4 9.707V10h.5a.5.5 0 0 1 .5.5v.5h.5a.5.5 0 0 1 .5.5v.5h.293l6.5-6.5z"/></svg>Edit Name</button></li>' +
'<li><button class="dropdown-item" type="button" data-rotate-user="' + window.UICore.escapeHtml(accessKey) + '">' +
'<li><button class="dropdown-item" type="button" data-expiry-user data-user-id="' + esc(userId) + '" data-access-key="' + esc(accessKey) + '" data-expires-at="' + esc(expiresAt) + '">' +
'<svg xmlns="http://www.w3.org/2000/svg" width="14" height="14" fill="currentColor" class="me-2" viewBox="0 0 16 16"><path d="M8 3.5a.5.5 0 0 0-1 0V9a.5.5 0 0 0 .252.434l3.5 2a.5.5 0 0 0 .496-.868L8 8.71V3.5z"/><path d="M8 16A8 8 0 1 0 8 0a8 8 0 0 0 0 16zm7-8A7 7 0 1 1 1 8a7 7 0 0 1 14 0z"/></svg>Set Expiry</button></li>' +
'<li><button class="dropdown-item" type="button" data-rotate-user data-user-id="' + esc(userId) + '" data-access-key="' + esc(accessKey) + '">' +
'<svg xmlns="http://www.w3.org/2000/svg" width="14" height="14" fill="currentColor" class="me-2" viewBox="0 0 16 16"><path d="M11.534 7h3.932a.25.25 0 0 1 .192.41l-1.966 2.36a.25.25 0 0 1-.384 0l-1.966-2.36a.25.25 0 0 1 .192-.41zm-11 2h3.932a.25.25 0 0 0 .192-.41L2.692 6.23a.25.25 0 0 0-.384 0L.342 8.59A.25.25 0 0 0 .534 9z"/><path fill-rule="evenodd" d="M8 3c-1.552 0-2.94.707-3.857 1.818a.5.5 0 1 1-.771-.636A6.002 6.002 0 0 1 13.917 7H12.9A5.002 5.002 0 0 0 8 3zM3.1 9a5.002 5.002 0 0 0 8.757 2.182.5.5 0 1 1 .771.636A6.002 6.002 0 0 1 2.083 9H3.1z"/></svg>Rotate Secret</button></li>' +
'<li><hr class="dropdown-divider"></li>' +
'<li><button class="dropdown-item text-danger" type="button" data-delete-user="' + window.UICore.escapeHtml(accessKey) + '">' +
'<li><button class="dropdown-item text-danger" type="button" data-delete-user data-user-id="' + esc(userId) + '" data-access-key="' + esc(accessKey) + '">' +
'<svg xmlns="http://www.w3.org/2000/svg" width="14" height="14" fill="currentColor" class="me-2" viewBox="0 0 16 16"><path d="M5.5 5.5a.5.5 0 0 1 .5.5v6a.5.5 0 0 1-1 0v-6a.5.5 0 0 1 .5-.5zm2.5 0a.5.5 0 0 1 .5.5v6a.5.5 0 0 1-1 0v-6a.5.5 0 0 1 .5-.5zm3 .5v6a.5.5 0 0 1-1 0v-6a.5.5 0 0 1 1 0z"/><path fill-rule="evenodd" d="M14.5 3a1 1 0 0 1-1 1H13v9a2 2 0 0 1-2 2H5a2 2 0 0 1-2-2V4h-.5a1 1 0 0 1-1-1V2a1 1 0 0 1 1-1H6a1 1 0 0 1 1-1h2a1 1 0 0 1 1 1h3.5a1 1 0 0 1 1 1v1zM4.118 4 4 4.059V13a1 1 0 0 0 1 1h6a1 1 0 0 0 1-1V4.059L11.882 4H4.118zM2.5 3V2h11v1h-11z"/></svg>Delete User</button></li>' +
'</ul></div></div>' +
'<div class="mb-3">' +
'<div class="small text-muted mb-2">Bucket Permissions</div>' +
'<div class="d-flex flex-wrap gap-1">' + policyBadges + '</div></div>' +
'<button class="btn btn-outline-primary btn-sm w-100" type="button" data-policy-editor data-access-key="' + window.UICore.escapeHtml(accessKey) + '">' +
'<div class="d-flex flex-wrap gap-1" data-policy-badges>' + policyBadges + '</div></div>' +
'<button class="btn btn-outline-primary btn-sm w-100" type="button" data-policy-editor data-user-id="' + esc(userId) + '" data-access-key="' + esc(accessKey) + '">' +
'<svg xmlns="http://www.w3.org/2000/svg" width="14" height="14" fill="currentColor" class="me-1" viewBox="0 0 16 16"><path d="M8 4.754a3.246 3.246 0 1 0 0 6.492 3.246 3.246 0 0 0 0-6.492zM5.754 8a2.246 2.246 0 1 1 4.492 0 2.246 2.246 0 0 1-4.492 0z"/><path d="M9.796 1.343c-.527-1.79-3.065-1.79-3.592 0l-.094.319a.873.873 0 0 1-1.255.52l-.292-.16c-1.64-.892-3.433.902-2.54 2.541l.159.292a.873.873 0 0 1-.52 1.255l-.319.094c-1.79.527-1.79 3.065 0 3.592l.319.094a.873.873 0 0 1 .52 1.255l-.16.292c-.892 1.64.901 3.434 2.541 2.54l.292-.159a.873.873 0 0 1 1.255.52l.094.319c.527 1.79 3.065 1.79 3.592 0l.094-.319a.873.873 0 0 1 1.255-.52l.292.16c1.64.893 3.434-.902 2.54-2.541l-.159-.292a.873.873 0 0 1 .52-1.255l.319-.094c1.79-.527 1.79-3.065 0-3.592l-.319-.094a.873.873 0 0 1-.52-1.255l.16-.292c.893-1.64-.902-3.433-2.541-2.54l-.292.159a.873.873 0 0 1-1.255-.52l-.094-.319z"/></svg>Manage Policies</button>' +
'</div></div></div>';
}
function attachUserCardHandlers(cardElement, accessKey, displayName) {
function attachUserCardHandlers(cardElement, user) {
var userId = user.user_id;
var accessKey = user.access_key;
var displayName = user.display_name;
var expiresAt = user.expires_at || '';
var editBtn = cardElement.querySelector('[data-edit-user]');
if (editBtn) {
editBtn.addEventListener('click', function() {
currentEditKey = accessKey;
currentEditKey = userId;
currentEditAccessKey = accessKey;
document.getElementById('editUserDisplayName').value = displayName;
document.getElementById('editUserForm').action = endpoints.updateUser.replace('ACCESS_KEY', accessKey);
document.getElementById('editUserForm').action = buildUserUrl(endpoints.updateUser, userId);
editUserModal.show();
});
}
@@ -306,9 +513,10 @@ window.IAMManagement = (function() {
var deleteBtn = cardElement.querySelector('[data-delete-user]');
if (deleteBtn) {
deleteBtn.addEventListener('click', function() {
currentDeleteKey = accessKey;
currentDeleteKey = userId;
currentDeleteAccessKey = accessKey;
document.getElementById('deleteUserLabel').textContent = accessKey;
document.getElementById('deleteUserForm').action = endpoints.deleteUser.replace('ACCESS_KEY', accessKey);
document.getElementById('deleteUserForm').action = buildUserUrl(endpoints.deleteUser, userId);
var deleteSelfWarning = document.getElementById('deleteSelfWarning');
if (accessKey === currentUserKey) {
deleteSelfWarning.classList.remove('d-none');
@@ -322,7 +530,7 @@ window.IAMManagement = (function() {
var rotateBtn = cardElement.querySelector('[data-rotate-user]');
if (rotateBtn) {
rotateBtn.addEventListener('click', function() {
currentRotateKey = accessKey;
currentRotateKey = userId;
document.getElementById('rotateUserLabel').textContent = accessKey;
document.getElementById('rotateSecretConfirm').classList.remove('d-none');
document.getElementById('rotateSecretResult').classList.add('d-none');
@@ -333,15 +541,31 @@ window.IAMManagement = (function() {
});
}
var expiryBtn = cardElement.querySelector('[data-expiry-user]');
if (expiryBtn) {
expiryBtn.addEventListener('click', function(e) {
e.preventDefault();
currentExpiryAccessKey = accessKey;
openExpiryModal(userId, expiresAt);
});
}
var policyBtn = cardElement.querySelector('[data-policy-editor]');
if (policyBtn) {
policyBtn.addEventListener('click', function() {
document.getElementById('policyEditorUserLabel').textContent = accessKey;
document.getElementById('policyEditorUser').value = accessKey;
document.getElementById('policyEditorDocument').value = getUserPolicies(accessKey);
document.getElementById('policyEditorUserId').value = userId;
document.getElementById('policyEditorDocument').value = getUserPolicies(userId);
policyModal.show();
});
}
var copyBtn = cardElement.querySelector('[data-copy-access-key]');
if (copyBtn) {
copyBtn.addEventListener('click', function() {
copyAccessKey(copyBtn);
});
}
}
function updateUserCount() {
@@ -375,10 +599,15 @@ window.IAMManagement = (function() {
'</svg>' +
'<div class="flex-grow-1">' +
'<div class="fw-semibold">New user created: <code>' + window.UICore.escapeHtml(data.access_key) + '</code></div>' +
'<p class="mb-2 small">This secret is only shown once. Copy it now and store it securely.</p>' +
'<p class="mb-2 small">These credentials are only shown once. Copy them now and store them securely.</p>' +
'</div>' +
'<button type="button" class="btn-close" data-bs-dismiss="alert" aria-label="Close"></button>' +
'</div>' +
'<div class="input-group mb-2">' +
'<span class="input-group-text"><strong>Access key</strong></span>' +
'<input class="form-control font-monospace" type="text" value="' + window.UICore.escapeHtml(data.access_key) + '" readonly />' +
'<button class="btn btn-outline-primary" type="button" id="copyNewUserAccessKey">Copy</button>' +
'</div>' +
'<div class="input-group">' +
'<span class="input-group-text"><strong>Secret key</strong></span>' +
'<input class="form-control font-monospace" type="text" value="' + window.UICore.escapeHtml(data.secret_key) + '" readonly id="newUserSecret" />' +
@@ -387,6 +616,9 @@ window.IAMManagement = (function() {
var container = document.querySelector('.page-header');
if (container) {
container.insertAdjacentHTML('afterend', alertHtml);
document.getElementById('copyNewUserAccessKey').addEventListener('click', async function() {
await window.UICore.copyToClipboard(data.access_key, this, 'Copy');
});
document.getElementById('copyNewUserSecret').addEventListener('click', async function() {
await window.UICore.copyToClipboard(data.secret_key, this, 'Copy');
});
@@ -408,15 +640,18 @@ window.IAMManagement = (function() {
}
if (usersGrid) {
var cardHtml = createUserCardHtml(data.access_key, data.display_name, data.policies);
usersGrid.insertAdjacentHTML('beforeend', cardHtml);
var newCard = usersGrid.lastElementChild;
attachUserCardHandlers(newCard, data.access_key, data.display_name);
users.push({
var newUser = {
user_id: data.user_id,
access_key: data.access_key,
display_name: data.display_name,
expires_at: data.expires_at || '',
policies: data.policies || []
});
};
var cardHtml = createUserCardHtml(newUser);
usersGrid.insertAdjacentHTML('beforeend', cardHtml);
var newCard = usersGrid.lastElementChild;
attachUserCardHandlers(newCard, newUser);
users.push(newUser);
updateUserCount();
}
}
@@ -428,34 +663,50 @@ window.IAMManagement = (function() {
if (policyEditorForm) {
policyEditorForm.addEventListener('submit', function(e) {
e.preventDefault();
var userInputEl = document.getElementById('policyEditorUser');
var key = userInputEl.value;
if (!key) return;
var userInputEl = document.getElementById('policyEditorUserId');
var userId = userInputEl.value;
if (!userId) return;
var template = policyEditorForm.dataset.actionTemplate;
policyEditorForm.action = template.replace('ACCESS_KEY_PLACEHOLDER', key);
policyEditorForm.action = template.replace('USER_ID_PLACEHOLDER', encodeURIComponent(userId));
window.UICore.submitFormAjax(policyEditorForm, {
successMessage: 'Policies updated',
onSuccess: function(data) {
policyModal.hide();
var userCard = document.querySelector('[data-access-key="' + key + '"]');
var userCard = document.querySelector('.iam-user-item[data-user-id="' + userId + '"]');
if (userCard) {
var badgeContainer = userCard.closest('.iam-user-card').querySelector('.d-flex.flex-wrap.gap-1');
var cardEl = userCard.querySelector('.iam-user-card');
var badgeContainer = cardEl ? cardEl.querySelector('[data-policy-badges]') : null;
if (badgeContainer && data.policies) {
var badges = data.policies.map(function(p) {
return '<span class="badge bg-primary bg-opacity-10 text-primary">' +
var bl = getBucketLabel(p.bucket);
var pl = getPermissionLevel(p.actions);
return '<span class="iam-perm-badge">' +
'<svg xmlns="http://www.w3.org/2000/svg" width="10" height="10" fill="currentColor" class="me-1" viewBox="0 0 16 16">' +
'<path d="M2.522 5H2a.5.5 0 0 0-.494.574l1.372 9.149A1.5 1.5 0 0 0 4.36 16h7.278a1.5 1.5 0 0 0 1.483-1.277l1.373-9.149A.5.5 0 0 0 14 5h-.522A5.5 5.5 0 0 0 2.522 5zm1.005 0a4.5 4.5 0 0 1 8.945 0H3.527z"/>' +
'</svg>' + window.UICore.escapeHtml(p.bucket) +
'<span class="opacity-75">(' + (p.actions.includes('*') ? 'full' : p.actions.length) + ')</span></span>';
'</svg>' + window.UICore.escapeHtml(bl) + ' &middot; ' + window.UICore.escapeHtml(pl) + '</span>';
}).join('');
badgeContainer.innerHTML = badges || '<span class="badge bg-secondary bg-opacity-10 text-secondary">No policies</span>';
}
if (cardEl) {
var nowAdmin = isAdminUser(data.policies);
cardEl.classList.toggle('iam-admin-card', nowAdmin);
var roleBadgeEl = cardEl.querySelector('[data-role-badge]');
if (roleBadgeEl) {
if (nowAdmin) {
roleBadgeEl.className = 'iam-role-badge iam-role-admin';
roleBadgeEl.textContent = 'Admin';
} else {
roleBadgeEl.className = 'iam-role-badge iam-role-user';
roleBadgeEl.textContent = 'User';
}
}
}
}
var userIndex = users.findIndex(function(u) { return u.access_key === key; });
var userIndex = users.findIndex(function(u) { return u.user_id === userId; });
if (userIndex >= 0 && data.policies) {
users[userIndex].policies = data.policies;
}
@@ -475,7 +726,7 @@ window.IAMManagement = (function() {
editUserModal.hide();
var newName = data.display_name || document.getElementById('editUserDisplayName').value;
var editBtn = document.querySelector('[data-edit-user="' + key + '"]');
var editBtn = document.querySelector('[data-edit-user][data-user-id="' + key + '"]');
if (editBtn) {
editBtn.setAttribute('data-display-name', newName);
var card = editBtn.closest('.iam-user-card');
@@ -485,15 +736,19 @@ window.IAMManagement = (function() {
nameEl.textContent = newName;
nameEl.title = newName;
}
var itemWrapper = card.closest('.iam-user-item');
if (itemWrapper) {
itemWrapper.setAttribute('data-display-name', newName.toLowerCase());
}
}
}
var userIndex = users.findIndex(function(u) { return u.access_key === key; });
var userIndex = users.findIndex(function(u) { return u.user_id === key; });
if (userIndex >= 0) {
users[userIndex].display_name = newName;
}
if (key === currentUserKey) {
if (currentEditAccessKey === currentUserKey) {
document.querySelectorAll('.sidebar-user .user-name').forEach(function(el) {
var truncated = newName.length > 16 ? newName.substring(0, 16) + '...' : newName;
el.textContent = truncated;
@@ -518,12 +773,12 @@ window.IAMManagement = (function() {
onSuccess: function(data) {
deleteUserModal.hide();
if (key === currentUserKey) {
if (currentDeleteAccessKey === currentUserKey) {
window.location.href = '/ui/';
return;
}
var deleteBtn = document.querySelector('[data-delete-user="' + key + '"]');
var deleteBtn = document.querySelector('[data-delete-user][data-user-id="' + key + '"]');
if (deleteBtn) {
var cardCol = deleteBtn.closest('[class*="col-"]');
if (cardCol) {
@@ -531,7 +786,7 @@ window.IAMManagement = (function() {
}
}
users = users.filter(function(u) { return u.access_key !== key; });
users = users.filter(function(u) { return u.user_id !== key; });
updateUserCount();
}
});
@@ -539,6 +794,52 @@ window.IAMManagement = (function() {
}
}
function setupSearch() {
var searchInput = document.getElementById('iam-user-search');
if (!searchInput) return;
searchInput.addEventListener('input', function() {
var query = searchInput.value.toLowerCase().trim();
var items = document.querySelectorAll('.iam-user-item');
var noResults = document.getElementById('iam-no-results');
var visibleCount = 0;
items.forEach(function(item) {
var name = item.getAttribute('data-display-name') || '';
var key = item.getAttribute('data-access-key-filter') || '';
var matches = !query || name.indexOf(query) >= 0 || key.indexOf(query) >= 0;
item.classList.toggle('d-none', !matches);
if (matches) visibleCount++;
});
if (noResults) {
noResults.classList.toggle('d-none', visibleCount > 0);
}
});
}
function copyAccessKey(btn) {
var key = btn.getAttribute('data-copy-access-key');
if (!key) return;
var originalHtml = btn.innerHTML;
navigator.clipboard.writeText(key).then(function() {
btn.innerHTML = '<svg xmlns="http://www.w3.org/2000/svg" width="12" height="12" fill="currentColor" viewBox="0 0 16 16"><path d="M13.854 3.646a.5.5 0 0 1 0 .708l-7 7a.5.5 0 0 1-.708 0l-3.5-3.5a.5.5 0 1 1 .708-.708L6.5 10.293l6.646-6.647a.5.5 0 0 1 .708 0z"/></svg>';
btn.style.color = '#22c55e';
setTimeout(function() {
btn.innerHTML = originalHtml;
btn.style.color = '';
}, 1200);
}).catch(function() {});
}
function setupCopyAccessKeyButtons() {
document.querySelectorAll('[data-copy-access-key]').forEach(function(btn) {
btn.addEventListener('click', function() {
copyAccessKey(btn);
});
});
}
return {
init: init
};

View File

@@ -35,6 +35,8 @@ window.UICore = (function() {
var successMessage = options.successMessage || 'Operation completed';
var formData = new FormData(form);
var hasFileInput = !!form.querySelector('input[type="file"]');
var requestBody = hasFileInput ? formData : new URLSearchParams(formData);
var csrfToken = getCsrfToken();
var submitBtn = form.querySelector('[type="submit"]');
var originalHtml = submitBtn ? submitBtn.innerHTML : '';
@@ -46,14 +48,18 @@ window.UICore = (function() {
}
var formAction = form.getAttribute('action') || form.action;
var response = await fetch(formAction, {
method: form.getAttribute('method') || 'POST',
headers: {
'X-CSRFToken': csrfToken,
var headers = {
'X-CSRF-Token': csrfToken,
'Accept': 'application/json',
'X-Requested-With': 'XMLHttpRequest'
},
body: formData,
};
if (!hasFileInput) {
headers['Content-Type'] = 'application/x-www-form-urlencoded;charset=UTF-8';
}
var response = await fetch(formAction, {
method: form.getAttribute('method') || 'POST',
headers: headers,
body: requestBody,
redirect: 'follow'
});
@@ -191,6 +197,10 @@ window.UICore = (function() {
}
});
window.addEventListener('beforeunload', function() {
pollingManager.stopAll();
});
return {
getCsrfToken: getCsrfToken,
formatBytes: formatBytes,

Some files were not shown because too many files have changed in this diff Show More