Commit Graph

5 Commits

Author SHA1 Message Date
5cac7c6241 Fix replication failing on large objects: multipart upload, per-part timeout, background healer, status metadata
- Replication now uses CreateMultipartUpload/UploadPart/CompleteMultipartUpload
  for objects above REPLICATION_STREAMING_THRESHOLD_BYTES, with adaptive part
  size (max(8 MiB, size/10000)) and 4 concurrent part uploads. Previously a
  single put_object was used regardless of size, which combined with the AWS
  SDK read_timeout capped every transfer at the timeout (default 30s) and
  made replication of any object above ~bandwidth*30s impossible.
- REPLICATION_READ_TIMEOUT_SECONDS default raised 30 -> 120 and documented as
  per-part / per-attempt rather than end-to-end.
- Added background healer worker that auto-retries persisted replication
  failures with exponential backoff (60s..3600s) capped at
  REPLICATION_HEALER_MAX_ATTEMPTS attempts. Configurable via
  REPLICATION_HEALER_ENABLED / REPLICATION_HEALER_INTERVAL_SECONDS /
  REPLICATION_HEALER_MAX_ATTEMPTS.
- Source objects now carry __replication_status__ (PENDING/COMPLETED/FAILED)
  metadata, surfaced as the x-amz-replication-status response header on
  GetObject and HeadObject.
2026-04-29 00:21:50 +08:00
46267b4f78 S3 API compliance: response headers, conditional writes, replication, security 2026-04-28 13:33:07 +08:00
4d923df16c Fix S3 read-path response-header gaps and replication-loop regression
S3 read path (handlers/mod.rs):
- Add Last-Modified + x-amz-meta-* to Range 206 responses
- Add x-amz-server-side-encryption to HEAD/?partNumber= responses
- 304 Not Modified now carries ETag, Last-Modified, version-id, cache headers
- Treat If-Match/If-Unmodified-Since and If-None-Match/If-Modified-Since
  as RFC-9110 pairs on both GET and CopyObject (date header ignored when
  ETag header is present)

Website hosting (middleware/auth.rs):
- Add ETag, Last-Modified, x-amz-server-side-encryption, and x-amz-meta-*
  to website HEAD/200/206 responses so CDN cachers can revalidate

Replication (services/replication.rs, services/s3_client.rs,
middleware/auth.rs, services/site_registry.rs):
- Detect replicated incoming writes via the authenticated principal's
  access key against the site registry's peer_inbound_access_key set.
  The auth middleware inserts a ReplicationPeerRequest extension marker
  on matched requests; handlers skip trigger_replication when set.
  Replaces a forgeable User-Agent substring check.
- Replication retry preflight now probes HeadBucket on the actual target
  bucket (not ListBuckets) and treats any HTTP response as reachable, so
  bucket-scoped credentials no longer block valid retries
- Populate ReplicationFailure.last_error_code from SdkError metadata
- Health probes use a max_attempts=1 client (fast-fail) rather than the
  production retry budget
2026-04-27 23:41:30 +08:00
3a590e6639 Make auto_heal real: peer-fetch corrupted_object with verified swap, poison-fallback on no peer 2026-04-25 17:14:38 +08:00
217af6d1c6 Full migration and transition to Rust; Remove python artifacts 2026-04-22 17:19:19 +08:00