Delete page "Home"

2025-12-03 04:26:24 +00:00
parent cdc4a81e3e
commit 09791266ea
1 changed files with 0 additions and 585 deletions
--- a/Home.md
+++ b/Home.md
@@ -1,585 +0,0 @@
 # MyFSIO Documentation
 This document expands on the README to describe the full workflow for running, configuring, and extending MyFSIO. Use it as a playbook for local S3-style experimentation.
 ## 1. System Overview
 MyFSIO ships two Flask entrypoints that share the same storage, IAM, and bucket-policy state:
 - **API server** – Implements the S3-compatible REST API, policy evaluation, and Signature Version 4 presign service.
 - **UI server** – Provides the browser console for buckets, IAM, and policies. It proxies to the API for presign operations.
 Both servers read `AppConfig`, so editing JSON stores on disk instantly affects both surfaces.
 ## 2. Quickstart
 ```bash
 python -m venv .venv
 . .venv/Scripts/activate      # PowerShell: .\.venv\Scripts\Activate.ps1
 pip install -r requirements.txt
 # Run both API and UI
 python run.py
 ```
 Visit `http://127.0.0.1:5100/ui` to use the console and `http://127.0.0.1:5000/` (with IAM headers) for raw API calls.
 ### Run modes
 You can run services individually if needed:
 ```bash
 python run.py --mode api   # API only (port 5000)
 python run.py --mode ui    # UI only (port 5100)
 ```
 ### Docker quickstart
 The repo now ships a `Dockerfile` so you can run both services in one container:
 ```bash
 docker build -t myfsio .
 docker run --rm -p 5000:5000 -p 5100:5100 \
  -v "$PWD/data:/app/data" \
  -v "$PWD/logs:/app/logs" \
  -e SECRET_KEY="change-me" \
  --name myfsio myfsio
 ```
 PowerShell (Windows) example:
 ```powershell
 docker run --rm -p 5000:5000 -p 5100:5100 `
  -v ${PWD}\data:/app/data `
  -v ${PWD}\logs:/app/logs `
  -e SECRET_KEY="change-me" `
  --name myfsio myfsio
 ```
 Key mount points:
 - `/app/data` &rarr; persists buckets directly under `/app/data/<bucket>` while system metadata (IAM config, bucket policies, versions, multipart uploads, etc.) lives under `/app/data/.myfsio.sys` (for example, `/app/data/.myfsio.sys/config/iam.json`).
 - `/app/logs` &rarr; captures the rotating app log.
 - `/app/tmp-storage` (optional) if you rely on the demo upload staging folders.
 With these volumes attached you can rebuild/restart the container without losing stored objects or credentials.
 ### Versioning
 The repo now tracks a human-friendly release string inside `app/version.py` (see the `APP_VERSION` constant). Edit that value whenever you cut a release. The constant flows into Flask as `APP_VERSION` and is exposed via `GET /healthz`, so you can monitor deployments or surface it in UIs.
 ## 3. Configuration Reference
 | Variable | Default | Notes |
 | --- | --- | --- |
 | `STORAGE_ROOT` | `<repo>/data` | Filesystem home for all buckets/objects. |
 | `MAX_UPLOAD_SIZE` | `1073741824` | Bytes. Caps incoming uploads in both API + UI. |
 | `UI_PAGE_SIZE` | `100` | `MaxKeys` hint shown in listings. |
 | `SECRET_KEY` | `dev-secret-key` | Flask session key for UI auth. |
 | `IAM_CONFIG` | `<repo>/data/.myfsio.sys/config/iam.json` | Stores users, secrets, and inline policies. |
 | `BUCKET_POLICY_PATH` | `<repo>/data/.myfsio.sys/config/bucket_policies.json` | Bucket policy store (auto hot-reload). |
 | `API_BASE_URL` | `None` | Used by the UI to hit API endpoints (presign/policy). If unset, the UI will auto-detect the host or use `X-Forwarded-*` headers. |
 | `AWS_REGION` | `us-east-1` | Region embedded in SigV4 credential scope. |
 | `AWS_SERVICE` | `s3` | Service string for SigV4. |
 | `ENCRYPTION_ENABLED` | `false` | Enable server-side encryption support. |
 | `KMS_ENABLED` | `false` | Enable KMS key management for encryption. |
 | `KMS_KEYS_PATH` | `data/kms_keys.json` | Path to store KMS key metadata. |
 | `ENCRYPTION_MASTER_KEY_PATH` | `data/master.key` | Path to the master encryption key file. |
 Set env vars (or pass overrides to `create_app`) to point the servers at custom paths.
 ### Proxy Configuration
 If running behind a reverse proxy (e.g., Nginx, Cloudflare, or a tunnel), ensure the proxy sets the standard forwarding headers:
 - `X-Forwarded-Host`
 - `X-Forwarded-Proto`
 The application automatically trusts these headers to generate correct presigned URLs (e.g., `https://s3.example.com/...` instead of `http://127.0.0.1:5000/...`). Alternatively, you can explicitly set `API_BASE_URL` to your public endpoint.
 ## 4. Authentication & IAM
 1. On first boot, `data/.myfsio.sys/config/iam.json` is seeded with `localadmin / localadmin` that has wildcard access.
 2. Sign into the UI using those credentials, then open **IAM**:
   - **Create user**: supply a display name and optional JSON inline policy array.
   - **Rotate secret**: generates a new secret key; the UI surfaces it once.
   - **Policy editor**: select a user, paste an array of objects (`{"bucket": "*", "actions": ["list", "read"]}`), and submit. Alias support includes AWS-style verbs (e.g., `s3:GetObject`).
 3. Wildcard action `iam:*` is supported for admin user definitions.
 The API expects every request to include `X-Access-Key` and `X-Secret-Key` headers. The UI persists them in the Flask session after login.
 ### Available IAM Actions
 | Action | Description | AWS Aliases |
 | --- | --- | --- |
 | `list` | List buckets and objects | `s3:ListBucket`, `s3:ListAllMyBuckets`, `s3:ListBucketVersions`, `s3:ListMultipartUploads`, `s3:ListParts` |
 | `read` | Download objects | `s3:GetObject`, `s3:GetObjectVersion`, `s3:GetObjectTagging`, `s3:HeadObject`, `s3:HeadBucket` |
 | `write` | Upload objects, create buckets | `s3:PutObject`, `s3:CreateBucket`, `s3:CreateMultipartUpload`, `s3:UploadPart`, `s3:CompleteMultipartUpload`, `s3:AbortMultipartUpload`, `s3:CopyObject` |
 | `delete` | Remove objects and buckets | `s3:DeleteObject`, `s3:DeleteObjectVersion`, `s3:DeleteBucket` |
 | `share` | Manage ACLs | `s3:PutObjectAcl`, `s3:PutBucketAcl`, `s3:GetBucketAcl` |
 | `policy` | Manage bucket policies | `s3:PutBucketPolicy`, `s3:GetBucketPolicy`, `s3:DeleteBucketPolicy` |
 | `replication` | Configure and manage replication | `s3:GetReplicationConfiguration`, `s3:PutReplicationConfiguration`, `s3:ReplicateObject`, `s3:ReplicateTags`, `s3:ReplicateDelete` |
 | `iam:list_users` | View IAM users | `iam:ListUsers` |
 | `iam:create_user` | Create IAM users | `iam:CreateUser` |
 | `iam:delete_user` | Delete IAM users | `iam:DeleteUser` |
 | `iam:rotate_key` | Rotate user secrets | `iam:RotateAccessKey` |
 | `iam:update_policy` | Modify user policies | `iam:PutUserPolicy` |
 | `iam:*` | All IAM actions (admin wildcard) | — |
 ### Example Policies
 **Full Control (admin):**
 ```json
 [{"bucket": "*", "actions": ["list", "read", "write", "delete", "share", "policy", "replication", "iam:*"]}]
 ```
 **Read-Only:**
 ```json
 [{"bucket": "*", "actions": ["list", "read"]}]
 ```
 **Single Bucket Access (no listing other buckets):**
 ```json
 [{"bucket": "user-bucket", "actions": ["read", "write", "delete"]}]
 ```
 **Bucket Access with Replication:**
 ```json
 [{"bucket": "my-bucket", "actions": ["list", "read", "write", "delete", "replication"]}]
 ```
 ## 5. Bucket Policies & Presets
 - **Storage**: Policies are persisted in `data/.myfsio.sys/config/bucket_policies.json` under `{"policies": {"bucket": {...}}}`.
 - **Hot reload**: Both API and UI call `maybe_reload()` before evaluating policies. Editing the JSON on disk is immediately reflected—no restarts required.
 - **UI editor**: Each bucket detail page includes:
  - A preset selector: **Private** detaches the policy (delete mode), **Public** injects an allow policy granting anonymous `s3:ListBucket` + `s3:GetObject`, and **Custom** restores your draft.
  - A read-only preview of the attached policy.
  - Autosave behavior for custom drafts while you type.
 ### Editing via CLI
 ```bash
 curl -X PUT http://127.0.0.1:5000/bucket-policy/test \
  -H "Content-Type: application/json" \
  -H "X-Access-Key: ..." -H "X-Secret-Key: ..." \
  -d '{
        "Version": "2012-10-17",
        "Statement": [
          {
            "Effect": "Allow",
            "Principal": "*",
            "Action": ["s3:ListBucket"],
            "Resource": ["arn:aws:s3:::test"]
          }
        ]
      }'
 ```
 The UI will reflect this change as soon as the request completes thanks to the hot reload.
 ## 6. Presigned URLs
 - Trigger from the UI using the **Presign** button after selecting an object.
 - Or call `POST /presign/<bucket>/<key>` with JSON `{ "method": "GET", "expires_in": 900 }`.
 - Supported methods: `GET`, `PUT`, `DELETE`; expiration must be `1..604800` seconds.
 - The service signs requests using the caller’s IAM credentials and enforces bucket policies both when issuing and when the presigned URL is used.
 - Legacy share links have been removed; presigned URLs now handle both private and public workflows.
 ### Multipart Upload Example
 ```python
 import boto3
 s3 = boto3.client('s3', endpoint_url='http://localhost:5000')
 # Initiate
 response = s3.create_multipart_upload(Bucket='mybucket', Key='large.bin')
 upload_id = response['UploadId']
 # Upload parts
 parts = []
 chunks = [b'chunk1', b'chunk2'] # Example data chunks
 for part_number, chunk in enumerate(chunks, start=1):
    response = s3.upload_part(
        Bucket='mybucket',
        Key='large.bin',
        PartNumber=part_number,
        UploadId=upload_id,
        Body=chunk
    )
    parts.append({'PartNumber': part_number, 'ETag': response['ETag']})
 # Complete
 s3.complete_multipart_upload(
    Bucket='mybucket',
    Key='large.bin',
    UploadId=upload_id,
    MultipartUpload={'Parts': parts}
 )
 ```
 ## 7. Encryption
 MyFSIO supports **server-side encryption at rest** to protect your data. When enabled, objects are encrypted using AES-256-GCM before being written to disk.
 ### Encryption Types
 | Type | Description |
 |------|-------------|
 | **AES-256 (SSE-S3)** | Server-managed encryption using a local master key |
 | **KMS (SSE-KMS)** | Encryption using customer-managed keys via the built-in KMS |
 ### Enabling Encryption
 #### 1. Set Environment Variables
 ```powershell
 # PowerShell
 $env:ENCRYPTION_ENABLED = "true"
 $env:KMS_ENABLED = "true"  # Optional, for KMS key management
 python run.py
 ```
 ```bash
 # Bash
 export ENCRYPTION_ENABLED=true
 export KMS_ENABLED=true
 python run.py
 ```
 #### 2. Configure Bucket Default Encryption (UI)
 1. Navigate to your bucket in the UI
 2. Click the **Properties** tab
 3. Find the **Default Encryption** card
 4. Click **Enable Encryption**
 5. Choose algorithm:
   - **AES-256**: Uses the server's master key
   - **aws:kms**: Uses a KMS-managed key (select from dropdown)
 6. Save changes
 Once enabled, all **new objects** uploaded to the bucket will be automatically encrypted.
 ### KMS Key Management
 When `KMS_ENABLED=true`, you can manage encryption keys via the KMS API:
 ```bash
 # Create a new KMS key
 curl -X POST http://localhost:5000/kms/keys \
  -H "Content-Type: application/json" \
  -H "X-Access-Key: ..." -H "X-Secret-Key: ..." \
  -d '{"alias": "my-key", "description": "Production encryption key"}'
 # List all keys
 curl http://localhost:5000/kms/keys \
  -H "X-Access-Key: ..." -H "X-Secret-Key: ..."
 # Get key details
 curl http://localhost:5000/kms/keys/{key-id} \
  -H "X-Access-Key: ..." -H "X-Secret-Key: ..."
 # Rotate a key (creates new key material)
 curl -X POST http://localhost:5000/kms/keys/{key-id}/rotate \
  -H "X-Access-Key: ..." -H "X-Secret-Key: ..."
 # Disable/Enable a key
 curl -X POST http://localhost:5000/kms/keys/{key-id}/disable \
  -H "X-Access-Key: ..." -H "X-Secret-Key: ..."
 curl -X POST http://localhost:5000/kms/keys/{key-id}/enable \
  -H "X-Access-Key: ..." -H "X-Secret-Key: ..."
 # Schedule key deletion (30-day waiting period)
 curl -X DELETE http://localhost:5000/kms/keys/{key-id}?waiting_period_days=30 \
  -H "X-Access-Key: ..." -H "X-Secret-Key: ..."
 ```
 ### How It Works
 1. **Envelope Encryption**: Each object is encrypted with a unique Data Encryption Key (DEK)
 2. **Key Wrapping**: The DEK is encrypted (wrapped) by the master key or KMS key
 3. **Storage**: The encrypted DEK is stored alongside the encrypted object
 4. **Decryption**: On read, the DEK is unwrapped and used to decrypt the object
 ### Client-Side Encryption
 For additional security, you can use client-side encryption. The `ClientEncryptionHelper` class provides utilities:
 ```python
 from app.encryption import ClientEncryptionHelper
 # Generate a client-side key
 key = ClientEncryptionHelper.generate_key()
 key_b64 = ClientEncryptionHelper.key_to_base64(key)
 # Encrypt before upload
 plaintext = b"sensitive data"
 encrypted, metadata = ClientEncryptionHelper.encrypt_for_upload(plaintext, key)
 # Upload with metadata headers
 # x-amz-meta-x-amz-key: <wrapped-key>
 # x-amz-meta-x-amz-iv: <iv>
 # x-amz-meta-x-amz-matdesc: <material-description>
 # Decrypt after download
 decrypted = ClientEncryptionHelper.decrypt_from_download(encrypted, metadata, key)
 ```
 ### Important Notes
 - **Existing objects are NOT encrypted** - Only new uploads after enabling encryption are encrypted
 - **Master key security** - The master key file (`master.key`) should be backed up securely and protected
 - **Key rotation** - Rotating a KMS key creates new key material; existing objects remain encrypted with the old material
 - **Disabled keys** - Objects encrypted with a disabled key cannot be decrypted until the key is re-enabled
 - **Deleted keys** - Once a key is deleted (after the waiting period), objects encrypted with it are permanently inaccessible
 ### Verifying Encryption
 To verify an object is encrypted:
 1. Check the raw file in `data/<bucket>/` - it should be unreadable binary
 2. Look for `.meta` files containing encryption metadata
 3. Download via the API/UI - the object should be automatically decrypted
 ## 8. Bucket Quotas
 MyFSIO supports **storage quotas** to limit how much data a bucket can hold. Quotas are enforced on uploads and multipart completions.
 ### Quota Types
 | Limit | Description |
 |-------|-------------|
 | **Max Size (MB)** | Maximum total storage in megabytes (includes current objects + archived versions) |
 | **Max Objects** | Maximum number of objects (includes current objects + archived versions) |
 ### Managing Quotas (Admin Only)
 Quota management is restricted to administrators (users with `iam:*` or `iam:list_users` permissions).
 #### Via UI
 1. Navigate to your bucket in the UI
 2. Click the **Properties** tab
 3. Find the **Storage Quota** card
 4. Enter limits:
   - **Max Size (MB)**: Leave empty for unlimited
   - **Max Objects**: Leave empty for unlimited
 5. Click **Update Quota**
 To remove a quota, click **Remove Quota**.
 #### Via API
 ```bash
 # Set quota (max 100MB, max 1000 objects)
 curl -X PUT "http://localhost:5000/bucket/<bucket>?quota" \
  -H "Content-Type: application/json" \
  -H "X-Access-Key: ..." -H "X-Secret-Key: ..." \
  -d '{"max_bytes": 104857600, "max_objects": 1000}'
 # Get current quota
 curl "http://localhost:5000/bucket/<bucket>?quota" \
  -H "X-Access-Key: ..." -H "X-Secret-Key: ..."
 # Remove quota
 curl -X PUT "http://localhost:5000/bucket/<bucket>?quota" \
  -H "Content-Type: application/json" \
  -H "X-Access-Key: ..." -H "X-Secret-Key: ..." \
  -d '{"max_bytes": null, "max_objects": null}'
 ```
 ### Quota Behavior
 - **Version Counting**: When versioning is enabled, archived versions count toward the quota
 - **Enforcement Points**: Quotas are checked during `PUT` object and `CompleteMultipartUpload` operations
 - **Error Response**: When quota is exceeded, the API returns `HTTP 400` with error code `QuotaExceeded`
 - **Visibility**: All users can view quota usage in the bucket detail page, but only admins can modify quotas
 ### Example Error
 ```xml
 <Error>
  <Code>QuotaExceeded</Code>
  <Message>Bucket quota exceeded: storage limit reached</Message>
  <BucketName>my-bucket</BucketName>
 </Error>
 ```
 ## 9. Site Replication
 ### Permission Model
 Replication uses a two-tier permission system:
 | Role | Capabilities |
 |------|--------------|
 | **Admin** (users with `iam:*` permissions) | Create/delete replication rules, configure connections and target buckets |
 | **Users** (with `replication` permission) | Enable/disable (pause/resume) existing replication rules |
 > **Note:** The Replication tab is hidden for users without the `replication` permission on the bucket.
 This separation allows administrators to pre-configure where data should replicate, while allowing authorized users to toggle replication on/off without accessing connection credentials.
 ### Architecture
 - **Source Instance**: The MyFSIO instance where you upload files. It runs the replication worker.
 - **Target Instance**: Another MyFSIO instance (or any S3-compatible service like AWS S3, MinIO) that receives the copies.
 Replication is **asynchronous** (happens in the background) and **one-way** (Source -> Target).
 ### Setup Guide
 #### 1. Prepare the Target Instance
 If your target is another MyFSIO server (e.g., running on a different machine or port), you need to create a destination bucket and a user with write permissions.
 **Option A: Using the UI (Easiest)**
 If you have access to the UI of the target instance:
 1.  Log in to the Target UI.
 2.  Create a new bucket (e.g., `backup-bucket`).
 3.  Go to **IAM**, create a new user (e.g., `replication-user`), and copy the Access/Secret keys.
 **Option B: Headless Setup (API Only)**
 If the target server is only running the API (`run_api.py`) and has no UI access, you can bootstrap the credentials and bucket by running a Python script on the server itself.
 Run this script on the **Target Server**:
 ```python
 # setup_target.py
 from pathlib import Path
 from app.iam import IamService
 from app.storage import ObjectStorage
 # Initialize services (paths match default config)
 data_dir = Path("data")
 iam = IamService(data_dir / ".myfsio.sys" / "config" / "iam.json")
 storage = ObjectStorage(data_dir)
 # 1. Create the bucket
 bucket_name = "backup-bucket"
 try:
    storage.create_bucket(bucket_name)
    print(f"Bucket '{bucket_name}' created.")
 except Exception as e:
    print(f"Bucket creation skipped: {e}")
 # 2. Create the user
 try:
    # Create user with full access (or restrict policy as needed)
    creds = iam.create_user(
        display_name="Replication User",
        policies=[{"bucket": bucket_name, "actions": ["write", "read", "list"]}]
    )
    print("\n--- CREDENTIALS GENERATED ---")
    print(f"Access Key: {creds['access_key']}")
    print(f"Secret Key: {creds['secret_key']}")
    print("-----------------------------")
 except Exception as e:
    print(f"User creation failed: {e}")
 ```
 Save and run: `python setup_target.py`
 #### 2. Configure the Source Instance
 Now, configure the primary instance to replicate to the target.
 1.  **Access the Console**:
    Log in to the UI of your Source Instance.
 2.  **Add a Connection**:
    - Navigate to **Connections** in the top menu.
    - Click **Add Connection**.
    - **Name**: `Secondary Site`.
    - **Endpoint URL**: The URL of your Target Instance's API (e.g., `http://target-server:5002`).
    - **Access Key**: The key you generated on the Target.
    - **Secret Key**: The secret you generated on the Target.
    - Click **Add Connection**.
 3.  **Enable Replication** (Admin):
    - Navigate to **Buckets** and select the source bucket.
    - Switch to the **Replication** tab.
    - Select the `Secondary Site` connection.
    - Enter the target bucket name (`backup-bucket`).
    - Click **Enable Replication**.
    Once configured, users with `replication` permission on this bucket can pause/resume replication without needing access to connection details.
 ### Verification
 1.  Upload a file to the source bucket.
 2.  Check the target bucket (via UI, CLI, or API). The file should appear shortly.
 ```bash
 # Verify on target using AWS CLI
 aws --endpoint-url http://target-server:5002 s3 ls s3://backup-bucket
 ```
 ### Pausing and Resuming Replication
 Users with the `replication` permission (but not admin rights) can pause and resume existing replication rules:
 1. Navigate to the bucket's **Replication** tab.
 2. If replication is **Active**, click **Pause Replication** to temporarily stop syncing.
 3. If replication is **Paused**, click **Resume Replication** to continue syncing.
 When paused, new objects uploaded to the source will not replicate until replication is resumed. Objects uploaded while paused will be replicated once resumed.
 > **Note:** Only admins can create new replication rules, change the target connection/bucket, or delete rules entirely.
 ### Bidirectional Replication (Active-Active)
 To set up two-way replication (Server A ↔ Server B):
 1.  Follow the steps above to replicate **A → B**.
 2.  Repeat the process on Server B to replicate **B → A**:
    - Create a connection on Server B pointing to Server A.
    - Enable replication on the target bucket on Server B.
 **Loop Prevention**: The system automatically detects replication traffic using a custom User-Agent (`S3ReplicationAgent`). This prevents infinite loops where an object replicated from A to B is immediately replicated back to A.
 **Deletes**: Deleting an object on one server will propagate the deletion to the other server.
 **Note**: Deleting a bucket will automatically remove its associated replication configuration.
 ## 11. Running Tests
 ```bash
 pytest -q
 ```
 The suite now includes a boto3 integration test that spins up a live HTTP server and drives the API through the official AWS SDK. If you want to skip it (for faster unit-only loops), run `pytest -m "not integration"`.
 The suite covers bucket CRUD, presigned downloads, bucket policy enforcement, and regression tests for anonymous reads when a Public policy is attached.
 ## 12. Troubleshooting
 | Symptom | Likely Cause | Fix |
 | --- | --- | --- |
 | 403 from API despite Public preset | Policy didn’t save or bucket key path mismatch | Reapply Public preset, confirm bucket name in `Resource` matches `arn:aws:s3:::bucket/*`. |
 | UI still shows old policy text | Browser cached view before hot reload | Refresh; JSON is already reloaded on server. |
 | Presign modal errors with 403 | IAM user lacks `read/write/delete` for target bucket or bucket policy denies | Update IAM inline policies or remove conflicting deny statements. |
 | Large upload rejected immediately | File exceeds `MAX_UPLOAD_SIZE` | Increase env var or shrink object. |
 ## 13. API Matrix
 ```
 GET    /                               # List buckets
 PUT    /<bucket>                        # Create bucket
 DELETE /<bucket>                        # Remove bucket
 GET    /<bucket>                        # List objects
 PUT    /<bucket>/<key>                  # Upload object
 GET    /<bucket>/<key>                  # Download object
 DELETE /<bucket>/<key>                  # Delete object
 POST   /presign/<bucket>/<key>          # Generate SigV4 URL
 GET    /bucket-policy/<bucket>          # Fetch policy
 PUT    /bucket-policy/<bucket>          # Upsert policy
 DELETE /bucket-policy/<bucket>          # Delete policy
 GET    /<bucket>?quota                  # Get bucket quota
 PUT    /<bucket>?quota                  # Set bucket quota (admin only)
 ```
 ## 14. Next Steps
 - Tailor IAM + policy JSON files for team-ready presets.
 - Wrap `run_api.py` with gunicorn or another WSGI server for long-running workloads.
 - Extend `bucket_policies.json` to cover Deny statements that simulate production security controls.