# MyFSIO Documentation This document expands on the README to describe the full workflow for running, configuring, and extending MyFSIO. Use it as a playbook for local S3-style experimentation. ## 1. System Overview MyFSIO ships two Flask entrypoints that share the same storage, IAM, and bucket-policy state: - **API server** – Implements the S3-compatible REST API, policy evaluation, and Signature Version 4 presign service. - **UI server** – Provides the browser console for buckets, IAM, and policies. It proxies to the API for presign operations. Both servers read `AppConfig`, so editing JSON stores on disk instantly affects both surfaces. ## 2. Quickstart ```bash python -m venv .venv . .venv/Scripts/activate # PowerShell: .\.venv\Scripts\Activate.ps1 pip install -r requirements.txt # Run both API and UI python run.py ``` Visit `http://127.0.0.1:5100/ui` to use the console and `http://127.0.0.1:5000/` (with IAM headers) for raw API calls. ### Run modes You can run services individually if needed: ```bash python run.py --mode api # API only (port 5000) python run.py --mode ui # UI only (port 5100) ``` ### Docker quickstart The repo now ships a `Dockerfile` so you can run both services in one container: ```bash docker build -t myfsio . docker run --rm -p 5000:5000 -p 5100:5100 \ -v "$PWD/data:/app/data" \ -v "$PWD/logs:/app/logs" \ -e SECRET_KEY="change-me" \ --name myfsio myfsio ``` PowerShell (Windows) example: ```powershell docker run --rm -p 5000:5000 -p 5100:5100 ` -v ${PWD}\data:/app/data ` -v ${PWD}\logs:/app/logs ` -e SECRET_KEY="change-me" ` --name myfsio myfsio ``` Key mount points: - `/app/data` → persists buckets directly under `/app/data/` while system metadata (IAM config, bucket policies, versions, multipart uploads, etc.) lives under `/app/data/.myfsio.sys` (for example, `/app/data/.myfsio.sys/config/iam.json`). - `/app/logs` → captures the rotating app log. - `/app/tmp-storage` (optional) if you rely on the demo upload staging folders. With these volumes attached you can rebuild/restart the container without losing stored objects or credentials. ### Versioning The repo now tracks a human-friendly release string inside `app/version.py` (see the `APP_VERSION` constant). Edit that value whenever you cut a release. The constant flows into Flask as `APP_VERSION` and is exposed via `GET /healthz`, so you can monitor deployments or surface it in UIs. ## 3. Configuration Reference | Variable | Default | Notes | | --- | --- | --- | | `STORAGE_ROOT` | `/data` | Filesystem home for all buckets/objects. | | `MAX_UPLOAD_SIZE` | `1073741824` | Bytes. Caps incoming uploads in both API + UI. | | `UI_PAGE_SIZE` | `100` | `MaxKeys` hint shown in listings. | | `SECRET_KEY` | `dev-secret-key` | Flask session key for UI auth. | | `IAM_CONFIG` | `/data/.myfsio.sys/config/iam.json` | Stores users, secrets, and inline policies. | | `BUCKET_POLICY_PATH` | `/data/.myfsio.sys/config/bucket_policies.json` | Bucket policy store (auto hot-reload). | | `API_BASE_URL` | `None` | Used by the UI to hit API endpoints (presign/policy). If unset, the UI will auto-detect the host or use `X-Forwarded-*` headers. | | `AWS_REGION` | `us-east-1` | Region embedded in SigV4 credential scope. | | `AWS_SERVICE` | `s3` | Service string for SigV4. | | `ENCRYPTION_ENABLED` | `false` | Enable server-side encryption support. | | `KMS_ENABLED` | `false` | Enable KMS key management for encryption. | | `KMS_KEYS_PATH` | `data/kms_keys.json` | Path to store KMS key metadata. | | `ENCRYPTION_MASTER_KEY_PATH` | `data/master.key` | Path to the master encryption key file. | Set env vars (or pass overrides to `create_app`) to point the servers at custom paths. ### Proxy Configuration If running behind a reverse proxy (e.g., Nginx, Cloudflare, or a tunnel), ensure the proxy sets the standard forwarding headers: - `X-Forwarded-Host` - `X-Forwarded-Proto` The application automatically trusts these headers to generate correct presigned URLs (e.g., `https://s3.example.com/...` instead of `http://127.0.0.1:5000/...`). Alternatively, you can explicitly set `API_BASE_URL` to your public endpoint. ## 4. Authentication & IAM 1. On first boot, `data/.myfsio.sys/config/iam.json` is seeded with `localadmin / localadmin` that has wildcard access. 2. Sign into the UI using those credentials, then open **IAM**: - **Create user**: supply a display name and optional JSON inline policy array. - **Rotate secret**: generates a new secret key; the UI surfaces it once. - **Policy editor**: select a user, paste an array of objects (`{"bucket": "*", "actions": ["list", "read"]}`), and submit. Alias support includes AWS-style verbs (e.g., `s3:GetObject`). 3. Wildcard action `iam:*` is supported for admin user definitions. The API expects every request to include `X-Access-Key` and `X-Secret-Key` headers. The UI persists them in the Flask session after login. ### Available IAM Actions | Action | Description | AWS Aliases | | --- | --- | --- | | `list` | List buckets and objects | `s3:ListBucket`, `s3:ListAllMyBuckets`, `s3:ListBucketVersions`, `s3:ListMultipartUploads`, `s3:ListParts` | | `read` | Download objects | `s3:GetObject`, `s3:GetObjectVersion`, `s3:GetObjectTagging`, `s3:HeadObject`, `s3:HeadBucket` | | `write` | Upload objects, create buckets | `s3:PutObject`, `s3:CreateBucket`, `s3:CreateMultipartUpload`, `s3:UploadPart`, `s3:CompleteMultipartUpload`, `s3:AbortMultipartUpload`, `s3:CopyObject` | | `delete` | Remove objects and buckets | `s3:DeleteObject`, `s3:DeleteObjectVersion`, `s3:DeleteBucket` | | `share` | Manage ACLs | `s3:PutObjectAcl`, `s3:PutBucketAcl`, `s3:GetBucketAcl` | | `policy` | Manage bucket policies | `s3:PutBucketPolicy`, `s3:GetBucketPolicy`, `s3:DeleteBucketPolicy` | | `replication` | Configure and manage replication | `s3:GetReplicationConfiguration`, `s3:PutReplicationConfiguration`, `s3:ReplicateObject`, `s3:ReplicateTags`, `s3:ReplicateDelete` | | `iam:list_users` | View IAM users | `iam:ListUsers` | | `iam:create_user` | Create IAM users | `iam:CreateUser` | | `iam:delete_user` | Delete IAM users | `iam:DeleteUser` | | `iam:rotate_key` | Rotate user secrets | `iam:RotateAccessKey` | | `iam:update_policy` | Modify user policies | `iam:PutUserPolicy` | | `iam:*` | All IAM actions (admin wildcard) | — | ### Example Policies **Full Control (admin):** ```json [{"bucket": "*", "actions": ["list", "read", "write", "delete", "share", "policy", "replication", "iam:*"]}] ``` **Read-Only:** ```json [{"bucket": "*", "actions": ["list", "read"]}] ``` **Single Bucket Access (no listing other buckets):** ```json [{"bucket": "user-bucket", "actions": ["read", "write", "delete"]}] ``` **Bucket Access with Replication:** ```json [{"bucket": "my-bucket", "actions": ["list", "read", "write", "delete", "replication"]}] ``` ## 5. Bucket Policies & Presets - **Storage**: Policies are persisted in `data/.myfsio.sys/config/bucket_policies.json` under `{"policies": {"bucket": {...}}}`. - **Hot reload**: Both API and UI call `maybe_reload()` before evaluating policies. Editing the JSON on disk is immediately reflected—no restarts required. - **UI editor**: Each bucket detail page includes: - A preset selector: **Private** detaches the policy (delete mode), **Public** injects an allow policy granting anonymous `s3:ListBucket` + `s3:GetObject`, and **Custom** restores your draft. - A read-only preview of the attached policy. - Autosave behavior for custom drafts while you type. ### Editing via CLI ```bash curl -X PUT http://127.0.0.1:5000/bucket-policy/test \ -H "Content-Type: application/json" \ -H "X-Access-Key: ..." -H "X-Secret-Key: ..." \ -d '{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Principal": "*", "Action": ["s3:ListBucket"], "Resource": ["arn:aws:s3:::test"] } ] }' ``` The UI will reflect this change as soon as the request completes thanks to the hot reload. ## 6. Presigned URLs - Trigger from the UI using the **Presign** button after selecting an object. - Or call `POST /presign//` with JSON `{ "method": "GET", "expires_in": 900 }`. - Supported methods: `GET`, `PUT`, `DELETE`; expiration must be `1..604800` seconds. - The service signs requests using the caller’s IAM credentials and enforces bucket policies both when issuing and when the presigned URL is used. - Legacy share links have been removed; presigned URLs now handle both private and public workflows. ### Multipart Upload Example ```python import boto3 s3 = boto3.client('s3', endpoint_url='http://localhost:5000') # Initiate response = s3.create_multipart_upload(Bucket='mybucket', Key='large.bin') upload_id = response['UploadId'] # Upload parts parts = [] chunks = [b'chunk1', b'chunk2'] # Example data chunks for part_number, chunk in enumerate(chunks, start=1): response = s3.upload_part( Bucket='mybucket', Key='large.bin', PartNumber=part_number, UploadId=upload_id, Body=chunk ) parts.append({'PartNumber': part_number, 'ETag': response['ETag']}) # Complete s3.complete_multipart_upload( Bucket='mybucket', Key='large.bin', UploadId=upload_id, MultipartUpload={'Parts': parts} ) ``` ## 7. Encryption MyFSIO supports **server-side encryption at rest** to protect your data. When enabled, objects are encrypted using AES-256-GCM before being written to disk. ### Encryption Types | Type | Description | |------|-------------| | **AES-256 (SSE-S3)** | Server-managed encryption using a local master key | | **KMS (SSE-KMS)** | Encryption using customer-managed keys via the built-in KMS | ### Enabling Encryption #### 1. Set Environment Variables ```powershell # PowerShell $env:ENCRYPTION_ENABLED = "true" $env:KMS_ENABLED = "true" # Optional, for KMS key management python run.py ``` ```bash # Bash export ENCRYPTION_ENABLED=true export KMS_ENABLED=true python run.py ``` #### 2. Configure Bucket Default Encryption (UI) 1. Navigate to your bucket in the UI 2. Click the **Properties** tab 3. Find the **Default Encryption** card 4. Click **Enable Encryption** 5. Choose algorithm: - **AES-256**: Uses the server's master key - **aws:kms**: Uses a KMS-managed key (select from dropdown) 6. Save changes Once enabled, all **new objects** uploaded to the bucket will be automatically encrypted. ### KMS Key Management When `KMS_ENABLED=true`, you can manage encryption keys via the KMS API: ```bash # Create a new KMS key curl -X POST http://localhost:5000/kms/keys \ -H "Content-Type: application/json" \ -H "X-Access-Key: ..." -H "X-Secret-Key: ..." \ -d '{"alias": "my-key", "description": "Production encryption key"}' # List all keys curl http://localhost:5000/kms/keys \ -H "X-Access-Key: ..." -H "X-Secret-Key: ..." # Get key details curl http://localhost:5000/kms/keys/{key-id} \ -H "X-Access-Key: ..." -H "X-Secret-Key: ..." # Rotate a key (creates new key material) curl -X POST http://localhost:5000/kms/keys/{key-id}/rotate \ -H "X-Access-Key: ..." -H "X-Secret-Key: ..." # Disable/Enable a key curl -X POST http://localhost:5000/kms/keys/{key-id}/disable \ -H "X-Access-Key: ..." -H "X-Secret-Key: ..." curl -X POST http://localhost:5000/kms/keys/{key-id}/enable \ -H "X-Access-Key: ..." -H "X-Secret-Key: ..." # Schedule key deletion (30-day waiting period) curl -X DELETE http://localhost:5000/kms/keys/{key-id}?waiting_period_days=30 \ -H "X-Access-Key: ..." -H "X-Secret-Key: ..." ``` ### How It Works 1. **Envelope Encryption**: Each object is encrypted with a unique Data Encryption Key (DEK) 2. **Key Wrapping**: The DEK is encrypted (wrapped) by the master key or KMS key 3. **Storage**: The encrypted DEK is stored alongside the encrypted object 4. **Decryption**: On read, the DEK is unwrapped and used to decrypt the object ### Client-Side Encryption For additional security, you can use client-side encryption. The `ClientEncryptionHelper` class provides utilities: ```python from app.encryption import ClientEncryptionHelper # Generate a client-side key key = ClientEncryptionHelper.generate_key() key_b64 = ClientEncryptionHelper.key_to_base64(key) # Encrypt before upload plaintext = b"sensitive data" encrypted, metadata = ClientEncryptionHelper.encrypt_for_upload(plaintext, key) # Upload with metadata headers # x-amz-meta-x-amz-key: # x-amz-meta-x-amz-iv: # x-amz-meta-x-amz-matdesc: # Decrypt after download decrypted = ClientEncryptionHelper.decrypt_from_download(encrypted, metadata, key) ``` ### Important Notes - **Existing objects are NOT encrypted** - Only new uploads after enabling encryption are encrypted - **Master key security** - The master key file (`master.key`) should be backed up securely and protected - **Key rotation** - Rotating a KMS key creates new key material; existing objects remain encrypted with the old material - **Disabled keys** - Objects encrypted with a disabled key cannot be decrypted until the key is re-enabled - **Deleted keys** - Once a key is deleted (after the waiting period), objects encrypted with it are permanently inaccessible ### Verifying Encryption To verify an object is encrypted: 1. Check the raw file in `data//` - it should be unreadable binary 2. Look for `.meta` files containing encryption metadata 3. Download via the API/UI - the object should be automatically decrypted ## 8. Bucket Quotas MyFSIO supports **storage quotas** to limit how much data a bucket can hold. Quotas are enforced on uploads and multipart completions. ### Quota Types | Limit | Description | |-------|-------------| | **Max Size (MB)** | Maximum total storage in megabytes (includes current objects + archived versions) | | **Max Objects** | Maximum number of objects (includes current objects + archived versions) | ### Managing Quotas (Admin Only) Quota management is restricted to administrators (users with `iam:*` or `iam:list_users` permissions). #### Via UI 1. Navigate to your bucket in the UI 2. Click the **Properties** tab 3. Find the **Storage Quota** card 4. Enter limits: - **Max Size (MB)**: Leave empty for unlimited - **Max Objects**: Leave empty for unlimited 5. Click **Update Quota** To remove a quota, click **Remove Quota**. #### Via API ```bash # Set quota (max 100MB, max 1000 objects) curl -X PUT "http://localhost:5000/bucket/?quota" \ -H "Content-Type: application/json" \ -H "X-Access-Key: ..." -H "X-Secret-Key: ..." \ -d '{"max_bytes": 104857600, "max_objects": 1000}' # Get current quota curl "http://localhost:5000/bucket/?quota" \ -H "X-Access-Key: ..." -H "X-Secret-Key: ..." # Remove quota curl -X PUT "http://localhost:5000/bucket/?quota" \ -H "Content-Type: application/json" \ -H "X-Access-Key: ..." -H "X-Secret-Key: ..." \ -d '{"max_bytes": null, "max_objects": null}' ``` ### Quota Behavior - **Version Counting**: When versioning is enabled, archived versions count toward the quota - **Enforcement Points**: Quotas are checked during `PUT` object and `CompleteMultipartUpload` operations - **Error Response**: When quota is exceeded, the API returns `HTTP 400` with error code `QuotaExceeded` - **Visibility**: All users can view quota usage in the bucket detail page, but only admins can modify quotas ### Example Error ```xml QuotaExceeded Bucket quota exceeded: storage limit reached my-bucket ``` ## 9. Site Replication ### Permission Model Replication uses a two-tier permission system: | Role | Capabilities | |------|--------------| | **Admin** (users with `iam:*` permissions) | Create/delete replication rules, configure connections and target buckets | | **Users** (with `replication` permission) | Enable/disable (pause/resume) existing replication rules | > **Note:** The Replication tab is hidden for users without the `replication` permission on the bucket. This separation allows administrators to pre-configure where data should replicate, while allowing authorized users to toggle replication on/off without accessing connection credentials. ### Architecture - **Source Instance**: The MyFSIO instance where you upload files. It runs the replication worker. - **Target Instance**: Another MyFSIO instance (or any S3-compatible service like AWS S3, MinIO) that receives the copies. Replication is **asynchronous** (happens in the background) and **one-way** (Source -> Target). ### Setup Guide #### 1. Prepare the Target Instance If your target is another MyFSIO server (e.g., running on a different machine or port), you need to create a destination bucket and a user with write permissions. **Option A: Using the UI (Easiest)** If you have access to the UI of the target instance: 1. Log in to the Target UI. 2. Create a new bucket (e.g., `backup-bucket`). 3. Go to **IAM**, create a new user (e.g., `replication-user`), and copy the Access/Secret keys. **Option B: Headless Setup (API Only)** If the target server is only running the API (`run_api.py`) and has no UI access, you can bootstrap the credentials and bucket by running a Python script on the server itself. Run this script on the **Target Server**: ```python # setup_target.py from pathlib import Path from app.iam import IamService from app.storage import ObjectStorage # Initialize services (paths match default config) data_dir = Path("data") iam = IamService(data_dir / ".myfsio.sys" / "config" / "iam.json") storage = ObjectStorage(data_dir) # 1. Create the bucket bucket_name = "backup-bucket" try: storage.create_bucket(bucket_name) print(f"Bucket '{bucket_name}' created.") except Exception as e: print(f"Bucket creation skipped: {e}") # 2. Create the user try: # Create user with full access (or restrict policy as needed) creds = iam.create_user( display_name="Replication User", policies=[{"bucket": bucket_name, "actions": ["write", "read", "list"]}] ) print("\n--- CREDENTIALS GENERATED ---") print(f"Access Key: {creds['access_key']}") print(f"Secret Key: {creds['secret_key']}") print("-----------------------------") except Exception as e: print(f"User creation failed: {e}") ``` Save and run: `python setup_target.py` #### 2. Configure the Source Instance Now, configure the primary instance to replicate to the target. 1. **Access the Console**: Log in to the UI of your Source Instance. 2. **Add a Connection**: - Navigate to **Connections** in the top menu. - Click **Add Connection**. - **Name**: `Secondary Site`. - **Endpoint URL**: The URL of your Target Instance's API (e.g., `http://target-server:5002`). - **Access Key**: The key you generated on the Target. - **Secret Key**: The secret you generated on the Target. - Click **Add Connection**. 3. **Enable Replication** (Admin): - Navigate to **Buckets** and select the source bucket. - Switch to the **Replication** tab. - Select the `Secondary Site` connection. - Enter the target bucket name (`backup-bucket`). - Click **Enable Replication**. Once configured, users with `replication` permission on this bucket can pause/resume replication without needing access to connection details. ### Verification 1. Upload a file to the source bucket. 2. Check the target bucket (via UI, CLI, or API). The file should appear shortly. ```bash # Verify on target using AWS CLI aws --endpoint-url http://target-server:5002 s3 ls s3://backup-bucket ``` ### Pausing and Resuming Replication Users with the `replication` permission (but not admin rights) can pause and resume existing replication rules: 1. Navigate to the bucket's **Replication** tab. 2. If replication is **Active**, click **Pause Replication** to temporarily stop syncing. 3. If replication is **Paused**, click **Resume Replication** to continue syncing. When paused, new objects uploaded to the source will not replicate until replication is resumed. Objects uploaded while paused will be replicated once resumed. > **Note:** Only admins can create new replication rules, change the target connection/bucket, or delete rules entirely. ### Bidirectional Replication (Active-Active) To set up two-way replication (Server A ↔ Server B): 1. Follow the steps above to replicate **A → B**. 2. Repeat the process on Server B to replicate **B → A**: - Create a connection on Server B pointing to Server A. - Enable replication on the target bucket on Server B. **Loop Prevention**: The system automatically detects replication traffic using a custom User-Agent (`S3ReplicationAgent`). This prevents infinite loops where an object replicated from A to B is immediately replicated back to A. **Deletes**: Deleting an object on one server will propagate the deletion to the other server. **Note**: Deleting a bucket will automatically remove its associated replication configuration. ## 11. Running Tests ```bash pytest -q ``` The suite now includes a boto3 integration test that spins up a live HTTP server and drives the API through the official AWS SDK. If you want to skip it (for faster unit-only loops), run `pytest -m "not integration"`. The suite covers bucket CRUD, presigned downloads, bucket policy enforcement, and regression tests for anonymous reads when a Public policy is attached. ## 12. Troubleshooting | Symptom | Likely Cause | Fix | | --- | --- | --- | | 403 from API despite Public preset | Policy didn’t save or bucket key path mismatch | Reapply Public preset, confirm bucket name in `Resource` matches `arn:aws:s3:::bucket/*`. | | UI still shows old policy text | Browser cached view before hot reload | Refresh; JSON is already reloaded on server. | | Presign modal errors with 403 | IAM user lacks `read/write/delete` for target bucket or bucket policy denies | Update IAM inline policies or remove conflicting deny statements. | | Large upload rejected immediately | File exceeds `MAX_UPLOAD_SIZE` | Increase env var or shrink object. | ## 13. API Matrix ``` GET / # List buckets PUT / # Create bucket DELETE / # Remove bucket GET / # List objects PUT // # Upload object GET // # Download object DELETE // # Delete object POST /presign// # Generate SigV4 URL GET /bucket-policy/ # Fetch policy PUT /bucket-policy/ # Upsert policy DELETE /bucket-policy/ # Delete policy GET /?quota # Get bucket quota PUT /?quota # Set bucket quota (admin only) ```