cmc-sales/go/cmd/vault/README.md

140 lines
4.3 KiB
Markdown

# Vault Email Processor - Smart Proxy
This is a Go rewrite of the PHP vault.php script that processes emails for the CMC Sales system. It now supports three modes: local file processing, Gmail indexing, and HTTP streaming proxy.
## Key Features
1. **Gmail Integration**: Index Gmail emails without downloading
2. **Smart Proxy**: Stream email content on-demand without storing to disk
3. **No ripmime dependency**: Uses the enmime Go library for MIME parsing
4. **Better error handling**: Proper error handling and database transactions
5. **Type safety**: Strongly typed Go structures
6. **Modern email parsing**: Uses enmime for robust email parsing
## Operating Modes
### 1. Local Mode (Original functionality)
Processes emails from local filesystem directories.
```bash
./vault --mode=local \
--emaildir=/var/www/emails \
--vaultdir=/var/www/vaultmsgs/new \
--processeddir=/var/www/vaultmsgs/cur \
--dbhost=127.0.0.1 \
--dbuser=cmc \
--dbpass="xVRQI&cA?7AU=hqJ!%au" \
--dbname=cmc
```
### 2. Gmail Index Mode
Indexes Gmail emails without downloading content. Creates database references only.
```bash
./vault --mode=index \
--gmail-query="is:unread" \
--credentials=credentials.json \
--token=token.json \
--dbhost=127.0.0.1 \
--dbuser=cmc \
--dbpass="xVRQI&cA?7AU=hqJ!%au" \
--dbname=cmc
```
### 3. HTTP Server Mode
Runs an HTTP server that streams Gmail content on-demand.
```bash
./vault --mode=serve \
--port=8080 \
--credentials=credentials.json \
--token=token.json \
--dbhost=127.0.0.1 \
--dbuser=cmc \
--dbpass="xVRQI&cA?7AU=hqJ!%au" \
--dbname=cmc
```
## Gmail Setup
1. Enable Gmail API in Google Cloud Console
2. Create OAuth 2.0 credentials
3. Download credentials as `credentials.json`
4. Run vault in any Gmail mode - it will prompt for authorization
5. Token will be saved as `token.json` for future use
## API Endpoints (Server Mode)
- `GET /api/emails` - List indexed emails (metadata only)
- `GET /api/emails/:id` - Get email metadata
- `GET /api/emails/:id/content` - Stream email HTML/text from Gmail
- `GET /api/emails/:id/attachments` - List attachment metadata
- `GET /api/emails/:id/attachments/:attachmentId` - Stream attachment from Gmail
- `GET /api/emails/:id/raw` - Stream raw email (for email clients)
## Database Schema Changes
Required migrations for Gmail support:
```sql
ALTER TABLE emails
ADD COLUMN gmail_message_id VARCHAR(255) UNIQUE,
ADD COLUMN gmail_thread_id VARCHAR(255),
ADD COLUMN is_downloaded BOOLEAN DEFAULT FALSE,
ADD COLUMN raw_headers TEXT;
CREATE INDEX idx_gmail_message_id ON emails(gmail_message_id);
ALTER TABLE email_attachments
ADD COLUMN gmail_attachment_id VARCHAR(255),
ADD COLUMN gmail_message_id VARCHAR(255);
```
## Architecture
### Smart Proxy Benefits
- **No Disk Storage**: Emails/attachments streamed directly from Gmail
- **Low Storage Footprint**: Only metadata stored in database
- **Fresh Content**: Always serves latest version from Gmail
- **Scalable**: No file management overhead
- **On-Demand**: Content fetched only when requested
### Processing Flow
1. **Index Mode**: Scans Gmail, stores metadata, creates associations
2. **Server Mode**: Receives HTTP requests, fetches from Gmail, streams to client
3. **Local Mode**: Original file-based processing (backwards compatible)
## Build
```bash
go build -o vault cmd/vault/main.go
```
## Dependencies
- github.com/jhillyerd/enmime - MIME email parsing
- github.com/google/uuid - UUID generation
- github.com/go-sql-driver/mysql - MySQL driver
- github.com/gorilla/mux - HTTP router
- golang.org/x/oauth2 - OAuth2 support
- google.golang.org/api/gmail/v1 - Gmail API client
## Database Tables Used
- emails - Main email records with Gmail metadata
- email_recipients - To/CC recipients
- email_attachments - Attachment metadata (no file storage)
- emails_enquiries - Email to enquiry associations
- emails_invoices - Email to invoice associations
- emails_purchase_orders - Email to PO associations
- emails_jobs - Email to job associations
- users - System users
- enquiries, invoices, purchase_orders, jobs - For identifier matching
## Gmail Query Examples
- `is:unread` - Unread emails
- `newer_than:1d` - Emails from last 24 hours
- `from:customer@example.com` - From specific sender
- `subject:invoice` - Subject contains "invoice"
- `has:attachment` - Emails with attachments