Understanding Cloud Storage Events from SFTP Gateway File Uploads
Overview
When uploading files through SFTP Gateway to Google Cloud Storage (GCS), you may notice multiple storage events triggered for a single file upload. This is expected behavior due to how the SFTP protocol translates file operations to GCS API calls.
This article explains why these events occur and provides solutions for filtering them in your Cloud Functions, Cloud Run services, or automation workflows.
How Many Events Are Generated?
Each file upload through SFTP Gateway generates 2-3 GCS events:
Initial file creation with metadata - A 0-byte object is created with timestamp metadata (
mtime,ctime,atime).File data upload - The object is overwritten with the actual file content.
Metadata update - Object metadata is updated after the upload completes (MD5 checksum added and timestamps are updated if preservation is enabled).
Why Does This Happen?
This behavior is inherent to how SFTP operations are translated to GCS APIs:
CREATE FILE command - When an SFTP client initiates an upload, it sends a "create file" command, which SFTP Gateway translates into creating a 0-byte object in GCS with initial timestamp metadata.
PUT FILE command - Once the entire file data is transferred, the object is overwritten with the actual content.
SET ATTRIBUTES command - After the upload completes, SFTP Gateway updates metadata including the MD5 checksum and optionally updates timestamps if preservation is enabled.
Note: Different SFTP clients may behave slightly differently, which can affect the exact timing of events. This is normal and depends on the client's implementation of the SFTP protocol.
GCS Event Details
For a single file upload to GCS, you'll see:
| Event Type | Description |
|---|---|
storage.objects.create | Initial 0-byte object creation with timestamp metadata |
storage.objects.create | Object content uploaded (overwrites the 0-byte object) |
storage.objects.update | Metadata update (upload complete) |
Example Cloud Logging entries:
1. 2025-12-10 17:41:35.013 storage.objects.create uploads/Thorn_Tech.pdf
2. 2025-12-10 17:41:35.198 storage.objects.create uploads/Thorn_Tech.pdf
3. 2025-12-10 17:41:35.536 storage.objects.update uploads/Thorn_Tech.pdf
How to Handle Multiple Events in Cloud Functions
If these duplicate events are causing issues in your Cloud Functions or Cloud Run services, here are the recommended approaches:
Solution 1: Filter by File Size (Recommended)
The simplest and most reliable solution is to ignore events for 0-byte or very small files (< 1 KB), as the initial creation event will always be minimal in size.
Python Example:
def process_gcs_event(data, context):
"""Triggered by a change to a Cloud Storage bucket.
Args:
data (dict): Event payload.
context (google.cloud.functions.Context): Metadata for the event.
"""
file_name = data['name']
bucket_name = data['bucket']
file_size = int(data.get('size', 0))
# Ignore small files (0 bytes or minimal size)
if file_size < 1024: # Less than 1 KB
print(f"Ignoring small file event: {file_name} ({file_size} bytes)")
return
# Process the actual file upload
print(f"Processing file: gs://{bucket_name}/{file_name} ({file_size} bytes)")
process_file(bucket_name, file_name)
def process_file(bucket_name, file_name):
"""Your file processing logic here."""
print(f"Processing gs://{bucket_name}/{file_name}")
Why this works:
- The initial object creation is always 0 bytes
- Actual file uploads will be larger than 1 KB in most cases
- Simple, fast, and doesn't require additional GCS API calls
- Works reliably across all SFTP clients
Solution 2: Add a Delay Before Processing
For simple workflows with files under 25 MB, adding a brief delay before processing can ensure you're working with the final version of the file.
Python Example:
import time
def process_gcs_event(data, context):
"""Triggered by a change to a Cloud Storage bucket."""
file_name = data['name']
bucket_name = data['bucket']
# Wait 15 seconds to ensure all events have completed
time.sleep(15)
# Process the file
print(f"Processing file: {file_name}")
process_file(bucket_name, file_name)
def process_file(bucket_name, file_name):
"""Your file processing logic here."""
print(f"Processing gs://{bucket_name}/{file_name}")
Note: This approach works well for smaller files (under 25 MB) where the upload completes quickly. For larger files or high-throughput systems, use Solution 1 instead.
Recommended Approach
For most use cases: Use Solution 1 (filter by file size) - it's simple, reliable, efficient, and doesn't require additional GCS API calls or delays.
For simple, low-traffic workflows: Use Solution 2 (delay approach) - only suitable for small files and scenarios where a 15-second delay is acceptable.
Viewing Events in Cloud Logging
To observe these events for debugging:
Step 1: Enable Data Access Audit Logs
- Go to IAM & Admin → Audit Logs in the Google Cloud Console
- Find Cloud Storage in the list of services
- Check the boxes for:
- Data Read (optional, for viewing read operations)
- Data Write (required for upload events)
- Click Save
Note: Data Access logs may incur additional costs. See Cloud Logging pricing for details.
Step 2: Query Events in Logs Explorer
- Go to Logging → Logs Explorer
- Use this query to see all storage events for your bucket:
resource.type="gcs_bucket"
resource.labels.bucket_name="your-bucket-name"
protoPayload.methodName=~"storage.objects.*"
To filter for specific operations:
resource.type="gcs_bucket"
resource.labels.bucket_name="your-bucket-name"
protoPayload.methodName=("storage.objects.create" OR "storage.objects.update")
To filter for a specific file:
resource.type="gcs_bucket"
resource.labels.bucket_name="your-bucket-name"
protoPayload.resourceName:"objects/uploads/your-file.pdf"
Understanding the Log Structure
Each log entry contains useful information:
{
"protoPayload": {
"methodName": "storage.objects.create",
"resourceName": "projects/_/buckets/your-bucket/objects/uploads/file.pdf",
"request": {
"size": "937756"
}
},
"timestamp": "2025-12-10T17:41:35.013Z"
}
Cloud Function Configuration Tips
Event Trigger Setup
When creating a Cloud Function with a Cloud Storage trigger:
- Event Type: Choose
google.storage.object.finalizefor upload events - Bucket: Select your GCS bucket
- Event Filters: Optional, but you can filter by prefix (e.g.,
uploads/)
Using gcloud:
gcloud functions deploy process-uploads \
--runtime python39 \
--trigger-event google.storage.object.finalize \
--trigger-resource your-bucket-name \
--entry-point process_gcs_event
Function Timeout Considerations
Set appropriate timeouts based on your solution:
- Solution 1 (file size filter): 60 seconds is usually sufficient
- Solution 2 (delay approach): Minimum 60 seconds (15s delay + processing time)
Configure via gcloud:
gcloud functions deploy process-uploads \
--trigger-bucket=your-bucket-name \
--runtime=python39 \
--timeout=120s
Permissions Required
Your Cloud Function's service account needs these permissions:
storage.objects.get
storage.objects.list
These are included in the roles/storage.objectViewer role.
Grant permissions:
gcloud projects add-iam-policy-binding PROJECT_ID \
--member=serviceAccount:SERVICE_ACCOUNT_EMAIL \
--role=roles/storage.objectViewer
Additional Notes
- This behavior is expected and cannot be disabled, as it's how the SFTP protocol translates to GCS APIs.
- The extra events do not increase storage costs (only one object exists in GCS).
- Timestamp preservation can be disabled in your SFTP client settings if you don't need original timestamps, but the three-event pattern will still occur.
- Different SFTP clients may produce slightly different event patterns, but the core behavior remains the same.
- SFTP Gateway sets custom metadata on objects including
mtime,ctime,atime, andmd5for file verification.
Need Help?
If you're still experiencing issues with duplicate events or need assistance implementing these filtering strategies, please contact our support team at support@thorntech.com.