Lambda Function Setup
Configure smart serverless functions to handle file uploads, deduplication logic, and cleanup—powering the core of our efficient, user-aware storage system.
We will now set up the core Lambda functions that power the file deduplication logic. These serverless functions are the brain of the project — handling file validation, deduplication checks, S3 operations, and user tracking in DynamoDB.
There are two main Lambda functions in this project:
Upload Handler — triggered when a user uploads a file
Delete Handler — triggered when a user deletes/unlinks a file

Step 1: Create the Upload Lambda Function
Go to AWS Console → Lambda
Click Create Function
Select:
Author from scratch
Function name:
owncloud-dedup-function
Runtime:
Python 3.9
Permissions: Attach a previous role
owncloud-dedup-role
with basic Lambda permissions.
Click Create function
Once created, open the function editor and paste your upload logic.
import json
import boto3
import hashlib
import base64
import uuid
# AWS Clients
s3 = boto3.client('s3')
dynamodb = boto3.resource('dynamodb')
# Configurations
TABLE_NAME = "dedupTable"
DEDUP_BUCKET = "owncloud-dedup-files"
table = dynamodb.Table(TABLE_NAME)
def lambda_handler(event, context):
try:
# Parse and validate input
body = json.loads(event['body'])
file_name = body.get('fileName')
file_content_b64 = body.get('fileContent')
user_id = body.get('userID')
if not file_name or not file_content_b64 or not user_id:
return {
'statusCode': 400,
'body': json.dumps({'error': 'Missing fileName, fileContent, or userID'})
}
# Decode base64 and hash the file
file_data = base64.b64decode(file_content_b64)
file_hash = hashlib.sha256(file_data).hexdigest()
# Check for file existence by hash
response = table.get_item(Key={'FileHash': file_hash})
if 'Item' in response:
item = response['Item']
s3_key = item['S3Key']
existing_users = item.get('Users', [])
# Only update if user not already listed
if user_id not in existing_users:
table.update_item(
Key={'FileHash': file_hash},
UpdateExpression="SET #u = list_append(if_not_exists(#u, :empty_list), :new_user)",
ExpressionAttributeNames={'#u': 'Users'},
ExpressionAttributeValues={
':new_user': [user_id],
':empty_list': []
}
)
return {
'statusCode': 200,
'body': json.dumps({
'message': 'File already exists. Linked to you.',
'stored_path': f"s3://{DEDUP_BUCKET}/{s3_key}"
})
}
# Upload new file to S3
unique_key = f"{uuid.uuid4()}_{file_name}"
s3.put_object(Bucket=DEDUP_BUCKET, Key=unique_key, Body=file_data)
# Store new metadata
table.put_item(Item={
'FileHash': file_hash,
'S3Key': unique_key,
'Users': [user_id]
})
return {
'statusCode': 200,
'body': json.dumps({
'message': 'File stored successfully',
'stored_path': f"s3://{DEDUP_BUCKET}/{unique_key}"
})
}
except Exception as e:
return {
'statusCode': 500,
'body': json.dumps({'error': str(e)})
}
This function handles:
File hash computation (SHA-256)
Duplicate check in DynamoDB
Upload to S3 (if new)
Append user reference to existing files
Return stored path (symlink-style)
Step 2: Create the Delete Lambda Function
Go back to the Lambda console
Click Create Function again
Select:
Function name:
deleteFileLambda
Runtime:
Python 3.11
Use the same role
deleteFileLambda-role-wpp128e3
with S3 + DynamoDB access
Paste the delete logic:
import json
import boto3
s3 = boto3.client('s3')
dynamodb = boto3.resource('dynamodb')
# Constants
TABLE_NAME = "dedupTable"
DEDUP_BUCKET = "owncloud-dedup-files"
table = dynamodb.Table(TABLE_NAME)
def lambda_handler(event, context):
try:
# Parse request
body = json.loads(event['body'])
file_hash = body.get('fileHash')
user_id = body.get('userID')
if not file_hash or not user_id:
return {
'statusCode': 400,
'body': json.dumps({'error': 'Missing fileHash or userID'})
}
# Fetch item from DynamoDB
response = table.get_item(Key={'FileHash': file_hash})
if 'Item' not in response:
return {'statusCode': 404, 'body': json.dumps({'error': 'File not found'})}
item = response['Item']
users = item.get('Users', [])
if user_id not in users:
return {
'statusCode': 400,
'body': json.dumps({'error': 'User not linked to this file'})
}
# Remove the user from the list
users.remove(user_id)
if users:
# Update remaining users in the DB
table.update_item(
Key={'FileHash': file_hash},
UpdateExpression="SET #U = :u",
ExpressionAttributeNames={"#U": "Users"},
ExpressionAttributeValues={':u': users}
)
return {
'statusCode': 200,
'body': json.dumps({'message': 'User reference removed. File still linked to others.'})
}
else:
# Delete file from S3 and remove DB entry
s3.delete_object(Bucket=DEDUP_BUCKET, Key=item['S3Key'])
table.delete_item(Key={'FileHash': file_hash})
return {
'statusCode': 200,
'body': json.dumps({'message': 'File deleted from system. No users remaining.'})
}
except Exception as e:
return {
'statusCode': 500,
'body': json.dumps({'error': str(e)})
}
This function:
Removes a user's reference from the
Users
list in DynamoDBDeletes the file from S3 if it has no users remaining
(Optional) Step 3: Assign IAM Role Permissions (for verification purpose)
Ensure both Lambda functions have access to:
S3 (
PutObject
,DeleteObject
)DynamoDB (
GetItem
,PutItem
,UpdateItem
,DeleteItem
)KMS (if using SSE-KMS encryption)
IAM Inline Policy Example:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:*"
],
"Resource": "arn:aws:s3:::owncloud-dedup-files/*"
},
{
"Effect": "Allow",
"Action": [
"dynamodb:*"
],
"Resource": "arn:aws:dynamodb:<region>:<account-id>:table/dedupTable"
}
]
}
Optional: Enable CloudWatch Logging
To help with debugging, enable CloudWatch logs:
Go to Lambda → Monitor tab → Enable CloudWatch Logs
Add
print()
statements inside your Lambda code
Lambda Functions Ready!
These two serverless functions are now capable of:
Deduplicating files intelligently
Tracking user-level file ownership
Cleaning up storage as users unlink
Last updated