AWS S3
This page explains the features and use cases of AWS S3 and provides step-by-step instructions for integrating it as a destination in HCL CDP.
AWS S3 (Simple Storage Service) is a cloud-based object storage service that allows businesses and individuals to securely store data within containers called buckets. S3 is versatile and supports use cases such as:
- Data lakes
- Websites
- Mobile applications
- Backups and archives
- IoT device data
- Big data analytics
Key Features of AWS S3
- Intelligent-Tiering: Automatically optimizes storage costs for changing access patterns.
- Storage Management: Tools for cost management, regulatory compliance, and latency reduction.
- Access Auditing: Features to manage and audit access to your buckets and objects.
For more details, refer to the AWS S3 Documentation.
Connection Modes
Connection | Web | Mobile |
---|---|---|
Device Mode | 🚫 | 🚫 |
Cloud Mode | ✅ | ✅ |
Setting Up AWS S3 as a Destination in HCL CDP
To configure AWS S3 in HCL CDP, follow the steps below:
- Navigate to Data Pipeline > Destinations and click +Create Destination.
- From the catalog pop-up, select Amazon S3 and click Continue.
- Select a source, and click Continue.
- In the Destination Name field, enter a unique name to identify the destination. Click Submit.
- Toggle the Status button to activate or deactivate the destination.
- Click +Add Sources to add sources, and select desired sources.
- Click Continue. In the Bucket Name text box, enter the bucket
name.Note: To get the bucket name, login to your Amazon AWS S3 console and then create a new S3 bucket (you can also choose an existing one, if any).
- Add an optional prefix path (e.g., if the bucket name is
AbcdS3
and prefix isAWS
, the full name becomesAWS AbcdS3
). - Click Update Destination to save your settings.
Data Handling in AWS S3
Data Processing:- HCL CDP collects data from sources, processes it in hourly batches, and stores it as gzipped, newline-separated JSON files.
- Files are first uploaded to an HCL CDP S3 bucket and then securely copied to your S3 bucket.
{ "Version": "2012-10-17", "Statement": [ { "Sid": "AllowhclcdpUser", "Effect": "Allow", "Principal": { "AWS": "arn:aws:iam::821664204980:user/s3-copy" }, "Action": [ "s3:PutObject", "s3:PutObjectAcl" ], "Resource": "arn:aws:s3:::<s3-bucket-name>/hclcdp-logs/*" } ] }
s3://{bucket}/hclcdp-logs/{source-id}/{received-day}/filename.gz
Event Types
HCL CDP supports several event types, stored as JSON objects in AWS S3:
- Track Events: Track user actions with event-specific details.
{ "anonymousId": "viz_62200c405e6de", "event": "otp_verified", "type": "track", "properties": { "customerId": "1192900" } }
- Identify Events: Identify user attributes.
{ "anonymousId": "viz_62200ceef3fd6", "type": "identify", "traits": { "email": "user@example.com" } }
- Screen Events: Track app screen views.
{ "type": "screen", "event": "page_viewed", "context": { "app": { "name": "MyApp" } } }
- Page Events: Track website page
views.
{ "type": "page", "properties": { "url": "https://example.com/home" } }
FAQs
Q1. What connection does AWS S3 use?
The AWS S3 is available in Cloud mode for both web and mobile.
Q2. How do I configure AWS S3?
Log in to your Amazon AWS S3 console and then create a new S3 bucket or use an existing one.
Q3. How is the data processed in HCL CDP?
HCL CDP collects data in hourly batches, stores it in its S3 bucket, and securely copies it to your S3 bucket. There is no size limit.
Q4. How do I set up AWS S3 as a destination?
Navigate to Data Pipeline > Destinations, select Amazon S3 and follow the setup process.
Q5. How to add a prefix path to the destination name?
Add the prefix (e.g., AWS
) before the bucket name in the Bucket
Name text box. For example, the full name becomes AWS
AbcdS3
.