DynamoDB
What is DynamoDB?
Amazon DynamoDB is a key-value and document database, which delivers single-digit millisecond performance at any scale.
It’s a fully managed, multi-region, multi-active, durable database with built-in security,
backup and restoration, and in-memory caching for internet-scale applications. - AWS
White paper published - 2007
First release - 2012
DynamoDB Features
- Fully Managed:
- No need to manage servers or software updates.
- Focus only on managing and using your data.
- Example: You don’t worry about server failures; AWS handles it.
- Fast:
- Delivers responses in single-digit milliseconds.
- Ideal for high-speed applications like real-time gaming leaderboards.
- Scalable:
- Works better as data and traffic increase.
- Can handle millions of requests per second without delays.
- Example: Social media apps that serve a growing user base.
- Durable and Highly Available:
- Data is automatically replicated across multiple servers for safety.
- Always available for writing new data.
- Eventual consistency ensures data is up-to-date within milliseconds.
- Example: Shopping cart updates in an e-commerce app.
- Cost-Effective:
- You only pay for the reads and writes you use.
- No extra charges for storage size.
- Example: Small-scale apps with low traffic will incur low costs.
- Flexible and Schema-less:
- Only need to specify a unique primary key.
- Insert different types of data without worrying about rigid schemas.
- Can only retrieve data using primary keys or indices (scanning is slow).
- Example: A user profile table where some users have a "location" field while others don't.
AWS Value-added Services
-
Provisioning of Hardware:
- DynamoDB is serverless. You don’t need to manage any physical servers or hardware.
- Example: You don’t worry about where the database runs; AWS handles it.
-
Partitioning and Repartitioning:
- Data is automatically split (partitioned) across multiple servers for scalability.
- If servers are added or removed, DynamoDB re-partitions the data without downtime.
- Example: A large "Orders" table with 1 million entries will be split across multiple partitions.
-
Replication:
- DynamoDB automatically creates copies of your data to ensure it is safe and durable.
- No need to manually set up or think about replication.
- Example: If a server in one data center fails, your data is still available from another replica.
-
Scaling of Table and Throughput:
- DynamoDB supports auto-scaling by default, so it adjusts based on traffic volume.
- Tables grow or shrink automatically to meet demand without downtime.
- Example: During a flash sale, your "Products" table automatically scales to handle thousands of requests.
DynamoDB: When to Use and When to Avoid
- When to Use DynamoDB
- DynamoDB is great for applications that need simple and fast reads/writes on a large scale.
- It works well for use cases where access patterns are clear in advance.
- Suitable for storing data with a known primary key for efficient retrieval.
- When to Avoid DynamoDB
- Not for Complex Queries:
- DynamoDB is not ideal for OLAP (Online Analytical Processing) applications that need complex queries.
- For example, in a banking database analyzing customer behavior, where query types may evolve, DynamoDB is not a good fit.
- Reason: It doesn't support flexible queries on non-key attributes.
- Not for Large File Storage:
- Items are limited to 400 KB in size, so it’s not suitable for storing large files.
- Use AWS S3 to store files and save their links in DynamoDB instead.
- Example: For storing videos or images, upload them to S3 and keep their URLs in DynamoDB.
DynamoDB Basic Concepts
- Tables:
- Tables are where data is stored in DynamoDB.
- Multiple tables can exist in a DynamoDB database.
- Similar to relational databases, but DynamoDB tables are schema-less.
- No fixed column structure is required, and data can vary from one row to another.
- Each table has a primary key to ensure unique data entries.
- Items:
- Attributes:
- Attributes are the individual key-value pairs within an item.
- Attributes are similar to columns in relational databases, but flexible in DynamoDB.
- Each attribute has a name (key) and a value, and the value can be of different types (string, number, etc.).
- Example:
"customer_name": "Alice"
is an attribute with key "customer_name" and value "Alice".
- Primary Key:
- Example - Orders Table:
- Let's assume we have a table called "Orders" with the primary key as "order_id".
- Example item 1:
{"order_id": "12345", "customer_name": "Alice", "total_amount": 150}
- Example item 2:
{"order_id": "12346", "customer_name": "Bob", "total_amount": 200, "items": ["item1", "item2"]}
- In this example, the items in the table can have different attributes. One may have an "items" list, while another may have "service" information.
Data Types
- Scalar Types:
- String: Stores a string of any length (within 400KB item size limit). If used as primary key, it should be less than 2048 bytes for single key, 1024 bytes for composite key.
- Examples: "Manager", "Artist", "Read-only"
- Number: Stores any kind of number: positive, negative, integer, or decimal.
- Examples: 124, 42.34, -5634
- Boolean: Stores either
true
or false
.
- Null: Represents a null value (unknown or undefined).
- Binary: Used to store binary data like images, files, or encrypted messages.
- Examples: Encrypted files, images (JPEG, PNG)
- Document Types:
- Map: A collection of key-value pairs (similar to JSON objects). The pairs are not ordered.
- Example:
{"name": "John", "age": 30}
- List: A collection of ordered values, which can be of different types. You can access elements by their index.
- Example:
[1, "apple", 3.14]
- Set Types:
- NumberSet: A set of unique numbers (no duplicates). Elements are of type Number.
- StringSet: A set of unique strings (no duplicates).
- Example:
{"apple", "banana", "orange"}
- BinarySet: A set of unique binary objects (no duplicates).
- Size Limitations:
- The maximum item size in DynamoDB is 400 KB (for all data types combined).
- For primary keys, the limit is 2048 bytes for a single key, and 1024 bytes for a composite key.
Understanding DynamoDB Partitions
- How DynamoDB Distributes Data:
- DynamoDB uses consistent hashing to distribute data across multiple servers.
- The partition key determines how items are assigned to different servers.
- Balanced partition keys ensure even data distribution, avoiding overload on specific servers.
- Importance of Good Partition Keys:
- Partition keys must be uniformly random for efficient data distribution.
- If not evenly distributed, some servers may get overloaded while others remain underutilized.
- This imbalance slows down database performance and increases costs.
Primary Key and Range Key
Why Do We Need a Primary Key?
- A primary key is like the key in a key-value store (like a hash map).
- It helps to uniquely identify and retrieve items from a database.
- The primary key decides where the data is saved and how it is retrieved.
Types of Primary Keys
- Single Key: Partition key (unique identifier for an item).
- Composite Key: Combination of partition key and sort key.
- In a composite key, the combination of both keys must be unique.
Why Use a Sort Key (Range Key)?
- A range key is optional but useful when:
-
- The partition key alone is not unique.
- It allows for more flexible access patterns, like querying a range of data.
Example Table: Library Books
- Primary Key:
- Partition Key: genre
- Sort Key: author_year_published
- Items:
-
{
genre: "Science Fiction",
author_year_published: "Asimov_1950",
title: "I, Robot",
copies_sold: 500000
}
-
{
genre: "Fantasy",
author_year_published: "Tolkien_1954",
title: "The Lord of the Rings",
copies_sold: 150000000
}
What Can You Do with This Model?
- Get all books in a specific genre.
- Get all books in a genre by a specific author.
- Get all books in a genre by an author in a specific year.
DynamoDB Capacity Units - Charges
How DynamoDB Charges
- DynamoDB charges based on the frequency of reads and writes, not the storage used.
- Two options for managing table capacity:
- On-demand provisioning:
- Automatically scales capacity up or down based on usage.
- No need to define read/write throughput upfront.
- Provisioned capacity:
- You specify the expected number of reads and writes per second.
- Useful for predictable workloads.
Write Capacity
- 1 Write Capacity Unit (WCU): Supports one write per second for an item of size 1KB.
- Items larger than 1KB:
- Divide item size by 1KB and round up. Example: 10KB item = 10 WCUs.
- strongly consistent writes:
- Requires 2 WCUs for a 1KB item written in one second.
Read Capacity
- 1 Read Capacity Unit (RCU):
- Supports one bly consistent read per second for an item of size 4KB.
- Or two eventually consistent reads per second for an item of size 4KB.
- Items larger than 4KB:
- Divide item size by 4KB and round up. Example: 15KB item = 4 RCUs for a bly consistent read.
- For eventually consistent reads, it will need half the RCUs. Example: 15KB item = 2 RCUs.
Key Recommendations
- Use eventually consistent reads for lower costs if minor data delays are acceptable.
- Writes are distributed within milliseconds across replicas in eventually consistent mode.
Examples
- Write Example:
- Item size = 6KB → Needs 6 WCUs for one write per second.
- For b consistency: Needs 12 WCUs.
- Read Example:
- Item size = 9KB → Needs 3 RCUs for a bly consistent read.
- For eventually consistent read: Needs 1.5 RCUs (round up = 2).
Querying and Scanning in DynamoDB
Options for Reading Data
- Querying: Faster and more efficient way to retrieve data.
- Scanning: Slower but useful for specific use cases like copying entire tables.
Querying
- When the Table Has Only a Partition Key:
- Query one item at a time using the partition key.
- DynamoDB fetches the data by directly accessing the partition linked to the key.
- Example: Table with "userId" as the partition key. Query for a specific "userId" to get the user's details.
- When the Table Has a Partition Key and Sort Key:
- Query multiple items by specifying the partition key and a range for the sort key.
- Returns all items from the same partition that fall within the specified range.
- Results are returned in sorted order.
- Example: Table with "userId" (partition key) and "createdAt" (sort key). Query all items for a "userId" within a date range.
Scanning
- Reads data from the entire table, fetching up to 1 MB of data per scan.
- Filters can be applied to narrow down results (filters are processed after data is read).
- Useful for small tables or copying all data.
- Example: Scan a "Products" table and filter items where "price" is greater than 100.
Difference Between Query and Scan
- Query:
- Fetches items from the same partition.
- Faster and uses fewer read capacity units.
- Limited access patterns (requires partition key).
- Scan:
- Can fetch items from multiple partitions.
- Slower and consumes more read capacity units.
- Supports flexible access patterns (does not require partition key).
Perform DynamoDB Tasks in AWS
1. Find DynamoDB Service
- Log in to your AWS Management Console.
- In the AWS Management Console, search for "DynamoDB" in the search bar at the top of the page.
- Click on the DynamoDB service from the search results to open the DynamoDB Dashboard.
2. Create a Table
- In the DynamoDB Dashboard, click on the "Create table" button.
- Provide a table name, e.g.,
ExampleTable
.
3. Add the Primary Key
- Under the "Primary key" section:
- Select the key type. For example, if you need a simple primary key, choose "Partition key".
- Enter the key name, e.g.,
UserID
, and choose its data type (String, Number, or Binary).
- If you need a composite key, click "Add sort key" and provide the sort key name, e.g.,
OrderID
, with its data type.
4. Set the Read/Write Capacity
- Choose the capacity mode:
- On-demand: Best for unpredictable workloads.
- Provisioned: Best for predictable workloads.
- If you choose provisioned capacity, set the "Read capacity units" and "Write capacity units". For example:
- Read capacity units:
5
- Write capacity units:
5
5. Clean Up
- Go to the DynamoDB Dashboard.
- Select the table you want to delete.
- Click on the "Actions" dropdown menu and choose "Delete table".
- Confirm the deletion by typing the table name when prompted.
Creating Table in DynamoDB
aws dynamodb create-table \
--table-name UserDetails \
--attribute-definitions \
AttributeName=userId,AttributeType=S \
--key-schema \
AttributeName=userId,KeyType=HASH \
--provisioned-throughput ReadCapacityUnits=1,WriteCapacityUnits=1
Creating Items in DynamoDB
-
1. Using the PutItem API
- Most commonly used method to insert items into DynamoDB.
- Requires the primary key to be specified for the item.
- If the primary key matches an existing item, the old item is replaced.
- Example:
{
"TableName": "UserDetails",
"Item": {
"userId": {"S": "12345"},
"name": {"S": "John Doe"},
"email": {"S": "johndoe@example.com"}
}
}
-
2. Creating Items from AWS Console
- Simple and manual way to add items using the AWS Management Console.
- Steps:
- Go to the AWS DynamoDB Console.
- Select the table where you want to insert the item.
- Click on the Create Item button.
- Not suitable for inserting large volumes of data.
-
3. Using the BatchWriteItem API
- Allows writing up to 25 items in a single batch request.
- Total size of the batch request should not exceed 16 MB.
- Each individual item should be smaller than 400 KB.
- Leverages parallel processing for faster data insertion.
- Useful for handling large datasets efficiently.
- Example:
{
"RequestItems": {
"UserDetails": [
{
"PutRequest": {
"Item": {
"userId": {"S": "12345"},
"name": {"S": "Alice"}
}
}
},
{
"PutRequest": {
"Item": {
"userId": {"S": "67890"},
"name": {"S": "Bob"}
}
}
}
]
}
}
Python Script to access DynamoDB
Perform following task using python:
- Create a table.
- Add data to the table.
- Query the table.
- Scan the table.
- Clean up.
1. Create a Table
- Use the boto3 library to create a DynamoDB table.
- Define table attributes such as partition key and sort key.
- Example:
import boto3
dynamodb = boto3.resource('dynamodb', region_name='us-east-1')
table = dynamodb.create_table(
TableName='Products',
KeySchema=[
{'AttributeName': 'productId', 'KeyType': 'HASH'} # Partition key
],
AttributeDefinitions=[
{'AttributeName': 'productId', 'AttributeType': 'S'}
],
ProvisionedThroughput={
'ReadCapacityUnits': 5,
'WriteCapacityUnits': 5
}
)
table.wait_until_exists()
print("Table created successfully!")
- Response: "Table created successfully!"
2. Add Data to the Table
- Use the
put_item
method to add items.
- Example:
table.put_item(
Item={
'productId': '101',
'name': 'Laptop',
'price': 1500,
'stock': 10
}
)
print("Item added successfully!")
- Response: "Item added successfully!"
3. Query the Table
4. Scan the Table
5. Clean Up
Python Script to insert bulk data into DynamoDB
- DynamoDB has a limit of 25 items per batch, and each batch cannot exceed 16 MB in size.
import boto3
import math
# Initialize DynamoDB resource
dynamodb = boto3.resource('dynamodb', region_name='us-east-1')
table = dynamodb.Table('Products') # Replace 'Products' with your table name
# Generate bulk data
bulk_data = [
{'productId': str(i), 'name': f'Product-{i}', 'price': 100 + i, 'stock': 50 + i}
for i in range(1, 501) # Example: Insert 500 items
]
# Function to insert bulk data
def insert_bulk_data(items, table, batch_size=25):
total_items = len(items)
num_batches = math.ceil(total_items / batch_size)
print(f"Total items: {total_items}, Batch size: {batch_size}, Total batches: {num_batches}")
for i in range(0, total_items, batch_size):
batch = items[i:i + batch_size]
with table.batch_writer() as batch_writer:
for item in batch:
batch_writer.put_item(Item=item)
print(f"Batch {i // batch_size + 1}/{num_batches} inserted successfully!")
# Call the function to insert data
insert_bulk_data(bulk_data, table)
Sample Input:
[
{"productId": "1", "name": "Product-1", "price": 101, "stock": 51},
{"productId": "2", "name": "Product-2", "price": 102, "stock": 52},
...
{"productId": "500", "name": "Product-500", "price": 600, "stock": 550}
]
Sample Output Logs in CloudWatch
Total items: 500, Batch size: 25, Total batches: 20
Batch 1/20 inserted successfully!
Batch 2/20 inserted successfully!
...
Batch 20/20 inserted successfully!
Verification
Scan the table
response = table.scan()
print(f"Total items in the table: {len(response['Items'])}")
Expected Output
Total items in the table: 500
Secondary Index
Querying on the primary key is fast, but it can only give us a limited set of access patterns.
The scan can give us any access pattern, but it is slow and costly.
There is a third option, using a secondary index, which is relatively cheap and provides us with more access patterns.
Reading Option |
Access Patterns |
Cost |
Speed |
Query on primary key |
Limited |
Low |
High |
Scan on table |
Many (virtually all) |
Very high |
Slow |
Query on secondary indexes |
Moderate |
Moderate |
High |
- Secondary indexes allow additional ways to query data efficiently.
- Two types of secondary indexes:
- Global Secondary Index (GSI): Defines a new partition key and optional sort key.
- Local Secondary Index (LSI): Adds a new sort key but keeps the same partition key as the table.
Considerations for Secondary Indexes
- Secondary indexes increase read and write costs.
- Local Secondary Index (LSI):
- It provides strongly consistent reads.
- Must be created when the table is created i.e. we can’t add or delete a LSI from an existing table.
- Uses the same read/write capacity units as the table.
- We have to use the same partition key, we can not choose an alternate partition key while creating LSIs
- The size of an item collection is the total size of all the items in the table and the LSI.
- The item collection size for any given partition key should be less than 10 GB.
- Global Secondary Index (GSI):
- Can be created or deleted after the table is created.
- Requires separate provisioned throughput units.
- Has its own partition key.
- The index key is similar to the primary key. It contains a partition key and a sort key.
- The sort key is optional, while the partition key is compulsory
Sample Data
- Table Name: MusicCollection
- Attributes:
- Artist: Partition Key
- Song: Sort Key
- Album: String
- Genre: String
- Year: Number
- Example Records:
- { "Artist": "The Beatles", "Song": "Hey Jude", "Album": "The White Album", "Genre": "Rock", "Year": 1968 }
- { "Artist": "The Beatles", "Song": "Let It Be", "Album": "Let It Be", "Genre": "Rock", "Year": 1970 }
- { "Artist": "Taylor Swift", "Song": "Shake It Off", "Album": "1989", "Genre": "Pop", "Year": 2014 }
- { "Artist": "Adele", "Song": "Hello", "Album": "25", "Genre": "Soul", "Year": 2015 }
1. Primary Index
- Partition Key: Artist
- Sort Key: Song
- Query Example:
2. Global Secondary Index (GSI)
- GSI Partition Key: Genre
- GSI Sort Key: Year
- Query Example:
3. Local Secondary Index (LSI)
- LSI Partition Key: Artist
- LSI Sort Key: Year
- Query Example:
4. Scan
Python Script for Secondary Index
Write script for following tasks:
- Create a table.
- Add items to the table.
- Check indexes and run a query.
- Run the global secondary index.
- Run the local secondary index.
- Clean up.
1. Create a Table
Python Script:
import boto3
dynamodb = boto3.client('dynamodb', region_name='us-east-1')
def create_table():
try:
response = dynamodb.create_table(
TableName='MusicCollection',
KeySchema=[
{'AttributeName': 'Artist', 'KeyType': 'HASH'}, # Partition key
{'AttributeName': 'Song', 'KeyType': 'RANGE'} # Sort key
],
AttributeDefinitions=[
{'AttributeName': 'Artist', 'AttributeType': 'S'},
{'AttributeName': 'Song', 'AttributeType': 'S'},
{'AttributeName': 'Genre', 'AttributeType': 'S'},
{'AttributeName': 'Year', 'AttributeType': 'N'}
],
ProvisionedThroughput={
'ReadCapacityUnits': 5,
'WriteCapacityUnits': 5
},
GlobalSecondaryIndexes=[
{
'IndexName': 'GenreYearIndex',
'KeySchema': [
{'AttributeName': 'Genre', 'KeyType': 'HASH'}, # GSI Partition key
{'AttributeName': 'Year', 'KeyType': 'RANGE'} # GSI Sort key
],
'Projection': {
'ProjectionType': 'ALL'
},
'ProvisionedThroughput': {
'ReadCapacityUnits': 5,
'WriteCapacityUnits': 5
}
}
]
)
print("Table created successfully.")
except Exception as e:
print(f"Error creating table: {e}")
create_table()
Sample Output:
Table created successfully.
2. Add Items to the Table
Python Script:
def add_item():
dynamodb = boto3.resource('dynamodb', region_name='us-east-1')
table = dynamodb.Table('MusicCollection')
response = table.put_item(
Item={
'Artist': 'The Beatles',
'Song': 'Hey Jude',
'Album': 'The White Album',
'Genre': 'Rock',
'Year': 1968
}
)
print("Item added:", response)
add_item()
Sample Output:
Item added: { 'ResponseMetadata': { 'HTTPStatusCode': 200 } }
3. Add Bulk Items with Batch Size = 25
Python Script:
def add_bulk_items():
dynamodb = boto3.resource('dynamodb', region_name='us-east-1')
table = dynamodb.Table('MusicCollection')
items = [
{'Artist': 'The Beatles', 'Song': f'Song {i}', 'Album': 'Album 1', 'Genre': 'Rock', 'Year': 1968 + i}
for i in range(1, 26)
]
with table.batch_writer() as batch:
for item in items:
batch.put_item(Item=item)
print("Bulk items added.")
add_bulk_items()
Sample Output:
Bulk items added.
4. Check Indexes and Run a Query
Python Script:
def query_table():
dynamodb = boto3.resource('dynamodb', region_name='us-east-1')
table = dynamodb.Table('MusicCollection')
response = table.query(
KeyConditionExpression=boto3.dynamodb.conditions.Key('Artist').eq('The Beatles')
)
print("Query result:", response['Items'])
query_table()
Sample Output:
Query result: [
{'Artist': 'The Beatles', 'Song': 'Hey Jude', 'Album': 'The White Album', 'Genre': 'Rock', 'Year': 1968},
...
]
5. Run the Global Secondary Index (GSI)
Python Script:
def query_gsi():
dynamodb = boto3.resource('dynamodb', region_name='us-east-1')
table = dynamodb.Table('MusicCollection')
response = table.query(
IndexName='GenreYearIndex',
KeyConditionExpression=boto3.dynamodb.conditions.Key('Genre').eq('Rock')
)
print("GSI Query result:", response['Items'])
query_gsi()
Sample Output:
GSI Query result: [
{'Artist': 'The Beatles', 'Song': 'Hey Jude', 'Album': 'The White Album', 'Genre': 'Rock', 'Year': 1968},
...
]
6. Run the Local Secondary Index (LSI)
Python Script:
def query_lsi():
dynamodb = boto3.resource('dynamodb', region_name='us-east-1')
table = dynamodb.Table('MusicCollection')
response = table.query(
IndexName='ArtistYearIndex',
KeyConditionExpression=boto3.dynamodb.conditions.Key('Artist').eq('The Beatles') &
boto3.dynamodb.conditions.Key('Year').gte(1968)
)
print("LSI Query result:", response['Items'])
query_lsi()
Sample Output:
LSI Query result: [
{'Artist': 'The Beatles', 'Song': 'Hey Jude', 'Album': 'The White Album', 'Genre': 'Rock', 'Year': 1968},
...
]
7. Clean Up
Python Script:
def delete_table():
try:
response = dynamodb.delete_table(TableName='MusicCollection')
print("Table deleted successfully.")
except Exception as e:
print(f"Error deleting table: {e}")
delete_table()
Sample Output:
Table deleted successfully.
CloudWatch Monitoring for DynamoDB
The following technologies are available for logging and monitoring AWS services:
- CloudWatch alarms
- CloudWatch logs
- CloudWatch events
- Manual monitoring
Steps to Create a CloudWatch Alarm for DynamoDB Table
1. Choose the Metric
- Go to the AWS Management Console.
- Navigate to CloudWatch.
- In the CloudWatch Dashboard, click on Alarms in the left menu.
- Click on Create Alarm.
- Click on Select Metric.
- Choose DynamoDB from the list of AWS services.
- Select the metric you want to monitor, e.g., ConsumedReadCapacityUnits or ConsumedWriteCapacityUnits.
2. Define the Alarm Condition
- Set the threshold type (e.g., Static or Anomaly Detection).
- For example, you can monitor if the ConsumedReadCapacityUnits exceed a certain value, such as 1000.
- Choose a period for evaluation, e.g., 1 minute.
3. Configure Actions
- Choose the action to take when the alarm is triggered, such as sending a notification to an Amazon SNS topic.
- If you don't have an SNS topic, create one by entering a name and email address to receive alerts.
- Confirm the email subscription by checking your inbox and clicking on the confirmation link.
4. Add a Name and Review
- Enter a name for the alarm, e.g., DynamoDBReadCapacityAlarm.
- Review the details of the alarm configuration.
- Click Create Alarm.
Example: CloudWatch Alarm for DynamoDB Table
Use Case:
- Monitor the ConsumedReadCapacityUnits of a DynamoDB table named MusicCollection.
- Trigger an alarm if the read capacity exceeds 500 units in a 1-minute interval.
Sample Alarm Configuration:
- Metric: DynamoDB > By Table Name > MusicCollection > ConsumedReadCapacityUnits
- Threshold: Greater than 500 units
- Evaluation Period: 1 minute
- Actions: Notify via SNS topic DynamoDBAlarms
- Alarm Name: HighReadCapacityAlarm
Expected Notification:
Subject: AWS Notification - Alarm "HighReadCapacityAlarm" in ALARM
Body:
Alarm Name: HighReadCapacityAlarm
State: ALARM
Metric: ConsumedReadCapacityUnits
Threshold: Greater than 500 units
DynamoDB Table: MusicCollection
Verifying the Alarm
- Perform operations on the DynamoDB table to simulate a high read capacity.
- Monitor the alarm state in the CloudWatch Alarms console.
- When the condition is met, check your email for the notification.
- View logs in the CloudWatch Logs to analyze the DynamoDB table activity.
Cleaning Up
- Delete the CloudWatch alarm from the Alarms section in CloudWatch.
- Unsubscribe or delete the SNS topic if it's no longer needed.
Use Case - Instagram Stories using DynamoDB
To create a DynamoDB database for the Instagram Stories feature, follow these steps:
We will design a DynamoDB database to store and manage Instagram stories. The project involves three primary steps:
- Creating the DynamoDB Database
- Populating the Database
- Retrieving and Managing Data
Create Dynamodb table
import boto3
# Initialize DynamoDB client
dynamodb = boto3.resource('dynamodb')
# Create the DynamoDB table
table = dynamodb.create_table(
TableName='InstagramStories',
KeySchema=[
{
'AttributeName': 'user_id', # Partition Key
'KeyType': 'HASH'
},
{
'AttributeName': 'story_id', # Sort Key
'KeyType': 'RANGE'
}
],
AttributeDefinitions=[
{
'AttributeName': 'user_id',
'AttributeType': 'S'
},
{
'AttributeName': 'story_id',
'AttributeType': 'S'
},
{
'AttributeName': 'story_type',
'AttributeType': 'S'
},
{
'AttributeName': 'timestamp',
'AttributeType': 'S'
},
{
'AttributeName': 'is_memory',
'AttributeType': 'BOOL'
},
{
'AttributeName': 'likes_count',
'AttributeType': 'N'
}
],
ProvisionedThroughput={
'ReadCapacityUnits': 5,
'WriteCapacityUnits': 5
}
)
print(f"Table {table.table_name} is being created...")
Populate the Table with Sample Data
# Add sample story data
def add_sample_story(user_id, story_id, story_type, title, description, is_memory, likes_count):
table.put_item(
Item={
'user_id': user_id,
'story_id': story_id,
'story_type': story_type,
'title': title,
'description': description,
'timestamp': str(datetime.now()),
'is_memory': is_memory,
'likes_count': likes_count
}
)
# Sample data for User U1 and U2
add_sample_story('U1', 'S1', 'image', 'Sunset at the beach', 'A beautiful sunset I captured while at the beach.', False, 0)
add_sample_story('U2', 'S2', 'video', 'Morning run', 'A short video of my morning run through the park.', True, 10)
Query the Data
def get_user_stories(user_id):
response = table.query(
KeyConditionExpression=boto3.dynamodb.conditions.Key('user_id').eq(user_id)
)
return response['Items']
# Get stories for User U1
user_stories = get_user_stories('U1')
print(user_stories)
Automatic Cleanup for Expired Stories
import time
def remove_expired_stories():
current_time = time.time()
response = table.scan()
for item in response['Items']:
story_time = time.mktime(datetime.strptime(item['timestamp'], '%Y-%m-%d %H:%M:%S.%f').timetuple())
if current_time - story_time > 86400 and not item['is_memory']: # 86400 seconds = 24 hours
table.delete_item(
Key={
'user_id': item['user_id'],
'story_id': item['story_id']
}
)
print(f"Deleted expired story: {item['story_id']}")
# Remove expired stories
remove_expired_stories()