Object Storage Classes
Explore the different object storage classes offered by S3 and their use cases.
S3 is a highly economical object storage service used for a wide variety of purposes, from website hosting, and backups to data lakes and analytics. These use cases vary in the access frequency and required latency. Using the same bucket for all use cases is not only expensive but unfeasible.
Amazon S3 storage classes offer a range of options to optimize costs, performance, and durability based on the specific needs of different data types and access patterns. These storage classes can be broadly categorized among general purposes, unknown and changing access, high performance, infrequent access, and archival purposes.
Lets learn how S3 helps us to deal with multiple use cases.
Amazon S3 Standard
S3 standard is a general-purpose storage class suitable for frequently accessed data that requires low latency and high throughput. It offers high durability, availability, and performance, making it ideal for a wide range of use cases, including website content, mobile applications, and data analytics.
Amazon S3 Express One Zone
S3 Express One Zone is a high-performance storage class built to deliver consistent single-digit millisecond first-byte latency. It is ideal for frequently accessed and latency-sensitive applications. It is 10 times speedier and 50% more cost-effective as compared to the Standard S3.
While creating an S3 bucket, we can select the region that stores data in a minimum of three availability zones. However, with S3 Express One Zone, we specify an availability zone to create the bucket in which the bucket is highly available. Also, it stores data in a different kind of S3 bucket called an Amazon S3 directory bucket. The Amazon S3 Directory Bucket allows hundreds of thousands of requests per second which lends the high performance ability to S3 Express One Zone.
S3 Express One Zone offers high data durability, integrity, and security. However, one downside is that if we lose the availability zone containing the S3 One Zone Express bucket, we lose the data in it. Thus it is necessary to take precautionary measures against it.
This type of storage class can easily be integrated with other AWS services such as Amazon Sagemaker, Athena, and more. We can further enhance the performance of the S3 Express One Zone by creating the compute and storage resources in the same availability zone as the bucket.
Infrequent Access
S3 provides two storage classes for infrequent access, S3 Standard IA and S3 One Zone IA.
Amazon S3 Standard-IA
The S3 Standard-Infrequent Access storage class is intended for data that is accessed less frequently but requires rapid access when needed. It offers high throughput and availability of the S3 Standard but at a lower cost. It's suitable for backups, disaster recovery, and long-term storage of infrequently accessed data.
Amazon S3 One Zone-IA
Amazon S3 One Zone Infrequent Access, as the name suggests, provides the features of both S3 Standard IA and S3 Express One Zone. It is suitable for the infrequent and rapid access data stored in a single availability zone. However, it has a lower per GB storage and retrieval cost as compared to the S3 standard. Thus, it is ideal for use cases where we require resilience storage for infrequently accessed data, such as secondary backup copies.
Archival
S3 offers Glacier storage class to provide high performance and minimal cost storage for archival purposes. It offers the lowest storage costs but incurs additional retrieval times and fees for accessing data. The Glacier storage is further divided among three storage classes to cater to various storage durations and access patterns.
Amazon S3 Glacier Instant Retrieval
The S3 Glacier is purposely built for data rarely accessed and requires retrieval in milliseconds. It helps us to save costs. Glacier provides the same availability and throughput as the Standard IA storage class but at a considerably lower cost if the data is accessed once a quarter, for example, user-generated archives. It can store data for a minimum of 90 days.
Amazon S3 Glacier Flexible Retrieval
S3 Glacier Flexible Retrieval provides storage at a 10% lower cost as compared to Instant Retrieval if the data is accessed once or twice a year. It is typically used to transfer large amounts of data at no cost and does not require rapid access, such as disaster recovery data. It offers three retrieval classes: expedited (1–5 minutes), standard (3–5 hours), and bulk (5–12 hours). In simple terms, this storage class offers a fine balance between costs and access time. It can store data for a minimum of 90 days.
Amazon S3 Glacier Deep Archive
Deep Archive is the lowest cost storage that allows long retention of data. It is typically used by organizations and enterprises that require long-term retention of data for regulatory compliance. It has a retrieval time of a maximum of 12 hours in standard or 48 hours in bulk. It can store data for a minimum of 180 days.
One important thing to note is that objects in the S3 Glacier Flexible Retrieval and Amazon S3 Glacier Deep Archive are not available in real-time. To access them, we must restore their temporary copy, which is only available for the time period we specify in the restore request.
Economics of archiving storage
When transitioning data into S3 Glacier Deep Archive from S3 Standard we should consider two costs: the upfront cost paid for the transition and the amount we are saving by transitioning to Deep Archive.
Therefore, the most important factor determining if transitioning is cost-effective is the point at which our upfront cost breaks even the saved amount. The graph given below can help us understand. Consider object size on the x-axis and retention period on the y-axis.
We have the following four cases:
When the object size and retention periods both are small, it's not feasible to transition data to Deep Archive as the cost to transmit data will be higher than the cost we save while we retain the data.
When the object size is small but the retention period is high, it can be feasible to transmit data archive if the cost to transmit data is comparable to the cost saved.
We know that transmitting large objects is less costly than transmitting small objects. Therefore, when the retention time is small but the data size is large enough, we might be able to break even the transmission cost with the cost saved up due to the large object size. In that case, it is highly feasible to transmit objects to archival.
If both the size of the object and retention period are large, it is ideal to transmit the data to the Deep Archive.
In general, it is always feasible to shift data to the Deep Archive from Standard storage if the data is large in size and is rarely accessed, such as the backup data.
Amazon S3 Intelligent-Tiering
S3 Intelligent-Tiering is designed to optimize storage costs by automatically moving objects between the three tiers: Frequently Accessed, Infrequent Access, and Archive Instant Access. These tiers have the same low latency and high-throughput performance as S3 Standard.
The S3 Intelligent-Tiering monitors access patterns and moves objects that haven't been accessed for up to 30 consecutive days to the Infrequent Access tier after 90 days to the Archive Instant Access tier. We can additionally set up the S3 intelligent tiering to move objects not accessed for more than 180 days to the Deep Archive Access.
In addition to this, we can optionally migrate data to S3 Glacier Flexible Retrieval and S3 Glacier Deep Archive. We can configure the S3 Intelligent Tiering to move data automatically to S3 Glacier Flexible Retrieval after 90–700+ days. Similarly, we can automate data migration to S3 Glacier Deep Archive after 180–700+ days.
The S3 Intelligent Tiering ensures data is migrated from one storage class to another without impacting performance, incurring additional retrieval charges, and causing operational overhead. Also, it automatically moves the objects to the Frequent Access tier if their access frequency increases. S3 intelligent tiering charges a small amount monthly for monitoring and automation.
Comparison of storage classes
The table below summarizes the important values to compare different stoarge options. Memorizing these numbers is not important from the exam’s perspective. However, understanding them can help in choosing the right option.
S3 Standard | S3 Intelligent-Tiering | Standrad - IA | One-Zone IA | Glacier Instant Retreival | Glacier Flexible Retreival | Glacier Deep Archive | |
Minimum storage duration charge | None | None | 30 days | 30 days | 90 days | 90 days | 180 days |
Minimum billable object size | None | None | 128 KB | 128 KB | 128 KB | 40 KB | 40 KB |
Retrieval fee | None | None | Per GB retrieved | Per GB retrieved | Per GB retrieved | Per GB retrieved | Per GB retrieved |
S3 Outposts
AWS Outposts is a fully managed service from Amazon Web Services (AWS) that extends AWS infrastructure, services, APIs, and tools to customer premises or co-location facilities. It allows customers to run AWS compute, storage, database, and other services locally, ensuring low-latency access to data and applications while meeting specific data residency, compliance, and latency requirements.
Some enterprises demand to keep their data local for multiple reasons, such as latency sensitivity, data residency regulations, and local data processing. AWS caters to these demands by bringing S3 to Outposts.
Amazon S3 allows us to deliver object storage on the on-premises Outpost environment. The Outpost provides only a single S3 storage class called OUTPOSTS, which redundantly stores data across multiple servers on the outposts. To keep the user experience consistent, S3 on Outposts offers the same API calls, automation, and security features as S3 on regions. It also comes in multiple sizes tailored to the most frequent use cases.
S3 on Outposts supports storing EBS and RDS backups on the Outposts. These features bring data closer to latency-sensitive, high-performance applications.
Get hands-on with 1300+ tech skills courses.