Back-of-the-envelope estimation involves making quick, rough calculations to estimate system capacity or performance early in the design process. Though informal, it can offer valuable insights into whether a system can meet its expected demands.

Why Use Back-of-the-Envelope Estimation?

Early Assessment: Helps in assessing the feasibility of a system design before committing significant resources.
Quick Decision Making: Facilitates faster decision-making by providing rough estimates that guide further exploration.
Initial Budgeting: Assists in early budgeting by estimating the resources needed, such as computing power, storage, and network bandwidth.

Prerequisites for Effective Estimation

Before performing a back-of-the-envelope estimation, it's essential to gather key information that will impact the accuracy of your calculations.

1. Understand the System Architecture

Components Involved: Identify the core components of the system, such as servers, databases, caches, and load balancers.
Interactions Between Components: Understand how these components interact, including the data flow and communication patterns.

2. Identify the Critical Metrics

Throughput: The number of transactions or requests the system must handle per second, minute, or hour.
Latency: The acceptable delay between a request and the response.
Storage Requirements: The amount of data that needs to be stored, including current data and projected growth.
Bandwidth: The data transfer rate required between different components or between the system and its users.

3. Determine the User Behavior

Number of Users: Estimate the total number of users who will access the system.
User Activity Patterns: Understand when users are most active and how they interact with the system (e.g., peak usage times, frequency of requests).
Request Size and Frequency: Estimate the size of requests and the frequency at which they are made.

4. Consider the Data Size

Average Data Size per Transaction: Estimate the size of data processed in each transaction, such as request size, response size, and any associated metadata.
Total Data Size: Calculate the total volume of data the system needs to manage, including raw data, metadata, logs, and backups.

5. Estimate the Processing Power

CPU Requirements: Estimate the number of CPU cycles required to process each request.
Memory Requirements: Assess how much memory is needed to handle data processing and caching.

6. Evaluate Network Requirements

Network Traffic: Calculate the expected amount of network traffic between different components and between users and the system.
Bandwidth Allocation: Determine the bandwidth needed to ensure smooth data transfer without bottlenecks.

7. Plan for Scalability

Load Balancing: Consider how the system will distribute traffic across multiple servers.
Horizontal vs. Vertical Scaling: Decide whether to scale out (add more servers) or scale up (enhance existing server capacity).
Redundancy and Fault Tolerance: Account for the need for backup systems to ensure high availability.

Power of Two Table: From Byte to Petabyte

Unit	Power of 2	Bytes	Conversion	Description
Byte (B)	2⁰	1	-	The smallest unit of digital information storage.
Kilobyte (KB)	2¹⁰	1,024	1 KB = 2¹⁰ B	1,024 bytes, often used to measure small text files or low-resolution images.
Megabyte (MB)	2²⁰	1,048,576	1 MB = 2¹⁰ KB	1,024 kilobytes, commonly used to measure medium-sized files like images or MP3s.
Gigabyte (GB)	2³⁰	1,073,741,824	1 GB = 2¹⁰ MB	1,024 megabytes, used for larger files such as videos or software applications.
Terabyte (TB)	2⁴⁰	1,099,511,627,776	1 TB = 2¹⁰ GB	1,024 gigabytes, often used for data storage in hard drives and databases.
Petabyte (PB)	2⁵⁰	1,125,899,906,842,624	1 PB = 2¹⁰ TB	1,024 terabytes, typically used in large-scale data centers and cloud storage.

Latency Numbers Every Programmer Should Know

Time Unit	Seconds (s)	Microseconds (μs)	Nanoseconds (ns)
1 Nanosecond	10^-9 s	10^-3 μs	1 ns
1 Microsecond	10^-6 s	1 μs	1,000 ns
1 Millisecond	10^-3 s	1,000 μs	1,000,000 ns

Operation	Latency	Power of 10	Description
L1 Cache Access	~0.5 ns	10^-9 seconds	Accessing data from the L1 cache, the fastest and closest storage to the CPU cores.
L2 Cache Access	~7 ns	10^-9 seconds	Accessing data from the L2 cache, which is slightly slower but larger than L1 cache.
RAM Access	~100 ns	10^-7 seconds	Accessing data from the main memory (RAM), which is slower than the CPU caches.
SSD Random Read	~150 µs	10^-6 seconds	Random read access from an SSD, faster than HDD but slower than RAM.
HDD Random Read	~10 ms	10^-3 seconds	Random read access from a traditional hard disk drive (HDD), significantly slower than SSD.
Network Round Trip (within data center)	~500 µs	10^-6 seconds	Time taken for a round trip within a data center, typically involving multiple switches.
Network Round Trip (between data centers)	~150 ms	10^-3 seconds	Time taken for a round trip between geographically distant data centers.
Read 1 MB from SSD	~1 ms	10^-3 seconds	Time to read 1 MB of sequential data from an SSD, faster than random reads.
Read 1 MB from HDD	~20 ms	10^-3 seconds	Time to read 1 MB of sequential data from an HDD, slower than SSDs.
Send 1 KB over 1 Gbps Network	~10 µs	10^-6 seconds	Time to send 1 KB of data over a 1 Gbps network, not including additional network overhead.

Twitter QPS and Storage Estimation

Assumptions

300 million monthly active users
50% of users use Twitter daily
Users post 2 tweets per day on average
10% of tweets contain media
Data is stored for 5 years

QPS Estimate

Description	Calculation	Result
Daily Active Users (DAU)	300 million * 50%	150 million
Tweets QPS	150 million * 2 tweets / 24 hours / 3600 seconds	~3500
Peak QPS	2 * QPS	~7000

Media Storage Estimation

Description	Calculation	Result
Average Tweet Size	tweet_id (64 bytes) + text (140 bytes) + media (1 MB)	1 MB
Daily Media Storage	150 million * 2 tweets * 10% * 1 MB	30 TB per day
5-Year Media Storage	30 TB * 365 days * 5 years	~55 PB

Estimation of Image Results Page Generation

Assumptions:

Time per image generation: 100 ms (0.1 seconds)
Number of thumbnails: 30

Serial Processing:

In serial processing, each thumbnail is generated one after the other. The total time is the sum of the time taken to generate each thumbnail.

Total Time (Serial) = Time per Image * Number of Images

Total Time (Serial) = 100 ms * 30 = 3000 ms = 3 seconds

Parallel Processing:

In parallel processing, all thumbnails are generated simultaneously. The total time is essentially the time taken to generate a single thumbnail, as all are done concurrently.

Total Time (Parallel) = Time per Image

Total Time (Parallel) = 100 ms = 0.1 seconds

Estimation Summary:

Processing Method	Total Time
Serial Processing	3 seconds
Parallel Processing	0.1 seconds