Advanced: Decentralized Solution

This solution is better suitable for large scale platforms with multiple geographically distributed warehouses and also for the high traffic environments. In this, each warehouse has its own independent inventory node.

High level architecture:

Inventory Service:

This is responsible for maintaining accurate stock levels across multiple warehouses, it should support distributed transactions for multi-warehouse operations.

Race condition: To avoid overselling and to avoid race conditions, it implements optimistic concurrency control. In databases to avoid transactions committed at the same time we have an important concept of concurrency control. There are two major concurrency control techniques.

Pessimistic approach
Optimistic approach

In a pessimistic approach, it uses lock ( either record, table, file) to prevent the subsequent transaction from writing into the same record. In this approach the main disadvantage is resource usage which increases the storage cost and also increases the latency. This approach also results in the Deadlock. This is well suited for baking applications where the user is not allowed to do the transactions in two or more places at the same time.

On the other hand we have an optimistic approach, it assumes conflicts are rare, so allow concurrent access and check for conflicts at the end. If the data is unchanged, the commit succeeds. If there is any conflict it will be rolled back fast. It doesn’t use any locks, just verifies data integrity during commit. This is well suited for the heavy read operation systems and also improves the performance.

Order Service:

This service ensures atomic order placement with distributed sagas or two-phase commit protocols.

What are Sagas distributed ?

In distributed systems, business transactions spanning multiple services require a mechanism to ensure data consistency across services. The Saga Pattern can help ensure that the overall transaction is either completed successfully or is rolled back to its initial state if any individual microservice encounters an error.

Order service is also responsible for maintaining the eventual consistency between the inventory service and payment service using distributed outbox patterns.

Outbox Patterns:

This is a design pattern used in the distributed systems to ensure reliable communications and eventual consistency between the services. This helps to avoid issues that arise from dual write operations, whereas single operation involves both a database write and a message or event notification.

Main components in this pattern are Outbox table, Message, Message Dispatcher

How does it work ?

Customer Places the order
Order service calls the inventory service for stock update
Inventory service then reduces the stock in the database and adds a message in the outbox table in the same transaction
Message Dispatcher reads the outbox table and publish the message
Order Service and other services consume the event to maintain the consistency. Other services like notification service can consume this to alert the customers if the stock is low.

Benefits of using this patterns avoid dual write, ensuring eventual consistency and improves reliability

Outbox table: A dedicated table in the same database as an application data, used to temporarily store messages that need to be sent to other services or systems.

Notification Service:

This is responsible for sending the notifications to the customer.

Decoupled event driven architecture.
It supports multiple notifications via the channels

Analytics and Reporting:

Leverages ELK stack (Elasticsearch, Logstash, Kibana) for real-time inventory insights.
Uses Apache Flink or Kafka Streams for near real-time aggregation and anomaly detection.

Worker Service: This is responsible for managing incoming shipments async. It will be integrated with external API’s. This uses idempotent operations to avoid double shipments.

Idempotent Operations:

In a happy scenario everything works fine as expected but what if the network fails/ service fails / if the same messages are delivered again and again by the event ?

There are many techniques to avoid this:

Deduplication Using Idempotency Keys: Assign a unique key to every order. If the same order is not present, store it in the database or the cache. This prevents the duplicate shipments if the multiple messages are queued.
Outbox pattern - discussed above

How does this work ?

Order service inserts the order request in the outbox table
Message dispatcher reads the outbox table and sends the message to kafka/rabbit MQ
Shipping service process the event

This ensures eventual consistency without duplicate event processing

Token based idempotency

Unique token is generated and stored in the database known as idempotent_key
If the key is present in the system return the processed response
Or process the request and store the key

This is so useful in the payment systems or stateless APIs where we need single time execution

Distributed Locks

Use Redis SETNX (Set If Not Exists) or Zookeeper locks to allow only one worker to process an order at a time.

SETNX lock:ship_order:12345 "processing" EX 30

Only the first worker gets the lock and processes the shipment.
If the process fails before completion, the lock expires, allowing a retry.

This prevents the multiple workers from shipping the same order again.

Choosing between the databases:

When selecting the database it is crucial to understand what factors it actually depends on like scalability, availability, consistency, query patterns.

Questions you should ask yourself while designing a database:

Does the system require structured or unstructured data storage?
How often do you need complex queries with joins and aggregations ?
Do you need real time updates or eventual consistency ?
Do we need secondary indexes for lookups ?
Does the inventory system need horizontal scaling?
How frequently does data change? High read-heavy or write-heavy workloads?
Do you need low-latency reads/writes across multiple regions?
Do you need multi-region replication for disaster recovery?
Can the system tolerate occasional downtimes?
Is the inventory model evolving rapidly?
Do you frequently add new attributes?

*** These are the questions from AI

In the system, we are designing

Inventory Service: This service is to track stock levels and update inventory in real-time, handle stock reservations when orders are placed and notify other services when stock is low.

For this requirement it is good to use Azure Cosmos DB / Google Firestore / DynamoDB / PostgreSQL / Cassandra supports OCC with transactions

Order Service: This service receives new orders and validates payment, requests inventory reservation before confirming the order, triggers shipment processing after confirmation.

PostgreSQL + Citus / CockroachDB / Google Spanner / Cosmos DB to support distributed sagas or 2PC control

Worker Service: This service processes queued jobs, handles retry logic if an order fails, triggers external APIs for fulfillment.

Redis SETNX / DynamoDB / PostgreSQL / Cosmos DB to support idempotent operations

What are the main requirements of the database during the flash sale ?
Strong consistency and availability

Caching Strategies:

To ensure fast inventory lookups while maintaining data accuracy, an optimized caching strategy is essential. Instead of caching all inventory items, only frequently accessed (hot) inventory items should be stored in the cache. This ensures optimal memory usage while providing quick reads for high-demand products.

For product listing and search scenarios, implementing the cache-aside pattern helps maintain a balance between freshness and performance. When a request is made, the system first checks the cache. If the data is not found, it fetches it from the database, caches the result, and returns it to the user. This prevents stale data issues while improving read latency.

Using Redis with expiration policies ensures that cached inventory items are automatically evicted after a set time, reducing the risk of serving outdated stock information. Cache invalidation should be triggered when stock levels change, ensuring real-time accuracy.

For even better performance, a hybrid caching model combining Redis (distributed cache) and local in-memory caching can be used. This reduces the load on Redis by serving frequently accessed items from memory, further improving response times for repeated queries.

PreviousIntermediate: Decentralized Solution NextScalable Emoji Broadcasting System - Hotstar

Last updated 5 months ago