Database Sharding: Horizontal Scaling Guide
What Is Database Sharding?
Database sharding is a horizontal scaling technique that distributes data across multiple database instances (shards) based on a shard key. Each shard holds a subset of the data and operates independently. Instagram, for example, shards their PostgreSQL database across thousands of instances to serve 2 billion+ monthly active users.
Sharding is the go-to answer when an interviewer asks "How would you scale this database beyond a single server?" It enables near-linear horizontal scaling.
Sharding Strategies
- Range-Based — Shard by value ranges (users A-M on shard 1, N-Z on shard 2). Simple but can create hotspots
- Hash-Based — Hash the shard key (e.g., user_id % num_shards). Even distribution but resharding is complex
- Directory-Based — Lookup table maps each key to its shard. Flexible but the directory becomes a single point of failure
- Geographic — Shard by region for data locality and compliance (EU data on EU shards)
Choosing a Shard Key
The shard key determines data distribution and query routing. A good shard key has high cardinality (many unique values), even distribution, and aligns with your most common query patterns. For a social media app, user_id is often the best shard key because most queries are user-scoped.
Cross-shard queries (JOINs across shards) are expensive — minimize them through denormalization. Read designing scalable databases for schema strategies.
Deep Dive: Advanced Database sharding guide Concepts
Technical interview preparation requires going beyond surface-level understanding. Interviewers at top companies probe for depth — they want to see that you understand not just what something is, but why it works that way, when to use it, and what trade-offs it involves. This section covers the advanced concepts that separate candidates who get offers from those who get politely rejected.
The most common failure mode in technical interviews is shallow knowledge: knowing the name of a concept without being able to apply it or explain its trade-offs. For every concept you list in your resume, prepare a 3-part explanation: definition, implementation pattern, and a real example from your experience or a well-known system.
Problem-Solving Framework for Technical Interviews
Step 1: Clarify Requirements (2-3 minutes)
Never start coding immediately. Ask clarifying questions about scale, constraints, and requirements. "How many users are we designing for?" "What are the latency requirements?" "Is this read-heavy or write-heavy?" Interviewers reward candidates who think like engineers, not just coders. Missing a critical constraint and building the wrong solution is a common failure pattern.
Step 2: Propose an Approach (3-5 minutes)
Describe your approach before writing code. "I'm thinking of using X because Y. The trade-off is Z. Does that direction make sense?" This communicates your thought process, invites feedback, and ensures alignment before you invest time in implementation.
Step 3: Implement with Commentary (15-20 minutes)
Code while explaining your choices. Use clean variable names, structure your solution logically, and handle edge cases explicitly. When you encounter a decision point, explain your reasoning out loud: "I'm using a hash map here instead of an array because lookup time is O(1) vs O(n), which matters when this function is called thousands of times."
Step 4: Test and Optimize (5 minutes)
After completing a working solution, test it with edge cases (empty input, single element, maximum size) and analyze time/space complexity. If time permits, discuss optimizations. Interviewers respect candidates who identify their own solution's limitations.
Time and Space Complexity Quick Reference
| Algorithm/Structure | Time (Average) | Space | Common Interview Use |
|---|---|---|---|
| Hash Map lookup | O(1) | O(n) | Two-sum, grouping, deduplication |
| Binary Search | O(log n) | O(1) | Sorted arrays, rotation detection |
| BFS/DFS | O(V+E) | O(V) | Graphs, trees, shortest path |
| Merge Sort | O(n log n) | O(n) | Stable sorting, external sort |
| Quick Sort | O(n log n) avg | O(log n) | In-place sorting |
| Dynamic Programming | O(n*m) typical | O(n*m) | Optimization, counting, subsequences |
Most Common Mistakes in Technical Interviews
- Not clarifying the problem: Jumping directly to code without understanding requirements leads to solving the wrong problem.
- Silence: Thinking quietly without verbalizing your thought process makes interviewers nervous and prevents them from helping you when you're stuck.
- Overcomplicating: Starting with the optimal solution when a simpler brute-force approach is expected at the beginning. Always state the O(n²) solution first, then optimize.
- Ignoring edge cases: Not testing with null, empty, or boundary inputs signals incomplete thinking.
- Not asking for hints: Most interviewers will help if you're stuck and ask for a hint. Struggling silently wastes time.
Practice Resources
The most effective preparation combines deliberate practice with AI-powered feedback:
- LeetCode: Use the company tag filter to practice company-specific questions. 75-100 medium problems is a solid preparation baseline.
- NeetCode 150: Curated list of 150 essential problems covering all major patterns. Available with video explanations.
- AissenceAI Coding Copilot: Real-time hints and approach suggestions during live coding practice sessions. Available at AissenceAI coding mode.
- AissenceAI Mock Interviews: Full coding interview simulations with AI feedback on clarity, approach, and edge case handling. Start practicing.
Frequently Asked Questions
How many LeetCode problems should I solve before a technical interview?
Quality over quantity. 50-75 problems solved thoroughly with pattern recognition beats 200 problems solved by looking up solutions. Focus on understanding the underlying pattern, not memorizing specific solutions. Common patterns: sliding window, two pointers, depth-first search, dynamic programming, binary search, heap.
What if I get stuck during a coding interview?
Say so: "I'm not immediately seeing the optimal approach. Can I think through a brute force solution and then optimize?" Or ask a targeted question: "Is it safe to assume the input is always sorted?" Showing structured problem-solving under pressure is itself a positive signal.
How important is code quality vs. correctness?
Both matter, but in this order: correct algorithm > working code > clean code > optimal code. An elegant but wrong solution scores worse than a messy but correct one. Clean code and optimizations matter most at senior levels.
Next Steps
Combine technical practice with real interview experience. Use AissenceAI mock technical interviews to simulate the pressure of a real interview. For live interviews, AissenceAI's coding copilot provides real-time hints and approach suggestions. Check best coding practice platforms for a full comparison of preparation resources.