Database Expert Labs - Module 9

Lab 25: Cassandra Distributed Database

NoSQL / Expert

Scenario: IoT Sensor Data Platform

TechSensors needs a highly available, distributed database for billions of IoT sensor readings. You must write CQL commands to create keyspaces, design wide-row tables with proper partitioning, and configure replication strategies. Each command is validated for correct syntax.

Learning Objectives:

Keyspace Creation: Define replication strategy and consistency levels
Table Design: Create wide-row tables with partition and clustering keys
Data Modeling: Write time-series data models with TTL
Queries: Use WHERE clauses with partition keys properly

📋 Step-by-Step Instructions

Step 1: Create Keyspace with Replication
Create a keyspace with NetworkTopologyStrategy and RF=3.

Required Syntax:
CREATE KEYSPACE iot_sensors WITH replication = {'class': 'NetworkTopologyStrategy', 'datacenter1': 3};

• Keyspace name: iot_sensors
• Strategy: NetworkTopologyStrategy
• RF: 3

💡 Tip: NetworkTopologyStrategy is production-ready, SimpleStrategy is for dev only.
Step 2: Switch to Keyspace
Use the keyspace to work with tables.

Required Command:
USE iot_sensors;

💡 Tip: Always USE keyspace before creating tables.
Step 3: Create Time-Series Table
Design a wide-row table for sensor readings with proper keys.

Required Elements:
• Table name: sensor_data
• Partition key: sensor_id
• Clustering key: timestamp (DESC order)
• Columns: temperature, humidity, battery_level
• Must use: PRIMARY KEY ((sensor_id), timestamp)

💡 Tip: Partition by sensor_id, cluster by timestamp for efficient time-series queries.
Step 4: Insert Sensor Reading
Insert a sensor reading with all required fields.

Required Fields:
• sensor_id: text (e.g., 'SENSOR-001')
• timestamp: timestamp
• temperature, humidity, battery_level

💡 Tip: Must include partition key (sensor_id) and clustering key (timestamp)!
Step 5: Query with WHERE Clause
Query sensor data using the partition key properly.

Required Query:
• Must use: SELECT
• Must have: WHERE sensor_id = (partition key required!)
• Optional: LIMIT clause

💡 Tip: Cassandra requires WHERE clause to include partition key. Can't scan full table!
Step 6: Create Table with TTL
Create a table for temporary data with automatic expiration.

Required Elements:
• Table name: alerts
• Must have: default_time_to_live = 86400 (24 hours)
• Primary key: alert_id

💡 Tip: TTL automatically deletes old data - perfect for temporary alerts!

Cassandra CQL Shell

cqlsh -- localhost:9042

Connected to Cassandra Cluster CQL Version: 3.4.5 [cqlsh 6.0.0 | Cassandra 4.0.0] cqlsh> Type CQL commands below.

cqlsh>

CQL Quick Reference

• CREATE KEYSPACE name WITH replication = {...};

• USE keyspace_name;

• CREATE TABLE name (col type, PRIMARY KEY ((partition), clustering));

• INSERT INTO table (cols) VALUES (vals);

• SELECT * FROM table WHERE partition_key = value;

Progress: 0/6 tasks completed

Score: 0/100

0%

Lab Completed!

Excellent Cassandra mastery!

Lab 26: Elasticsearch Full-Text Search

Search / Expert

Scenario: News Article Search Engine

MediaCorp needs a powerful search engine for millions of news articles. You must write Elasticsearch REST API commands to create indices with mappings, configure analyzers for full-text search, and write complex queries with aggregations.

Learning Objectives:

Index Creation: Create indices with custom mappings and analyzers
Document Indexing: Index documents with proper field types
Full-Text Search: Write match, bool, and multi_match queries
Aggregations: Create bucket and metric aggregations

📋 Step-by-Step Instructions

Step 1: Create Index with Mappings
Create an index with explicit field mappings and analyzers.

Required API Call:
PUT /news_articles

• Index name: news_articles
• Must define: mappings with properties
• Fields: title (text), content (text), author (keyword), published_date (date)

💡 Tip: Use 'text' for full-text search, 'keyword' for exact match.
Step 2: Index a Document
Add a news article document to the index.

Required API Call:
POST /news_articles/_doc

• Must include: title, content, author, published_date

💡 Tip: POST auto-generates ID, PUT with /_doc/id for specific ID.
Step 3: Full-Text Match Query
Search articles using full-text match query.

Required Query:
GET /news_articles/_search

• Must use: match query
• Search in: title or content field

💡 Tip: match query analyzes search terms and finds relevant documents.
Step 4: Bool Query with Filters
Write a compound query with must, should, filter clauses.

Required Elements:
• Must use: bool query
• Include: must or should clause
• Include: filter for date range

💡 Tip: filter doesn't affect scoring - use for date ranges, exact matches.
Step 5: Multi-Match Query
Search across multiple fields at once.

Required Query:
• Must use: multi_match query
• Search fields: ["title", "content"]
• Include: type parameter (best_fields or cross_fields)

💡 Tip: best_fields returns docs matching best in any field.
Step 6: Aggregation Query
Create aggregations to analyze article data.

Required Elements:
• Must use: aggs or aggregations
• Include: terms aggregation on author
• Or: date_histogram on published_date

💡 Tip: terms agg creates buckets by field value - great for faceted search!

Elasticsearch REST Console

elasticsearch:9200

Elasticsearch 8.11.0 Cluster: news-cluster (green) Connected to localhost:9200 ES> Type REST commands (e.g., GET /_cluster/health)

ES>

Elasticsearch API Reference

• PUT /index_name - Create index

• POST /index/_doc - Index document

• GET /index/_search - Search documents

• GET /_cluster/health - Cluster status

Progress: 0/6 tasks completed

Score: 0/100

0%

Lab Completed!

Excellent Elasticsearch mastery!

Lab 27: Neo4j Graph Database

Graph / Expert

Scenario: Social Network Analysis

SocialNet needs a graph database to model user relationships, recommendations, and fraud detection. You must write Cypher queries to create nodes, relationships, and perform graph traversals and pattern matching.

Learning Objectives:

Node Creation: Create labeled nodes with properties
Relationships: Define typed relationships between nodes
Pattern Matching: Use MATCH to find graph patterns
Graph Algorithms: Find shortest paths and recommendations

📋 Step-by-Step Instructions

Step 1: Create User Nodes
Create user nodes with the Person label and properties.

Required Syntax:
CREATE (n:Person {name: 'Alice', age: 30})

• Label: Person
• Properties: name, age

💡 Tip: Labels categorize nodes (Person, Product, etc). Properties store data.
Step 2: Create Relationships
Create FOLLOWS relationship between users.

Required Syntax:
MATCH (a:Person {name:'Alice'}), (b:Person {name:'Bob'}) CREATE (a)-[:FOLLOWS]->(b)

• Relationship type: FOLLOWS
• Direction matters: (a)-[:REL]->(b)

💡 Tip: Relationships are typed and directional. Use MERGE to avoid duplicates.
Step 3: MATCH Pattern Query
Query the graph to find all people someone follows.

Required Query:
• Must use: MATCH
• Pattern: (p:Person)-[:FOLLOWS]->(followed)
• Must use: RETURN

💡 Tip: MATCH finds patterns in the graph. Always end with RETURN.
Step 4: Variable-Length Paths
Find friends-of-friends using variable-length paths.

Required Query:
• Must use: *1..2 or similar variable length
• Pattern: -[:FOLLOWS*1..2]->
• Find 2nd degree connections

💡 Tip: *1..2 means 1 to 2 hops. *..3 means up to 3 hops.
Step 5: Aggregation Query
Count followers per user using aggregation.

Required Query:
• Must use: COUNT() aggregation
• Must use: ORDER BY
• Find most followed users

💡 Tip: count() aggregates per group. ORDER BY DESC for top results.
Step 6: Shortest Path Algorithm
Find the shortest path between two users.

Required Query:
• Must use: shortestPath function
• Pattern: shortestPath((a)-[*]-(b))
• Find connection between two people

💡 Tip: shortestPath finds minimum hops. Use [*..10] to limit depth.

Neo4j Cypher Shell

neo4j://localhost:7687

Neo4j 5.15.0 Community Edition Connected to neo4j://localhost:7687 Database: neo4j neo4j> Type Cypher queries below.

neo4j>

Cypher Quick Reference

• CREATE (n:Label {prop: val}) - Create node

• (a)-[:REL]->(b) - Relationship pattern

• MATCH (n) RETURN n - Query pattern

• WHERE n.prop = val - Filter results

Progress: 0/6 tasks completed

Score: 0/100

0%

Lab Completed!

Excellent Neo4j mastery!

Advanced Database Systems

Distributed & Graph Databases - Module 9

Learning Objectives:

📋 Step-by-Step Instructions

CQL Quick Reference

Lab Completed!

Learning Objectives:

📋 Step-by-Step Instructions

Elasticsearch API Reference

Lab Completed!

Learning Objectives:

📋 Step-by-Step Instructions

Cypher Quick Reference

Lab Completed!