Duration
- Basic Level: 2 days
- Advanced Level: 3 Days
Course Outline
Developer Training on Apache Cassandra
Level: Basic
Introduction to RDBMS and NoSQL Databases
- Traditional Database Management System
- Limitations of RDMS
- NoSQL databases
- Common characteristics of NoSQL databases
- CAP theorem
- Different Types of NoSQL Databases
- How Cassandra solves the Limitations
- History of Cassandra
- Use Cases
- Quiz
Introduction To Apache Cassandra
- Features of Cassandra
- Distributed
- Horizontal Scalability
- Fault Tolerance
- High Availability
- Cassandra Architecture
- Gossip Protocol
- Partitioning
- Data Model vs. Query Model
- How to approach Cassandra for an application
- Quiz
Designing Cassandra Database
- Installation of Single Node Cluster on Windows/Linux
- Cassandra Configuration File
- Communication with Cassandra
- Understanding Ways to Communicate with Cassandra
- Running the Command-Line Client Interface - Using cqlsh
- Database Concepts
- Key Space
- Key Space Properties - Replication Factor and Replication Strategy
- Column Family/Table
- Column Family Properties
- Data Types - Primitives and Collection
- Primary key -- Partition Key, Compound Key, Clustering Key and
- Composite Key
- Purpose of Primary Key in Data Partitioning and Storage
- Create, Delete, Update and Read (CRUD)
- Wide Rows vs. Skinny Rows
- Time To Live
- Secondary Indexes in Cassandra
- Difference between Custom Indexes and Secondary Indexes
- Materialized Views
- Lightweight transactions
- Triggers
- Practical Session
- Quiz
Understanding Cassandra Database Approach
- Difference between Relational Modeling and Cassandra Modeling
- Key Points to note while modeling a Cassandra Database
- Patterns and Anti-Patterns in Cassandra Modeling
- Practical Session
- Quiz
- Conclusion of First Day
Deep Dive into Apache Cassandra
- Tokens and vnodes
- Initial_Token
- Num_Tokens
- How data is distributed on multiple nodes and Datacenter
- How data is written on nodes
- Tunable Consistency Level
- Hinted Handoff
- CommitLogs
- MemTables
- SSTable
- Compaction
- How data is read from nodes
- Tunable Consistency Level
- Read Direct
- Read Digest
- Read Repair
- Bloom Filter and its properties
- Key Space and its properties
- Row Space and its properties
- Tombstones - How Data is deleted
- Understanding How Update Works
Understanding MultiNode Cassandra Cluster
- Installation of Multi-Node Cluster on Linux
- Cassandra Configuration File - seed nodes and bootstrap node
- Monitoring and Status Info of Cluster using NodeTool
- Data Import and Export into/from Table from/into CSV file
- Other DDL and DML Commands of Cassandra
- Data Movement from one node to other
- Addition/Removal/replace of a live/dead node from the cluster and its effect
- Practical Session
- Quiz
- Quiz
- Conclusion of the Second Day
Level: Advanced
Introduction and Recap of Apache Cassandra
Configuration
- Different Types of Gossip Protocols
- How to define Tokens
- Advantages of dynamic tokens using vnodes over static tokens
- Rebalancing cluster
- Practical Session
- Setup of multinode Cluster on Linux
- Cassandra.yaml configuration file
- Configuring gossip settings
- Configuring the heap dump directory
- Token setting
- Configuring/Enabling virtual nodes
- Commit log archive configuration
- Quiz
Backup and Restore
- Taking a snapshot
- Deleting snapshot files
- Enabling incremental backups
- Restoring from a Snapshot
- Restoring a snapshot into a new cluster
- Quiz
- Practice Session
- Creating Snapshots
- Restore Snapshot
Performance Tuning
- How data modeling affects performance
- Workload characterization
- Performance characteristics of your cluster
- Drill-down analysis
- Latency analysis
- Thread state analysis
- Easy performance tuning wins
- I/O (including hardware)
- JVM & memory
- Compaction
- Compression
- Quiz
- Practice Session
Operations
- Snitches
- Cassandra nodes
- Specifying seed nodes
- Bootstrapping a node
- Adding a node (Commissioning) in Cluster
- Removing (Decommissioning) a node
- Removing a dead node
- Repair
- Read Repair
- Run a Repair Operation
- Tuning Bloom filters
- Data caching
- Configuring memtable throughput
- Configuring compaction
- Compression
- Tuning Java resources.
- Purging gossip state on a node
- Using Cassandra-stress
- Moving a node from one rack to another.
- Decommissioning a data center.
- Switching snitches
- Edge cases for transitioning or migrating a cluster
- Modifying Cassandra-rackdc.properties
- Changing Replication Strategy
- Security
- Quiz
- Practice Session
- Cassandra Configuration File Setting
- NodeTool Commands for add, remove, replace nodes
- NodeTool Commands to move data from nodes
- NodeTool Commands to determine cluster and nodes status
- Creating Roles and granting permissions
Introduction To Datastax Cassandra
- The need for Datastax Cassandra Enterprise
- DSE Architecture
- Installation Community Version of DataStax Cassandra
- Understanding OpsCenter
Managing and Monitoring Cluster
- Logging
- Tailing
- Using Nodetool Utility
- Using JConsole
- Learning about OpsCenter
- Runtime Analysis Tools
- Practice Session
- JMX and Jconsole
- OpsCenter
- NodeTool
- Quiz
- Conclusion of the Training
- Question Answers
Test & Evaluation
Each lecture will have a quiz containing a set of multiple-choice questions. Apart from that, there will be a final test based on multiple-choice questions.
Your evaluation will include the overall scores achieved in each lecture quiz and the final test.