More than a year after its 4.0 major upgrade, Apache Cassandra is set to release its 4.1 iteration next week, promising pluggable schema management and new guardrails to help ops professionals keep those devs in line.
In July last year, the 4.0 release of the popular NoSQL database was the first major upgrade in six years, with an emphasis on stability, speed and consistency.
Building on that foundation for enterprise-grade systems, Mick Semb Wever, Apache Cassandra PMC chairman told The Register: “Four point one is the next natural step. We’re looking at who our users are, and what the major concerns are.”
He was these included stability, as this is a “high availability technology that scales, and that comes with its own set of challenges.”
Cassandra was first developed at Facebook in 2008 as a wide-column database that could support highly distributed systems where writes exceed reads, and so-called ACID-compliance is not important. It has been deployed in customer-facing applications built by businesses including Netflix, eBay and Apple, which runs a Cassandra DB with more than 160,000 instances storing over 100 petabytes of data across more than a thousand clusters. It became an Apache Top-Level Project in February 2010.
“Often, it’s deployed in situations in companies and platforms where there are so many developers involved, DevOps kind of breaks. The feedback we get from ops teams is they couldn’t get anything else to work at this scale, but that comes with a whole new set of problems and so we’ve been addressing them and trying to make it easier for operators, which in turn is about reducing cost, and reducing risk,” Semb Wever said.
To this end, included in 4.1 are configurable system level guardrails to guide users in scalable use of the database, a partition denylisting tool for reducing the impact of overloaded partitions and improvements to nodetool, backup and restore.
Semb Wever said the list of guardrails included whether developers want to use secondary indexes or how many secondary indexes they want to use on a table. “It could be warnings when certain features are used, or it could deny their use or limit the extent of their use,” he said.
Another area the Apache Cassandra developers have focused on relates to the down-stream ecosystem of companies that support and build on the open-source system. These include DataStax, where Semb Wever is a solutions architect.
New features include pluggable persistent memory providers via new Memtable API and Pluggable external schema manager services. Semb Wever explained that pluggable storage had been “called for for a long time”.
“We know that different people will use different storage engines,” he said. By allowing developers to plug-in their own storage engine more easily the system could reduce the storage on disk and reduce the lookup times, he added.
Apache Cassandra 4.1 is set for general availability next week. ®