cassandra - Bigdata analysis in nosql -


i'm trying migrate our postgres database containing millions of clicks (few years click history) more performing system. our current analytic queries, running on postgres taking forever complete , degrades performance of whole database. i've been investigating possible solutions , i've decided closely investigate 2 options:

  • hbase hadoop (mapreduce)
  • cassandra spark

i working nosql before, never used analytical purposes. @ first bit disapointed how little analytical query options databases provide (missing groupby, count, ...). after reading many articles , presentations i've found out, need design schema according how intend read data , storage layer separated query layer. adds more redundant data, in world of nosql not issue.

eventually i've found 1 nice grails plugin cassandra-orm, internally encapsulates orderby feature in cassandra counters counters. i'm still worried howto make design extendable. queries, come in future, have no clue today, how can design schema prepared ? 1 option use spark, spark doesn't provide data in real time.

could give me insight or advice best possible options bigdata analysis. should use combination of real time queries vs. pre-aggregated ones?

thanks,

  1. if looking @ near real time data analysis, spark + hbase combination 1 of solutions.

  2. if want compromise on throughput, solr + cassandra combination datastax can used.

i using solr + cassandra datastax use case, not require real time processing. performance of search option not great combo ok throughput.

spark+hbase combination seems promising. depending on business requirement & expertise, can chose right combination.


Comments

Popular posts from this blog

python - TypeError: start must be a integer -

c# - DevExpress RepositoryItemComboBox BackColor property ignored -

django - Creating multiple model instances in DRF3 -