cassandra - Bigdata analysis in nosql -
i'm trying migrate our postgres database containing millions of clicks (few years click history) more performing system. our current analytic queries, running on postgres taking forever complete , degrades performance of whole database. i've been investigating possible solutions , i've decided closely investigate 2 options:
- hbase hadoop (mapreduce)
- cassandra spark
i working nosql before, never used analytical purposes. @ first bit disapointed how little analytical query options databases provide (missing groupby, count, ...). after reading many articles , presentations i've found out, need design schema according how intend read data , storage layer separated query layer. adds more redundant data, in world of nosql not issue.
eventually i've found 1 nice grails plugin cassandra-orm, internally encapsulates orderby feature in cassandra counters counters. i'm still worried howto make design extendable. queries, come in future, have no clue today, how can design schema prepared ? 1 option use spark, spark doesn't provide data in real time.
could give me insight or advice best possible options bigdata analysis. should use combination of real time queries vs. pre-aggregated ones?
thanks,
if looking @ near real time data analysis, spark + hbase combination 1 of solutions.
if want compromise on throughput, solr + cassandra combination datastax can used.
i using solr + cassandra datastax use case, not require real time processing. performance of search option not great combo ok throughput.
spark+hbase combination seems promising. depending on business requirement & expertise, can chose right combination.
Comments
Post a Comment