What's faster for Stata: manipulating data in a flat database (i.e. Excel) or in a relational database? -


i'm entry-level optimization analyst @ company publishes risk ratings data various companies. have tons of data (to point our history solely limited number of rows possible in excel).

we use many .do files in stata perform manipulations , statistical analyses (the largest production run takes 9 hours, 1 insheet taking half minute). i'm trying convince company move away using flat database using relational database have been having trouble finding information online whether flat or relational better in stata. so--which better, , why?

i hypothesise answered own questions emphasising limitations of excel prevent capitalising on full potential of data. excel is not proper analytical tool or data warehousing solution , such there no point in using in analytical projects involving more complex doing basic sums small business / household needs.

to answer question:

  1. flat file databases archaic technology dating beginnings of computer science: never designed meet modern analytical needs of working big data, live data streams, etc.

  2. relational databases

    • help avoid data duplication
    • help avoid inconsistent records
    • are easier when changing data format

Comments

Popular posts from this blog

python - TypeError: start must be a integer -

c# - DevExpress RepositoryItemComboBox BackColor property ignored -

django - Creating multiple model instances in DRF3 -