How to read lines of a file with node.js or javascript with delay, not in non-blocking behavior? -

- February 15, 2012

i reading file (300,000 lines) in node.js. want send lines in batches of 5,000 lines application (elasticsearch) store them. whenever finish reading 5,000 lines, want send them in bulk elasticsearch through api store them , keep reading rest of file , send every 5,000 line in bulk.

if want use java (or other blocking language such c, c++, python, etc.) task, i'll this:

int countlines = 0; string bulkstring = ""; bufferedreader br = new bufferedreader(new inputstreamreader(new fileinputstream("filepath.txt"))); while ((currentline = br.readline()) != null) {      countlines++;      bulkstring += currentline;      if(countlines >= 5000){           //send bulkstring elasticsearch via apis           countlines = 0;           bulkstring = "";      } }

if want same thing node.js, do:

var countlines = 0; var bulkstring = ""; var instream = fs.createreadstream('filepath.txt'); var rl = readline.createinterface(instream, outstream); rl.on('line', function(line) {      if(countlines >= 5000){           //send bulkstring via apis           client.bulk({           index: 'indexname',           type: 'type',           body: [bulkstring]           }, function (error, response) {           //task done           });           countlines = 0;           bulkstring = "";      } }

the problem node.js is non-blocking doesn't wait first api response before sending next batch of lines. know count benefit done.js because not wait i/o, problem sends of data elasticsearch. therefor elasticsearch's queue full , throw exceptions.

my question how can make node.js wait response api before continues read next lines or before sends next batch of lines elasticsearch.

i know can set parameters in elasticsearch increase queue size, interested in blocking behavior of node.js issue. familiar concept of callbacks, cannot think of way use callbacks in scenario prevent node.js calling elasticsearch api in non-blocking mode.

pierre's answer correct. want submit code shows how can benefit non-blocking concept of node.js @ same time, not overwhelm elasticsearch many requests @ 1 time.

here pseudo code can use give code flexibility setting queue size limit:

var countlines = 0; var bulkstring = ""; var queuesize = 3;//maximum of 3 requests sent elasticsearch server var batchesalreadyinqueue = 0; var instream = fs.createreadstream('filepath.txt'); var rl = readline.createinterface(instream, outstream); rl.on('line', function(line) {      if(countlines >= 5000){           //send bulkstring via apis           client.bulk({           index: 'indexname',           type: 'type',           body: [bulkstring]           }, function (error, response) {                //task done                batchesalreadyinqueue--;//we decrease number of requests sent elasticsearch when hear 1 of requests                rl.resume();           });           if(batchesalreadyinqueue >= queuesize){                rl.pause();           }           countlines = 0;           bulkstring = "";      } }

Search This Blog

Soju

How to read lines of a file with node.js or javascript with delay, not in non-blocking behavior? -

Comments

Post a Comment

Popular posts from this blog

python - TypeError: start must be a integer -

c# - DevExpress RepositoryItemComboBox BackColor property ignored -

django - Creating multiple model instances in DRF3 -