Skip to content

Implement parse data from AsyncIterator #447

Open
@uasan

Description

@uasan

Hello, it's me again )

The streams work well, but I decided to optimize them even more by removing all unnecessary abstractions, leaving only the native JS (AsyncIterator) and the result is worth it.

I propose to add a new simple interface for parsing data from an asynchronous iterator (generator), it is a great performance and a very simple implementation.

User land example:

import { parse } from 'csv-parse/iterator';

async function* iterator() {
  try {
    yield Buffer.from('A,B,C\n');
  } catch (error) {
    console.error(error);
  }
}

for await (const records of parse(iterator()))
  console.log(records);

Lib implement:

async function* parse(iterator) {
  let result = null;
  const setResult = records => {
    result = records;
  };

  for await (const chunk of iterator) {
    const error = api.parse(chunk, false, setResult);

    if (error) {
      await iterator.throw(error);
    } else if (result) {
      yield result;
      result = null;
    }
  }

  // Flush
  const error = api.parse(undefined, true, setResult);

  if (error) {
    await iterator.throw(error);
  } else if (result) {
    return result;
  }
}

Asynchronous iterators are great for this task, they work in all JavaScript environments, consume less memory and CPU compared to any stream implementation.

In fact, there are a lot of sources in the form of asynchronous iterators, all streams provide an interface for asynchronous iterators, here is an example of fetch:

import { parse } from 'csv-parse/iterator';

const response = await fetch('file.csv');

for await (const record of parse(response[Symbol.asyncIterator]())) {
   console.log(record);
}

According to my local measurements, async iterators are 20% faster than streams, and at the user level, writing a generator function is much faster than coding using stream interfaces.

If you like it, I can do a PR

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions