Description
Hello, it's me again )
The streams work well, but I decided to optimize them even more by removing all unnecessary abstractions, leaving only the native JS (AsyncIterator) and the result is worth it.
I propose to add a new simple interface for parsing data from an asynchronous iterator (generator), it is a great performance and a very simple implementation.
User land example:
import { parse } from 'csv-parse/iterator';
async function* iterator() {
try {
yield Buffer.from('A,B,C\n');
} catch (error) {
console.error(error);
}
}
for await (const records of parse(iterator()))
console.log(records);
Lib implement:
async function* parse(iterator) {
let result = null;
const setResult = records => {
result = records;
};
for await (const chunk of iterator) {
const error = api.parse(chunk, false, setResult);
if (error) {
await iterator.throw(error);
} else if (result) {
yield result;
result = null;
}
}
// Flush
const error = api.parse(undefined, true, setResult);
if (error) {
await iterator.throw(error);
} else if (result) {
return result;
}
}
Asynchronous iterators are great for this task, they work in all JavaScript environments, consume less memory and CPU compared to any stream implementation.
In fact, there are a lot of sources in the form of asynchronous iterators, all streams provide an interface for asynchronous iterators, here is an example of fetch:
import { parse } from 'csv-parse/iterator';
const response = await fetch('file.csv');
for await (const record of parse(response[Symbol.asyncIterator]())) {
console.log(record);
}
According to my local measurements, async iterators are 20% faster than streams, and at the user level, writing a generator function is much faster than coding using stream interfaces.
If you like it, I can do a PR