Skip to content

Conversation

@jacoscaz
Copy link
Collaborator

This PR aims to extend the gains found in #48 to the wrap function, skipping any kind of buffering when possible (i.e. when no transform options are provided). @jeswr suggested this in #44 .

@RubenVerborgh
Copy link
Owner

Do we have any performance evidence here?

@jacoscaz
Copy link
Collaborator Author

With main as it is right now, after merging #46 , going through an array of 200k items with wrap(new ArrayIterator(arr)) takes ~103 ms (MacBook Pro, Apple Silicon, M1, 16 GB RAM). With this PR that goes down to ~10 ms, so roughly 10x faster.

@coveralls
Copy link

coveralls commented Mar 26, 2022

Coverage Status

Coverage remained the same at 100.0% when pulling 1030891 on jacoscaz:faster-wrapping into 358608f on RubenVerborgh:main.

@jeswr jeswr mentioned this pull request Mar 27, 2022
@jacoscaz
Copy link
Collaborator Author

@jeswr this PR now includes your work from #55 , as discussed on Gitter.

@jeswr @RubenVerborgh I've tried reworking the wrap() function to return the most appropriate iterator for all sorts of sources while maintaining backward compat. with the current implementation. Seems to work well but looking forward to your feedbacks. This PR now builds on #48 , diff here jacoscaz/AsyncIterator@faster-filtering-mapping...jacoscaz:faster-wrapping .

@jacoscaz
Copy link
Collaborator Author

I realize that the resulting PR might get pretty big but perhaps we could merge this one into #48 so as to have all of the recent performance work in one branch. That would simplify testing in other projects.

@jeswr
Copy link
Collaborator

jeswr commented Mar 27, 2022

I think the _wrap function still needs an option where you can select to run the isIterable tests first within the wrap function.

I'm also concerned that this is a little heavyweight if you do lots of wrapping, for instance within the map or transform of another iterator. I think it's worth having custom wrap functions for each of those source types that the user can choose to use if they know the nature of what they are wrapping in advance.

Great work on this btw @jacoscaz !

@jacoscaz
Copy link
Collaborator Author

@jeswr I should have addressed both points in my latest commit. The names are a bit confusing, though, particularly the fromIterator function (for ES2015 Iterator) and fromIteratorLike function (for AsyncIterator, stream.Readable, ...). Suggestions?

@jeswr
Copy link
Collaborator

jeswr commented Mar 28, 2022

@jacoscaz , all of the iteratorLike stuff could potentially be AsyncIteratorLike? Because the fromIterator is strictly for ES2015 synchronous iterables?

@jacoscaz
Copy link
Collaborator Author

All right, done! Let's see what @RubenVerborgh says; if things look good I'll add tests for both this and #48 and we should be all set.

Comment on lines +1331 to +1374
map<T>(map: (item: D) => T, self?: any): AsyncIterator<T> {
return new MultiMapFilterTransformIterator(this._source, {
filter: false,
function: self ? map.bind(self) : map,
next: {
filter: false,
function: this._map,
},
});
}

filter(filter: (item: D) => boolean, self?: any): AsyncIterator<D> {
return new MultiMapFilterTransformIterator(this._source, {
filter: true,
function: self ? filter.bind(self) : filter,
next: {
filter: false,
function: this._map,
},
});
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per this comment it still may be worth removing these lines. @jacoscaz it might be worth doing some perf. tests on your machine to see if your results align with mine.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the cause of the marginal slowdown observed there was the fact that we are essentially adding an extra function for every map operation in order to compose them (item: any) => t(_func(item));

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curiously enough, I am getting 10% - 15% faster iteration by keeping those lines in. Removing them seems to slow things down. On some level this makes sense as V8 should optimize execution of the "compiled" transformation more effectively. Another differentiating factor might be the number of item vs. the number of iterators. These lines likely produce higher setup latencies, so they might have a more significant impact on scenarios with low numbers of items and high numbers of iterators. I would keep these in and postpone further tuning to a later stage as we're currently lacking dedicated scaffolding for this sort of testing.

…atorLike, adds option to prioritize ES2015 Iterable and Iterator while wrapping
@jacoscaz
Copy link
Collaborator Author

jacoscaz commented Mar 28, 2022

Happy to report that not only all tests are passing but with this branch we're not that far from having all tests in Comunica pass, too! There's only 8 tests that are breaking. Issues so far:

@jacoscaz
Copy link
Collaborator Author

Superseded by #57

@jacoscaz jacoscaz closed this Mar 30, 2022
@jacoscaz jacoscaz deleted the faster-wrapping branch May 9, 2022 08:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants