-
Notifications
You must be signed in to change notification settings - Fork 144
[WIP] Add include_for_find support via 'expand' #877
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@@ -185,6 +185,7 @@ def collection_filterer(res, type, klass, is_subcollection = false) | |||
options[:filter] = miq_expression if miq_expression | |||
options[:offset] = params['offset'] if params['offset'] | |||
options[:limit] = params['limit'] if params['limit'] | |||
options[:include_for_find] = determine_include_for_find(klass) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
😍
Nice! Do you have some performance comparisons? |
@NickLaMuro do we have to opt-in or something? I don't see improvements with applying this change on this request: Before:
After:
|
@jrafanie you won't see an improvement with that request (attributes=...) as this enhancement is for expanding subcollections. i.e. expand=... see method used https://github.com/ManageIQ/manageiq-api/pull/877/files#diff-8caecf6616f6fd1e2602842ad7dd51f0R108. Once this is expanded to also include virtual attributes and associations (via attributes), then you'd see the improvement with your query. |
@abellotti no, I don't think that is correct. The N+1 was happening because However, I think the issue is that @jrafanie is seeing I was probably testing locally with this patch also applied: So I think they were working together to get a better result overall, and without it, the N+1 causing a refetch of the VMs obliterates any This report output is a single run where I was modifying a few things in place:
(ignore the And the last one, I tweaked the I will mark this PR as |
If the intent is to know what to prefetch for a virtual attributes specified in attributes, looking at expand does not seem right as callers will not be specifying those via expand (only for well defined subcollections). Is there a way to find what to preload for a virtual attribute specified via attributes ? |
@abellotti Thanks, I figured you would have had some issue with how I was designing this 😄
Not really, or at least it wasn't my intent, though it could be. The idea was more that this was overloading |
I can investigate what this would take more today, though, it is significantly harder to do with This was the explicit approach, where a user has to opt in to do it, where the later would be more complex, but would be automatically applied. |
As LJ showed, without my patch from #874 in place, I got the following for performance numbers: $ bundle exec miqperf benchmark -ac 5 "/api/vms?expand=resources,hardware&attributes=num_cpu,name"
D, [2020-07-29T15:16:48.277412 #13338] DEBUG -- : --> logging in...
D, [2020-07-29T15:16:54.638950 #13338] DEBUG -- : --> making GET request: /api/vms?expand=resources,hardware&attributes=num_cpu,name
D, [2020-07-29T15:16:59.451455 #13338] DEBUG -- : --> making GET request: /api/vms?expand=resources,hardware&attributes=num_cpu,name
D, [2020-07-29T15:17:02.135731 #13338] DEBUG -- : --> making GET request: /api/vms?expand=resources,hardware&attributes=num_cpu,name
D, [2020-07-29T15:17:04.765458 #13338] DEBUG -- : --> making GET request: /api/vms?expand=resources,hardware&attributes=num_cpu,name
D, [2020-07-29T15:17:07.750053 #13338] DEBUG -- : --> making GET request: /api/vms?expand=resources,hardware&attributes=num_cpu,name
$ bundle exec miqperf report --last
/api/vms
| ms | queries | query (ms) | rows |
| ---: | ---: | ---: | ---: |
| 4241 | 1118 | 527.5 | 3147 |
| 2192 | 1116 | 460.4 | 1665 |
| 2132 | 1116 | 512.0 | 1665 |
| 2463 | 1116 | 469.7 | 1665 |
| 2592 | 1116 | 549.0 | 1665 | However, after applying the patch manually to that branch in the API: $ git apply <(curl -L https://github.com/ManageIQ/manageiq-api/pull/874.patch)
$ git apply <(curl -L https://github.com/ManageIQ/manageiq-api/pull/877.patch) I got the numbers I expected: $ bundle exec miqperf benchmark -ac 5 "/api/vms?expand=resources,hardware&attributes=num_cpu,name"
D, [2020-07-29T15:18:04.040054 #13731] DEBUG -- : --> logging in...
D, [2020-07-29T15:18:09.667021 #13731] DEBUG -- : --> making GET request: /api/vms?expand=resources,hardware&attributes=num_cpu,name
D, [2020-07-29T15:18:11.642624 #13731] DEBUG -- : --> making GET request: /api/vms?expand=resources,hardware&attributes=num_cpu,name
D, [2020-07-29T15:18:12.000060 #13731] DEBUG -- : --> making GET request: /api/vms?expand=resources,hardware&attributes=num_cpu,name
D, [2020-07-29T15:18:12.357015 #13731] DEBUG -- : --> making GET request: /api/vms?expand=resources,hardware&attributes=num_cpu,name
D, [2020-07-29T15:18:12.777287 #13731] DEBUG -- : --> making GET request: /api/vms?expand=resources,hardware&attributes=num_cpu,name
$ bundle exec miqperf report --last
/api/vms
| ms | queries | query (ms) | rows |
| ---: | ---: | ---: | ---: |
| 1738 | 12 | 35 | 2041 |
| 260 | 10 | 9.8 | 559 |
| 262 | 10 | 9.2 | 559 |
| 257 | 10 | 9.6 | 559 |
| 259 | 10 | 10.0 | 559 | |
98e33df
to
ddd28eb
Compare
I've confirmed the patches on jansa improve the results as @NickLaMuro reported...
|
FYI: I am working on getting specs put together for this patch, but there was also some concerns I raised about |
d6e3f60
to
d2270d4
Compare
@NickLaMuro tests are failing, can you take a look? |
Also, is this still WIP now that #874 is merged? |
Yes I can. I was looking into some other potential improvements all of yesterday, so I will give this a look today and finish anything that needs to be finished up. |
@jrafanie Okay, there are a few things here:
So the test that is failing is the one that I wrote, and I am not sure if it is worth keeping now that it already isn't working. Basically, I am testing the number of queries executed is what I expect. It is possible we could test in a different way, where we are only matching the SELECTS that we want test instead of what I was doing, which was ignoring the first N queries that generally on all requests (see However, before even answering that question...
I forgot there was another reason for the WIP flag, and that was I wanted to have a larger discussion of whether or not this is the right architectural approach to take for this, or at least the one we want to take. I think somehow using I don't think that has been answered since though, so I think this is still WIP. |
Re: #877 (comment) the concern being: Since we're overloading the meaning of expand, i.e. expand (and return) a subcollection and now prefetch association, the problem that would emerge would be what if we have a subcollection named same as the association. The user specifying expand= would get the perf. boost, but will also get the subcollection returned (not what we want). While we can go through all currently exposed subcollections (via OPTIONS /api/:collection) and verify that this is not currently a valid problem scenario for now, no guarantee that won't bite us later. Hopefully there's another option we can contemplate. |
Definitely, as I mentioned before:
Doing that is definitely another option, but this probably was meant to more be a [POC] PR (which I probably should have labeled it as) and show that doing I would be happy to look into doing just that in a different branch, assuming making such an addition is something you agree with, because I will already be re-working this PR again as a result, and I don't fancy doing it a third if you didn't like the proposed concept from the beginning. Edit: "proposed concept" being adding a new URL param |
Oh crap, I forgot you had mentioned that previously:
My bad. I am fine with either approach, I just tend to favor "performance knobs" since there is less of a chance it is breaking existing functionality or user queries in doing so, and if they need to be backported, they can be (safely). However, I am fine moving forward with an implicit approach as well. |
Allows for opt in of `:include_for_find` option for `Rbac::Filterer` via the `expand` request parameter. Include for find will effectively add a `.include` to the base query done in `collection_search`, which in cases like this: /api/vms?expand=resources,hardware&attributes=num_cpu,name Can avoid an N+1 on the associated resource.
d2270d4
to
886dd97
Compare
Checked commit NickLaMuro@886dd97 with ruby 2.5.7, rubocop 0.69.0, haml-lint 0.28.0, and yamllint app/controllers/api/base_controller/parameters.rb
spec/support/shared_examples/expand_resources_includes.rb
|
I admit not reading the whole thread, so apologies if I'm repeating, but offhand, there's no way an end user will know that adding hardware will speed up num_cpu, so that doesn't feel like a good interface. This is what :uses is for. Under the covers, we can ask what num_cpu uses, and then tack that into the includes chain, or even better just include num_cpu directly, which in turn will include the uses clause. Either way we can know what to include without putting the onus on the end user. I am generally against more knobs because that's just increases complexity for the end user. So I'd be against adding anything to the interface if we can auto-determine things. If we can't, then I'd be ok with it, but I'd rather be sure we really need it. |
yeah, I agree with @abellotti and @Fryguy in that if we correctly specify the |
Opened up #887 which I think fits what everyone else would like out of this one, so closing this one. |
Note: Requires #874 to be fully effective.
Allows for opt in of
:include_for_find
option forRbac::Filterer
via theexpand
request parameter.Include for find will effectively add a
.include
to the base query done incollection_search
, which in cases like this:Can avoid an N+1 on the associated resource (in the above case,
hardware
).Example response
For the above query
Performance Results
Not, that to gain any performance benefit from this PR using the
/api/vm
endpoint, #874 must be included as well.Before
After #874 only (not this PR)
After #874 + #877
Links