Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve performance of reading cache entries #198

Merged
merged 7 commits into from
Sep 2, 2024

Conversation

djfpaagman
Copy link
Contributor

@djfpaagman djfpaagman commented Aug 22, 2024

I saw the release of 1.0.0 and noticed the code for reading queries from the database.

I found it a curious approach (took a while to figure out what the gsub did) and decide to have a try at making it more readable and in the end I also had some fun benchmarking the query and made it about 50% faster!


Here's my benchmark script:

require 'benchmark/ips'

require_relative "./test/test_helper"

Benchmark.ips do |x|
  x.report("Entry.read") { SolidCache::Entry.read('bla') }
  x.report("Entry.read_multi") { SolidCache::Entry.read_multi(['bla', 'blabla']) }
end

And here are some of my results, ran on a 2021 MBP M1 Pro.

With SQLite:

What Entry.read (i/s) % Entry.read_multi (i/s) %
Current implementation 15.682k (± 2.8%) - 14.543k (± 4.8%) -
With direct AR query (1) 9.958k (± 3.0%) -36.5% 8.910k (± 3.7%) -38.7%
With encrypted column (2) 16.871k (± 3.1%) +7,6% 14.972k (± 5.6%) +2,94%
With regular column (3) 19.740k (± 4.9%) +25,9% 18.320k (± 7.2%) +26,0%

With Postgres (running from the Docker container) there's a even bigger performance gain, although the variability is also very high and the overal numbers are way lower, so not sure how reliable the numbers are. They are consistently higher when running the benchmark multiple times, though.

What Entry.read (i/s) % Entry.read_multi (i/s) %
Current implementation 1.029k (±24.1%) - 1.025k (±23.0%) -
With regular column (3) 1.818k (±42.6%) +76,7% 1.921k (±33.3%) +87,4%

(1) I decided to see how it would perform with a plain and simple Active Record query, also as a reference:

def read_multi(keys)
  without_query_cache do
    where(key_hash: key_hashes_for(keys)).pluck(:key, :value).to_h
  end
end

(2) with encrypted columns it uses the current implementation, so that performance will be similar to the current.

(3) is the code as in this PR.

The main performance gain is gained by querying directly and not instantiating any SolidCache::Entry objects when the data is not encrypted.


Curious to hear your thoughts, happy to make improvements in any direction.

@djfpaagman
Copy link
Contributor Author

Oh this fails miserably for the encrypted and unprepared_statements setups, I will have another look at it later.

We can't assume prepared statements are available, so we go through
`Arel.sql` to construct the query, which will handle that part,
depending on availablity.
The different databases don't like the manually constructed query, AR
takes very nice care of that.
They need to be cast to values to be deserialized
@djfpaagman
Copy link
Contributor Author

I've changed it quite a bit since my original PR. I left the commits for now as a paper trail on how I got to the current solution, happy to squash them later.

What's changed is that it's hard to manually construct a query string that works on all databases, so I've reverted it to use AR again (with some minor changes) and added a comment that explains how it works.

It now uses a more performant way to grab the data from the database if there's no encryption, else it will use the current implementation.

There are some benchmarks (updated as well) in the description above.

Copy link
Collaborator

@djmb djmb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@djfpaagman - this is great, thank you for this!

If you can make the change to pass the types when casting, I think we are good to go 👍

app/models/solid_cache/entry.rb Outdated Show resolved Hide resolved
@djfpaagman
Copy link
Contributor Author

Sweet, I didn't realize you could also that do that! All tests are green now, do you want me to squash all commits before you merge it?

@djmb djmb merged commit b610333 into rails:main Sep 2, 2024
19 checks passed
@djfpaagman djfpaagman deleted the performance_read_queries branch September 4, 2024 15:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants