Skip to content

Conversation

zhuoyuan-liu
Copy link
Contributor

Implement batch creation of node queries to handle more than 10,000 nodes efficiently.

It's a quick fix and I think it's a good idea to applied the batchsize as a global configuration for DB cofig.

It can be done by:

db, err := gorm.Open(postgres.Open(dsn), &gorm.Config{
    CreateBatchSize: 1000, 
})

@javuto javuto added queries On-demand queries related issues backend Backend related issues labels Mar 27, 2025
@javuto javuto requested review from Copilot and javuto March 27, 2025 15:48
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR addresses an error encountered when creating node queries for large datasets by applying batch creation to prevent performance issues.

  • Replaces q.DB.Create with q.DB.CreateInBatches using a hard-coded batch size (1000).
  • Aims to improve performance for datasets with more than 10,000 nodes.
Comments suppressed due to low confidence (1)

pkg/queries/queries.go:411

  • [nitpick] Consider using a global configuration variable for the batch size instead of hardcoding 1000, in order to align with the intended DB configuration.
if err := q.DB.CreateInBatches(&nodeQueries, 1000).Error; err != nil {

@javuto
Copy link
Collaborator

javuto commented Mar 27, 2025

Is there something missing in this PR?

@javuto
Copy link
Collaborator

javuto commented Mar 27, 2025

Nevermind, I found the documentation https://gorm.io/docs/create.html#Batch-Insert

@javuto
Copy link
Collaborator

javuto commented Mar 27, 2025

If we apply the configuration using the gorm.Config statement, then every Create will be affected by that batch size. Not sure if that could have a negative impact or not.

@zhuoyuan-liu
Copy link
Contributor Author

Here is the source code:
image
image

For a single value creation, it functions the same as a regular create operation. As for the rest of the database operations, I don't see any batch creation involved, so this change should have no impact on other operations.

@javuto javuto merged commit 4292fae into jmpsec:main Mar 28, 2025
25 checks passed
@zhuoyuan-liu zhuoyuan-liu deleted the fix-batch branch April 8, 2025 07:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend Backend related issues queries On-demand queries related issues
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants