How To Use Django bulk_update

Hi Guy!,
In our previous lesson, we learnt how to use bulk_create in Django and saw the benefits of using it over our normal way of saving bulk stuffs. If you have not seen that kindly click here to read on it.

In this, we are going to discuss how to use bulk_update in django!

How to use bulk_update

“Django’s bulk_update has the same goal as bulk_create, which is to reduce the number of queries to the database Time to say goodbye to looping through and updating one by one! 🎉

bulk_update(objs, fields, batch_size=None)

  • objs â€“ List of objects that should be updated.
  • fields â€“ Defines the fields you want to update
  • ignore_conflicts â€“ Defines how many instances are updated with a single database call. Defaults None.

Example:

In bulk_update, I’ll use the same Blog model we used for bulk_create

class Blogs(models.Model):
    title = models.CharField(max_length=200, blank=True, null=True)
    description = models.TextField(blank=True, null=True)
    date_created = models.DateTimeField(auto_now_add=True)
    
    def __str__(self) -> str:
        return f"{self.title}"

In our previous lesson we already created 1000 Posts. We are going to use that for this lesson and update it. With this we will be going through our normal way of creating it by for looping all the 1000 posts and updating them.

We already have our 1000 Posts from bulk_create example, we’ll update the title of every single one, making it 1000 database calls once again:

def function_witout_bulk():
    posts = Blogs.objects.all()
    for index in posts:
        index.title = "Post 1"
        index.save()
        
p = cProfile.Profile()
p.runcall(function_witout_bulk)
p.print_stats(sort='tottime')

with this the output is

(env) project/project$ python3.8 django_test.py 
          517128 function calls (513069 primitive calls) in 5.328 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
     2001    3.906    0.002    3.906    0.002 {function SQLiteCursorWrapper.execute at 0x7fd393c40c10}
     3000    0.043    0.000    0.081    0.000 query.py:314(clone)
     1000    0.038    0.000    0.226    0.000 query.py:1361(build_filter)
     1000    0.037    0.000    0.377    0.000 compiler.py:1913(as_sql)
     1000    0.032    0.000    5.276    0.005 base.py:835(save_base)
     2002    0.029    0.000    0.110    0.000 utils.py:108(debug_sql)

More than half of the time was used to execute the method to call out the sqlite library each time we update the post. Lets now look into this with bulk_update

def function_with_bulk():
    posts = Blogs.objects.all()
    for post in posts:
        post.title = "Post 1"
    Blogs.objects.bulk_update(posts, ['title'])


p = cProfile.Profile()
p.runcall(function_with_bulk)
p.print_stats(sort="tottime")

lets see the result

(env) project/project$ python3.8 django_test.py 
          480604 function calls (422457 primitive calls) in 0.205 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
     6021    0.010    0.000    0.039    0.000 copy.py:66(copy)
15060/9036    0.008    0.000    0.009    0.000 deconstruct.py:15(__new__)
     1004    0.007    0.000    0.039    0.000 query.py:1361(build_filter)
       10    0.007    0.001    0.007    0.001 {function SQLiteCursorWrapper.execute at 0x7f78d2d40c10}
     6021    0.005    0.000    0.017    0.000 copy.py:258(_reconstruct)
28182/28174    0.005    0.000    0.005    0.000 {built-in method builtins.getattr}
11051/1037    0.005    0.000    0.021    0.000 functional.py:49(__get__)

We noticed that it took about 5.328 seconds to update 1000 data into the database, while it only took 0.205 seconds, which is less than a second, to perform the same task using the second method. This is a significant improvement. Imagine updating billions of data? That would take almost forever with the first method.

Leave a Reply

Your email address will not be published. Required fields are marked *

Back To Top