Living the stream: Reducing memory usage in Java and Kotlin

5 min readSep 17, 2020

TLDR; Change how you fetch lists of information in your HTTP endpoints, and your application can often survive on 300–600MB of heap. Ours is usually somewhere between 200 and 350, though we have a max of 850MB to handle any high spikes.

Doing this does not change code much and even helps against HTTP time outs. Most applications I have worked on can benefit from this, and even if you can not do this on all endpoints, it helps doing it where you can.

It will make your application more scalable and robust.

Oh yeah, and there is code at the bottom. :)

Update 28.10.2020: Spring 5.3 just added streaming support in JDBCTemplate so it’s easier if you’re in Java land. :)

The problem

In our main back end application we are handling quite a few requests, but none that should take much memory. So when we started getting OutOfMemoryErrors we had to do some digging. We could of course increase memory, but memory in the cloud is expensive. And besides… I hate not understanding why stuff happens… ;)

We use some caches, but neither of those were really big. And when consumption rose high it would always go down again on a later GC. So… Not a permanent leak, but peaks related to specific actions…

We zeroed in on a couple of endpoints that would get the history of orders with dates as parameters. Every once in a while the time range was increased and we could see clear issues in our back end application.

It made sense: To fetch a large list of orders from the DB we had to:

Read from the database
Transform into objects
Transform into JSON
Write JSON to the response

This basically gives at least (maybe more?) two full representations of your objects in memory at the same time. So the larger the list (the DB) gets the more memory you need. Not very scalable.:)

While you shouldn’t really fetch large lists it felt like an unnecessary scaling limitation. And techniques like paging is less effective if you really have to fetch large volumes.

So on a hunch we started looking into streaming… If you’re not doing sum/sort/groupBy etc. it should be possible to read one item, send it to the client, then read another one. Right? Filter/map etc should also work one piece at a time. Streaming would prevent having more than one object in memory at the same time.

So we started digging… And it was a rabbit hole…But we got back out. :)

The solution

Because of the Streaming API introduced in Java 8 we thought it would be possible. It lets you compose operations that will be applied first when something starts pulling items off the stream.

We are using JDBI and Postgres, so there might be special variations that don’t apply to you, but I hope you can easily find similar things in your stack.

After a lot of fumbling we found these things to be key:

The Handle (transaction) has to be opened and closed on your view layer. The same level where you have access to the output stream to write JSON to. In our case it is in the KTor mapping code.
Set the fetch size of your query, it will make it use a cursor. What is the most efficient here varies a lot, but for one of our main streams we have set 500.
The connections autocommit has to be false. At least in Postgres fetching chunks (with a cursor) won’t work if autocommit is true. In JDBI this is done by starting a transaction. Usually via Handle.useTransaction { … } .
Use Handle.stream() (instead of execute()) in JDBI. If you use a different library, hopefully it has a similar operation. :)
Use the Jackson Streaming API to write elements. When you map one element off the DB stream, you write it to the Jackson streaming API before you pull another off the DB stream. Rinse and repeat.

We fumbled around with this, and even forgot the autocommit in a re-factoring. We noticed the next time our application crashed though. ;)

With this implemented we can now stream the entire order history from our database in parallel from multiple clients. Stupid, but we can. ;)

Bonus: Because the first item is sent back to the client way before the last one is fetched+converted to object+generated JSON from the DB: Bytes will start to arrive “immediately”, and your HTTP connection will not get timed out by your proxy/router.

After implementing the streaming our memory consumption looks like the below image when fetching several months of orders. The resulting JSON is about 1GB and would definitely have crashed our backend before:

It hovers around a heap size of 250MB. And you can see the load and the GC time go up when fetching, but not much happening with the memory. :)

The juicy stuff (code)

You can find a fully working example with other stuff from previous posts at https://github.com/PorterAS/dependency-injection.

Here I will extract the main parts :) First out. The repository doesn’t look very different:

The code to stream the result back in a repository

The code above looks a lot like any regular repository. But notice that the handle is passed in (transaction has to be managed from view layer), and the fetch size is set.

I have made the handle nullable here, but that’s just a minor detail to be able to create stubbed repositories without mocking the whole JDBI API.

The more complex and juicy stuff is in the view layer. There is an OrderService in between here, but that basically just passes the call on to OrderRepository for this specific case.

In the above we do:

We get a outputstream to write to from KTor (call.respondOutputStream(…)).
We open a JDBI handle and start a transaction. Here we use a quite manual way of setting up the transaction as we can’t use all of the JDBI api with KTor and async. The handle/transaction is passed to the service and repository to perform operations on.
We run the actual query via the repository.
Once we have a resulting stream, we create a Jackson Generator and write the start of the JSON array.
We then iterate one by one on the orders on the stream (pulling off) and write the JSON to the generator.
Write the end of the JSON array and clean up by commiting the transaction and closing streams that needs to be closed. Handling streams is mainly by using .use { … }. See comments in code for exception.

Because JDBI is told to stream as well, it will pull one record, map it to an object and then pass it on before doing another one.

The above code is inlined to make it clear what happens. In our real code this is separated into utility methods such that it is wrapped and handled properly everywhere we use it. Our real code would look something like this:

Please let me know if there any issues or improvements to be made. :)

Living the stream: Reducing memory usage in Java and Kotlin

The problem

The solution

The juicy stuff (code)

Written by Anders Sveen