Efficient Data Handling in Java 23: Compact Strings, Off-Heap Storage, Weak references and Zero-Copy Techniques 

As software becomes more and more complex and varied in terms of the environments in which it runs, there is an increasing need to manage computing resources efficiently. However,it must still satisfy the latency and throughput requirements of the users.

In this blog, we will discuss how Java 23 eases efficient data handling, and the new constructs it introduces for managing memory better. 

Shallow Heap vs Retained Heap

A JVM running Java programs allocates memory for its code segments, class structure, objects and data variables. An object occupies some space in the memory, which it uses for its variables, data and class structure. This space is called the shallow space of the object. 

The term Shallow Heap is used to refer to the memory directly occupied by an object, i.e. the object’s shallow space.

But an object could also hold references to other objects, let’s say its associated objects, and these associated objects in turn occupy some memory themselves. 

If this object is garbage collected, then the associated objects are also garbage collected. 

So, the memory that will be freed when that object, and all its reachable associated objects, are garbage collected, is called the retained heap.

The Retained Heap consists of the shallow size of the object itself, plus the shallow sizes of all objects that are only reachable from that object.

You can read more about Shallow Heap VS Retained Heap here. 

How to efficiently handle data in Java 23 

Sometimes, a lot of memory is wasted by frameworks and carelessly written programs. For example, the PetClinic app stores thousands of Strings containing the same content, one each for every object that references it.

The sample program below uses some of the older techniques to allocate memory, store variables and copy files . 

import java.util.ArrayList;
import java.util.List;
import java.io.*;
import java.nio.channels.FileChannel;
import java.nio.ByteBuffer;
import java.nio.file.Paths;
import java.nio.file.StandardOpenOption;

public class TraditionalDataHandling {

    public static void main(String[] args) {
try{
        String asciiText = "Hello, Compact Strings!"; // ASCII text
        String unicodeText = "你好, Compact Strings!"; // Unicode text

HeapStorageExample()  ;
TraditionalFileCopy() ;

Thread.sleep(20000); 
doFinalCleanup() ;
long endTime = System.currentTimeMillis();
              System.out.println("All tasks completed in " + (endTime - startTime) + " ms");
}
catch ( Exception e) {
System.out.println("Exception occured");
e.printStackTrace();
}
    }


public static void HeapStorageExample( )  throws IOException
{
        List<byte[]> dataList = new ArrayList<>();
        for (int i = 0; i < 100000; i++) {
            dataList.add(new byte[1024]); // 1KB per object
        }
        System.out.println("Data stored in heap. High GC pressure!");
}  

public static void TraditionalFileCopy() throws IOException
    {
        try (FileInputStream fis = new FileInputStream("nodes.csv");
            FileOutputStream fos = new FileOutputStream("copyNodes.csv")) {

            byte[] buffer = new byte[8192]; // 8KB buffer
            int bytesRead;
            while ((bytesRead = fis.read(buffer)) != -1) {
                fos.write(buffer, 0, bytesRead);
            }
        }
        System.out.println("File copied with traditional approach.");
    }

    private static void doFinalCleanup() {
        System.out.println("Final cleanup task completed.");
    }
}

A better way would have been to store just one String, and all objects that use the same string content keeping a reference to this String object. 

This is what we call a canonicalizing mapping (or canonical mapping). It ensures that only one instance of an object exists in memory, and all subsequent references point to that single instance. This saves storage space and potentially improves performance.

The many references pointing to this single String instance are called weak references

A weak reference is used to implement canonicalizing mappings. 

So, it’s good if an object itself uses less memory, and its associated objects also are held via weak references.  

Some of Java 23 memory management enhancements are built on this concept only. Some other enhancements work smartly on common sense principles, namely :-  

1. Off-Heap Storage – All the data that your Java program is processing need not be loaded into the Java Heap. Sometimes it’s better to keep the voluminous data outside the heap, and keep the heap memory for necessary Java objects. 

Also, as this data is kept outside the heap, it is not liable for garbage collection, hence no CPU cycles are used on that count. You can interact with off-heap memory through a MemorySegment object, 

2. Zero-Copy Techniques – While copying large files, there is no need to read the file contents into the heap. This feature allows you to keep the file in its original location and use the operating system to create an additional reference or pointer. Thus, CPU cycles and memory are saved, as redundant copying is avoided. 

3. Compact Strings – Earlier, Java used to store string contents internally in a character array using 2 bytes or 16 bits per character. But characters of the English language(LATIN-1  representation) need 1 byte only. So, nowadays, String contents are stored in a byte[] instead, using only 1 byte or 8 bits for a character, unless the internal representation needs UTF-16 encoding. 

4. Arena – For managing off-heap memory, Java 21 onwards provides the concept of an Arena. You can easily create and manage native MemorySegments using the Arena.

You can also create a confined arena, using Arena.ofConfined(), which allows only its creating owner thread to access the memory segments created by it. You’ll get an exception if you try to close a confined arena with a thread other than the owner thread. In case you need multiple threads to access the off-heap memory segments, you can create those segments using a shared area, created via Arena.ofShared() method. 

When an Arena is closed, all its associated memory segments are invalidated and the memory regions backing them are deallocated. 

Sample Program Using Improved Memory Techniques in Java 23

The program below is somewhat at a surface level, but it clearly shows how memory enhancements in the latest Java versions increase the JVM efficiency.

import java.util.ArrayList;
import java.util.List;
import java.io.*;
import java.nio.channels.FileChannel;
import java.nio.ByteBuffer;
import java.nio.file.Paths;
import java.nio.file.StandardOpenOption;
import java.lang.foreign.*;
import java.lang.foreign.Arena;
import java.lang.foreign.MemorySegment;

public class EfficientDataHandling {

    public static void main(String[] args) { 

try{
        String asciiText = "Hello, Compact Strings!"; // ASCII text stored in byte[]
        String unicodeText = "你好, Compact Strings!"; // Unicode text stored in char[], 2 bytes per character

OffHeapStorageExample() ;
ZeroCopyFileTransfer ( ) ;
AllocateMemorySegment( );

Thread.sleep(10000); 
doFinalCleanup() ;
long endTime = System.currentTimeMillis();
        System.out.println("All tasks completed in " + (endTime - startTime) + " ms");

}
catch ( Exception e) { e.printStackTrace();
}
    }

public static void  OffHeapStorageExample() throws IOException
    {
        int size = 100_000 * 1024; // 100,000 KB
        ByteBuffer buffer = ByteBuffer.allocateDirect(size); // Allocates off-heap memory
        for (int i = 0; i < size; i++) {
            buffer.put((byte) 1); // Simulating data storage
        }
    }

public static void ZeroCopyFileTransfer( )  throws IOException
    {
        try (FileChannel sourceChannel = new FileInputStream("nodes.csv").getChannel();
            FileChannel destChannel = new FileOutputStream("copyNodesEfficient.csv").getChannel()) {

            sourceChannel.transferTo(0, sourceChannel.size(), destChannel);
        }
        System.out.println("File copied using zero-copy.");

}

    private static void doFinalCleanup() {
        System.out.println("Final cleanup task completed.");
    }

public static void AllocateMemorySegment( )  throws IOException
{
String myStr = "My string in Off-Heap memory";
    try (Arena arena = Arena.ofConfined()) {

        // Allocate off-heap memory
        MemorySegment nativeMemory = arena.allocateFrom(myStr); // Converts the Java string myStr into a null-terminated C string using the UTF-8 charset, storing the result into a memory segment.

        // Access off-heap memory
        for (int i = 0; i < myStr.length(); i++ ) {
          System.out.print((char)nativeMemory.get(ValueLayout.JAVA_BYTE, i));
        }           
    } // Off-heap memory is deallocated as soon as Try-with-resources block ends
}
}

Comparing memory management using the HeapHero tool  

When we compare the memory usage using the HeapHero tool, we can clearly see that the percentage difference between the shallow heap and the retained heap is better for EfficientDataHandling.java 


Fig: HeapHero Memory Analytics Showing Improved Retained Memory

Source: https://heaphero.io/heap-report-wc.jsp?p=ZzM2a1FrdWpWbXd4ajVsTTVUdDVXdzBWcDh1SVI2TFhnQ1dnM3J5Z3pUSHhJN2pMMXpmV2JaeG40TURJSmtleW8rM1dwL29QcURBU2FSVTBBcDFnWVE9PQ==

A lower percentage of retained heap size compared to shallow heap size is better, clearly indicating that an object is not holding onto a large amount of other objects. This potentially leads to more efficient memory usage and garbage collection. 

Conclusion

You can make good use of the enhanced memory management and off-heap storage techniques of Java 23, and make your programs run better.

Thanks.

Share your Thoughts!

Up ↑

Index

Discover more from yCrash

Subscribe now to keep reading and get access to the full archive.

Continue reading