Java provides constructs of multi-threading, or threads, for concurrent programming. Using threads, one can code applications to perform multiple tasks simultaneously.
Traditionally, Java developers relied on native threads, or platform threads, for developing multi-threaded applications.
What are Platform Threads?
A platform thread is implemented as a thin wrapper around an operating system (OS) thread. A platform thread captures its corresponding OS thread for its entire lifetime, and runs its associated Java code on this underlying OS thread. Consequently, the number of available platform threads is limited to the number of OS threads.
Platform threads typically have a large thread stack and other resources that are maintained by the operating system. Platform threads are scarce, or finite in number, hence they need to be managed via thread pools etc.
What are Virtual threads ?
To provide a better alternative to platform threads, Java 19 introduced Virtual threads under Project Loom. A virtual thread is also an instance of java.lang.Thread, but isn’t permanently tied to a specific OS thread. A virtual thread does run its code via an OS thread, but as soon as it hits a blocking I/O operation, the Java runtime suspends this virtual thread and frees the associated OS thread to perform operations for other virtual threads.
Later, when this older virtual thread is ready to resume, it’s assigned an available OS thread. In essence, the Java runtime allocates limited OS threads to a large number of virtual threads, and as soon as a virtual thread is waiting/blocked for some reason, its underlying OS thread is made available for other virtual threads, to be used as a vehicle to run their code.
Virtual threads are plentiful, and are not bounded by the number of OS threads assigned to the JVM. So we must not try to pool virtual threads, as it’s a different programming concept. You should assign a virtual thread to each concurrent task and let the JVM manage the group of virtual threads.
The Difference Between Implementation of Virtual Threads and Platform Threads
Instead of using a shared thread pool executor to return a finite pool of threads, and assigning the next task in the list to a free thread in this finite pool:
Future<ResultA> future1 = sharedThreadPoolExecutor.submit(task1);
Future<ResultB> future2 = sharedThreadPoolExecutor.submit(task2);
// ... use futures
Rather use a virtual thread executor that creates a new virtual thread for each submitted task. So, each task in your list gets a dedicated virtual thread assigned to it: something like a 1-to1 mapping.
try (var executor = Executors.newVirtualThreadPerTaskExecutor()) {
Future<ResultA> future1 = executor.submit(task1);
Future<ResultB> future2 = executor.submit(task2);
// ... use futures
}
Remember to close the ExecutorService and related resources like underlying threads and variables by calling ExecutorService.close() method, OR use a try-with-resources construct.
try (var executor = Executors.newVirtualThreadPerTaskExecutor()) {
The number of virtual threads is always equal to the number of concurrent tasks in your application. It’s conceptually easier and clearer when you assign all the tasks in your program to a virtual thread each. You can focus more on your business logic, rather than managing the multithreading intricacies. Virtual threads are better for both performance and observability.
Our test Results: Virtual VS Platform Threads
Now lets run some actual tests, take thread dumps and check what’s happening inside the JVM.
The program which we will run has a function that will generate a random number and calculate the sum of its primes. It will then simulate an I/O operation and send the thread into a waiting state. A number of virtual/platform threads will be created. Each thread will run this function and then all threads will be JOINed.
The first program uses traditional platform threads.
import java.util.concurrent.*;import java.util.ArrayList;import java.util.stream.IntStream;
import java.util.List;
import java.util.Random;
public class TraditionalThreadProcessor {
private static final int TASK_COUNT = 10000;// Limit thread count as each thread will be mapped to an OS thread.
private static final int THREAD_POOL_SIZE = Runtime.getRuntime().availableProcessors(); private static final Random RANDOM = new Random();
public static void main(String[] args) {
long startTime = System.currentTimeMillis();
ExecutorService myExecutor = Executors.newFixedThreadPool(THREAD_POOL_SIZE);
List<Future<?>> futures = new ArrayList<>();
for (int i = 0; i < TASK_COUNT; i++) {
int taskId = i;
futures.add(myExecutor.submit(() -> processTask(taskId)));
}
// Wait for all tasks to complete
futures.forEach(future -> {
try {
future.get(); // Block until task completes
} catch (InterruptedException | ExecutionException e) {
e.printStackTrace();
}
});
myExecutor.shutdown();
long endTime = System.currentTimeMillis();
System.out.println("All tasks completed in " + (endTime - startTime) + " ms");
}
private static void processTask(int taskId) {
// Simulating CPU-bound work
int limit = RANDOM.nextInt(5000) + 1000; // Random upper bound for prime calculation
int primeSum = calculateSumOfPrimes(limit);
// Simulating a blocking/IO-bound operation
simulateBlockingOperation();
// Print every 1000th task
if (taskId % 1000 == 0) {
System.out.printf("Task %d processed: Sum of primes up to %d = %d%n", taskId, limit, primeSum);
}
}
private static int calculateSumOfPrimes(int limit) {
return IntStream.rangeClosed(2, limit)
.filter(TraditionalThreadProcessor::isPrime)
.sum();
}
private static boolean isPrime(int num) {
if (num < 2) return false;
return IntStream.rangeClosed(2, (int) Math.sqrt(num))
.noneMatch(n -> (num % n == 0));
}
//Simulating a blocking/IO-bound operation
private static void simulateBlockingOperation() {
try {
Thread.sleep(10); // Simulate a small delay (e.g., database query)
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
}
}
}
Our second program will use the new virtual threads.
public class VirtualThreadProcessor {
private static final int TASK_COUNT = 400000;
private static final Random RANDOM = new Random();
public static void main(String[] args) {
long startTime = System.currentTimeMillis();
try (ExecutorService myExecutor = Executors.newVirtualThreadPerTaskExecutor()) {
List<CompletableFuture<Void>> futures = new ArrayList<>();
for (int i = 0; i < TASK_COUNT; i++) {
int taskId = i;
futures.add(CompletableFuture.runAsync(() -> processTask(taskId), myExecutor));
}
// Wait for all tasks to complete
futures.forEach(CompletableFuture::join);
}
long endTime = System.currentTimeMillis();
System.out.println("All tasks completed in " + (endTime - startTime) + " ms");
}
private static void processTask(int taskId) {
// Simulating CPU-bound work
int limit = RANDOM.nextInt(5000) + 1000; // Random upper bound for prime calculation
int primeSum = calculateSumOfPrimes(limit);
// Simulating a blocking/IO-bound operation
simulateBlockingOperation();
if (taskId % 1000 == 0) { // Print every 1000th task
System.out.printf("Task %d processed: Sum of primes up to %d = %d%n", taskId, limit, primeSum);
}
}
private static int calculateSumOfPrimes(int limit) {
return IntStream.rangeClosed(2, limit)
.filter(VirtualThreadProcessor::isPrime)
.sum();
}
private static boolean isPrime(int num) {
if (num < 2) return false; return IntStream.rangeClosed(2, (int) Math.sqrt(num))
.noneMatch(n -> (num % n == 0));
}
private static void simulateBlockingOperation() {
try {
Thread.sleep(10); // Simulate a small delay (e.g. a database query or Network Latency)
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
}
}
}
We use Random, and not SecureRandom, as we do not want to directly use the underlying OS entropy to generate random numbers. Random uses a pseudo-random number generator (PRNG) algorithm, which relies on a seed value that is typically based on the current system time. This makes it predictable, but it’s fit for our test programs. Also, programs based on an OS’s entropy might behave differently on different operating systems at different times.
These tests were run on an Amazon Linux machine using OpenJDK 23 (java-23-amazon-corretto). Our findings for the time taken to run these programs were:
| Number of threads | Platform threads | Virtual threads | |
| 10,000 tasks | 51125 ms | 922 ms | |
| 100,000 tasks | 510300 ms (8 minutes!) | 63915 ms (1 minute) |
To take an even deeper, look we used the FastThread tool. With platform threads, there are almost an equal number of daemon vs non-daemon threads.

Fig: Daemon vs Non-daemon Threads: Platform Threads
With virtual threads, the difference is more pronounced: there were less non-daemon threads, indicating that threads were quickly doing their work and exiting the CPU.

Fig: Daemon vs Non-daemon Threads: Virtual Threads
Also, with platform threads, we see comparatively more threads in the WAITING state.

Fig: Thread State: Platform Threads
But with virtual threads, there are a smaller number of WAITING threads, and since virtual threads are quickly being executed, the thread dump is not able to determine the state of 2 threads (because they are done, and about to exit the CPU).

Fig: Thread State: Virtual Threads
Conclusion
You can clearly see how virtual threads are a few orders of magnitude faster when the tasks are predefined, high in number and within a certain orb of predictability. Traditional threads might be good only if the number of tasks is not very high and the tasks vary a lot in the processing they need to do, OR each computation is so highly CPU-intensive that it’s better mapped to an actual OS thread.
Code written in the synchronous style and using simple blocking IO will benefit greatly from virtual threads (i.e. Not using Futures will benefit from virtual). Code written in the non-blocking, asynchronous style, e.g. using Futures, won’t benefit much from virtual threads. Programs or frameworks that dedicate a thread per task should expect to see a significant benefit from virtual threads.
Let us know your experiences and thoughts. Thanks for your time.
