• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.
  • The forums have been upgraded with support for dark mode. By default it will follow the setting on your system/browser. You may override it by scrolling to the end of the page and clicking the gears icon.

Possible to force my java app to use more CPU utilization?

If you're not looking for that major of a change, you could consider changing your array lists to arrays. Array lists have a huge performance penalty when they have to expand their internal arrays.
The correct way to do this in Java to maximize concurrency would be to use a stream IMHO.
expecting efficient cpu usage from java is sorta like expecting a 300lb man not to use the handicap electric carts at the store
its possible but extremely unlikely and generally a inconvenience for all
While your comment is colorful, it's not true or helpful. Java has performance similar to other JIT languages like C# and tends to be eclipsed only by languages like C and C++ which isn't a low bar. I'll concede that the JVM tends to consume a lot of memory but, that's more because if you give it more heap space, it will use it (kind of how the OS will do disk caching with free memory. It's there, so why not use it?) There are plenty of applications that can still run even if you reduce the max size of the heap as well. The JVM is also more than capable of using multiple hardware threads.
Essentially I made use of the "interrupt" and "join" methods inherited from the Thread class.
That's probably not the best way to go about it. The best way to handle mutable state with multiple threads is .wait and .notify. Join is appropriate if you're waiting for thread termination to indicate that processing is done but, spinning up threads is a relatively expensive task and you don't want to be spawning a thread every time you need to render a frame. I suggest using a stream because it decouples the consumer from the producer and doesn't require spinning up new threads often. It also opens the possibility of having multiple workers on either the producer or consumer side which could further improve performance depending on where the bottleneck is and if you can just throw more threads at the problem (OpenGL calls won't benefit from this.) Interrupt isn't a good option because you're basically telling the thread to halt execution, even if the task isn't yet complete. If you're not getting an InterruptedException, then the thread already has likely finished executing by the time .interrupt has been invoked otherwise something might not be completing every time.
 
While your comment is colorful, it's not true or helpful.

I know. I go grocery shopping weekly without issues, and Java is pretty much as performant as any JIT while we're at it. ;)
 
Thanks for the comments.

I'm not sure if I mentioned this in a prior reply, but I prefer Java because I find it to be sufficiently fast and cross platform compatible. I installed Ubuntu Linux on my Windows 10 PC. I'm able to run my app on it without modification or recompiling. Besides, Java is a fun language to program in, IMO.

Aquinus:

I'm using Interrupt and Join only when I restart the simulation (by clicking on a button). I don't use it upon every iteration. During execution, no new threads are spawned.

I would love to have access to the Nvidia GPU on my PC. The problem I have is that my laptop has two graphics adapters. It always defaults to the Intel GPU when running apps. I wish I could use the discrete GPU without jumping through too many hoops, but I've yet to find a way to do that.
 
Ever tried C#? It's like Java without the stupidity. If you target .NET Framework 3.0 and conform the platform good practices, the same binary can run on Mono on Linux. Another user here contributed OpenCL C# code that executes on as many GPUs as you have and balances the load amongst them: https://www.techpowerup.com/forums/...ultiple-gpus-and-cpu-at-the-same-time.232334/

HTML5 is the most cross-platform language there is. If rendering, it uses the GPU that owns the screen. Don't have experience with NVIDIA Optimus so I'm not sure how it handles browsers. Not that it matters...if you're doing it on CPU now, even an integrated GPU should be significantly faster at the task. CPUs really suck at rendering. They're built for complex logic. GPUs are built for doing simple tasks, over and over, across many processors.

HTML5 mostly builds off of JavaScript which should be familiar to you (not all that different from Java).
 
Last edited:
Ever tried C#? It's like Java without the stupidity. If you target .NET Framework 3.0 and conform the platform good practices, the same binary can run on Mono on Linux. Another user here contributed OpenCL C# code that executes on as many GPUs as you have and balances the load amongs them: https://www.techpowerup.com/forums/...ultiple-gpus-and-cpu-at-the-same-time.232334/

I haven't tired C#. I'll give it a whirl. I'm downloading VS 2017 now. How difficult is it to use DirectX utilizing a discrete GPU? I'd like to start as simple as possible. Say, animating circles that bounce off the edges of a "canvas".

HTML5 is the most cross-platform language there is. If rendering, it uses the GPU that owns the screen. Don't have experience with NVIDIA Optimus so I'm not sure how it handles browsers. Not that it matters...if you're doing it on CPU now, even an integrated GPU should be significantly faster at the task. CPUs really suck at rendering. They're built for complex logic. GPUs are built for doing simple tasks, over and over, across many processors.

So, HTML5 renders to the current GPU? That was the problem I was having in Java. It would render to the default GPU and not the more powerful, discrete GPU.

HTML5 mostly builds off of JavaScript which should be familiar to you (not all that different from Java).

I'm familiar with JavaScript. I've never used it to create simulations, though. Mainly for other web development purposes, such as validation, DOM, Jquery, AJAX, etc. In terms of pure logic and arithmetic calculations, I don't believe it to be faster than Java. Perhaps I'm mistaken?
 
I haven't tired C#. I'll give it a whirl. I'm downloading VS 2017 now. How difficult is it to use DirectX utilizing a discrete GPU? I'd like to start as simple as possible. Say, animating circles that bounce off the edges of a "canvas".
DirectDraw is for 2D, Direct3D is for 3D. If you want to animate canvases and don't care about cross-platform, I'd recommend using Windows Presentation Foundation which is GPU accelerated but only available on Windows. Here's basic WPF animations:
https://www.codeproject.com/articles/364529/animation-using-storyboards-in-wpf

So, HTML5 renders to the current GPU? That was the problem I was having in Java. It would render to the default GPU and not the more powerful, discrete GPU.
I imagine you'd have to go into Optimus settings and tell it to give the browser NVIDIA priority.

I'm familiar with JavaScript. I've never used it to create simulations, though. Mainly for other web development purposes, such as validation, DOM, Jquery, AJAX, etc. In terms of pure logic and arithmetic calculations, I don't believe it to be faster than Java. Perhaps I'm mistaken?
Java swing = CPU
HTML5 = GPU
Visual Studio WinForms = CPU
Windows Presentation Foundation = GPU

Example of bouncing box in HTML5 canvas:
http://jsfiddle.net/n2derqgw/4/
 
Now, now. I don't make fun of you for using .NET so I would keep statements like this to yourself. ;)
Two examples: 1) "Swing," what part of that sounds like UI to you? 2) lack of unsigned integers make any kind of binary data handling needlessly complicated (constantly have to convert types).

Not talking about programmers, just Java in general is haphazard and poorly thought out. The only thing going for it is it's cross-platform nature.

Just don't use Swing. LWJGL gives you access to OpenGL bindings in Java.
https://www.lwjgl.org/
There's the Java approach to GPU rendering. If insisting on sticking with Java, probably the best path.
 
It would render to the default GPU and not the more powerful, discrete GPU.

Didn't read the entire thread , but what is it exactly that you are trying to render ? Java and graphics do not go well together if performance is your goal , no matter what you will do. If whatever it is that you are trying to do can be packed onto simple calculations over large vectors (likely to be the case if you are drawing pixels) OpenCL could be a solution and you wouldn't have to worry about any dedicated graphics API or cumbersome multithreading.

There is no single compiler directive or library that can instantly turn your app into multi-threaded, because that would be too easy ;)

I know it's a year old comment but , believe it or not such a thing exists if you use C/C++ , it's OpenMP.
 
Last edited:
Didn't read the entire thread , but what is it exactly that you are trying to render ?

I'm rendering polygons. Nothing fancy. The polygons interact with each other when they overlap.


OpenCL could be a solution and you wouldn't have to worry about any dedicated graphics API or cumbersome multithreading.

Thanks for the recommendation. I'll look into it.


Easiest way to get the job done: use an Executor and start throwing jobs at it. https://docs.oracle.com/javase/tutorial/essential/concurrency/executors.html

Also, get off Netbeans, IntelliJ Idea is vastly superior (and free for personal use). Its code analyzer is nothing short of miraculous.

I've been using Netbeans for a while now and have grown used to it. I'll look into IntelliJ, sounds interesting.

I've since downloaded Visual C# and created a quick benchmark app consisting of a billion iterations, doing trigonometric computations. Here's what the code looks like

Code:
...
for (int i=0; i<1000000000; i++)
{
          x = r * Math.cos(theta);
          y = r * Math.sin(theta);
          theta += Math.PI * 0.001;
          r += 0.1;
}
...

The time it took to execute is as follows:

C# (Fastest): 78715 ms

Java: (Fastest): 792166 ms


So (not too surprisingly), C# is effectively 10 times faster with the given code. Now to create a simple benchmark with animation...
 
Last edited:
2) lack of unsigned integers make any kind of binary data handling needlessly complicated (constantly have to convert types).
Why are you converting between signed and unsigned in the first place? If you run out of values, use a long instead of an integer. If that's too small use a BigDecimal. Just because the human readable value is negative doesn't make it useless for the same operations and doesn't really change anything. It is "just bits", the negative just says a particular bit is a particular value. Bitwise operators and everything still work as expected on a signed variant versus an unsigned one. I see no difference between having 0-255 or -128-127 for a byte as they're representing the same thing. The only impact is how it's represented to a person but as far as the computer is concerned, a byte is a byte.
1) "Swing," what part of that sounds like UI to you?
It doesn't and I have absolutely no intention of defending anything about Swing. :)
C# (Fastest): 78715 ms

Java: (Fastest): 792166 ms
Trig functions in Java are a weird animal as they're intended to be accurate enough to do useful things with it, which means going a lot of decimal places out. There are faster implementations at the cost of accuracy. Depending on your use case, you might not need some as accurate as what is in java.lang.Math. Since you're only incrementing by 0.1, you could just make a lookup table for those values since a hash map lookup will be a lot faster than doing trig functions.
https://stackoverflow.com/questions/523531/fast-transcendent-trigonometric-functions-for-java
 
Why are you converting between signed and unsigned in the first place?
Because files are almost exclusively handled as unsigned byte streams. Java has to handle them as signed byte streams (who seriously ever uses a signed byte? so rare). This especially gets complicated in things like animations where you have two unsigned bytes in a struct that can either be instructions or a signed short value. Java has to do many steps when it should only be one or two.

I see no difference between having 0-255 or -128-127 for a byte as they're representing the same thing.
Except that in the case of files, a negative value is usually impossible and because Java condemned that 8th bit to be useless, you've halved the upper bound of the value doubling the risk of overflow.


I'm trying to pin down a WPF physics sample that works but no luck so far...

Edit: I sort of broke Visual Studio so it's gonna be a while...
 
Except that in the case of files, a negative value is usually impossible and because Java condemned that 8th bit to be useless, you've halved the upper bound of the value doubling the risk of overflow.
It still doesn't make a difference. It's twos compliment so the value that comes after 0111 1111 (127) is 1000 0000 (-128, or 128 unsigned,) and adding one to either the signed or unsigned representation will result in the same byte ordering of 1000 0001 (if you let it overflow or use a bitwise op instead of an arithmetic op.) It doesn't change any bitwise operation you would be using on it. It only changes how it's displayed to you. How you use them is still completely up to you. Just because it's different than what you're used to doesn't mean it's bad.
 
I'm rendering polygons. Nothing fancy. The polygons interact with each other when they overlap.




Thanks for the recommendation. I'll look into it.




I've been using Netbeans for a while now and have grown used to it. I'll look into IntelliJ, sounds interesting.

I've since downloaded Visual C# and created a quick benchmark app consisting of a billion iterations, doing trigonometric computations. Here's what the code looks like

Code:
...
for (int i=0; i<1000000000; i++)
{
          x = r * Math.cos(theta);
          y = r * Math.sin(theta);
          theta += Math.PI * 0.001;
          r += 0.1;
}
...

The time it took to execute is as follows:

C# (Fastest): 78715 ms

Java: (Fastest): 792166 ms


So (not too surprisingly), C# is effectively 10 times faster with the given code. Now to create a simple benchmark with animation...
Interesting test there. depending on what you do with those values, the compiler could decide it doesn't have to do anything, if the values are not used. Out of curiosity, what were the initial values?

One other thing, while I'm not familiar with C#, Java defaults to double which is in most cases overkill. I ran your code using float instead of double. It's almost 4x faster ;)
 
It still doesn't make a difference. It's twos compliment so the value that comes after 0111 1111 (127) is 1000 0000 (-128, or 128 unsigned,) and adding one to either the signed or unsigned representation will result in the same byte ordering of 1000 0001 (if you let it overflow or use a bitwise op instead of an arithmetic op.) It doesn't change any bitwise operation you would be using on it. It only changes how it's displayed to you. How you use them is still completely up to you. Just because it's different than what you're used to doesn't mean it's bad.
Just think about trying to read a GIF in Java. You have a palette of 256 indexed colors and you have an unsigned byte array for each pixel that references a color in a specific frame. In Java, you can't just read the bytes and pull the indexed color for each. You have to read the byte, convert it to a signed short, and use that signed short to pull the indexed color from the array. If you don't take that extra step, any index > 127 will give an out of bounds exception because negative indexes are invalid. There's an extra step here Java forces on developers because it doesn't give access to the full range of primitives processors support.

That stupidity carries back to writing the updated GIF too. All of those signed shorts Java forced you to convert to must be converted back to signed bytes for writing or else the file structure will become uncompliant with the standard.

C# not only lets you use all of the primitives, they're also classed (e.g. int.MaxValue, int.MinValue, and int.ToString() is a thing), they can be casted and converted between types, and you can change between signed and unsigned without changing the binary via (unchecked). Literally everything anyone needs to do with binary data is baked in and practical.
 
expecting efficient cpu usage from java is sorta like expecting a 300lb man not to use the handicap electric carts at the store
its possible but extremely unlikely and generally a inconvenience for all

Off topic but I used to be 330lb (now 240) and could walk just fine.
 
Interesting test there. depending on what you do with those values, the compiler could decide it doesn't have to do anything, if the values are not used. Out of curiosity, what were the initial values?

One other thing, while I'm not familiar with C#, Java defaults to double which is in most cases overkill. I ran your code using float instead of double. It's almost 4x faster ;)

(Kindly FYI) This has nothing to do with my Polygon simulation. But the entire (Java) "benchmark" code is as follows:

Code:
        System.out.println("Calculations started...");
        long startTime = System.currentTimeMillis();
        double x, y;
        double r = 10;
        double theta = 0;

        for (int i=0; i<1000000000; i++)
        {
            x = r * Math.cos(theta);
            y = r * Math.sin(theta);
            theta += Math.PI * 0.001;
            r += 0.1;
        }
     
        long finishTime = System.currentTimeMillis();
        long elapsedTime = (finishTime-startTime);
     
        System.out.println("Calculations complete in: " + elapsedTime + " MilliSeconds");

When running it in a thread, I was gettings results as fast as 513055 ms.
 
(Kindly FYI) This has nothing to do with my Polygon simulation. But the entire (Java) "benchmark" code is as follows:

Code:
        System.out.println("Calculations started...");
        long startTime = System.currentTimeMillis();
        double x, y;
        double r = 10;
        double theta = 0;

        for (int i=0; i<1000000000; i++)
        {
            x = r * Math.cos(theta);
            y = r * Math.sin(theta);
            theta += Math.PI * 0.001;
            r += 0.1;
        }
    
        long finishTime = System.currentTimeMillis();
        long elapsedTime = (finishTime-startTime);
    
        System.out.println("Calculations complete in: " + elapsedTime + " MilliSeconds");

When running it in a thread, I was gettings results as fast as 513055 ms.
Print x and y when you're done (for both Java and C#). If the compiler figures out you don't use those value, it may decide to not compute them at all. That way you can be 100% sure you're testing the math and not some compiler optimization.
And if you want to dig more into benchmarking, look up jmh ;) Java is way trickier to benchmark that it would seem.
 
Just think about trying to read a GIF in Java. You have a palette of 256 indexed colors and you have an unsigned byte array for each pixel that references a color in a specific frame. In Java, you can't just read the bytes and pull the indexed color for each. You have to read the byte, convert it to a signed short, and use that signed short to pull the indexed color from the array. If you don't take that extra step, any index > 127 will give an out of bounds exception because negative indexes are invalid. There's an extra step here Java forces on developers because it doesn't give access to the full range of primitives processors support.
...or you could use a map of bytes to some object that describes everything needed for that color? There are other ways to do what you suggest without forcing the language to do it the way you want. There are right and wrong ways to do things in just about every language. If you need to look something up and you can't directly use an array, you just use a map.

As for the op, this same logic can apply but, with a slightly different problem.
(Kindly FYI) This has nothing to do with my Polygon simulation. But the entire (Java) "benchmark" code is as follows:

Code:
        System.out.println("Calculations started...");
        long startTime = System.currentTimeMillis();
        double x, y;
        double r = 10;
        double theta = 0;

        for (int i=0; i<1000000000; i++)
        {
            x = r * Math.cos(theta);
            y = r * Math.sin(theta);
            theta += Math.PI * 0.001;
            r += 0.1;
        }

        long finishTime = System.currentTimeMillis();
        long elapsedTime = (finishTime-startTime);

        System.out.println("Calculations complete in: " + elapsedTime + " MilliSeconds");

When running it in a thread, I was gettings results as fast as 513055 ms.
The code here is going to run the same cos and sin methods with the same arguments 500000 times since you're repeating cos and sin after you pass 2π (360 degrees.) The op should be generating a lookup table for all values of cos(n) and sin(n) over a range of 0 to 2π, with a step of 0.001. One thousandth can be represented as an integer (1 = 1/1000) and that's your key. If you really felt so inclined, you could use an array.
Code:
public class Test {

  public static class TrigWrapper {
    public double sin;
    public double cos;

    public TrigWrapper(double s, double c) {
      this.sin = s;
      this.cos = c;
    }
  }

  public static TrigWrapper[] makeTrigMap() {
    double step = 0.001;
    double currentAngle;
    TrigWrapper[] output = new TrigWrapper[2000];
    for(int i = 0; i < 2000; i++) {
      currentAngle = i * (step * Math.PI);
      output[i] = new Test.TrigWrapper(Math.sin(currentAngle), Math.cos(currentAngle));
    }
    return output;
  }
}

With something like that, instead of having:
Code:
        for (int i=0; i<1000000000; i++)
        {
            x = r * Math.cos(theta);
            y = r * Math.sin(theta);
            theta += Math.PI * 0.001;
            r += 0.1;
        }
You could have something like this:
Code:
        Test.TrigWrapper[] lookupTable = Test.makeTrigMap();
        for (int i=0; i<1000000000; i++)
        {
            int n = i % 2000;
            x = r * lookupTable[n].cos;
            y = r * lookupTable[n].sin;
            r += 0.1;
        }
Edit: Don't mind my Java. I tend to write Clojure when I'm using the JVM. Logic might not be perfect but, you get the idea.

Edit 2: Making the lookup table (in Clojure which runs on the JVM,) takes very little time.
Code:
(defn make-lookup-table []
  (vec
    (for [i (range 2000)]
      (let [n (* (/ i 1000.0) Math/PI)]
        {:cos (Math/cos n)
         :sin (Math/sin n)}))))
Code:
(time (def table (make-lookup-table)))
> "Elapsed time: 2.466348 msecs"
> #'some-math.core/table

Edit 3: Lookup time, not too bad.
Code:
(time (get-in table [1373 :cos]))
> "Elapsed time: 0.079673 msecs"
> -0.3884807466313663
 
Last edited:
Print x and y when you're done (for both Java and C#). If the compiler figures out you don't use those value, it may decide to not compute them at all. That way you can be 100% sure you're testing the math and not some compiler optimization.
And if you want to dig more into benchmarking, look up jmh ;) Java is way trickier to benchmark that it would seem.

I updated the code and got the same result.

For the moment, what I'm doing, (drawing polygons as objects that interact), is more perceptual than functional at the moment. It's doing what it's supposed to and appears as I envisioned it. I'll see if I can port it to its C# equivalent for comparison purposes.
 
Last edited:
I got this code working!
https://chriscavanagh.wordpress.com/2006/10/23/wpf-2d-physics/

Stuff attached. I had to delete the compiled stuff in Source.zip so it was small enough to upload. Open NewtonDynamics solution file first, compile that. Then open the WPFPhysics solution and compile that. Should have a working example then. CodePlex didn't bother to save a unified solution file so had to rebuild them.

...or you could use a map of bytes to some object that describes everything needed for that color? There are other ways to do what you suggest without forcing the language to do it the way you want. There are right and wrong ways to do things in just about every language. If you need to look something up and you can't directly use an array, you just use a map.
A dictionary? Seriously? The memory footprint balloons and performance plummets using a dictionary compared to an array for no reason other than Java's design stupidity. No, just no. Better off converting to and from short and sticking to an array.
 

Attachments

Last edited:
The memory footprint balloons and performance plummets using a dictionary compared to an array
Since those are obviously huge problems with a map with 256 values. :kookoo:
 
Back
Top