Inside: SereneScreen Fan Forum

Inside: SereneScreen Fan Forum (https://www.feldoncentral.com/forums/index.php)
-   Goldfish Aquarium 2 for Mac OS X (https://www.feldoncentral.com/forums/forumdisplay.php?f=44)
-   -   Performance isn't up to expectations (https://www.feldoncentral.com/forums/showthread.php?t=2557)

johnblommers 08-20-2004 04:38 PM

Performance isn't up to expectations
 
Today I downloaded and purchased the Goldfish Aquarium for Mac OS X beta and it sure is pretty. And I like it. I like it but would like to raise some questions about performance.

My system is a dual processor 2Ghz G5 with 1.5GB RAM. The main monitor is a 22-inch Apple Cinema display set to millions of colors and 1600x1024. When I run the Goldfish Aquarium screen saver I have all 5 fish on in a lush tank and get frame rates just over 100 fps. That's OK and reducing it to one fish makes maybe a 10% improvement. Yes I enabled the dual processor option (thanks for providing it!)

But I have a second dual headed Radeon 7000 with 32MB VRAM video car. Attached is a 15-inch LCD at 1024x768 and a 17-inch LCD at 1280x1024. Each runs at millions of colors and gets 16MB of VRAM from the card.

When I configure Goldfish Aquarium to use all three monitors the frame rate on each monitor drops down to about ONE frame per second. OK I know this product is still in beta.

BTW the sound preferenced don't take effect until you EXIT the setup - now that's odd. Feedback should be instantaneous.

Best ...

AKcrab 08-20-2004 05:47 PM

Re: Performance isn't up to expectations
 
Quote:

Originally posted by johnblommers
But I have a second dual headed Radeon 7000 with 32MB VRAM video car. Attached is a 15-inch LCD at 1024x768 and a 17-inch LCD at 1280x1024. Each runs at millions of colors and gets 16MB of VRAM from the card.
I think the 17" is part of the problem... From the system requirements for GA:
Quote:

* VIDEO: OpenGL accelerated drivers with 16MB VRAM (32MB VRAM for larger than 1024x768)
So in order to drive 1280x1024 you'll be needing 32MB, and you've only got 16.

I must say, I *wish* I had your problem! :D
Nice setup.

JimO'Connor 08-20-2004 06:20 PM

AKcrab is right. You are overstressing your video cards memory. I have an almost identical set-up. Same CPU with same memory with same Cinema Display at millions. However I'm running my 2nd display (1280x1024 at millions) on the built-in video card and perf is pretty good. I don't have a third display.

Probably -- if you put your Cinema Display and one other display on the same card you'd be okay, and let your non-factory card handle a single display.

Come back often!

johnblommers 08-20-2004 08:33 PM

So I removed the Apple Studio Display ...
 
OK, I simply removed the 17-inch 1280x1024 Apple studio display from the add-on ATI card.

I took the remaining 15-inch LCD CintiQ and switched it to the DVI interface and plugged it into the DVI port that the Apple Studio Display formerly occupied. Now the single 15-inch 1024x768 at millions of colors monitor has all of the 32 meg of VRAM available. I rebooted just to be sure. The ATI Displays program agrees.

So the main Apple Studio Display at 1600x1024 is still on the main 64-Meg VRAM ATI 9600 Pro. Using only this main display I can get 100+ fps.

Switching to all-screens mode, things got a little bit better. I now get a hair over 13 fps on BOTH monitors. The frame rates are virtually identical. Obviously I am expecting the big screen to remain at 100+ fps as the ATI 7000 card is not as strong.

BTW I am running Mac OS X 10.3.5 so all my drivers are current.

I guess my question is why is there this coupling between the two screens. I have two CPUs so the second screen saver could have a whole CPU to itself, as it has a whole video card to itself.

BTW there may be more under the hood here, as my Apple Studio Display decided to quit working on me just after using the screen saver! It may well be the video card going bad. It's not the first ATI card to go South on me.

Anyhow, please share your thoughts with me.

This is fun troubleshooting, and I am enjoying the beta sofware, looking forward to upcoming improvements.

Good luck!

JimO'Connor 08-20-2004 11:08 PM

Three displays/two video cards
 
Hi John,

Please make me a table of the configurations and the speed. Following it in the text of a paragraph is hard for me. What I see you have here is:

rate........22" ................... 15" CintiQ ......... 17" ASD
100+ .... ATI 9600 Pro ....... Off .................... Off
13 ........ ATI 9600 Pro ....... ATI 7000 ............ Off

This is as I would expect once I thought more about what you have and how we work. You only draw as fast as your slowest video card. Try this:
40+ ...... ATI 9600 Pro ....... Off .................... ATI 9600 Pro

We use multiple CPUs for tweening the fish geometries. Threading the entire drawing model would not be economical for the small portion of the market which has multiple CPUs and multiple monitors and isn't using factory video cards, though I'd enjoy doing it. Drawing to OpenGL on multiple threads wasn't something I wanted to tackle on the all-nighter when I added the threaded computation; what I read indicated it wasn't straightforward.

The quick (and therefore likely to happen in the near term) solution would be to have the ability to pick which displays are used in multiple display mode, so you could block the ATI 7000.

The way to get things changed in the program (like re-writing the drawing model to make it fully threaded) is to:
1) buy the product (thanks for doing that)
2) talk nicely about the product in public forums (saw your post on VT, thanks for doing that)
3) get other people to buy the product (you now know what to give family for birthday, right?)
4) talk to me about it here from time to time
5) be patient and friendly (thanks for doing that)

You are most of the way there!

Thanks,
Jim

johnblommers 08-21-2004 01:19 AM

Aha! I understand
 
OK, one thing you said explains everything.

"You can only draw as fast as your slowest video card."

The ATI 7000 with just the 15" Cintiq is limited to about 13fps.
The ATI 7000 with two monitors is overdriven and reduces the fps to about 2. That slower card is the limiting factor. What I really need to do is put a more modern card in there!

BTW I am not rich, I just find the extra monitors to be a tremendous productivity booster. Some of my friends shake their heads, and my students are amazed that multiple monitors is even possible.

Now the Serene Screen Marine Aquarium 2.0.6 turns out to follow the same performance rule. But Marine Aquarium can put out about 47 fps on the ATI 7000 so it's not an issue.

Goldfish Aquarium seems to run about 1/3 the speed of Marine Aquarium, but then Marine Aquarium has been out a long time and is polished & otpimized.

This has been a good and worthwhile exchange. Thank you.

AKcrab 08-21-2004 02:29 AM

Re: Aha! I understand
 
Quote:

Originally posted by johnblommers
Goldfish Aquarium seems to run about 1/3 the speed of Marine Aquarium, but then Marine Aquarium has been out a long time and is polished & otpimized.
If you look at the wire frames (press W) while running each of the simulations, I think it will be pretty obvious why the Goldfish takes more power.

The complexity of the goldfish mesh is amazing.

JimO'Connor 08-21-2004 06:55 AM

Re: Re: Aha! I understand
 
Quote:

Originally posted by AKcrab
If you look at the wire frames (press W) while running each of the simulations, I think it will be pretty obvious why the Goldfish takes more power.

The complexity of the goldfish mesh is amazing.

AKcrab has this right. Goldfish is slower primarily because there is just more data to push around. When the "Vertex Array Range" check box is hooked up to something then the number of times the data is pushed around will be reduced and both MA and GA will receive a speed boost (size yet to be determined, but I expect it to be larger for GA than MA because GA has so much more data). I won't promise when this will happen, but I have every intention of making it happen.

There is one algorithmic black hole which GA has which MA doesn't have which has so far been resistant to optimization because it depends on accessing a large table and has lots of if's in it. The large table means a lot of memory access (bad thing) and the if's give the instruction pipe trouble. An old optimization trick was to store commonly used values (such as sin, cos, tan, sqrt) in a table rather than re-compute them. This is now probably a bad idea as recalcing them is often more efficient than looking them up because of the speed disparity between the CPU and memory (unless the entire table can be kept in cache and not shoved out).

G4 optimization (AltiVec) has so far not been done because it requires us to pad our data by 1/3 in order to get proper alignment, which then increases the amount of data we have to push around by 1/3, which then costs far more than the computational savings. Since we aren't primarily computationally bound, but memory bound, this is a bad thing. In a single aquarium, notice that turning on and off "mutiple CPUs" doesn't have a HUGE effect. Enough to be worthwhile, but not 30%. After we go version 1.0 I expect to take another look at AltiVec. That is the neat thing about having a line of similar products. Each one gives us a chance to improve on the previous one, and then the improvements get rolled back to the first product (eventually).

johnblommers 08-21-2004 02:44 PM

Performance numbers
 
This is so interesting I decided to characterize the performance of the Goldfish Aquarium (GA) application (not the screen saver). Here are the summary findings:

(1) In a 0-fish unpopulated tank, the frames per second (FPS) varies dramatically from 190 fps for a lush planting to an incredible 542 fps for a large clear tank. The amount of non-fish stuff in the tank is what impacts the performance most. This is the area ripe for optimization.

(2) In a lush planting tank, the performance varies slightly from 190 fps for an empty tank to 124 fps for a 5-fish tank. The number of fish has a minimal impact on the performace. Therefore can we please allow more fish in the tank?

(3) The size of the GA application window barely effects the performance until the window size exceeds 1024x768. This is very interesting! Good job!

(4) The GA application works very nicely when it spans monitors. When the monitors are connected to different performing video cards, the fps is higher when more of the window overlaps the faster card, and vice versa. When the application overlaps monitors connected to the same card, the performance remains constant. Also good job!

(5) It is possible to duplicate the GA application and run them all at the same time on multiple screens if desired. Then you can look at the punishment being visited upon the graphics card using the free utility GET_ATI_NVIDIA_RAM_V059 available from:

http://people.freenet.de/amichalak/A...NOINSTALL.sitx

-------------------------------------------------
Follows is the raw data to support the above conclusions:

Application mode 1024x768 with bubbles
124 fps 5 monstros lush planting
142 fps 4 monstros lush planting
142 fps 3 monstros lush planting (same as for 4, I know)
166 fps 2 monstros lush planting
181 fps 1 monstros lush planting
190 fps 0 monstros lush planting

Application mode 1024x768 with bubbles
190 fps 0 monstos lush planting
232 fps 0 monstrols medium planting
250 fps 0 monstrols rocks only
500 fps 0 monstros pond
542fps 0 monstros large clear tank

Application mode 1024x768 with bubbles
542 fps 0 monstros large clear tank
477 fps 1 monstros large clear tank
333 fps 2 monstros large clear tank
300 fps 3 monstros large clear tank
250 fps 4 monstros large clear tank
227 fps 5 monstros large clear tank

Application windowed mode with bubbles lust planting 5 monstros
123 fps 1024x768
124 fps 0800x0600
124 fps 320x240
124 fps 160x240

86 fps when hit F key for full screen
90 fps in window 1450x906

app window crosses screen boundry (314x96)
11.55 fps 100% on slow monitor
9.25 fps some on both
76 fps bout 1/2 n 1/2
99 fps most on big monitor
125 fps all on big monitor

and when both monitors are on the same fast card the app runs at the same speed.

BTW the application can be duplicated so many; instances can run at once
------------------------------------------------------
System is a Dual processor G5 2Ghz unit
with 1.5Gig RAM,
one ATI Radeon 9600 Pro w/64meg VRAM
split across two attached LCD monitors
(ACD 1600x1024 and
ASD 1280x1024), and
one ATI Radeon 7000 with
one 1024x768 Wacom Cintiq LCD
running Mac OS X 10.3.5 with all the patches.

JimO'Connor 08-21-2004 03:52 PM

Re: Performance numbers
 
Quote:

Originally posted by johnblommers
This is so interesting I decided to characterize the performance of the Goldfish Aquarium (GA) application (not the screen saver). Here are the summary findings:

(1) In a 0-fish unpopulated tank, the frames per second (FPS) varies dramatically from 190 fps for a lush planting to an incredible 542 fps for a large clear tank. The amount of non-fish stuff in the tank is what impacts the performance most. This is the area ripe for optimization.

Drawing an empty screen (pond with no fish, debris off, bubbles off or large clear tank with same conditions) means just clearing the screen and drawing a grad fill rectangle. This takes some small number of milliseconds, requires almost no data be transferred to the video card, and requires the video card to do almost no work because we turn off depth testing, lighting, and most everything complicated to draw the background. This is like "while (true) ;" I have a debug build which outputs the average draw in milliseconds, including the extremes each second. Obviously this will vary between machines/monitors/build styles because of optimizations, etc, but it gives us hard numbers to compare.

.............................. Oreo ...... Jack .... Monstro
Tank ..... Empty .... 1 fish ..... 2 fish ..... 3 fish .... + bubbles ... + debris
Clear ..... 1.2 ms ... 3.0 ms . 3.7 ms ... 4.3 ms .... 4.8 ms ....... NA
Rocks .... 3.0 ms ... 3.5 ms . 4.0 ms ... 5.0 ms .... 6.0 ms ....... 8.0 ms

Add in something to do in the loop and the loop takes a LOT longer (>2x!), but still doesn't take much time in absolute terms.

The rocks, with 22 textures and the complex lightplay model, cost about the same as the first fish, which has two textures but more polygons. Algorithmically, there isn't much of anything going on with the rocks. They mostly happen on the video card. Also, the rocks, being stationary, are hugely important to selling the illusion since the viewer can examine every wart in detail.

The debris is where an unbelievable amount of time goes. I can't make the same optimization for the debris that I did for the bubbles because the debris are actually in the tank instead of behind it, so we are stuck with the cost until I figure out something else.

Quote:


(2) In a lush planting tank, the performance varies slightly from 190 fps for an empty tank to 124 fps for a 5-fish tank. The number of fish has a minimal impact on the performace. Therefore can we please allow more fish in the tank?

The number of fish is more a concern of how the fish interact in the collision detection logic than computational considerations. The pond gives you 10 fish, but there is a lot of room for them to move around and nothing for them to hit except each other. Put more fish in a tank with rocks and other objects and they tend to just bounce off each other which isn't very appealing. Eric, ultimately, gets to decide that sort of thing.

Quote:


(3) The size of the GA application window barely effects the performance until the window size exceeds 1024x768. This is very interesting! Good job!

That is because we don't do anything different based on the size of the window. The hardware takes care of it, and the number of pixels the hardware must deal with is dwarfed by the incoming data (on your machine) unti you get to about that size, then deciding what each pixel should be gets to be a larger issue. I wish I could take credit... :)

Quote:


(4) The GA application works very nicely when it spans monitors. When the monitors are connected to different performing video cards, the fps is higher when more of the window overlaps the faster card, and vice versa. When the application overlaps monitors connected to the same card, the performance remains constant. Also good job!

Apple handles all of that in the OpenGL layer and below. We don't do anything special. Again, I wish I could take credit...

Quote:


(5) It is possible to duplicate the GA application and run them all at the same time on multiple screens if desired. Then you can look at the punishment being visited upon the graphics card using the free utility GET_ATI_NVIDIA_RAM_V059 available from:

For testing purposes we have a build which will create an arbitrary number of windows and put the aquarium in each one. It is useless as a product, but it allows us to stress test the rendering code and simulate people with a whole lot of monitors. It is much more efficient than multiple copies of the application running because the different tanks get to share textures, models and other data.

Looking forward to the next installment! :)

johnblommers 10-16-2004 08:58 PM

Mac OpenGL Performance Tool
 
1 Attachment(s)
So it's Saturday evening and it's cold an damp out in the Pacific Northwest. Nothing to do but stay warm and cozy in the den with my trusty G5. I decide to learn a few things about OpenGL, so point my browser at the Apple developer site at:

http://developer.apple.com/samplecod...enGL-date.html

and play with some of the code. I stumble across a developer tool called OpenGL Driver Monitor already on my system, part of the free Xcode developer kit. It lets you sample all kinds of performance statistics for each grapics card. If you're not happy with the performace of your graphics card, maybe this tool will help you pass the time.

JimO'Connor 11-12-2004 01:32 PM

How many out there have two video cards?

AKcrab 11-12-2004 02:40 PM

Quote:

Originally posted by JimO'Connor
How many out there have two video cards?
He's back!

I only have one. I would expect the number of users with multiple cards is going to be small.

JimO'Connor 11-12-2004 03:09 PM

Yes, that is probably true. I need some volunteers to test a change which will speed up drawing to multiple monitors. Probably the change will be most noticeable to people with cards of vastly differing capabilities.

We call this the "John Blommers" fix. Congrats John! :)

ESHIREY 11-12-2004 03:17 PM

I have 2 cards . The AGP is a FX5900XT and the PCI is an old old Matrox Mystique. I am going to be getting a better PCI card. The AGP card is running my dual 21" Nokia Multigraph 445Xpro monitors and the PCI is running my 17" Dell.

JimO'Connor 11-12-2004 03:23 PM

Sorry Ed,

You need a Mac to hold them.


Jim

ESHIREY 11-12-2004 03:24 PM

I have a mac. But it only has 1 card. :(

johnblommers 11-12-2004 05:02 PM

You know I do
 
Quote:

Originally posted by JimO'Connor
How many out there have two video cards?
Count me in as a two-card player

ATI Radeon 9800 Pro Macintosh special edition (2 heads, 256Meg VRAM)
ATI Radeion 7000 Macintosh edition (two heads)

JimO'Connor 11-12-2004 11:19 PM

Re: You know I do
 
Quote:

Originally posted by johnblommers
Count me in as a two-card player

ATI Radeon 9800 Pro Macintosh special edition (2 heads, 256Meg VRAM)
ATI Radeion 7000 Macintosh edition (two heads)

Yes, and that is why this fix is in your honor. :)

johnblommers 11-12-2004 11:43 PM

Re: Re: You know I do
 
Quote:

Originally posted by JimO'Connor
Yes, and that is why this fix is in your honor. :)
And that is why you rock!

Some others don't.

Take RealMyst - it crashes on my system unless I take out two of my monitors.

Take The Incredibles Demo - it hangs up unless I take out two of my monitors.

Take HomeWorld2 - it thinks my ATI 9800 cannot support multiple rendering contexts and disables shadows and hyperspace effects. If I disable the monitor on the ATI 7000 card it works fine.


All times are GMT -6. The time now is 05:30 AM.

Powered by vBulletin®
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.