Skip to main content

JVM crash with si_signo=SIGBUS: si_errno=Not enough space

8 replies [Last post]
neighbour
Offline
Joined: 2007-11-27

We've been facing the problem of JVM crash which is reproduced usually in 12 - 24 hours.
This happens on Solaris/T1 and Solaris/Opteron hardware.
The environment is:
- JDK1.6.0_13, Solaris, Sun-Fire-T1000 (1 CPU, 1 GHz, 8 cores)
- JDK1.6.0_13, Solaris, Sun-Fire X4200 (2 CPU, 2.2 GHz, dual core AMD Opteron 275)

Unfortunately, we are not able yet to localize the bug (or the cause of the bug) and provide a simple test which reproduces the problem.

But probably hs_err_pid*.log files in the attachment will provide some clue.

I've found only one bug in the SUN bug DB similar to this one: http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4197092, which is submitted in 1998, and is related to JDK 1.2.

Message was edited by: neighbour

Reply viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.
jjburke
Offline
Joined: 2004-03-16

Yes demonduck, internal QA should be happening. In the absence of any articles about this, I suppose internal QA is at best minimal. Something perhaps like a medium sized lab with mostly Sun equipment.

A fully funded lab would have a spectrum of video cards and PC brands. Then test cases that repetitively run tests day after day. That test harness, maybe a c++ program, would watch the environment and report back anomalies. Its really old hat. I hope Sun has been doing this for a long time because a test harness is a growing creature.

I used to write installation code for my large department's program installs. This was at the HDQ of a major retail company in Plano TX. There, more than half my code would check and report the success of some department's install process (dlls, exes, programs, and registry entries existed with correct version) then report back the install results to a database. Also checked memory usage for Performance Monitor stair step memory leaks. My installs went to 1,500 PCs. Automated installs only had 5-6 PCs with errors. On those errors, I was often there before the user realized a problem existed.

Microsoft has certainly had QA labs for a long long time for their operating systems. I think this is easier because MS owns the base environment. With Java that runs on top of the OS and with other programs its a greater challenge.

Thanks,
Jim

demonduck
Offline
Joined: 2008-03-14

It is indeed a mystery why one of the premier hw/sw companies in the World would not have at least an automated regression test system where each bug ever found is tested for in each release and no release is scheduled until all regression tests are passed.

You are right, exhaustive testing of every possible hw system is impossible and hw caused bugs have to be dealt with ad hoc. But each release should at least conform to their own specs. I've also worked at a lot of different companies both as a salaried employee and as a consultant and the successful companies have at least a token QA dept. and run regression tests automatically. And a release is not scheduled until those tests are passed. It's not that hard to do.

It is a mystery why bugs go away and come back and some are simply are not fixed for years. I understand how bugs can slip through in newly developed features but once found they should be automatically added to their regression tests.

[big shrug]

Idunno....

linuxhippy
Offline
Joined: 2004-01-07

> It is indeed a mystery why one of the premier hw/sw
> companies in the World would not have at least an
> automated regression test system where each bug ever
> found is tested for in each release and no release is
> scheduled until all regression tests are passed.
If you would have fixed a single java bug yourself instead of spreading FUD arround here, you would know that every single bug-fix which is merged in comes along with a regression test - for - you name it - automated regression testing.
So for now there are hundreds of thousands of regression tests, run on almost any build you can download (even the non official ones).

After all, why do you conclude Sun has no QA lab?! Bugs happen dude, well, except for some childish mini applets ;)

- Clemens

demonduck
Offline
Joined: 2008-03-14

Well, I offered my consulting services to both you and SUN for a reasonable rate and was turned down so I guess I'm just not good enough -- not as good as you. And you have a huge financial incentive to fix bugs for SUN so why do you compare what you do to what I offered to do and was not allowed to do?

My applets are just the best I can do --

http://pancyl.com/DarkSnowCreek1.htm (fullscreen F1/ESC -- click drag to pan/tilt -- mousewheel zooms)

not as good as your stuff -- which I would like to look at again. Did you do the Gears applet? Where is your stuff again?

If I sound harsh in my posts -- sometimes you have to hit the mule over the head with a board to get it's attention. That's all I'm doing. Prodding the mule to work harder. And I guess what I'm doing is working because of all the extra work you and others have done. So I guess I am helping -- but in a way that looks like I'm not. Some people don't understand that.

linuxhippy
Offline
Joined: 2004-01-07

> And you have a huge financial incentive to fix
> bugs for SUN so why do you compare what you do to
> what I offered to do and was not allowed to do?
I worked on bugfixes before the challenge if thats what you mean.

> http://pancyl.com/DarkSnowCreek1.htm (fullscreen
> F1/ESC -- click drag to pan/tilt -- mousewheel
> zooms)
Nice and really cool :)

> not as good as your stuff -- which I would like to
> look at again. Did you do the Gears applet?
In all its glory, yeah I know, the Gears applet rocks ;)

> If I sound harsh in my posts -- sometimes you have to
> hit the mule over the head with a board to get it's
> attention. That's all I'm doing. Prodding the mule
> to work harder. And I guess what I'm doing is
> working because of all the extra work you and others
> have done. So I guess I am helping -- but in a way
> that looks like I'm not. Some people don't
> understand that.
I get your point. The question is wether you do more good than harm?!
E.g. in this post you spread wrong facts that SUN doesn't do regression testing and has no QA lab (the have a huge one) - why and what for?
Also I didn't look into the server-flag-ignore-all-command-line-args issue because you shouted that loud but because I though this was an ugly thing. After all ... nobody has had a look at my fix for now ;)

- Clemens

frajt
Offline
Joined: 2005-01-05

It is known bug within the JVM for us. Your system went out of the virtual memory. Check your available memory and possibly add more swap space. Some application reports "Cannot allocate memory!", some applications (like JVM) crash with the SIGBUS.

We have very same problem. It was reported to the SUN next to all other problems we had with the >JDK1.6.0_04. It was the only issue the SUN rejected to fix. The answer was that there is not much to do when the OS is running out of memory and it's only about the way how the JVM exits. They simply call it proper to fail with the SIGBUS signal instead of running the normal JVM exit method.

Michal

jjburke
Offline
Joined: 2004-03-16

I dream too much, but it would be nice if we would try to provide simple
code examples of errors that we find. I know its a fair amount of extra work to distill working code to an example but this would do tremendous good.

1) The code example would serve as a decent test case for a Sun fix.
2) Saves time for Sun code engineers to build their own test case. We are more familiar with the error at that point than Sun is.
3) New Java releases would come out faster.
4) The rest of us can take a look at the test case. Maybe we have seen and conquered that error. Maybe we could come up with fix or work-around suggestions.

Thanks,
Jim

demonduck
Offline
Joined: 2008-03-14

Simply put, SUN needs to do better Quality Assurance and more rigorous testing to reduce the need for the greater Java community to be in continual beta test mode.

Fewer releases with better quality releases is better engineering. The kinds of bugs reported in these forums shouldn't be in the releases. They should be caught by Quality Assurance.

If and when Oracle or somebody buys SUN, maybe a more rigorous engineering standard will be imposed.