Project 2682 malloc error

The most demanding Projects are only available to a small percentage of very high-end servers.

Moderators: Site Moderators, PandeGroup

Re: Project 2682 malloc error

Postby Tedster » Sat Aug 07, 2010 3:28 am

I'm currently sitting at 33% on one, with no errors. FahCore_a3 is currently using 2.1 GB of RAM.
Image
Tedster
 
Posts: 5
Joined: Sun May 02, 2010 8:59 pm

Re: Project 2682 malloc error

Postby toTOW » Sat Aug 07, 2010 2:16 pm

I think the memory usage might be the issue here ... my machine only have 2 GB ...

Or it's an allocation error on 32 bits systems which can't allocate pages bigger than 2 GB ...
Folding@Home beta tester since 2002. Folding Forum moderator since July 2008.

FAH-Addict : latest news, tests and reviews about Folding@Home project.

Image
User avatar
toTOW
Site Moderator
 
Posts: 8785
Joined: Sun Dec 02, 2007 10:38 am
Location: Bordeaux, France

Re: Project 2682 malloc error

Postby PantherX » Sat Aug 07, 2010 2:22 pm

toTOW wrote:...Or it's an allocation error on 32 bits systems which can't allocate pages bigger than 2 GB ...

If that is the case, then the minimum requirement for running bigadv would be a 64 bit OS.
ETA:
Now ↞ Very Soon ↔ Soon ↔ Soon-ish ↔ Not Soon ↠ End Of Time

Welcome To The F@H Support Forum Ӂ Chrome Folding App (Beta) Ӂ Troubleshooting "Bad WUs" Ӂ Troubleshooting Server Connectivity Issues
User avatar
PantherX
Site Moderator
 
Posts: 6321
Joined: Wed Dec 23, 2009 9:33 am

Re: Project 2682 malloc error

Postby tear » Sat Aug 07, 2010 3:03 pm

PantherX wrote:
toTOW wrote:...Or it's an allocation error on 32 bits systems which can't allocate pages bigger than 2 GB ...

If that is the case, then the minimum requirement for running bigadv would be a 64 bit OS.

Not really. Windows client and Core A3 are 32-bit binaries.
One man's ceiling is another man's floor.
Image
tear
 
Posts: 857
Joined: Sun Dec 02, 2007 4:08 am
Location: Rocky Mountains

Re: Project 2682 malloc error

Postby Parja » Sat Aug 07, 2010 3:07 pm

toTOW wrote:I think the memory usage might be the issue here ... my machine only have 2 GB ...

Or it's an allocation error on 32 bits systems which can't allocate pages bigger than 2 GB ...


12GB of system memory on Win 7 x64 for me and it's a dedicated Folding system, so I know memory can't be the problem.

I ran Memtest for about 4 hours last night just in case and it didn't come up with any errors.
Parja
 
Posts: 22
Joined: Sat Jun 28, 2008 1:38 am

Re: Project 2682 malloc error

Postby tear » Sat Aug 07, 2010 3:12 pm

toTOW wrote:I think the memory usage might be the issue here ... my machine only have 2 GB ...

18GB on WS2K8 R2 here. It's a heisenbug that some folks are just lucky not to trip.

Or it's an allocation error on 32 bits systems which can't allocate pages bigger than 2 GB ...

You mean 32-bit Windows...

But even so!

See this http://msdn.microsoft.com/en-us/library ... spx?ppud=4 and this http://www.microsoft.com/whdc/system/pl ... aemem.mspx.

FahCore_a3 does not carry IMAGE_FILE_LARGE_ADDRESS_AWARE flag therefore it's subject to same address space limitations on 32-bit and 64-bit Windows.

Q.E.D.
tear
 
Posts: 857
Joined: Sun Dec 02, 2007 4:08 am
Location: Rocky Mountains

Re: Project 2682 malloc error

Postby toTOW » Sat Aug 07, 2010 5:23 pm

Well that's a bit different to what I experimented ...

I already tried to enable PAE and the core should be compiled to take advantage of this, but as far as I know, it never made any differences in my tests ...

But maybe I was doing it wrong ...

tear> do you think that it is as simple as recompiling the core with IMAGE_FILE_LARGE_ADDRESS_AWARE flag enabled to make it work ?
User avatar
toTOW
Site Moderator
 
Posts: 8785
Joined: Sun Dec 02, 2007 10:38 am
Location: Bordeaux, France

Re: Project 2682 malloc error

Postby vladh4x0r » Sat Aug 07, 2010 7:48 pm

Should not need a recompile, just a patch of the PE header:
Code: Select all
 editbin.exe /LARGEADDRESSAWARE FahCore_a3.exe

This would probably run afoul of the "no code modifications" policy of FAH though. Too bad, since it's not modifying any executable code, just the header flag...
Image
vladh4x0r
 
Posts: 30
Joined: Tue Jul 28, 2009 5:04 am
Location: Folsom, CA, USA

Re: Project 2682 malloc error

Postby tear » Sat Aug 07, 2010 9:38 pm

I thought of sticking to 666 for a while longer but... what the hell (so to speak).

toTOW,
The issue in question is not really related to FahCore running out of memory. Is there anything that makes you think it is?
Just before the crash FahCore memory use is at 300MB max. (+1 thread saturating the CPU) perhaps you've hit a different
issue?

vlad,
Yeah, if FahCore doesn't make any assumptions about two most significant bits then yeah, that *could* work. But again,
I'm pretty confident memory shortage is not an issue (I'll try it anyway in a while if I can still get offending unit).


tear
tear
 
Posts: 857
Joined: Sun Dec 02, 2007 4:08 am
Location: Rocky Mountains

Re: Project 2682 malloc error

Postby toTOW » Sat Aug 07, 2010 10:54 pm

When I had some issue with memory assignments, I saw the Fahcore process allocating more and more memory up to something close to 2GB (more likely something like 19xx KB) until it dies with an error.

But you right, in those days, it triggered a Windows error ("the adress at blahblah can not be read (or written, not sure)").

I think the error is different, but anyway, it wouldn't hurt to make the SMP core able to use more than 2 GB of memory since it could become again an issue on future projects.

But I think tear is right, and the issue here is different ... I have a 64 bits machine (Windows 7 64 bits, 12 GB of RAM) stuck on a p2682 (different unit as before), so the problem with this project is not memory (or 32 vs 64 bits) related ... I'll report back the error when I'll have a chance to access to this machine.
User avatar
toTOW
Site Moderator
 
Posts: 8785
Joined: Sun Dec 02, 2007 10:38 am
Location: Bordeaux, France

Re: Project 2682 malloc error

Postby tear » Sat Aug 07, 2010 11:07 pm

An interesting experiment would be:
1) Identifying RTLs core A3 requires
2) Copying them from unaffected system to affected system and checking what gives

tear
tear
 
Posts: 857
Joined: Sun Dec 02, 2007 4:08 am
Location: Rocky Mountains

Re: Project 2682 malloc error

Postby tear » Sun Aug 08, 2010 12:15 am

vladh4x0r wrote:Should not need a recompile, just a patch of the PE header:
Code: Select all
 editbin.exe /LARGEADDRESSAWARE FahCore_a3.exe

This would probably run afoul of the "no code modifications" policy of FAH though. Too bad, since it's not modifying any executable code, just the header flag...


That's actually my bad. I was looking at the wrong spot -- flag's already there.

Anyway, back to the subject.
I'm not a Windows hacker so additional insight is much welcome.

I checked RTLs with dependency walker and there are gobs of them, so... I'd like to take it easy and...

zero2dash --
What OS are you running on the machine that's not exhibiting issues? Any service packs?
Can you please make available (or send over e-mail -- tear@braxis.org) your C:\WINDOWS\SYSWOW64\MSVCRT.DLL* (replace C:\WINDOWS\ as appropriate) ?


I'm also thinking of checking debug versions of some DLLs...


Any other ideas?
tear
 
Posts: 857
Joined: Sun Dec 02, 2007 4:08 am
Location: Rocky Mountains

Re: Project 2682 malloc error

Postby vladh4x0r » Sun Aug 08, 2010 12:33 am

Well we've had an issue on OS X and on WinXP 32-bit at the start of the thread, while Win7 x64 had no issue. I don't know whether the OS X client binary is 32-bit or 64-bit, and what the process address space for 32-bit processes looks like on OS X. Given that Large Address Aware flag is already set on the Windows client binary, and XP32 crashed while Win7x64 worked, how were we able to rule out out-of-address-space condition already? IMHO the way to tell would be to boot XP32 with /3GB flag and see if it passes (or alternatively, remove the LAA flag from the binary, and see if it fails). Since DLLs load into the process address space, perhaps it's crashing in RTL (BTW as a hardware guy I'm used to this acronym expanding to something completely different!) during decompression. It doesn't have to use 2GB of RAM, only to attempt to allocate 2GB of VM.
vladh4x0r
 
Posts: 30
Joined: Tue Jul 28, 2009 5:04 am
Location: Folsom, CA, USA

Re: Project 2682 malloc error

Postby Grandpa_01 » Sun Aug 08, 2010 12:33 am

The WU is currently at 52% with no problems so far.

@tear I don't know if you need it or not but I sent you the MSVCR.DLL from my machine
Image
2 - SM H8QGi-F AMD 6xxx=112 cores @ 3.2 & 3.9Ghz
5 - SM X9QRI-f+ Intel 4650 = 320 cores @ 3.15Ghz
2 - I7 980X 4.4Ghz 2-GTX680
1 - 2700k 4.4Ghz GTX680
Total = 464 cores folding
User avatar
Grandpa_01
 
Posts: 1757
Joined: Wed Mar 04, 2009 7:36 am

Re: Project 2682 malloc error

Postby tear » Sun Aug 08, 2010 12:56 am

vladh4x0r wrote:Well we've had an issue on OS X and on WinXP 32-bit at the start of the thread, while Win7 x64 had no issue.

Parja's system and mine are 64-bit.

vladh4x0r wrote:I don't know whether the OS X client binary is 32-bit or 64-bit, and what the process address space for 32-bit processes looks like on OS X.

Neither do I -- that would be additional datapoint.

vladh4x0r wrote:Given that Large Address Aware flag is already set on the Windows client binary, and XP32 crashed while Win7x64 worked, how were we able to rule out out-of-address-space condition already?

zero2dash's systems (per his hardware profile) are 64-bit, I'm not sure about others'. Who else has it working and and what system? Grandpa?

vladh4x0r wrote:IMHO the way to tell would be to boot XP32 with /3GB flag and see if it passes (or alternatively, remove the LAA flag from the binary, and see if it fails).

That won't hurt, I suppose. Though see my first point.

vladh4x0r wrote:Since DLLs load into the process address space, perhaps it's crashing in RTL (BTW as a hardware guy I'm used to this acronym expanding to something completely different!) during decompression.

And for a compiler guy RTL expands to something yet different; aren't TLAs fun?

I don't know enough about Windows to draw solid conclusions. Crash error codes look like they come from NT kernel (0xCxxxxxx). But do they really?

Code: Select all
Problem signature:
  Problem Event Name:   APPCRASH
  Application Name:   FahCore_a3.exe
  Application Version:   0.0.0.0
  Application Timestamp:   4c192f33
  Fault Module Name:   ntdll.dll
  Fault Module Version:   6.1.7600.16385
  Fault Module Timestamp:   4a5bdb3b
  Exception Code:   c0000029
  Exception Offset:   00090526
  OS Version:   6.1.7600.2.0.0.274.10
  Locale ID:   1033
  Additional Information 1:   0a9e
  Additional Information 2:   0a9e372d3b4ad19135b953a78882e789
  Additional Information 3:   0a9e
  Additional Information 4:   0a9e372d3b4ad19135b953a78882e789

Read our privacy statement online:
  http://go.microsoft.com/fwlink/?linkid=104288&clcid=0x0409

If the online privacy statement is not available, please read our privacy statement offline:
  C:\Windows\system32\en-US\erofflps.txt


vladh4x0r wrote:It doesn't have to use 2GB of RAM, only to attempt to allocate 2GB of VM.

True.


tear
tear
 
Posts: 857
Joined: Sun Dec 02, 2007 4:08 am
Location: Rocky Mountains

PreviousNext

Return to SMP with bigadv

Who is online

Users browsing this forum: No registered users and 1 guest

cron