Topic: SIGSEGV: Segmentation fault - invalid memory reference

I am baffled by some behavior I see from my code. I get the above system message inside a subroutine that I thought was benign and was something that worked fine (and still does) in my 32 bit executable. This subroutine does little other than check status of data directories and creates directories and copies files as needed. Even more confusing is that the debugger steps right through this code and works as it should. Here is the full output that I get and follow with the portion of the code. Note that the output string indicates it is going to create the directory, it does not complete.

  Enter Subroutine pdata
   132000000           0       66000          66           0           0
Path Name Needed pos\00\66
Creating Lower directory pos\00\66                       mkdir pos\00\66


Program received signal SIGSEGV: Segmentation fault - invalid memory reference.

Backtrace for this error:
        Backtrace is unavailable.  Please use
        the Simply Fortran debugger for similar
        functionality


      subroutine pdata(icn,con,Iter8)

       dimension icn(50)

       character*3 path
       character*11 fullpath     
       character*30 fname
       character*8 numb1
       character*2 numb2
       character*2 numb3
       character*3 numb3L
       character*100 ShellCommand
       logical*4 exists
       integer*4 k1,k2,k3
       integer*8 Iter8, iprod8
       Write(6,*)'Enter Subroutine pdata'

       path='pos'
!
       iprod8=icn(9)*icn(11)
       k1=int(iter8/icn(9)) 
       k2=mod(int(iter8/iprod8),100)
       k3=int(iter8/(iprod8*100))
       
       write(6,*)icn(1),icn(19),k1,k2,k3,icn(45)

       if(k3.lt.100)then
        write(numb3,'(i2.2)')k3
       else
        write(numb3L,'(i3.3)')k3
       endif
       write(numb2,'(i2.2)')k2
       write(numb1,'(i8.8)')k1
       fullpath='pos' // '\' // numb3 // '\' // numb2
       if(k3.ge.100)then
        fullpath='pos' // '\' // numb3L // '\' // numb2
       endif
       Write(6,*)'Path Name Needed ', fullpath

       if(icn(45).ne.k3)then
        call sdata(icn,con,sm,sz,x,y,z,vx,vy,vz,dvx,dvy,dvz)
        Write(6,*)'Need new directories'
         if(icn(45).eq.-1)then
! First Time Creating directory structure and file copies 
           Write(6,*)'First time in this subroutine, need to move files'
           inquire(file=path,exist=exists)
           if(.not.exists)then             
             ShellCommand='mkdir ' // path
             write(6,*)'Creating ' // path // ' directory', ShellCommand
             iresult=system(ShellCommand)
             if(iresult.ne.0)write(6,*)'**** Create directory FAILED'
           endif
!     Copy Riod2.ini to POS
           ShellCommand='copy riod2.ini ' // path
           write(6,*)ShellCommand
            iresult=system(ShellCommand)
            write(6,*)ShellCommand
            if(iresult.ne.0)write(6,*)'**** Copy Riod2.ini FAILED'
!     Copy Creation.txt to POS   
            ShellCommand='copy Creation.txt ' // path
            iresult=system(ShellCommand)
            if(iresult.ne.0)write(6,*)'**** Copy Creation.txt FAILED'
            write(6,*)'Copied Riod2.ini and Creation.txt to pos'
! End First time creating directories
         endif
         
!     Check to see if top directory POS directory exists
         fname= path // '\' // numb3
         inquire(file=fname,exist=exists)
         if(.not.exists)then
           Write(6,*)'Need top directory'
           ShellCommand='mkdir ' // fname
           write(6,*)'Creating ' // fname // ' directory ', ShellCommand
           iresult=system(ShellCommand)
           if(iresult.ne.0)write(6,*)'**** Directory create FAILED'
         else
          Write(6,*)'Top Directory Exists'
         endif
         icn(45)=k3
         Write(6,*)'icn(45)= ',icn(45)
       endif
!   Check to see if we are at a new sublevel POS directory
       if(icn(46).ne.k2)then
!     Need new bottom POS directory and then do other stuff
         fname= path // '\' // numb3 // '\' // numb2
         if(k3.ge.100)then
          fname= path // '\' // numb3L // '\' // numb2
         endif
         inquire(file=fname,exist=exists)
         if(.not.exists)then
           ShellCommand='mkdir ' // fname
           write(6,*)'Creating Lower directory ',fname,'  ',ShellCommand 
           iresult=system(ShellCommand)
           if(iresult.ne.0)write(6,*)'**** Directory create FAILED'
         else
          Write(6,*)'Lower directory exists'
         endif
         Write(6,*)'Check for DatFiles directory'
!  Check if DatFiles Directory exisits
         inquire(file='pos\DatFiles',exist=exists)
         if(.not.exists)then
           ShellCommand='mkdir ' // 'pos\DatFiles'
           write(6,*)'Creating DatFiles directory ', ShellCommand
           iresult=system(ShellCommand)
          if(iresult.ne.0)write(6,*)'**** Datfiles Directory create FAILED'
         else
          Write(6,*)'DatFiles Directory exists'
         endif
!
..... More of the same follows
Here is the makefile

#
# Automagically generated by Approximatrix Simply Fortran 2.41
#
FC="C:\Program Files (x86)\Simply Fortran 2\mingw-w64\bin\gfortran.exe"
CC="C:\Program Files (x86)\Simply Fortran 2\mingw-w64\bin\gcc.exe"
AR="C:\Program Files (x86)\Simply Fortran 2\mingw-w64\bin\ar.exe"
WRC="C:\Program Files (x86)\Simply Fortran 2\mingw-w64\bin\windres.exe"
RM=rm -f


OPTFLAGS= -g -mtune=broadwell

SPECIALFLAGS=$(IDIR)

RCFLAGS=-O coff

PRJ_FFLAGS= -fopenmp

PRJ_CFLAGS=

PRJ_LFLAGS=-Wl,--stack,1500000000 -lgomp

FFLAGS=$(SPECIALFLAGS) $(OPTFLAGS) $(PRJ_FFLAGS) -Jmodules

CFLAGS=$(SPECIALFLAGS) $(OPTFLAGS) $(PRJ_CFLAGS)

"build\riod.o": ".\riod.f90"
    @echo Compiling .\riod.f90
    @$(FC) -c -o "build\riod.o" $(FFLAGS) ".\riod.f90"

clean: .SYMBOLIC
    @echo Deleting build\riod.o and related files
    @$(RM) "build\riod.o"
    @echo Deleting default icon resource
    @$(RM) "build\sf_default_resource.res"
    @echo Deleting riod.exe
    @$(RM) "riod.exe"

"riod.exe":  "build\riod.o" "build\Riod-MP-F90.prj.target"
    @echo Generating riod.exe
    @$(FC) -o "riod.exe" -static -fopenmp "build\riod.o" $(LDIR) $(PRJ_LFLAGS)

all: "riod.exe" .SYMBOLIC

2 (edited by grogley 2018-10-20 13:04:43)

Re: SIGSEGV: Segmentation fault - invalid memory reference

More on this; I have tested more and discovered that when using 12 threads, I get the memory reference fault. Any fewer threads, 1-11 threads work fine. Note that OpenMP is used in this program but this subroutine call is in a single threaded segment of the code.

Also, if I make sure the directory structure is in place, the next call to a system command within that subroutine (this subroutine uses many system calls to copy and move data files around) will fail with the same memory fault. I also just discovered that when the system call happens, it opens a CMD.EXE process (as expected) but that process is orphaned after the crash. I had many of these being tracked in the Task Manager from my testing. I have closed all these and retested it still crashes with 12 threads but not 11.

I have been running this code on other systems with fewer processors/threads and it has not crashed after many days and thus many entries into this subroutine.

Any suggestions (other than run with fewer than 12 threads, LOL!) would be helpful. Thanks.

Rod

Re: SIGSEGV: Segmentation fault - invalid memory reference

Rod,

First, do you experience this issue if you compile as a 64-bit executable?

Second, with the stack size you've set, you're requesting a stack that is 1.4GB in size.  With the 12-thread overhead, there is a very good chance you're exceeding what 32-bit windows will allow for memory allocation for a single process (~1.5GB by default).  You might be getting lucky at 11 threads.

Jeff Armstrong
Approximatrix, LLC

Re: SIGSEGV: Segmentation fault - invalid memory reference

Jeff,

Not sure if you saw the Makefile but I think it indicates that this is a 64 bit executable. I have completely abandoned 32 bit development (Finally!).

I really don't need to do that huge of a memory allocation anymore since I am allocating the largest array dynamically within the program, reducing the memory footprint from gigabytes to under 100 megabytes. I have reduced that allocation number to 100Mb and recompiled. I will test the code with this change. Here is the new makefile:

#
# Automagically generated by Approximatrix Simply Fortran 2.41
#
FC="C:\Program Files (x86)\Simply Fortran 2\mingw-w64\bin\gfortran.exe"
CC="C:\Program Files (x86)\Simply Fortran 2\mingw-w64\bin\gcc.exe"
AR="C:\Program Files (x86)\Simply Fortran 2\mingw-w64\bin\ar.exe"
WRC="C:\Program Files (x86)\Simply Fortran 2\mingw-w64\bin\windres.exe"
RM=rm -f


OPTFLAGS= -O3 -fgraphite-identity -floop-interchange -floop-strip-mine -floop-block -floop-parallelize-all -mtune=broadwell

SPECIALFLAGS=$(IDIR)

RCFLAGS=-O coff

PRJ_FFLAGS= -fopenmp

PRJ_CFLAGS=

PRJ_LFLAGS=-Wl,--stack,100000000 -lgomp

FFLAGS=$(SPECIALFLAGS) $(OPTFLAGS) $(PRJ_FFLAGS) -Jmodules

CFLAGS=$(SPECIALFLAGS) $(OPTFLAGS) $(PRJ_CFLAGS)

"build\riod.o": ".\riod.f90"
    @echo Compiling .\riod.f90
    @$(FC) -c -o "build\riod.o" $(FFLAGS) ".\riod.f90"

clean: .SYMBOLIC
    @echo Deleting build\riod.o and related files
    @$(RM) "build\riod.o"
    @echo Deleting default icon resource
    @$(RM) "build\sf_default_resource.res"
    @echo Deleting riod.exe
    @$(RM) "riod.exe"

"riod.exe":  "build\riod.o" "build\Riod-MP-F90.prj.target"
    @echo Generating riod.exe
    @$(FC) -o "riod.exe" -static -fopenmp "build\riod.o" $(LDIR) $(PRJ_LFLAGS)

all: "riod.exe" .SYMBOLIC

Re: SIGSEGV: Segmentation fault - invalid memory reference

Rod,

Sorry, I think your first email mentioned 32-bit, but the Makefile was clearly 64-bit.  I wasn't careful looking at it.

It works in the debugger, but not as a standalone application?  I'm trying to understand the issue fully.  Those type of bugs are exceptionally frustrating.  I would also try enabling runtime diagnostics.  Perhaps it will catch the issue without the debugger.

Jeff Armstrong
Approximatrix, LLC

Re: SIGSEGV: Segmentation fault - invalid memory reference

Jeff,

No problem, I did mention 32 bit code and I should have been more specific in my original discussion.

I am testing with the changes you suggest in your first note in this thread. The code has entered and exited that subroutine normally now twice after reducing the memory request 100MB at compile time. Because the code only calls this subroutine rarely and it does different file movements on different intervals, I won't know if this has fixed the problem for a day or two.

I guess what concerns me is I don't understand (recall I am not too bright) why the allocation of memory in the compile step would affect what happens in this particular subroutine as it is single threaded and large arrays have been deallocated. This suggest to me there are other issues with my code or somewhere else in the chain.

My current development platform has a maximum of 12 threads but I will hopefully get access to machines with even more threads and the code will need to work on those too.

Re: SIGSEGV: Segmentation fault - invalid memory reference

Jeff,

Following up on this thread to close it out. It appears that my code is no longer crashing with that memory fault issue. The code has been running several days without issue (I hope as I had a nasty system crash on Tuesday but I think it is unrelated to my program. I tend to do a lot things on my system while my simulation runs in the background using 100% of the CPU cycles).

Thanks for the help.
Rod