Peter Rival wrote:
>
> W Bauske wrote:
>
> > Jay Estabrook wrote:
> > >
> > > That's important - that you see them primarily during boot, rather
> > > than often during normal operation. That's indicative of boot tasks
> > > that simply take a long time and happen to need to hold a lock during
> > > that time. I'd be much more concerned if these happened frequently
> > > during normal operation.
> > >
> >
> > They happen alot on my UP2K's in production.
> >
> > socket.c:43 spinlock grabbed in pvmd3 at fffffc00003c4368(0) 2032 ticks
> > select.c:43 spinlock stuck in pvmd3 at fffffc000035f630(0) owner zm32mig_s_pvm
> > at fffffc000034fb78(1) read_write.c:43
> > select.c:43 spinlock grabbed in pvmd3 at fffffc000035f630(0) 2033 ticks
> > select.c:43 spinlock stuck in pvmd3 at fffffc000035f630(0) owner zm32mig_s_pvm
> > at fffffc000034fb78(1) read_write.c:43
> > select.c:43 spinlock grabbed in pvmd3 at fffffc000035f630(0) 1949 ticks
> > socket.c:43 spinlock stuck in pvmd3 at fffffc00003c4368(1) owner zm32mig_s_pvm
> > at fffffc000034fb78(0) read_write.c:43
> > socket.c:43 spinlock grabbed in pvmd3 at fffffc00003c4368(1) 2024 ticks
> > sched.c:30 spinlock stuck in pvmd3 at fffffc000032cea4(1) owner zm32mig_s_pvm at
> > fffffc000034fb78(0) read_write.c:43
> > sched.c:30 spinlock grabbed in pvmd3 at fffffc000032cea4(1) 1943 ticks
> > socket.c:43 spinlock stuck in pvmd3 at fffffc00003c4368(1) owner zm32mig_s_pvm
> > at fffffc000034fb78(0) read_write.c:43
> > socket.c:43 spinlock grabbed in pvmd3 at fffffc00003c4368(1) 2062 ticks
> > socket.c:43 spinlock stuck in pvmd3 at fffffc00003c4368(0) owner zm32mig_s_pvm
> > at fffffc000034fb78(1) read_write.c:43
> > socket.c:43 spinlock grabbed in pvmd3 at fffffc00003c4368(0) 2267 ticks
> >
>
> Which release was this? I'm trying to collect a hit list of locks to go take a
> look at, and from first glance, it appears this is another lock_kernel() call.
> Interesting thing is, it's being held in what appears to be llseek(), and tried for
> in something I can't find (socket.c:43 is still in the comments, at least in 2.2 as
> is select.c:43...). Seeing as this file:line# pair is generated from the gcc
> __FILE__ and __LINE__ macros, I'm not sure what's wrong there - Jay?
>
This is kernel 2.2.15pre17 with a couple minor patches, nfsv3, and
ide DMA patches. llseek makes no sense. pvmd shouldn't write to disk.
It is more likely socket calls only, both local and remote. PVM uses
a UNIX type socket to talk to local processes I believe and an INET
socket to talk to the world.
If Greg can reproduce similar messages, then I'll let him work with you
all to sort it out. My point in posting was just to confirm this was
not a boot time only problem.
> BTW, it appears that this is happening when the pvmd3 process is reading
> (apparently large amounts of) data from a socket and writing it to the file, or
> perhaps the inverse. This could _perhaps_ be improved by writing the data out
> faster, rather than batching it up in cache. It's only a guess, but it might help
> if your disk is fast enough. *shrug*
>
> - Pete (still muttering about the !@#$*$@#^ kernel lock...)
Large is relative. I'm sending around 16MB of data every 10-50
secs. Total input dataset is around 300-400GB for this run. The
systems each use quite fast disk. hdparm shows them at 15+MB/sec.
There is practically no disk I/O in this code on worker nodes except
for the end when they save their results.
Wes
-- To unsubscribe: send e-mail to axp-list-request@redhat.com with 'unsubscribe' as the subject. Do not send it to axp-list@redhat.com
This archive was generated by hypermail version 2a22 on Sat Jul 1 05:31:30 2000 PDT
Send any problems or questions about this archive to webmaster@alphalinux.org.