Potential error

Message boards : Number crunching : Potential error
Message board moderation

To post messages, you must log in.

AuthorMessage
manalog

Send message
Joined: 7 Dec 20
Posts: 8
Credit: 210,342
RAC: 0
Message 2057 - Posted: 18 Apr 2023, 12:31:49 UTC
Last modified: 18 Apr 2023, 12:38:36 UTC

Hello,
I'd like to signal an error that I saw in the stderr of a workunit:
<core_client_version>7.18.1</core_client_version>
<![CDATA[
<stderr_txt>
20:16:32 (10581): wrapper (7.17.26016): starting
20:16:33 (10581): wrapper (7.17.26016): starting
20:16:33 (10581): wrapper: running cmdock (-c -j 1 -b 1 -r target.prm -p "/var/lib/boinc-client/slots/15/data/scripts/dock.prm" -f htvs.ptc -i ligands.sdf -o docking_out)
SIGSEGV: segmentation violation
Stack trace (14 frames):
../../projects/www.sidock.si_sidock/cmdock-l_wrapper_2.02_x86_64-pc-linux-gnu(+0x3ff8c)[0x559934e3ff8c]
/lib/x86_64-linux-gnu/libc.so.6(+0x42520)[0x7f6f4bddf520]
/lib/x86_64-linux-gnu/libc.so.6(free+0x1e)[0x7f6f4be4247e]
../../projects/www.sidock.si_sidock/cmdock-l_wrapper_2.02_x86_64-pc-linux-gnu(+0x5c3dd)[0x559934e5c3dd]
../../projects/www.sidock.si_sidock/cmdock-l_wrapper_2.02_x86_64-pc-linux-gnu(+0x7b8e4)[0x559934e7b8e4]
/lib/x86_64-linux-gnu/libc.so.6(+0x42520)[0x7f6f4bddf520]
/lib/x86_64-linux-gnu/libc.so.6(clock_nanosleep+0xc8)[0x7f6f4be82868]
/lib/x86_64-linux-gnu/libc.so.6(__nanosleep+0x17)[0x7f6f4be876e7]
/lib/x86_64-linux-gnu/libc.so.6(usleep+0x4f)[0x7f6f4beb90df]
../../projects/www.sidock.si_sidock/cmdock-l_wrapper_2.02_x86_64-pc-linux-gnu(+0x51adf)[0x559934e51adf]
../../projects/www.sidock.si_sidock/cmdock-l_wrapper_2.02_x86_64-pc-linux-gnu(+0x1e784)[0x559934e1e784]
/lib/x86_64-linux-gnu/libc.so.6(+0x29d90)[0x7f6f4bdc6d90]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x80)[0x7f6f4bdc6e40]
../../projects/www.sidock.si_sidock/cmdock-l_wrapper_2.02_x86_64-pc-linux-gnu(+0x1aa3a)[0x559934e1aa3a]

Exiting...
12:57:29 (237073): wrapper (7.17.26016): starting
12:57:29 (237073): wrapper (7.17.26016): starting
12:57:29 (237073): wrapper: running cmdock (-c -j 1 -b 1 -r target.prm -p "/var/lib/boinc-client/slots/15/data/scripts/dock.prm" -f htvs.ptc -i ligands.sdf -o docking_out)
SIGSEGV: segmentation violation
Stack trace (14 frames):
../../projects/www.sidock.si_sidock/cmdock-l_wrapper_2.02_x86_64-pc-linux-gnu(+0x3ff8c)[0x55f6dce3ff8c]
/lib/x86_64-linux-gnu/libc.so.6(+0x42520)[0x7f2d1bb6c520]
/lib/x86_64-linux-gnu/libc.so.6(free+0x1e)[0x7f2d1bbcf47e]
../../projects/www.sidock.si_sidock/cmdock-l_wrapper_2.02_x86_64-pc-linux-gnu(+0x5c3dd)[0x55f6dce5c3dd]
../../projects/www.sidock.si_sidock/cmdock-l_wrapper_2.02_x86_64-pc-linux-gnu(+0x7b8e4)[0x55f6dce7b8e4]
/lib/x86_64-linux-gnu/libc.so.6(+0x42520)[0x7f2d1bb6c520]
/lib/x86_64-linux-gnu/libc.so.6(clock_nanosleep+0xc8)[0x7f2d1bc0f868]
/lib/x86_64-linux-gnu/libc.so.6(__nanosleep+0x17)[0x7f2d1bc146e7]
/lib/x86_64-linux-gnu/libc.so.6(usleep+0x4f)[0x7f2d1bc460df]
../../projects/www.sidock.si_sidock/cmdock-l_wrapper_2.02_x86_64-pc-linux-gnu(+0x51adf)[0x55f6dce51adf]
../../projects/www.sidock.si_sidock/cmdock-l_wrapper_2.02_x86_64-pc-linux-gnu(+0x1e784)[0x55f6dce1e784]
/lib/x86_64-linux-gnu/libc.so.6(+0x29d90)[0x7f2d1bb53d90]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x80)[0x7f2d1bb53e40]
../../projects/www.sidock.si_sidock/cmdock-l_wrapper_2.02_x86_64-pc-linux-gnu(+0x1aa3a)[0x55f6dce1aa3a]

Exiting...
00:57:46 (15865): wrapper (7.17.26016): starting
00:57:46 (15865): wrapper (7.17.26016): starting
00:57:46 (15865): wrapper: running cmdock (-c -j 1 -b 1 -r target.prm -p "/var/lib/boinc-client/slots/15/data/scripts/dock.prm" -f htvs.ptc -i ligands.sdf -o docking_out)
13:46:42 (15865): cmdock exited; CPU time 44584.810259
13:46:42 (15865): called boinc_finish(0)

</stderr_txt>
]]>

The workunit was validated and credits were granted, but still I have the doubt if it's doing useful science or if it's just producing a damaged output so considered that the error could be repeated in other hosts too I preferred to signal it.
Currently this host has 7 more wu in the queue, I will check if these produce these errors too.
https://www.sidock.si/sidock/results.php?hostid=43371

EDIT: Unfortunately I checked the stderr of the running process using the /proc/ folder and yes also the others running tasks are producing segmentation violations.
ID: 2057 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
manalog

Send message
Joined: 7 Dec 20
Posts: 8
Credit: 210,342
RAC: 0
Message 2058 - Posted: 18 Apr 2023, 12:43:59 UTC - in response to Message 2057.  

I rapidly checked some host from the "Top Computer" statistics page and I found an host with similar specs of mine (Ryzen + Linux Mint) and it is producing the same output:
https://www.sidock.si/sidock/results.php?hostid=21673
ID: 2058 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Brian Nixon

Send message
Joined: 10 Feb 21
Posts: 21
Credit: 4,683,097
RAC: 6,512
Message 2059 - Posted: 18 Apr 2023, 19:40:57 UTC

TL;DR: I believe this is harmless.

It is the BOINC wrapper that has crashed here, not the science app (CmDock). My best guess for why that did not cause the task to fail with an error is that it happened while the client was asking the task to exit (either at shutdown, or suspending activity with the preference Leave non-GPU tasks in memory while suspended unchecked). If that is the case, this looks like a race condition while the wrapper is exiting (which would be a bug in BOINC’s part, not SiDock’s) – but it seems to be benign because the science app had exited before the wrapper crashed. That means that when the client resumed the task later, it was able to restart correctly from the last checkpoint.
ID: 2059 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote

Message boards : Number crunching : Potential error

©2024 SiDock@home Team