Posts by marmot

1) Message boards : News : СmDock "long" and "short" tasks applications (Message 1932)
Posted 23 Jan 2023 by marmot
Post:
SiDock has a beta test server that one of my machines is still waiting on tasks from.

First server of SiDock@home (sidocktest) was frozen after credit was moved in main project and now is stopped. Before deploy this application was tested during several days and no any problems were found. For example: where one short ("Sprot") task is hung on my computer, I perform a run of several dozens (~40) of copies of this task on my machine and all of them were completed successfully, without any problem. Only with help of auxiliary server, we were able to get a similar task hung for a long ("RdRp_v2") task.

Actually, finding of this problem is a side, but very important result of our project also, received already now.


Trying to clarify.
So the WU that went on for 3-4 days, and maybe hung, were sent intentionally and the bug with the application you were trying to uncover was found by the results returned by the BOINC community running SiDock last week?
And the beta server is down and you only beta test in house now then send the new apps straight out?
2) Message boards : News : СmDock "long" and "short" tasks applications (Message 1931)
Posted 23 Jan 2023 by marmot
Post:
After special "restart test" under Ubuntu, I did the same test for Windows 10 + BOINC 7.16.11. Before ~ 1 hour of tasks completion I restart a VM with Windows. First task, for workunit 49655115 is complete. And as you see, CPU time does not lost:
77614604	49655066	20 Jan 2023, 19:14:36 UTC	21 Jan 2023, 20:29:48 UTC	Completed and validated	85,387.91	85,211.20	1,004.96	CurieMarieDock 0.2.0 long tasks v2.00 windows_x86_64
77614663	49655123	20 Jan 2023, 19:14:36 UTC	21 Jan 2023, 20:00:16 UTC	Completed and validated	76,331.40	76,198.72	898.37	CurieMarieDock 0.2.0 long tasks v2.00 windows_x86_64
77614655	49655115	20 Jan 2023, 19:14:35 UTC	21 Jan 2023, 18:28:11 UTC	Completed and validated	76,226.16	76,041.19	897.14	CurieMarieDock 0.2.0 long tasks v2.00 windows_x86_64
77614657	49655117	20 Jan 2023, 19:14:35 UTC	21 Jan 2023, 20:29:20 UTC	Completed and validated	81,194.31	80,985.95	955.61	CurieMarieDock 0.2.0 long tasks v2.00 windows_x86_64

Maybe it does not work under some systems, but under nearest for me Windows system it works.


With these results, could you please comment on my post about checkpointing and some WU that still refuse to checkpoint here: https://www.sidock.si/sidock/forum_thread.php?id=231
3) Message boards : Number crunching : checkpointing (Message 1930)
Posted 23 Jan 2023 by marmot
Post:
Most of the newer Wu's are checkpointing within 10 minutes.
Some of them are still checkpointing once when first entering RAM then never again.

This one Workunit 49653004
has been running 2d 19h on my 2700X with 3d 17h reported to go and only a checkpoint at the 1st second.
(going to abort it since restarting the client could lose credit on several more WU's)

The WU reports itself as it's 1st time sent to a BOINCer, not a resend.

So are these from the data set before https://www.sidock.si/sidock/forum_thread.php?id=225&postid=1913#1913 where some app routines were dropped?

Will these WU that refuse to checkpoint be exhausted or deprecated soon?

EDIT: (If I look through the WU by time received the answer may emerge. All the ones sent to a Xeon server at 22 Jan 2023, 6:25:11 UTC refuse to checkpoint. Ones sent later to my Intel laptops and AMDs Desktops are almost all checkpointing.)
4) Message boards : News : СmDock "long" and "short" tasks applications (Message 1929)
Posted 23 Jan 2023 by marmot
Post:
Folks,

I was away for two days, so no babysitting for the WUs.
Guess what, some tasks claimed to be > 100 hrs and still have to run.
I suspended and resumed those items and they instantly returned to only a fraction of the already reported and wasted CPU time.
That's it for me, I will quit the project.
However I might return from time to time to review the postings. If something improved I might consider coming back.

Take care and best of success.

Chris


It's some improvement.

If the tasks report they are checkpointing every 10 min or less then those are the 'good' WU.

The ones that only create a checkpoint at the 1st second, and never again, are risky and will lose you credit if you restart to get them moving again.

Not sure how many more of these non-checkpointing WU are to come (I'm posting a question about it in Crunching forum).

You could abort every WU that refuses to checkpoint within 11 minutes after they start?

They still have a high chance of completing as long as you do not pause the client (my electric company has peak hours of 9x pricing to avoid).

I'm going to abort 1 of about 40 received on my 2700X in the last day.
5) Message boards : Number crunching : how long can "long tasks" be? (Message 1919)
Posted 21 Jan 2023 by marmot
Post:

Post name of this task and restart BOINC. If this results does not get appropriate credit, I simply add it. :)


I reset BOINC.exe controlling that last, above, WU and it completed in minutes and here is the result:
77561932	44887	15 Jan 2023, 18:36:22 UTC	21 Jan 2023, 1:48:59 UTC	Completed and validated	309,214.12	298,826.10	1,129.15	CurieMarieDock 0.2.0 long tasks v2.00
windows_x86_64


At least it's not 10 credit; not what you'd expect for 309k sec but over 1k.

Thanks for the above strategy to investigate run time and checkpoint in properties
I'm going to restart all the remaining BONC clients with 2d+ CPU time WUs and see what happens.
6) Message boards : Number crunching : how long can "long tasks" be? (Message 1915)
Posted 21 Jan 2023 by marmot
Post:
But if you see tasks with estimation like "days" you can check it properties in BOINC Manager: "CPU time at last checkpoint" and "CPU time". If it values very differs (for 1 hour, for example) - task is hung and need a restart.


My one task is 4 days and 5 hours run time, at 99.4% complete, and stalled.
If I follow your direction then it will get credit for the last 30 minutes, according to Mad_Max.

Actually, all of the newer WU's (Feb 3rd deadline) that are progressing have:
CPU time at last checkpoint: of under 30 minutes.
The WU above, at 99.4% complete shows 1d 5 hours since last checkpoint.

So it seems that if the checkpoints are over 60 minutes ago then the task is hung.
Or is that because these new WU are checkpointing and the old ones weren't and so the run time should equal the checkpoint time?

.
7) Message boards : News : СmDock "long" and "short" tasks applications (Message 1914)
Posted 21 Jan 2023 by marmot
Post:
I suggest you move on to a different project (like I and others) or simply stop until the issues are resolved. It's not worth getting bent out of shape over.

I have WU's still in progress, that I'm spending an hour or more per day baby sitting, and want to keep working on this project.
They need to hear how much time we are spending on managing their project WU's.
We're not directly being paid for this work; but they are.
COVID research is highly paid research and the vaccine companies have made record profits the last 2 years.

I already watch top contributors leaving this project and I will shortly follow when there is no improvement.
Greger, top contributor of our team, was 230k RAC has pulled out.


To put this in perspective; SiDock has a beta test server that one of my machines is still waiting on tasks from.
They have beta test capability, with volunteers willing to accept the risks of beta WU's, and it could have been used to prevent what happened last week.
8) Message boards : Number crunching : how long can "long tasks" be? (Message 1903)
Posted 20 Jan 2023 by marmot
Post:
but slower PCs estimate up to 9 days run time.

Paul.


My one machine had 20 WU over 7days left and the time left kept advancing 5 seconds every second.

Pausing/unpausing didn't help.
I gave up on them as they appeared to be hung.

Anyway, might as well d/l fresh WU's as they will have more realistic deadlines and the credit gained form a 9 day task won't be worth it.
9) Message boards : Number crunching : how long can "long tasks" be? (Message 1902)
Posted 20 Jan 2023 by marmot
Post:
My KabyLake 14nm Laptop shows 3d 4 hours left to complete which I'm not sure it can make by Jan 21.

4 days to deadline for this tasks are added on server side (if I rightly found this computer). :) Probably that for next bunches of tasks the deadline will be extended.


That laptop was running an Einstein Intel GU task.
It lowered the CPU effective frequencies from 2400 to 1300 (which is way more severe a drain than I realized).
Stopped all new Einstein work and all but 2 are going to make the deadline wish I'd known about the ability to ask for deadline extensions:

would be very kind, if you can also extend the deadline for this 2 WUs

https://www.sidock.si/sidock/workunit.php?wuid=49613180
https://www.sidock.si/sidock/workunit.php?wuid=49610715

Added. :)


I could use another 2 days to complete the rest of these, unaborted tasks, on the one machine. I'm user 279.
Thankyou
10) Message boards : Number crunching : Tasks hanging - (Message 1901)
Posted 20 Jan 2023 by marmot
Post:
77522270 49580149 15 Jan 2023, 3:32:33 UTC 17 Jan 2023, 22:39:26 UTC Completed and validated 241,613.00 3,915,919.00 1,647.69 CurieMarieDock 0.2.0 long tasks v2.00
windows_x86_64

This reported runtime is IMPOSSIBLE because it was running on a single thread and the machine returned 48 other WU in the same time period.
Mistake can occurs on different stages - machine, sending, processing on server. Usually known anomalies relates to Windows hosts. Maybe a some influence of antivirus | defenders | e.t.c takes a place.

In any case, "problem" of this result dosn't related to hungs. Looks that it is tasks metadata mistake only.


This machine is dedicated to BOINC 20/7 and any service or 3rd party app that can drain resources is disabled.
No anti-virus, no task schedules, no workstation, no local DNS server, only basic IP 4 packeting.
The 3rd party task scheduler is just been added yesterday and couldn't be the cause.
I am very careful to examine all new projects WU's and have never seen this kind of run time reporting.

If the machines local clock was off 5 minutes on a WU that ran only 5 minutes, maybe the negative time reporting would be interpreted as 3,915,919.00 sec but the local clock isn't off by 235k seconds...
11) Message boards : News : СmDock "long" and "short" tasks applications (Message 1900)
Posted 20 Jan 2023 by marmot
Post:
You all could make management of these huge jobs easier if you multi-threaded the problem and set run limits.

Look at Amicable Numbers user settings.
We get to choose number of threads and run time length.

I've spent about 12 hours over the last 3 days baby sitting these WU's.
And I foresee more management hours to come.
12) Message boards : News : СmDock "long" and "short" tasks applications (Message 1899)
Posted 20 Jan 2023 by marmot
Post:
This has been a bad SiDock day for me.
31 tasks completed successfully across 9 computers.
27 WU of 32 on the daytime BOINC installation on the one server that is running dual BOINC (10 hours morning/ 10 Hours night for avoiding peak electric rate hours)
are increasing expectation time 3 seconds every second.
I paused them but they are set to NOT leave RAM because that loses credit.
They have 35+ hours or runtime already and looking at another 50+ hours (some said 5-9 days left) and can't beat their deadline so I aborted them.
The nighttime BOINC install appears to be working but another 5 appear unable to be able to meet the deadline.

*Why did 29 of 32 SiDock WU stop advancing on a machine that pauses 2x a day for 2 hours each period?*
I am switching to a single BOINC install but it will still need to pause 2x a day with a cron job boinccmd --set_run_mode never 7320
So I need assurances that these WU won't keep stalling out because they paused.

Also, still getting a few ending in error after very long runs.

So today I'm at 31 success and 23 failures.
42% failure rate is abominable!


77279883	49343723	44965	11 Jan 2023, 23:44:08 UTC	18 Jan 2023, 16:16:43 UTC	Aborted	493391.9	486944.8	---	CurieMarieDock 0.2.0 long tasks v2.00 windows_x86_64
77337873	49399835	44888	12 Jan 2023, 19:26:44 UTC	16 Jan 2023, 17:47:08 UTC	Error while computing	13412.96	13299.98	---	CurieMarieDock 0.2.0 long tasks v2.00 windows_x86_64
77337845	49399809	44888	12 Jan 2023, 19:26:44 UTC	16 Jan 2023, 17:16:13 UTC	Error while computing	11594.1	11437.84	---	CurieMarieDock 0.2.0 long tasks v2.00 windows_x86_64
77398977	49459298	44888	13 Jan 2023, 15:36:16 UTC	16 Jan 2023, 16:29:53 UTC	Error while computing	8851.1	8775.33	---	CurieMarieDock 0.2.0 long tasks v2.00 windows_x86_64
77398985	49459314	44888	13 Jan 2023, 15:36:16 UTC	16 Jan 2023, 16:53:02 UTC	Error while computing	10180.91	10069.2	---	CurieMarieDock 0.2.0 long tasks v2.00 windows_x86_64
77399065	49459325	44888	13 Jan 2023, 15:38:10 UTC	16 Jan 2023, 17:00:55 UTC	Error while computing	10682.11	10552.55	---	CurieMarieDock 0.2.0 long tasks v2.00 windows_x86_64
77440386	49499316	44898	14 Jan 2023, 3:44:08 UTC	19 Jan 2023, 2:10:12 UTC	Aborted	180865.64	161821.8	---	CurieMarieDock 0.2.0 long tasks v2.00 windows_x86_64
77440796	49499732	44900	14 Jan 2023, 3:50:29 UTC	15 Jan 2023, 12:52:36 UTC	Error while computing	238.91	209.97	---	CurieMarieDock 0.2.0 long tasks v2.00 windows_x86_64
77446164	49505088	44965	14 Jan 2023, 5:27:09 UTC	18 Jan 2023, 16:16:43 UTC	Aborted	331879.66	327318.9	---	CurieMarieDock 0.2.0 long tasks v2.00 windows_x86_64
77557978	49611032	44887	15 Jan 2023, 16:19:56 UTC	20 Jan 2023, 13:30:06 UTC	Aborted	278329.64	268281.6	---	CurieMarieDock 0.2.0 long tasks v2.00 windows_x86_64
77558574	49611655	44886	15 Jan 2023, 16:39:03 UTC	17 Jan 2023, 14:04:34 UTC	Error while computing	91007.37	80447.98	---	CurieMarieDock 0.2.0 long tasks v2.00 windows_x86_64
77561448	49614522	44886	15 Jan 2023, 18:16:18 UTC	18 Jan 2023, 2:09:25 UTC	Error while computing	95553.32	85968.58	---	CurieMarieDock 0.2.0 long tasks v2.00 windows_x86_64
77561513	49614582	44886	15 Jan 2023, 18:17:09 UTC	20 Jan 2023, 2:23:25 UTC	Aborted	140384.21	122982.7	---	CurieMarieDock 0.2.0 long tasks v2.00 windows_x86_64
77575916	49628894	44888	16 Jan 2023, 14:00:54 UTC	20 Jan 2023, 14:54:31 UTC	Aborted	160172.45	70500.13	---	CurieMarieDock 0.2.0 long tasks v2.00 windows_x86_64
77575917	49628899	44888	16 Jan 2023, 14:00:54 UTC	20 Jan 2023, 14:54:31 UTC	Aborted	164651.93	74874.86	---	CurieMarieDock 0.2.0 long tasks v2.00 windows_x86_64
77575927	49628909	44888	16 Jan 2023, 14:00:55 UTC	17 Jan 2023, 18:37:29 UTC	Error while computing	4871.7	4784.16	---	CurieMarieDock 0.2.0 long tasks v2.00 windows_x86_64
77575864	49628849	44888	16 Jan 2023, 14:01:56 UTC	20 Jan 2023, 14:46:51 UTC	Aborted	140432.11	35749.41	---	CurieMarieDock 0.2.0 long tasks v2.00 windows_x86_64
77576808	49629786	44888	16 Jan 2023, 16:29:53 UTC	20 Jan 2023, 14:54:31 UTC	Aborted	142096.72	37286.89	---	CurieMarieDock 0.2.0 long tasks v2.00 windows_x86_64
77576812	49629790	44888	16 Jan 2023, 16:29:53 UTC	20 Jan 2023, 14:54:31 UTC	Aborted	152900.31	73801.61	---	CurieMarieDock 0.2.0 long tasks v2.00 windows_x86_64
77576971	49629949	44888	16 Jan 2023, 16:53:02 UTC	20 Jan 2023, 14:54:31 UTC	Aborted	151144.96	70712.69	---	CurieMarieDock 0.2.0 long tasks v2.00 windows_x86_64
77576972	49629952	44888	16 Jan 2023, 16:53:02 UTC	20 Jan 2023, 14:46:51 UTC	Aborted	152046.29	71703.56	---	CurieMarieDock 0.2.0 long tasks v2.00 windows_x86_64
77576991	49629969	44888	16 Jan 2023, 17:00:55 UTC	20 Jan 2023, 14:54:31 UTC	Aborted	148436.82	68058.72	---	CurieMarieDock 0.2.0 long tasks v2.00 windows_x86_64
77576993	49629971	44888	16 Jan 2023, 17:00:55 UTC	20 Jan 2023, 14:46:51 UTC	Aborted	149894.3	45036	---	CurieMarieDock 0.2.0 long tasks v2.00 windows_x86_64
77577026	49630004	44888	16 Jan 2023, 17:16:13 UTC	20 Jan 2023, 14:54:31 UTC	Aborted	146550.13	66754.64	---	CurieMarieDock 0.2.0 long tasks v2.00 windows_x86_64
77577052	49630030	44888	16 Jan 2023, 17:16:13 UTC	20 Jan 2023, 14:54:31 UTC	Aborted	144242.85	65149.5	---	CurieMarieDock 0.2.0 long tasks v2.00 windows_x86_64
77577165	49630145	44888	16 Jan 2023, 17:39:47 UTC	20 Jan 2023, 14:54:31 UTC	Aborted	146132.22	65806.09	---	CurieMarieDock 0.2.0 long tasks v2.00 windows_x86_64
77577227	49630209	44888	16 Jan 2023, 17:47:08 UTC	20 Jan 2023, 14:54:31 UTC	Aborted	145234.13	64319.44	---	CurieMarieDock 0.2.0 long tasks v2.00 windows_x86_64
77577289	49630271	44888	16 Jan 2023, 17:55:21 UTC	20 Jan 2023, 14:55:11 UTC	Aborted	144869.21	64534.8	---	CurieMarieDock 0.2.0 long tasks v2.00 windows_x86_64
77577296	49630270	44888	16 Jan 2023, 17:55:21 UTC	20 Jan 2023, 14:55:11 UTC	Aborted	144408.69	63322.3	---	CurieMarieDock 0.2.0 long tasks v2.00 windows_x86_64
77577308	49630289	44888	16 Jan 2023, 17:58:26 UTC	20 Jan 2023, 14:54:31 UTC	Aborted	144100.1	62643.67	---	CurieMarieDock 0.2.0 long tasks v2.00 windows_x86_64
77577263	49630238	44888	16 Jan 2023, 17:59:20 UTC	20 Jan 2023, 14:54:31 UTC	Aborted	144041.11	63361.22	---	CurieMarieDock 0.2.0 long tasks v2.00 windows_x86_64
77577321	49630296	44888	16 Jan 2023, 18:00:14 UTC	20 Jan 2023, 14:54:31 UTC	Aborted	143210.83	63137.41	---	CurieMarieDock 0.2.0 long tasks v2.00 windows_x86_64
77585195	49638144	13277	17 Jan 2023, 17:47:15 UTC	19 Jan 2023, 2:09:39 UTC	Aborted	10668.42	10534.66	---	CurieMarieDock 0.2.0 long tasks v2.00 windows_x86_64
77585527	49638492	44888	17 Jan 2023, 18:37:29 UTC	20 Jan 2023, 14:54:31 UTC	Aborted	142162.91	52501.22	---	CurieMarieDock 0.2.0 long tasks v2.00 windows_x86_64
77580116	49633069	44893	17 Jan 2023, 3:07:02 UTC	20 Jan 2023, 14:29:05 UTC	Aborted	192815.89	179682.8	---	CurieMarieDock 0.2.0 long tasks v2.00 windows_x86_64
77580105	49633104	44895	17 Jan 2023, 3:09:24 UTC	20 Jan 2023, 14:29:18 UTC	Aborted	185638.03	180320.3	---	CurieMarieDock 0.2.0 long tasks v2.00 windows_x86_64
77581948	49634901	13277	17 Jan 2023, 8:53:04 UTC	19 Jan 2023, 2:09:39 UTC	Aborted	28511.6	26879.02	---	CurieMarieDock 0.2.0 long tasks v2.00 windows_x86_64
77588447	49641410	44886	18 Jan 2023, 4:06:39 UTC	20 Jan 2023, 4:28:35 UTC	Error while computing	128506.93	112809.9	---	CurieMarieDock 0.2.0 long tasks v2.00 windows_x86_64
 
13) Message boards : News : СmDock "long" and "short" tasks applications (Message 1873)
Posted 18 Jan 2023 by marmot
Post:
Look at this, this machine ran the hottest temperature LLR SRBase WU's for weeks without a single failure.

77561448 49614522 44886 15 Jan 2023, 18:16:18 UTC 18 Jan 2023, 2:09:25 UTC Error while computing 95,553.32 85,968.58 --- CurieMarieDock 0.2.0 long tasks v2.00
windows_x86_64
77558574 49611655 44886 15 Jan 2023, 16:39:03 UTC 17 Jan 2023, 14:04:34 UTC Error while computing 91,007.37 80,447.98 --- CurieMarieDock 0.2.0 long tasks v2.00
windows_x86_64

That's an entire wasted day for each of those cores.

Some of these WU are going to run 2-3 days ending in errors without any hardware cause?

This is not acceptable.

And the results are being purged way too quickly so we can not evaluate the run results and find the issues.

6 tasks all report less than 5% complete, been running for 12-15 hours and somehow they are going to complete in under 3 days (according to BOINC which is using past WU run time data)?
15 hours for 5% calculates to 12.5 more days to completion and the one is at 1.67% after 13 hours time to complete would be 33 days.
And the failure rate was 19% per day on my 5 machines.

Maybe this is the results of the percentages not being accurate from a BOINC restart as Mad Max found; but Mad Max reported the percentages displayed corrected themselves from checkpointing.

I'm concerned.
14) Message boards : Number crunching : Tasks hanging - (Message 1871)
Posted 18 Jan 2023 by marmot
Post:

This reported runtime is IMPOSSIBLE because it was running on a single thread and the machine returned 48 other WU in the same time period..

Something is wrong with these work units.

It loose(resets to zero) CPU time stats after each restart (full restart without leaving in RAM). So only CPU/elapsed time since last app restart counted. Looks like another bug...
I post about it in detail already in the another thread before saw your message: https://www.sidock.si/sidock/forum_thread.php?id=225&postid=1866#1866


Did you see any results reporting 3,915,919 seconds?

(Oh no! All my valid results have been purged! There was another and I was trying to check if it was an identical 3,915,919. It was over 3.9 million seconds).

So, I have to go and edit all my machines BOINC settings to retain apps in RAM.... :sigh:


Although such long tasks can be a problem in themselves - admins need to at least increase the BOINC deadline setting for them, because weaker computers (or modern but not working 24/7, but only a few hours a day) simply will not have enough time to finish all calculations before the deadline.


Agreed and I made that point several times on several messages
15) Message boards : News : СmDock "long" and "short" tasks applications (Message 1868)
Posted 18 Jan 2023 by marmot
Post:
This new app looses CPU/elapsed time stats if restarted (full restarts,without leaving in memory). And so loose points/credits as well.
At the same time, actual progress is NOT lost. That is, checkpoints are working. After restarting the app (BOINC restart or BOINC manager just switch to another project without active option "leave in memory while suspended" ), calculations continue from the last checkpoint as intended, but all the stats counters reported to BOINC of elapsed time, CPU time and time elapsed from the last checkpoint are resets to zero.

BOINC progress bar (% of task completed) also resets to zero after each restart. But it restore to correct values after some time (usually few mins). But time counters does not restore.


So we ae in a catch 22.
Hoar Frost says we need to restart the stuck tasks to get them to work but if we do we lose all the earned credit so far.

I have to pause my BOINC from 6-8am and 6-8pm every weekday because the electric company charges 9x normal rates during those periods.
The SIDock WU have been removed from RAM 2x per day. Have had 15 WU fail and 7 not validate of 113 total: 19.5% failure rate.

This WU are not ready for prime time... Too many unresolved issues. Use the Sidock test server for this and let the issues get worked out by people who know it's a beta test.
16) Message boards : Number crunching : how long can "long tasks" be? (Message 1867)
Posted 18 Jan 2023 by marmot
Post:
All of my tasks are this long; which Bryn Mawr said was the intent.

My KabyLake 14nm Laptop shows 3d 4 hours left to complete which I'm not sure it can make by Jan 21.

The electric company put 4 hours of high cost peak periods that I have to pause the machines for daily. BOINC does not support 2 pauses per day. Only through task scheduler can boinccmd be used to dual pause boinc.

The laptop is low power so I pause it for one period but the servers got a dual BOINC install and so half the SiDock long run for 10 hours in daylight and the other half for 10 hours at night.

Not sure any of those can complete by the 21st running only 10 hours a day. They are server Xeons, but 4th gen and older.

They easily completed even the longest SRBase the last month on the 10 hour/10 hour dual plan.

SRBase provided us a multithread option we can setup in a app_config.xml to assure we'd meet the deadlines.

Is multithreaded planned here?
17) Message boards : Number crunching : can this result be correct? (Message 1865)
Posted 18 Jan 2023 by marmot
Post:
There were 2 results with that impossible run time of over 45 days within 1 day on a single thread.

Also, there were 15 that ended in error states yesterday.
7 that couldn't validate.
Given that 91 completed, that's a 19.5% failure rate.
18) Message boards : Number crunching : Tasks hanging - (Message 1862)
Posted 18 Jan 2023 by marmot
Post:
I haven't looked at the logs but most all my WU's are showing 2d+ left till completion and the returned credit at Free-DC for this project has taken a sharp nosedive today implying it's a systemic problem in the WU's.


No, they’ve moved from tasks taking an hour or so to tasks taking a day or so. Credits will pick up when the long tasks finish and the average 1,500 credits per task kick in.


Except for two fake results like this one from my machines:

77522270 49580149 15 Jan 2023, 3:32:33 UTC 17 Jan 2023, 22:39:26 UTC Completed and validated 241,613.00 3,915,919.00 1,647.69 CurieMarieDock 0.2.0 long tasks v2.00
windows_x86_64

This reported runtime is IMPOSSIBLE because it was running on a single thread and the machine returned 48 other WU in the same time period..

Something is wrong with these work units.
There were 15 that ended in error states yesterday.
7 that couldn't validate.
Given that 91 completed, that's a 19.5% failure rate.

Also, the deadline is too close. Our local electric company forced new rate programs and meters on us. 6-8am and 6-8pm are 31 cents per kwh the rest of the day is 4 cents.
BOINC doesn't support 2 pause periods so moved to dual installs.
One runs 10 hours in the day the other 10 hours at night.
These new peak/off-peak programs are a paradigm shift in USA electric power companies; so others crunching BOINC will have to face this soon

8th gen laptop should be able to complete one of these before a deadline but with 4 hours lost per day to the rate plan; and it showing 3 days 4 hours till a Jan 21 deadline, it looks unlikely to finish..
We'll need longer deadlines or a switch to multi-thread these WU's.
19) Message boards : Number crunching : Tasks hanging - (Message 1859)
Posted 18 Jan 2023 by marmot
Post:
I haven't looked at the logs but most all my WU's are showing 2d+ left till completion and the returned credit at Free-DC for this project has taken a sharp nosedive today implying it's a systemic problem in the WU's.
20) Message boards : Number crunching : can this result be correct? (Message 1858)
Posted 18 Jan 2023 by marmot
Post:
This machine was running multiple WU's and none were taking up all the cores, and 40+ other WU finished today, so how can this WU have have 45 days of run time on a single core in 2 days?

77522270 49580149 15 Jan 2023, 3:32:33 UTC 17 Jan 2023, 22:39:26 UTC Completed and validated 241,613.00 3,915,919.00 1,647.69 CurieMarieDock 0.2.0 long tasks v2.00
windows_x86_64


Are the WU's multithreaded?

Still, 45 other WU's completed in the same day so not sure where it found cores to multithread to.


Next 20

©2024 SiDock@home Team